ARTICLE

https://doi.org/10.1038/s41467-019-12339-7 OPEN Enhanced CRISPR-based DNA demethylation by Casilio-ME-mediated RNA-guided coupling of methylcytosine oxidation and DNA repair pathways

Aziz Taghbalout1, Menghan Du1, Nathaniel Jillette1, Wojciech Rosikiewicz1, Abhijit Rath2, Christopher D. Heinen2, Sheng Li1 & Albert W. Cheng 1,3,4*

Casilio-ME 1234567890():,; Here we develop a methylation editing toolbox, , that enables not only RNA-guided methylcytosine editing by targeting TET1 to genomic sites, but also by co-delivering TET1 and factors that couple methylcytosine oxidation to DNA repair activities, and/or promote TET1 to achieve enhanced activation of methylation-silenced . Delivery of TET1 activity by Casilio-ME1 robustly alters the CpG methylation landscape of promoter regions and activates methylation-silenced genes. We augment Casilio-ME1 to simultaneously deliver the TET1-catalytic domain and GADD45A (Casilio-ME2) or NEIL2 (Casilio-ME3) to streamline removal of oxidized cytosine intermediates to enhance activation of targeted genes. Using two-in-one effectors or modular effectors, Casilio-ME2 and Casilio-ME3 remarkably boost activation and methylcytosine demethylation of targeted loci. We expand the toolbox to enable a stable and expression-inducible system for broader application of the Casilio-ME platforms. This work establishes a platform for editing DNA methylation to enable research investigations interrogating DNA methylomes.

1 The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA. 2 Center for Molecular Oncology, University of Connecticut Health, 263 Farmington Avenue, Farmington, CT 06030, USA. 3 Department of Genetics and Genome Sciences, University of Connecticut Health, 400 Farmington Avenue, Farmington, CT 06030, USA. 4 Institute for Systems Genomics, UConn Health Science Center, 400 Farmington Avenue, Farmington, CT 06030, USA. *email: [email protected]

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7

NA methylation is part of the multifaceted epigenetic mediating transcriptional activation of methylation-silenced Dmodifications of chromatin that shape cellular differ- gene and 5mC demethylation. The ability of Casilio-ME to entiation, gene expression, and maintenance of cellular mediate co-delivery of TET1 activity along with other protein homeostasis. Aberrant DNA methylation is implicated in various factors, which enhance turnover of oxidized cytosine inter- diseases including cancer, imprinting disorders, and neurological mediates, paves the way for new areas of research to efficiently diseases1. Developing tools to directly edit the methylation state address the cause–effect relationships of DNA methylation in of a specific genomic locus is of significant importance both for normal and pathological processes. studying the biology of DNA methylation as well as for devel- opment of therapies to treat DNA methylation-associated diseases. Results In mammalian cells, the 5-methylcytosine (5mC) epigenetic Casilio-ME1 delivers TET1 activity to a genomic target site. mark generated by covalent linkage of a methyl group to the 5th Casilio-ME1 is a three-component DNA Methylation Editing position of the cytosine ring of CpG sequences is catalyzed by one platform built on Casilio which uses nuclease-deficient Cas9 of the three canonical DNA methyltransferases DMNT1, (dCas9), an effector module made of Pumilio/FBF (PUF) – DNMT3A, and DNMT3B2 4. DNA methylation is dynamic and domain linked to an effector protein, and a modified sgRNA involves demethylation pathways which erase 5mC to restore containing PUF-binding sites (PBS) (Fig. 1a)30. The dCas9/ unmethylated DNA. Active demethylation involves the ten–eleven sgRNA ribonucleoprotein complex binds DNA targets without translocation (TET) family of methylcytosine dioxygenases that cutting to serve as an RNA-guided DNA-binding vehicle whose iteratively oxidize 5mC into 5-hydroxymethylcytosine (5hmC), 5- specificity is dictated by the spacer sequence of the sgRNA and formylcytosine (5fC), and 5-carboxylcytosine (5caC) intermediates5. a short protospacer adjacent motif located on the target DNA. Subsequently, 5fC and 5caC are processed by the base-excision PUF-tethered effectors are recruited to the ribonucleoprotein repair (BER) machinery to restore unmethylated cytosines. complex via binding to their cognate PBS present on the sgRNA Restoration of an intact DNA base is initiated by DNA glycosylases scaffold. PUF domains are found in members of an evolutio- that excise damaged bases to generate an apurinic/apyrimidinic site narily conserved family of eukaryotic RNA-binding (AP site) for processing by the rest of the BER machinery. Thymine whose specificity is encoded within their structural tandem DNA glycosylase (TDG)-based BER has been functionally linked to repeats, each of which recognizes a single ribonucleobase31. TET1-mediated demethylation, suggesting an interplay between PUF domains can be programmed to bind to any 8-mer RNA TET1 and enzymes of the BER machinery to actively erase 5mC sequence, e.g., PUFa and PUFc used in this study were designed marks. TDG acts on 5fC and 5caC and NEIL1 and NEIL2 DNA to bind PBSa (UGUAUGUA) and PBSc (UUGAUGUA), glycosylase/AP-lyase activities facilitate restoration of unmethylated respectively30,31. Multiple PBS added in tandem to the 3′ end of cytosine by displacing TDG from the AP site to create a single the sgRNA allow concurrent recruitment of multiple PUF- – strand DNA break substrate for further BER processing6 12. effectors to targeted DNA sequences without interfering with Interestingly, DNA demethylation is enhanced by GADD45A dCas9 targeting, and therefore allow amplification of the (Growth Arrest and DNA-Damage-inducible Alpha), a multifaceted response to associated effector modules30. nuclear protein involved in maintenance of genomic stability, DNA To enable targeted cytosine demethylation and subsequent – repair and suppression of cell growth13 15. GADD45A interacts activation of methylation-silenced genes, we built a DNA methyl with TET1 and TDG, and was suggested to play a role in coupling editor TET1-effector Casilio-ME1 as a protein fusion of hTET1 5mC oxidation to DNA repair16,17. catalytic domain (TET1(CD)) to the carboxyl end of PUFa Advances in artificial transcription factor (ATF) technologies (Fig. 1a). We chose as a target the MLH1 promoter region that is have enabled direct control of gene expression and epigenetic part of a large CGI whose aberrant hypermethylation induces – states18 20. CRISPR/Cas9-based technologies allow much flex- MLH1-silencing in 10–30% of colorectal and other cancers32,33. ibility and scalability because the specificity is programmable by a MLH1 is silenced in HEK293T cells and therefore represents a single guide RNA (sgRNA)21,22. Tethering of TET1 or DNMT3a clinically relevant model for developing Casilio-ME. to genomic targets by use of ATFs has been shown to allow To test the system, cells were transiently transfected with – targeted removal or deposition of DNA methylation23 29. How- plasmids encoding Casilio-ME1 components PUFa-TET1(CD) ever, these ATF systems have inherent limitations in enabling effector, dCas9 and six MLH1-promoter-targeting sgRNAs each multiplexed targeting, effector multimerization or formation of containing five copies of PBSa (Fig. 1a). This resulted in robust protein complexes at the targeted sequence. We recently devel- MLH1 activation as indicated by the obtained fold changes in oped the Casilio system which uses an extended sgRNA scaffold MLH1 mRNA in cells collected on day 3 post-transfection to assemble protein factors at target sites, enabling multi- (Fig. 1b, c upper panel). In contrast, MLH1 activation was not merization, differential multiplexing30, and potentially stoichio- obtained with a non-targeting sgRNA (NT-sgRNA) (Fig. 1b), metric complex formation. indicating that Casilio-ME1-mediated MLH1 activation requires Here we develop an advanced DNA methylation editing specific targeting of the PUFa-TET1(CD) module directed by the technology which allows targeted bridging of TET1 activity to programmable sgRNAs. BER machinery to efficiently alter the epigenetic state of CpG Evidence that the Casilio-ME1-induced activation of MLH1 targets and activate methylation-silenced genes. Casilio-DNA results from TET1-mediated 5mC demethylation came from Methylation Editing (ME) platforms enable targeted delivery of high throughput bisulfite sequencing (BSeq) of MLH1 ampli- the TET1 effector alone (Casilio-ME1) or in association with cons derived from the cells analyzed in Fig. 1b. BSeq showed GADD45A (Casilio-ME2)orNEIL2(Casilio-ME3)toachieve that targeted delivery of the TET1(CD) effector induces a enhanced 5mC demethylation and gene activation. We show profound decrease in CpG methylation frequency within the that Casilio-ME-mediated delivery of TET1 activity to gene MLH1 promoter region (Fig. 1c lower panel). Demethylation promoters induces robust cytosine demethylation within the activity was prominently higher within CpGs neighboring targeted CpG island (CGI) and activation of gene expression. MLH1-sgRNA sites (Fig. 1c lower panel (arrows)), and seemed When systematically compared to other reported methylation to spread away, albeit with relatively reduced activities. These editing systems, Casilio-ME shows superior activities in data indicate that Casilio-ME1 mediates delivery of TET1

2 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 ARTICLE

abNS Experiment 1 Casilio-ME1 600 Experiment 2 c c

TET1(CD) 400 TET1(CD) N PUFa N dCas9- PUFa 3′ sgRNA 5′ PBSa 200

CpG island MLH1 fold change (RT qPCR)

0 MLH1 sgRNAs NT sgRNA

c FE DCB A

–357Region B –71 –53Region C 17 1.0

0.8

0.6

MLH1 sgRNAs - experiment 1 0.4

5mC + 5hmC MLH1 sgRNAs - experiment 2

NT sgRNA 0.2 TSS F E D C B A

0 2 –9 17 86 –93 –84 –70 –53 –46 –36 –34 –28 109133 141 174 189 209 220 224 240 243 251 267 –499–493–479–477–475–471–468–464–454–450–441–429–421–414–411–409–405–403–393–385–382–357–350–328–315–310–294–291–266–250–225–213–185–169–162–130–124 CpG coordinates relative to MLH1 TSS

d e F D B A

–357–71 –53 17

Casilio-ME1 SunTag MLH1 sgRNAs TET1(CD) c platform c system sf-Gfp 400 c NT sgRNA scFv TET1(CD) TET1(CD) N N dCas9- N PUFa ′ C PUFa 3 300 sgRNA dCas9- 5′ PBSa sgRNA GCN4 N *** 5′

MS2 system TALE system 200 c TET1(CD) c TET1(CD)

TET1(CD) TET1(CD) 100 dCas9- MS2 Coat MLH1 fold change (RT qPCR) N N protein sgRNA TAL effector ′ N *** N ′ 3 5 *** MS2-binding site 0 MS2 TALE SunTag Casilio-ME1 system system system platform activity to promoter regions to induce 5mC demethylation methylation-silenced genes23,24,26,29, a direct comparison of their within the targeted CGI and subsequent activation of the efficiency is lacking. Here we compared Casilio-ME1 efficiency to methylation-silenced gene. alter expression of methylation-regulated genes to alternative technologies for 5mC demethylation that are based on TALEs (transcription activator like effector), dCas9/MS2 or dCas9/Sun- Comparison of Casilio-ME1 with other TET1 delivery systems. Tag systems23,24,26 (Fig. 1d). We therefore assembled four TALE- Although other technologies enabling targeted delivery of TET1 TET1(CD) fusions each designed to bind to one of the four activity to genomic loci have been reported to induce activation of MLH1-sgRNA target sequences used for dCas9-based delivery

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7

Fig. 1 Evaluation of Casilio-ME1-mediated gene activation and 5mC demethylation. a Schematic representation Casilio-ME1 components. PUFa-TET1(CD) effector (TET1 residues 1418–2136), dCas9, and sgRNA with 3′extension scaffold comprising five PUFa-binding site (PBSa) are shown. Amino (N) and carboxyl (C) termini of protein fusions are arbitrarily shown. b Column plot showing fold changes in MLH1 mRNA levels in cells transfected with Casilio-ME1 components comprising MLH1-sgRNAs or NT-sgRNA. Cells were collected three days after transfection and were not subjected to selection. Error bars represent mean ± S.E.M (n = 3), data form two independent experiments are shown. NS, not significant, p > 0.05, two-way ANOVA. c Upper panel: MLH1 promoter and associated CpG island. Regions B and C are depicted according to a report correlating MLH1-silencing to region C hypermethylation33. CpGs (lollipops), transcription start site (TSS) (arrow), and the sgRNAs used (A to F) are shown. Coordinates are relative to annotated TSS. Lower panel: high throughput BSeq analysis of MLH1 amplicons obtained from cells analyzed in (b). CpG methylation frequency of MLH1 promoter regions (mean ± S.E.M; n = 2) is shown. Arrows indicate locations of CpG overlapping the MLH1 sgRNAs (A–F) target sequences or TSS (blue arrow) as shown. CpG coordinates represent positions of cytosines, in , relative to annotated TSS. p < 0.0001, two-tail t-test and two-way ANOVA. d DNA demethylation technologies compared in panel (e). TET1 effectors are tethered to dCas9 nucleoprotein at targeted site via binding of PUFa to PBSa, MS2 coat protein to stem-loop RNA structures appended to sgRNA, or ScFv (single-chain fragment variable) antibody against short peptide (GCN4) appended in array to dCas9 carboxy-terminus. TALE mediates delivery of TET1 activity via binding to targeted sequence. In MS2 system mouse TET1(CD) is also C-terminally fused to dCas9. e Evaluation of Casilio-ME1 platform as compared with alternative 5mC demethylation systems. MLH1 mRNA relative levels (mean ± S.E.M.; n = 3) in cells transfected with Casilio-ME1, MS2, TALE or SunTag components. Four MLH1-sgRNAs or NT-sgRNA were used with dCas9-based delivery systems. Four TALE effectors each targeting the sequences targeted by the MLH1-sgRNAs A, B, D, or F were used. ***p < 0.05, one-way ANOVA

systems. Relative quantitation of MLH1 mRNA indicated that the to a separate PUF protein, i.e., TET1(CD) to PUFa and SunTag, TALEs or MS2 based systems only achieved 63%, 7% or GADD45A to PUFc, and used sgRNA containing both PBSa 1%, respectively, of the Casilio-ME1-mediated activation level and PBSc to tether the respective PUF-fusion to the sgRNA (Fig. 1e). BSeq analysis of MLH1 promoter comparing Casilio- scaffold (Fig. 2c). When Casilio-ME2.3 (PUFc-GADD45A) or ME1 and SunTag systems showed that Casilio-ME1 induced Casilio-ME2.4 (GADD45A-PUFc) components were introduced stronger demethylation at most of the CpG sites examined to cells, 3- and 6-fold increase in TET1-mediated MLH1 (Supplementary Fig. 1a, b). The efficient MLH1 activation activation was obtained, respectively (Fig. 2d). However, no obtained with Casilio-ME1 delivery of TET1 activity as compared MLH1 expression was detected using Casilio-ME2.3 and ME2.4 to SunTag is not driven by sgRNA composition as a similar trend systems when a catalytically dead TET1(CD) (dTET1(CD)) was obtained when one sgRNA was used for targeting MLH1 CGI replaced wild-type TET1(CD), indicating that the observed (Supplementary Fig. 1c). These results suggest that Casilio-ME1- GADD45A-mediated stimulation of gene activation requires the mediated delivery of TET1 activity to CGI target enables stronger oxidative activity of TET1 (Fig. 2d). Similarly, no MLH1 5mC demethylation and gene activation compared to published activation was obtained when the GADD45A module of systems. Casilio-ME2.3 and ME2.4 systems were introduced into cells without the PUFa-TET1(CD) component (Fig. 2c, d), indicating Co-delivery of TET1(CD) and GADD45A enhances gene acti- that expression of GADD45A alone, in the absence of the TET1 vation. Active 5mC erasure is a two-step process initiated by module, does not mediate gene activation. In addition, when the TET1-mediated iterative 5mC oxidations followed by base- sgRNAs contained PBSa but lacked PBSc required for tethering excision (BER) or nucleotide-excision (NER) repair conversions PUFc-associated modules, GADD45A modules failed to stimulate of oxidized intermediates to cytosines5,15. Thus, coupling these MLH1 activation (Supplementary Fig. 3). Thus, enhancement of two steps could streamline 5mC active erasure to efficiently gene activation in Casilio-ME2 requires co-delivery of GADD45A activate methylation-silenced genes. Because GADD45A pro- and TET1-effector modules to the target site. motes TET1 activity and/or could recruit key player(s) of DNA repair14–17, we sought to augment Casilio-ME by constructing an Co-delivery of TET1(CD) and BER enzymes. TDG, NEIL1, and upgraded version Casilio-ME2 which simultaneously recruits NEIL2 have been functionally linked to active DNA demethyla- TET1(CD) and GADD45A to bridge 5mC oxidation with DNA tion as they are involved in the initial step of removing oxidized repair at a specific genomic locus. cytosines 5fC and 5caC produced by TET1 activities6–9,11,12. To test Casilio-ME2 in inducing MLH1 activation in a Because initiating repair of oxidized cytosines by the BER comparison with Casilio-ME1 and our previously reported machinery might be a rate limiting step to TET1-mediated acti- Casilio-p65HSF1 activator, we introduced plasmids encoding the vation of methylation-silenced genes, we reasoned that coupling protein fusions of PUFa-p65HSF1 (activator), PUFa-TET1(CD) TET1 activities with DNA glycosylases could facilitate 5mC active (Casilio-ME1), PUFa-GADD45A-TET1(CD) (Casilio-ME2.1), or erasure and enhance subsequent gene activation. GADD45A-PUFa-TET1(CD) (Casilio-ME2.2)(Fig.2a), along with We therefore linked NEIL1, NEIL2, NEIL3, or TDG to the dCas9 and sgRNAs plasmids into HEK293T cells. Casilio-ME1- PUFa-TET1(CD) effector as single-chain protein fusions and mediated MLH1 activation was 46% higher than that obtained with looked for potential gains in Casilio-ME1-mediated MLH1 the PUFa-p65HSF1 activator module (Fig. 2b). Interestingly, when activation. Among these, only NEIL2 fusions showed enhanced GADD45A was added as part of Casilio-ME2.1 or Casilio-ME2.2 activation of MLH1 expression (Supplementary Fig. 4). Casilio- TET1-effectors, MLH1 mRNA expression was augmented by 3 and ME3.1 and Casilio-ME3.2, in which NEIL2 is fused N-terminally 6-fold, respectively, compared to Casilio-ME1 (Fig. 2b). This to PUFa or between PUFa and TET1(CD) of the PUFa-TET1(CD) enhanced MLH1 activation, obtained with GADD45A as part of effector, respectively, increased MLH1 activation 4-fold in the the TET1 effector modules, does not result from higher expression presence of MLH1-sgRNAs compared to Casilio-ME1 (Fig. 3a, b). of Casilio-ME2 effectors (Supplementary Fig. 2). Thus, coupling of Thus, co-delivering TET1(CD) and NEIL2 as a two-in-one GADD45A and TET1(CD) as a two-in-one effector enhances effector to target sites improves activation of 5mC-silenced genes. TET1-mediated activation of methylation-silenced genes. To further show that NEIL2 promotes demethylation-mediated To obtain further evidence that co-delivery of TET1(CD) and gene activation, TET1(CD) and NEIL2 were co-delivered as GADD45A effectors to target sites enhances gene activation separate effectors to MLH1 promoter regions by fusing TET1 compared to delivery of TET1(CD) alone, we fused each effector (CD) and NEIL2 to PUFa and PUFc, respectively, and using

4 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 ARTICLE

abF EDCBA

–357 –71 –53 17

c Casilio-ME1 TET1(CD) *** N PUFa-TET1(CD) 4000 MLH1 dCas9- NPUFa 3′ sgRNA NT 5′ PBSa sgRNAs

Casilio-ME2.1 TET1(CD)c 3000 GADD 45A PUFa-GADD45A-TET1(CD) dCas9- NPUFa ′ sgRNA 3 5′ PBSa ***

N 2000 Casilio-ME2.2 GADD c 45A TET1(CD) dCas9- GADD45A-PUFa-TET1(CD) PUFa ′ sgRNA 3 5′ PBSa MLH1 fold change (RT qPCR) 1000 p65HSF1 Casilio/p65HSF1

N C dCas9- PUFa 3′ PUFa-p65HSF1 sgRNA *** 5′ PBSa 0 Casilio/p65HSF1 Casilio-ME1 Casilio-ME2.1 Casilio-ME2.2

c d FE DCBA c Casilio-ME1 PUFa-TET1(CD) TET1(CD)

N dCas9- PUFa 3′ *** sgRNA 1000 5′ PBSc PBSa c c Casilio-ME2.3 GADD 45A TET1(CD) PUFc-GADD45A

N PUFc 800 dCas9- N PUFa 3′ sgRNA 5′ PBSc PBSa c Casilio-ME2.3 without GADD 600 PUFa-TET1 45A PUFc-GADD45A ***

N dCas9- PUFc ′ sgRNA 3 ′ PBSc 5 PBSa 400 Casilio-ME2.4 c GADDN 45A GADD45A-PUFc TET1(CD) C

PUFc MLH1 fold change (RT qPCR) dCas9- N PUFa 3′ sgRNA 200 5′ PBSa PBSc

Casilio-ME2.4 without GADDN PUFa-TET1 45A GADD45A-PUFc C 0 dCas9- PUFc ′ sgRNA 3 Casilio- Casilio- Casilio- Casilio- Casilio- Casilio- Casilio- Casilio- ′ PBSc 5 PBSa ME1 ME2.3 ME 2.4 dME2.3 dME2.4 ME2.3 ME2.4 ME1 NT-gRNA Dead Without TET1(CD) PUFa-TET1

Fig. 2 Casilio-ME2 mediates dual delivery of TET1(CD) and GADD45A to targeted genomic sites. a Schematic representation of the indicated Casilio-ME and Casilio platforms showing effector modules of PUFa fusion proteins used to transfect cells analyzed in (b). TET1(CD) (black), GADD45A (blue), p65HSF1 (green), PUFa (light gray), amino (N), and carboxyl (C) termini of protein fusions and occupancy of PBSa are arbitrarily shown. b Plot showing MLH1 mRNA relative levels (mean fold change ± S.E.M.; n = 3) in cells transfected with components of Casilio-ME1, Casilio-ME2.1, Casilio-ME2.2, or Casilio/p65HSF1 in the presence of MLH1-sgRNAs or NT-sgRNA as indicated. Cells were collected 3 days after transfection. Drawing of promoter regions with the MLH1-sgRNAs used (A-F), CpGs (lollipops), and TSS (arrow) is shown above the plot. ***p < 0.0005, one-way ANOVA. c Schematic representation of the indicated Casilio-ME platforms showing effector modules of PUFa and PUFc fusion proteins used to transfect cells analyzed in panel (d). TET1(CD) (black), GADD45A (blue), PUFa (light gray), PUFc (orange), sgRNA containing both PBSa and PBSc, amino (N) and carboxyl (C) termini of protein fusions are arbitrarily shown. d MLH1 mRNA relative levels (mean fold change ± S.E.M.; n = 3) in cells transfected with components of Casilio-ME1, Casilio-ME2.3, or Casilio-ME2.4 in the presence of MLH1-sgRNAs or NT-sgRNA is shown. When indicated PUFa-TET1(CD), effector component of Casilio-ME2.3 and Casilio- ME2.4, was replaced by a catalytically dead PUFa-TET1(CD) effector containing TET1-inactivating mutations H1671Y, D1673A57, or omitted. MLH1 promoter with the sgRNAs used (A-F), CpGs (lollipops), and TSS (arrow) is depicted above the plot. ***p < 0.0005, one-way ANOVA

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7

abF E DC BA

–357 –71–53 17

Casilio-ME1 c *** *** TET1(CD) PUFa-TET1(CD) 400 N MLH1 dCas9- PUFa sgRNA 5′ NT PBSa sgRNAs 300 Casilio-ME3.1 c NNEIL2 TET1(CD) NEIL2-PUFa-TET1(CD) dCas9- PUFa sgRNA 200 5′ PBSa

Casilio-ME3.2 TET1(CD)c 100

NEIL2 PUFa-NEIL2-TET1(CD) MLH1 fold change (RT qPCR) N dCas9- PUFa sgRNA ′ PBSa 5 0 Casilio-ME1 Casilio-ME3.1 Casilio-ME3.2

c d FE DCBA

–357 –71–53 17

Casilio-ME1 c *** *** TET1(CD)PUFa-TET1(CD) N dCas9- PUFa 300 sgRNA PBSc 5′ PBSa

Casilio-ME3.3 c c

NEIL2 TET1(CD) 200 N PUFc-NEIL2 N PUFc dCas9- PUFa sgRNA PBSc 5′ PBSa

Casilio-ME3.4 c NEIL2 N 100

NEIL2-PUFc MLH1 fold change (RT qPCR) TET1(CD)

N PUFcc dCas9- PUFa sgRNA PBSc 5′ PBSa 0 Casilio-ME1 Casilio-ME3.3 Casilio-ME3.4

Fig. 3 Casilio-ME3 mediates dual delivery of TET1(CD) and NEIL2 to targeted genomic site. a Illustration of the indicated Casilio-ME platforms showing effector modules of PUFa fusion proteins used to transfect cells analyzed in panel (b). TET1(CD) (black), NEIL2 (blue), PUFa (light gray), occupancy of PBSa, amino (N) and carboxyl (C) termini of protein fusions are arbitrarily shown. b MLH1 mRNA relative levels (mean fold change ± S.E.M.; n = 3) in cells transfected with components of Casilio-ME1, Casilio-ME3.1, or Casilio-ME3.2 in the presence MLH1-sgRNAs or NT-sgRNA as indicated. Drawing of promoter regions with the MLH1-sgRNAs used (A-F), CpGs (lollipops), and TSS (arrow) is shown above the plot. ***p < 0.001, one-way ANOVA. c Schematic representation of the indicated Casilio-ME platforms showing effector modules of PUFa and PUFc protein fusions used to transfect cells analyzed in panel (d). TET1(CD) (black), NEIL2 (blue), PUFa (light gray), PUFc (orange), sgRNA containing both PBSa and PBSc, occupancy of PBSa and PBSc, amino (N) and carboxyl (C) termini of protein fusions are arbitrarily shown. d Plot showing MLH1 mRNA relative levels (mean fold change ± S.E.M.; n = 3) in cells transfected with components of Casilio-ME1, Casilio-ME3.3, or Casilio-ME3.4 in the presence of MLH1-sgRNAs. Drawing of promoter regions with the MLH1- sgRNAs used (A-F), CpGs (lollipops), and TSS (arrow) is shown above the plot. ***p < 0.005, one-way ANOVA sgRNAs comprising both PBSa and PBSc (Fig. 3c). Binding of these showed no significant gains in TET1-mediated MLH1 activation, effectors to a sgRNA scaffold brings TET1(CD) and NEIL2 into indicating that enhanced TET1-mediated gene activation requires close proximity, and potentially enables coupling of DNA co-targeting of NEIL2 and TET1(CD) modules via an RNA demethylation with BER. When Casilio-ME3.3 (PUFc-NEIL2) or scaffold (Supplementary Fig. 5b). Thus, these data show that co- Casilio-ME3.4 (NEIL2-PUFc) components were used, MLH1 delivery of NEIL2 and TET1(CD) to genomic loci synergistically activation was increased by 7-fold as compared to Casilio-ME1 promotes TET1-mediated gene activation, likely via facilitated (Fig. 3d). Taken together, these results show that co-delivery of coupling of 5mC demethylation and BER activities to efficiently TET1(CD) and NEIL2 DNA glycosylase/AP-lyase stimulates restore unmethylated cytosine to targeted sites. activation of a methylation-silenced gene. To determine whether the enhanced gene activation obtained Comparison of Casilio-ME platforms. Casilio-ME2 and Casilio- with Casilio-ME3.3 and ME3.4 systems requires co-targeting of ME3 platforms showed an enhanced activation of a methylation- TET1(CD) and NEIL2 to a genomic site and does not result from silenced gene compared to Casilio-ME1. Here we sought to NEIL2 over expression, we disabled targeting of NEIL2 effector compare these platforms to one another in their efficiencies to modules by using sgRNAs comprising PBSa but lacking the PBSc activate MLH1 and alter methylation landscape of targeted CGI. required for targeting PUFc-based NEIL2 effectors (Supplemen- The comparison included the previously reported dCas9-TET1 as tary Fig. 5a). Cells transfected with Casilio-ME3.3 or ME3.4 an alternative system for reference25. Normalized MLH1 levels, to components comprising sgRNAs that lacked PBSc tethering sites those obtained with dCas9-TET1 system, showed that Casilio-

6 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 ARTICLE

ME2.2 gave augmented MLH1 activation higher than Casilio- bridging oxidative removal of 5mC to DNA repair pathways to ME1, ME2.1, ME3.1, and ME3.2 platforms (Supplementary efficiently restore unmethylated cytosine to targeted loci. Fig. 6a). This augmented MLH1 activation does not necessarily require targeting with multiple sgRNAs as a similar trend in MLH1 activation was obtained with one sgRNA targeting MLH1 Evaluation of potential off-target activities of Casilio-ME. The promoter region (Fig. 6Sb). When modular Casilio-ME2.3, CRISPR/dCas9 system inherently tolerates mismatches, to some ME2.4, and ME3.4 platforms were compared, Casilio-ME2.4 extent, between guide RNAs and genomic loci to subsequently showed the most MLH1 activation (Supplementary Fig. 6c). Only give rise to potential off-target effects37,38. To evaluate Casilio-ME background MLH1 mRNA levels could be detected with TET1 platforms for potential off-target effects, we performed reduced (CD)-dead mutants of the Casilio-ME platforms (Casilio-dME) representation bisulfite sequencing (RRBS) of genomic DNA (Supplementary Fig. 6a, c), indicating that TET1 activity is extracted from cells transfected with Casilio-ME components, required for gene activation and that delivery of GADD45A or dCas9-TET1 or SunTag systems in the presence of either MLH1 NEIL2 without TET1 oxidative activity is not sufficient for acti- or non-targeting sgRNAs. Pairwise correlations between all vating methylation-silenced genes. samples, including untransfected cells, gave similar correlations. To ask whether the augmented MLH1 activation of Casilio- The correlations were within the same range as previously ME2.2 and Casilio-ME3.1 came from an increased efficiency in reported for RRBS replicates39, suggesting that Casilio-ME plat- 5mC erasure, we performed BSeq and oxidative BSeq (oxBSeq) by form associated off-target activities, if any existed, do not exceed high throughput amplicon sequencing of MLH1 promoter those of alternative 5mC editing systems (Supplementary Fig. 8a). regions derived from cells transfected with Casilio-ME compo- To evaluate further the specificity of the Casilio-ME platforms, nents or dCas9-TET1. Analysis of 5mC frequencies within MLH1 we performed RNAseq to compare MLH1-sgRNAs and NT- CGI showed that Casilio-ME2.2 and Casilio-ME3.1 produced sgRNA transfected cells. The overall pattern of gene expression of higher demethylation activities compared to Casilio-ME1 and Casilio-ME2.2, SunTag and dCas9-TET1 systems seemed largely dCas9-TET1 as indicated by the reduced levels of 5mC (Fig. 4a similar with high correlations of FPKM values among MLH1- upper panel, b). This is consistent with the observed higher sgRNA and NT-sgRNA transfected cells in each system accumulation trends of 5mC-oxidation products 5hmC and (Supplementary Fig. 8b). In addition to MLH1, FSBP was called bisulfite converted CpGs (5fC, 5caC and C) (Fig. 4). Interestingly, significantly upregulated in Casilio-ME2.2 by RNAseq analysis, a noticeable trend appears to exist when looking at the levels of thus representing a potential off-target effect. However, quantita- 5mC-oxidation products; Casilio-ME2.2 produced more bisulfite tion by TaqMan assays of FSBP levels in the RNA samples used converted CpGs (5fC, 5caC, and C), whereas Casilio-ME3.1 for RNAseq showed no expression changes, indicating that FSBP produced more 5hmC (Fig. 4b, c). This apparent difference in activation observed in RNAseq is a false positive (Supplementary 5mC oxidation patterns could explain the relative efficiencies of Fig. 9a). MLH1 was more prominently upregulated in Casilio- Casilio-ME2.2 and ME3.1 in enhancing MLH1 activation. The ME2.2 transfected cells. The other differentially expressed higher accumulation of 5hmC in the NEIL2-based Casilio-ME RNAseq hits of Casilio-ME2.2, SunTag and dCas9-TET1 systems platform could be explained by NEIL2 competing with TET1 could represent off-target effects or reflect potential transcriptome (CD) for processing 5fC and 5caC substrates to potentially steer changes subsequent to MLH1 reactivation (Supplementary TET1 activity more toward the 5mC substrate. Alternatively, Fig. 8b). TET1 activity may be promoted in the presence of NEIL2 or Recruitment of BER associated proteins via Casilio-ME plat- NEIL2-associated proteins. For Casilio-ME2.2, the observed 5mC forms might introduce mutations to targeted sites. To evaluate oxidation profiles are consistent with GADD45A promoting potential mutagenicity of Casilio-ME, we performed deep TET1 activity and/or recruiting BER to the target site, leading to sequencing of the MLH1 locus, a 1 kb targeted region that accumulation of bisulfite converted CpGs (5fC, 5caC, and C). comprises the promoter and part of the first exon, and compared Evidence that the enhanced gene activation obtained with sequence identity distribution among reads to untransfected cells. Casilio-ME2.2 and Casilio-ME3.1 required fully active GADD45A Casilio-ME transfected cells showed no significant difference in or NEIL2, respectively, was obtained when point mutations were sequence identity within MLH1 reads, ruling out the possibility of introduced to alter key functional features or inactivate Casilio-ME platforms introducing mutations to targeted sites corresponding proteins. GADD45A lacks any obvious enzymatic (Supplementary Fig. 9b, c). activity; however, previous reports pointed us to key amino acids To ask whether expression of the PUFa-TET1(CD) fusion required for chromatin interaction (G39A) or dimerization/self- proteins, Casilio-ME components comprising GADD45A or association (L77E) of the protein34,35. Catalytically inactive NEIL2, could cause cellular alterations, we performed prolifera- NEIL2 with (C291S) or (R310Q) mutations located at the zinc tion assays (MTT) on transfected cells. Cells passaged after finger domain required for NEIL2-binding to DNA substrate transfection with Casilio-ME showed no measurable growth were also reported36. Introduction of these point mutations to changes compared to controls or cells transfected with compo- Casilio-ME2.2 or Casilio-ME3.1 abrogated the enhanced MLH1 nents of alternative DNA demethylation systems (Supplementary activation (Supplementary Fig. 7a). The reduced MLH1 activa- Fig. 10a). tions obtained were not due to protein destabilization caused by To further evaluate potential alteration of transcriptomes that amino-acid changes introduced to GADD45A and NEIL2 may occur subsequent to expression of GADD45A or NEIL2 as (Supplementary Fig. 7b, c). Interestingly, Casilio-ME3.1 contain- part of PUFa fusions, we compared RNA expression profiles in ing NEIL2(R310Q) mutation seemed to retain a weak enhance- cells transfected with Casilio-ME1, Casilio-ME2.2 or Casilio- ment that is likely attributed to residual catalytic and DNA- ME3.1. RNAseq analysis showed overall comparable RNA binding activities of the R310Q NEIL2 mutant (Supplementary expression profiles in these cells with high correlations of FPKM Fig. 7a)36. values (Supplementary Fig. 10b). However, few off-target genes The enhanced demethylation activities, taken together with the were called significantly upregulated in the case of Casilio-ME 2.2 fact that the boost in MLH1 activation mediated by Casilio-ME2.2 (Supplementary Fig. 10b). None of the off-target hits called genes and Casilio-ME3.1 required TET1 oxidative activity and func- associated with known GADD45A cellular functions. TaqMan tionally active GADD45A or NEIL2 enhancer proteins, is assays on four upregulated off-target genes, HSPA1A, PKP3, consistent with the idea that these platforms might facilitate SRPX2, and THBS2, showed an activated expression of these off-

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7

a C BER

BER

5mC 5hmC 5fC 5caC TET1 TET1 TET1

Casilio-ME1 Casilio-ME2.2 Casilio-ME3.1 dCas9-TET1 1.0

0.8

0.6 5mC 0.4

0.2 E DCB A 0 2 –9 17 86 –93 –84 –70 –53 –46 –36 –34 –28 109 133 141 174 189 209 220 224 240 243 251 267 –310 –294 –291 –266 –250 –225 –213 –185 –169 –162 –130 –124

0.4

5hmC 0.2

0 2 –9 17 86 –93 –84 –70 –53 –46 –36 –34 –28 109 133 141 174 189 209 220 224 240 243 251 267 –310 –294 –291 –266 –250 –225 –213 –185 –169 –162 –130 –124

1.0

0.8

0.6

0.4

5fC + 5caC C 0.2

0 2 –9 17 86 –93 –84 –70 –53 –46 –36 –34 –28 109 133 141 174 189 209 220 224 240 243 251 267 –310 –294 –291 –266 –250 –225 –213 –185 –169 –162 –130 –124 CpG coordinates relative to MLH1 TSS

bc*** *** *** *** 1.0 dCas9-TET1 1.0 NS dCas9-TET1 *** *** Casilio-ME1 Casilio-ME1 *** 0.8 Casilio-ME2.2 0.8 Casilio-ME2.2 Casilio-ME3.1 Casilio-ME3.1 0.6 0.6 *** NS *** *** *** *** *** *** *** *** *** 0.4 *** 0.4 NS *** *** ** *** NS 0.2 0.2 *** MLH1 promoter methylation promoter MLH1 ***

Methylation (MLH1 CpG 86-209) 86-209) CpG (MLH1 Methylation *** 0 0 *** 5mC 5hmC (5fC + 5caC + C) 5mC 5hmC (5fC + 5caC + C) target genes in Casilio-ME2.2 transfected cells (Supplementary be reduced by using dCas9 variants with higher fidelity as a Fig. 10c). Nonetheless, the activated expression of the four genes delivery vehicle40–43. required TET1 activity and fully functional GADD45A, indicat- ing that expression of GADD45A as part of PUFa-TET1 (CD) protein fusion is not sufficient for inducing the observed Portability of the Casilio-ME platforms. To show that Casilio- changes in expression of untargeted genes (Supplementary ME platforms enable efficient activation of other 5mC-silenced Fig. 10c). The CRISPR associated off-target activity could likely genes and in different cell types, we measured changes in

8 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 ARTICLE

Fig. 4 Efficiency of 5mC demethylation induced by Casilio-ME platforms and dCas9-TET1 system. MLH1 promoter was targeted by using components of the indicated methylation editing system in the presence of MLH1-sgRNAs. Cells were collected 3 days after transfection and corresponding genomic DNA was subjected to high throughput amplicon BSeq and oxBSeq. a 5mC conversion to cytosine by TET1 and BER pathways is depicted above panels. Frequencies (mean ± S.E.M.; n = 2) of 5mC (upper panel), 5hmC (middle panel) and bisulfite converted CpG (C, 5fC, and 5caC) (lower panel) plotted against CpG positions within MLH1 promoter region are shown. The obtained levels of (5mC + 5hmC) and 5mC from two experiments were concordant. Arrows indicate locations of CpG overlapping one of the six sgRNAs target sequences. Statistical significance of differences in methylation patterns were tested. p < 0.0001, two-way ANOVA. b Box plot of frequencies of different CpG variants measured by BSeq and oxBSeq across the MLH1 promoter regions in cells transfected with the indicated 5mC demethylation system is shown. For each box plot the thick line inside the box represents the median value and the surrounding bottom and top lines represent the 25th and 75th percentiles. The whiskers represent min and max values, the x represents the mean value. **p < 0.01, and ***p < 0.0001, two-way ANOVA. c same as b but focusing on CpG 86–209 of the MLH1 promoter proximal-intron1 region. NS, not significant p > 0.05, and ***p < 0.01, two-way ANOVA

expression of MGMT, SOX17, RHOXF2 (HEK293T), CDH1 protein factors to facilitate coupling 5mC oxidation to DNA (U2OS), and GSTP1 (LNCaP) in cells transfected with the com- repair pathways to effectively restore intact DNA to targeted sites. ponents of Casilio-ME platforms. This showed that Casilio-ME Such dual delivery enhanced 5mC demethylation at CGI target platforms also enabled activation of these methylation-silenced and augmented gene activations when compared to TET1(CD) genes. Casilio-ME2.2, Casilio-ME3.1,orME3.2 produced sig- delivered alone. In addition to the robustness of the platform, the nificantly enhanced gene activations compared to Caslio-ME1 for modular design of Casilio-ME allows a high degree of tunability the genes tested (Supplementary Fig. 11a-e). The improved gene and flexibility in editing 5mC epigenetic marks. activation is consistent with the increased demethylation effi- Turnover of 5fC and 5caC by DNA repair machinery lags ciency obtained by Casilio-ME3.2 targeting GSTP1 CGI as com- behind TET1-mediated 5mC oxidations as these intermediates pared to Casilio-ME1 (Supplementary Fig. 11g, h). Expression of accumulate before getting converted to unmethylated cytosine44. PUFa protein fusions of Casilio-dME1, dME2.2, and dME3.1, Coupling TET1 activity with BER or NER pathways could containing catalytically inactive TET1(CD), in the absence of accelerate 5fC and 5caC turnover, thereby enhancing activation of sgRNAs failed to activate MGMT, SOX17, and RHOXF2 (Sup- methylation-silenced genes. Consistent with this idea, Casilio- plementary Fig. 11f), indicating that expression of GADD45A or ME2 and Casilio-ME3 platforms designed to facilitate coupling NEIL2 as part of PUFa-TET1(CD) is not sufficient for activating TET1 and DNA repair activities gave an enhanced gene activation expression of the tested genes. and CpG demethylation of targeted sites. This enhanced gene Interestingly, the superiority and the levels of enhancement in activation requires TET1 catalytic activity, fully functional TET1-mediated gene activation achieved by Casilio-ME2.2 and GADD45A or NEIL2 proteins and co-targeting relevant effectors Casilio-ME3.1 or ME3.2 varied for some gene targets, suggesting in close proximity to genomic target sites. the existence of some locus dependency for GADD45A or NEIL2 Previous studies revealed interesting functional and physical to efficiently augment TET1-mediated gene activation. None- interactions among proteins involved in oxidizing 5mC and theless, activation of methylation-silenced genes obtained by removal of oxidized cytosine intermediates via BER or NER. Casilio-ME1 targeting different CGIs in different cell types was NEIL2 promotes substrate turnover by TDG during DNA more effective compared to previously reported methods demethylation12. GADD45A physically interacts with TET1 or (Supplementary Fig. 11i). TDG and seems to promote TET1 activity, and enhances removal of 5fC and 5caC by TDG14,16,17. GADD45A also recruits repair enzymes such as the 3′-NER endonuclease XPG to genomic sites Inducible Casilio-ME platform. To enable tunable and “on- 45,46 ” DNA . As GADD45A is devoid of any enzymatic activity, it demand targeted DNA demethylation and gene activation, we was proposed to function as a liaison protein to physically couple constructed piggyBac (PB) transposon vectors hosting 5mC oxidation with DNA repair16. Consistent with these doxycycline-inducible PB Casilio-ME1 cassettes (DIP_Casilio- observations, Casilio-mediated co-targeting of TET1(CD) with ME1) where the expression of dCas9 and PUFa-TET1(CD) is GADD45A or NEIL2 within close proximity of their substrates under the control of Tet-On promoters (Supplementary Fig. 12a). enhanced 5mC demethylation and activation of methylation- A DIP_Casilio-ME1 stable cell line is established by piggyBac silenced genes. However, it was reported that GADD45A and transposition followed by antibiotic selection. When the DIP_- TDG failed to enhance demethylation of methylated plasmid by Casilio-ME1 cell line was transiently transfected with targeting TET1(CD) in vitro47, and the addition of TDG to Casilio-ME sgRNAs, we obtained robust MLH1 activation in the presence of modules failed to augment gene activation. TDG protein fusions doxycycline. MLH1 mRNA level also increased in response to tested here might not be functional or other factor(s) might be increasing amounts of doxycycline (Supplementary Fig. 12b). required for TDG to produce enhanced gene activation. The Without doxycycline added, only background levels of MLH1 enhanced activation of methylation-silenced genes observed were detected and no detectable amounts of Casilio-ME1 protein could alternatively be achieved by TDG-independent pathways, components were observed in Western blot analysis of protein by perhaps recruiting yet to be found players that could act by extracts from transfected cells (Supplementary Fig. 12b, c). This enhancing TET1/BER activities in the presence of GADD45A or DIP_Casilio-ME1 will enable establishment of isogenic cell lines NEIL2, or inhibiting DNA methyltransferases leading to that can be used to study different target CGIs in a tunable fi replication-dependent demethylation at the targeted sites. The manner by supplying different target-speci c sgRNAs and mechanisms by which these potential partners enhance 5mC adjusting doxycycline dosage. erasure are in need of further studies. The Casilio-ME2 and Casilio-ME3 have the potential to be used in such mechanistic Discussion studies as different combinations of protein assemblies that This study establishes a modular RNA-guided DNA methylation include mutants or other proteins could be tested. Future char- editing platform that not only recruits the TET1 effector to acterization of protein domains enabling an enhanced TET1- initiate DNA demethylation by 5mC oxidations, but also delivers mediated gene activation and of protein interactions taking place

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 at TET1-targeted genomic sites could lead to further improve- respectively. Bisulfite treated DNA served as templates to PCR-amplify three DNA ment of the Casilio-ME platforms and shed light on 5mC editing fragments of 350–400 bp or a single fragment that cover MLH1 or GSTP1 promoter ’ in mammalian cells. regions, respectively, using ZymoTaq PreMix according to manufacturer s instructions (Zymo Research). The MLH1 PCR fragments were then cloned by Different levels of activation of methylation-silenced genes SLIC into EcoRI-linearized pUC19 plasmid using T4 DNA polymerize50. Ten could be obtained by using one of the three flavors of Casilio-ME independent positive clones for each sample were then subjected to Sanger and by varying doxycycline concentrations with the DIP_Casilio- sequencing to determine methylation profiles based on bisulfite-mediated cytosine ME1 platform described here. This equips Casilio-ME platforms to thymine conversion frequency of individual CpGs. MLH1 amplicons obtained fi from bisulfite converted DNA templates and from unconverted DNA were sub- with the capability to ne-tune gene activation. These Casilio-ME jected to high throughput sequencing (2 × 250 paired-end reads) conducted at platforms significantly expand 5mC editing capability to effi- Genewiz (South Plainfield, NJ, USA). Fifty to 120 thousand reads were obtained ciently address the causal-effect relationships of methylcytosine per sample. Sequence analysis of the plasmids extracted from MLH1 clones to epigenetic marks in numerous biological and pathological determine methylation frequencies was performed by using BiQ Analyzer 3 with minimal bisulfite conversion rate and sequence identity set to 97 and 95%, systems. respectively51. Reads from high throughput amplicon sequencing, on the other hand, were analyzed for 5mC and 5hmC by using BiQ Analyzer HiMod with Methods minimal read quality score, alignment score, sequence identity and bisulfite con- version rate set to 30, 1000, 0.9, and 0.9, respectively52. Obtained levels of (5mC + Cell culture and transfection. HEK293T and U2OS cells (both from ATCC) were 5hmC) and 5mC from two experiments were concordant across the CpGs. BiQ cultivated in Dulbecco’s modified Eagle’s medium (DMEM)(Sigma) with 10% fetal Analyzer HiMod was used without sequence identity filter to analyze sequence bovine serum (FBS)(Lonza), 4% Glutamax (Gibco), 1% Sodium Pyruvate (Gibco), integrity of MLH1 amplicons derived from genomic DNA without bisulfite and penicillin-streptomycin (Gibco) in an incubator set to 37 °C and 5% CO . 2 treatment. LNCaP cells were obtained from ATCC and cultivated in RPMI-1640 supple- mented with 10% FBS. Doxycycline (Dox) (Sigma) (1 µg/ml) or as otherwise indicated was added on the day of transfections with a daily change of media fi supplemented with Dox. Cells were seeded into 12-well plates at 150,000 cells per Reduced representation bisul te sequencing (RRBS). Library preparation for RRBS was performed according to manufacturer’s instructions (Diagenode). well the day before being transfected with plasmids each encoding dCas9 (100 ng), fl sgRNAs (100 ng) or PUF-fusion (200 ng) in the presence of Attractene or Lipo- Brie y, 100 ng of genomic DNA for each sample was enzymatically digested, end- ’ repaired and ligated with an adaptor. Samples with different adaptors were then fectamine 3000 transfection reagents according to manufacturers instructions fi fi (Qiagen, Thermo Fisher Scientific, respectively). The same plasmid ratio and total pooled together and subjected to bisul te treatment followed by puri cation steps. The pooled DNA was PCR-amplified and then cleaned up with Ampure XP beads. amount of DNA were used in cell transfections with components of dCas9/MS2 fi and SunTag systems. In the two-component systems, where TET1(CD) was fused Libraries were quanti ed with real time qPCR and sequenced using Illumina to dCas9 or TALEs, 200 ng effector, 100 ng sgRNAs, and 100 ng empty vector of NextSeq (1 × 75 single end reads). Forty to 60 million reads per sample were plasmid DNA were used. The combinations of MLH1 sgRNAs used were based on obtained. To compute the CpG methylation levels, RRBS reads were aligned to the fi hg38 human reference genome using Bismark (version 0.16.0)53. CpG sites with the obtained Casilio-ME1-medaited MLH1 activation ef ciency in preliminary fi ’ experiments. However, no sgRNAs optimizations were performed for the other coverage lower than 10 or higher than 400 reads were ltered out and Pearson s correlation of methylation profiles across indicated samples were computed by methylation-silenced genes. Nucleofections of LNCaP cells were performed by 54 using 4D-nucleofector according to manufacturer’s instructions (Lonza) using 400 using methylKit (version 1.0.0) . ng plasmid DNA. Cells were harvested 3 days after transfection or as otherwise indicated with media changes at 24 h post-transfection, and cell pellets were used for extractions of RNA, genomic DNA and protein using AllPrep DNA/RNA/ RNA sequencing. RNA sequencing libraries were prepared for independent fi Protein Mini Kit according to the manufacturer’s instructions (Qiagen). Stable and replicates of mRNA samples at Genewiz (South Plain eld, NJ, USA) by using ’ Dox-inducible expression cell line was generated by transfecting three plasmids NEBNext Ultra RNA Library Prep Kit for Illumina according to manufacturer s fl including the hyperactive transposase plasmid (hyPBase)48,49 and the indicated instructions (New England Biolabs). Brie y, mRNA was enriched with Oligod(T) PiggyBac vectors hosting cassette enabling a Dox-inducible expression of dCas9 or beads, fragmented for 15 min at 94 °C and then reverse transcribed followed by PUFa-TET1(CD) effector. The transfected cells were then subjected to double second strand cDNA synthesis. cDNA fragments were end repaired and adenylated ′ selection in the presence of blasticidin and hygromycin. at 3 ends, and universal adapters were ligated to cDNA fragments, followed by index addition and library enrichment by limited cycle PCR. The libraries were validated on the Agilent TapeStation (Agilent Technologies), and quantified by Plasmid constructions. A list of plasmids with links to their Addgene entries are using Qubit 2.0 Fluorometer (Invitrogen) and quantitative PCR (KAPA Biosys- provided in Supplementary Table 1. Detailed descriptions and sequences of oli- tems). The libraries were clustered on two lanes of a flowcell then loaded on the gonucleotides and proteins are given in the Supplementary Tables 2-5. The plas- Illumina HiSeq instrument according to manufacturer’s instructions. The samples mids pCAG-dCas9-5xPlat2AflD and pCAG-scFvGCN4sfGFPTET1CD (Addgene were sequenced using a 2 × 150 paired-end configuration and image analysis and #82560 and 82561, respectively), pdCas9-Tet1-CD, and pcDNA3.1-MS2-Tet1-CD base calling were conducted by the HiSeq Control Software. Generated raw (Addgene #83340 and 83341, respectively) were gifts from Izuho Hatada and sequence data (.bcl files) from Illumina HiSeq was converted into fastq files and de- Ronggui Hu, respectively. multiplexed using Illumina’s bcl2fastq 2.17 software. Thirty to 40 million read pairs were obtained per sample. Reads were then quantitated by Salmon into transcript estimates55, then subjected to DESeq2 for differential gene expression analysis56. Quantitative RT-PCR analysis. Harvested cells were washed with Dulbecco’s Genes with low read counts in all samples were filtered out to eliminate noise in the phosphate-buffered saline (dPBS), centrifuged at 125 × g for 5 min and then the analysis. Differentially expressed genes were called with adjusted p-values less than flash-frozen pellets were stored at −80 °C. Extracted RNA (500 ng–2 µg) were used 0.05. MA-plots and FPKM scatter plots were generated using DESeq2. as templates to make cDNA libraries using a High Capacity RNA-to-cDNA kit (Applied Biosystems). TaqMan gene expression assays were designed using GAPDH (Hs03929097, VIC) as an endogenous control and CDH1 Western blot analysis. Protein cell extracts (30 µg) were separated by electro- (Hs01023895_m1, FAM), GSTP1 (Hs00943350_g1, FAM), HSPA1A phoresis on 10% SDS-polyacrylamide gels and then transferred to nitrocellulose (Hs00359163_s1, FAM), MGMT (Hs01037698_m1, FAM), MLH1 membranes at 100 V for 1 h using Bjerrum Schafer-Nielsen buffer with SDS. (Hs00179866_m1, FAM), PKP3 (Hs00170887_m1, FAM), RHOXF2 Blocked membrane in 5% Blotting-Grade Blocker (BioRad) in TBS-T (50 mM Tris (Hs00261259_m1, FAM), SOX17 (Hs00751752_s1, FAM), SRPX2 pH 7.6, 200 mM NaCl, 0.1% Tween 20) were incubated overnight at 4 °C with the (Hs00997580_m1, FAM), or THBS2 (Hs01568063_m1, FAM) as targets (Thermo indicated antibodies, and then protein bands were detected using Horseradish Fisher Scientific). Quantitative PCR (qPCR) was performed in 10 µL reactions by peroxidase-conjugated secondary antibodies (Sigma) and Clarity Western ECL using TaqMan Universal Master Mix II with UNG and 2 μL of diluted cDNA from Substrate according to manufacturer’s instructions (BioRad). Monoclonal anti-Flag each sample (Applied Biosystems). Gene expression levels were calculated by “delta (dilution 1:1000, Sigma cat #F1804), monoclonal CRISPR/Cas9 (dilution 1:5000, delta Ct” and normalized to control samples using ViiA7 version 1.2.2 or Epigentek cat #A-9000-100) and monoclonal anti-β-actin (dilution 1:5000, Sigma QuantStudio version 1.3 software (Applied Biosystems by Life technologies). cat #MAB 1501) antibodies were used according to manufacturers’ instructions. Blots were imaged using a G:Box (Syngene). Uncropped and unprocessed images of Bisulfite and oxidative bisulfite sequencing. Bisulfite and oxidative bisulfite the blots are supplied in Source Data file. conversion experiments were performed by using the EpiTect Fast DNA Bisulfite Kit, True Methyl oxBS Module and genomic DNA according to manufacturers’ instructions (Qiagen and NuGen, respectively). For oxBSeq, oxidation reactions Cell proliferation assay. Cells were split 48 h after transfection in serial dilutions were carried out in parallel and all treated samples developed same orange color as indicated, incubated for 48 h and then MTT assay was performed according to expected for successful oxidative reactions. The average bisulfite conversion rates of manufacturer’s instructions (Thermo Fisher Scientific), and absorbance at 540 nm cytosines in oxidized and non-oxidized sample sets were 0.996 and 0.995, was recorded using a SpectraMax M5 plate reader (Molecular devices).

10 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7 ARTICLE

Statistical analyses. Information on replication, statistical tests and presentation 23. Maeder, M. L. et al. Targeted DNA demethylation and activation of are given in the respective figure legends. GraphPad Prism 8 and Microsoft Excel endogenous genes using programmable TALE-TET1 fusion proteins. Nat. were used to perform the indicated tests. Differences in all comparisons were Biotechnol. 31, 1137–1142 (2013). considered significant if the obtained p values were less than 0.05. 24. Morita, S. Targeted DNA demethylation in vivo using dCas9-peptide repeat and scFv-TET1 catalytic domain fusions. Nat. Biotechnol. 34, 1060–1065 Reporting summary. Further information on research design is available in (2016). the Nature Research Reporting Summary linked to this article. 25. Liu, X. S. et al. Editing DNA Methylation in the Mammalian Genome. Cell 167, 233–247 e217 (2016). 26. Xu, X. et al. A CRISPR-based approach for targeted DNA demethylation. Cell Data availability Disco. 2, 16009 (2016). The authors declare that the data that support the findings of this study are included in 27. Huang, Y. H. et al. DNA epigenome editing using CRISPR-Cas SunTag- the published article and in the Supplemental Information, and are available from directed DNMT3A. Genome Biol. 18, 176 (2017). corresponding author upon reasonable request. Data containing RNAseq and RRBS raw 28. Nunna, S., Reinhardt, R., Ragozin, S. & Jeltsch, A. Targeted methylation of the sequencing read files were deposited onto sequencing read archive (SRA) with accession epithelial cell adhesion molecule (EpCAM) promoter to silence its expression number PRJNA515359. The source data underlying Figs. 1b, 1c, 1e, 2b, 2d, 3b, 3d, and in ovarian cancer cells. PLoS ONE 9, e87703 (2014). 4a–c and Supplementary Figs. 2b, 7b–c and 12c are provided as a Source Data file. New 29. Chen, H. et al. Induced DNA demethylation by targeting Ten-Eleven plasmids have been deposited to the Addgene repository. Translocation 2 to the human ICAM-1 promoter. Nucleic Acids Res 42, 1563–1574 (2014). Received: 20 May 2019; Accepted: 4 September 2019; 30. Cheng, A. W. et al. Casilio: a versatile CRISPR-Cas9-Pumilio hybrid for gene regulation and genomic labeling. Cell Res 26, 254–257 (2016). 31. Abil, Z., Denard, C. A. & Zhao, H. Modular assembly of designer PUF proteins for specific post-transcriptional regulation of endogenous RNA. J. Biol. Eng. 8, 7 (2014). 32. Herman, J. G. et al. Incidence and functional consequences of hMLH1 promoter hypermethylation in colorectal carcinoma. Proc. Natl Acad. Sci. USA References 95, 6870–6875 (1998). 1. Robertson, K. D. DNA methylation and human disease. Nat. Rev. Genet 6, 33. Deng, G., Chen, A., Hong, J., Chae, H. S. & Kim, Y. S. Methylation of CpG in a – 597 610 (2005). small region of the hMLH1 promoter invariably correlates with the absence of 2. Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian gene expression. Cancer Res 59, 2029–2033 (1999). – development. Nat. Rev. Genet 14, 204 220 (2013). 34. Chen, K. et al. Gadd45a is a heterochromatin relaxer that enhances iPS cell 3. Jones, P. A. & Liang, G. Rethinking how DNA methylation patterns are generation. EMBO Rep. 17, 1641–1656 (2016). – maintained. Nat. Rev. Genet 10, 805 811 (2009). 35. Sanchez, R. et al. Solution structure of human growth arrest and DNA damage 4. Lyko, F. The DNA methyltransferase family: a versatile toolkit for epigenetic 45alpha (Gadd45alpha) and its interactions with proliferating cell nuclear – regulation. Nat. Rev. Genet 19,81 92 (2018). antigen (PCNA) and Aurora A kinase. J. Biol. Chem. 285, 22196–22201 5. Wu, X. & Zhang, Y. TET-mediated active DNA demethylation: mechanism, (2010). – function and beyond. Nat. Rev. Genet 18, 517 534 (2017). 36. Das, A. et al. Identification of a zinc finger domain in the human NEIL2 6. Kohli, R. M. & Zhang, Y. TET enzymes, TDG and the dynamics of DNA (Nei-like-2) protein. J. Biol. Chem. 279, 47132–47138 (2004). – demethylation. Nature 502, 472 479 (2013). 37. Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas 7. Maiti, A. & Drohat, A. C. Thymine DNA glycosylase can rapidly excise 5- nucleases in human cells. Nat. Biotechnol. 31, 822–826 (2013). formylcytosine and 5-carboxylcytosine: potential implications for active 38. Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. – demethylation of CpG sites. J. Biol. Chem. 286, 35334 35338 (2011). Biotechnol. 31, 827–832 (2013). 8. Hashimoto, H., Hong, S., Bhagwat, A. S., Zhang, X. & Cheng, X. Excision of 5- 39. Harris, R. A. et al. Comparison of sequencing-based methods to profile DNA hydroxymethyluracil and 5-carboxylcytosine by the thymine DNA glycosylase methylation and identification of monoallelic epigenetic modifications. Nat. domain: its structural basis and implications for active DNA demethylation. Biotechnol. 28, 1097–1105 (2010). – Nucleic Acids Res 40, 10203 10214 (2012). 40. Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved fi 9. Zhang, L. et al. Thymine DNA glycosylase speci cally recognizes 5- specificity. Science 351,84–88 (2016). fi – carboxylcytosine-modi ed DNA. Nat. Chem. Biol. 8, 328 330 (2012). 41. Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high 10. He, Y. F. et al. Tet-mediated formation of 5-carboxylcytosine and its excision DNA specificity. Nature 556,57–63 (2018). – by TDG in mammalian DNA. Science 333, 1303 1307 (2011). 42. Kleinstiver, B. P. et al. High-fidelity CRISPR-Cas9 nucleases with no 11. Muller, U., Bauer, C., Siegl, M., Rottach, A. & Leonhardt, H. TET-mediated detectable genome-wide off-target effects. Nature 529, 490–495 (2016). oxidation of methylcytosine causes TDG or NEIL glycosylase dependent gene 43. Chatterjee, P., Jakimo, N. & Jacobson, J. M. Minimal PAM specificity of a – reactivation. Nucleic Acids Res 42, 8592 8604 (2014). highly similar SpCas9 ortholog. Sci. Adv. 4, eaau0766 (2018). 12. Schomacher, L. et al. Neil DNA glycosylases promote substrate turnover 44. Shen, L. et al. Genome-wide analysis reveals TET- and TDG-dependent – by Tdg during DNA demethylation. Nat. Struct. Mol. Biol. 23, 116 124 5-methylcytosine oxidation dynamics. Cell 153, 692–706 (2013). (2016). 45. Le May, N. et al. NER factors are recruited to active promoters and facilitate 13. Niehrs, C. & Schafer, A. Active DNA demethylation by Gadd45 and DNA chromatin modification for transcription in the absence of exogenous – repair. Trends Cell Biol. 22, 220 227 (2012). genotoxic attack. Mol. Cell 38,54–66 (2010). 14. Barreto, G. et al. Gadd45a promotes epigenetic gene activation by repair- 46. Schmitz, K. M. et al. TAF12 recruits Gadd45a and the nucleotide excision – mediated DNA demethylation. Nature 445, 671 675 (2007). repair complex to the promoter of rRNA genes leading to active DNA 15. Schuermann, D., Weber, A. R. & Schar, P. Active DNA demethylation demethylation. Mol. Cell 33, 344–353 (2009). – by DNA repair: facts and uncertainties. DNA Repair (Amst.) 44,92 102 47. Weber, A. R. et al. Biochemical reconstitution of TET1-TDG-BER-dependent (2016). active DNA demethylation reveals a highly coordinated mechanism. Nat. 16. Kienhofer, S. et al. GADD45a physically and functionally interacts with TET1. Commun. 7, 10806 (2016). – Differentiation 90,59 68 (2015). 48. Yusa, K., Zhou, L., Li, M. A., Bradley, A. & Craig, N. L. A hyperactive piggyBac 17. Li, Z. et al. Gadd45a promotes DNA demethylation through TDG. Nucleic transposase for mammalian applications. Proc. Natl Acad. Sci. USA 108, – Acids Res 43, 3986 3997 (2015). 1531–1536 (2011). fi 18. van Tol, N. & van der Zaal, B. J. Arti cial transcription factor-mediated 49. Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human – regulation of gene expression. Plant Sci. 225,58 67 (2014). cells using the CRISPR-Cas9 system. Science 343,80–84 (2014). 19. Pulecio, J., Verma, N., Mejia-Ramirez, E., Huangfu, D. & Raya, A. CRISPR/ 50. Jeong, J. Y. et al. One-step sequence- and ligation-independent cloning as a – Cas9-Based Engineering of the Epigenome. Cell Stem Cell 21, 431 447 (2017). rapid and versatile cloning method for functional genomics studies. Appl. 20. Gaj, T., Gersbach, C. A. & Barbas, C. F. 3rd ZFN, TALEN, and CRISPR/Cas- Environ. Microbiol 78, 5440–5443 (2012). – based methods for genome engineering. Trends Biotechnol. 31, 397 405 (2013). 51. Bock, C. et al. BiQ Analyzer: visualization and quality control for DNA 21. Cheng, A. W. et al. Multiplexed activation of endogenous genes by CRISPR- methylation data from bisulfite sequencing. Bioinformatics 21, 4067–4068 – on, an RNA-guided transcriptional activator system. Cell Res 23, 1163 1171 (2005). (2013). 52. Becker, D. et al. BiQ Analyzer HiMod: an interactive software tool for high- 22. Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression throughput locus-specific analysis of 5-methylcytosine and its oxidized – and activation. Cell 159, 647 661 (2014). derivatives. Nucleic Acids Res 42, W501–W507 (2014).

NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12339-7

53. Krueger,F.&Andrews,S.R.Bismark:aflexible aligner and methylation Additional information caller for Bisulfite-Seq applications. Bioinformatics 27,1571–1572 (2011). Supplementary information is available for this paper at https://doi.org/10.1038/s41467- 54. Akalin, A. et al. methylKit: a comprehensive R package for the analysis of 019-12339-7. genome-wide DNA methylation profiles. Genome Biol. 13, R87 (2012). 55. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon Correspondence and requests for materials should be addressed to A.W.C. provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017). Peer review information Nature Communications thanks Rahul Kohli and the other, 56. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and anonymous, reviewer(s) for their contribution to the peer review of this work. Peer dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). reviewer reports are available. 57. Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324,930–935 (2009). Reprints and permission information is available at http://www.nature.com/reprints

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in Acknowledgements published maps and institutional affiliations. This work has been supported by the Jackson Laboratory internal grants (to A.C.), National Research Institute 1R01HG009900 (to A.C) and National Cancer Institute P30CA034196 (to A.C), Leukemia Research Foundation New Investi- Open Access This article is licensed under a Creative Commons gator Grant (to S.L.), The Jackson Laboratory Director’s Innovation fund 19000-17-31 Attribution 4.0 International License, which permits use, sharing, (to S.L.), The Jackson Laboratory Cancer Center New Investigator Award (to S.L.), and adaptation, distribution and reproduction in any medium or format, as long as you give National Cancer Institute of the National Institutes of Health P30CA034196 (to S.L.), appropriate credit to the original author(s) and the source, provide a link to the Creative and the National Institutes of Health grant CA115783 (to C.D.H.). Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the Author contributions article’s Creative Commons license and your intended use is not permitted by statutory A.T., C.H. and A.C. conceived and designed the study. A.T., N.J., M.D., A.R. performed regulation or exceeds the permitted use, you will need to obtain permission directly from the experiments. W.R. and S.L. analyzed RRBS data. A.T., C.H., and A.C. analyzed data the copyright holder. To view a copy of this license, visit http://creativecommons.org/ and wrote the manuscripts. licenses/by/4.0/.

Competing interests © The Author(s) 2019 The authors declare no competing interests.

12 NATURE COMMUNICATIONS | (2019) 10:4296 | https://doi.org/10.1038/s41467-019-12339-7 | www.nature.com/naturecommunications