UNIVERSITY OF CALIFORNIA

Los Angeles

New Roles of DNA Methylation in the Epigenome and

Developmental Overgrowth Syndrome

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Molecular Biology

by

Andrew King

2017

© Copyright by

Andrew King

2017 ABSTRACT OF THE DISSERTATION

New Roles of DNA Methylation in the Epigenome and

Developmental Overgrowth Syndrome

by

Andrew King

Doctor of Philosophy in Molecular Biology

University of California, Los Angeles, 2017

Professor Michael F. Carey, Chair

DNA methylation is an epigenetic mark classically described to repress gene expression in a long and stable manner. DNA methylation plays important roles throughout development as well as throughout life where it maintains many tissue-specific genes in the silent state. DNA methylation is also altered in many diseases. Despite recent interest and innovation in measuring genome-wide DNA methylation, its function in the genome and its role in disease is not well understood. In this body of work, I aim to understand DNA methylation’s role in the cell through perturbation of DNA methylation in two separate contexts: normal mouse embryonic stem cells

(mESCs) and a mESC-derived neural (NSC) disease model.

In the first part of my dissertation (Chapter 2), global perturbation of DNA methylation in mESCs are used to address DNA methylation’s relationship with other members of the epigenome, namely modifications. Using a series of mESCs with DNA

ii methyltransferases knocked-out and subsequently rescued, I dissect the causal effects of DNA

methylation. By measuring how histone modifications and gene expression change with respect

to global DNA methylation, I address whether DNA methylation is capable of regulating histone

modifications and gene expression in mESCs. I find that genome-wide DNA demethylation

alters occupancy of histone modifications in multiple genomic contexts. Most interestingly,

global remethylation of the genome reverses changes in histone modification occupancy,

indicating causal regulation by DNA methylation.

In the second part of my dissertation (Chapter 3), highly specific perturbations in DNA

methylation are made in mESCs and mESC-derived NSCs to understand what role DNA

methylation may play in disease pathogenesis. Mutations recently described in the gene

DNMT3A cause an overgrowth syndrome associated with intellectual disability named Tatton-

Brown Rahman Syndrome (TBRS). Through genome editing, I establish a cellular model for

TBRS where the endogenous Dnmt3a gene contains knock-in of individual missense mutations

homologous to those found in TBRS patients. Differentiation of TBRS mESCs into NSCs along

with measurements of global DNA methylation, gene expression, and histone modifications

reveal downstream consequences of several of these mutations. I find that TBRS-associated

DNMT3A mutations largely result in loss of methylation at specific DNMT3A targets. Of

particular interest are a group of imprinted genes with known roles in growth and intellectual

function. Furthermore, specific DNA methylation changes are also found to associate with

changes in histone modification in NSCs.

iii The dissertation of Andrew King is approved.

Guoping Fan

Jason Ernst

Harley Kornblum

William Lowry

Michael F. Carey, Committee Chair

University of California, Los Angeles

2017

iv

I dedicate this dissertation to my family, without whom I could not have achieved.

v Table of Contents

List of Figures ...... viii

Acknowledgements ...... x

VITA ...... xii

Chapter 1 : Introduction to DNA Methylation ...... 1

Historical Context ...... 3

Epigenetic Landscapes ...... 6

DNA Methylation Mechanisms – Lessons from Biochemistry ...... 13

Biological Functions ...... 18

DNA Methylation in Disease ...... 21

Relation to this Dissertation ...... 26

References ...... 28

Chapter 2 : Reversible Regulation of and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells ...... 40

Introduction ...... 41

Dnmt Reconstitution in Demethylated mESCs Restores Global Cytosine Methylation and Causes Various Changes in Histone Modifications ...... 43

DNA Methylation is Required for Maintenance and Reestablishment of Promoter H3K27me3 ...... 45

Impact of DNA Methylation on Promoter H3K27me3 and Gene Expression Differs Between Bivalent and Silent Promoters ...... 47

DNA Methylation Maintains and Reestablishes Silent or Primed States at Enhancer Elements ...... 48

De Novo Enhancers Are Tissue-Specific and Contain Methylation-Sensitive Factors ...... 51

Regulation of H3K27me3 Depends on 5-Methylcytosine and DNMT Catalytic Activity52

vi DNA Methylation Regulation of H3K27me3 is Mediated by PRC2 Targeting ...... 53

DNMT3 Isoform Differences in Histone Modification Regulation ...... 54

Discussion ...... 56

Experimental Procedures ...... 60

Figures...... 68

References ...... 83

Chapter 3 : Molecular and genomic defects of mutations in DNMT3A-associated Overgrowth Syndrome with Intellectual Disability ...... 90

Introduction ...... 91

Establishing a Cellular Model to Study Overgrowth-Associated DNMT3A Mutations .. 92

DNMT3A Mutants Differ in Functionality in mESC and mNSCs ...... 93

DNA Methylation Alterations Occur at Sites of de novo Methylation in NSCs ...... 95

Epigenetic Signatures of OG-DMRs and Antagonism of H3K27me3 ...... 97

Altered Gene Expression in Overgrowth Mutant NSCs ...... 99

Structural Interpretation of DNMT3A Mutations ...... 101

Discussion ...... 103

Experimental Procedures ...... 106

Figures...... 111

References ...... 123

Chapter 4: Discussion and Concluding Remarks...... 126

First principles of epigenetics ...... 127

Understanding epigenetic alterations in disease context ...... 130

References ...... 133

vii List of Figures

Figure 2.1. Alteration of DNA Methylation Causes Selective Genome-wide Changes in Histone

Modifications ...... 68

Figure 2.2. DNA Methylation Causally Regulates the Maintenance and Establishment of

Promoter H3K27me3 ...... 69

Figure 2.3. DNA Methylation Regulates Histone Modification Status and Gene Activity at

Enhancers ...... 72

Figure 2.4. DNA Methylation-Dependent Loci Are Tissue-Specific Promoters and Enhancers . 73

Figure 2.5. Regulation of H3K27me3 Depends on DNMT Catalytic Activity ...... 74

Figure 2.6. DNA Methylation Directly Regulates Loci Gaining H3K27me3 via PRC2 Occupancy

...... 75

Figure 2.7. DNA Methylation Regulation of Histone Modifications Across Promoter and

Enhancer Contexts ...... 76

Figure 2.8. Overview of RRBS and ChIP-seq Data ...... 77

Figure 2.9. Histone Modification at Promoters Change with Global Methylation ...... 78

Figure 2.10. Comparison of and H3K27me3 loss and rescue in DKO and TKO mESCs

...... 79

Figure 2.11. Enriched de novo Transcription Factor Motifs in DNA Methylation-Sensitive

Enhancers ...... 80

Figure 2.12. Isoform-specific effects at promoters and enhancers ...... 82

Figure 3.1. DNMT3A mutants differ in functionality in genome methylation ...... 112

Figure 3.2. DNA methylation changes occur at sites of de novo methylation in NSCs ...... 113

viii Figure 3.3. Epigenetic signatures of OG-DMRs and antagonism with H3K27me3 ...... 116

Figure 3.4. Gene expression changes in mutant NSCs ...... 117

Figure 3.5. Structural analysis of Dnmt3a mutants ...... 118

Figure 3.6. Cellular characterization of cell lines ...... 120

Figure 3.7. WGBS DNA methylation metrics ...... 121

Figure 3.8. DNA Methylation and H3K27me3 of Nr2f2 in developing mouse embryos ...... 122

ix Acknowledgements

I would like to acknowledge my advisor Guoping Fan for providing me not only the

opportunity to work in his laboratory, but also a broad range of activities including grant writing,

manuscript review, lab management, and collaboration. I am thankful for the projects that I find

exceptionally fascinating and the diversity of work that is performed in the lab.

I am thankful for the comments and direction provided by my committee members:

Professors Michael Carey, Jason Ernst, Harley Kornblum, and Bill Lowry. I would also like to

acknowledge members of the Fan laboratory. I am especially grateful to Dr. Kevin Huang, Dr.

Kumi Sakurai, and Dr. Ganlu Hu for helping me get started and for making the lab a warm and

welcoming place. Also included are the talented undergraduates, Stella Joh, Kai Chang, Harrison

Wu, and Katherine Sheu, whom I’ve had the pleasure of working with.

Members of the Broad Stem Cell Research Center Sequencing Core including Suhua

Feng and Shawn Cokus have been invaluable in generating sequencing data. Also, members of

the Center of Excellence for Stem Cell Genomics (CESCG) at the Salk Institute for Biological

Studies, including Joe Nery, Cesar Barragan, Manoj Hariharan, whom I relied on heavily.

Lastly, I will never forget the mentorship from Professors Sylvie Garneau-Tsodikova and

Ivan Maillard who have been my role models as scientists from my first steps into biology. I am

also grateful to Professor Gopalan Srinivasan who generously provided my first research

experience. It was in the environment cultitvated by these mentors that nurtured by passion and

excitement for research.

Chapter 2 of this dissertation is adapted from a publication in Cell Reports: King, A.D.,

Huang, K., Rubbi, L., Liu, S., Wang, C.Y., Wang, Y., Pellegrini, M., Fan, G. (2016). Reversible

Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse

x Embryonic Stem Cells. Cell Reports 17, 289-302 doi: 10.1016/j.celrep.2016.08.083. It was originally published as an open access article under the Creative Commons CC BY license. This work was co-authored with Kevin Huang, Liudmilla Rubbi, and Shuo Liu. Kevin Huang helped conceive the original project and provided guidance and discussion throughout. Liudmilla Rubbi prepared RRBS libraries used in the analysis of DNA methylation. Shuo Liu performed mass spectrometry measurements for analysis of global DNA methylation.

Chapter 3 is unpublished work performed by Andrew King. I would like to acknowledge

Yang Yu for helping perform mass spectrometry measurements of global DNA methylation and

Katherine Sheu helped generate several cell lines.

This work would not have been possible without financial support from the National

Institute of Dental and Craniofacial Research (R21DE022928 and R01DE025474) and from the

California Institute for Regenerative Medicine (CIRM) Center for Excellence for Stem Cell

Genomics (GC1R-06673-A). I am grateful for training support from the Eli and Edythe Broad

Stem Cell Research Center (BSCRC), Rose Hills Foundation, and UCLA Molecular Biology

Institute Philip Whitcome fellowship.

xi VITA

2010 B.S.E. in Chemical Engineering, minor in Biochemistry

magna cum laude

University of Michigan

PUBLICATIONS

King, A.D., Huang, K., Rubbi, L., Liu, S., Wang, C.Y., Wang, Y., Pellegrini, M., Fan, G. (2016). Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Reports 17, 289-302. doi:10.1016/j.celrep.2016.08.083

Chen, Y.A., King, A.D., Wu, C.Y., Liao, W.H., Peng, C.C., Tung, Y.C. (2011). Generation of oxygen gradients in microfluidic devices for cell culture using spatially confined chemical reactions. Lab Chip 11, 3626-3633. doi: 10.1039/c1lc20325h

Sheoran, A., King, A., Velasco, A., Pero, J.M., Garneau-Tsodikova, S. (2008). Characterization of TioF, a tryptophan 2,3-dioxygenase involved in 3-hydroxyquinaldic acid formation during thiocoraline biosynthesis. Molecular Biosystems 4(6), 622-628. doi: 10.1039/b801391h

xii

Chapter 1 :

Introduction to DNA Methylation

1 One of the most fascinating questions in biology pertains to how multicellular organisms generate such an enormous diversity of functional cell types and how these numerous cell types remember their developmental history. Anyone who has tried to organize a large group of people has experienced the challenge of coordination. The challenge faced by cells in an animal is enormous from this perspective, where cells in a developing embryo are one among billions and whose time-frames operate at a scale orders of magnitude smaller than that of the whole organism. Yet this feat of coordination is accomplished by every developing embryo.

One of the key processes allowing this to occur is epigenetic regulation. Epigenetics is currently defined as the heritable transfer of information across both cells and organisms without changing the underlying DNA sequence. This system forms the basis of cellular memory and confers the ability for cells to be both rigid and plastic in their behaviors. It is these properties that contribute to ideas of fate determination, differentiation, restriction, and potential that are key to understanding how cells normally navigate development and what the consequences are in processes of disease.

In this introduction, I will describe the rich history in the best characterized epigenetic modification – DNA methylation. How our current understanding of DNA methylation reflects a post-revisionist view of DNA methylation where the currents of evidence had led scientists to overestimate the functionality of DNA methylation in the 1980s. This was soon followed by a period of revision, where the function of DNA methylation was moderated. In the post-genomic era, detailed study of DNA methylation has hinted at vital roles that are still not fully understood.

This dissertation aims to contribute to furthering our understanding about DNA methylation’s role in gene regulation, development, and disease.

2 Historical Context

Foundations: 5-methylcytosine, the prototypical epigenetic regulator

Observations of phenomena, such as the ability for imaginal discs in Drosophila

melanogaster to retain their determined state after more than 70 transplants, equivalent to one thousand replication cycles, motivated questions relating to stability of cellular phenotypes and the mechanisms behind these properties (Gehring, 1976). DNA methylation was first described

in animals in the mid-twentieth century with measurement of 5-methylcytosine from several sources (Hotchkiss, 1948; Wyatt, 1951). Different tissues showed tissue-specific methylation patterns and changes across different development stages (Kappler, 1971; Monk et al., 1987;

Vanyushin et al., 1970).

DNA methyltransferases were first postulated by Werner Arber to serve as the

“modification” enzymes in the bacterial restriction and modification (R-M) system that yielded the discovery of restriction endonucleases (Arber, 1965). In the bacterial R-M system, methylation of host bacterial DNA protected it from host-expressed methylation-sensitive restriction enzymes. However, bacteriophage DNA, which would be unmethylated, was subject to digestion. The first bacterial DNA methylase was shortly isolated from H. influenzae in

Nobel-winning work by Hamilton Smith and colleagues (Roy and Smith, 1973).

Two landmark articles speculating on how epigenetic mechanisms may function are

widely cited as a starting point to the formalization of epigenetics and how DNA methylation

may operate as an epigenetic mark (Holliday and Pugh, 1975; Riggs, 1975). Central to the

hypothesis is the ability for ordered control of transcription to occur through methylation of

DNA without change to the sequence. This hypothesis, which has since been demonstrated

experimentally is the basis of how we define epigenetics today. Taking note from bacterial DNA

3 methylases, the model proposed, there would need to exist two enzymes – one that initially

writes the DNA methylation mark and a second that takes the hemimethylated DNA as substrate and copies the information to the palindromic site while not having activity on nonmethylated

DNA, a maintenance enzyme. Methylation patterns could then be written and propagated via this two-part system. This model provided a mechanism for inheritance, not relying on DNA base changes, well before evidence of existence of these enzymes existed.

Classic Roles for DNA Methylation in Gene Regulation – Overview of the 1980s

The classic role of DNA methylation stemmed from early observations on the variability

of DNA methylation at several different genes across different tissues and the correlation of

actively expressed genes with hypomethylation and silenced genes with hypermethylation. The

1980s were punctuated by mapping of methylation patterns using newly described methylation

sensitive restriction enzymes discovered by Adrian Bird (Bird and Southern, 1978) and

functional tests on how DNA methylation regulated gene activity, which is reviewed extensively

during the period (Bird, 1984; Doerfler, 1983; Razin and Cedar, 1991; Razin and Riggs, 1980).

Additionally, on the mechanistic level, methylation near that 5’ end of genes, which was being

actively studied for transcriptional initiation, implicated a role for DNA methylation in the

blocking of transcription. From the experiments in this decade, several paradigms for how DNA

methylation is understood were established.

Paradigms of DNA methylation regulation

The early study of DNA methylation is mostly characterized by measurement of

methylation as a modification of DNA and correlations in its level with activity of genes in

4 different tissues. Many correlations existed where DNA methylation was absent when genes were on and present when genes were silenced, leading to the growing notion that DNA methylation regulated gene expression. Here I describe a number of key experiments that established DNA methylation as a driver of regulation, moving beyond the correlations previously observed.

Some of the earliest in vivo evidence was shown by Rudolf Jaenisch’s lab, where they observed de novo methylation and subsequent silencing of transduced viruses at the stage of pre- implantation embryos, but not in post-implantation embryos (Jähner et al., 1982). This strongly supported the causal nature of DNA methylation in repression of genes. Separately, gene transfer experiments where unmethylated versus in vitro methylated DNA, transduced into cells, showed different activity at the well-studied globin gene (Busslinger et al., 1983).

Following the discovery of the demethylation activity of 5-azacytidine by Peter Jones and colleagues and its ability to alter differentiation states (Jones and Taylor, 1980), several 5- azacytidine demethylation studies showed cell state changes, including those involved in the discovery of master regulator transcription factor MyoD, argued strongly for DNA methylation as a key regulator of differentiation state (Davis et al., 1987). Other experiments also proved that

DNA methylation can not only silence genes, but are also stably maintained for generations of cells (Wigler et al., 1981).

The most dramatic example of DNA methylation repression is where DNA methylation partakes in silencing the entire X chromosome during X-chromosome inactivation in female cells

(Monk, 1986). It is evident that silenced X chromosomes are stably repressed for the life of the organism. The notion that DNA methylation actively silenced the inactive X chromosome was

5 supported through experiments showing 5-azacytidine reactivating genes located on the inactive

X chromosome (Mohandas et al., 1981)

The foundation of how DNA methylation and to some extent epigenetics as a whole

operates, was established in the 1980s with the seminal studies demonstrating functions of DNA

methylation. These studies strongly provided evidence for the idea that DNA methylation silences genes. Extrapolation from a limited number of studies set forth an idea that DNA methylation may be a key regulator throughout development. It is these very experiments that

were the basis of my undergraduate and medical instruction on DNA methylation. However, the

extent to which these foundational experiments prove generalized causality is limited. In part,

this is due to the limited contexts studied, both genomic contexts as well as in different tissues of

the body. Future studies of genomic contexts would be better understood in the post-genomic

era. However we are still understanding the different biological contexts to this day.

Epigenetic Landscapes

DNA methylation occurs in CpG dinucleotide context, which is simply consecutive

cytosine and guanosine nucleotides. This context is unique in that it is rotationally symmetric

and exists as a palindrome, where a methylated cytosine on one strand (5’-CG-3’) is matched on

the complementary strand (3’-GC-5’). Mammalian genomes are depleted of CpG dinucleotides

as a result of spontaneous deamination of methyl-cytosine to thymine (Bird, 1980; Coulondre et

al., 1978; Josse et al., 1961). Due to this phenomenon, only 0.9% of the human genome consists

of CpG dinucleotide, compared to 6.25% as would be expected by random chance or 4% given a

G+C fraction of approximately 40%. Despite their rarity, 60-90% of CpGs are found methylated

throughout the vertebrate genomes.

6

Rudimentary maps in the pre-genomic era

CpG islands are one of the first genomic features relating to DNA methylation that were mapped and provided an indication of DNA methylation’s function in the genome. The landmark study by Adrian Bird and colleagues identified CpG islands using methylation sensitive restriction enzymes HpaII and HhaI in the mouse genome (Bird et al., 1985). Bird et al. observed that most vertebrate genomes were methylated and poorly cleaved by methylation-sensitive restriction enzymes. However, a consistent small fraction accounting for <1% of DNA contained the majority of HpaII cut sites across tissues and were shown to map to islands of unmethylated

DNA near genes.

Unlike the rest of the genome, CpG islands displayed CpG content close to the expected value based on random chance. These islands were found to be GC-rich, have high CpG content, and remained unmethylated during evolution. CpG islands were more formally defined after measurement of CpG islands near genes in several different vertebrate genomes. The classic definition is a region greater than 200 base pairs in length with >50% G+C content and >0.6 observed/expected ratio of CpG (Gardiner-Garden and Frommer, 1987). In their original description, Bird et al. estimated around 30,000 CpG islands in the haploid genome. Sequencing of the human and mouse genomes, eventually showed these estimates to be very close, with approximately 27,000 CpG islands in human and 15,500 in mouse (Mouse Genome Sequencing

Consortium et al., 2002)

Soon after their discovery, CpG islands were found to be associated with housekeeping genes and lacking at tissue-specific genes (Bird, 1986). In a review, Bird writes,

7

Given that there are usually rather few CpGs near tissue specific genes … one would not

expect to find CpG or methylation built into the activation mechanism of genes of this type.

This conclusion is heretical given recent interest in DNA methylation as a mechanism of

controlling tissue specific genes (Bird, 1986)

In observing the first glimpse of methylation’s landscape through analysis of CpG islands, the relationship between DNA methylation, genomic context, and biological function were starting to be clarified. Analysis of CpG content in the fashion of identifying CpG islands has resulted in classification of promoters into low and high CpG content promoters with distinguishing biological functions and frequency of methylation (Saxonov et al., 2006). While

CpG islands have been shown to be constitutively unmethylated, recent studies have shown that the adjacent 2-kb, termed CpG island shores, show more variability in methylation between tissues as well as between normal and cancer (Irizarry et al., 2009)

Epigenetic landscapes in the genomic era

Genome-wide DNA methylation maps are important both for understanding generalities of DNA methylation in the genome as well as finding where differences occur between cells.

The techniques to measure DNA methylation have improved dramatically, particularly over the last two decades. The field first driven by innovations in array technology followed by sequencing technology has moved from a reliance on methylation sensitive restriction enzymes, to methylation-specific antibodies, and finally to the current gold-standard of whole-genome bisulfite sequencing (WGBS). With the improvement in the ability to create high-resolution and

8 genome-wide maps, the understanding in DNA methylation’s role in biology has become more

apparent.

The first genome-wide maps utilized immunoprecipitation of methylation-specific

antibodies and were in the flowering plant Arabidopsis, due to its relatively small genome (135

Mbp) and well characterized DNA methylation system with several genetic knockouts of the

methylation machinery available. Unexpectedly, these maps showed heavy methylation genome-

wide, particularly in gene bodies, which correlated with expression level (Zhang et al., 2006;

Zilberman et al., 2007). Additionally, promoters that were methylated tended to exhibit tissue-

specific gene expression. These patterns of global DNA methylation were shortly confirmed

using the higher resolution whole-genome bisulfite sequencing (WGBS), which involves

bisulfite conversion of fragmented genomic DNA followed by next-generation sequencing

(Cokus et al., 2008; Lister et al., 2008). Previously, methylation of CpG islands on the inactive X

(Xi) chromosome suggested that DNA methylation would be higher across Xi than the active X

chromosome (Xa). Surprisingly, early genome-wide experiments in human cells showed that the

Xa, in fact, had twice as much methylation than Xi, where extra methylation was concentrated within gene bodies (Hellman and Chess, 2007; Weber et al., 2005). Gene body methylation has since been studied for its mechanism of establishment and functional role (Baubec et al., 2015;

Neri et al., 2017).

The study of methylation maps in mammalian genomes paralleled studies in Arabidopsis.

However, at around 3 billion base-pairs, human and mouse genomes faced more challenges in scaling up. Several early studies tested new technologies to measure DNA methylation genome- wide. These approaches were often constrained to limited regions of the genome, most often due to reliance on methylation-sensitive restriction sites or focused DNA probe sets. From these

9 studies, it was shown that methylation could distinguish between different cell types based on the

methylation state of CpGs across the genome (Bibikova et al., 2006). Additionally, by sampling

many additional sites, CpG islands were also found to be methylated in a tissue-specific manner

in normal tissues. For example, using methylation array CGH showed, Ching et al. found several promoter-associated differentially methylated CpG islands between astrocytes and PBLs (Ching et al., 2005). Separately, using protein domains with specific affinity for unmethylated CpG sites in combination with an array containing CpG islands, Illingworth et al. found that around 6-8% of CpG islands were methylated in somatic tissues and found 1,657 differentially methylated

CpG islands in one or more tissues studied (Illingworth et al., 2008). Furthermore, the methylation status of CpG islands generally associated with gene expression, where methylated

CpG islands tended to show low expression. From early genome-wide maps in mammals, CpG islands were shown to not always stay unmethylated, rather methylation occured in a tissue- specific manner.

As the use of whole-genome bisulfite sequencing becomes more wide-spread, the full complement of methylation levels at the majority of CpG and non-CpG sites can be easily measured. One goal of these studies was to identify individual regulatory elements and how these regulatory elements are differentially modified by DNA methylation. In one of the earliest large-scale bisulfite sequencing studies in human, measuring DNA methylation on chromosomes

6, 20, and 22 from 12 different human tissues revealed an underrepresentation of tissue-specific differentially methylated regions in CpG islands with some located up to 100 kb away from annotated genes (Eckhardt et al., 2006). Consistent with this, recent systematic studies in both mouse and human across dozens of tissues reveal that the most dynamic methylation sites across tissues are distal to promoters. On average, nearly 75% of tissue-specific DMRs were found near

10 distal regulatory elements rather than near promoters (Hon et al., 2013; Ziller et al., 2013). These

systematic studies identified hundreds of thousands of tissue DMRs, which is closer to the order

of magnitude of distal regulatory elements, such as enhancers, than promoter-proximal regions.

Relationship between methylation and histone landscape

In the same period that DNA methylation landscapes were being mapped, great efforts

were made in parallel on mapping the landscape of chromatin features through large

collaborative projects like the ENCODE and Roadmap projects. Numerous

genome-wide maps of histone modifications allowed a better understanding of the epigenomic

landscape (Barski et al., 2007; Hawkins et al., 2010; Heintzman et al., 2007; Mikkelsen et al.,

2007). Examples include the identification of bivalent promoters, which are marked by both

active H3K4me3 and repressive H3K27me3 histone modifications, found to mark developmental

genes (Bernstein et al., 2006). Other combinations of histone modifications also predicted

enhancer elements across the genome (Creyghton et al., 2010; Heintzman et al., 2007, 2009;

Rada-Iglesias et al., 2011).

Cross-referencing newly generated DNA methylation maps with histone modification

maps identified where DNA methylation may function in the epigenetic architecture and

revealed associations between DNA methylation and the histone landscape. Work from our

laboratory and others studying the methylome of embryonic stem cells found that DNA

methylation at promoters varied based on chromatin context (Fouse et al., 2008; Meissner et al.,

2008; Weber et al., 2007). Specifically, methylated promoters were predominately found in a

minority of low/intermediate CpG-content promoters lacking histone modifications, while promoters with higher CpG contents were mostly unmethylated and marked with active histone

11 modifications. Low CpG content promoters were enriched for lineage specific genes, whereas

developmental genes were generally associated with higher CpG content and bivalent histone modifications. Measurement of histone modification and DNA methylation during stem cell differentiation showed correlation between the loss of histone modifications and DNA hypermethylation (Meissner et al., 2008).

In the first WGBS study in mouse embryonic stem cells, the authors identified low- methylated regions (<30%) that were enriched in K4me1 and p300, resembling distal regulatory regions (Stadler et al., 2011). The authors also tested how these low methylated regions are formed by manipulating CTCF binding sites on reporter transgenes and found that introduction

of binding sites led to local reduction in DNA methylation, arguing that regulatory factor binding

is able to regulate methylation levels. Together, mapping of multiple layers of the epigenome have revealed a complex landscape, where genes have different modes of epigenetic regulation during different stages of development and different genomic contexts.

However, relating between epigenetic modification and gene expression is ultimately limited by association. Mapping studies have allowed a high-resolution view across different tissues, allowing association between variable sites and regions of functional importance. Newer studies now are utilizing traditional genetic systems measured using genome-wide maps to understand how genetic perturbations affect the epigenetic landscape.

12 DNA Methylation Mechanisms – Lessons from Biochemistry

Cloning of DNMTs

During the inception of epigenetics as a field, DNA methylation was functionally

recognized as a regulatory mark, however the players that wrote the mark onto the genome in mammals were only hypothesized. It was not until the end of the 1980s that the first mammalian

DNA methyltransferase (DNMT) was identified and cloned by Tim Bestor with the late Vernon

Ingram (Bestor et al., 1988). This first methyltransferase, now known as DNMT1, was shown to have preference for hemimethylated DNA, or 5-methylcytosine present on only one strand, which is typically found after DNA replication as the newly synthesized strand has not been methylated. Thus, the newly cloned DNMT enzyme fulfilled the postulated function of the maintenance methyltransferase.

An observation about the sequence of DNMT1 was that it showed striking similarity in

its C-terminal domain to bacterial methyltransferases, including several key conserved motifs.

Comparative analysis between 45 methyltransferases showed high levels of conservation of 10 amino acid motifs throughout the methyltransferase domain (Kumar et al., 1994). Utilizing this sequence similarity between mammalian and bacterial methyltransferases, Masaki Okano and En

Li identified and cloned mammalian Dnmt3a and Dnmt3b by searching human EST clones for these conserved sequences (Okano et al., 1998). Unlike DNMT1 these new enzymes showed activity on unmethylated DNA, and thus fulfilled the role of de novo DNA methyltransferase enzymes.

While identification of the mammalian methylation machinery confirmed previous notions about how epigenetic information could be propagated between generations of cells, it remained difficult to understand how patterns of methylation arose. Mammalian

13 methyltransferases were found to have little sequence specificity aside from increased activity towards CpG dinucleotides, in contrast to bacterial methyltransferases that had sequence specificity like their restriction enzyme counterparts. The identification of a few general acting methyltransferases with little sequence specificity necessitates regulation through interaction with other proteins.

Biochemical insights into DNMT function

Both de novo and maintenance DNMTs share a similar domain structure, with a large N- terminal regulatory domains and C-terminal catalytic methyltransferase domain. Mammalian

DNMT3A and DNMT3B share similar domain arrangements with a N-terminal variable region,

PWWP domain, ADD domain, followed by a C-terminal catalytic domain.

The PWWP domain is named after the Pro-Trp-Trp-Pro motif found in many nuclear proteins. In DNMT3A, PWWP has been shown to interact with chromatin and is necessary for

DNMT to localize to chromatin targets (Chen et al., 2004; Ge et al., 2004; Qiu et al., 2002). The

ADD domain is a unique zinc finger motif named after ATRX, DNMT3, and DNMT3L proteins in which it was identified (Aapola et al., 2000). The ADD domain consists of a C2C2 GATA- like finger and an imperfect PHD finger, ending with a long C-terminal alpha helix that contacts the GATA finger forming a self-contained domain (Argentaro et al., 2007). The ADD domain of

DNMT3A/3L has been shown to interact with the H3 histone tail, specifically sensing the presence of H3K4 and H3K9 methylation states (Li et al., 2011; Otani et al., 2009; Zhang et al.,

2010). DNMT1 meanwhile is organized into a N-terminal PCNA binding domain, nuclear localization signal, replication foci targeting (RFT) domain, a CXXC domain, two bromo- adjacent homology (BAH) domains, and a C-terminal catalytic domain (Leonhardt et al., 1992) .

14 Recently, structural and biochemical experiments have provided evidence for novel

regulatory principles of DNMT (Jeltsch and Jurkowska, 2016). Most recently, structures of

DNMT3A have shown the existence of intrinsic autoinhibitory mechanisms that limit activity of

the enzymes outside of proper contexts. Autoinhibition in DNMT3A is mediated through

interactions between the ADD domain and the C-terminal catalytic domain, which normally block the catalytic sites from DNA access in the absence of activating conditions (Guo et al.,

2014b). When the ADD domain interacts with histone H3 tails, specifically tails unmethylated at the H3K4 position, DNMT3A undergoes a large conformational change moving the ADD domain away from the catalytic site allowing access and methylation of DNA. Autoinhibition has also been demonstrated for DNMT1, where the RFT domain blocks the catalytic site while the CXXC domain positions unmethylated DNA away from the catalytic site inhibiting DNMT1 in the absence of methylation and coactivators (Song et al., 2011; Takeshita et al., 2011).

Lastly, biochemical studies have shown that de novo methyltransferases do not operate as monomers, but rather interact with one another in different combinations. The most well studied complex is a heterotetramer consisting of a central DNMT3A dimer and two DNMT3L molecules on the periphery (3L-3A-3A-3L) (Jia et al., 2007). DNMT3L is an inactive homolog of DNMT3A consisting of an ADD domain and a catalytically inactive methyltransferase

domain. The interaction of DNMT3L with DNMT3A increases the catalytic activity of

DNMT3A (Gowher et al., 2005; Suetake et al., 2004). Both homotypic interactions with

DNMT3A and heterotypic interactions with DNMT3L occur through the C-terminal catalytic

domain of DNMT3A and have been well characterized. DNMT3A has also been shown to

mainly interact with DNMT3B in ESCs (Li et al., 2007).

15 DNA Methylation interactions with chromatin

Activity of de novo DNMTs were shown to differ between naked DNA and chromatin templates (Zhang et al., 2010). Furthermore, methylation of H3K4 on chromatin templates in vitro further reduced methyltransferase activity indicating responsiveness to chromatin context on the activity of DNMTs. An increasing number of interactions have been described between

DNMT and chromatin, indicating a high level of regulation of when and where DNA methylation can occur. I will focus on specific examples of how de novo DNMTs have interacted with the different modifications of histone tails. As mentioned earlier, the conserved domains in the N-terminal regulatory regions of DNMT3A/3B confer the ability of de novo DNMTs to interact with specific histone modifications. The PWWP domain of DNMT3A was shown to specifically interact with H3K36me3, where binding increased DNMT3A activity (Dhayalan et al., 2010). Most recently, studies of genome-wide localization of DNMT3B have found that the

PWWP domain targets DNMT3B to gene bodies through interactions with H3K36me3 histone modification (Baubec et al., 2015; Morselli et al., 2015).

Another key interaction of the DNMT3A/3L ADD domain is with H3K4, as previously described (Otani et al., 2009). The interaction of DNMT3A with unmethylated H3K4 is involved in interruption of intrinsic autoinhibition. The ADD domain of both DNMT3A and DNMT3L show high binding affinity for unmethylated H3K4 and low affinities for methylated forms. This is thought to prevent DNMT from aberrantly methylating active promoters. DNMT3A enzymes engineered with mutations in the ADD domain could bind and methylate more in H3K4me3 marked regions with generalized repression of genes (Noh et al., 2015).

Fewer studies have investigated the interaction between DNMT and H3K27 histone modifications. One highly cited study showed EZH2, the writer for H3K27me3, recruits

16 DNMT3A/3B in cancer cells via the ADD domain (Viré et al., 2006). For some time H2K27me3 and DNA methylation were thought to be mutually exclusive, but improved techniques showed overlapping regions in the genome (Brinkman et al., 2012). Most recently, DNA methylation was shown to inhibit writing of H3K27me3 (Jermann et al., 2014).

Lastly, DNA methylation also has a close relationship with methylation of H3K9 and cooperate in closed chromatin structure and gene repression. DNA methylation was first shown to require H3K9 histone methyltransferases in Arabidopsis thaliana, indicating that H3K9me can direct DNA methylation (Jackson et al., 2002). In support of this idea, interaction between

DNMTs, SUV39H1, and HP1 were shown in vitro (Fuks et al., 2003; Smallwood et al., 2007).

Interactions among DNMTs, H3K9 histone methyltransferases, and HP1 form a positive feedback loop allowing establishment of heterochromatin in different environments across the genome (Du et al., 2015; Epsztejn-Litman et al., 2008). The relationship between DNA methylation and H3K9 methylation is complicated by the fact there are five known H3K9 histone methyltransferases in mammals. Knockout of different H3K9 histone methyltransferases resulted in specific loss of DNA methylation at different sites. For example, Suv39h1/2 is required for de novo DNA methylation at pericentric heterochromatin (Lehnertz et al., 2003), while knockout of G9a resulted in hypomethylation of a few dozen specific loci (Ikegami et al.,

2007). However, the converse is not always true, where genome-wide demethylation in mESCs resulted in minimal changes in H3K9 (Karimi et al., 2011; Tsumura et al., 2006). However, in human cancer cell lines, DNMT knockout does reveal global reduction in K9me2/3, suggesting the relationship may differ based on cellular context.

17 Biological Functions

Genetics - Revisiting function through experimental models

The function of DNA methylation in the cellular context was studied soon after cloning

of the respective Dnmt genes. Knockout of Dnmt1 in mouse embryonic stem cells and derivative

mouse lines revealed an essential role for DNA methylation in early development (Lei et al.,

1996). In this study, homozygous mutations causing genomic hypomethylation led to arrest as

early as the 8-somite stage. Heterozygous knockouts on the other hand showed no abnormalities.

Interestingly, homozygous knockout ESCs were viable. However, upon differentiation to

embryoid bodies, knockout cells were unable to differentiate compared to normal cells.

Conditional Dnmt1 deletion in fibroblasts also resulted in cell death indicating a differential requirement or distinct roles for DNA methylation in embryonic development and differentiated somatic cells (Jackson-Grusby et al., 2001). Further study of Dnmt1 knockouts showed chromosomal instability, reactivation of transposable and viral elements, and initiation of tumorigenesis (Holm et al., 2005; Kim et al., 2004).

Studies of de novo DNA methyltransferases showed similar requirements for DNA methylation during development (Okano et al., 1999). Homozygous Dnmt3b knockout mice resembled Dnmt1 knockouts showing an embryonic lethal phenotype. Mutant embryos

developed normally before E9.5, but showed neural tube defects and other abnormalities later in

development. In contrast, Dnmt3a knockout mice developed to term, but became runted and died

several weeks after birth. Double mutant embryos (Dnmt3a-/-, Dnmt3b-/-) showed abnormal

morphology as early as E8.5 and died before E11.5. Finally, in the most extreme case, where all

active methyltransferases were knocked out, triple-knockout (TKO) ESCs were shown to

18 maintain pluripotency with no changes in H3K9 and H3K4 localization in cells (Tsumura et al.,

2006).

At the same time as the mouse studies of DNMT were conducted, human cell line models

were used to study the role of DNMTs in human, principally in the study of human cancers. In

the lab of Stephen Baylin, Bert Vogelstein, Kenneth Kinzler, and others, hypomethylated human

colorectal carcinoma cell line (HCT116) showed demethylation of repeat sequence, loss of

imprinting, growth suppression, and chromosomal instability (Egger et al., 2006; Karpf and

Matsui, 2005; Rhee et al., 2002). The most complete demethylation of human cells, utilizing a full loss-of-function allele in DNMT1 alone, led to “mitotic catastrophe”, which was characterized by severe mitotic defects and cell death, consistent with results from Dnmt1 knockout in mouse fibroblasts (Chen et al., 2007).

Roles during development

DNA methylation is highly dynamic during the earliest stages of development where

rapid and global changes in DNA methylation occur prior to implantation of the blastocyst into the uterine wall (Guo et al., 2014a; Santos et al., 2002; Smith et al., 2014). Specifically, global demethylation occurs during the preimplantation, followed by global remethylation. During demethylation, parental imprints are reset. After the initial landscape of DNA methylation is established for pluripotency around the post-implantation step of early embryogenesis, DNA methylation is both added de novo to silence programs no longer needed in more differentiated cells as well removed concomitant with activation of specific developmental programs

(Bogdanović et al., 2016; Li et al., 2007).

19 Early evidence of DNA methylation’s involvement in neural development was during the neurogenic to gliogenic switch, where changes in DNA methylation of Gfap coincided with the period (Takizawa et al., 2001; Teter et al., 1994, 1996). As a result of Dnmt knockout mice being embryonic lethal or dying shortly after birth, the study of DNA methylation in the nervous system relied on conditional knockouts, work largely done by Guoping Fan and Rudolf Jaenisch.

The first in vivo of hypomethylation in the entire CNS precursors compartment via Nestin-Cre- meditated Dnmt1 knockout resulted in neonatal death due to disruption in breathing control centers in the brain (Fan et al., 2001). Specifically, in severely hypomethylated cells, neural cells died immediately after birth, while in mosaic animals, most cells were eliminated postnatally.

Hypomethylation additionally resulted in precocious differentiation to glial cells in early embryos due to JAK-STAT pathway activation (Fan et al., 2005).

Studying methylation’s role in more specific regions on the brain, Emx1-Cre-driven

Dnmt1 knockout mice showed defects in cortical lamination as well as progressive degeneration in early postnatal stages (Golshani et al., 2005). Follow-up studies on surviving hypomethylated postnatal neuron showed defects, including dendritic arborization and excitability (Hutnick et al.,

2009). Hypomethylation in postmitotic neurons led to learning deficits in CamK2a-Cre-mediated

Dnmt knockout mice (Feng et al., 2010).

Separate from global loss of methylation using Dnmt1, Dnmt3a knockout in Nes-Cre mice resulted in loss of motor neurons in hypoglossal nucleus and defects in neuromuscular junctions of the diaphragm (Nguyen et al., 2007). Furthermore, cultured neural stem cells from

Dnmt3a knockout mice showed precocious differentiation with increased glial differentiation

(Wu et al., 2010, 2012). Mapping of DNA methylation profiles of the developing and postnatal brain showed increased non-CpG methylation after birth, particularly in the neuronal lineage

20 compared to glial. The increases in mCH coincided with periods of synaptogenesis and synaptic

pruning, indicating potentially specific roles for DNA methylation in postnatal brain

development and maturation. Interestingly, DNMT3A binding is enriched at sites of mCH in

neurons (Lister et al., 2013). DNMT3A has also been studied in other adult stem cells including

hematopoietc stem cells (HSCs), where DNMT3A was found to be essential for maintenance of

multipotency in HSCs (Challen et al., 2012).

One interesting concept that has recently arisen is of different modes of methylation

pattern maintenance (Shipony et al., 2014). By contrasting DNA methylation dynamics and noise

in human embryonic stem cells and somatic fibroblast cell lines, distinct modes of maintenance

were observed between the two states. Embryonic stem cells utilize a “dynamic equilibrium” model involving high levels of demethylation and de novo methylation. Meanwhile, fibroblast

cells utilize the more traditional static or “clonal maintenance” model, where methylation

patterns are highly reliant on DNMT1 maintenance methylation. These different modes

dominating different stages of development may explain some differences observed between

ESCs and somatic cell lines.

DNA Methylation in Disease

DNA methylation, like the broader epigenetic apparatus, has been identified to play roles

in nearly every tissue studied. Likewise, alterations in DNA methylation are also increasingly

recognized in the mechanism of more and more diseases. Here I will briefly describe three classic and more relevant disease states where methylation changes are either widespread or play

key mechanistic roles.

21 Cancer

As interest in DNA methylation as a gene regulatory mechanism increased in the 1980s,

investigations of how DNA methylation differed between normal and cancerous tissues were

conducted. Among these early studies, Andrew Feinberg, Bert Vogelstein, and others observed

that genes would progressively become hypomethylated, where the degree of hypomethylation

correlated to disease stage (Feinberg and Vogelstein, 1983; Gama-Sosa et al., 1983). Around the

same time, serendipitous measurement of DNA methylation around the calcitonin CpG island

promoter showed aberrant hypermethylation in cancer compared to normal counterparts (Baylin

et al., 1986). Measurement of additional loci together with identification of examples where bona fide tumor suppressors were silenced due to methylation led to the broader application of the

hypermethylation concept (Baylin et al., 1998; Jones, 1996). Furthermore, a distinct group of cancers showed higher levels of CpG island methylation, termed the CpG island methylator

phenotype (CIMP), first in colorectal carcinomas (Issa, 2004; Toyota et al., 1999). Together, the

hypo- and hypermethylation phenotypes have been observed in many different cancers. Today,

the epigenetic alterations in human cancers are often described as a combination of global

hypomethylation with regional hypermethylation where hypermethylation is able to methylation

and silence tumor suppressor genes, contributing to the progression of disease (Esteller, 2008).

Recently specific mutations in the epigenetic machinery have been identified in a large

proportion of hematological malignancies. Specifically, mutations of DNMT3A at the hotspot

residue R882 are found in approximately one fifth to a quarter of acute myeloid leukemia (AML)

cases (Ley et al., 2010). This mutation was recently shown to act in a dominant-negative fashion,

acting through disruption of DNMT3A dimerization (Russler-Germain et al., 2014). While

22 current research has improved our understanding of how the mutations alter DNMT3A function,

how the methylation changes precisely lead to malignancy are still unclear.

Imprinting disorders

Among the human disorders rooted in epigenetics, one of the most well-known classes

are disorders resulting from misexpression of imprinted genes. Imprinted genes are a class of

genes that are differentially expressed depending on the parent-of-origin. Early evidence of imprinting was supported by nonequivalence of parental genomes, shown by the necessity of both paternal and maternal genomes for development. Formal studies of imprinting utilizing random insertion of transgenes in mice and observation of parent-of-origin effects identified methylation as a main mechanism for imprinting (Reik et al., 1987; Sapienza et al., 1987; Swain et al., 1987). Furthermore, the pattern of expression correlated with the methylation state.

As a class, imprinting disorders often have phenotypes involving growth and neurocognition (Constância et al., 2004). One of the oldest and best described imprinting disorders is Beckwith-Wiedemann syndrome (BWS), described first in 1964 by Drs. Beckwith and Wiedemann. BWS is characterized by pathognomonic features including macroglossia, midline abdominal wall defects, and somatic overgrowth. It was not until 1989 that the locus was linked to 11p15 and a year later hypothesized to be caused by an imprinted gene due to the observed parental patterns of deletion and duplication (Brown et al., 1990; Ping et al., 1989). In the following years, further study of the Igf2 imprinting locus culminated in generation of animal models recapitulating BWS features (Sun et al., 1997; Zhang et al., 1997).

The number of imprinted genes has steadily grown across the last few decades with the overall number on the order of a hundred imprinted genes. New genome-wide technologies have

23 enabled systematic identification of novel predicted imprinting regions and have even identified

tissue-specific imprints. Many of these genes are clustered together and controlled locally via

imprinting control regions. Associated with many imprinted loci are developmental disorders

that exhibit reciprocal phenotypes. For example, Beckwith-Wiedemann syndrome and Silver-

Russell syndrome display overgrowth and growth restriction respectively, both due to alterations

at chr 11p15. Similarly, more recently described Kagami-Ogata syndrome and Temple syndrome

exhibit increased and decreased birth weight respectively, where both are due to alterations at chr

14q32 (Ogata and Kagami, 2016; Temple et al., 2007).

Overgrowth Syndromes

Related to Beckwith Wiedemann syndrome is another class of developmental disorders

exhibiting somatic overgrowth resulting from mutation of the epigenetic machinery.

Haploinsufficiency of NSD1, a H3K36-specific histone methyltransferase (HMTase), was found

to cause Sotos syndrome, a syndrome characterized by overgrowth, macrocephaly, brain

anomalies and seizure, and mental retardation (Kurotaki et al., 2002). Later, autosomal dominant

mutations in EZH2, a H3K27-specific HMTase was found to cause Weaver syndrome, characterized by tall stature (>2.5 SD) and head circumference, facial features including a broad forehead and hypertelorism (abnormally distant eyes), and mild intellectual disability (IQ < 70)

(Gibson et al., 2012; Tatton-Brown et al., 2011). Furthermore, additional mutations in SETD2

(H3K36 HMTase) for Sotos overgrowth syndrome and EED (H3K27 HMTase) for Weaver syndrome were identified, indicating that the epigenetic marks rather than protein-specific interactions cause the phenotype (Cohen et al., 2015; Luscan et al., 2014). Most recently,

DNMT3A was identified in a cohort of overgrowth syndrome patients without a known etiology

24 (Tatton-Brown et al., 2014). The newly named Tatton-Brown-Rahman Syndrome (TBRS) presents with increased height and head circumference, mild to moderate intellectual disability in all cases, and variable penetrance of other findings (e.g. cardiovascular malformations, scoliosis).

With the recent discovery and only 13 patients described in literature, our knowledge of

DNMT3A overgrowth syndrome is limited.

Distinct from constitutional overgrowth syndromes, regional mosaic overgrowth syndromes are also well known, with Proteus/Wiedemann syndrome popularized by the Elephant

Man in popular culture. It is hypothesized that these syndromes are caused by mutations that would be lethal in the non-mosaic state. Proteus syndrome results from mosaic mutations in

AKT1, causing disproportionate and asymmetric localized overgrowth in skin, connective tissue, and the brain (Lindhurst et al., 2011). The PI3K-AKT signaling axis is also downstream of constitutional Beckwith-Wiedemann Syndrome, where IGF2 signaling through IGF1R, activates the PI3K-AKT or Ras pathway (Sun et al., 1997; Weksberg et al., 2010; Zhang et al., 1997).

Together, overgrowth syndromes predominantly influence components of the PI3K-AKT and

Ras pathways or epigenetic machinery.

An interesting observation that hints at the molecular underpinnings of overgrowth syndromes is the shared presence in many malignancies. NSD1, the causal gene in Sotos syndrome, is also found fused to NUP98 in recurrent t(5;11)(q35;p15.5) translocations in childhood AML and epigenetically silenced in neuroblastoma and gliomas (Berdasco et al.,

2009; Jaju et al., 2001). Similarly, mutations and upregulation of EZH2, the causal gene in

Weaver syndrome, are found in a number of hematopoietic malignancies as well as other solid tumors (Ernst et al., 2010; Varambally et al., 2002).

25 Relation to this Dissertation

As presented in the introduction, early studies of DNA methylation's silencing

functionality prematurely attributed an instructive role for DNA methylation in gene regulation.

Observation that CpG islands were generally hypomethylated provided early indication that

DNA methylation may not have as direct a role as previously imagined. Together with evidence

showing silencing of endogenous loci precedes methylation, the hypothesis of DNA methylation

as a driver of gene regulation fell by the wayside. Recent observations in more comprehensive

genome-wide maps have renewed interest in DNA methylation as tissue-specific differentially

methylated regions are found colocalized with features characteristic of distal regulatory

elements, such as enhancers.

Epigenetic mechanisms are general tools used in cells to remember and respond to its

environment. However, this system interfaces with many different tissue-specific cues and instructions forming tissue-specific outcomes. Our understanding of the basic principles of how

DNA methylation are relatively well established, but new understanding is focused on the interactions with a number of other regulators. With this in mind, to understand the role of DNA

methylation in development and disease, it is necessary to understand not only the first principles

on which DNA methylation operates, but also how and where it operates in specific contexts. In this dissertation, I attempt to address these two major aspects important to understanding DNA methylation.

Foremost is understanding the first principles of how DNA methylation functions among the epigenomic community and its role in gene regulation. One fundamental question relating to this is whether DNA methylation acts as a driver or a follower in the epigenetic and gene regulatory hierarchy. Chapter two delves into asking a fundamental question about the order of

26 events. The debate preceding the work described in this chapter was over DNA methylation’s place in the regulatory hierarchy. The dramatic iPSC reprogramming experiments by Yamanaka

and others, reminiscent of 5-azacytidine transdifferentiation of 3T3 and 10T½ experiments in the

1980s, provided strong evidence that transcription factors alone can control a cell’s state.

Mapping of Polycomb-repressive complex and its associated histone modifications also revealed histone modifications as the major repressive mechanism of developmental genes. Furthermore, highly cited work in cancer cells provided a mechanistic argument that DNA methylation was recruited by histone modifications.

The second question is what role does DNA methylation have in the proper biological setting, particularly in a recently discovered disease that results from mutations of DNMT3A that have unknown functional consequences. Also, to what extent can classic methylation function explain new diseases with mutations in the DNA methylation machinery and what can new paradigms coming out from this past decade teach us? Chapter three thus takes a global omics view of a cellular model of a human overgrowth syndrome asking how alterations in the methylation machinery effect the epigenome.

27 References

Aapola, U., Kawasaki, K., Scott, H.S., Ollila, J., Vihinen, M., Heino, M., Shintani, A., Kawasaki, K., Minoshima, S., Krohn, K., et al. (2000). Isolation and initial characterization of a novel zinc finger gene, DNMT3L, on 21q22.3, related to the cytosine-5-methyltransferase 3 gene family. Genomics 65, 293–298.

Arber, W. (1965). Host-controlled modification of bacteriophage. Annu. Rev. Microbiol. 19, 365–378.

Argentaro, A., Yang, J.-C., Chapman, L., Kowalczyk, M.S., Gibbons, R.J., Higgs, D.R., Neuhaus, D., and Rhodes, D. (2007). Structural consequences of disease-causing mutations in the ATRX-DNMT3-DNMT3L (ADD) domain of the chromatin-associated protein ATRX. Proc. Natl. Acad. Sci. U. S. A. 104, 11939–11944.

Barski, A., Cuddapah, S., Cui, K., Roh, T.-Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. (2007). High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837.

Baubec, T., Colombo, D.F., Wirbelauer, C., Schmidt, J., Burger, L., Krebs, A.R., Akalin, A., and Schübeler, D. (2015). Genomic profiling of DNA methyltransferases reveals a role for DNMT3B in genic methylation. Nature.

Baylin, S.B., Höppener, J.W., de Bustros, A., Steenbergh, P.H., Lips, C.J., and Nelkin, B.D. (1986). DNA methylation patterns of the calcitonin gene in human lung cancers and lymphomas. Cancer Res. 46, 2917–2922.

Baylin, S.B., Herman, J.G., Graff, J.R., Vertino, P.M., and Issa, J.P. (1998). Alterations in DNA methylation: a fundamental aspect of neoplasia. Adv. Cancer Res. 72, 141–196.

Berdasco, M., Ropero, S., Setien, F., Fraga, M.F., Lapunzina, P., Losson, R., Alaminos, M., Cheung, N.-K., Rahman, N., and Esteller, M. (2009). Epigenetic inactivation of the Sotos overgrowth syndrome gene histone methyltransferase NSD1 in human neuroblastoma and glioma. Proc. Natl. Acad. Sci. U. S. A. 106, 21830–21835.

Bernstein, B.E., Mikkelsen, T.S., Xie, X., Kamal, M., Huebert, D.J., Cuff, J., Fry, B., Meissner, A., Wernig, M., Plath, K., et al. (2006). A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells. Cell 125, 315–326.

Bestor, T., Laudano, A., Mattaliano, R., and Ingram, V. (1988). Cloning and sequencing of a cDNA encoding DNA methyltransferase of mouse cells. The carboxyl-terminal domain of the mammalian enzymes is related to bacterial restriction methyltransferases. J. Mol. Biol. 203, 971– 983.

Bibikova, M., Chudin, E., Wu, B., Zhou, L., Garcia, E.W., Liu, Y., Shin, S., Plaia, T.W., Auerbach, J.M., Arking, D.E., et al. (2006). Human embryonic stem cells have a unique epigenetic signature. Genome Res. 16, 1075–1083.

28 Bird, A. (1986). CpG-rich islands and the function of DNA methylation. Nature 321, 209–213.

Bird, A.P. (1980). DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 1499–1504.

Bird, A.P. (1984). Gene expression: DNA methylation ? how important in gene control? Nature 307, 503–504.

Bird, A.P., and Southern, E.M. (1978). Use of restriction enzymes to study eukaryotic DNA methylation: I. The methylation pattern in ribosomal DNA from Xenopus laevis. J. Mol. Biol. 118, 27–47.

Bird, A., Taggart, M., Frommer, M., Miller, O.J., and Macleod, D. (1985). A fraction of the mouse genome that is derived from islands of nonmethylated, CpG-rich DNA. Cell 40, 91–99.

Bogdanović, O., Smits, A.H., de la Calle Mustienes, E., Tena, J.J., Ford, E., Williams, R., Senanayake, U., Schultz, M.D., Hontelez, S., van Kruijsbergen, I., et al. (2016). Active DNA demethylation at enhancers during the vertebrate phylotypic period. Nat. Genet. 48, 417–426.

Brinkman, A.B., Gu, H., Bartels, S.J.J., Zhang, Y., Matarese, F., Simmer, F., Marks, H., Bock, C., Gnirke, A., Meissner, A., et al. (2012). Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk. Genome Res. 22, 1128–1138.

Brown, K.W., Williams, J.C., Maitland, N.J., and Mott, M.G. (1990). Genomic imprinting and the Beckwith-Wiedemann syndrome. Am. J. Hum. Genet. 46, 1000–1001.

Busslinger, M., Hurst, J., and Flavell, R.A. (1983). DNA methylation and the regulation of globin gene expression. Cell 34, 197–206.

Challen, G. a, Sun, D., Jeong, M., Luo, M., Jelinek, J., Berg, J.S., Bock, C., Vasanthakumar, A., Gu, H., Xi, Y., et al. (2012). Dnmt3a is essential for hematopoietic stem cell differentiation. Nat. Genet. 44, 23–31.

Chen, T., Tsujimoto, N., and Li, E. (2004). The PWWP domain of Dnmt3a and Dnmt3b is required for directing DNA methylation to the major satellite repeats at pericentric heterochromatin. Mol. Cell. Biol. 24, 9048–9058.

Chen, T., Hevi, S., Gay, F., Tsujimoto, N., He, T., Zhang, B., Ueda, Y., and Li, E. (2007). Complete inactivation of DNMT1 leads to mitotic catastrophe in human cancer cells. Nat. Genet. 39, 391–396.

Ching, T.-T., Maunakea, A.K., Jun, P., Hong, C., Zardo, G., Pinkel, D., Albertson, D.G., Fridlyand, J., Mao, J.-H., Shchors, K., et al. (2005). Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3. Nat. Genet. 37, 645–651.

Cohen, A.S. a, Tuysuz, B., Shen, Y., Bhalla, S.K., Jones, S.J.M., and Gibson, W.T. (2015). A

29 novel mutation in EED associated with overgrowth. J. Hum. Genet. 60, 339–342.

Cokus, S.J., Feng, S., Zhang, X., Chen, Z., Merriman, B., Haudenschild, C.D., Pradhan, S., Nelson, S.F., Pellegrini, M., and Jacobsen, S.E. (2008). Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215–219.

Constância, M., Kelsey, G., and Reik, W. (2004). Resourceful imprinting. Nature 432, 53–57.

Coulondre, C., Miller, J.H., Farabaugh, P.J., and Gilbert, W. (1978). Molecular basis of base substitution hotspots in Escherichia coli. Nature 274, 775–780.

Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna, J., Lodato, M. a, Frampton, G.M., Sharp, P. a, et al. (2010). Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U. S. A. 107, 21931–21936.

Davis, R.L., Weintraub, H., and Lassar, A.B. (1987). Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell 51, 987–1000.

Dhayalan, A., Rajavelu, A., Rathert, P., Tamas, R., Jurkowska, R.Z., Ragozin, S., and Jeltsch, A. (2010). The Dnmt3a PWWP domain reads histone 3 lysine 36 trimethylation and guides DNA methylation. J. Biol. Chem. 285, 26114–26120.

Doerfler, W. (1983). DNA Methylation and Gene Activity. Annu. Rev. Biochem. 52, 93–124.

Du, J., Johnson, L.M., Jacobsen, S.E., and Patel, D.J. (2015). DNA methylation pathways and their crosstalk with histone methylation. Nat. Rev. Mol. Cell Biol. 16, 519–532.

Eckhardt, F., Lewin, J., Cortese, R., Rakyan, V.K., Attwood, J., Burger, M., Burton, J., Cox, T. V, Davies, R., Down, T.A., et al. (2006). DNA methylation profiling of human chromosomes 6, 20 and 22. Nat. Genet. 38, 1378–1385.

Egger, G., Jeong, S., Escobar, S.G., Cortez, C.C., Li, T.W.H., Saito, Y., Yoo, C.B., Jones, P. a, and Liang, G. (2006). Identification of DNMT1 (DNA methyltransferase 1) hypomorphs in somatic knockouts suggests an essential role for DNMT1 in cell survival. Proc. Natl. Acad. Sci. U. S. A. 103, 14080–14085.

Epsztejn-Litman, S., Feldman, N., Abu-Remaileh, M., Shufaro, Y., Gerson, A., Ueda, J., Deplus, R., Fuks, F., Shinkai, Y., Cedar, H., et al. (2008). De novo DNA methylation promoted by G9a prevents reprogramming of embryonically silenced genes. Nat. Struct. Mol. Biol. 15, 1176–1183.

Ernst, T., Chase, A.J., Score, J., Hidalgo-Curtis, C.E., Bryant, C., Jones, A. V, Waghorn, K., Zoi, K., Ross, F.M., Reiter, A., et al. (2010). Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders. Nat. Genet. 42, 722–726.

Esteller, M. (2008). Epigenetics in cancer. N. Engl. J. Med. 358, 1148–1159.

Fan, G., Beard, C., Chen, R.Z., Csankovszki, G., Sun, Y., Siniaia, M., Biniszkiewicz, D., Bates,

30 B., Lee, P.P., Kuhn, R., et al. (2001). DNA hypomethylation perturbs the function and survival of CNS neurons in postnatal animals. J. Neurosci. 21, 788–797.

Fan, G., Martinowich, K., Chin, M.H., He, F., Fouse, S.D., Hutnick, L., Hattori, D., Ge, W., Shen, Y., Wu, H., et al. (2005). DNA methylation controls the timing of astrogliogenesis through regulation of JAK-STAT signaling. Development 132, 3345–3356.

Feinberg, A.P., and Vogelstein, B. (1983). Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature 301, 89–92.

Feng, J., Zhou, Y., Campbell, S.L., Le, T., Li, E., Sweatt, J.D., Silva, A.J., and Fan, G. (2010). Dnmt1 and Dnmt3a maintain DNA methylation and regulate synaptic function in adult forebrain neurons. Nat. Neurosci. 13, 423–430.

Fouse, S.D., Shen, Y., Pellegrini, M., Cole, S., Meissner, A., Van Neste, L., Jaenisch, R., and Fan, G. (2008). Promoter CpG methylation contributes to ES cell gene regulation in parallel with Oct4/Nanog, PcG complex, and histone H3 K4/K27 trimethylation. Cell Stem Cell 2, 160–169.

Fuks, F., Hurd, P.J., Wolf, D., Nan, X., Bird, A.P., and Kouzarides, T. (2003). The methyl-CpG- binding protein MeCP2 links DNA methylation to histone methylation. J. Biol. Chem. 278, 4035–4040.

Gama-Sosa, M.A., Slagel, V.A., Trewyn, R.W., Oxenhandler, R., Kuo, K.C., Gehrke, C.W., and Ehrlich, M. (1983). The 5-methylcytosine content of DNA from human tumors. Nucleic Acids Res. 11, 6883–6894.

Gardiner-Garden, M., and Frommer, M. (1987). CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261–282.

Ge, Y.-Z., Pu, M.-T., Gowher, H., Wu, H.-P., Ding, J.-P., Jeltsch, A., and Xu, G.-L. (2004). Chromatin targeting of de novo DNA methyltransferases by the PWWP domain. J. Biol. Chem. 279, 25447–25454.

Gehring, W.J. (1976). Developmental Genetics of Drosophila. Annu. Rev. Genet. 10, 209–252.

Gibson, W.T., Hood, R.L., Zhan, S.H., Bulman, D.E., Fejes, A.P., Moore, R., Mungall, A.J., Eydoux, P., Babul-Hirji, R., An, J., et al. (2012). Mutations in EZH2 Cause Weaver Syndrome. Am. J. Hum. Genet. 90, 110–118.

Golshani, P., Hutnick, L., Schweizer, F., and Fan, G. (2005). Conditional Dnmt1 deletion in dorsal forebrain disrupts development of somatosensory barrel cortex and thalamocortical long- term potentiation. Thalamus Relat. Syst. 3, 227–233.

Gowher, H., Liebert, K., Hermann, A., Xu, G., and Jeltsch, A. (2005). Mechanism of stimulation of catalytic activity of Dnmt3A and Dnmt3B DNA-(cytosine-C5)-methyltransferases by Dnmt3L. J. Biol. Chem. 280, 13341–13348.

Guo, H., Zhu, P., Yan, L., Li, R., Hu, B., Lian, Y., Yan, J., Ren, X., Lin, S., Li, J., et al. (2014a).

31 The DNA methylation landscape of human early embryos. Nature 511, 606–610.

Guo, X., Wang, L., Li, J., Ding, Z., Xiao, J., Yin, X., He, S., Shi, P., Dong, L., Li, G., et al. (2014b). Structural insight into autoinhibition and histone H3-induced activation of DNMT3A. Nature.

Hawkins, R.D., Hon, G.C., Lee, L.K., Ngo, Q., Lister, R., Pelizzola, M., Edsall, L.E., Kuan, S., Luu, Y., Klugman, S., et al. (2010). Distinct epigenomic landscapes of pluripotent and lineage- committed human cells. Cell Stem Cell 6, 479–491.

Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van Calcar, S., Qu, C., Ching, K.A., et al. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318.

Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112.

Hellman, A., and Chess, A. (2007). Gene body-specific methylation on the active X chromosome. Science 315, 1141–1143.

Holliday, R., and Pugh, J.E. (1975). DNA modification mechanisms and gene activity during development. Science 187, 226–232.

Holm, T.M., Jackson-Grusby, L., Brambrink, T., Yamada, Y., Rideout, W.M., and Jaenisch, R. (2005). Global loss of imprinting leads to widespread tumorigenesis in adult mice. Cancer Cell 8, 275–285.

Hon, G.C., Rajagopal, N., Shen, Y., McCleary, D.F., Yue, F., Dang, M.D., and Ren, B. (2013). Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat. Genet. 45, 1198–1206.

Hotchkiss, R.D. (1948). The quantitative separation of purines, pyrimidines, and nucleosides by paper chromatography. J. Biol. Chem. 175, 315–332.

Hutnick, L.K., Golshani, P., Namihira, M., Xue, Z., Matynia, A., Yang, X.W., Silva, A.J., Schweizer, F.E., and Fan, G. (2009). DNA hypomethylation restricted to the murine forebrain induces cortical degeneration and impairs postnatal neuronal maturation. Hum. Mol. Genet. 18, 2875–2888.

Ikegami, K., Iwatani, M., Suzuki, M., Tachibana, M., Shinkai, Y., Tanaka, S., Greally, J.M., Yagi, S., Hattori, N., and Shiota, K. (2007). Genome-wide and locus-specific DNA hypomethylation in G9a deficient mouse embryonic stem cells. Genes Cells 12, 1–11.

Illingworth, R., Kerr, A., Desousa, D., Jørgensen, H., Ellis, P., Stalker, J., Jackson, D., Clee, C., Plumb, R., Rogers, J., et al. (2008). A novel CpG island set identifies tissue-specific methylation at developmental gene loci. PLoS Biol. 6, e22.

32 Irizarry, R.A., Ladd-Acosta, C., Wen, B., Wu, Z., Montano, C., Onyango, P., Cui, H., Gabo, K., Rongione, M., Webster, M., et al. (2009). The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet. 41, 178– 186.

Issa, J.-P. (2004). CpG island methylator phenotype in cancer. Nat. Rev. Cancer 4, 988–993.

Jackson, J.P., Lindroth, A.M., Cao, X., and Jacobsen, S.E. (2002). Control of CpNpG DNA methylation by the KRYPTONITE histone H3 methyltransferase. Nature 416, 556–560.

Jackson-Grusby, L., Beard, C., Possemato, R., Tudor, M., Fambrough, D., Csankovszki, G., Dausman, J., Lee, P., Wilson, C., Lander, E., et al. (2001). Loss of genomic methylation causes p53-dependent apoptosis and epigenetic deregulation. Nat. Genet. 27, 31–39.

Jähner, D., Stuhlmann, H., Stewart, C.L., Harbers, K., Löhler, J., Simon, I., and Jaenisch, R. (1982). De novo methylation and expression of retroviral genomes during mouse embryogenesis. Nature 298, 623–628.

Jaju, R.J., Fidler, C., Haas, O.A., Strickson, A.J., Watkins, F., Clark, K., Cross, N.C., Cheng, J.F., Aplan, P.D., Kearney, L., et al. (2001). A novel gene, NSD1, is fused to NUP98 in the t(5;11)(q35;p15.5) in de novo childhood acute myeloid leukemia. Blood 98, 1264–1267.

Jermann, P., Hoerner, L., Burger, L., and Schubeler, D. (2014). Short sequences can efficiently recruit histone H3 lysine 27 trimethylation in the absence of enhancer activity and DNA methylation. Proc. Natl. Acad. Sci. 1400672111 – .

Jia, D., Jurkowska, R.Z., Zhang, X., Jeltsch, A., and Cheng, X. (2007). Structure of Dnmt3a bound to Dnmt3L suggests a model for de novo DNA methylation. Nature 449, 248–251.

Jones, P.A. (1996). DNA methylation errors and cancer. Cancer Res. 56, 2463–2467.

Jones, P.A., and Taylor, S.M. (1980). Cellular differentiation, cytidine analogs and DNA methylation. Cell 20, 85–93.

Josse, J., Kaiser, A.D., and Kornberg, A. (1961). Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. J. Biol. Chem. 236, 864–875.

Kappler, J.W. (1971). The 5-methylcytosine content of DNA: tissue specificity. J. Cell. Physiol. 78, 33–36.

Karimi, M.M., Goyal, P., Maksakova, I. a, Bilenky, M., Leung, D., Tang, J.X., Shinkai, Y., Mager, D.L., Jones, S., Hirst, M., et al. (2011). DNA methylation and SETDB1/H3K9me3 regulate predominantly distinct sets of genes, retroelements, and chimeric transcripts in mESCs. Cell Stem Cell 8, 676–687.

Karpf, A.R., and Matsui, S. (2005). Genetic disruption of cytosine DNA methyltransferase enzymes induces chromosomal instability in human cancer cells. Cancer Res. 65, 8635–8639.

33 Kim, M., Trinh, B.N., Long, T.I., Oghamian, S., and Laird, P.W. (2004). Dnmt1 deficiency leads to enhanced microsatellite instability in mouse embryonic stem cells. Nucleic Acids Res. 32, 5742–5749.

Kumar, S., Cheng, X., Klimasauskas, S., Mi, S., Posfai, J., Roberts, R.J., and Wilson, G.G. (1994). The DNA (cytosine-5) methyltransferases. Nucleic Acids Res. 22, 1–10.

Kurotaki, N., Imaizumi, K., Harada, N., Masuno, M., Kondoh, T., Nagai, T., Ohashi, H., Naritomi, K., Tsukahara, M., Makita, Y., et al. (2002). Haploinsufficiency of NSD1 causes Sotos syndrome. Nat. Genet. 30, 365–366.

Lehnertz, B., Ueda, Y., Derijck, A.A.H.A., Braunschweig, U., Perez-Burgos, L., Kubicek, S., Chen, T., Li, E., Jenuwein, T., and Peters, A.H.F.M. (2003). Suv39h-mediated histone H3 lysine 9 methylation directs DNA methylation to major satellite repeats at pericentric heterochromatin. Curr. Biol. 13, 1192–1200.

Lei, H., Oh, S.P., Okano, M., Jüttermann, R., Goss, K. a, Jaenisch, R., and Li, E. (1996). De novo DNA cytosine methyltransferase activities in mouse embryonic stem cells. Development 122, 3195–3205.

Leonhardt, H., Page, a W., Weier, H.U., and Bestor, T.H. (1992). A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei. Cell 71, 865–873.

Ley, T.J., Ding, L., Walter, M.J., McLellan, M.D., Lamprecht, T., Larson, D.E., Kandoth, C., Payton, J.E., Baty, J., Welch, J., et al. (2010). DNMT3A mutations in acute myeloid leukemia. N. Engl. J. Med. 363, 2424–2433.

Li, B.-Z., Huang, Z., Cui, Q.-Y., Song, X.-H., Du, L., Jeltsch, A., Chen, P., Li, G., Li, E., and Xu, G.-L. (2011). Histone tails regulate DNA methylation by allosterically activating de novo methyltransferase. Cell Res. 21, 1172–1181.

Li, J.-Y., Pu, M.-T., Hirasawa, R., Li, B.-Z., Huang, Y.-N., Zeng, R., Jing, N.-H., Chen, T., Li, E., Sasaki, H., et al. (2007). Synergistic function of DNA methyltransferases Dnmt3a and Dnmt3b in the methylation of Oct4 and Nanog. Mol. Cell. Biol. 27, 8748–8759.

Lindhurst, M.J., Sapp, J.C., Teer, J.K., Johnston, J.J., Finn, E.M., Peters, K., Turner, J., Cannons, J.L., Bick, D., Blakemore, L., et al. (2011). A mosaic activating mutation in AKT1 associated with the Proteus syndrome. N. Engl. J. Med. 365, 611–619.

Lister, R., O’Malley, R.C., Tonti-Filippini, J., Gregory, B.D., Berry, C.C., Millar, A.H., and Ecker, J.R. (2008). Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536.

Lister, R., Mukamel, E.A., Nery, J.R., Urich, M., Puddifoot, C.A., Johnson, N.D., Lucero, J., Huang, Y., Dwork, A.J., Schultz, M.D., et al. (2013). Global epigenomic reconfiguration during mammalian brain development. Science 341, 1237905.

Luscan, A., Laurendeau, I., Malan, V., Francannet, C., Odent, S., Giuliano, F., Lacombe, D.,

34 Touraine, R., Vidaud, M., Pasmant, E., et al. (2014). Mutations in SETD2 cause a novel overgrowth condition. J. Med. Genet. 51, 512–517.

Meissner, A., Mikkelsen, T.S., Gu, H., Wernig, M., Hanna, J., Sivachenko, A., Zhang, X., Bernstein, B.E., Nusbaum, C., Jaffe, D.B., et al. (2008). Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770.

Mikkelsen, T.S., Ku, M., Jaffe, D.B., Issac, B., Lieberman, E., Giannoukos, G., Alvarez, P., Brockman, W., Kim, T.-K., Koche, R.P., et al. (2007). Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560.

Mohandas, T., Sparkes, R.S., and Shapiro, L.J. (1981). Reactivation of an inactive human X chromosome: evidence for X inactivation by DNA methylation. Science 211, 393–396.

Monk, M. (1986). Methylation and the X chromosome. Bioessays 4, 204–208.

Monk, M., Boubelik, M., and Lehnert, S. (1987). Temporal and regional changes in DNA methylation in the embryonic, extraembryonic and germ cell lineages during mouse embryo development. Development 99, 371–382.

Morselli, M., Pastor, W.A., Montanini, B., Nee, K., Ferrari, R., Fu, K., Bonora, G., Rubbi, L., Clark, A.T., Ottonello, S., et al. (2015). In vivo targeting of de novo DNA methylation by histone modifications in yeast and mouse. Elife 4, e06205.

Mouse Genome Sequencing Consortium, Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., et al. (2002). Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.

Neri, F., Rapelli, S., Krepelova, A., Incarnato, D., Parlato, C., Basile, G., Maldotti, M., Anselmi, F., and Oliviero, S. (2017). Intragenic DNA methylation prevents spurious transcription initiation. Nature 543, 72–77.

Nguyen, S., Meletis, K., Fu, D., Jhaveri, S., and Jaenisch, R. (2007). Ablation of de novo DNA methyltransferase Dnmt3a in the nervous system leads to neuromuscular defects and shortened lifespan. Dev. Dyn. 236, 1663–1676.

Noh, K.-M., Wang, H., Kim, H.R., Wenderski, W., Fang, F., Li, C.H., Dewell, S., Hughes, S.H., Melnick, A.M., Patel, D.J., et al. (2015). Engineering of a Histone-Recognition Domain in Dnmt3a Alters the Epigenetic Landscape and Phenotypic Features of Mouse ESCs. Mol. Cell 59, 89–103.

Ogata, T., and Kagami, M. (2016). Kagami-Ogata syndrome: a clinically recognizable upd(14)pat and related disorder affecting the chromosome 14q32.2 imprinted region. J. Hum. Genet. 61, 87–94.

Okano, M., Xie, S., and Li, E. (1998). Cloning and characterization of a family of novel mammalian DNA (cytosine-5) methyltransferases. Nat. Genet. 19, 219–220.

35 Okano, M., Bell, D.W., Haber, D. a, and Li, E. (1999). DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247–257.

Otani, J., Nankumo, T., Arita, K., Inamoto, S., Ariyoshi, M., and Shirakawa, M. (2009). Structural basis for recognition of H3K4 methylation status by the DNA methyltransferase 3A ATRX-DNMT3-DNMT3L domain. EMBO Rep. 10, 1235–1241.

Ping, A.J., Reeve, A.E., Law, D.J., Young, M.R., Boehnke, M., and Feinberg, A.P. (1989). Genetic linkage of Beckwith-Wiedemann syndrome to 11p15. Am. J. Hum. Genet. 44, 720–723.

Qiu, C., Sawada, K., Zhang, X., and Cheng, X. (2002). The PWWP domain of mammalian DNA methyltransferase Dnmt3b defines a new family of DNA-binding folds. Nat. Struct. Biol. 9, 217– 224.

Rada-Iglesias, A., Bajpai, R., Swigut, T., Brugmann, S.A., Flynn, R.A., and Wysocka, J. (2011). A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283.

Razin, A., and Cedar, H. (1991). DNA methylation and gene expression. Microbiol. Rev. 55, 451–458.

Razin, A., and Riggs, A. (1980). DNA methylation and gene function. Science (80-. ). 210, 604– 610.

Reik, W., Collick, A., Norris, M.L., Barton, S.C., and Surani, M.A. (1987). Genomic imprinting determines methylation of parental alleles in transgenic mice. Nature 328, 248–251.

Rhee, I., Bachman, K.E., Park, B.H., Jair, K.-W., Yen, R.-W.C., Schuebel, K.E., Cui, H., Feinberg, A.P., Lengauer, C., Kinzler, K.W., et al. (2002). DNMT1 and DNMT3b cooperate to silence genes in human cancer cells. Nature 416, 552–556.

Riggs, A.D. (1975). X inactivation, differentiation, and DNA methylation. Cytogenet. Cell Genet. 14, 9–25.

Roy, P.H., and Smith, H.O. (1973). DNA methylases of Hemophilus influenzae Rd. II. Partial recognition site base sequences. J. Mol. Biol. 81, 445–459.

Russler-Germain, D. a., Spencer, D.H., Young, M. a., Lamprecht, T.L., Miller, C. a., Fulton, R., Meyer, M.R., Erdmann-Gilmore, P., Townsend, R.R., Wilson, R.K., et al. (2014). The R882H DNMT3A mutation associated with AML dominantly inhibits wild-type DNMT3A by blocking its ability to form active tetramers. Cancer Cell 25, 442–454.

Santos, F., Hendrich, B., Reik, W., and Dean, W. (2002). Dynamic reprogramming of DNA methylation in the early mouse embryo. Dev. Biol. 241, 172–182.

Sapienza, C., Peterson, A.C., Rossant, J., and Balling, R. (1987). Degree of methylation of transgenes is dependent on gamete of origin. Nature 328, 251–254.

36 Saxonov, S., Berg, P., and Brutlag, D.L. (2006). A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc. Natl. Acad. Sci. U. S. A. 103, 1412–1417.

Shipony, Z., Mukamel, Z., Cohen, N.M., Landan, G., Chomsky, E., Zeliger, S.R., Fried, Y.C., Ainbinder, E., Friedman, N., and Tanay, A. (2014). Dynamic and static maintenance of epigenetic memory in pluripotent and somatic cells. Nature 513, 115–119.

Smallwood, A., Estève, P.-O., Pradhan, S., and Carey, M. (2007). Functional cooperation between HP1 and DNMT1 mediates gene silencing. Genes Dev. 21, 1169–1178.

Smith, Z.D., Chan, M.M., Humm, K.C., Karnik, R., Mekhoubad, S., Regev, A., Eggan, K., and Meissner, A. (2014). DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611–615.

Song, J., Rechkoblit, O., Bestor, T.H., and Patel, D.J. (2011). Structure of DNMT1-DNA complex reveals a role for autoinhibition in maintenance DNA methylation. Science 331, 1036– 1040.

Stadler, M.B., Murr, R., Burger, L., Ivanek, R., Lienert, F., Schöler, A., van Nimwegen, E., Wirbelauer, C., Oakeley, E.J., Gaidatzis, D., et al. (2011). DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490–495.

Suetake, I., Shinozaki, F., Miyagawa, J., Takeshima, H., and Tajima, S. (2004). DNMT3L stimulates the DNA methylation activity of Dnmt3a and Dnmt3b through a direct interaction. J. Biol. Chem. 279, 27816–27823.

Sun, F.L., Dean, W.L., Kelsey, G., Allen, N.D., and Reik, W. (1997). Transactivation of Igf2 in a mouse model of Beckwith-Wiedemann syndrome. Nature 389, 809–815.

Swain, J.L., Stewart, T.A., and Leder, P. (1987). Parental legacy determines methylation and expression of an autosomal transgene: a molecular mechanism for parental imprinting. Cell 50, 719–727.

Takeshita, K., Suetake, I., Yamashita, E., Suga, M., Narita, H., Nakagawa, A., and Tajima, S. (2011). Structural insight into maintenance methylation by mouse DNA methyltransferase 1 (Dnmt1). Proc. Natl. Acad. Sci. U. S. A. 108, 9055–9059.

Takizawa, T., Nakashima, K., Namihira, M., Ochiai, W., Uemura, A., Yanagisawa, M., Fujita, N., Nakao, M., and Taga, T. (2001). DNA methylation is a critical cell-intrinsic determinant of astrocyte differentiation in the fetal brain. Dev. Cell 1, 749–758.

Tatton-Brown, K., Hanks, S., Ruark, E., Zachariou, A., Duarte Sdel, V., Ramsay, E., Snape, K., Murray, A., Perdeaux, E.R., Seal, S., et al. (2011). Germline mutations in the oncogene EZH2 cause Weaver syndrome and increased human height. Oncotarget 2, 1127–1133.

Tatton-Brown, K., Seal, S., Ruark, E., Harmer, J., Ramsay, E., Del Vecchio Duarte, S., Zachariou, A., Hanks, S., O’Brien, E., Aksglaede, L., et al. (2014). Mutations in the DNA

37 methyltransferase gene DNMT3A cause an overgrowth syndrome with intellectual disability. Nat. Genet. 46, 385–388.

Temple, I.K., Shrubb, V., Lever, M., Bullman, H., and Mackay, D.J.G. (2007). Isolated imprinting mutation of the DLK1/GTL2 locus associated with a clinical presentation of maternal uniparental disomy of chromosome 14. J. Med. Genet. 44, 637–640.

Teter, B., Osterburg, H.H., Anderson, C.P., and Finch, C.E. (1994). Methylation of the rat glial fibrillary acidic protein gene shows tissue-specific domains. J. Neurosci. Res. 39, 680–693.

Teter, B., Rozovsky, I., Krohn, K., Anderson, C., Osterburg, H., and Finch, C. (1996). Methylation of the glial fibrillary acidic protein gene shows novel biphasic changes during brain development. Glia 17, 195–205.

Toyota, M., Ahuja, N., Ohe-Toyota, M., Herman, J.G., Baylin, S.B., and Issa, J.-P.P. (1999). CpG island methylator phenotype in colorectal cancer. Proc. Natl. Acad. Sci. U. S. A. 96, 8681– 8686.

Tsumura, A., Hayakawa, T., Kumaki, Y., Takebayashi, S., Sakaue, M., Matsuoka, C., Shimotohno, K., Ishikawa, F., Li, E., Ueda, H.R., et al. (2006). Maintenance of self-renewal ability of mouse embryonic stem cells in the absence of DNA methyltransferases Dnmt1, Dnmt3a and Dnmt3b. Genes Cells 11, 805–814.

Vanyushin, B.F., Tkacheva, S.G., and Belozersky, A.N. (1970). Rare bases in animal DNA. Nature 225, 948–949.

Varambally, S., Dhanasekaran, S.M., Zhou, M., Barrette, T.R., Kumar-Sinha, C., Sanda, M.G., Ghosh, D., Pienta, K.J., Sewalt, R.G. a B., Otte, A.P., et al. (2002). The polycomb group protein EZH2 is involved in progression of prostate cancer. Nature 419, 624–629.

Viré, E., Brenner, C., Deplus, R., Blanchon, L., Fraga, M., Didelot, C., Morey, L., Van Eynde, A., Bernard, D., Vanderwinden, J.-M., et al. (2006). The Polycomb group protein EZH2 directly controls DNA methylation. Nature 439, 871–874.

Weber, M., Davies, J.J., Wittig, D., Oakeley, E.J., Haase, M., Lam, W.L., and Schübeler, D. (2005). Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat. Genet. 37, 853–862.

Weber, M., Hellmann, I., Stadler, M.B., Ramos, L., Pääbo, S., Rebhan, M., and Schübeler, D. (2007). Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39, 457–466.

Weksberg, R., Shuman, C., and Beckwith, J.B. (2010). Beckwith–Wiedemann syndrome. Eur. J. Hum. Genet. 18, 8–14.

Wigler, M., Levy, D., and Perucho, M. (1981). The somatic replication of DNA methylation. Cell 24, 33–40.

38 Wu, H., Coskun, V., Tao, J., Xie, W., Ge, W., Yoshikawa, K., Li, E., Zhang, Y., and Sun, Y.E. (2010). Dnmt3a-dependent nonpromoter DNA methylation facilitates transcription of neurogenic genes. Science 329, 444–448.

Wu, Z., Huang, K., Yu, J., Le, T., Namihira, M., Liu, Y., Zhang, J., Xue, Z., Cheng, L., and Fan, G. (2012). Dnmt3a regulates both proliferation and differentiation of mouse neural stem cells. J. Neurosci. Res. 90, 1883–1891.

Wyatt, G.R. (1951). Recognition and estimation of 5-methylcytosine in nucleic acids. Biochem. J. 48, 581–584.

Zhang, P., Liégeois, N.J., Wong, C., Finegold, M., Hou, H., Thompson, J.C., Silverman, A., Harper, J.W., DePinho, R.A., and Elledge, S.J. (1997). Altered cell differentiation and proliferation in mice lacking p57KIP2 indicates a role in Beckwith-Wiedemann syndrome. Nature 387, 151–158.

Zhang, X., Yazaki, J., Sundaresan, A., Cokus, S., Chan, S.W.-L., Chen, H., Henderson, I.R., Shinn, P., Pellegrini, M., Jacobsen, S.E., et al. (2006). Genome-wide high-resolution mapping and functional analysis of DNA methylation in arabidopsis. Cell 126, 1189–1201.

Zhang, Y., Jurkowska, R., Soeroes, S., Rajavelu, A., Dhayalan, A., Bock, I., Rathert, P., Brandt, O., Reinhardt, R., Fischle, W., et al. (2010). Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided by interaction of the ADD domain with the histone H3 tail. Nucleic Acids Res. 38, 4246–4253.

Zilberman, D., Gehring, M., Tran, R.K., Ballinger, T., and Henikoff, S. (2007). Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat. Genet. 39, 61–69.

Ziller, M.J., Gu, H., Müller, F., Donaghey, J., Tsai, L.T.-Y., Kohlbacher, O., De Jager, P.L., Rosen, E.D., Bennett, D. a, Bernstein, B.E., et al. (2013). Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481.

39

Chapter 2 :

Reversible Regulation of Promoter and Enhancer Histone Landscape by

DNA Methylation in Mouse Embryonic Stem Cells

40 Introduction

Recent studies have revealed that global DNA methylation is dramatically altered during

pre- and post-implantation development (Guo et al., 2014a; Smith et al., 2014), primordial germ cell reprogramming (Gkountela et al., 2015; Guo et al., 2015; Tang et al., 2015), as well as stem

cell differentiation (Xie et al., 2013), and cellular reprogramming (Lister et al., 2011). A major

challenge in the field has been to understand how drastic changes in methylomes contribute to altered transcriptional programs associated with cell fate commitment and differentiation. The prevailing hypothesis posits that DNA methylation is a crucial silencer of pluripotency and

tissue-specific genes via promoter hypermethylation. However, gene promoters account for a

tiny fraction of the genome, and increasing evidence repudiates the obligate role for promoter

methylation in gene silencing (Bogdanovic et al., 2011; Hammoud et al., 2014; Noh et al., 2015).

For instance, pluripotency genes can be silenced during differentiation in the absence of

promoter methylation (Sinkkonen et al., 2008). Therefore, it is becoming increasingly clear that

DNA methylation works in conjunction with other factors to properly regulate gene expression.

DNA methylation and histone modifications are two mediators of epigenetic regulation.

These two marks cooperate at many times during development, including silencing of

pluripotency genes, genomic imprinting, and X chromosome inactivation (Cedar and Bergman,

2009). Currently, two models describe the relationship between DNA methylation and histone

modifications. A number of studies support the idea that DNA methylation is targeted and

patterned by histone modifications in the “follower” model, where DNA methylation acts

downstream in the regulatory hierarchy. For example, de novo DNA methyltransferases (DNMT)

are shown to recognize unmethylated histone H3 (H3K4me0) at promoters specifying methylation patterns at promoters (Guo et al., 2014b; Ooi et al., 2007; Otani et al., 2009; Zhang

41 et al., 2010). Similarly, H3K36me3 has recently been shown to target DNMT3B to gene bodies

contributing to genic methylation (Baubec et al., 2015; Morselli et al., 2015). Alternatively,

mounting evidence argues for an instructive role for DNA methylation that regulates histone

modification patterns, acting higher in the hierarchy. Under certain circumstances, DNA methylation has been found to be antagonistic to H3K27me3 at promoters. At these sites, methylated DNA is found to exclude binding of PRC2 components to their targets, providing a mechanistic basis for mutual exclusion (Bartke et al., 2010; Jermann et al., 2014). In addition,

DNMT3A enzyme was found to facilitate neurogenic gene expression through the exclusion of

Polycomb protein binding in gene bodies (Wu et al., 2010). In short, the complex relationship between DNA methylation and histone modifications remains to be illustrated, possibly in a cell- type and/or genomic region-specific manner.

To understand how DNA methylation may coordinate with histone modifications to regulate genome-wide gene expression, we and others have leveraged hypomethylated mouse embryonic stem cells (mESCs), which are viable despite complete loss of genomic DNA methylation. We have previously found that mESCs null of DNA methylation show upregulation of genes primarily associated with bivalent (H3K4me3/H3K27me3 positive) or unmarked

(H3K4me3/H3K27me3 double negative) gene promoters in wild-type cells (Fouse et al., 2008).

In contrast, minimal changes in H3K9me3 occupancy were observed in hypomethylated mESCs, leading to the idea that DNA methylation and H3K9me3 act non-redundantly (Karimi et al.,

2011). Meanwhile, H3K27me3 is dramatically redistributed in response to DNA hypomethylation (Brinkman et al., 2012; Cooper et al., 2014; Reddington et al., 2013). Despite these findings, it is still inconclusive whether the changes of histone modifications observed in

DNA methylation null mESCs are directly or indirectly caused by hypomethylation.

42 In this current study, we set out to understand how DNA methylation shapes the histone landscape and in mESCs. To study this, we employed a mESC model allowing sequential establishment of fully methylated, globally demethylated, and various intermediate

levels of hypomethylation followed by subsequent measurement of histone modifications and

RNA transcriptome across all states. Thus, our experimental setup is designed to determine

whether DNA methylation acts upstream or downstream with respect to select histone

modifications. We show that DNA methylation regulates occupancy of both H3K27me3 and

H3K27ac at gene promoters and tissue-specific enhancer elements. Furthermore, changes in

H3K27me3 and H3K27ac histone modifications are reversed upon DNMT reconstitution, dependent on DNMT catalytic activity, indicating the ability of DNA methylation to outcompete established chromatin states.

RESULTS

Dnmt Reconstitution in Demethylated mESCs Restores Global Cytosine Methylation

and Causes Various Changes in Histone Modifications

To dissect causal relationships between DNA methylation and histone modifications we

simultaneously knocked-out all three DNA methyltransferases, Dnmt1, Dnmt3a, and Dnmt3b via

Cre-lox recombination to generate triple Dnmt knockout (TKO) mouse ES cells that are

completely devoid of DNA methylation after several cell passages (Figure 2.1A). By contrast,

double knockout of the de novo DNA methyltransferases Dnmt3a and Dnmt3b (DKO) leads to

slower global demethylation (presumably due to the robust Dnmt1 maintenance enzyme) and

reaches 90% loss of global methylation by 30 passages (Jackson et al., 2004) (Figure 2.1B). In

these two demethylated mESC systems, we then reconstituted Dnmt3a1, Dnmt3a2, Dnmt3b1

43 isoforms individually (Figure 2.1A and Figure 2.8A). Reconstitution of the de novo DNMTs in

TKO mESCs resulted in increased global methylation to approximately half of WT levels

(Figure 2.1B). In DKO mESCs, reconstitution of Dnmt3a1, Dnmt3a2, or Dnmt3b1 led to a greater increase of methylation to >70% WT levels, indicating a significant contribution from

Dnmt1. Profiling the DNA methylomes of TKO and DKO reconstitution cell lines using reduced representation bisulfite sequencing (RRBS) showed global levels of cytosine methylation consistent with mass spectrometry results (Figure 2.1B and Figure 2.8B). Together, these cell lines enable the study of relationships between varying global methylation levels and histone occupancy.

Mapping average methylation across all genes, we find similar methylation distribution patterns, but different amplitudes proportional to global methylation levels (Figure 2.8C).

Inspection of individual CpG sites revealed that the amplitude differences between cell lines are explained by cytosines being partially methylated rather than a skewed distribution of fully methylated and unmethylated CpGs (Figure 2.8B). Together, our data suggests that Dnmt3a and

Dnmt3b isoforms are all capable of shaping the overall DNA methylome patterning.

Using ChIP-seq, we profiled genomic occupancy of H3K4me3, associated with active

promoters, H3K27me3 associated with repressive chromatin, H3K27ac, associated with active

promoters and enhancers (Creyghton et al., 2010), and H3K4me1, associated with regulatory

regions including enhancers (Barski et al., 2007; Heintzman et al., 2007). Comparison across

wild-type (WT), TKO, DKO, and Dnmt reconstitution cell lines reveals selective global changes

in specific histone modifications. Genome-wide correlation between all histone modifications

and cell lines revealed that correlation generally exists among each histone modification. For

44 example, H3K4me3 across samples are highly correlated (Pearson correlation, r > 0.96) (Figure

2.1C), indicating that H3K4me3 is largely unaffected by global hypomethylation.

Among H3K27ac datasets, inter-sample correlation is also high (r ~ 0.93), however there may be few site-specific differences resulting from hypomethylation. By contrast, H3K27me3 appears to be most sensitive to varying levels of global DNA methylation. WT have 0.63 and

0.65 Pearson correlation with TKO and DKO respectively, >0.7 correlation with reconstitution in

TKO, and >0.8 correlation with reconstitution in DKO (Figure 2.1C, lower left corner).

Correlations are also seen between histone modifications, for example between H3K4me1 and

H3K27ac (r ~ 0.40) and between H3K27me3 and H3K4me3 in WT (r ~ 0.18), suggesting co- localization in a significant portion of the genome. Epigenetic profiling across mESC lines of different global methylation states is shown at several genomic loci, illustrating histone modification changes sensitive to DNA methylation (Figure 2.1D). Overall, this indicates a broad role for DNA methylation in influencing histone modifications, with different histone modifications showing different relationships.

DNA Methylation is Required for Maintenance and Reestablishment of Promoter

H3K27me3

As we observed the greatest global variation in H3K27me3 occupancy in response to

DNA hypomethylation, we aimed to further investigate the relationship between DNA

methylation and H3K27me3 at promoters. We first organized promoters by chromatin

environment, categorizing promoters into H3K4me3+, bivalent, H3K27me3+, and silent devoid of both H3K4me3 and H3K27me3 histone modifications (Figure 2.2A). Consistent with previous studies, we find that H3K27me3 is reduced at nearly all bivalent promoters

45 (3,231/5,362 = 60.3%) in demethylated mESCs (Brinkman et al., 2012; Cooper et al., 2014)

(Figure 2.2A and 2.2B). The loss of H3K27me3 at bivalent promoters is also associated with a

small gain in H3K27ac and loss of H3K4me1 at a subgroup of promoters (Figure 2.9A-C). In contrast, for a subgroup of silent promoters (H3K4me3 and H3K27me3 negative) (705/9,068 =

7.8%), we find increased H3K27me3 occupancy in the demethylated state. For example,

promoters of Cdkn2a and Pdgfb are bivalently marked and upon demethylation, entirely lose

H3K27me3 (Figure 2.2C). In the case of Cdkn2a, the promoter is seen to gain H3K27ac. In

contrast, Trpv1, which is devoid of H3K4me3 and H3K27me3, gains H3K27me3 upon

demethylation (Figure 2.2C). Interestingly, this increase of H3K27me3 extends to adjacent

regions including the gene body. Together, these data indicate that DNA methylation is

necessary for the maintenance of normal promoter H3K27me3 patterns in a context-specific

manner.

We next asked whether DNA methylation can reestablish normal H3K27me3 patterns by

reconstituting Dnmt in demethylated mESCs. Expression of Dnmt3a1, Dnmt3a2, and Dnmt3b1

in TKO and DKO mESCs restore variable levels of H3K27me3 occupancy around promoters

(Figure 2.2A and 2.2D). At bivalent promoters, Dnmt3a1 and Dnmt3a2 appear to have a slightly stronger impact in restoring H3K27me3 compared to Dnmt3b1. Nonetheless, neither of the three enzymes alone could fully recapitulate WT patterns when reconstituted in TKO mESCs (Figure

2.9D). By contrast, reconstitution of Dnmt3a1, Dnmt3a2, and Dnmt3b1 in DKO mESCs all show greater H3K27me3 rescue, particularly Dnmt3a1, which shows H3K27me3 patterns/levels phenocopying WT (Figure 2.2A and 2.2D). Meanwhile, at silent promoters, where demethylation leads to increased levels of H3K27me3 occupancy, we also observe differences in rescue between TKO and DKO reconstitution cell lines. Whereas TKO Dnmt reconstitution

46 appears to have weak rescue of H3K27me3 at silent promoter, all Dnmt isoforms in DKO are able to restore H3K27me3 down to near WT levels (Figures 2.2D and 2.9D). These data show the ability of reconstituted DNMT to reestablish WT histone modification patterns, indicating

DNMT’s ability to outcompete established chromatin states (i.e. the demethylated epigenome) at

gene promoters.

Impact of DNA Methylation on Promoter H3K27me3 and Gene Expression Differs

Between Bivalent and Silent Promoters

To determine the role of DNA methylation in regulation of histone modifications at

promoters, we quantified methylation in different categories of promoters (Figure 2.2E, top).

Bivalent promoters are hypomethylated with an average methylation fraction of 0.17, while

silent promoters are hypermethylated with an average of 0.88 (Figure 2.2E). As CpG islands are

frequently found at gene promoters and are hypomethylated, we measured the CpG content of

promoters in each category, represented as CpG observed-to-expected ratio (CpG O/E). We find bivalent promoters are CpG-rich (O/E > 0.6), consistent with the association of bivalent promoters with CpG islands (Bernstein et al., 2006). Silent promoters that gain H3K27me3 are characterized by low CpG content (O/E < 0.4), consistent with low CpG-content promoters

(LCPs) being predominately methylated in mESCs (Fouse et al., 2008; Weber et al., 2007).

Analysis of DNA methylation in KO and reconstitution cell lines show loss and recovery of WT methylation levels and patterns (Figure 2.2E, bottom). Bivalent promoters, such as Cdkn2a and

Pdgfb, showed a characteristic hypomethylated CpG island surrounding the TSS, while silent promoters, such as Trpv1, were heavily methylated (Figure 2.2C and 2.2E).

47 Comparing H3K27me3 occupancy at the TSS with global 5mC between cell lines reveals

strong associations between DNA methylation and promoter H3K27me3 (Figure 2.2F).

H3K27me3 positively correlates with global DNA methylation at bivalent promoters (Pearson

correlation, r = 0.84) whereas H3K27me3 is anticorrelated with global DNA methylation at

silent promoters (r = 0.95) (Figure 2.2F). These data suggest a predictive ability of global DNA

methylation levels on H3K27me3 occupancy in two promoter contexts.

To test whether alterations in histone modification resulting from demethylation changed gene expression, we profiled the transcriptome from all cell lines. Genes with bivalent promoters

generally had low expression (RPKM < 10), but significantly increased (mean RPKM increase

from 3.9 to 4.8) upon demethylation (Figure 2.2G). Alternatively, silent promoters were not

expressed (RPKM < 1) and on average remained silenced despite demethylation. In the case of

bivalent promoter gene Cdkn2a, we saw gene expression increase over 2-fold in DKO with

resuppression upon remethylation in Dnmt reconstitution mESCs.

DNA Methylation Maintains and Reestablishes Silent or Primed States at Enhancer

Elements

Differential DNA methylation is prevalent at tissue-specific enhancers (Andersson et al.,

2014; Elliott et al., 2015; Hon et al., 2013; Stadler et al., 2011; Thurman et al., 2012; Ziller et al.,

2013), suggesting a regulatory role for DNA methylation at enhancers elements. We therefore extended our analysis to enhancers, analyzing H3K27ac/me3 and H3K4me1 modifications

(Heintzman et al., 2007). We predicted that genomic hypomethylation would lead to activation of silent enhancers either as appearance of active enhancers (marked with both H3K27ac and

48 H3K4me1) or appearance of poised enhancers (marked with H3K27me3 and H3K4me1) in TKO

(Figure 2.3A) (Creyghton et al., 2010; Rada-Iglesias et al., 2011; Zentner et al., 2011).

To identify putative methylation-dependent enhancer sites, we identified differentially

enriched peaks between demethylated DKO/TKO and their respective methylated wild-type

mESCs. Differential peaks were overlapped with H3K4me1 and those within 2 kb of known TSS

were excluded to avoid identification of gene promoters. Identifying differential peaks common

to both DKO and TKO yielded a number of changes in the chromatin state of candidate

enhancers upon genomic hypomethylation (Figure 2.3A). For example, we observed 138 (1.3%)

and 551 (5.2%) peaks gaining and losing H3K27ac respectively. Similarly, we found 1,041

(46.5%) peaks gained and 162 (7.2%) peaks lost H3K27me3 in response to global demethylation. We noted that candidate enhancers that become active or poised in Dnmt KO mESCs predominately arise de novo from previously primed enhancer loci in WT mESCs

(Figure 2.3B). A few new active or poised enhancers derive from the other indicating that methylation does not regulate transitions between active and poised status.

We then sought to determine whether or not remethylation would be able to return the enhancers to their original state. If chromatin state took precedence over DNA methylation, we would expect histone modification status of reconstituted cell lines to resemble the demethylated state. On the other hand, if DNA methylation is capable of acting upstream in the hierarchy, the chromatin state would be reversible. We find that for both H3K27ac and H3K27me3, reestablishing DNA methylation returns putative enhancers to the wild-type state, exemplified by

de novo intergenic active enhancer upstream of Foxd3 as previously characterized in TKO

mESCs (Domcke et al., 2015), and an intragenic poised enhancer in Tmem104 (Figure 2.3C).

The histone modification changes characterized in demethylated DKO and TKO mESCs were

49 uniformly reversible across all enhancers identified (Figures 2.3D and 2.10). Similar to rescue at promoters, Dnmt reconstitution in DKO more efficiently reestablishes wild-type patterns of enhancers compared to in TKO (Figure 2.3D).

Enhancers that tend to lose H3K27ac or H3K27me3 are relatively hypomethylated in WT mESCs, whereas enhancers that gain H3K27ac or H3K27me3 are relatively hypermethylated

(Figure 2.3E). This is consistent with antagonism between modifications of the H3K27 residue with DNA methylation (Bartke et al., 2010; Jermann et al., 2014). Measuring CpG content, all enhancers sensitive to DNA methylation consistently show low CpG O/E (O/E < 0.4) (Figure

2.3E). In contrast with CpG-rich bivalent promoters that lose H3K27me3, poised enhancers losing H3K27me3 are low in CpG content.

Lastly, to test whether chromatin state at enhancers alters gene expression, we measured gene expression at both genes nearest to each enhancer (data not shown) as well as a subset of promoters with previously confirmed Hi-C interactions with DNA methylation-sensitive enhancers we identified (Shen et al., 2012). Expression changes at genes associated with methylation-sensitive enhancers varied in their response to demethylation where each category included genes up- and down-regulated (Figure 2.3F). Median fold change within enhancer categories indicated nearly equal up- and downregulation, however several genes exhibit expression patterns indicative of regulation by enhancer methylation. In the case of Elmo1, gene expression increased greater than 2-fold in association with a de novo active enhancer within the first intron (Figure 2.3G). Reconstitution of Dnmt also returned both expression of Elmo1 as well as the intragenic enhancer to wild-type levels. Together, this shows that there are hundreds of enhancers in mESCs where DNA methylation is the primary factor in determining the modification status of chromatin.

50

De Novo Enhancers Are Tissue-Specific and Contain Methylation-Sensitive

Transcription Factors

To characterize DNA methylation-sensitive enhancers, we cross-referenced these

enhancers with previously identified tissue-specific enhancers to assess their identity (Shen et al.,

2012) (Figure 2.4A). Of the active enhancers that significantly gain or lose H3K27ac, 33.3%

(46/138) and 40.8% (225/551) overlapped with tissue-specific enhancers respectively. In both categories, mESC-specific enhancers were found, with nearly two-thirds of active enhancers

losing H3K27ac predictably consisting of mESC-specific enhancers. De novo active enhancers

that gained H3K27ac showed overlap with several testes, E14.5 liver, placenta, and kidney-

specific enhancers with the remainder divided between 14 other tissues (Figure 2.4A). Both

enhancers that gain and lose H3K27me3 also overlapped with tissue-specific enhancers showing

33.2% (346/1,041) and 23.5% (38/162) overlap respectively. Interestingly, enhancers with de

novo poised enhancers overlap with enhancers from a variety of different tissue types.

Meanwhile, enhancers losing H3K27me3 are predominately specific to testes (76%), with a

smaller contribution from other tissue types.

To determine how DNA methylation regulates enhancers we assessed DNA methylation-

sensitive enhancers for enrichment of transcription factor motifs. We focused on enhancers

gaining H3K27ac or H3K27me3, representing de novo active or poised enhancers, as sites losing

these marks are already active/poised and are predictably enriched in pluripotency factors

(Figure 2.11). We find that de novo active enhancers (gain H3K27ac in Dnmt KO) are strongly

enriched in motifs for NRF1 (16%) and Nkx family transcription factors (87%) (Figure 2.4B).

Interestingly, NRF1 was recently described to be a DNA methylation-dependent transcription

51 factor where methylation of CpG within the binding motif affected binding to DNA (Domcke et

al., 2015). In our data, we identified several instances of de novo methylation-dependent active

enhancers with underlying methylated NRF1 motifs (Figure 2.4C). For example, at a testes- specific enhancer upstream of Zfp92, we find a DNA methylation-dependent peak with adjacent

Nkx2.5 and NRF1 motifs. NRF1 at this enhancer is methylated at an intermediate level (0.29-

0.57) in mESCs and is specifically demethylated (<0.05) in adult germline stem cells (AGSCs) of the testis (Hammoud et al., 2014). Similarly, a placenta-specific enhancer is found intragenically in Ears2 where a local NRF1 motif is methylated (0.83) in mESCs. The motif for

ZBTB33 (also known as Kaiso), a transcriptional regulator described to bind methylated CGCG motif in vitro, was also found at 6 sites (4.4%).

At de novo poised enhancers (gain H3K27me3 in Dnmt KO) we find enrichment of several motifs containing the canonical E-box sequence CACGTG as well as motifs for pluripotency factor Esrrb (Figure 2.4B). Notably, and Max were also identified as putative methylation-dependent transcription factors (Domcke et al., 2015). Together, these data show

that DNA methylation can act upstream of modified H3K27, likely by means of modulating

binding of DNA methylation-dependent transcription factors.

Regulation of H3K27me3 Depends on 5-Methylcytosine and DNMT Catalytic

Activity

In understanding the relationship between DNA methylation and histone modifications, it

is important to distinguish between effects directly conferred by 5-methylcytosine (5mC),

DNMT-protein interactions, or other indirect effects. To test whether the methylcytosine moiety

of DNA itself or DNMT protein is responsible for reestablishing H3K27me3 at gene promoters,

52 we reconstituted a catalytically null mutant of Dnmt3b1 (Dnmt3b1MUT) that contains missense

mutations in the essential PC motif in DKO mESCs. Promoter H3K27me3 in the reconstitution with catalytic mutant resembled DKO at both bivalent and silent promoters. This demonstrates

that the ability of DNMT to reestablish H3K27me3 patterns in TKO and DKO mESCs is

dependent on DNMT catalytic activity and 5mC (Figures 2.5A and 2.5B). Similarly, poised

enhancer elements gaining/losing H3K27me3 also show inability of Dnmt3b1MUT to rescue

H3K27me3 redistribution compared to wild-type Dnmt3b (Figure 2.5C). For example,

Dnmt3b1MUT showed lack of regulation of H3K27me3 at promoters of bivalent Cdkn2a and silent Trpv1, as well as intragenic de novo poised enhancer in Tmem104 and intergenic poised enhancer lost upstream of Foxp4 (Figure 2.5D). This is consistent with previous study of

Dnmt3a1 catalytic mutants in neural stem cells, which show an inability to reverse

PRC2/H3K27me3 occupancy (Wu et al., 2010).

DNA Methylation Regulation of H3K27me3 is Mediated by PRC2 Targeting

We have shown that both promoters and enhancers prone to gaining H3K27me3 are

heavily methylated in WT, consistent with a direct antagonism between DNA methylation and

H3K27me3. To test whether this direct relationship extends to the respective catalytic enzymes, we analyzed published genome binding data of DNMT3A2, DNMT3B1, and SUZ12 (Baubec et al., 2015; Cooper et al., 2014). Similar to measurements of DNA methylation profiles at promoters and enhancers, we find that DNMT3A2 and DNMT3B1 are preferentially bound to silent promoters and de novo poised enhancers that both gain H3K27me3 in TKO (Figures 2.6A and 2.6B). Likewise, SUZ12 binding mirrors H3K27me3 occupancy, where SUZ12 binding increases at silent promoters in TKO (Cooper et al., 2014). In a similar manner, SUZ12 increases

53 strongly at de novo poised enhancers (Figures 2.6A and 2.6B). The strong presence of DNMT

and 5mC at silent promoters and de novo poised enhancers that gain SUZ12 and H3K27me3

upon demethylation suggests direct antagonism between DNA methylation and PRC2 complex

as previously described in NPCs (Wu et al., 2010). In contrast, bivalent promoters are

unmethylated and lack DNMT binding, implying an indirect mechanism for regulating

H3K27me3 and PRC2.

DNMT3 Isoform Differences in Histone Modification Regulation

For the two catalytically active de novo methyltransferases, multiple isoforms have been

characterized. These isoforms are expressed in a tissue and developmentally-regulated manner.

In mESCs, Dnmt3a2 and Dnmt3b1 are the major isoforms, while Dnmt3a1 is expressed at a lower level and increases in expression throughout development (Chen et al., 2002; Feng et al.,

2005). Methylation patterns are determined through differential expression of Dnmt3a and

Dnmt3b isoforms (Chen et al., 2003), but are redundant as certain targets such as Oct4 and

Nanog are still able to be methylated in single knockouts (Li et al., 2007). We therefore asked whether different Dnmt3 isoforms have unique or shared targets across the genome. Segmenting the genome into 500 bp windows, we found that the vast majority of methylated windows (26M bases) can be methylated by all three isoforms (90%). Interestingly, we do identify a tiny fraction

of the genome (in total about 200 kb or <1%) that appear either Dnmt3a1-specific, Dnmt3a2- specific, or Dnmt3b1-specific. Therefore, the three Dnmt3 isoforms have highly redundant genomic targets.

We next evaluated the effect on histone modification by different Dnmt isoforms. One challenge in measuring these differences is the inherent lack of quantitative ability of

54 conventional ChIP-seq, precluding accurate and quantitative comparison between isoforms. We therefore compared histone modifications between Dnmt reconstitution lines, and performed comparison with reference to background to account for systematic differences between ChIP- seq libraries.

Evaluation of H3K27me3 at gene promoters showed that all DNMT isoforms studied were able to regulate H3K27me3 and reestablish WT patterns to some extent indicating redundancy in their function. This is also evident in the ability of different isoforms to reestablish

similar global methylation levels (Figure 2.1B). However, of the three enzymes, Dnmt3b1

reconstitution showed the least recapitulation towards WT despite similar ability to remethylate the genome. Measuring the ratio of H3K27me3 occupancy between Dnmt3 isoforms at bivalent promoters, we find that Dnmt3a1 and Dnmt3a2 restore H3K27me3 to higher levels than

Dnmt3b1 (Figures 2.12A-B). Similarly, at silent promoters, both Dnmt3a2 and Dnmt3a1

resuppress H3K27me3 to a greater extent than Dnmt3b1 (Figures 22D and 2.12A-B). At

enhancers, we found Dnmt3a1 and Dnmt3a2 show greater recovery of H3K27me3 at poised

enhancers losing H3K27me3 and greater resuppression of H3K27me3 at de novo poised

enhancers than Dnmt3b1 (Figures 2.12C-D). Likewise, comparison of H3K27ac changes

between reconstitution lines showed that both Dnmt3a2 and Dnmt3a1 restored WT patterns more than Dnmt3b1 (Figure 2.12E). Overall, these comparisons indicate a potential bias between

Dnmt3a and Dnmt3b in the regulation of histone modifications at promoters and enhancer elements.

55 Discussion

In the current study, we combined sequential genetic alteration of DNA methylation

levels with systematic and comprehensive mapping of histone modifications to assess how DNA

methylation influences the epigenomic and transcriptomic landscape. Our experiments show the

active histone modification H3K4me3 acts independently of DNA methylation, while DNA

methylation acts upstream of H3K27me3, H3K27ac, and H3K4me1, which have regulatory

consequences at gene promoters and enhancers. Another key finding in our study was the

reversibility of the histone landscape in response to newly established methylcytosine marks.

Thus, DNA methylation is situated within a hierarchy of epigenetic regulation where it is

required for regulation of some histone modifications.

Measurement of histone modifications showed different responses to genome-wide

demethylation. For example, H3K4me3 did not change in response to global hypomethylation.

This is in contrast to somatic cells, where hypomethylation results in appearance of new

H3K4me3 peaks, albeit far fewer than H3K27me3 (Reddington et al., 2013). This indicates that

DNMT acts downstream of H3K4me3 in ESCs. The downstream position of DNA methylation is

consistent with mechanistic studies of de novo methylation showing exclusion of DNMT binding

to sites marked with H3K4me3 (Noh et al., 2015; Ooi et al., 2007; Otani et al., 2009; Zhang et

al., 2010). H3K4me1 on the other hand showed subtle changes at promoters in Dnmt KO

indicating sensitivity to DNA methylation. Binding affinity of H3K4me1 to DNMT3A-ADD and

DNMT3L was over an order of magnitude greater than H3K4me3, indicating methylation states of H3K4 differ in their relationship with DNA methylation (Noh et al., 2015; Ooi et al., 2007).

H3K4me3 therefore belongs to a list of histone modifications including H3K9me3 and

H3K36me3 that dictate DNMT localization in ESCs.

56 One model for the relationship between H3K27me3 and DNA methylation is that EZH2 targets DNMT for de novo methylation. The preferential de novo methylation at Polycomb targets in many cancers lends credence to this notion (Gal-Yam et al., 2008). EZH2 and

DNMT1/3A/3B were shown to physically interact, where EZH2 is required for DNMT binding at EZH2 targets in HeLa cells (Viré et al., 2006). Based on this model, DNMTs would not influence H3K27me3 as PRC2 acts upstream of DNMT. However, in contrast to this model, we find that loss of DNA methylation affects the maintenance of H3K27me3 patterns, which are then able to be reestablished with DNMT reconstitution, indicating that DNA methylation functionally acts upstream of PRC2 activity. DNA methylation acting upstream of histone modifications is not unprecedented, where DNA methylation has been shown to target H3K9 methyltransferase activity in fibroblasts (Fuks et al., 2003). Recently, DNA methylation was found to recruit SETDB1 and consequently H3K9me3 to key developmental genes in preadipocytes (Matsumura et al., 2015). However, in ESCs, DNA methylation is not required for

H3K9me3 deposition (Karimi et al., 2011). These differences could be due to intrinsic differences between embryonic and somatic cells or particular cell states.

We found the relationship between DNA methylation and H3K27me3 is dependent on genomic context. The negative correlation at silent promoter is consistent with the antagonism between DNA methylation and H3K27me3 (Bartke et al., 2010; Fouse et al., 2008; Mikkelsen et al., 2007). However, the positive correlation at bivalent promoters is unexplained. Bivalent promoters are unmethylated precluding a direct influence of methylcytosine on PRC2 at these sites. The leading thought is that antagonism between DNA methylation and PRC2 constrains

H3K27me3 within CpG islands, which is lost in TKO cells (Brinkman et al., 2012; Reddington et al., 2013). In this case, in cells lacking DNA methylation, H3K27me3 is aberrantly deposited

57 to regions normally restricted by DNA methylation, resulting in depletion of H3K27me3 at sites

such as bivalent promoters. In other words, DNA demethylated regions act as a sink for PcG activity. Alternatively, PRC2 has been shown to bind as a default state to CpG-rich DNA, while being obstructed by active gene expression (Jermann et al., 2014; Riising et al., 2014). DNA methylation could therefore influence gene expression of bivalent genes and consequently regulate H3K27me3. The order of events between PRC2 binding and gene activity are not yet known, preventing evaluation of such a mechanism. It remains of great interest mechanistically how DNA methylation regulates H3K27me3 at bivalent promoters.

The relationship between DNA methylation and chromatin state at enhancers has been largely been based on association with little indication of order of events (Andersson et al., 2014;

Elliott et al., 2015; Hon et al., 2013; Ziller et al., 2013). Our data showing de novo gain of

H3K27me3 and H3K27ac at enhancers upon demethylation demonstrates DNA methylation’s regulation of chromatin state. This is in agreement with previous work showing loss of H3K27ac resulting from hypermethylation by Tet2 knockout (Hon et al., 2014; Lu et al., 2014). Notably, our findings showed reversibility of H3K27me3 and H3K27ac changes at both promoters and enhancers. This indicates that DNA methylation can establish silent chromatin states, even over preexisting active states.

At de novo active enhancers, DNA methylation most likely regulates H3K27ac at active enhancers through modulation of transcription factor binding. Recent results showed DNA

methylation competes with binding of transcription factor NRF1, where changes in NRF1

binding were reversible between 2i and Serum conditions (Domcke et al., 2015). We identified

the NRF1 binding motif enriched at methylation-sensitive active enhancers, indicating regulation

of “settler” transcription factor binding as a target of regulation by DNA methylation. We also

58 identified motifs for Kaiso, a factor that binds methylated CpG in vitro and recruits deacetylation

complex NCoR to repress gene expression (Yoon et al., 2003). The huge diversity in transcription factors and their individual relationships with DNA methylation represent many

potential mechanisms underlying methylation-dependent changes in histone modifications.

Several models describing the relationship between DNA methylation and transcription factors

have been previously reviewed (Blattler and Farnham, 2013).

Similarly, poised enhancers showed reversible regulation by DNA methylation. Studies

supporting the idea that the three-dimensional architecture of the genome is structured into dense

polycomb interfaces (so-called polycomb bodies) have shown evidence of higher-order

interactions among H3K27me3 enriched sites (Joshi et al., 2015; Li et al., 2015; Wani et al.,

2016). It remains to be seen whether DNA methylation can control higher-order interactions via

regulation of histone modifications and other structural proteins.

Another question in the field that lacks understanding is what differences exist between

Dnmt isoforms. Early evidence suggested strong redundancy between isoforms in ESCs as single

gene knockouts of either Dnmt3a or Dnmt3b showed minimal changes in DNA methylation

(Chen et al., 2003; Okano et al., 1999). Knockdown of Dnmt isoforms revealed hypermethylation of highly-transcribed gene bodies upon Dnmt3b knockdown whereas Dnmt3a knockdown resulted in global hypomethylation (Tiedemann et al., 2014). DNMT3B was also found to selectively localize to gene bodies of highly transcribed genes through H3K36me3, while DNMT3A showed decreased methylation of highly transcribed genes (Baubec et al.,

2015). In our results, we find a dominant role for Dnmt3a isoforms over Dnmt3b in reestablishing H3K27me3 patterns at both promoters and enhancers. This provides some clues

59 that differential expression of Dnmt isoforms may play a role in regulating chromatin state and

gene expression, but much still remains to be studied.

In sum, this research extends the role of DNA methylation and provides a comprehensive

resource benefitting the understanding of how alterations in DNA methylation sculpt the rest of

the epigenome. Our results present a model where DNA methylation plays an active role in

establishing chromatin states in addition to its classic roles in promoter silencing. Thus, DNA

methylation is important yet underappreciated in regulation of chromatin across the mammalian

genome.

Experimental Procedures

ES Cell Line Generation

Mice containing flanking loxP sites in exons 4 and 5 of Dnmt12loxP (Jackson-Grusby et

al., 2001), exons 19 of Dnmt3a2loxP (Kaneda et al., 2004), and exons 16-19 of Dnmt3b2loxP loci

(Dodge et al., 2005; Lin et al., 2006), were crossed and bred to establish a line of triple floxed

(TFX) mice. mESCs were derived from TFX mouse embryos. All three DNA

methyltransferases, Dnmt1, Dnmt3a, and Dnmt3b, were simultaneously knocked-out via Cre/lox recombination to generate triple knockout (TKO) mESCs. Previous Dnmt triple knockouts either utilized knockout of Dnmt3a/3b in combination with knockdown of Dnmt1 or sequential knockout of de novo Dnmt3a/3b then Dnmt1 (Meissner et al., 2005; Tsumura et al., 2006).

Double knockout (DKO) mESC lines with Dnmt3a and Dnmt3b knock-out are as previously described (Okano et al., 1999). For reconstitution, Dnmt isoforms were cloned into pCAG-IRES-

Blast vector at the EcoRI cloning site as previously described (Chen et al., 2003). Briefly, constructs were transfected by electroporation with Gene Pulser II (Bio-Rad) and selected with 5

60 ug/mL Blasticidin. Individual clones were picked and expanded for a minimal of five passages to

generate stable reconstitution cell lines. In total, the following cell lines were generated for this

study: TKO, TKO+Dnmt3a1, TKO+Dnmt3a2, TKO+Dnmt3b1, DKO+Dnmt3a1,

DKO+Dnmt3a2, DKO+Dnmt3b1, and DKO+Dnmt3b1MUT.

ES Cell Culture

Mouse embryonic stem cells (mESCs) were grown on irradiated mouse embryonic

fibroblasts (MEF) and maintained in DMEM (Thermo) supplemented with 15% fetal bovine

serum, 1 mM GlutaMAX (Life Technologies), 1X Non-essential amino acids (NEAA), 1X

Penicillin/Streptomycin (Life Technologies), mLIF, and 0.001% β-mercaptoethanol (Sigma).

Chromatin Immunoprecipitation

ChIP was performed using protocol as described previously (Wang et al., 2005), with minor modification. Briefly, mESCs were pre-plated three times to deplete MEF feeders before

collection. Briefly, 2x107 cells were fixed with 1% formaldehyde for 8 min after which

chromatin was isolated and sonicated to 150-300 bp using Bioruptor (Diagenode). 5x106 cells worth of sonicated chromatin was precleared with Protein A/G Dynabeads (Life Technologies) for 1 hour then incubated in antibody at 4°C overnight. Antibodies for immunoprecipitation were all ChIP grade and validated previously in ENCODE datasets

(http://compbio.med.harvard.edu/antibodies/about) (Egelhofer et al., 2011). 2 ug of anti-

H3K4me3 antibody (Abcam ab8580), 5 ug of anti-H3K27me3 antibody (Active Motif 39155), 5 ug of anti-H3K27ac antibody (Active Motif 39133), and 5 ug of anti-H3K4me1 antibody

(Abcam ab8895) were used for ChIP. Chromatin was then immunoprecipitated with Protein A/G

61 Dynabeads for > 2 hours. ChIP was washed successively by two rounds of low-salt buffer, two rounds of high-salt buffer, two rounds of LiCl buffer, and finally two rounds of TE buffer. ChIP

DNA was eluted in TE, reverse crosslinked overnight at 65°C, and purified by incubation with

RNase A, then proteinase K, and finally phenol-chloroform extraction and ethanol precipitation.

ChIP-seq and Analysis

ChIP DNA was quantified using Qubit Fluorometer (Life Technologies) and approximately 10 ng were used for libraries construction using Kapa Library Preparation Kit for

Illumina sequencing (Kapa Biosystems) following manufacturer’s recommendations. ChIP-seq libraries were quantified using Qubit Fluorometer and Bioanalyzer High Sensitivity DNA

Analysis Kit (Agilent). Libraries were pooled and sequenced using the Illumina HiSeq 2000 machine as either 50-bp or 100-bp single-end sequencing reads. Sequenced reads were trimmed and mapped to mouse genome (mm9) using bowtie 1.1.0 allowing up to two mismatches and only unique alignments using the following options: “–v2 –m1 --best” (Langmead et al., 2009) .

Mapped reads were processed using Homer tool suite v4.7 where reads were extended by fragment length and normalized to 1x107 tags per sample (Heinz et al., 2010). Peak calling was done using Homer’s findPeaks function using input to filter peaks. Parameters for H3K4me3 included “-F 8”; H3K27me3 included “-size 1000 -minDist 3000 -F 4 -tagThreshold 32”;

H3K27ac included “-F 4”; H3K4me1 included “-size 1000 -minDist 1000 -nfr”. Peaks were further filtered using hard cut-offs for average reads/bp to retain only the most robust peaks.

Global correlation analysis was performed by first dividing the mouse genome into non- overlapping 1-kb windows using the BEDTools suite (Quinlan and Hall, 2010) and measuring coverage of ChIP-seq tags, which were extended by the approximate fragment length of 200 bp.

62 A matrix containing tag counts in 1 kb bins for each histone modification in each cell type was

then arranged in R, where empty bins were eliminated and counts were normalized across

samples. Matrix of pair-wise Pearson correlations were then calculated and used to generate

correlation heatmap.

Promoter categories were identified using overlap of H3K4me3 and H3K27me3.

Candidate active and poised enhancers were identified by first identifying differential peaks for both H3K27ac and H3K27me3 in J1-DKO and TFX-TKO pairs. Differential peaks identified in

both were then overlapped with H3K4me1 peaks and excluded from promoter regions. Promoter

and enhancer analysis was performed by identifying genomic regions and quantifying

normalized tags between samples within genomic regions in bed format. Scatterplot and boxplot

analyses used normalized tag counts summed over 2 kb from the center of genomic features of

interest. Heatmaps were generated from Homer -ghist output and visualized using Java

TreeView (Saldanha, 2004).

Reduced Representation Bisulfite Sequencing

Genomic DNA was purified from mESCs using standard phenol-chloroform-isoamyl alcohol extraction followed by ethanol precipitation. RRBS libraries were generated starting

from 400-800 ng of genomic DNA as previously described by Meissner et al. with minor

modifications. Briefly, MspI- generated fragments were end-repaired and adenylated before

being ligated with Illumina TruSeq adapters. DNA purifications after each enzymatic reaction as

well as size-selection of adapter-ligated fragments ranging from 200-350 bp were carried out

using AMPure XP beads (Beckman Coulter). Bisulfite conversion was performed with EpiTect

kit (Qiagen). In order to optimize the efficiency, bisulfite conversion, carried out as described by

63 the manufacturer, was performed twice. Bisulfite-converted libraries were amplified using

MyTaq Mix (Bioline) with the following program: 98°C for 2min, (98°C for 15sec, 60°C for

30sec, 72°C for 30sec) 12 cycles, 72°C for 5 min, 4°C indefinitely. Libraries were sequenced using Illumina HiSeq 2000 machine as 100-bp single-end sequencing reads.

DNA Methylation Analysis

Sequenced reads were trimmed and mapped to mouse genome (mm9) using Trim Galore and Bismark. Methylation state of individual CpGs were determined using Bismark Methylation

Extractor and used to generate bedGraph tracks. Cytosines that were covered by at least 5 reads were kept for downstream analysis. Cytosine coverage maps were generated using Bismark’s coverage2cytosine function and converted to tag format using Homer’s makeTagDirectory function with options –format bismark –mCcontext CG –precision 2. Definition of global 5mC levels was calculated by summation of average methylation per site and dividing by total cytosines. Analysis of methylation at specific genomic features was performed using Homer’s annotatePeaks function with –ratio option. Methylation at bivalent and silent promoters was calculated by averaging fractional CpG methylation across 2 kb surrounding all bivalent and silent promoter TSS. Metaplot and heatmap analysis were similarly generated from –hist and – ghist outputs. Whole genome bisulfite sequencing data were re-analyzed from the literature.

RNA-seq and Analysis

Total RNA was purified using RNeasy per manufacturer’s instructions (Qiagen). PolyA- tailed RNA was purified from total RNA and libraries were constructed using the TruSeq RNA

Library Prep Kit v2.0 (Illumina) following manufacturer’s recommendations. RNA-seq libraries

64 were quantified using Qubit Fluorometry and Bioanalyzer High Sensitivity DNA Analysis Kit.

Libraries were pooled and sequenced using the Illumina HiSeq 2000 machine as 50-bp or 100-bp

single-end sequencing reads. Reads with mapped to the mouse genome (mm9) with STAR aligner (Dobin et al., 2013).

Gene Ontology Analysis

Genes of which promoters were identified to have histone modification changes were compiled and analyzed using DAVID (Huang et al., 2007). Gene ontology of candidate enhancers was performed by formatting enhancer regions and random nonpromoter H3K4me1 background regions in BED genomic intervals format. These genomic intervals were then analyzed with GREAT (McLean et al., 2010).

Published Datasets

ChIP-seq datasets for DNMT3A2 and DNMT3B1 in mESCs were obtained from

GSE57413 (Baubec et al., 2015), SUZ12 in WT and Dnmt TKO mESCs from ERP005575

(Cooper et al., 2014), and H3K27me3 in WT and Dnmt TKO mESCs from GSE28254

(Brinkman et al., 2012) and ERP005575 (Cooper et al., 2014). Whole-genome bisulfite sequencing data for WT mESC was obtained from GSE30206 (Stadler et al., 2011), GSE48519

(Hon et al., 2014), and for DKO mESCs from GSE47894 (Leung et al., 2014).

Mass Spectrometry

One microgram of genomic DNA was digested by nuclease P1 (0.5 U), alkaline phosphatase (0.05 U), phosphodiesterase 1 (0.0005 U) and 2 (0.00025 U), following previously

65 published procedures (Liu et al., 2013). For liquid chromatography-tandem mass spectrometry- based quantification of 5-methyl-2’-deoxycytidine (5-mdC) and 2’-deoxyguanosine (dG), 600 fmol of [1’,2’,3’,4’,5’-13C5]-5-mdC and 16.2 pmol of [15N5]-dG were added to 200 ng of digested DNA. Enzymes were then removed by chloroform extraction, and the aqueous layer was subjected to LC-MS/MS analysis on an LTQ linear ion-trap mass spectrometer (Thermo

Fisher Scientific, San Jose, CA) coupled with an Agilent 1200 capillary HPLC (Agilent

Technologies, Santa Clara, CA). A 0.5 × 250 mm Zorbax SB-C18 column (5 μm in particle size,

Agilent Technologies, Santa Clara, CA) was eluted at a flow rate of 8.0 μL/min. A solution of 2 mM sodium bicarbonate (pH 7.0, solution A) and methanol (solution B) were used as mobile phases, and solvent composition was linearly changed from 0-20% B in 5 min and subsequently to 70% B in 25 min. The instrument was operated in positive-ion ESI-MSn mode to monitor the transitions of m/z 242126 for 5-mdC, m/z 247126 for [1’,2’,3’,4’,5’-13C5]-5-mdC, m/z

268152134 for dG, and m/z 273157139 for [15N5]-dG. Electrospray ionization conditions were as follows: electrospray voltage, 5 kV; capillary temperature, 275°C; capillary voltage, 9 V; tube lens voltage, 40 V; sheath gas flow rate, 15 arbitrary units. The calibration curves for the quantifications of 5-mdC and dG were generated by spiking isotope-labeled standards (600 fmol of [1’,2’,3’,4’,5’-D5]-5-mdC or 16.2 pmol of [15N5]-dG) with different amounts of the corresponding unlabeled analytes. The level of 5-mdC was expressed as a percentage of dG.

Accession Numbers

66 ChIP-seq and RNA-seq are available in the Gene Expression Omnibus (GEO) under accession number GSE77004.

67 Figures

Figure 2.1. Alteration of DNA Methylation Causes Selective Genome-wide Changes in Histone Modifications (A) Schematic of experimental design. Wild-type, TKO, DKO, and Dnmt Reconstitution cell lines each are assayed for DNA methylation by RRBS, histone modifications by ChIP-seq, and gene expression using RNA-seq. (B) Left: Mass spectrometry measurement of global 5mC levels between WT (n=6), TKO (n=7), TKO+Dnmt3a1 (m=5), TKO+Dnmt3a2 (n=6), TKO+Dnmt3b1 (n=4), and DKO and reconstitution cell lines (n=2). Data represented as mean ± SEM. Right: Global 5mC levels measured by RRBS. (C) Global Pearson correlation analysis between H3K27me3, H3K4me1, H3K4me3, and H3K27ac histone modifications across mESC lines. Correlation were calculated based on 1-kb genome-wide bins. (D) Genome browser tracks of RRBS and histone modification ChIP-seq between WT, KO, and Dnmt reconstitution mESC lines. Regions of interest exhibiting histone modification changes are boxed.

68

Figure 2.2. DNA Methylation Causally Regulates the Maintenance and Establishment of Promoter H3K27me3 (A) Heatmap of H3K4me3 (red) and H3K27me3 (blue) at ± 5 kb region centered around all Refseq TSS ranked by H3K4me3. Promoter categories are shown on left. Representative genes from bivalent and silent promoter categories are shown on the right. (B) Comparison of H3K27me3 between TKO vs WT at bivalent (top) and silent (bottom) promoters. Grey dots include all promoters and colored dots are statistically different between TKO and WT (p < 0.0001). Data are represented as H3K27me3 read coverage in log-

69 transformed tags per 10 million (TPM) reads within 2 kb surrounding TSS of individual promoters. (C) Genome browser tracks showing gene model, CpG islands, RRBS DNA methylation, H3K27me3/H3K27ac, and H3K4me1 (from top to bottom) at different loci. Promoter regions of Cdkn2a and Pdgfb (bivalent) and Trpv1 (silent) are shown boxed. (D) H3K27me3 occupancy in WT, demethylated DKO, and remethylated Dnmt reconstitution lines. Promoter metaplot and boxplot quantification of H3K27me3 occupancy between cell lines for bivalent (left) and silent (right) promoters. Metaplot represented as normalized H3K27me3 reads (in tags per 10 million; TPM) measured in 100 bp bins across ± 5 kb centered at TSS. Boxplot data similarly represented as H3K27me3 TPM within 2 kb surrounding TSS. (E) Comparison of fractional CpG methylation and observed-to-expected CpG ratio between bivalent and silent promoters in 2 kb surrounding TSS (top). Promoter metaplot showing average CpG methylation at ± 5 kb region centered around TSS (bottom). (F) Promoter H3K27me3 correlates with global DNA methylation (measured by mass spec) at bivalent and silent promoters. Data represented as normalized H3K27me3 in TPM within 5 kb of promoters and global 5mC (as in Figure 1B) are shown for each cell line. H3K27me3 TPM represents an average across all bivalent and silent promoters respectively. Note y-axis scale is different between bivalent and silent promoters. (G) Reads per kilobase million reads (RPKM) expression values for individual genes (left) and all bivalent and silent promoters (right). Bivalent promoters are expressed at a higher level in DKO relative to WT (P = 4.75x10-7, Kolmogorov-Smirnov test).

70

71 Figure 2.3. DNA Methylation Regulates Histone Modification Status and Gene Activity at Enhancers (A) Altered histone modification occupancy at putative enhancer sites. Metaplot indicates H3K27ac and H3K27me3 read coverage from DKO (solid) and TKO (dashed), normalized by depth in 100 bp bins within ± 5 kb around non-promoter H3K4me1 peaks. (B) Schematic showing changes in candidate enhancer status between primed (H3K4me1+), active (H3K4me1+/H3K27ac+), and poised (H3K4me1+/H3K27me3+) states. Numbers in circles indicate number of sites in WT mESC. Arrows represent changes occurring with demethylation in DKO/TKO. (C) Genome browser visualization of histone modification changes at candidate de novo active (chr4:99,210,711-99,257,661; 80 kb upstream of Foxd3) and poised (chr11:115,049,693- 115,089,693; Tmem104 intragenic) enhancer sites. (D) Heatmap analysis of histone modification changes to Dnmt DKO/TKO and reconstitution. Heatmap shows H3K4me1, H3K27ac, H3K27me3, and 5mCpG ± 5 kb genomic region centered on H3K4me1 separated by candidate enhancer categories: H3K27ac Gain (n=138), H3K27ac Loss (n=551), H3K27me3 Gain (n=1,041), H3K27me3 Loss (n=162). (E) Comparison of fractional CpG methylation (top) and observed-to-expected CpG ratio (bottom) in 2 kb surrounding enhancer center between categories of DNA methylation-sensitive enhancers. (F) Distribution of log2 fold-change (DKO/WT) in gene expression (in RPKM) between enhancer categories. (G) Representative example of gene expression changes at Elmo1 (chr13:20,175,630- 20,231,247) showing increase in H3K27ac at intronic enhancer (boxed and labeled with E) and expression in DKO and reversion with Dnmt reconstitution.

72

Figure 2.4. DNA Methylation-Dependent Loci Are Tissue-Specific Promoters and Enhancers (A) Distribution of candidate enhancers among previously identified tissue-specific enhancers (Shen et al., 2012). Tissues with candidate enhancer overlap >4% are labeled, all remaining tissues are grouped in “Other” category. (B) DNA sequence motifs enriched in de novo DNA methylation-dependent active and poised enhancer categories. (C) Examples of tissue-specific enhancers overlapping differentially methylated regions and methylation-dependent transcription factor motifs enriched in our analysis. Genome browser tracks show enhancers upstream of Zfp92 (chrX:70,622,988-70,673,292) and intragenic to Ears2 (chr7:129,179,862-129,202,881). Whole genome-bisulfite sequencing tracks are shown for mESC, adult germline stem cell (AGSC) from testis, and placenta (Stadler et al., 2011; Hammoud et al., 2014; Hon et al., 2013).

73

Figure 2.5. Regulation of H3K27me3 Depends on DNMT Catalytic Activity (A) Heatmap of H3K4me3 (red) and H3K27me3 (blue) at ± 5 kb region centered around all Refseq TSS ranked by H3K4me3. H3K27me3 compared between DKO, DKO+Dnmt3b1 (wild- type), and DKO+Dnmt3b1MUT (catalytic mutant). (B) Comparison of H3K27me3 occupancy between wild-type Dnmt3b1 and catalytic mutant reconstitution at bivalent and silent promoters. Average profile of H3K27me3 across ± 5 kb centered on TSS and quantification between cell lines shown for bivalent (top) and silent promoters (bottom). (C) Comparison between reconstitution of wild-type Dnmt3b1 and Dnmt3b1MUT at enhancers. Metaplot showing average profiles of H3K27me3 occupancy centered on de novo poised enhancers (top) and poised enhancers losing H3K27me3 (bottom). (D) Genome browser tracks showing examples of H3K27me3 changes at promoters of Cdkn2a and Trpv1 as well as de novo poised enhancer in Tmem104 and poised enhancer upstream of Foxp4 (chr17:48,106,091-48,116,091).

74

Figure 2.6. DNA Methylation Directly Regulates Loci Gaining H3K27me3 via PRC2 Occupancy (A) Heatmap comparison of 5mC, DNMT3A2/B1, H3K27me3, and SUZ12 between different genomic contexts. Data is shown within +/- 5 kb centered on TSS or center of enhancer and quantified as averaged depth-normalized tags in 100 bp bins. (B) Metaplot profiles showing average occupancy of DNMT3A2/3B1 and SUZ12 across silent promoters (top) and de novo poised enhancers (bottom) (Baubec et al., 2015; Cooper et al., 2014).

75

Figure 2.7. DNA Methylation Regulation of Histone Modifications Across Promoter and Enhancer Contexts Schematic of DNA methylation shaping of the epigenome. At promoters (left) hypomethylation results in loss or gain of H3K27me3, but not H3K4me3. Loss/gain of H3K27me3 is associated with H3K4me3 occupancy, where bivalent promoters marked by both H3K4me3 and H3K27me3 lose H3K27me3 with hypomethylation, while silent promoters absent of histone modifications gain H3K27me3 in the demethylated state. At enhancers (right) methylation represses enhancers, which become poised (gain H3K27me3) or active (gain H3K27ac) upon global demethylation. Reconstitution of DNMT in all contexts reestablishes WT patterns.

76

Figure 2.8. Overview of RRBS and ChIP-seq Data (A) RT-PCR quantification of Dnmt1, Dnmt3a, and Dnmt3b in WT, TKO, DKO, and respective reconstitution cell lines. (B) Histogram plot of all CpG sites measured by RRBS that are covered by at least 5 reads. X- axis indicates the methylation level of individual CpG sites defined by the number of methylated reads divided by the total number of methylated (C) and unmethylated (T) reads (C/[T+C]). (C) Metaplot showing average methylation profiles across all genes; DKO shown as solid lines and TKO shown as dashed lines. (D) ChIP-seq reproducibility. Comparison of H3K27me3 WT and TKO ChIP-seq data with published datasets. Top row compares between biological replicates of WT (r=0.895), WT from ‘B’ Brinkman et al. (r=0.858), and WT from ‘C’ Cooper et al. (r=0.761) respectively from left to right. Bottom row compares between biological replicates of TKO (r=0.872), TKO from ‘B’ Brinkman et al. (r=0.835), and TKO from ‘C’ Cooper et al. (r=0.836) respectively from left to right.

77

Figure 2.9. Histone Modification at Promoters Change with Global Methylation (A) Metaplot profiles showing average profiles of H3K4me3, H3K4me1, and H3K27ac centered on TSS of bivalent promoters. (B) H3K27me3 loss at bivalent promoters coincides with a general gain in H3K27ac. Scatter plot showing ratio (TKO/WT) of H3K27ac versus H3K27me3. (C) H3K27me3 loss at bivalent promoters coincides with a general loss in H3K4me1. Scatter plot showing ratio (TKO/WT) of H3K4me1 versus H3K27me3. (D) H3K27me3 occupancy in wild-type, demethylated TKO, and remethylated Dnmt reconstitution lines. Promoter metaplot and boxplot quantification of H3K27me3 occupancy between cell lines for bivalent and silent promoters. Metaplot represented as normalized H3K27me3 reads (in tags per 10 million; TPM) measured in 100 bp bins across ± 5 kb centered at TSS. Boxplot data similarly represented as H3K27me3 TPM within 2 kb surrounding TSS

78

Figure 2.10. Comparison of H3K27ac and H3K27me3 loss and rescue in DKO and TKO mESCs (A) Metaplot profiles of H3K27ac occupancy, measured in tags per 10 million reads, across de novo active enhancers (Gain H3K27ac in KO) and active enhancers that lose H3K27ac. (B) Metaplot profiles of H3K27me3 occupancy, measured in tags per 10 million reads, across de novo poised enhancers (Gain H3K27me3 in KO) and poised enhancers that lose H3K27me3.

79

Figure 2.11. Enriched de novo Transcription Factor Motifs in DNA Methylation-Sensitive Enhancers (A-D) Top enriched de novo motifs from Homer for (A) de novo active enhancers; (B) de novo poised enhancers; (C) active enhancers losing H3K27ac; (D) poised enhancers losing H3K27me3 categories.

80

81

Figure 2.12. Isoform-specific effects at promoters and enhancers (A-D) Histogram of H3K27me3 normalized read count at promoters/enhancers as a ratio of Dnmt3a/Dnmt3b1. Blue line represents histogram of H3K27me3 ratio at promoter/enhancer categories, while grey line in all plots are H3K27me3 ratio measured at all promoters, representing systematic differences in ChIP-seq signal. P-value shown for two sample t-test. (A) Comparison of Dnmt3a2 and Dnmt3b1 at promoters. Background ratio measured at all promoters. (B) Comparison of Dnmt3a1 and Dnmt3b1 at promoters. (C) Comparison of Dnmt3a2 and Dnmt3b1 at enhancers. (D) Comparison of Dnmt3a1 and Dnmt3b1 at enhancers. (E) Histogram of H3K27ac normalized read counts at enhancers as a ratio of Dnmt3a/Dnmt3b1.

82 References

Andersson, R., Gebhard, C., Miguel-Escalada, I., Hoof, I., Bornholdt, J., Boyd, M., Chen, Y., Zhao, X., Schmidl, C., Suzuki, T., et al. (2014). An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461.

Barski, A., Cuddapah, S., Cui, K., Roh, T.-Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. (2007). High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837.

Bartke, T., Vermeulen, M., Xhemalce, B., Robson, S.C., Mann, M., and Kouzarides, T. (2010). Nucleosome-interacting proteins regulated by DNA and histone methylation. Cell 143, 470–484.

Baubec, T., Colombo, D.F., Wirbelauer, C., Schmidt, J., Burger, L., Krebs, A.R., Akalin, A., and Schübeler, D. (2015). Genomic profiling of DNA methyltransferases reveals a role for DNMT3B in genic methylation. Nature.

Bernstein, B.E., Mikkelsen, T.S., Xie, X., Kamal, M., Huebert, D.J., Cuff, J., Fry, B., Meissner, A., Wernig, M., Plath, K., et al. (2006). A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells. Cell 125, 315–326.

Blattler, A., and Farnham, P.J. (2013). Cross-talk between site-specific transcription factors and DNA methylation states. J. Biol. Chem. 288, 34287–34294.

Bogdanovic, O., Long, S.W., van Heeringen, S.J., Brinkman, A.B., Gómez-Skarmeta, J.L., Stunnenberg, H.G., Jones, P.L., and Veenstra, G.J.C. (2011). Temporal uncoupling of the DNA methylome and transcriptional repression during embryogenesis. Genome Res. 21, 1313–1327.

Brinkman, A.B., Gu, H., Bartels, S.J.J., Zhang, Y., Matarese, F., Simmer, F., Marks, H., Bock, C., Gnirke, A., Meissner, A., et al. (2012). Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk. Genome Res. 22, 1128–1138.

Cedar, H., and Bergman, Y. (2009). Linking DNA methylation and histone modification: patterns and paradigms. Nat. Rev. Genet. 10, 295–304.

Chen, T., Ueda, Y., Xie, S., and Li, E. (2002). A novel Dnmt3a isoform produced from an alternative promoter localizes to euchromatin and its expression correlates with active de novo methylation. J. Biol. Chem. 277, 38746–38754.

Chen, T., Ueda, Y., Dodge, J.E., Wang, Z., and Li, E. (2003). Establishment and maintenance of genomic methylation patterns in mouse embryonic stem cells by Dnmt3a and Dnmt3b. Mol. Cell. Biol. 23, 5594–5605.

Cooper, S., Dienstbier, M., Hassan, R., Schermelleh, L., Sharif, J., Blackledge, N.P., DeMarco, V., Elderkin, S., Koseki, H., Klose, R., et al. (2014). Targeting Polycomb to Pericentric Heterochromatin in Embryonic Stem Cells Reveals a Role for H2AK119u1 in PRC2 Recruitment. Cell Rep. 7, 1456–1470.

83 Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna, J., Lodato, M. a, Frampton, G.M., Sharp, P. a, et al. (2010). Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U. S. A. 107, 21931–21936.

Dobin, A., Davis, C. a, Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15– 21.

Dodge, J.E., Okano, M., Dick, F., Tsujimoto, N., Chen, T., Wang, S., Ueda, Y., Dyson, N., and Li, E. (2005). Inactivation of Dnmt3b in mouse embryonic fibroblasts results in DNA hypomethylation, chromosomal instability, and spontaneous immortalization. J. Biol. Chem. 280, 17986–17991.

Domcke, S., Bardet, A.F., Adrian Ginno, P., Hartl, D., Burger, L., and Schübeler, D. (2015). Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575–579.

Egelhofer, T. a, Minoda, A., Klugman, S., Lee, K., Kolasinska-Zwierz, P., Alekseyenko, A. a, Cheung, M.-S., Day, D.S., Gadel, S., Gorchakov, A. a, et al. (2011). An assessment of histone- modification antibody quality. Nat. Struct. Mol. Biol. 18, 91–93.

Elliott, G., Hong, C., Xing, X., Zhou, X., Li, D., Coarfa, C., Bell, R.J.A., Maire, C.L., Ligon, K.L., Sigaroudinia, M., et al. (2015). Intermediate DNA methylation is a conserved signature of genome regulation. Nat. Commun. 6, 6363.

Feng, J., Chang, H., Li, E., and Fan, G. (2005). Dynamic expression of de novo DNA methyltransferases Dnmt3a and Dnmt3b in the central nervous system. J. Neurosci. Res. 79, 734–746.

Fouse, S.D., Shen, Y., Pellegrini, M., Cole, S., Meissner, A., Van Neste, L., Jaenisch, R., and Fan, G. (2008). Promoter CpG methylation contributes to ES cell gene regulation in parallel with Oct4/Nanog, PcG complex, and histone H3 K4/K27 trimethylation. Cell Stem Cell 2, 160–169.

Fuks, F., Hurd, P.J., Wolf, D., Nan, X., Bird, A.P., and Kouzarides, T. (2003). The methyl-CpG- binding protein MeCP2 links DNA methylation to histone methylation. J. Biol. Chem. 278, 4035–4040.

Gal-Yam, E.N., Egger, G., Iniguez, L., Holster, H., Einarsson, S., Zhang, X., Lin, J.C., Liang, G., Jones, P. a, and Tanay, A. (2008). Frequent switching of Polycomb repressive marks and DNA hypermethylation in the PC3 prostate cancer cell line. Proc. Natl. Acad. Sci. U. S. A. 105, 12979–12984.

Gkountela, S., Zhang, K.X., Shafiq, T.A., Liao, W.-W., Hargan-Calvopiña, J., Chen, P.-Y., and Clark, A.T. (2015). DNA Demethylation Dynamics in the Human Prenatal Germline. Cell 161, 1425–1436.

Guo, F., Yan, L., Guo, H., Li, L., Hu, B., Zhao, Y., Yong, J., Hu, Y., Wang, X., Wei, Y., et al.

84 (2015). The transcriptome and DNA methylome landscapes of human primordial germ cells. Cell 161, 1437–1452.

Guo, H., Zhu, P., Yan, L., Li, R., Hu, B., Lian, Y., Yan, J., Ren, X., Lin, S., Li, J., et al. (2014a). The DNA methylation landscape of human early embryos. Nature 511, 606–610.

Guo, X., Wang, L., Li, J., Ding, Z., Xiao, J., Yin, X., He, S., Shi, P., Dong, L., Li, G., et al. (2014b). Structural insight into autoinhibition and histone H3-induced activation of DNMT3A. Nature.

Hammoud, S.S., Low, D.H.P., Yi, C., Carrell, D.T., Guccione, E., and Cairns, B.R. (2014). Chromatin and transcription transitions of mammalian adult germline stem cells and spermatogenesis. Cell Stem Cell 15, 239–253.

Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van Calcar, S., Qu, C., Ching, K.A., et al. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318.

Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C., Singh, H., and Glass, C.K. (2010). Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589.

Hon, G.C., Rajagopal, N., Shen, Y., McCleary, D.F., Yue, F., Dang, M.D., and Ren, B. (2013). Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat. Genet. 45, 1198–1206.

Hon, G.C., Song, C.-X., Du, T., Jin, F., Selvaraj, S., Lee, A.Y., Yen, C.-A., Ye, Z., Mao, S.-Q., Wang, B.-A., et al. (2014). 5mC oxidation by Tet2 modulates enhancer activity and timing of transcriptome reprogramming during differentiation. Mol. Cell 56, 286–297.

Huang, D., Sherman, B.T., Tan, Q., Collins, J.R., Alvord, W.G., Roayaei, J., Stephens, R., Baseler, M.W., Lane, H.C., and Lempicki, R.A. (2007). The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 8, R183.

Jackson, M., Krassowska, A., Gilbert, N., Chevassut, T., Forrester, L., Ansell, J., and Ramsahoye, B. (2004). Severe global DNA hypomethylation blocks differentiation and induces histone hyperacetylation in embryonic stem cells. Mol. Cell. Biol. 24, 8862–8871.

Jackson-Grusby, L., Beard, C., Possemato, R., Tudor, M., Fambrough, D., Csankovszki, G., Dausman, J., Lee, P., Wilson, C., Lander, E., et al. (2001). Loss of genomic methylation causes p53-dependent apoptosis and epigenetic deregulation. Nat. Genet. 27, 31–39.

Jermann, P., Hoerner, L., Burger, L., and Schubeler, D. (2014). Short sequences can efficiently recruit histone H3 lysine 27 trimethylation in the absence of enhancer activity and DNA methylation. Proc. Natl. Acad. Sci. 1400672111 – .

85 Joshi, O., Wang, S.-Y., Kuznetsova, T., Atlasi, Y., Peng, T., Fabre, P.J., Habibi, E., Shaik, J., Saeed, S., Handoko, L., et al. (2015). Dynamic Reorganization of Extremely Long-Range Promoter-Promoter Interactions between Two States of Pluripotency. Cell Stem Cell 17, 748– 757.

Kaneda, M., Okano, M., Hata, K., and Sado, T. (2004). Essential role for de novo DNA methyltransferase Dnmt3a in paternal and maternal imprinting. 429, 2–5.

Karimi, M.M., Goyal, P., Maksakova, I. a, Bilenky, M., Leung, D., Tang, J.X., Shinkai, Y., Mager, D.L., Jones, S., Hirst, M., et al. (2011). DNA methylation and SETDB1/H3K9me3 regulate predominantly distinct sets of genes, retroelements, and chimeric transcripts in mESCs. Cell Stem Cell 8, 676–687.

Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25.

Leung, D., Du, T., Wagner, U., Xie, W., Lee, A.Y., Goyal, P., Li, Y., Szulwach, K.E., Jin, P., Lorincz, M.C., et al. (2014). Regulation of DNA methylation turnover at LTR retrotransposons and imprinted loci by the histone methyltransferase Setdb1. Proc. Natl. Acad. Sci. U. S. A. 111, 6690–6695.

Li, J.-Y., Pu, M.-T., Hirasawa, R., Li, B.-Z., Huang, Y.-N., Zeng, R., Jing, N.-H., Chen, T., Li, E., Sasaki, H., et al. (2007). Synergistic function of DNA methyltransferases Dnmt3a and Dnmt3b in the methylation of Oct4 and Nanog. Mol. Cell. Biol. 27, 8748–8759.

Li, L., Lyu, X., Hou, C., Takenaka, N., Nguyen, H.Q., Ong, C.-T., Cubeñas-Potts, C., Hu, M., Lei, E.P., Bosco, G., et al. (2015). Widespread rearrangement of 3D chromatin organization underlies polycomb-mediated stress-induced silencing. Mol. Cell 58, 216–231.

Lin, H., Yamada, Y., Nguyen, S., Linhart, H., Jackson-Grusby, L., Meissner, A., Meletis, K., Lo, G., and Jaenisch, R. (2006). Suppression of intestinal neoplasia by deletion of Dnmt3b. Mol. Cell. Biol. 26, 2976–2983.

Lister, R., Pelizzola, M., Kida, Y.S., Hawkins, R.D., Nery, J.R., Hon, G., Antosiewicz-Bourget, J., O’Malley, R., Castanon, R., Klugman, S., et al. (2011). Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 68–73.

Liu, S., Wang, J., Su, Y., Guerrero, C., Zeng, Y., Mitra, D., Brooks, P.J., Fisher, D.E., Song, H., and Wang, Y. (2013). Quantitative assessment of Tet-induced oxidation products of 5- methylcytosine in cellular and tissue DNA. Nucleic Acids Res. 41, 6421–6429.

Lu, F., Liu, Y., Jiang, L., Yamaguchi, S., and Zhang, Y. (2014). Role of Tet proteins in enhancer activity and telomere elongation. 2103–2119.

Matsumura, Y., Nakaki, R., Inagaki, T., Yoshida, A., Kano, Y., Kimura, H., Tanaka, T., Tsutsumi, S., Nakao, M., Doi, T., et al. (2015). H3K4/H3K9me3 Bivalent Chromatin Domains Targeted by Lineage-Specific DNA Methylation Pauses Adipocyte Differentiation. Mol. Cell 60, 584–596.

86 McLean, C.Y., Bristor, D., Hiller, M., Clarke, S.L., Schaar, B.T., Lowe, C.B., Wenger, A.M., and Bejerano, G. (2010). GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501.

Meissner, A., Gnirke, A., Bell, G.W., Ramsahoye, B., Lander, E.S., and Jaenisch, R. (2005). Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 33, 5868–5877.

Mikkelsen, T.S., Ku, M., Jaffe, D.B., Issac, B., Lieberman, E., Giannoukos, G., Alvarez, P., Brockman, W., Kim, T.-K., Koche, R.P., et al. (2007). Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560.

Morselli, M., Pastor, W.A., Montanini, B., Nee, K., Ferrari, R., Fu, K., Bonora, G., Rubbi, L., Clark, A.T., Ottonello, S., et al. (2015). In vivo targeting of de novo DNA methylation by histone modifications in yeast and mouse. Elife 4, e06205.

Noh, K.-M., Wang, H., Kim, H.R., Wenderski, W., Fang, F., Li, C.H., Dewell, S., Hughes, S.H., Melnick, A.M., Patel, D.J., et al. (2015). Engineering of a Histone-Recognition Domain in Dnmt3a Alters the Epigenetic Landscape and Phenotypic Features of Mouse ESCs. Mol. Cell 59, 89–103.

Okano, M., Bell, D.W., Haber, D. a, and Li, E. (1999). DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247–257.

Ooi, S.K.T., Qiu, C., Bernstein, E., Li, K., Jia, D., Yang, Z., Erdjument-Bromage, H., Tempst, P., Lin, S.-P., Allis, C.D., et al. (2007). DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature 448, 714–717.

Otani, J., Nankumo, T., Arita, K., Inamoto, S., Ariyoshi, M., and Shirakawa, M. (2009). Structural basis for recognition of H3K4 methylation status by the DNA methyltransferase 3A ATRX-DNMT3-DNMT3L domain. EMBO Rep. 10, 1235–1241.

Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842.

Rada-Iglesias, A., Bajpai, R., Swigut, T., Brugmann, S.A., Flynn, R.A., and Wysocka, J. (2011). A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283.

Reddington, J.P., Perricone, S.M., Nestor, C.E., Reichmann, J., Youngson, N. a, Suzuki, M., Reinhardt, D., Dunican, D.S., Prendergast, J.G., Mjoseng, H., et al. (2013). Redistribution of H3K27me3 upon DNA hypomethylation results in de-repression of Polycomb target genes. Genome Biol. 14, R25.

Riising, E.M., Comet, I., Leblanc, B., Wu, X., Johansen, J.V., and Helin, K. (2014). Gene silencing triggers polycomb repressive complex 2 recruitment to CpG islands genome wide. Mol. Cell 55, 347–360.

87 Saldanha, A.J. (2004). Java Treeview–extensible visualization of microarray data. Bioinformatics 20, 3246–3248.

Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., Lobanenkov, V. V, et al. (2012). A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120.

Sinkkonen, L., Hugenschmidt, T., Berninger, P., Gaidatzis, D., Mohn, F., Artus-Revel, C.G., Zavolan, M., Svoboda, P., and Filipowicz, W. (2008). MicroRNAs control de novo DNA methylation through regulation of transcriptional repressors in mouse embryonic stem cells. Nat. Struct. Mol. Biol. 15, 259–267.

Smith, Z.D., Chan, M.M., Humm, K.C., Karnik, R., Mekhoubad, S., Regev, A., Eggan, K., and Meissner, A. (2014). DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611–615.

Stadler, M.B., Murr, R., Burger, L., Ivanek, R., Lienert, F., Schöler, A., van Nimwegen, E., Wirbelauer, C., Oakeley, E.J., Gaidatzis, D., et al. (2011). DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490–495.

Tang, W.W.C., Dietmann, S., Irie, N., Leitch, H.G., Floros, V.I., Bradshaw, C.R., Hackett, J.A., Chinnery, P.F., and Surani, M.A. (2015). A unique gene regulatory network resets the human germline epigenome for development. Cell 161, 1453–1467.

Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen, E., Sheffield, N.C., Stergachis, A.B., Wang, H., Vernot, B., et al. (2012). The accessible chromatin landscape of the human genome. Nature 489, 75–82.

Tiedemann, R.L., Putiri, E.L., Lee, J.-H., Hlady, R.A., Kashiwagi, K., Ordog, T., Zhang, Z., Liu, C., Choi, J.-H., and Robertson, K.D. (2014). Acute Depletion Redefines the Division of Labor among DNA Methyltransferases in Methylating the Human Genome. Cell Rep. 9, 1554–1566.

Tsumura, A., Hayakawa, T., Kumaki, Y., Takebayashi, S., Sakaue, M., Matsuoka, C., Shimotohno, K., Ishikawa, F., Li, E., Ueda, H.R., et al. (2006). Maintenance of self-renewal ability of mouse embryonic stem cells in the absence of DNA methyltransferases Dnmt1, Dnmt3a and Dnmt3b. Genes Cells 11, 805–814.

Viré, E., Brenner, C., Deplus, R., Blanchon, L., Fraga, M., Didelot, C., Morey, L., Van Eynde, A., Bernard, D., Vanderwinden, J.-M., et al. (2006). The Polycomb group protein EZH2 directly controls DNA methylation. Nature 439, 871–874.

Wang, G., Balamotis, M.A., Stevens, J.L., Yamaguchi, Y., Handa, H., and Berk, A.J. (2005). Mediator Requirement for Both Recruitment and Postrecruitment Steps in Transcription Initiation. Mol. Cell 17, 683–694.

Wani, A.H., Boettiger, A.N., Schorderet, P., Ergun, A., Münger, C., Sadreyev, R.I., Zhuang, X., Kingston, R.E., and Francis, N.J. (2016). Chromatin topology is coupled to Polycomb group protein subnuclear organization. Nat. Commun. 7, 10291.

88 Weber, M., Hellmann, I., Stadler, M.B., Ramos, L., Pääbo, S., Rebhan, M., and Schübeler, D. (2007). Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39, 457–466.

Wu, H., Coskun, V., Tao, J., Xie, W., Ge, W., Yoshikawa, K., Li, E., Zhang, Y., and Sun, Y.E. (2010). Dnmt3a-dependent nonpromoter DNA methylation facilitates transcription of neurogenic genes. Science 329, 444–448.

Xie, W., Schultz, M.D., Lister, R., Hou, Z., Rajagopal, N., Ray, P., Whitaker, J.W., Tian, S., Hawkins, R.D., Leung, D., et al. (2013). Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148.

Yoon, H.-G., Chan, D.W., Reynolds, A.B., Qin, J., and Wong, J. (2003). N-CoR mediates DNA methylation-dependent repression through a methyl CpG binding protein Kaiso. Mol. Cell 12, 723–734.

Zentner, G.E., Tesar, P.J., and Scacheri, P.C. (2011). Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. Genome Res. 21, 1273–1283.

Zhang, Y., Jurkowska, R., Soeroes, S., Rajavelu, A., Dhayalan, A., Bock, I., Rathert, P., Brandt, O., Reinhardt, R., Fischle, W., et al. (2010). Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided by interaction of the ADD domain with the histone H3 tail. Nucleic Acids Res. 38, 4246–4253.

Ziller, M.J., Gu, H., Müller, F., Donaghey, J., Tsai, L.T.-Y., Kohlbacher, O., De Jager, P.L., Rosen, E.D., Bennett, D. a, Bernstein, B.E., et al. (2013). Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481.

89

Chapter 3 :

Molecular and genomic defects of mutations in DNMT3A-associated

Overgrowth Syndrome with Intellectual Disability

90 Introduction

Mutations in epigenetic machinery have revealed a class of genetic overgrowth syndromes with a unique etiology in a class of epigenetic regulators (Tatton-Brown et al., 2017).

Mutations in DNMT3A were recently identified in one group of distinct Overgrowth Syndrome with Intellectual disability patients, named Tatton-Brown Rahman Syndrome (TBRS) (Tatton-

Brown et al., 2014). Variants in DNMT3A are also found in autism spectrum patients suggesting a broader role for mutation on intellectual function (Sanders et al., 2015). Previously most well characterized DNMT3A mutations occur in the methyltransferase domain or are involved in

dimerization as in missense mutation at the R882 residue (Ley et al., 2010; Russler-Germain et

al., 2014). Disease-associated variants found in regulatory domains have not been well studied.

Given their newfound prevalence in an epigenetic-related overgrowth and intellectual disability disorder, it is important to understand their function.

DNA methyltransferase activity differs between naked DNA and chromatin, indicating

DNMT activity responds to epigenomic context (Zhang et al., 2010). Recently, mechanisms of intrinsic auto-inhibition in DNMT3A by regulatory domains have been described providing a means to understand mutations in these regions (Guo et al., 2014). The effect of these mutations

within the cellular environment are unknown and will be important in understanding how these

mutations relate to problems found in overgrowth syndrome patients.

DNA methylation patterns are cell-type specific in human and mouse (Hon et al., 2013;

Ziller et al., 2013). De novo DNMTs such as DNMT3A are not intrinsically sequence specific,

therefore the domain structure of DNMT3A contributes to genomic specificity. Recent work has

shown that alterations in DNMT3A’s regulatory domains can dramatically alter the specificity

and function of DNMT3A (Noh et al., 2015).

91 We therefore address the question of what effect mutations in DNMT3A associated with

TBRS have in a cellular context. We utilize genome editing and neural differentiation to study the molecular effects of mutant DNMT3A. Here we study overgrowth associated mutations found in all three DNMT3A conserved domains. Mutations in DNMT3A showed variation in functional defects. We identified a consistent set of 341 overgrowth-associated differentially methylated regions (OG-DMRs) in NSCs that were found near CpG islands and developmental genes. Surprisingly, OG-DMRs were enriched for a number of imprinted genes associated with growth and intellectual function phenotypes in humans. OG-DMRs were also associated with changes in histone modification H3K27me3. Relating the mutations found in different domains, we find consistency between structural understanding and functional effects of methylation in the genome.

RESULTS

Establishing a Cellular Model to Study Overgrowth-Associated DNMT3A

Mutations

In the original description of TBRS, 13 mutations were found to be spread across the

entire DNMT3A enzyme (Tatton-Brown et al., 2014). We chose four of the originally described

mutations to investigate further, with at least one mutation in each of DNMT3A’s conserved

domains (Figure 3.1A). I310N is found in the PWWP domain, G532S and M548K are both in

the ADD domain, and P904L is in the methyltransferase domain. P904L, however, is not near the conserved catalytic site nor does it overlap with known dimerization surfaces. Each of these human mutations are found in highly conserved regions of both the human and mouse protein.

For each mutation, the homologous residue in the mouse genome was identified and used. To

92 introduce these mutations we utilized CRISPR-Cas9 to introduce the missense mutations in the

endogenous Dnmt3a locus and exploiting ssODN-associated homology directed repair (Figure

3.1B). Individual ESC colonies were screened, subcloned, and confirmed by Sanger sequencing

(Figure 3.1C). CRISPR modification of the alleles was efficient with clones being

predominately indels and indels mixed with knock-in. We were unable to obtain heterozygous

knock-in therefore all experiments were performed in homozygous knock-in lines, herein

referred to as mutant lines. Homozygous mESC lines for I306N (IN), G528S (GS), M544K

(MK), and P900L (PL) were obtained.

Wild-type and mutant mESC were then differentiated through Pax6+/RC2+ cortical neural progenitors and subsequently expanded in neural stem cell (NSC) conditions to stable and homogenous neural stem cells (Conti et al., 2005; Gaspard et al., 2009) (Figure 3.1D). NSCs expressed Nestin and were proliferative (Figure 3.6). No overt differences were observed upon differentiation.

DNMT3A Mutants Differ in Functionality in mESC and mNSCs

We first assessed global methylation in mutant mESC and NSCs by measurement of 5- methylcytosine via mass spectrometry (Figure 3.1E). Mutant mESCs were observed to differ in global methylation where IN were hypomethylated in both mESCs and NSCs, similar to KO cells. Meanwhile, GS and PL mutant lines, showed no difference in mESCs. GS NSCs showed some variability and were similar to wild-type J1 NSCs. PL NSCs showed moderately diminished methylation, but not to the extent of IN NSCs.

Next, in order to better understand the effect of mutations in the cellular context, we utilized an ectopic expression system in stably transfected cell lines (Figures 3.1F and 3.1G). In

93 ESCs, both de novo methyltransferases Dnmt3a and Dnmt3b are expressed and have been shown

to be functionally redundant where both are able to methylate targets in the absence of the other

(King et al., 2016). Furthermore, while Dnmt3a is known to operate as dimers, Dnmt3a and

Dnmt3b also interact in ESC (Li et al., 2007). We therefore reconstituted wild-type and mutant

DNMT3A in DKO (Figure 3.1H, left). We have previously shown that reconstitution of wild- type Dnmt in DKO allows robust measurement of de novo methylation activity as the presence of Dnmt1 can maintain de novo methylated sites rather than being passively diluted (King et al.,

2016). Our DKO lines were also late passage allowing the genome to fully demethylate. As expected, DKO mESCs were dramatically hypomethylated. Reconstitution of wild-type

DNMT3A consistently remethylated to an average of 2.4% global 5mdC, around 80% of normal

J1 mESCs. Reconstitution of IN mutant DNMT3A surprisingly only resulted in 0.9% 5mdC, equivalent to 30% of normal. In contrast, both GS and PL mutant DNMT3A remethylated the genome to similar levels as wild-type DNMT3A at 2.5% and 2.0% (83% and 67% of normal) in

GS and PL respectively.

Given that IN mutant DNMT3A shown minimal activity on its own, we next ectopic expressed the three mutants in wild-type ESC grown in two different culture conditions. We cultured mESCs in both ground-state conditions, which normally are hypomethylated with lower expression of endogenous Dnmt3a and Dnmt3b, as well as conventional conditions that show high methylation. In both ground-state and conventional cultures, all three mutant DNMT3A were able to increase methylation (Figure 3.1H, right). These experiments in ESCs are summarized and show that GS and PL mutants show the ability to methylate a demethylated genome in the absence of endogenous Dnmt3a or Dnmt3b, while IN shows minimal activity

94 (Figure 3.1I). However, in the presence of endogenous de novo Dnmts, all three mutants studied

were showed the ability to increase global methylation.

DNA Methylation Alterations Occur at Sites of de novo Methylation in NSCs

We next aimed to understand where the methylation changes were occurring in the

genome. We therefore collected and analyzed whole-genome bisulfite sequencing (WGBS) data in duplicate with 8-12X coverage per replicate over an average of 25 million CpG sites across all

samples. Replicates were consistent, with all replicates having Pearson correlation >0.8 at CpG

sites with >10X coverage fitting with ENCODE quality standards (Figure 3.7). Global analysis

of CpG methylation showed increased methylation in neural stem cells compared to embryonic stem cells where methylated CpG sites increased from 62% in ESCs to 75% in NSCs (Figure

3.2A). Nearly all overgrowth mutants showed reduced proportions of methylation CpGs with KO at 67%, IN at 65%, MK at 70%, and PL at 70%. Meanwhile, 75% of CpG sites in GS NSCs were methylation, similar to wild-type J1 NSCs.

Dnmt3a has been reported to be involved in methylation in the Non-CpG context, so we also measured methylation in CHG and CHH contexts between mutants. Methylation in Non-

CpG sites account for approximately 23% of all methylation in mESC and lowered to 18% in

NSCs. Non-CpG methylation ranged from 17.3% to 19.8% in mutants (Figure 3.7).

Pair-wise comparison between each mutant NSC and wild-type J1 identified methylation changes consistent with changes observed at the global level. Specifically, in KO, IN, MK, and

PL NSC lines, significant DMRs were predominately lower in mutant lines compared to wild-

type J1-derived NSCs (Figure 3.2C, red dots). In contrast, GS NSCs showed many more regions

with increased methylation compared to J1 NSCs. Differentially methylated regions identified

95 for each pair were approximately 1kb and highly similar between mutations showing reduced methylation (Figure 3.2C, bottom).

As to not exclude any sites changing in methylation, we took the union of all differentially methylated regions identified by pair-wise comparison yielding a total of 508 sites.

The majority of OG-DMRs (341/508 = 67%) showed consistently lower methylation in comparison to J1 NSCs. We identify these 341 sites as overgrowth-associated differentially methylated regions (OG-DMRs) (Figure 3.2D). The remainder of sites were consistently methylated/unmethylated across all NSCs with a few individual lines showing differences, possibly due to variability in differentiation and culture in individual lines or batches.

Interestingly, we observe that OG-DMRs tend to be hypomethylated in mESCs indicating the loss of de novo methylation during neural differentiation (Figures 3.2C and 3.2D). Consistent with this, OG-DMRs were enriched in ontology terms relating to embryo organ development, pattern specification, and regionalization. Promoter associated transcription factors ontologies were also enriched in OG-DMRs (Figure 3.2E).

Of all OG-DMRs 113 (33%) were within 2-kb and 49 (14%) within 500 bp of a TSS.

Similarly, 116 (34%) overlapped with a CpG island and an additional 96 were within 4-kb of a

CpG island, commonly referred to as CpG island shores and shelves (Figure 3.2F). DNA methylation has been shown to most variable at CpG island shores rather than within CpG islands themselves (Irizarry et al., 2009; Ziller et al., 2013). As CpG content can distinguish

between CpG islands and neighboring shores, we measured CpG content of OG-DMRs represented as observed-to-expected CpG ratio. We find that most DMRs have CpG contents

lower than the CpG content normally found within CpG islands (>0.6), rather the CpG is (Figure

3.2G).

96

Epigenetic Signatures of OG-DMRs and Antagonism of H3K27me3

Having found that two-thirds of OG-DMRs are found near CpG island shores/shelves and

one-third are distal to promoters and CpG islands, we next asked whether these sites correspond

to cis-regulatory elements that may be responsible in gene regulating during development.

Towards this question, we profiled H3K4me3, H3K4me1, H3K27ac, and H3K27me3 in wild-

type J1 NSCs to assess the epigenetic signatures of OG-DMRs (Figure 3.3A).

Separating between promoter and CpG island-associated OG-DMRs and those distal, we

find OG-DMRs in a number of different chromatin contexts. Across both categories of OG-

DMRs, we see many sites that are found near H3K4me1 without H3K4me3, H3K27ac, or

H3K27me3 present with 45% (106/235) and 40% (42/106) of representation between

promoter/CGI-associated and non-promoter/CGI OG-DMRs respectively (Figure 3.3B). These

regions have been referred to as primed enhancer or permissive regulatory regions, which are

capable of activity upon activation. Promoter/CGI-associated OG-DMRs were more abundant in

regions marked by H3K27me3 where nearly 20% (46/235) are co-occupied by H3K4me1 and

H3K27me3 (poised enhancers) with a further 5% (12/235) solely marked with H3K27me3. In

contrast, 6% (6/106) are found in either poised enhancer or polycomb-repressed contexts in non-

promoter OG-DMRs. Four non-promoter DMRs showed overlap with H3K4me3, which is

unexpected. Visual inspection revealed that one was approximately 5 kb from , which is

found in a broad H3K4me3 domain nearly 10 kb wide. These sites are located distal to proximal

promoters, but may still be involved in regulation. Lastly, non-promoter/non-CGI OG-DMRs tended to more associated with larger hypermethylated regions. While we identify these sites as

other, these could be equivalent to quiescent regions that are heavily methylated in larger

97 genomic studies (Figure 3.3B). While 19% (44/235) of promoter/CGI OG-DMRs are in the

other category, nearly half (46%; 49/106) of non-promoter DMRs were found to not be

significantly marked with the histone modifications we measured.

Previously, Dnmt3a has been shown to regulate neural genes through antagonism of

PRC2 and H3K27me3 in postnatal NSCs and ESC-derived NPCs (Wu et al., 2010). We therefore

sought to answer whether DNA methylation changes at OG-DMRs were associated with changes

in H3K27me3. Profiling H3K27me3 in all mutant NSC lines, we find that a number of sites

show increase in H3K27me3 (Figure 3.3C). While nonpromoter OG-DMRs show an

appreciable increase in H3K27me3 at several sites and on average across all sites, the higher

levels of H3K27me3 at more noticeable at promoter/CGI-associated DMRs (Figure 3.3D).

Despite the size of DMRs limited to an average of approximately 680 bp, we observed broad

changes in H3K27me3 spanning multiple kilobases. Taking the Nr2f2 locus as an example, we

see dramatically higher H3K27me3 occupancy most evident in KO and IN NSCs in comparison

to wild-type J1 NSCs (Figure 3.3E). The OG-DMR near Nr2f2 is located 3 kb downstream of

the 3’UTR of Nr2f2 and is bordered by several CpG islands. The differences in H3K27me3 vary

with nearly all mutants showing increases on the left of the OG-DMR, however only KO and IN

show increases on the right, overlapping the Nr2f2 gene. One interesting observation was that

across all OG-DMRs, nearly all mutant NSC lines showed increased H3K27me3 despite minor

reductions in DNA methylation at OG-DMRs in GS mutant NSCs. In the case of Nr2f2, we see

that all mutants including GS mutants show reduced DNA methylation.

We had previously shown that OG-DMRs are associated with many ontologies relating to developmental biological processes. We wondered whether these de novo methylation events that are disrupted in overgrowth syndrome, may relate to fate restriction. We therefore utilized a

98 comprehensive set of tissue-specific differentially methylated regions (tsDMRs) that were

identified across 17 adult mouse tissues spanning all three germ layers to assess whether OG-

DMRs were associated with methylation events important in tissue-specificity (Hon et al., 2013).

We found that half of the OG-DMRs in our study overlapped with a tsDMR with 49% and 56% in promoter/CGI-associated OG-DMRS and non-promoter respectively. Overlap occurred broadly across all tissues (Figure 3.3F). For example, Nr2f2 OG-DMR overlapped with bone marrow, spleen, thymus, and olfactory bulb tsDMRs. Visual inspection of methylation revealed methylation over this region with hypomethylation specific to blood-related tissues and olfactory bulb.

Referencing DNA methylation measurements from developing mouse embryos by

ENCODE, the Nr2f2 OG-DMR is progressively methylated during forebrain development between E10.5 and E15.5 (Figure 3.8A). Meanwhile, in heart and intestinal tissues, the OG-

DMR is hypermethylated at all stages. Furthermore, comparison with H3K27me3 at similar stages during mouse embryonic development, showed an associated decrease in H3K27me3 matching the progressive increase in methylation in both forebrain and hindbrain (Figure 3.8B).

Together, this indicates that OG-DMRs show substantial tissue specificity and dynamics during

development, suggesting a role in specifying cell fate.

Altered Gene Expression in Overgrowth Mutant NSCs

After identifying both DNA methylation and histone modification alterations in Dnmt3a

mutant NSC lines, we asked whether any gene expression changes could be observed in mutant

NSCs. Comparison between mutant and wild-type J1 NSCs resulted in 87 differentially

expressed genes (DEGs) shared between KO, IN, GS, and PL NSCs (Figure 3.4A). The majority

99 of DEGs were higher in mutant NSCs, where 82% (80/87) genes showed higher gene expression.

Expression patterns of DEGs clustered into three general patterns. In the first group, DEGs are

normally expressed in ESCs and are silenced in wild-type NSCs. However, in nearly all mutant

lines, we observe a moderate derepression. The second group of DEGs are expressed in the

opposite pattern where expression is low in ESCs and increases in wild-type NSCs. In mutant

NSCs, however, the genes are further upregulated. In the last group, DEGs are heterogeneously

expressed in ESCs and downregulated in wild-type J1 NSCs. Nearly all mutant NSCs showed

derepression. Looking at enrichment within curated gene sets, we found the latter two groups

shared similar functional terms, including focal adhesion signaling and osteoblast genes (Figure

3.4B). The first group was enriched in placental genes as well as fibroblast proliferation and

regulation of ERK signaling.

To find out how many DEGs may be regulated directly by changes in DNA methylation,

we overlapped our set of 87 DEGs with 304 unique genes closest to each OG-DMR and found

minimal overlap. Only five genes were differentially expressed and were near OG-DMRs

(Figure 3.4C). Included in these five genes are Col5a2, where the associated OG-DMR is found overlapping the TSS with a 20-50% reduction in DNA methylation between KO, IN, and PL

lines suggesting direct regulation. The remainder of genes have DMRs that are not located near

the TSS.

We last asked whether changes in DNA methylation can lead to derepression without statistically significant gene expression changes, which may be the case if genes of different fates are unmethylated, but not activated due to cellular context. We therefore looked at promoter/CGI-associated OG-DMRs only and compared changes in methylation with changes in expression for each gene. On average, mutant NSCs showed a 30-40% lower methylation at

100 promoter-associated OG-DMRs. Similarly, expression of associated genes increased on average

between 1.4 and 1.6-fold (Figure 3.4D). Looking at individual genes, we see that the majority of

hypomethylated OG-DMRs show minor increase with a handful of genes being upregulated at

higher levels. However, while these genes show larger fold-changes, they are expressed at low

absolute levels.

Together, this indicates that while a number of epigenetic changes are occurring as a

result of overgrowth-associated Dnmt3a mutations, the link between DMRs and expression

changes does not immediately result in expression changes through direct action of DNA

methylation at gene promoters.

Structural Interpretation of DNMT3A Mutations

To interpret changes in methylation, we mapped the mutation to crystal structures and

structural models of DNMT3A. DNMT3A has been shown to form a heterotetrameric complex consisting of catalytically inactive coactivator DNMT3L and DNMT3A (Jia et al., 2007).

Visualizing the location of DNMT3A mutants on a complete structural model including PWWP,

ADD, C-terminal MTase domain, and nucleosomes, it is apparent these mutations are spatially

distal to both the active site and the interfaces surfaces (Figure 3.5A) (Rondelet et al., 2016). We

therefore aimed to study the structural context of each mutant to corroborate our functional

findings in NSCs.

I310N is located in the second beta strand of the canonical beta-barrel core found in

PWWP domains (Figure 3.5B). The PWWP domain was found to be essential for chromatin targeting as mutation of individual residues within the domain completely abolish targeting, as is found in human ICF syndrome. The beta-barrel constitutes a hydrophobic core, which is

101 important for the stability of the individual domain. Mutation of conserved hydrophobic residues

in the PWWP domain of DNMT3B abrogated targeting, which was hypothesized to affect the

structure of the hydrophobic core (Ge et al., 2004). We predict the I310 mutation to asparagine

would disrupt the hydrophobic core due to polarity.

The activity of Dnmt3a differs between naked DNA and chromatinized DNA due to

repression of catalytic activity by the N-terminal portion of DNMT3A (Zhang et al., 2010).

Crystal structures of DNMT3A revealed that large conformational changes occur due to movement of the N-terminal ADD domain and its interaction with either histone H3 tail or self- interactions with the C-terminal methyltransferase domain (Figure 3.5C, left) (Guo et al., 2014).

In the active state, the ADD domain interacts with histone H3 tail and is positioned away from the active site. In the inactive state, however, the ADD domain interacts intimately with the methyltransferase domain, particularly between an ADD domain loop containing several acidic residues and a pocket of the methyltransferase domain formed by basic residues (Figure 3.5C, right). Mutation of the acidic residues in the loop to alanine showed reduced autoinhibition (Guo et al., 2014). In silico mutation of G532S showed all three rotamers of serine in conflict within the confinements of the ADD-MTase interface. Interestingly, D529N has recently been identified in Sotos-like cohort of overgrowth patients (Tlemsani et al., 2016). We predict G532S mutation to interfere with the autoinhibitory mechanism, due to destabilization of the ADD- methyltransferase domain interface in the inactive state.

Lastly, P904 normally exists in the terminal most helix of the methyltransferase domain.

The terminal helix in the catalytic domain normally abuts the ADD domain in the active form when H3 is bound (Guo et al., 2014). This terminal helix normally exists with a 30 degree kink that is highly likely due to P904. Consistent with most characterized kinked helices, proline is

102 found at the +2 position relative to the kink (Wilman et al., 2014). We predicted that P904L

mutation will result in removal of the kink potentially leading to straightening of the terminal

helix. In silico mutation of P904L results in all three rotamers clashing with steric conflict due to

the bulky side-chain of leucine (Figure 3.5E). Furthermore, modeling the straightened helix, we find it is compatible with the remainder of the methyltransferase domain, suggesting a straightened helix could exist with functional methyltransferase activity, but with altered stability of the active state.

Discussion

In this work, we aimed to understand the effect that regulatory domain mutations had on

the ability of DNMT3A to methylate in a cellular context. We particularily looked at DNMT3A

mutations associated with TBRS and studied these mutations in mESCs and mNSCs. One

question of interest is whether DNMT3A mutations in TBRS act as a gain- or loss-of-function.

We found that DNMT3A containing overgrowth-associated G532S and P904L mutations were

capable of methylation activity in mESCs, in contrast to AML-associated R882 residue

mutations that show minimal activity (Kim et al., 2013). On the other hand, we found that

DNMT3A with overgrowth-associated I310N mutation was incapable of methylation in

demethylated mESCs, but showed methylation activity when ectopically expressed in wild-type

mESCs. Together, these data indicate that mutations in the regulatory domains can differ based

on the location of the mutation, specifically that the PWWP domain is important for intrinsic

activity of DNMT3A as opposed to other regulatory regions, such the ADD domain and non-

catalytic and non-multimerization regions of the MTase domain.

103 Utilizing genome-wide base-pair resolution DNA methylation maps of knock-in mutant

NSCs, we also identified differentially methylated regions. In Dnmt3a knock-out, I306N,

M544K, and P900L NSC lines, regions of differential methylation showed almost exclusively hypomethylation. In contrast, G528S NSCs showed both hyper- and hypomethylated regions.

Previous study of Dnmt3a knock-out HSCs had also showed both hyper- and hypomethylation at specific genomic regions, indicating a more complex role for Dnmt3a and specificity for the region of mutation (Challen et al., 2012).

Interestingly, a set of consistently hypomethylated regions shared across mutant NSCs, which we termed overgrowth-associated differentially methylated regions (OG-DMRs), were enriched for a number of imprinted genes associated with growth and intellectual function phenotypes in humans. Included in this list were Meg3, Peg3, Peg1 (Mest), and Asb4. The

Dlk1/Meg3 locus is known for the reciprocal diseases Kagami-Ogata Syndrome and Temple

syndrome, which result in increased and decreased birth weight respectively (Ogata and Kagami,

2016; Temple et al., 2007). It is an appealing idea for mutations in epigenetic machinery, such as

DNMT3A, to act through imprinted genes as these genes are epigenetically regulated and

coordinate both somatic growth as well as intellectual function.

OG-DMRs were also associated with changes in histone modification H3K27me3 where

hypomethylation was associated with increased H3K27me3. This observation is consistent with

previous work showing the ability of DNA methylation to regulate histone modification occupancy as well as experiments showing antagonism between DNA methylation and

H3K27me3 at neural genes in postnatal NSCs (King et al., 2016; Wu et al., 2010). If H3K27me3 modifications do indeed change as a direct consequence of DNMT3A mutation-associated hypomethylation, this could be another means of gene expression alteration.

104 Lastly, we observed differences in methylation depending on the location of the

overgrowth-assocated mutations. We analyzed the location of these mutations in three-

dimensional structural models of DNMT3A. Relating the mutations to their location within

known domains, we find that our functional DNA methylation data matches the predicted

functional effect based on the structural location of each mutation. In our hDNMT3A

reconstitution experiments in DKO mESCs, we found I310N to behave differently from G532S

and P904L. This difference is explained by previous work showing that the PWWP domain is necessary for targeting Dnmt3a to chromatin (Ge et al., 2004). In knock-in mNSCs, we found

G528S to differ from I306N, M544K, and P900L. This difference is explained by structural

features central to the auto-inhibitory mechanism intrinsic to DNMT3A and the fact that G532 in

hDNMT3A (G528 in mouse) plays an essential and unique role, differing from other regulatory

residues (Guo et al., 2014).

105 Experimental Procedures

Mouse Embryonic Stem Cell Culture

Feeder-dependent mouse embryonic stem cells (mESCs) were grown on irradiated mouse

embryonic fibroblasts (MEF) and maintained in DMEM (Thermo) supplemented with 15% fetal

bovine serum (GE HyClone), 1 mM GlutaMAX (Life Technologies), 1X Non-essential amino

acids (NEAA), 1X Penicillin/Streptomycin (Life Technologies), 1000 U/mL LIF (ESGRO;

Millipore), and 0.001% β-mercaptoethanol (Sigma). Ground state feeder-free mESCs were

cultured on gelatinized culture plates in mESC medium with the addition of 1 uM CHIR99021

(Stemgent 04-0004) and 3 uM PD0325901 (Stemgent 04-0006).

Mouse Neural differentiation

ESC neural induction was performed as previously described with minor modification (Gaspard

et al., 2009). Briefly, mESCs were adapted to feeder-free conditions and seeded at low densities

(5,000-10,000 cells/cm2) on gelatinized culture plates. Cells were grown as a monolayer in

defined default medium (DDM), a DMEM-F12-based medium supplemented with N-2 (Thermo

Fisher), for 10 days with addition of 1 uM cyclopamine (Calbiochem) starting on day 2. Neural progenitors were replated on culture dishes coated with 33.3 ug/mL poly-D-lysine (Corning) and

3.33 ug/mL laminin (Corning) on day 10 post-induction. Progenitors were expanded in neural stem cell expansion medium as previously described (Conti et al., 2005). Briefly, cells were grown in DMEM-F12-based medium supplemented with N-2 (Thermo Fisher), 10 ng/mL hEGF

(Peprotech), and 10 ng/mL bFGF (Sino Biological). Cells were replated for minimally two passages to maintain a more steady-state and homogenous culture, which are referred to as

106 neural stem cells. Neural stem cells were collected for analysis generally four passages into expansion.

Generation of Knock-in mouse embryonic stem cell lines sgRNAs were designed using the Broad Institute sgRNA design tool, prioritizing distance from site of mutation and on-target efficacy scores (Doench et al., 2016). Sequences were synthesized as complementary oligonucleotides (IDT) and cloned into vector pX459 (Addgene 48139) containing specific sgRNA, Cas9, and puromycin selection marker (Ran et al., 2013).

Corresponding single-stranded oligo donors (ssODNs) were designed to match target-strand and included silent mutations disabling PAM sites, RFLP sites for genotyping, and overgrowth disease mutations (Richardson et al., 2016). Transfection was carried out using Lipofectamine

3000 per manufacturer’s instruction, where 4 ug each of sgRNA-Cas9 plasmid DNA and ssODN were transfected into approximately 50,000 cells in 6-well format. Transient selection using 2 ug/mL Puromycin with 1 uM Scr7 was started 24 hours after transfection for a total of 48 hours.

ESCs were allowed to recover for an additional 24-48 hours. mESCs were then transferred to new well at clonal density, individual colonies were expanded then manually picked into 96-well plates for subsequent genotyping and expansion.

Generation of ectopic expression mouse embryonic stem cell lines

Site-directed mutagenesis for each mutation was carried out on pcDNA3/Myc-DNMT3A

(Addgene 35521). EcoRI/MfeI digested Myc-DNMT3A fragments were inserted into EcoRI cloning site of pCAG-IRES-Blast vector as previously described (Chen et al., 2003; King et al.,

2016). Note the EcoRI site is not conserved. Briefly, constructs were linearized with MluI and

107 transfected by electroporation with Gene Pulser II using 250V and 500 uF in PBS (Bio-Rad) into

J1 and DKO mESCs and selected with 5 ug/mL Blasticidin until stable resistant clones grew out.

DKO mESC lines with Dnmt3a and Dnmt3b knock-out are as previously described (Okano et al.,

1999). Cells were maintained in 5 ug/mL blasticidin during routine growth.

Quantification of DNA Methylation by Mass Spectrometry

Measurement of 5-mdC and dG were conducted as previously described (Yu et al., 2016).

Genomic DNA was extracted using phenol-chloroform-isoamyl alcohol extraction followed by

ethanol precipitation. One microgram of cellular DNA was enzymatically digested into

nucleoside mixtures. Enzymes in the digestion mixture were removed by chloroform extraction

and the resulting aqueous layer subjected directly to LC-MS analysis for quantification of 5-mdC and dG. The amounts of 5-mdC and dG (in moles) in the nucleoside mixtures were calculated from area ratios of peaks found in selected-ion chromatograms (SICs) for the analytes over their

corresponding stable isotope-labeled standards, the amounts of the labeled standards added (in

moles), and the calibration curves. The final levels of 5-mdC, in terms of percentages of dG,

were calculated by comparing the moles of 5-mdC relative to the moles of dG.

Whole Genome Bisulfite Sequencing (WGBS)

Genomic DNA was purified from mESCs and NSCs using standard phenol-chloroform-isoamyl alcohol extraction followed by ethanol precipitation. Genomic DNA was spiked with 0.5%

lambda DNA, fragmented to 250-350 bp using Bioruptor (Diagenode), and concentrated by

ethanol precipitation. 500 ng of fragmented DNA was used for bisulfite conversion using EZ

DNA Methylation-Direct Kit (Zymo). Libraries were constructed using Accel-NGS Methyl-Seq

108 DNA Library Kit (Swift) according to manufacturer’s recommendations. WGBS libraries were

quantified using Qubit Fluorometer and Agilent 2200 TapeStation (Agilent). Libraries were

pooled and sequenced using the Illumina HiSeq 4000 as 150-bp paired-end sequencing reads.

Chromatin immunoprecipitation and ChIP-seq

ChIP was performed using protocol as described previously (King et al., 2016). Briefly, 2x107

neural stem cells were fixed with 1% formaldehyde for 10 min after which chromatin was

isolated and sonicated to 150-300 bp using Bioruptor (Diagenode). 5*106 cells worth of

sonicated chromatin was precleared with Protein A/G Dynabeads (Life Technologies) for 1 hour

then incubated in antibody at 4°C overnight. Antibodies for immunoprecipitation were all ChIP

grade and validated previously in ENCODE datasets

(http://compbio.med.harvard.edu/antibodies/about) (Egelhofer et al., 2011). 2 ug of anti-

H3K4me3 antibody (Abcam ab8580), 5 ug of anti-H3K4me1 antibody (Abcam ab8895), 5 ug of

anti-H3K27me3 antibody (Active Motif 39155), and 5 ug of anti-H3K27ac antibody (Active

Motif 39133) were used for ChIP. Chromatin was then immunoprecipitated with Protein A/G

Dynabeads for > 2 hours. ChIP was washed successively by two rounds of low-salt buffer, two

rounds of high-salt buffer, two rounds of LiCl buffer, and finally two rounds of TE buffer. ChIP

DNA was eluted in TE, reverse crosslinked overnight at 65°C, and purified by incubation with

RNase A, then proteinase K, and finally phenol-chloroform extraction and ethanol precipitation.

ChIP DNA was quantified using Qubit Fluorometer (Life Technologies) and

approximately 10 ng were used for libraries construction using Kapa Library Preparation Kit for

Illumina sequencing (Kapa Biosystems) following manufacturer’s recommendations. ChIP-seq

libraries were quantified using Qubit Fluorometer and Agilent 2200 TapeStation (Agilent).

109 Libraries were pooled and sequenced using the Illumina HiSeq 4000 machine as 50-bp single- end sequencing reads.

RNA-seq

As previously described (King et al., 2016). Total RNA was purified using RNeasy per manufacturer’s instructions (Qiagen). PolyA-tailed RNA was purified from total RNA and libraries were constructed using the TruSeq RNA Library Prep Kit v2.0 (Illumina) following manufacturer’s recommendations. RNA-seq libraries were quantified using Qubit Fluorometry and Bioanalyzer High Sensitivity DNA Analysis Kit. Libraries were pooled and sequenced using the Illumina HiSeq 2500 machine as 100-bp single-end sequencing reads.

110 Figures

111 Figure 3.1. DNMT3A mutants differ in functionality in genome methylation A: Schematic showing domain structure of DNMT3A and amino acid sequence homology across species. PWWP domain is shown in orange, ADD domain shown in green, and catalytic methyltransferase domain shown in yellow. B: Schematic of Dnmt3a gene structure with exons shown as blocks. Sequence used for CRISPR-Cas9 genome editing is shown for IN, GS, and PL mutations in inset. PAM sequence is highlighted in green, overgrowth substitution is shown in red, and RFLP genotyping site is italicized and underlined. C: Example of genotyping result from several P904L clones. Top: RFLP analysis using silent HhaI knock-in. Bottom: Sanger sequencing results from individual clones. D: Neural differentiation scheme with representative immunofluorescent images of embryonic stem cells (left), neural progenitor intermediates (middle), and neural stem cells (right). E: Mass spectrometry measurements of global methylation in wild-type J1 versus knock-in ESC and NSCs. F: Schematic of CAG-based expression vector. G: Table indicating de novo Dnmt complements in different cell lines studied. H: Mass spectrometry measurements of global methylation in DKO reconstitution mESCs and ecotopic expression in J1 mESCs. I: Summary of mass spectrometry measurements of global methylation across ectopic expression cell lines.

112

Figure 3.2. DNA methylation changes occur at sites of de novo methylation in NSCs A: Global methylation distribution between unmethylated (<20%), partially methylated (20- 70%), and methylated (>70%) in CpG context. B: Global correlation of promoter methylation between ESC and NSCs. C: Differentially methylated regions in each mutant NSC lines. Above: volcano plots showing p- value and absolute methylation changes. Methylation change are shown as difference in average

113 methylation between mutant and wild-type NSCs. Negative values indicate lower methylation and positive values indicate high methylation in mutants compared to wild-type J1 NSC. Below: Metaplot profiles showing average spatial differences in DNA methylation. D: Heatmap of methylation values across the union of differentially methylated regions (n=508). Regions shown as rows, were organized by k-means clustering. Clusters that show consistent changes are indicated by red vertical line, referred to as overgrowth-associated differentially methylated regions (OG-DMRs). E: Ontology enrichment of OG-DMRs. F: Overlap between OG-DMRs, CpG-islands/shores/shelves, and promoters (within 2kb). G: CpG content comparison between OG-DMRs, CpG islands, CpG island shores, and background genome.

114

115

Figure 3.3. Epigenetic signatures of OG-DMRs and antagonism with H3K27me3 A: Heatmap of H3K4me3, H3K4me1, H3K27ac histone modifications, 5mCpG, and locations of CpG islands. B: Classification of promoter/CGI-associated OG-DMR and non-promoter/non-CGI OG-DMR into different chromatin contexts based on histone modification combinations in NSC C: Heatmap of H3K27me3 between wild-type J1 and mutant NSCs. D: Metaplot profiles of DNA methylation (top) and H3K72me3 (bottom) across all mutant NSC lines. E: Genome browser view around the Nr2f2 locus. Top: smoothed DNA methylation profiles across all NSCs. Colors are as shown previously. Middle: H3K27me3 signal across the genomic region with gene models and CpG islands (green). Bottom: CpG content plot with CpG islands highlighted in red. F: Distribution of OG-DMRs between tissue-specific DMRs.

116

Figure 3.4. Gene expression changes in mutant NSCs A: Heatmap of differentially expressed genes (n=87) between wild-type and mutant neural stem cells. Values are shown as log2(RPKM) centered by subtraction of the row-wise mean with reds higher and blues lower than the mean. Rows were clustered by k-means clustering. B: Functional term enrichment within DEG clusters. C: Overlap between promoter/CGI associated OG-DMR and differentially expressed genes D: Left: Average changes in DNA methylation and fold-change expression between lines. Right: Scatter plot comparison between log-scale fold-change in expression and absolute change in DNA methylation. Each point represents one promoter region in one NSC.

117

Figure 3.5. Structural analysis of Dnmt3a mutants A: Mapping of overgrowth associated mutants on a structural model of DNMT3A (PWWP- ADD-C-terminal domain)-DNMT3L heterotetramer (Rondelet et al., 2016). B: Crystal structure of PWWP domain with key features highlighted (Wu et al., 2011). IN mutation is marked in red. Hydrophobic residues of the beta-barrel are marked in black.

118 C: Left: Different conformations of ADD domain (green) in active and inactive state (Xu et al. 2014). The molecule of SAH shown in stick representation to indicate location of active site. In the inactive state, the ADD domain contacts the methyltransferase domain (yellow) hindering access to the active site. Right: Key residues involved in the ADD-MTase interface in the inactive state. D: Left: Location of the terminal helix (orange) within the methyltransferase domain (yellow). Right: Pro900 causes a 30 degree kink the in the terminal helix. E: All three rotamers of P904L show steric clashing with adjacent structures.

119

Figure 3.6. Cellular characterization of cell lines A: Expression of Nestin across knock-in neural stem cell lines as measured by flow cytometry. B: Cell cycle analysis of neural stem cell lines where % cells in S-phase (red), G1 (green), and G2 (grey) are shown (n=2-3). Error bars represent standard deviation. C: Comparison of growth curves between wild-type and mutant neural stem cells. D: Evaluation of ectopic expression and reconstitution of DNMT in mESCs. Top: RT-PCR quantification of exogenous DNMT expression across cell lines. Bottom: Immunofluorescent staining of DNMT3A reconstitution in DKO mESCs.

120

Figure 3.7. WGBS DNA methylation metrics A: Comparison of WGBS samples between biological replicates. Pearson correlation is shown for CpG sites with >10X coverage. B: Base pair resolution methylation infromation across WGBS samples. Top: Extent of methylation with percent of individual CpG sites that are methylation (>70%), partially methylated (20-70%), or unmethylated (<20%). Middle: Distribution of 5mC between CpG and non-CpG contexts. Bottom: Distribution of 5mC between CHH and CHG contexts C: Genome browser view near Meg3 imprinted locus. Smoothed DNA methylation profile shown for each cell lines with colors as shown previously. Gene model, ChromHMM, and CpG content are shown below. Two known DMRs, intergenic DMR (Ig-DMR) and Meg3-DMR, involved in gene regulation are shown.

121

Figure 3.8. DNA Methylation and H3K27me3 of Nr2f2 in developing mouse embryos A: WGBS tracks at the Nr2f2 locus across developmental stages E10.5-P0 in Forebrain, Hindbrain, Gut, and Heart tissues (data from ENCODE Consortium). B: H3K27me3 tracks at Nr2f2 locus across developmental stages similar to panel A.

122 References

Challen, G. a, Sun, D., Jeong, M., Luo, M., Jelinek, J., Berg, J.S., Bock, C., Vasanthakumar, A., Gu, H., Xi, Y., et al. (2012). Dnmt3a is essential for hematopoietic stem cell differentiation. Nat. Genet. 44, 23–31.

Chen, T., Ueda, Y., Dodge, J.E., Wang, Z., and Li, E. (2003). Establishment and maintenance of genomic methylation patterns in mouse embryonic stem cells by Dnmt3a and Dnmt3b. Mol. Cell. Biol. 23, 5594–5605.

Conti, L., Pollard, S.M., Gorba, T., Reitano, E., Toselli, M., Biella, G., Sun, Y., Sanzone, S., Ying, Q.-L., Cattaneo, E., et al. (2005). Niche-independent symmetrical self-renewal of a mammalian tissue stem cell. PLoS Biol. 3, e283.

Doench, J.G., Fusi, N., Sullender, M., Hegde, M., Vaimberg, E.W., Donovan, K.F., Smith, I., Tothova, Z., Wilen, C., Orchard, R., et al. (2016). Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191.

Egelhofer, T. a, Minoda, A., Klugman, S., Lee, K., Kolasinska-Zwierz, P., Alekseyenko, A. a, Cheung, M.-S., Day, D.S., Gadel, S., Gorchakov, A. a, et al. (2011). An assessment of histone- modification antibody quality. Nat. Struct. Mol. Biol. 18, 91–93.

Gaspard, N., Bouschet, T., Herpoel, A., Naeije, G., van den Ameele, J., and Vanderhaeghen, P. (2009). Generation of cortical neurons from mouse embryonic stem cells. Nat. Protoc. 4, 1454– 1463.

Ge, Y.-Z., Pu, M.-T., Gowher, H., Wu, H.-P., Ding, J.-P., Jeltsch, A., and Xu, G.-L. (2004). Chromatin targeting of de novo DNA methyltransferases by the PWWP domain. J. Biol. Chem. 279, 25447–25454.

Guo, X., Wang, L., Li, J., Ding, Z., Xiao, J., Yin, X., He, S., Shi, P., Dong, L., Li, G., et al. (2014). Structural insight into autoinhibition and histone H3-induced activation of DNMT3A. Nature.

Hon, G.C., Rajagopal, N., Shen, Y., McCleary, D.F., Yue, F., Dang, M.D., and Ren, B. (2013). Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat. Genet. 45, 1198–1206.

Irizarry, R.A., Ladd-Acosta, C., Wen, B., Wu, Z., Montano, C., Onyango, P., Cui, H., Gabo, K., Rongione, M., Webster, M., et al. (2009). The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet. 41, 178– 186.

Jia, D., Jurkowska, R.Z., Zhang, X., Jeltsch, A., and Cheng, X. (2007). Structure of Dnmt3a bound to Dnmt3L suggests a model for de novo DNA methylation. Nature 449, 248–251.

Kim, S.J., Zhao, H., Hardikar, S., Singh, A.K., Goodell, M. a, and Chen, T. (2013). A DNMT3A

123 mutation common in AML exhibits dominant-negative effects in murine ES cells. Blood 122, 4086–4089.

King, A.D., Huang, K., Rubbi, L., Liu, S., Wang, C.-Y., Wang, Y., Pellegrini, M., and Fan, G. (2016). Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 17, 289–302.

Ley, T.J., Ding, L., Walter, M.J., McLellan, M.D., Lamprecht, T., Larson, D.E., Kandoth, C., Payton, J.E., Baty, J., Welch, J., et al. (2010). DNMT3A mutations in acute myeloid leukemia. N. Engl. J. Med. 363, 2424–2433.

Li, J.-Y., Pu, M.-T., Hirasawa, R., Li, B.-Z., Huang, Y.-N., Zeng, R., Jing, N.-H., Chen, T., Li, E., Sasaki, H., et al. (2007). Synergistic function of DNA methyltransferases Dnmt3a and Dnmt3b in the methylation of Oct4 and Nanog. Mol. Cell. Biol. 27, 8748–8759.

Noh, K.-M., Wang, H., Kim, H.R., Wenderski, W., Fang, F., Li, C.H., Dewell, S., Hughes, S.H., Melnick, A.M., Patel, D.J., et al. (2015). Engineering of a Histone-Recognition Domain in Dnmt3a Alters the Epigenetic Landscape and Phenotypic Features of Mouse ESCs. Mol. Cell 59, 89–103.

Ogata, T., and Kagami, M. (2016). Kagami-Ogata syndrome: a clinically recognizable upd(14)pat and related disorder affecting the chromosome 14q32.2 imprinted region. J. Hum. Genet. 61, 87–94.

Okano, M., Bell, D.W., Haber, D. a, and Li, E. (1999). DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247–257.

Ran, F.A., Hsu, P.D., Wright, J., Agarwala, V., Scott, D.A., and Zhang, F. (2013). Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281–2308.

Richardson, C.D., Ray, G.J., DeWitt, M.A., Curie, G.L., and Corn, J.E. (2016). Enhancing homology-directed genome editing by catalytically active and inactive CRISPR-Cas9 using asymmetric donor DNA. Nat. Biotechnol. 34, 339–344.

Rondelet, G., Dal Maso, T., Willems, L., and Wouters, J. (2016). Structural basis for recognition of histone H3K36me3 nucleosome by human de novo DNA methyltransferases 3A and 3B. J. Struct. Biol. 194, 357–367.

Russler-Germain, D. a., Spencer, D.H., Young, M. a., Lamprecht, T.L., Miller, C. a., Fulton, R., Meyer, M.R., Erdmann-Gilmore, P., Townsend, R.R., Wilson, R.K., et al. (2014). The R882H DNMT3A mutation associated with AML dominantly inhibits wild-type DNMT3A by blocking its ability to form active tetramers. Cancer Cell 25, 442–454.

Sanders, S.J., He, X., Willsey, A.J., Ercan-Sencicek, A.G., Samocha, K.E., Cicek, A.E., Murtha, M.T., Bal, V.H., Bishop, S.L., Dong, S., et al. (2015). Insights into Autism Spectrum Disorder Genomic Architecture and Biology from 71 Risk Loci. Neuron 87, 1215–1233.

Tatton-Brown, K., Seal, S., Ruark, E., Harmer, J., Ramsay, E., Del Vecchio Duarte, S.,

124 Zachariou, A., Hanks, S., O’Brien, E., Aksglaede, L., et al. (2014). Mutations in the DNA methyltransferase gene DNMT3A cause an overgrowth syndrome with intellectual disability. Nat. Genet. 46, 385–388.

Tatton-Brown, K., Loveday, C., Yost, S., Clarke, M., Ramsay, E., Zachariou, A., Elliott, A., Wylie, H., Ardissone, A., Rittinger, O., et al. (2017). Mutations in Epigenetic Regulation Genes Are a Major Cause of Overgrowth with Intellectual Disability. Am. J. Hum. Genet. 100, 725– 736.

Temple, I.K., Shrubb, V., Lever, M., Bullman, H., and Mackay, D.J.G. (2007). Isolated imprinting mutation of the DLK1/GTL2 locus associated with a clinical presentation of maternal uniparental disomy of chromosome 14. J. Med. Genet. 44, 637–640.

Tlemsani, C., Luscan, A., Leulliot, N., Bieth, E., Afenjar, A., Baujat, G., Doco-Fenzy, M., Goldenberg, A., Lacombe, D., Lambert, L., et al. (2016). SETD2 and DNMT3A screen in the Sotos-like syndrome French cohort. J. Med. Genet. 53, 743–751.

Wilman, H.R., Shi, J., and Deane, C.M. (2014). Helix kinks are equally prevalent in soluble and membrane proteins. Proteins 82, 1960–1970.

Wu, H., Coskun, V., Tao, J., Xie, W., Ge, W., Yoshikawa, K., Li, E., Zhang, Y., and Sun, Y.E. (2010). Dnmt3a-dependent nonpromoter DNA methylation facilitates transcription of neurogenic genes. Science 329, 444–448.

Zhang, Y., Jurkowska, R., Soeroes, S., Rajavelu, A., Dhayalan, A., Bock, I., Rathert, P., Brandt, O., Reinhardt, R., Fischle, W., et al. (2010). Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided by interaction of the ADD domain with the histone H3 tail. Nucleic Acids Res. 38, 4246–4253.

Ziller, M.J., Gu, H., Müller, F., Donaghey, J., Tsai, L.T.-Y., Kohlbacher, O., De Jager, P.L., Rosen, E.D., Bennett, D. a, Bernstein, B.E., et al. (2013). Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481.

125

Chapter 4:

Discussion and Concluding Remarks

126 DNA methylation is one of the most well studied and understood epigenetic regulators in

the genome. However, as we appreciate new complexities, such as genomic and biological

contexts, our fundamental understanding of DNA methylation is continually being refined. My

work presented in this dissertation looks at DNA methylation in these new contexts. Here I

discuss how my findings relate to the broader field and introduce some interesting frontiers in the field of DNA methylation.

First principles of epigenetics

Interaction with transcription factors

Histone modifications have been an invaluable marker used to identify different environments across the genome due to strong associations between specific modifications and functional activity, such as binding of activating transcription factors. In Chapter 2, I observed that distal regions that became activated upon global demethylation, as indicated by gain of

H3K27ac, were reversibly silenced upon remethylation of the genome (King et al., 2016). The

reversible changes in H3K27ac likely indicate the reversible binding of an activating

transcription factor. This was previously shown to be true, specifically for the transcription

NRF1, where NRF1 bound to its motif in the absence of DNA methylation and was associated

with a gain in H3K27ac (Domcke et al., 2015). The NRF1 motif was also identified by motif

enrichment in my work described in Chapter 2 (King et al., 2016). Of note, reversibility was

shown in the study by Domcke et al., albeit by switching culture conditions (2i versus conventional serum), which may introduce other confounding variables. Nonetheless, these two studies together suggest that DNA methylation has an important role in regulating a subset of transcription factors.

127 Several recent studies have further investigated the relationship between DNA

methylation and transcription factors (Kribelbauer et al., 2017; Yin et al., 2017). Utilizing

massive protein binding assays to DNA templates, the authors identified that around 60% of

transcription factors studied could bind methylated sequence and whose binding was influenced

either positively or negatively. More than half of these transcription factors were “MethylPlus”,

which exhibited enhanced binding when the motif was methylated. Among these MethylPlus

transcription factors was OCT4 (POU5F1), which showed greater occupancy in hypermethylated

cells. Based on my work along with a number of recent studies, I believe it is likely that DNA

methylation plays a much more important role in regulating transcription factor binding than previously thought. It will be interesting to cross-reference future transcription factor binding data with histone modification changes in methylation altered mESCs.

Interface of epigenetic machinery with environmental cues

The epigenetic apparatus is an interface between the highly variable and dynamic

environment, changing day-to-day, and the contrastingly static landscape of the genome,

changing slowly across lifetimes. Epigenetics, thus, allows for different interpretations of the

same genome based on the environment. DNA methylation has been shown to change rapidly in

response to environmental changes. One of the most striking examples of this comes from a

study measuring methylation changes in rats undergoing fear conditioning. In this set of

experiments, odor fear conditioning in rats resulted in DNA methylation alterations at olfactory

genes in sperm within a mere 10 days (Dias and Ressler, 2014). Another example is of stimulus-

induced methylation changes in the mouse brain, where changes in DNA methylation can be

observed as soon as 4 hours after stimulation (Guo et al., 2011).

128 New mechanisms shed light on how environmental changes could alter DNA methylation

in short timespans. Recent work demonstrated that phosphorylation of DNMT3A by CK2 could regulate localization of Dnmt3a to repeats in heterochromatin (Deplus et al., 2014). Post- translational modifications of the de novo DNMTs have not received much attention. Future work studying how post-translational regulation of DNA methylation may yield exciting lessons on how certain regions of the genome can respond exquisitely quickly to environmental stimuli.

Cause versus Consequence

One of the most important questions relating to DNA methylation remains whether DNA

methylation is a cause or consequence of particular changes in the genome. Evidence exists

supporting both arguments. My work in Chapter 2 has provided some support that DNA

methylation is capable of serving an instructive or causal role at promoters and enhancers.

However this is limited to ESCs and the generality must still be investigated.

This question is most pertinent to DNA methylation’s activity at specific genomic

features. For example, what does DNA methylation do at CpG island shores? Could DNA

methylation at CpG island shores be regulating the activity of these promoters? In Chapter 2, I

found that global DNA methylation regulates H3K27me3 at bivalent promoters, which are

highly associated with CpG island promoters. In Chapter 3, I also find that DNA methylation

changes at specific OG-DMRs also are associated with H3K27me3 changes near CpG islands.

Interestingly, DNMT3A has been shown to localize adjacent to CpG islands in neural stem cells

(Wu et al., 2010). However, little is known about whether this localization is a cause or a

consequence. Future studies will require specific targeting of methylation changes and

measurement of whether DNMTs can cause local changes in chromatin structure.

129

Emerging concepts

One of the ground-breaking studies of DNA methylation in the last few years has been a

novel study of methylation dynamics in different cell states (Shipony et al., 2014). Shipony et al.

used a combination of bisulfite sequencing and unique molecular identifiers (UMI), which allowed measurement of methylation status of individual DNA molecules, to explain the observation that DNA methylation in ESC and testis is highly homogenous compared to all other tissues (Landan et al., 2012). They found that ESCs utilize a turnover-based equilibrium, which reinforces the methylation state. In contrast, somatic cells use clonal maintenance, leading to the emergence of epipolymorphisms or epimutations.

As a consequence of these findings and previous work, Amos Tanay’s group has provided a mechanism of how alterations in methylation patterns may occur in cancers (Landan et al., 2012). In the same study, the authors investigated regions that are hypomethylated in cancer cell lines and showed that regions with higher noise in ESCs and clonal maintenance in fibroblasts have a predisposition for hypermethylation in cancer cell lines. Cancer cell lines, similar to fibroblasts and T cells, showed more clonal methylation memory, which shows that these cells can accumulate and clonally propagate epimutations. Thus, in addition to genetic mutations, the genome can accumulate epimutations. The extent that epimutations play a role in disease, and how these methylation alterations function are of great interest.

Understanding epigenetic alterations in disease context

Similar to the increasing complexity and challenge moving from single gene Mendelian

disorders to complex traits encoded by multiple genes, epigenetic-focused diseases also fall on

130 the same spectrum. Single locus epigenetic diseases such as imprinting disorders have been

studied and well understood. The mechanism of the Beckwith-Wiedemann locus on chr11 has been finely mapped with the identity and function of most genomic elements understood.

However, many epigenetically-related disease are not consequences of individual loci or have not yet had their main actors identified. For example, in cancers, global methylation changes occur in conjunction with genetic alterations. In some cases, genetic alterations are associated with epigenetic changes, such as the mutator phenotype often resulting from hypermethylation of

DNA repair genes associated with the methylator phenotype. The multitude of possible effects makes studying these epigenetic diseases particularly difficult.

In the case of overgrowth syndromes, we are currently trying to answer some of the most

elementary questions about this new class genetic diseases rooted in mutations of the epigenetic

machinery. It is unclear what effect the mutations have on the epigenetic machinery and whether

the changes effect multiple processes or converge on a single pathway. In Chapter 3, I answer

some of these questions in the first step towards understanding one example of this class of

disorders. Mutations in almost all cases are loss of function. Although there are several hundred

loci exhibiting changes in methylation and several dozen genes differentially expressed, very few

genes show correlation between methylation and expression owing to few promoters under direct

regulation by DNA methylation. This is consistent with the one publication to date measuring

global methylation and gene expression in peripheral blood leukocytes from DNMT3A

overgrowth patients, where the authors found no trends between expression and methylation

(Spencer et al., 2017). However, methylation changes are enriched in imprinted genes with

known function in growth and intellectual function. This provides early evidence that mutations

131 in epigenetic machinery may converge on individual mechanisms rather than multiple disparate gene programs.

Many technical challenges exist for studying epigenetic changes in overgrowth syndromes. Most studies utilizing genomic techniques have been aimed at investigating differences between cell types. In Chapter 3, I aim to identify specific changes arising from single point mutations in DNMT3A. Not only are the downstream effects of DNMT not known, but the effect sizes are often far smaller than differences related to cell types. Furthermore, point mutations may alter other properties, such as dynamics/noise. Another challenge that will require further attention is how truly generalizable are the fundamental principles of how DNA methylation operates. As noted previously, the emerging concept that cells in different developmental stages maintain DNA methylation patterns in different ways demonstrates that caution is needed moving forward.

In conclusion, this dissertation takes two key steps into furthering the field’s understanding about DNA methylation. The first is to better understand the fundamental principles by which DNA methylation acts in the epigenome. In this work, I’ve shown that DNA methylation is capable of playing an instructive role, causing remodeling of the histone modification landscape in response to global DNA methylation changes in mouse embryonic stem cells. The second step is to understand the role of DNA methylation in a biological context.

Overgrowth-associated mutations in Dnmt3a were shown to result in loss-of-function at a number of sites, including known imprinted loci.

132 References

Deplus, R., Blanchon, L., Rajavelu, A., Boukaba, A., Defrance, M., Luciani, J., Rothé, F., Dedeurwaerder, S., Denis, H., Brinkman, A.B., et al. (2014). Regulation of DNA methylation patterns by CK2-mediated phosphorylation of Dnmt3a. Cell Rep. 8, 743–753.

Dias, B.G., and Ressler, K.J. (2014). Parental olfactory experience influences behavior and neural structure in subsequent generations. Nat. Neurosci. 17, 89–96.

Domcke, S., Bardet, A.F., Adrian Ginno, P., Hartl, D., Burger, L., and Schübeler, D. (2015). Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575–579.

Guo, J.U., Ma, D.K., Mo, H., Ball, M.P., Jang, M.-H., Bonaguidi, M.A., Balazer, J.A., Eaves, H.L., Xie, B., Ford, E., et al. (2011). Neuronal activity modifies the DNA methylation landscape in the adult brain. Nat. Neurosci. 14, 1345–1351.

King, A.D., Huang, K., Rubbi, L., Liu, S., Wang, C.-Y., Wang, Y., Pellegrini, M., and Fan, G. (2016). Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 17, 289–302.

Kribelbauer, J.F., Laptenko, O., Chen, S., Martini, G.D., Freed-Pastor, W.A., Prives, C., Mann, R.S., and Bussemaker, H.J. (2017). Quantitative Analysis of the DNA Methylation Sensitivity of Transcription Factor Complexes. Cell Rep. 19, 2383–2395.

Landan, G., Cohen, N.M., Mukamel, Z., Bar, A., Molchadsky, A., Brosh, R., Horn-Saban, S., Zalcenstein, D.A., Goldfinger, N., Zundelevich, A., et al. (2012). Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat. Genet. 44, 1207–1214.

Shipony, Z., Mukamel, Z., Cohen, N.M., Landan, G., Chomsky, E., Zeliger, S.R., Fried, Y.C., Ainbinder, E., Friedman, N., and Tanay, A. (2014). Dynamic and static maintenance of epigenetic memory in pluripotent and somatic cells. Nature 513, 115–119.

Spencer, D.H., Russler-Germain, D.A., Ketkar, S., Helton, N.M., Lamprecht, T.L., Fulton, R.S., Fronick, C.C., O’Laughlin, M., Heath, S.E., Shinawi, M., et al. (2017). CpG Island Hypermethylation Mediated by DNMT3A Is a Consequence of AML Progression. Cell 168, 801–816.e13.

Wu, H., Coskun, V., Tao, J., Xie, W., Ge, W., Yoshikawa, K., Li, E., Zhang, Y., and Sun, Y.E. (2010). Dnmt3a-dependent nonpromoter DNA methylation facilitates transcription of neurogenic genes. Science 329, 444–448.

Yin, Y., Morgunova, E., Jolma, A., Kaasinen, E., Sahu, B., Khund-Sayeed, S., Das, P.K., Kivioja, T., Dave, K., Zhong, F., et al. (2017). Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239.

133