Biology of maintenance and de novo methylation mediated by DNA methyltransferase-1

Olga Yarychkivska

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2017

© 2017 Olga Yarychkivska All rights reserved ABSTRACT

Biology of maintenance and de novo methylation mediated by DNA methyltransferase-1 Olga Yarychkivska

Within the past 70 years since the discovery of 5-methylcytosine, we have acquired considerable knowledge about genomic DNA methylation patterns, the dynamics of DNA methylation throughout development, and the enzymatic machinery that establishes and perpetuates genomic methylation patterns. Nonetheless, in the field of epigenetics major questions remain open about the mechanisms of spatiotemporal control that exist to ensure the fidelity of methylation patterns. This thesis aims to decipher the regulatory logic and upstream pathways influencing one of the DNA methyltransferases by leveraging the diverse resources of molecular genetics, biochemistry, and structural biology.

The primary subject of my research, DNA methyltransferase 1 (DNMT1), is crucial for maintaining genomic methylation patterns upon DNA replication and cell division. In addition to its C-terminal catalytic domain, mammalian DNMT1 harbors several N-terminal domains of unknown function: a succession of seven glycine-lysine (GK) repeats, resembling histone tails, and two Bromo-Adjacent Homology (BAH) domains that are absent from bacterial DNA methyltransferases. The work I present in this thesis characterizes the role of these hitherto enigmatic domains in regulating DNMT1 activity.

In my studies, I found that mutation of the (GK) repeats motif leads to de novo methylation by DNMT1 specifically at paternally imprinted . Conventionally, de novo methylation is thought to be undertaken by complete different enzymes, DNMT3A and

DNMT3B, whereas DNMT1 is limited to perpetuating the patterns these other methyltransferases had set down. Recombinant DNMT1 had been previously shown to efficiently methylate unmethylated DNA substrate in vitro, but this is the first time its de novo methyltransferase capability has been observed in vivo. Based on these data, I propose a new model in which DNMT1 is the enzyme responsible for laying down de novo methylation patterns at paternally imprinted genes in the male germline, explaining the previously observed non-essential role of other DNA methyltransferases in the establishment of paternal imprints.

Furthermore, I demonstrated that acetylation of the (GK) repeats motif inhibits this de novo methyltransferase activity of DNMT1, making this particular motif an essential regulatory platform for controlling the diverse in vivo functions of the enzyme.

Though the (GK) repeats motif had previously been proposed to regulate the stability of

DNMT1 through its interaction with the deubiquitinase USP7, I tested the biological relevance of this interaction and found that USP7 deletion does not alter DNMT1 protein levels.

In fact, USP7 appears to play no part in regulating maintenance DNA methylation, as I present evidence that USP7 localization to replication foci is entirely independent of DNMT1.

Finally, I demonstrated that the tandem BAH domains of DNMT1 are required for its maintenance methyltransferase activity as they are involved in targeting the enzyme to replication foci during S phase. Based on biochemical data supporting an interaction between

DNMT1's BAH1 domain and histones, I propose that this targeting could occur through BAH1's recognition of specific histone modifications, thus providing a potential mechanistic link between maintenance DNA methylation and chromatin markings.

This thesis identifies DNMT1 as a novel de novo methyltransferase in vivo and also characterizes the regulatory functions of the enzyme's BAH domains and the (GK) repeats. These results elucidate the multiple regulatory mechanisms within the DNMT1 molecule itself that control its functions in mammalian cells, thereby providing critical insights as to how the DNA methylation landscape takes shape and yielding surprising revelations about the parts that well- studied have to play in this process.

TABLE OF CONTENTS

LIST OF FIGURES ...... iv LIST OF ABBREVIATIONS ...... v CHAPTER I. GENERAL INTRODUCTION ...... 1 I.1. Patterns and Machinery of DNA Methylation ...... 3 a. Overall composition of the mammalian genome ...... 3 b. Methylation patterns and different genomic compartments ...... 6 c. DNA methyltransferases ...... 7 I.2. Functions of DNA Methylation ...... 11 a. Defense against transposable elements ...... 11 b. Genomic stability and apoptosis ...... 11 c. Imprinting ...... 13 d. X chromosome inactivation ...... 15 I.3. Methylation Dynamics ...... 17 a. Developmental overview...... 17 b. Establishment of sex-specific genomic methylation in the germline ...... 18 c. Correlation with chromatin landscape ...... 20 I.4. Role and Regulation of DNMT1 ...... 26 a. DNMT1’s discovery and genetic studies ...... 26 b. Structural insights into DNMT1 regulation ...... 29 i. RFTS and targeting to sites of DNA replication ...... 29 ii. Autoinhibitory role of CXXC domain ...... 32 iii. BAH domains ...... 32 iv. (GK) repeats ...... 34 v. Methyltransferase domain and mechanism of methylation ...... 34 c. Regulation of DNMT1 by other proteins ...... 36 i. UHRF1 is essential for maintenance DNA methylation ...... 36 ii. Acetylation-ubiquitination regulation by USP7 and TIP60 ...... 37 I.5. Outline of Thesis ...... 39 CHAPTER II. ACETYLATED (GK) REPEATS PREVENT DNMT1 FROM CARRYING OUT DE NOVO METHYLATION AT PATERNALLY-IMPRINTED GENES IN EMBRYONIC STEM CELLS ...... 40 II.1. Rationale and Summary ...... 40 II.2. Materials and Methods ...... 41 a. Cell lines and cell culture ...... 41

i

b. Immunoblotting ...... 42 c. Methylation analysis ...... 42 d. Southern blots ...... 42 e. Bisulfite sequencing ...... 43 f. RT-qPCR ...... 44 g. Statistical analysis ...... 44 II.3. Results ...... 45 a. GR substitution of (GK) repeats does not alter DNMT1 stability ...... 45 b. GR DNMT1 rescues methylation globablly and at repetitive sequences ...... 45 c. GR DNMT1 produces ectopic gain of methylation at paternally-imprinted genes ...... 46 d. Expression of GR DNMT1 has no effect on methylation at maternally-imprinted genes ...... 47 e. Gain of methylation at paternally-imprinted genes results in transcriptional changes ...... 48 II.4. Interpretation ...... 61 a. Expression of DNMT1 in gonocytes during male imprint establishment ...... 61 b. Sex-specific function of DNA methyltransferases in the germline ...... 62 c. Phenotypic specificity to paternal differentially methylated regions ...... 63 d. Distinction between paternal and maternal imprints ...... 63 e. Future directions ...... 64 CHAPTER III. USP7-INDEPENDENT REGULATION OF DNMT1 EXPRESSION ...... 68 III.1. Rationale and Summary ...... 68 III.2. Manuscript ...... 69 III.3. Interpretation ...... 86 a. Manuscript contributions...... 87 CHAPTER IV. THE ROLE OF BAH DOMAINS IN REGULATING DNMT1 LOCALIZATION AND FUNCTION ...... 88 IV.1. Rationale and Summary...... 88 IV.2. Materials and Methods ...... 90 a. Cell lines and cell culture ...... 90 b. CRISPR mutagenesis ...... 90 c. Immunoblotting ...... 91 d. Methylation analysis ...... 91 e. Southern blots ...... 92 f. Immunofluorescence...... 92

ii

g. Protein purification ...... 93 h. Histone peptide array ...... 93 i. In vitro pull-down of ES cell extracts and recombinant proteins...... 94 j. Structural analysis ...... 94 IV.3. Results ...... 95 a. Deletion of BAH domains results in stable DNMT1 protein ...... 95 b. BAH domains are required for global maintenance of DNA methylation ...... 95 c. BAH domains are required for methylation at different genomic compartments to variable degrees ...... 96 d. Deletion of BAH domains prevents targeting of DNMT1 to replication foci during S phase ...... 106 e. BAH1 domain is predicted to be a histone-binding domain ...... 106 f. BAH1 domain recognizes specific modifications on histone tails in vitro ...... 107 g. BAH1 domain binds to chromatin factors in vitro ...... 108 h. Point mutation in BAH1 binding cage is not sufficient for disruption of BAH1 function ...... 108 IV.4. Interpretation ...... 110 a. Proper targeting of DNMT1 is required for maintenance methylation ...... 110 b. Future directions ...... 112 CHAPTER V. GENERAL DISCUSSION ...... 114 V.1. Further Perspectives ...... 118 REFERENCES ...... 122

iii

LIST OF FIGURES

FIGURE 1A: Composition of the human genome and distribution of DNA methylation ...... 4 FIGURE 1B: De novo versus maintenance activities of DNA methyltransferases ...... 5 FIGURE 1C: Structure and regulation of DNMT1 ...... 24 FIGURE 1D: The (GK) repeats in DNMT1 ...... 25 TABLE 1: List of genetic studies of Dnmt1 ...... 27 FIGURE 1E: Pathogenic mutations of DNMT1 ...... 30 FIGURE 2A: DNMT1 minigene system used for genetic rescue experiments ...... 41 TABLE 2: Primers for bisulfite sequencing of imprinted genes ...... 43 FIGURE 2B: Substitution of lysines by arginines within the (GK) repeats does not affect maintenance methylation ...... 49 FIGURE 2C: Methylation status of H19 paternally imprinted DMR...... 50 FIGURE 2D: Methylation status of Dlk1/Gtl2 paternally imprinted DMR ...... 51 FIGURE 2E: Methylation status of Rasgfr1 paternally imprinted DMR ...... 52 FIGURE 2F: Methylation status of Kcnq1ot1 maternally imprinted DMR ...... 53 FIGURE 2G: Methylation status of Igfr2r maternally imprinted DMR...... 54 FIGURE 2H: Methylation status of Snrpn maternally imprinted DMR...... 55 FIGURE 2I: Methylation status of Peg1 maternally imprinted DMR ...... 56 FIGURE 2J: Methylation status of Peg3 maternally imprinted DMR ...... 57 TABLE 3: Summary of methylation status of imprinted genes in GK and GR cells ...... 58 FIGURE 2K: Expression of H19 is altered in GR clones ...... 59 FIGURE 2L: Model of acetylation-dependent maintenance and de novo functions of DNMT1 ...... 60 FIGURE 4A: Deletion of BAH domains at endogenous locus yields stable protein at wild-type levels .. 97 FIGURE 4B: ES cells with BAH deletion exhibit genome-wide demethylation ...... 98 FIGURE 4C: DNMT1∆BAH clones exhibit mislocalization of the protein during S phase ...... 99 FIGURE 4D: Identification of BAH1 and BAH2 binding pockets ...... 100 FIGURE 4E: BAH domains bind to methylated H3K27 and acetylated H2B peptides in vitro ...... 101 FIGURE 4F: BAH1 and BAH2 bind to methylated H3K27 and acetylated H2B peptides in vitro ...... 102 FIGURE 4G: BAH1 alone binds to methylated H3K27 and acetylated H2B peptides in vitro ...... 103 FIGURE 4H: Identification of BAH1 potential binding partners in vitro ...... 103 FIGURE 4I: Point mutation in BAH1 is not sufficient to disrupt its function ...... 104 FIGURE 4J: Structural model of DNMT1 recruitment to nucleosomes via its BAH domains ...... 105

iv

List of Abbreviations

BAH domain Bromo-Adjacent Homology domain

(GK) repeats glycine-lysine repeats

DNMT1 DNA methyltransferase-1

DNMT3A DNA methyltransferase-3A

DNMT3B DNA methyltransferase-3B

DNMT3L DNA methyltransferase-3L

DNMT3C DNA methyltransferase-3C

USP7 Ubiquitin-specific protease 7 (also known as HAUSP)

CpG site Cytosine-phosphate-guanine site

LTR Long Terminal Repeat

IAP Intracisternal A-particles

LINE Long Interspersed Element

SINE Short Interspersed Element

PWWP Pro-Trp-Trp-Pro domain

PHD-like Plant Homeodomain-like

H1 Histone H1

v

ESC Embryonic Stem Cells

DMR Differentially methylated region

RFTS Replication focus targeting sequence

CXXC Cys-X-X-Cys domain

TRD Target Recognition domain

UHRF1 Ubiquitin-like, containing PHD and RING finger domain 1

SDS Sodium Dodecyl Sulfate

PBS Phosphate Buffer Saline

RT-qPCR Real-Time Quantitative PCR

TIP60 Tat-interactive protein 60kDa shRNA short hairpin RNA

MEF Mouse Embryonic Fibroblasts

UBL3 Ubiquitin-like 3 domain

PCNA Proliferating Cell Nuclear Antigen

vi

Acknowledgements

I am indebted to many people for their guidance, support and help throughout my work on this thesis dissertation. I spent some of the best years of my life so far pursuing my Ph.D., and the successful attainment of this degree was by the grace of God. I am humbled to have studied fascinating biological mechanisms and complex molecules, which are the beautiful creation of the Lord, and this Ph.D. dissertation is to glorify His perfect design, wisdom and power.

I would like to express sincere gratitude to my Ph.D. advisor Dr. Timothy H. Bestor, who accepted me in his lab and gave me much freedom to explore different scientific questions. His kindness and respect were always a source of great encouragement, and his clear focus on the problem at hand and desire to help were essential in guiding me. I also want to give profound thanks to a post-doctoral researcher in the Bestor lab, Dr. Mathieu Boulard, for his close mentorship and companionship throughout all these years in the lab. He taught me that something should be accomplished every single day and that persistence and optimism bring great results. Mathieu was always an excellent example to follow and representative of the kind of passionate post-doc I want to be. My lab in general taught me to be a freestanding scientist with my own opinions, for which I am immensely grateful.

I want to thank my graduate school colleagues, classmates, and the Department of

Genetics and Development for creating a wonderful collaborative atmosphere, where we taught each other, learned from each other, and shared reagents. I especially want to thank Dr. Gina

DeStefano, a fellow graduate student, for her special friendship, inspiration, and help with personal and professional challenges, and Dr. Tulsi Patel, my classmate and roommate of five years, who suggested a wonderfully efficient way to end my cloning struggles. I am grateful to

vii my other close graduate school friends: Dr. Nilushi DeSilva, Dr. Matt Borok, Dr. Sarah Deng,

Dr. Łucja Grajkowska, and Sedef Onal. I want to thank Dr. Victor Lin from the transgenic mouse facility at CUMC for his advice on cloning, CRISPR, mouse embryonic stem cell techniques and his generous sharing of reagents. I want to recognize members of the Costantini and

Papaioannou labs, located on the same floor as my lab (14th floor of Hammer building); I especially wish to thank Dr. Adam Packard for teaching me how to pick ES cell clones, a technique I have since used many times. I also owe thanks to our collaborators: Dr. Nicholas

Babault and Dr. Ming-Ming Zhou from Mount Sinai School of Medicine for help with protein purifications, Dr. Jikui Song from UC Riverside for protein expressing plasmids, and Dr. Omid

Tawana and Dr. Wei Gu from Columbia University Medical Center (CUMC) for collaboration on Usp7.

I want to acknowledge the great role that my high school biology teacher, Ulyana

Miskevych, played in the development of my interest in research. I am grateful that she chose me to do a student research study on the ecology of the endangered plant Lunaria, which I subsequently presented at the Minor Academy of Sciences competition and at the International

Conference on Biodiversity in Chernivtsi, Ukraine. I want to thank Dr. Mark Andrake, who mentored me during a summer research program at the Fox Chase Center. I am grateful to Dr.

Peter Oelkers and Dr. Felice Elephant for letting me do research in their labs during my undergraduate years at Drexel University. I want to sincerely thank my chemistry teacher James

Purcell from Manor College for encouraging me to think about graduate school.

I owe everything to my wonderful parents, sister Oksana, grandparents and family. My father’s parents taught me the meaning of hard work, and my mother’s parents taught me the value and importance of good education. My grandmother on my mother’s side was the first one

viii who introduced me to a book on genetics, and as a child I loved visiting her biology classroom at the local school. My brilliant parents, who are both engineers, were always ready to help with anything and greatly encouraged my sister’s and my education. Thanks to their decision to move to the United States from Ukraine, I had the opportunity to end up at Columbia University and work with some of the best scientists in the world.

I am immensely grateful to my dear husband Dr. Joseph M. Villarin for his support, love, help and companionship throughout my Ph.D. It was special to start our relationship at Columbia

University and go through all the wonderful experiences and challenges together.

ix

To my husband Joseph Villarin and our unborn son

x

CHAPTER I. GENERAL INTRODUCTION

The existence of DNA methylation at cytosine residues has been known for almost 70 years (HOTCHKISS 1948). In 1948, Hotchkiss first discovered 5-methylcytosine in calf thymus preparation using paper chromatography by observing that this novel nucleotide separated from cytosine in the same manner that thymine (also known as methyluracil) separated from uracil

(HOTCHKISS 1948). Biochemical studies with deoxyribonucleic and ribonucleic acids and cell extracts showed that DNA polymerase has no ability to distinguish between 5-methylcytosine and cytosine (Gold et al. 1963; Bessman et al. 1958), suggesting that addition of a small methyl group to cytosine does not block DNA replication nor, most likely, .

More than 40 years ago, scientists first predicted the existence of two general classes of

DNA methyltransferases: de novo enzymes that would establish methylation patterns during embryonic development, and maintenance enzymes that would replicate methylation patterns during cell division by methylating hemimethylated CpG sites produced by semiconservative

DNA replication (Riggs 1975; Holliday & Pugh 1975). This prediction has since been experimentally confirmed. In 1981, it was demonstrated that cytosine methylation patterns on an artificially methylated plasmid transfected into mouse cells can be stably maintained for up to 25 cell divisions (Wigler et al. 1981). In 1988, the first mammalian enzyme involved in DNA methylation, DNA methyltransferase 1 (DNMT1), was cloned and sequenced (Bestor et al.

1988); to date, it remains the only known maintenance methyltransferase. It took another 10 years after this work before the de novo methyltransferases DNMT3A and DNMT3B were cloned and characterized (Okano et al. 1998).

1 In 2011, the first crystal structure of DNMT1 with DNA was solved (Song et al. 2011), providing the basis for elucidation of the molecular mechanisms of maintenance DNA methylation as well as DNMT1’s regulatory interactions. This thesis aims to further expand our knowledge about the maintenance DNA methyltransferase DNMT1, specifically its regulation by

N-terminal domains of previously unknown function: the lysine-glycine (GK) repeats and Bromo

Adjacent Homology (BAH) domains.

2 I. 1. Patterns and Machinery of DNA Methylation a. Overall composition of the mammalian genome

The mammalian genome is mainly populated by repeat sequences and transposable elements (Rollins et al. 2006; Waterston et al. 2002), and fully half consists of autonomous retrotransposons (LTR and non-LTR), non-autonomous retrotransposons and DNA transposons

(Kazazian 2004) (Figure 1A).

Autonomous LTR retrotransposons include intracisternal A-particles (IAPs), which are the most active retrotransposons in the mouse genome (Kazazian 2004). There are about 1,000 distinct IAP elements in the mouse genome, one third of which are functional and capable of autonomous retrotransposition (Dewannieux et al. 2004). Autonomous non-LTR retrotransposons include long interspersed elements (LINEs), present in almost all eukaryotes, which contain a 5’ promoter region and two open reading frames that encode proteins necessary for retrotransposition (Smit, 1996). Non-autonomous retrotransposons include short interspersed elements (SINEs) that lack a functional reverse transcriptase but can be mobilized in trans by

LINEs (Kazazian 2004). Other repeat elements in the mouse genome are pericentromeric major satellites and centromeric minor satellites (Lehnertz et al. 2003).

LTR retrotransposons and DNA transposons make up 9.3% and 3.6% of the human genome respectively, while LINEs and SINEs constitute 22.3% and 16.1% (Rollins et al. 2006).

While about 40% of the genome is composed of unnanotated and intergenic compartments, the remaining genomic sequences constitute only a small fraction of the entire genome: simple and other repeats – 3.15%, satellites – 4.17%, exons – 2.14%, and isolated CpG islands – 0.57%

(Rollins et al. 2006).

3 CpG dinucleotides are not equally distributed across the mammalian genome. The majority of CpG dinucleotides in the human genome are found in the unannotated compartment

(30.4%) and in SINEs (27.2%), while CpG islands, both isolated and those associated with promoters, account for 6.8% of genomic CpG dinucleotides (Rollins et al. 2006). Interestingly, even though LINEs make up a greater proportion of the human genome than SINEs, they contain a lower number of CpG sites. This is due to the fact that LINEs are older retrotransposons compared to SINEs and, therefore, have endured a longer time period and more opportunities for mutation of methylated CpG sites (Zietkiewicz et al. 1998; Rollins et al. 2006).

FIGURE 1A. Composition of the human genome and distribution of DNA methylation. (Figure adopted from John Edwards). Vertical axis indicates percentage of total CpG dinucleotides in each compartment; horizontal axis indicates percentage of total genome in each compartment, light blue at the top of each compartment indicates unmethylated fraction. Unmethylated compartment of CpG islands/first exons occupies <0.5% of the genome. The ICR/DMR compartment represents 0.001% of the genome.

4

FIGURE 1B. De novo versus maintenance activities of DNA methyltransferases. (Adopted from Okano et al. 1998). Baculovirus expressed DNMT3A, DNMT3B and His-tagged DNMT1 were incubated with double-stranded unmethylated and hemimethylated oligonucleotides. DNMT1 exhibits higher activity on unmethylated substrate than DNMT3A and DNMT3B.

5 b. Methylation patterns and different genomic compartments

Advances in next-generation sequencing technologies have allowed for more precise genome-wide analysis of mammalian cytosine methylation patterns and understanding of the genomic compartments prone to CpG methylation or resistant to methylation. Multiple approaches have been developed to this end, including immunoprecipitation of methylated DNA followed by massive sequencing or hybridization onto an array, known as MeDIP-seq

(Maunakea et al. 2010); parallel sequencing of bisulfite-treated DNA (Lister et al. 2009); enzymatic fractionation of the genome into methylated and unmethylated compartments followed by deep sequencing, called MethylMAPS (J. R. Edwards et al. 2010); and reduced representation bisulfite sequencing, or RRBS (Meissner et al. 2005). Three principal findings are common across all of the genome-wide methylation studies: 1) repeat elements are heavily methylated in mammals, regardless of the studied tissue; 2) CpG-dense regions (CpG islands) are unmethylated; and 3) gene bodies of transcriptional units exhibit elevated methylation levels relative to surrounding intergenic regions (Figure 1A).

Each genomic compartment exhibits specific correlation between CpG methylation and

CpG density up to a certain threshold (Rollins et al. 2006; J. R. Edwards et al. 2010). Repeat elements are hypermethylated (about 80% methylated in human and mouse), with the exception of simple repeats (Rollins et al. 2006; J. R. Edwards et al. 2010). This observation is consistent with the role of cytosine methylation in silencing transposons in vertebrates (Bestor 2003).

Within repeat elements, methylation levels increase as local CpG density increases, up to a density of 1 per 15 nucleotides, beyond which methylation levels decrease significantly (J. R.

Edwards et al. 2010).

6 The majority of CpG-dense promoters are kept unmethylated (Meissner et al. 2008;

Mohn et al. 2008; Zemach et al. 2010; J. R. Edwards et al. 2010) (Figure 1A). Interestingly,

CpG-dense domains outside the repeats, including those within promoters, show increased methylation levels up to a density of 1 in 40 nucleotides, which is a significantly lower methylation density than that of repeat sequences (J. R. Edwards et al. 2010). This suggests that the refractory nature of CpG islands is not fixed by nucleotide content alone but involves additional factors that protect them from methylation. Consistent with this observation, FBXL10 has been discovered as a required factor that protects CpG islands co-occupied by Polycomb repressive complexes from acquisition of methylation (Boulard et al. 2015).

Internal introns and exons show high levels of methylation, with exons displaying higher methylation levels and CpG density (J. R. Edwards et al. 2010). This observation demonstrates that, in mammals, gene bodies exhibit high levels of methylation that correspond to local CpG density. To date, the significance of exonic methylation remains unclear.

Overall, these studies and observations demonstrate the importance of taking into account the type of sequences studied and CpG density within these sequences when studying DNA methylation. For example, while 75% of all promoters are located within CpG islands (Ioshikhes

& M. Q. Zhang 2000), the remaining promoters have very low CpG densities and methylation is unlikely to regulate their expression.

c. DNA methyltransferases

Mammals use three catalytically active DNA methyltransferases to regulate DNA methylation levels: DNMT1, DNMT3A and DNMT3B (Okano et al. 1998; Bestor et al. 1988) as well as a co-factor DNMT3L that is itself devoid of catalytic activity and is expressed

7 exclusively in the germline (Bourc'his et al. 2001; Bourc'his & Bestor 2004). Recently, a rodent- specific DNA methyltransferase called DNMT3C has been identified (Barau et al. 2016).

Dnmt3c evolved as a duplication of Dnmt3b and is responsible for methylation of evolutionarily young retrotransposons in the male germline, which is required for mouse fertility (Barau et al.

2016).

Dnmt3a and Dnmt3b are closely related genes, sharing a common domain composition formed by a PWWP domain, a PHD-like domain and a C-terminal DNA methyltransferase domain (Okano et al. 1998; Otani et al. 2009; Qiu et al. 2002). The human genes display high amino acid conservation in the catalytic C-terminal domain with 81% sequence identity between them, though only having 30% identity in the variable N-terminal domain (S. Xie et al. 1999) .

DNMT3A and DNMT3B are required to initiate methylation of unmethylated DNA template in vivo, as shown by the inability of embryonic stem cells deleted for both Dnmt3a and

Dnmt3b (hereafter Dnmt3ab-/-) to methylate a newly integrated retroviral fragment, despite the presence of DNMT1 (Okano et al. 1999). Based on this evidence, de novo genomic methylation is thought to be exclusively catalyzed by DNMT3A and DNMT3B, even though DNMT1 exhibits higher activity on unmethylated DNA in vitro than both DNMT3A and DNMT3B

(Okano et al. 1998) (Figure 1-B). DNMT3A and DNMT3B share redundant de novo DNA methylation activity in some genomic compartments, such as retroviral DNA or interspersed repeats (T. Chen et al. 2003; T. Chen et al. 2004; Okano et al. 1999). Both DNMT3A and

DNMT3B show strong association with chromatin, the nucleoprotein complex consisting of

DNA and histone proteins (Jeong et al. 2009).

Although DNMT3A and DNMT3B cooperate and interact together, they are generally expressed at different levels in a given cell type, suggesting incomplete overlap of functions

8 (Okano et al. 1998) (Kashiwagi et al. 2011) . Indeed, DNMT3B has a non-redundant activity on pericentromeric tandem repeats (satellite sequences) (T. Chen et al. 2003; T. Chen et al. 2004;

Okano et al. 1999); DNMT3A is not able to compensate this function when reintroduced into

Dnmt3ab-/- embryonic stem cells (T. Chen et al. 2003; T. Chen et al. 2004).

Despite strong structural similarities in the C-terminus, DNMT3A and DNMT3B differ in the N-terminus. These differences include the PWWP domain, shown to be important for

DNMT3B-directed methylation of pericentromeric repeats (T. Chen et al. 2004). Notably, the

PWWP of DNMT3B was shown to bind DNA, whereas the slightly different PWWP of

DNMT3A does not, indicating that the binding of either DNMT3A or DNMT3B to specific loci is mediated in part by the PWWP domain (T. Chen et al. 2004; Qiu et al. 2002).

In addition, DNMT3B preferentially associates with H1-containing condensed chromatin

(Kashiwagi et al. 2011). Co-sedimentation experiments have shown that DNMT3B is released from polynucleosomes containing less that 5 nucleosomes, indicating that DNMT3B association with chromatin is mediated by higher-order chromatin structure (Kashiwagi et al. 2011).

Loss of function genetics in mouse has revealed many differences between Dnmt3a and

Dnmt3b. Dnmt3a mutants develop normally, but die at 4 weeks of age; hence Dnmt3a seems to be important for survival but not for embryonic development (Okano et al. 1999). In contrast, loss-of-function of Dnmt3b results in embryonic lethality and is characterized by partial demethylation, mainly at pericentromeric tandem repeats (Okano et al. 1999). Mutations in the

Dnmt3b gene cause an autosomal recessive developmental disorder in humans called ICF syndrome (immunodeficiency, centromeric instability, facial anomalies) (G. L. Xu et al. 1999).

Affected individuals have centromeric heterochromatin instability at chromosomes 1, 9 and 16 caused by hypomethylation of classic satellites II and III (G. L. Xu et al. 1999) and exhibit

9 hypomethylation of subtelomeric regions as well as decreased length of telomeres (Yehezkel et al. 2008). ICF mutations of Dnmt3b are not complete loss-of-function mutations and do not abolish catalytic activity of DNMT3B (Ueda et al. 2006), but they prevent the interaction between DNMT3B and DNMT3L (Z.-H. Xie et al. 2006). This impairment of DNMT3B recruitment causes the significant developmental abnormalities seen in ICF syndrome.

DNMT1 is the only known maintenance DNA methyltransferase (Bestor et al. 1988; E.

Li et al. 1992) and is discussed in greater detail in section I.4.

10 I.2. Functions of DNA methylation a. Defense against transposable elements

In mammals, cytosine methylation in a CpG context plays a crucial role for the restriction of transposons and other parasitic elements that threaten the integrity of the genome (Smit 1996;

Bestor 2003).The potential of DNA methylation to silence transposons was first demonstrated by transient transfection assays of an artificial IAP element in which in vitro methylation at the LTR was sufficient to inhibit the expression of the IAP element (Feenstra et al. 1986). Since then accumulating genetic evidence has demonstrated that DNA methylation is necessary and sufficient for the silencing of parasitic elements (Bourc'his & Bestor 2004; Tsumura et al. 2006;

Walsh et al. 1998). The first observation of retrotransposon activation following reduction of genomic methylation levels came from the study of Dnmt1-null mouse embryos, where IAP expression was massively reanimated (Walsh et al. 1998). The catalytic activity of DNMT1 was shown to be crucial for the silencing of IAP elements (Damelin & Bestor 2007). Dnmt1- conditional deletion demonstrated that DNA methylation is also necessary for repression of IAPs in differentiated cells (Jackson-Grusby et al. 2001). The essential role of DNA methylation for transposon silencing has been further illustrated using Dnmt3l loss-of-function mutant males that exhibit severe hypomethylation at interspersed repeats (IAP and LINEs) in the germline

(Bourc'his & Bestor 2004). This hypomethylation induces transcription of those elements, causes meiotic catastrophe and, as a consequence, complete male sterility (Bourc'his & Bestor 2004).

b. Genome stability and apoptosis

Loss of DNA methylation in differentiated cells results in genome instability and cell death (Jackson-Grusby et al. 2001; Dodge et al. 2005). Global genomic demethylation resulting

11 from Dnmt1-conditional disruption in murine embryonic fibroblasts (MEFs) induces cell death by apoptosis (Jackson-Grusby et al. 2001). Similarly, global demethylation of the human cancer cell line HCT1168, induced by conditional deletion of Dnmt1, leads to G2 arrest followed by progressive cell death (T. Chen et al. 2007). Moreover, Dnmt1-deficient embryonic stem cells

(ESCs) rapidly die when induced to differentiate (E. Li et al. 1992; Panning & Jaenisch 1996).

Cell death of differentiating Dnmt1-null ESCs occurs even when a catalytically inactive DNMT1 is re-introduced into the cell (Damelin & Bestor 2007) showing that apoptosis is triggered by the loss of DNA methylation rather than by the absence of DNMT1 protein. These experiments show that there is an absolute necessity for differentiated cells to maintain DNA methylation levels above a certain threshold in order to survive. While mouse ESCs are not affected by the loss of DNA methylation and grow normally when all three DNA methyltransferases

(DNMT3A, DNMT3B, DNMT1) are deleted (Tsumura et al. 2006), human ESCs exhibit massive cell death upon deletion of DNMT1 while remaining unaffected by disruption of

DNMT3A and DNMT3B (Liao et al. 2015). This observation reflects that 1) deletion of DNMT1 yields a more severe demethylation of the genome; and 2) mouse ESCs are known to be in a more naïve pluripotent state than human ESCs (Weinberger et al. 2016), such that they can survive severe demethylation.

Simultaneous inactivation of both p53 and Dnmt1 in MEFs partially rescues the apoptotic phenotype (Jackson-Grusby et al. 2001), indicating that apoptosis resulting from loss of genomic methylation is both P53-dependent and -independent. The pathway by which loss of methylation activates apoptosis in differentiated cells remains to be determined.

In addition to cell cycle arrest and cell death, demethylation in differentiated cells causes genomic instability (Bourc'his & Bestor 2004; T. Chen et al. 2007). Loss of methylation in

12 HCT1168 cells results in mitotic defects that include broken chromosomes, chromosome congressions or alignment defects (T. Chen et al. 2007). A similar DNA double strand break phenotype, characterized by the appearance of numerous foci of -H2A.X, has also been documented in Dnmt3l-deficient germ cells, in which demethylation of transposable elements is correlated with meiotic catastrophe (Bourc'his & Bestor 2004).

c. Imprinting

Parthenogenetic mouse embryos that contain only chromosomes of maternal origin fail to develop to term, indicating that a contribution of the paternal genome is crucial for mammalian embryonic development (Kaufman et al. 1977; Surani et al. 1984). This lethality has been associated with the impairment of the monoallelic expression of paternally imprinted genes (E.

Li, Beard, Forster, et al. 1993), as evidenced by the live birth, normal development and normal reproductive ability of a parthenogenetic mouse carrying a 13-kb deletion in the H19 paternally imprinted gene in one of the maternal genomes (Kono et al. 2004). Experiments using transgenic mice have demonstrated that differential expression of paternally and maternally imprinted genes is a consequence of parent-of-origin specific DNA methylation (Reik et al. 1987; Sapienza et al.

1987; Swain et al. 1987). Expression of imprinted genes is under the control of differentially methylated regions (DMRs) or imprints which serves as imprint control regions (ICRs) (Wutz et al. 1997; Thorvaldsen et al. 1998). One DMR often controls a cluster of co-regulated imprinted genes (Tomizawa et al. 2011); the 23 DMRs identified to date control monoallelic expression of some 130 genes (Duffié & Bourc'his 2013). The crucial role of DNA methylation for silencing one allele has been demonstrated using Dnmt1-deficient mice, in which imprinted genes become biallelically expressed (E. Li, Beard, Forster, et al. 1993). The oocyte-specific form of DNMT1

13 (DNMT1o) is important for imprint maintenance in 8-cell embryos (Howell et al. 2001), and the zygotic somatic form of DNMT1 (DNMT1s) is also required to maintain maternal and paternal imprints in the early embryo (Hirasawa et al. 2008). Similarly to parthenogenetic conceptuses, embryos depleted for Dnmt1 fail to develop beyond mid-gestation (E. Li et al. 1992) suggesting that proper maintenance of genomic imprints is essential for normal mammalian development.

There are several notable differences between paternal and maternal DMRs. First, only three paternal DMRs are known (H19, Dlk1/Gtl2, and Rasgrf1), while about 20 maternal DMRs have been characterized to date (Duffié & Bourc'his 2013). Second, paternal DMRs have lower

CpG density than the maternal DMRs and are located in intergenic regions, while maternal

DMRs lie within promoters (Kobayashi et al. 2006; Bourc'his & Bestor 2006). Third, paternal imprints are established early in the developing male germline around the time of birth, while maternal imprints are established in the growing oocyte shortly before ovulation (Bourc'his &

Bestor 2006). Fourth, in addition to methyltransferases, different chromatin factors may be required for maternal versus paternal imprint establishment. For example, KDM1B, a histone H3 lysine 4 demethylase is necessary for methylation of four maternal imprints (Ciccone et al.

2009), while methylation of paternally imprinted Rasgrf1, but not of H19 or Dlk1/Gtl2, is dependent on PIWI (Watanabe et al. 2011). Differences in paternal and maternal imprinting further demonstrate a striking sexual dimorphism.

Failure of correct imprint establishment can result in several developmental and cognitive disorders. Prader-Willi syndrome (PWS) and Angelman syndrome (AS) are clinically distinct complex disorders that map to chromosome 15q11-q13 in human (chromosome 7 in mouse), which contains the imprinting cluster for Snurf/Snrpn (C. A. Edwards & Ferguson-Smith

2007). Both disorders can result from microdeletion, uniparental disomy or a defect in the

14 imprint’s DMR, although Prader-Willi is caused by the loss of normal paternal contribution whereas Angelman syndrome is caused by the loss of normal maternal contribution (Cassidy et al. 2000). Specifically, SNRPN, a small nuclear ribonucleoprotein, is implicated in causation of

Prader-Willi syndrome, while UBE3A, a ubiquitin ligase, is implicated in Angelman syndrome

(Cassidy et al. 2000; C. A. Edwards & Ferguson-Smith 2007). The phenotypic consequence of these disorders resulting from defects in the same chromosomal region are strikingly different:

Prader-Willi patients exhibit more severe behavioral and endocrine disorders including obsessive-compulsive symptoms and hypothalamic insufficiency, while Angelman syndrome patients have a more severe cognitive and neurologic impairment, including seizures and ataxia

(Cassidy et al. 2000).

d. X chromosome inactivation

It has been shown that double dosage of X-linked genes is detrimental for extra- embryonic development (Marahrens et al. 1997; Wang et al. 2001; Blewitt et al. 2008). In mammals, sex chromosome dosage compensation is achieved by inactivation of one of the two

X-chromosomes in females (Okamoto & Heard 2009). Given random inactivation, female organisms are thus composed of a mosaic of cells that inactivate either the paternal or the maternal X-chromosome, while the extra-embryonic lineage specifically silences the paternal X- chromosome (Okamoto & Heard 2009). Failure of paternal X-chromosome inactivation in the trophoblast causes lethality of female embryos (Marahrens et al. 1997; Wang et al. 2001; Blewitt et al. 2008). Given the absence of sexual dimorphism of mutants targeted for DNA methyltransferases, it is unlikely that genomic methylation plays an essential role in this process

(E. Li et al. 1992; Okano et al. 1999). By using a reporter-transgene inserted into one of the X-

15 chromosomes of female Dnmt1-mutant embryos, paternal X-chromosome imprinting (XCI) was shown to be independent of DNA methylation (Sado et al. 2000).

XCI relies on the transcription of a non-coding RNA, Xist, from the future inactive X- chromosome (Xi). Proper monoallelic expression of Xist, and subsequent X-inactivation have been reported to occur normally in the absence of DNA methyltransferases (Beard et al. 1995;

Panning & Jaenisch 1996; Sado et al. 2004). However, cytosine methylation is important for the stable repression of Xist, since loss of methylation causes derepression of Xist in a fraction of cells (Panning & Jaenisch 1996; Sado et al. 2004). In addition, demethylation has been shown to induce Xist expression in male embryos (which normally do not express Xist), though demethylation does not reactivate Xist transcription in male undifferentiated ESCs (Beard et al.

1995). This indicates that ESCs utilize mechanisms other than DNA methylation for Xist silencing.

De novo methylation is dispensable for proper X-inactivation as evidenced by the normal expression level of X-linked genes in Dnmt3a- and Dnmt3b- null mutant mice (Sado et al. 2004).

Nevertheless, more than half of the cells in Dnmt1-deficient female embryos express a transgene inserted in the paternal X-chromosome (Sado et al. 2000). Therefore, genomic methylation synergizes with the heterochomatin state of the Xi to impede transcription. Paradoxically, both male and female active X-chromosomes were reported to be hypermethylated outside of CpG islands (relative to the Xi) (Hellman & Chess 2007). Approximately half of the hypermethylated regions were found downstream of transcription start sites and exhibit no correlation with gene activity (Hellman & Chess 2007).

16 I.3. Methylation Dynamics a. Developmental overview

DNA methylation patterns undergo drastic changes during early embryonic development in primordial germ cells (PGCs), which give rise to oocytes and spermatocytes, and in cleavage stage pre-implantation embryos (Monk et al. 1987), but later become very stable in all somatic tissues (Ziller et al. 2013).

Global loss of methylation in PGCs takes place during their migration from the proximal epiblast to the gonadal ridge between E9.5-E11.5. However sequences that carry long-term epigenetic memory, such as imprints and CpG islands on the X chromosome, resist demethylation until PGCs enter the site of the future gonad (Seisenberger et al. 2012). While in general repeat elements are demethylated in PGCs, evolutionarily young an active restrotransposons (e.g. IAPs) are kept methylated to prevent their reactivation and parasitic transposition in the genomes of future germ cells (Seisenberger et al. 2012).

Methylation patterns are re-established in PGCs with the endowment of sex-specific imprinting: this process is characterized by a profound sexual dimorphism in the identity and characteristics of sequences that acquire methylation and in the timing of de novo methylation

(Schaefer et al. 2007). Acquisition of methylation is a premeiotic event in male germ cells, occurring perinatally in prospermatogonia, while in female germ cells it does not take place until oocytes are arrested in meiosis I shortly before ovulation (Schaefer et al. 2007). Establishment of sex-specific genomic methylation in the germline and role of DNMTs is discussed in greater detail in the following section.

Cleavage stage embryos prior to implantation progressively lose methylation with each round of cell division, and, upon implantation, normal methylation is reconstituted globally (Howlett &

17 Reik 1991; Santos et al. 2002; Borgel et al. 2010; Z. D. Smith et al. 2012). Importantly, during the wave of early embryonic demethylation, both genomic imprints and CpG-rich transposons retain methylation (Morgan et al. 2005). Sequences that undergo demethylation in the pre- implantation stage are primarily old and inactive transposon remnants, satellite repeats and unannotated genomic sequences.

b. Establishment of sex-specific genomic methylation in the germline

Novel patterns of genomic methylation are established in the germline by coordinated action of DNMT3A and its co-factor DNMT3L (Bourc'his et al. 2001; Bourc'his & Bestor 2004;

Kaneda et al. 2004). Dnmt3l is homologous to the Dnmt3a/3b methyltransferases, as it carries the

PHD-like domain, but lacks the methyltransferase domain and therefore is devoid of catalytic activity (Aapola et al. 2001). Conditional disruption of Dnmt3b in the gonads revealed that

DNMT3B is dispensable for methylation in the gametes (Kaneda et al. 2004).

While DNMT3A and DNMT3B are found in most tissues, Dnmt3l is specifically expressed in the germline at stages when de novo methylation occurs: in females it is present only in growing oocytes arrested in meiosis I, whereas in males it is expressed in diploid prospermatogonia (Okano et al. 1998; Bourc'his et al. 2001; Bourc'his & Bestor 2004). Deletions of Dnmt3a or Dnmt3l in the female germline produce phenotypically indistinguishable progeny

(i.e. neural tube closure defects, pericardial edema, and extra-embryonic tissue abnormalities), with embryonic lethality at E9.5 (Bourc'his et al. 2001; Kaneda et al. 2004). The embryos lack maternal imprints and display biallelic expression of maternally imprinted genes. Therefore, both

Dnmt3a and Dnmt3l are crucial for establishment of maternal imprints. Repeat elements show normal methylation patterns in the progeny from either Dnmt3a or Dnmt3l female mutants

18 (Bourc'his et al. 2001; Kaneda et al. 2004). Hence, methylation at repeats is likely catalyzed by

DNMT3B in the absence of DNMT3A, indicating that DNMT3B can substitute for DNMT3A to methylate both tandem and interspersed repeats. Furthermore, normal methylation of repeats in the progeny from Dnmt3l deficient mothers indicates that the recovery of full methylation of repeat elements in growing oocytes is independent of DNMT3L.

Deletion of Dnmt3a or Dnmt3l in the male germline causes azoospermia and sterility

(Bourc'his & Bestor 2004; Kaneda et al. 2004). In striking contrast to female germline, deletion of Dnmt3l results in demethylation and activation of both LTR (IAP) and non-LTR (LINE) retrotransposons, while H19 imprinted gene exhibits a minor loss of methylation and Dlk1/Gtl2 has normal methylation (Bourc'his & Bestor 2004). Conditional deletion of Dnmt3b in male germline has no effect on paternal imprint methylation. The role of Dnmt3a in de novo methylation of paternal imprints is not well understood as variable methylation phenotypes have been reported after conditional disruption of Dnmt3a in the male germline (Kaneda et al. 2004;

Kato et al. 2007). The importance of DNMT3A for paternal imprint establishment requires further investigation. Quite recently, it has been demonstrated that the newly discovered rodent- specific de novo methyltransferase Dnmt3C is required for methylation of Rasgrf1 but not of

H19 or Dlk1/Gtl2 (Barau et al. 2016).

Dnmt3l is dispensable for female meiosis, but is required for male gamete development

(Bourc'his & Bestor 2004). Even though Dnmt3l is not expressed during male meiosis, testes from adult mutants display severe hypogonadism and Sertoli cell-only phenotype(Bourc'his &

Bestor 2004). Loss of Dnmt3l in males causes meiotic catastrophe characterized by asynapsis or abnormal synapsis (Bourc'his & Bestor 2004).

19 Although the mechanism by which DNMT3L stimulates DNA methylation is still unclear, the activation of DNMT3A/3B by DNMT3L was shown to be mediated by direct interaction as indicated by the binding of both DNMT3A2 and DNMT3B onto DNMT3L (Ooi et al. 2007). The C-terminal extremity of DNMT3L was shown to interact with the catalytic domain of DNMT3A and DNMT3B (Hata et al. 2002; Z.-X. Chen et al. 2005). In vitro DNMT3L strongly increases the binding of a methyl group donor S-adenosyl-L-methionine (SAM) to

DNMT3A2 (Kareta et al. 2006), but DNMT3L does not bind to DNA itself. In addition,

DMNT3L stimulates the catalytic activity of both DNMT3A and DNMT3B (Qiu et al. 2002;

Suetake et al. 2004; Gowher, Liebert, et al. 2005).

As such, the existing literature appears to be clear in establishing that methyltransferase

DNMT3A and co-factor DNMT3L are required for methylation of maternally imprinted genes.

However, DNMT3L is dispensable for de novo methylation of paternal ICRs, indicating that establishment of imprints in the male germline is mediated by other factors.

c. Correlation with chromatin landscape

Chromatin marks show greater correlation than do DNA sequences with the local DNA methylation profile (Zemach et al. 2010; Meissner et al. 2008; J. R. Edwards et al. 2010). Such interplays is not surprising given the recruitment of the DNMT3 family onto chromatin (Jeong et al. 2009; Nimura et al. 2006). Absence of DNA methylation does not influence the global levels of common histone modifications; however genome-wide distribution of histone modification has not been investigated in hypomethylated cells (Tsumura et al. 2006).

The nucleosome is the basic organizational structure of chromatin; it consists of 146 base pairs of DNA wrapped around a histone octamer composed of 2 molecules of H2A, H2B,

20 H3 and H4 (Luger et al. 1997). In contrast to acetylation, which reduces electrostatic interaction of histones with DNA, histone methylation has no effect on the nucleosomal structure. Instead, the methylation of histone tails is thought to modify the chromatin structure by recruitment of specific factors (B. Li et al. 2007). For example, the recruitment of DNMT3A onto the chromatin of transcribed regions could explain the presence of 5-methylcytosine downstream of transcription start sites. Indeed, a recent report shows that the PWWP domain of DNMT3A binds, with weak affinity, to methylated lysine 36 of histone H3 (H3K36me3) (Dhayalan et al.

2010; Y. Zhang et al. 2010). The enrichment of H3K36me3 within actively transcribed genes could account for DNMT3A recruitment downstream of active promoters (B. Li et al. 2007).

Other modifications of the nucleosome have been shown to influence genomic methylation patterns. These chromatin marks include the incorporation of the non-allelic variant

H2A.Z (Zemach et al. 2010), the hypermethylation at lysine 9 of histone H3 (H3K9me3)

(Lehnertz et al. 2003), the hypomethylation at lysine 4 of histone H3 (H3K4me0) (Ooi et al.

2007; Meissner et al. 2008; Otani et al. 2009; J. R. Edwards et al. 2010) and the presence of the linker histone H1 (Fan et al. 2005). In addition, two ATP-dependant chromatin remodeling factors have been shown to play a role in the maintenance of DNA methylation: LSH and ATRX

(Dennis et al. 2001; Garrick et al. 2006).

The association of the histone H1 with the nucleosome promotes the formation of higher order structure of the chromatin fiber (Fan et al. 2005). ESCs targeted for the histone H1 display a specific loss of methylation at imprinted genes but not at repeat elements (Fan et al. 2005).

Given the ubiquitous distribution of H1 within the nucleus, its presence is necessary but not sufficient to keep imprinted DMRs methylated.

21 H3K9me3 is generally found in heterochromatin regions such as pericentromeric regions

(B. Li et al. 2007). H3K9me3 was reported to be important for the localization of DNMT3B at pericentromeric satellites in ESCs; reduced global level of H3K9me3 results in loss of methylation at major satellites but not in other compartments, showing that H3K9me3 plays a role in maintaining methylation at major satellites in ESCs (Lehnertz et al. 2003).

Genome-wide analysis shows that incorporation of the histone variant H2A.Z is anti- correlated with presence of 5mC in human cells (J. R. Edwards et al. 2010). In fish, the antagonistic relationship between H2A.Z and DNA methylation was shown to be very strong at promoters and coding sequences (Zemach et al. 2010). Interestingly, H2A.Z was shown to preferentially occupy the promoter of developmental genes known to be under control of unmethylated promoters (Tanay et al. 2007; Creyghton et al. 2008; J. R. Edwards et al. 2010).

Although data suggest that H2A.Z might be important to protect CpG-dense promoters against

DNA methylation, there is no genetic evidence to support this hypothesis.

Methylation of H3K4 is antagonistic to DNA methylation (Meissner et al. 2008; J. R.

Edwards et al. 2010). The recent finding of binding specificity of the DNMT3 family towards histone H3 unmethylated at lysine 4, has revealed the existence of a molecular mechanism which explains the anti-correlation seen between H3K4me and DNA methylation (Meissner et al. 2008;

J. R. Edwards et al. 2010). The PHD-like motif of both DNMT3A and DNMT3L directly interacts with the N-terminal tail of histone H3 (Ooi et al. 2007; Otani et al. 2009). This finding suggests that the presence of a nucleosomal structure is required for the methylation of the underlying DNA sequence. Importantly, DNMT3A and DNMT3L nucleosome targeting is regulated by the methylation of lysine 4, as the binding is abolished when H3K4 is mono-, di-, or tri-methylated (Ooi et al. 2007; Otani et al. 2009). Furthermore, the inhibition of DNMT3L

22 recruitment by H3K4me3 was confirmed in vivo by showing that DNMT3L is only associated with mono-nucleosomes devoid of methylation at H3K4 (Ooi et al. 2007). Crystallographic analysis of DNMT3L bound to the H3 tail revealed that introduction of a bulky tryptophan side chain into the H3 binding site of DNMT3L (I107W) or disruption of interaction between the aspartic acid side chain with the amino group of H3 lysine 4 (D90A) eliminates the interaction of the H3 tail with DNMT3L (Ooi et al. 2007).

The vertebrate specific histone variant macroH2A1 has a complicated relationship with

DNA methylation: Its sub-nuclear localization is dependent on genomic methylation in ESCs

(Damelin & Bestor 2007). In contrast, its deletion has no effects on DNA methylation

(unpublished data). MacroH2A1 is dispensable for the silencing of methylated promoters as evidenced by the mild phenotype produced by its deletion in mice (Boulard et al. 2010). This indicates that genomic methylation plays a mysterious role for the localization of macroH2A1 independent of its function in silencing and development.

Global loss of genomic methylation does not influence global levels of common histone modifications; however genome wide distribution of histone modification has not been investigated in hypomethylated cells.

23

FIGURE 1C. Structure and regulation of DNMT1. Domains of DNMT1 (top).

24 N terminal regulatory domain contains nuclear localization sequence (NLS), replication focus targeting sequence (RFTS), CXXC domain that recognizes unmetylated DNA, and tandem Bromo-Adjacent-Homology (BAH) domains. N terminus is connected to the catalytic C-terminal domain via seven alternating lysine and glycine residues, the (GK) repeats. Letters below the domain diagram indicate the position of N-terminal truncations in the crystal structures shown in a-d. a, Superposition of active DNMT1 and M. HhaI (bacterial methyltransferase). b, Superposition of DNMT1 in autoinhibited state with unmethylated DNA (blue) with DNMT1 in active state with hemimethylated DNA (pink). Autoinhibitory BAH1-CXXC linker is depicted in electrostatic potential view.c. DNMT1 molecule is connected by several linkers and loops. Note that the (GK) repeats are not resolved in the structure and depicted with dashes.d. Almost full- length DNMT1 showing the inhibitory effect of the RFTS and CXXC domains.

FIGURE 1D. The (GK) repeats in DNMT1. a., Alignment of the (GK) repeats of DNMT1 homologs from mammals, insects and plants. b., At least four out of seven lysines within the (GK) repeats are acetylated. c., The (GK) repeats resembles N-terminal tails of histones, which are post-translationaly modified as well.

25 I.4. Role and regulation of DNMT1 a. DNMT1’s discovery and genetic studies

DNMT1 was the first eukaryotic DNA methyltransferase to be purified and cloned

(Bestor et al. 1988). This protein is 1620 amino acids long, with a 1100-amino acid N-terminal domain, a 500-amino acid C-terminal domain containing the catalytic site, and between them, a region of alternating glycine and lysine (GK) residues (Bestor et al. 1988) (Figure 1C).

Two forms of DNMT1 are known: somatic and oocytic; the latter is lacking the first 118 amino acids at the N-terminus, which allows it to be more stable and degradation-resistant (Ding

& Chaillet 2002). A number of genetic mutations in Dnmt1 have been generated (summarized in

Table 1). The first targeted mutation of Dnmt1 in mice, which resulted in a hypomorphic allele

(Dnmt1n/n) with 5-10% residual activity, demonstrated global loss of methylation, demethylation of IAPs and embryonic lethality before mid-gestation (E. Li et al. 1992; Walsh et al. 1998).

Subsequently, two null alleles were produced by disruption of the RFTS domain (Dnmt1s/s) (E.

Li, Beard & Jaenisch 1993), and deletion of conserved PC and ENV motifs in the catalytic domain (Dnmt1c/c) (Lei et al. 1996). Both null mutations resulted in global demethylation more severe than that observed for the Dnmt1n/n allele as well as developmental arrest prior to the 8- somite embryonic stage and loss of imprinting (E. Li, Beard & Jaenisch 1993; Lei et al. 1996).

The consequence of DNMT1 overexpression was studied by introduction of a BAC carrying the full Dnmt1 gene into wild type ES cells, which resulted in DNMT1 levels increased to 4 times that of wild type and 150% increase in methylation activity (Biniszkiewicz et al. 2002). Injection of DNMT1-overexpressing ES cells into diploid and tetraploid embryos resulted in mid-gestation lethality, and, while global methylation levels were normal, both alleles of H19 DMR were methylated (Biniszkiewicz et al. 2002).

26 Meanwhile, null mutation of the oocyte form of DNMT1 was created by deletion of exon

1o. (Howell et al. 2001). This resulted in normal viability of homozygous mutants, but heterozygous offspring of mutant females died in late gestation and displayed loss of allele- specific expression and methylation at certain imprinted genes suggesting that DNMT1o is required for maintenance of imprints that otherwise would be lost during the fourth embryonic S phase in the 8-cell embryo (Howell et al. 2001). Deletion of exon 1s in mice, forcing the production of the oocytic form of DNMT1 in all cells demonstrated that DNMT1o can functionally substitute for somatic DNMT1 and maintain normal methylation at all sequences, even though DNMT1o accumulates over time to protein levels 5 times that of wild type and exhibits 350% greater methylation activity (Ding & Chaillet 2002). This mutation thus demonstrated the intrinsic stability of DNMT1o that allows for its prolonged storage in oocytic cytoplasm, where it remains until being trafficked into the nuclei of the 8-cell embryo to maintain methylation at imprinted genes (Ding & Chaillet 2002; Howell et al. 2001).

TABLE 1. List of genetic studies of Dnmt1.

27 Biochemical studies have demonstrated that DNMT1 has a 5- to 30-fold preference for hemimethylated over unmethylated DNA (Yoder et al. 1997) (Bestor 1992; S. Pradhan et al.

1999), though its activity on unmethylated DNA is greater than that seen for DNMT3A and

DNMT3B (Okano et al. 1998) (Figure 1B). Still, the primary in vivo function of DNMT1 is thought to be the precise replication of genomic DNA methylation patterns to maintain them after each cell division. It appears that regulatory domains within the N-terminal section of the protein are responsible for limiting DNMT1 function in vivo to solely maintenance methylation, as the cleavage of the N-terminal section from the C-terminal catalytic domain causes increased de novo methylation activity on unmethylated substrate (Bestor 1992; Bestor and Ingram, 1983,

Gruenbaum, 1982).

The regulatory domains within DNMT1’s N-terminus include (Figure 1C) the replication foci targeting sequence (RFTS, also known as RFD), nuclear localization signal, a zinc finger

CXXC (Cys-X-X-Cys) domain, CXXC-BAH1 autoinhibitory linker and tandem BAH domains.

Structurally, the catalytic C-terminal domain exhibits a high degree of similarity to the bacterial restriction methyltransferases (Bestor et al. 1988; Bestor 1992), which do not posses any analogue of DNMT1’s N-terminal regulatory structures (Figure 1C).

Within the past several years, different laboratories were successful in solving crystal structures of mouse and human DNMT1 molecules in active (with hemimethylated DNA) (Song et al. 2012), inhibited (with unmethylated DNA) (Song et al. 2011), and free states (no DNA)

(Takeshita et al. 2011) (Z.-M. Zhang et al. 2015). Such a diverse database of DNMT1 structures offers a unique opportunity to study the molecular mechanisms of maintenance DNA methylation, with special emphasis on the role of N-terminal regulatory domains. In the

28 following sections, the various regulatory domains of the DNMT1 molecule are considered with respect to their structures and what is consequently known about their function.

b. Structural insights into DNMT1 regulation i. RFTS and targeting to sites of DNA replication

The RFTS, or replication foci targeting sequence (also knows at RFD or replication focus domain), is essential for the localization of DNMT1 to replication foci during S phase of the cell cycle (Leonhardt et al. 1992). This localization occurs through the interaction of RFTS with

UHRF1, an E3 ubiquitin ligase (Bostick et al. 2007) (Sharif et al. 2007) (discussed further in section I.c.I). The single most important signal for the recruitment of DNMT1 to replication foci is hemimethylated DNA: this conclusion derives from several studies showing that catalytic mutation in the methyltransferase domain renders DNMT1 inactive and causes its mislocalization (Damelin & Bestor 2007; Takebayashi et al. 2007).

The crystal structure of hDNMT1 in the free state, resolved at 2.62 A, shows that the

RFTS domain exercises an inhibitory role through its direct binding at the catalytic site (Z.-M.

Zhang et al. 2015) (Figure 1C(d)). It is also possible that RFTS binding to UHRF1 at replication foci induces allosteric changes in DNMT1 shifting its conformation from inhibited to the active form.

The only pathogenic mutations identified in Dnmt1 to date are all located within the

RFTS domain (Figure 1E), and contribute to the development of the autosomal dominant

DNMT1-complex disorder, which is characterized by narcolepsy, hearing loss, mild to severe neuropathy and cognitive decline (Baets et al. 2015). Previously, this disorder was separated into two distinct conditions: hereditary sensory and autonomic neuropathy type 1E (HSAN1E) and

29

FIGURE 1E. Pathogenic mutations of DNMT1. The RFTS domain of DNMT1 is subject to different dominant mutations in a variable adult onset cerebellar ataxia, deafness, dementia, and narcolepsy syndrome (autosomal dominant DNMT1-complex disorder).

cerebellar ataxia, deafness, narcolepsy (ADCA-DN) (Sun et al. 2014; Klein et al. 2011; Klein et al. 2013). However, after examination of a large cohort of patients and identification of high

30 phenotypic overlap of HSAN1E and ADCA-DN, Baets et al., 2015 suggested to unite these two conditions under the name of “DNMT1-complex disorder”. As yet, the following mutations in the RFTS domain have been identified in patients with DNMT1-complex disorder: C353F,

T481P, D490E, P491L, P491Y, P491R, Y495C, Y495H, K505del, Y524D, I531N, H553R,

A554V, C580R, G589A, V590F ((Baets et al. 2015; Klein et al. 2013; Klein et al. 2011;

Moghadam et al. 2014)). Methylation analysis of patients with the most common Y495C mutation displayed a decrease in the global methylation level of only 0.3-3%, with most of the demethylation occurring in intergenic regions (Sun et al. 2014), implying that changes in DNA methylation do not, in fact, contribute to the pathogenesis of this disease. Interestingly, it is known that DNMT1 continues to be expressed in post-mitotic neurons (Goto et al. 1994; Trasler et al. 1996; Inano et al. 2000), which do not need methylation maintenance. Accordingly, it can most likely be inferred that a neuron-specific effect of DNMT1 contributes to DNMT1-compex disorder. Baets at el., 2015 also found that mutant DNMT1 proteins when overexpressed in

HEK293 cells mislocalize and aggregate in the cytoplasm. This observation potentially provides a link between DNMT1-complex disorder and the aggregative proteinopathies seen in many neurodegenerative disorders (Ross & Poirier 2004; Ciryam et al. 2013). However, the role of these DNMT1 mutations has not been studied in vivo in the brain; the creation of a mouse model would be an essential tool to understanding the pathogenic mechanisms behind DNMT1 mutants but has not yet been undertaken.

Interestingly, one of the ubiquitination sites within DNMT1, lysine 586 (Lecona et al.

2016), is located within one cluster of DNMT1-complex disorder-associated mutations (A554V,

C580R, G589A and V590F).

31 ii. Autoinhibitory role of CXXC domain

In other DNA-binding proteins, CXXC domains previously have been reported to specifically bind unmethylated CpG dinucleotides (Allen et al. 2006; Birke et al. 2002; Lee et al.

2001; M. Pradhan et al. 2008). For example, the CXXC-containing FBXL10 protein binds to unmethylated CpG islands and protects them from acquiring DNA methylation (Boulard et al.

2015). The fact that the maintenance DNA methyltransferase DNMT1 contains a domain that recognizes unmethylated rather than hemimethylated DNA (M. Pradhan et al. 2008) is thus rather surprising. Nonetheless, the crystal structure of mouse and human DNMT1 with unmethylated DNA revealed that the CXXC domain in DNMT1 actually participates in autoinhibitory mechanism in the presence of unmethylated DNA (Song et al. 2011)

(Figure 1-C (b)). First, CXXC anchors unmethylated DNA in such a way that it keeps it at a distance from the active site of DNMT1. Second, the CXXC-BAH1 autoinhibitory linker, which is composed of negatively charged residues that repel DNA, is positioned between the active site and the unmethylated DNA. As such, unmethylated CpG site are prevented from entering the active site of DNMT1. Mutagenesis studies have demonstrated that the CXXC domain contributes 6- to 7-fold preference for hemimethylated over unmethylated DNA (Song et al.

2011), whereas DNMT1 itself has up to 30-fold preference for the hemimethylated species in vitro (Yoder et al. 1997). It thus cannot be excluded that other mechanisms might exist to account for any remaining difference.

iii. BAH domains

The first BAH (Bromo-Adjacent-Homology) domain was identified in the chicken protein Polybromo, where, as its name implies, it is situated next to several Bromo domains

32 (Yang & R.-M. Xu 2013). BAH domains are mostly found in chromatin factors, such as S. cerevisiae Risc1/2, D. melanogaster Ash1, and S. pombe Orc1 (Origin of Replication Complex

1). The precise function of BAH domains remained enigmatic for some time until their further characterization in two proteins, ORC1 and Sir3, categorized them as histone-recognizing modules (Kuo et al. 2012; Armache et al. 2011). Mutational analysis and crystal structures have showed that the BAH domain of ORC1 specifically recognizes dimethylated lysine 20 of histone

H4 (Kuo et al. 2012), while Sir3’s BAH domain binds nucleosome possessing unmodified lysine

16 of histone H4 and unmodified lysine 79 of histone H3 (Armache et al. 2011).

DNMT1 contains two tandem BAH domains that have a dumbbell-shaped configuration in the crystal structure (Song et al. 2011; Song et al. 2012; Takeshita et al. 2011); their potential function in the process of DNA methylation is unknown (Figure 1C). Structural and sequence homology with the ORC1 BAH domain suggests that DNMT1’s BAH1 domain could be a methyllysine-binding domain (Yang & R.-M. Xu 2013). Both of DNMT1’s BAH domains are positioned on the surface remote from the bound DNA, although BAH1 connects to the CXXC via the autoinhibitory linker (Song et al. 2011), and BAH2 contains a long protruding loop that makes contacts with both the TRD and with the phosphate backbone of the bound DNA in the active structure (Song et al. 2012).

There is disagreement in the literature on the question of DNMT1 binding to nucleosomes (Jeong et al. 2009; Gowher, Stockdale, et al. 2005). Biochemical experiments seem to strongly support the conclusion that in contrast to DNMT3A and DNMT3B, DNMT1 does not make stable association with chromatin (Jeong et al. 2009). Given the dynamics of maintenance methylation, such that DNMT1 must be constantly moving onto the next target hemimethylated

33 site, any stable interaction with chromatin indeed seems implausible, though a transient interaction with nucleosomes cannot be excluded.

iv. (GK) repeats

The (GK) repeats motif is constituted by 13 alternating glycine and lysine residues that link the regulatory N-terminus to the catalytic C-terminal domain (Figure 1C). Though little studied, the (GK) repeats stand out as a potentially important functional sequence in DNMT1 for several reasons. First, they are conserved among eukaryotic DNMT1 homologs (Figure 1D (a)).

Second, four out of seven lysines within the (GK) repeats can be acetylated (Kim et al. 2006;

Choudhary et al. 2009), indicating a possible regulatory function (Figure 1D (b)). Third, the

(GK) repeats significantly resemble the GK dipeptides in the N-terminal tails of core histones

H2A, H4 and noncanonical histone variant H2A.Z (Figure 1D (c)). Acetylation of these GK dipeptides within histones is well known to reduce electrostatic interaction with DNA, thus relaxing chromatin.

v. Methyltransferase domain and mechanism of methylation

The methyltransferase domain of DNMT1 consists of two subdomains called the catalytic core and the TRD (target recognition domain), adopting a class I methyltransferase fold (Song et al. 2011). Comparison of mouse DNMT1 with M.HhaI (Haemophilus haemolyticus methyltransferase) (Figure 1C (a)) demonstrates that the overall fold of the catalytic domain is similar between mammalian and bacterial enzymes, but several differences stand out: 1) five (I,

VI, VIII, IX, X) out of six conserved sequence motifs adopt a similar conformation, while motif

IV containing the catalytic loop and catalytic cysteine adopts different conformations; 2) the

34 DNA helix in M.HhaI can only be positioned within a cleft formed by the catalytic core and the

TRD, while DNA can shift positions within mDNMT1, entering the cleft in the active structure and being laterally displaced from the catalytic site by the CXXC domain in the inhibited structure; 3) the TRD subdomains have little sequence similarity while both contain an 8-amino acid DNA recognition loop, a second 8-amino acid loop in M.HhaI is replaced by a 73-residue loop in mDNMT1 (Song et al. 2011; Song et al. 2012).

The formation of the active DNMT1-DNA complex involves invasion of the DNA major groove by two TRD loops and the minor groove by the catalytic loop (Song et al. 2012). The

DNA duplex of hemimethylated DNA is embedded in the catalytic cleft of DNMT1 with the target cytosine looped out of the helix and inserted into the catalytic pocket of the methyltransferase domain, where it forms a covalent bond with the reactive cysteine (Cys 1229)

(Song et al. 2012). The target cytosine is surrounded by residues that are completely conserved between mammalian and bacterial enzymes and is in proximity to the S-adenosyl homocysteine

(AdoHcy) (Song et al. 2012).

Interestingly, the methylated cytosine on the parental strand is accommodated within a shallow concave and hydrophobic surface within the TRD domain, and in vitro mutation of tryptophan 1512 of the TRD, which is involved in base-stacking with the 5-methylcytosine yields a catalytically dead DNMT1 (Song et al. 2012). Comparison of active (Song et al. 2012) and autoinhibited (Song et al. 2011) DNMT1-DNA complexes shows that the largest conformational change takes place within the catalytic loop of the methyltransferase domain, where its penetration into the DNA helix is controlled by a kink at the N-terminal end.

35 c. Regulation of DNMT1 by other proteins i. UHRF1 is essential for maintenance DNA methylation

UHRF1 (ubiquitin-like, containing PHD and RING finger domain 1), formerly known as

NP95 (nuclear protein 95), is a homolog of the plant protein VIM1/ORTH2, required for maintenance of CG methylation at centromeric repeat sequences in Arabidopsis ( Johnson 2007,

Woo 2007). Null mutation of Uhrf1 phenocopies null mutation of Dnmt1 resulting in almost complete loss of genomic methylation and embryonic arrest in early gestation (Bostick et al.

2007; Sharif et al. 2007). UHRF1 binds with high affinity to hemimethylated CpG sites through its SET and RING finger associated (SRA) domain (Bostick et al. 2007; Sharif et al. 2007) via a base-flipping mechanism (Arita et al. 2008; Avvakumov et al. 2008; Hashimoto et al. 2008), and recruits DNMT1 to sites of newly replicated hemimethylated DNA at replication foci during S phase (Bostick et al. 2007; Sharif et al. 2007). Interaction of UHRF1 and DNMT1 is mediated via the latter’s RFTS domain, which is important for proper localization of DNMT1 during S phase (Leonhardt et al. 1992; Garvilles et al. 2015).

UHRF1 also contains tandem tudor domains that recognize H3K9me3 and a PHD finger that assists the tudor domains and also recognizes unmodified H3R2 (Rothbart et al. 2012). In vitro and cell culture studies have demonstrated that interaction of UHRF1 with H3K9me3 might be necessary for DNA methylation (Rothbart et al. 2012; Liu et al. 2013) however, mutation of

Uhrf1 in mice making the protein incapable of binding histones showed that loss of this interaction leads to only 10% loss of global DNA methylation, suggesting that maintenance methylation is mostly independent of H3K9me3 (Zhao et al. 2016).

Direct interaction of DNMT1 and UHRF1 has been demonstrated by several groups

(Felle et al. 2011; Du et al. 2010; Garvilles et al. 2015), however, recently it has been proposed

36 that ubiquitination of H3K23 in Xenopus (Nishiyama et al. 2013) and H3K18 in mammals (Qin et al. 2015) might act as an intermediate for their functional interaction. This model suggests that

RING domains of UHRF1 ubiquitinates histone H3, which is subsequently recognized by the

RFTS domain of DNMT1 (Nishiyama et al. 2013; Qin et al. 2015). It is presently not known if

UHRF1 has the ability to ubiquitinate DNMT1, though two ubiquitination sites in the enzyme recently have been identified: lysines 586 in the RFTS domain and lysine 997 in the BAH2 domain (Lecona et al. 2016).

UHRF1 is the only characterized factor up to date that is known to be required for maintenance DNA methylation in addition to DNMT1.

ii. Acetylation-ubiquitination regulation by USP7 and TIP60

A recently proposed model on the control of DNMT1 by an acetylation-ubiquitination coupled mechanism suggests a link between DNMT1 biosynthesis/degradation and the cell cycle; the molecular mechanism involves interactions with deubiquitinase USP7 and acetyltransferase TIP60 (Du et al. 2010).

Ubiquitin-specific protease 7 (USP7; also known as Herpes-associated ubiquitin-specific protease or HAUSP) is important for regulation of p53 and MCM2 stability (M. Li et al. 2002;

Hu et al. 2006; Sheng et al. 2006). It has been reported that during S phase USP7 binds to unacetylated (GK) repeats of DNMT1 and stabilizes the enzyme by deubiquitination (Cheng et al. 2015). Later in S phase and during early G2 phase, TIP60 acetylates the (GK) repeats, which subsequently leads to the dissociation of DNMT1/USP7 complex (Du et al. 2010), ubiquitination of DNMT1 at lysine 586 (RFTS domain) and lysine 997 (BAH2 domain) (Lecona et al. 2016), and proteasomal degradation of DNMT1.

37 This model was formulated based on in vitro biochemical studies, while genetic analysis is complicated by the fact that Usp7 mutation leads to DNA replication defects (Lecona et al.

2016) and cell lethality via p53 misregulation (M. Li et al. 2002). We further test the proposed model in Chapter 3.

38 I.5. Outline of Thesis

Having reviewed the relevant body of literature underlying this work, we now turn our attention to reporting the new findings we have generated concerning regulatory domains of the maintenance methyltransferase DNMT1.

Chapter 2 describes how the previously unknown function of DNMT1's (GK) repeats was elucidated using reverse genetics. It also provides evidence establishing a novel in vivo role of

DNMT1 in de novo methylation of paternally imprinted DMRs. In Chapter 3, we round out our investigation of the (GK) repeats by challenging the preliminary model put forward in prior literature that implicates this motif in DNMT1's interaction with USP7 and regulation throughout the cell cycle. Chapter 4 explores the function of DNMT1's BAH domains as protein-binding modules and demonstrates that they are required for targeting of the enzyme to replication foci during S phase. Finally, with Chapter 5, we take the opportunity to consider more broadly the significance of these findings as well as future perspectives for the field of epigenetics.

In the end, this thesis aims to overturn some of the basic assumptions that have long held sway in the study of the molecular mechanisms of DNA methylation and yield a more nuanced understanding of the complex machinery that has evolved for this purpose.

39 CHAPTER II. ACETYLATED (GK) REPEATS PREVENT DNMT1 FROM CARRYING

OUT DE NOVO METHYLATION AT PATERNALLY IMPRINTED GENES IN

EMBRYONIC STEM CELLS

II.1. Rationale and Summary

Mammalian DNMT1 has a large N-terminus with several distinct domains that regulate the enzyme’s targeting (RFTS), nuclear localization, and binding to DNA (CXXC) (Figure 1-C).

However, several domains within DNMT1 have unknown function, among which is a short motif of (GK) repeats. The (GK) repeats are composed of 13 alternating glycine and lysine residues and link the regulatory N-terminus to the catalytic C-terminus. The (GK) repeats stood out as a potential functional sequence in DNMT1 for several reasons. First, they are conserved among eukaryotic DNMT1 homologs (Figure 1D(a)). Second, four out of seven lysines within the (GK) repeats can be acetylated (Kim et al. 2006; Choudhary et al. 2009) which indicates a possible regulatory function (Figure 1D(b)). Third, the (GK) repeats closely resemble the GK dipeptides in the N-terminal tails of histones H2A, H4 and H2A.Z (Figure 1D(c)). Acetylation of these GK dipeptides within histones is well known to reduce electrostatic interaction with DNA, which relaxes chromatin. Hence, I hypothesized that enzymatic activity of DNMT1 could be modulated through neutralization of the positive charge of the (GK) repeats.

This chapter describes the phenotype and functional consequences of the first attempted genetic mutagenesis of the (GK) repeats in mouse ES cells. Methylation analysis of cells expressing DNMT1 with fully substituted (GK) repeats identifies the function of this motif in methylation of paternally imprinted genes. The data presented here revealed that mutation of the

(GK) repeats motif confers de novo methyltransferase activity to DNMT1.

40 II. 2. Materials and Methods a. Cell lines and cell culture

Mouse ES cells were cultured on gelatin following standard techniques. Stable ES cell lines were generated by nucleofection of Dnmt1-null ES cells (Lei et al. 1996) with MT80 minigene (Figure

2A) and pGKPuro plasmid for Puromycin resistance. MT80 minigene, carrying 12kb of 5’

Dnmt1 genomic sequence with endogenous promoter and 5.5kb of Dnmt1 cDNA (Tucker,

Talbot, et al. 1996; Damelin & Bestor 2007), was modified by the addition of the N-terminal

Flag-HA tag after the translation start site. Point mutations were produced using QuikChange

Site-Directed Mutagenesis kit (Agilent). Individual clones were selected with Puromycin for 10-

14 days and picked into 96-well plates. Clones were genotyped using primers specific to the

Flag-HA tag . Positive clones were propagated and levels of DNMT1 expression were tested by

Western blotting. Clones expressing DNMT1 at wild-type levels were used for further studies.

FIGURE 2A. Dnmt1 minigene system used for genetic rescue experiments. Construct contains 12 kb of genomic DNA with endogenous Dnmt1 promoter. Minigene was modified by the addition of HA and FLAG tags after the translation start site. Dnmt1 point mutations were introduced by site-directed mutagenesis.

41 b. Immunoblotting

Whole cell extracts were prepared by lysis in RIPA buffer (150mM NaCl, 1% NP-40, 0.5%

Deoxycholic acid, 0.1% SDS, 50mM Tris pH 7.5) and briefly sonicated to disrupt genomic

DNA, then heated to 100° in SDS and loaded onto SDS-PAGE gels. Proteins were transferred to nitrocellulose membrane and blocked in 5% milk, 0.1% Tween20, PBS for 1 hour at room temperature. Blots were incubated at 40C overnight with primary antibodies in 10% FBS, 0.1%

Tween20. Blots were washed with PBST, after incubation with DNMT1 antibody blots were washed briefly with PBST and PBS. Antibodies used: rabbit polyclonal to DNMT1 (Cell

Signaling Technology, 5032 (D63A6) 1:2,500 dilution), rabbit polyclonal to HA tag (Abcam, ab9110, 1:5000), mouse monoclonal to alpha-tubulin (Abcam, ab7291, 1:10,000).

c. Methylation analysis

Genomic DNA was digested for two rounds with methylation-sensitive enzyme HpaII

(recognition sequence CCGG), its isoschizomer MspI as a control, or McrBC that cuts DNA between two methylated sites. DNA was quantified and ran on 0.8% agarose gel, and stained with ethidium bromide.

d. Southern blots

Southern blot analysis was performed with IAP probe generated by PCR. Primers (matching to

LTR+GAG) used for probe amplification: probe_IAP_F (GGTAAACAAATAATCTGCGC); probe_IAP_R (CTGGTAATGGGCTGCTTCTTCC). Probes for minor and major satellites were prepared by EcoRI digestion from plasmids (gift of Mathieu Boulard). Agarose gels were transferred to a Nytran SPC membrane (GE Healthcare) overnight in 10X SSPE buffer. After

42 crosslinking, membrane was prehybridized with 6X SSC, 5X Denhardts, 1% SDS, 10% Dextran

Sulfate for 1hour at 45° and incubated overnight with IAP probe at 45°. Membranes were washed once with 2X SSC, 0.5% SDS; 2X with 1X SSC, 0.5% SDS, and 1X with 0.2X SSC,

0.5% SDS.

e. Bisulfite sequencing

Genomic DNA isolated from ES cells was bisulfite converted using the EZ DNA Methylation-

Gold kit. All imprinted genes were amplified with two rounds of nested or semi-nested PCR using the primer pairs listed in Table 2. The PCR product resulting from the second round of

PCR was sequenced using Sanger sequencing.

Primer name Forward primer Reverse primer H19_BS1 GAGTATTTAGGAGGTATAAGAATT ATCAAAAACTAACATAAACCCCT H19_BS2 GTAAGGAGATTATGTTTATTTTTGG CCTCATTAATCCCATAACTAT

Kcnq1ot1_BS1 TATAAGGAAGGTTAAGAATTTATTGTAATTATTG AATTTCTTCTCTAAATCAACACRACACAAA Kcnq1ot1_BS2 GTTTTTATTTTAAGTTTTTAATTTTTTATATTTGAATTT ACAAACACTCACACCAAAAACAAAAATAAT

IG-DMR_BS1 AGATGTGTTGTGGATTTAGGTTGTAG CTAAACTACAATCTATATAATCACAACAC IG-DMR_BS2 AAGTGTTGTGGTTTGTTATGGGTAAG CCATTCCCAATCTATAAAAATATTTTAACC

Rasgrf1_1 GAGAGTATGTAAAGTTAGAGTTGTGTTG ATAATACAACAACAACAATAACAATC Rasgrf1_2 TAAAGATAGTTTAGATATGGAATTTTGGG

Peg1_1 GATTTGGGATATAAAAGGTTAATGAG TCATTAAAAACACAAACCTCCTTTAC Peg1_2 TTTTAGATTTTGAGGGTTTTAGGTTG AATCCCTTAAAAATCATCTTTCACAC

Peg3_1 TTTTTAGATTTTGTTTGGGGGTTTTTAATA AATCCCTATCACCTAAATAACATCCCTACA Peg3_2 TTGATAATAGTAGTTTGATTGGTAGGGTGT ATCTACAACCTTATCAATTACCCTTAAAAA

Snrpn_BS1 TATGTAATATGATATAGTTTAGAAATTAG AATAAACCCAAATCTAAAATATTTTAATC Snrpn_BS2 AATTTGTGTGATGTTTGTAATTATTTGG ATAAAATACACTTTCACTACTAAAATCC

TABLE 2. Primers for bisulfite sequencing of imprinted genes

43

f. RT-qPCR

RNA was extracted from ES cells using Trizol and chloroform, and subsequently treated with

TurboDNAse. After precipitation with sodium acetate and ethanol, 1ug of RNA was used for cDNA synthesis using Superscript IV. cDNA was diluted 1/15 and used for RT-qPCR. We used

PowerUp SYBR Green PCR Master Mix (catalog no. A25742; Applied Biosystems). Primers for

H19 and b-actin were a kind gift of Mathieu Boulard. g. Statistical analysis

Peak heights for methylated and unmethylated alleles at each CpG site were quantified using

4Peaks program. Statistical analysis was performed using Mann-Whitney test.

44 II.3. Results a. GR substitution of the (GK) repeats does not alter DNMT1 stability

To investigate the function of the (GK) repeats, we made seven substitutions of all lysine residues within the repeats to arginines (Figure 2B). Such substitution preserved the positive charge of the (GK) repeats while rendering them unacetylatable. Expression constructs that produced wild-type GK or GK→GR DNMT1 were transfected into Dnmt1-null ES cells and stable clones were established as described in the Methods. Expression was driven by a 12kb segment of DNA 5’ to the Dnmt1 gene, which contains the endogenous Dnmt1 promoter and other regulatory sequences necessary for expression of DNMT1 at wild-type levels(Tucker,

Talbot, et al. 1996; Damelin & Bestor 2007). As shown in Figure 2B(b), GK and GR proteins were expressed at levels similar to that of endogenous DNMT1 in the wild type, suggesting that the GK→GR substitution results in a stable DNMT1 protein.

b. GR DNMT1 rescues methylation globally and at repetitive sequences

The methylation status of GR cells was determined by digestion of genomic DNA with the methyl-sensitive restriction enzyme HpaII. Figure 2B(c) shows that similarly to GK protein,

GR DNMT1 rescues global methylation levels to wild-type levels. Further Southern blot analysis revealed that GR DNMT1 is able to normally methylate IAP retrotransposons and minor satellite repeat sequences (Figure 2B(c)) indicating that mutation of the (GK) motif does not compromise the maintenance activity of DNMT1. Accordingly, we can conclude that acetylation of the (GK) repeats is not involved in the regulation of the methyltransferase activity of DNMT1 or its targeting to replication foci.

45 c. GR DNMT1 produces ectopic gain of methylation at paternally imprinted genes

While restrotransposons and repetitive elements constitute the majority of methylated

DNA, another essential compartment where methylation plays an important regulatory role are imprinted genes. Imprinted genes are a set of genes that are monoallelically expressed in a parent-of-origin dependent manner. The choice of the expressed allele in the embryo is under the control of sex-specific methylation patterns that are established in the parental germline

(Bourc'his & Bestor 2006).

It has been shown previously that reintroduction of Dnmt1 after its ablation or transient inhibition in ES cells fails to restore genomic imprints both at paternal and maternal DMRs

(Tucker, Beard, et al. 1996; McGraw et al. 2015). In fact, imprinting information can be restored only upon germline transmission (Tucker, Beard, et al. 1996) suggesting either the existence of a germline factor necessary for imprint establishment or the presence of an inhibitory factor in ES cells that must be absent in the germline.

The methylation status of all three paternally imprinted DMRs H19, Dlk1/Gtl2 and

Rasgrf1 was examined using bisulfite sequencing of genomic DNA derived from GK or GR cells. Analysis revealed that all three paternal imprints exhibited de novo methylation in GR cells

(Figures 2C, 2D, 2E), which was statistically significant when compared to wild type GK (all p- values are equal or below 0.01). On average GR cells exhibited a 30% gain of methylation

(Figures 2C(b); 2D(b); 2E(b)). H19 result was validated by subcloning and sequencing of individual clones (Figure 2C(c)).

Importantly, reintroduction of wild-type GK DNMT1 does not catalyze de novo methylation at paternal imprints. Therefore, ectopic methylation at paternally imprinted genes originates from the mutation of the (GK) motif. This result was surprising as it was thought that

46 DNMT1 specialized in maintenance methylation. Our data reveals that mutation that prevents acetylation of the (GK) repeats confers a de novo methylation activity unto DNMT1.

Bisulfite analysis was done on two independent wild type GK and two GR lines, and quantified data for both clones for all three genes is statistically significant (Figures 2C(b);

2D(b); 2E(b)). This experiment does not discriminate between paternal and maternal alleles, therefore this gain of methylation could be restricted to one allele or randomly distributed between both alleles.

The same experiment was conducted on wild-type and Dnmt1-null ES cells as an important control to show that wild type cells exhibit normal monoallelic methylation, while cells lacking Dnmt1 show a complete loss of methylation at imprinted loci (Figures 2C(a); 2D(a);

2E(a)).

d. Expression of GR DNMT1 has no effect on methylation at maternally imprinted genes.

Using bisulfite analysis methylation levels were determined at the following maternally imprinted DMRs: Igf2r, Kcnq1ot1, Snrpn, Peg3 and Peg1 (Figures 2F; 2G; 2H; 2I; 2J). In contrast to paternally imprinted genes, at each of the maternal imprints examined there was no gain of ectopic methylation in GR cells. Bisulfite analysis was done on two independent GK and two GR lines producing the same result. Control experiments on wild-type and Dnmt1-null ES cells were carried out to confirm the presence of expected patterns of methylation (Figures 2F;

2G; 2H; 2I; 2J).

The striking difference in the effects on paternal and maternal imprints in GR cells points to a previously unknown mechanism by which DNMT1 is capable of recognizing and de novo methylating specifically paternal DMRs.

47

e. Gain of methylation at paternally imprinted genes results in transcriptional changes

To investigate the functional consequences of the observed gain of methylation at paternal imprints, RT-qPCR was conducted on RNA from GK and GR cells with primers for

H19 and Igf2. RNA from wild-type and Dnmt1-null ES cells was used as a control to show that when H19 DMR is demethylated, the expression of H19 gene increases, while the expression of

Igf2 drops. The role of DNA methylation in regulation of imprinted genes has been well characterized (Reik et al. 1987; Sapienza et al. 1987; Swain et al. 1987; Wutz et al. 1997;

Thorvaldsen et al. 1998). We hypothesized that expression of paternally imprinted genes in GR cells will be altered in comparison to Dnmt1-null and GK cells. RT-qPCR analysis revealed that expression of H19 is significantly decreased in GR cells (Figure 2K), and is lower than in wild- type cells suggesting that both alleles of H19 are methylated and silenced in GR cells.

48

FIGURE 2B. Substitution of lysines by arginines within the (GK) repeats does not affect maintenance methylation. a., Position of GK to GR substitution within the (GK) repeats of DNMT1. b., GR substitution does not effect DNMT1 stability. Stable clones expressing wild- type levels of DNMT1 were further analyzed. c., Normal methylation of genomic DNA as measured by resistance to methyl-sensitive restriction enzyme HpaII; Southern blot analysis with probes for IAP retrotransposon and Minor Satellites revealed normal methylation in GK and GR clones.

49

FIGURE 2C. Methylation status of H19 paternally imprinted DMR. a., Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers for H19. Unmethylated molecules are represented by TG, methylated molecules are represented by CG. Wild type and Dnmt1-/- were used as controls. GR clones exhibit de novo methylation at all CpG sites. b., Methylation levels across all CpG sites were quantified for two independent wild type GK and GR clones. p-value was calculated using Mann-Whitney test (left). Quantification of methylation levels at individual CpG sites (right). c., Amplified PCR products were subcloned into pJET and individual colonies sequenced, confirming the phenotype when sequencing bulk PCR product. p-value was calculated using Mann-Whitney test.

50

FIGURE 2D. Methylation status of Dlk1/Gtl2 paternally imprinted DMR. a., Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers for Dlk1/Gtl2. Unmethylated molecules are represented by TG, methylated molecules are represented by CG. Wild type and Dnmt1-/- were used as controls. GR clones exhibit de novo methylation at all CpG sites. b., Methylation levels across all CpG sites were quantified for two independent wild type GK and GR clones. p-value was calculated using Mann-Whitney test (left). Quantification of methylation levels at individual CpG sites (right).

51

FIGURE 2E. Methylation status of Rasgrf1 paternally imprinted DMR. a., Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers for Rasgrf1. Unmethylated molecules are represented by TG, methylated molecules are represented by CG. Wild type and Dnmt1-/- were used as controls. GR clones exhibit de novo methylation at all CpG sites. b., Methylation levels across all CpG sites were quantified for two independent wild type GK and GR clones. p-value was calculated using Mann-Whitney test (left). Quantification of methylation levels at individual CpG sites (right).

52

FIGURE 2F. Methylation status of Kcnq1ot1 maternally imprinted DMR. Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers. Methylated sites are denoted by filled in blue peaks and marked by CG, unmethylated sites are marked by red trace and TG. Wild type and Dnmt1-/- were used as controls. GR clones do not gain methylation.

53

FIGURE 2G. Methylation status of Igf2r maternally imprinted DMR. Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers. Methylated sites are denoted by filled in blue peaks and marked by CG, unmethylated sites are marked by red trace and TG. Wild type and Dnmt1-/- were used as controls. GR clones do not gain methylation.

54

FIGURE 2H. Methylation status of Snrpn maternally imprinted DMR. Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers. Methylated sites are denoted by filled in blue peaks and marked by CG, unmethylated sites are marked by red trace and TG. Wild type and Dnmt1-/- were used as controls. GR clones do not gain methylation.

55

FIGURE 2I. Methylation status of Peg1 maternally imprinted DMR. Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers. Methylated sites are denoted by filled in blue peaks and marked by CG, unmethylated sites are marked by red trace and TG. Wild type and Dnmt1-/- were used as controls. GR clones do not gain methylation.

56

FIGURE 2J. Methylation status of Peg3 maternally imprinted DMR. Sequencing traces for PCR products amplified from bisulfite-converted genomic DNA using nested primers. Methylated sites are denoted by filled in blue peaks and marked by CG, unmethylated sites are marked by red trace and TG. Wild type and Dnmt1-/- were used as controls. GR clones do not gain methylation.

57 de novo methylation

(p<0.01)

Paternal DNMT1GK DNMT1GR

H19 – +

Rasgrf1 – +

Dlk1/Gtl2 – +

Maternal DNMT1GK DNMT1GR

Igf2r – –

Snrpn – –

Kcnq1ot1 – –

Peg1 – –

Peg3 – –

TABLE 3. Summary of methylation status of imprinted genes in GK and GR cells.

58 Expression of H19 0.16

0.14

0.12

0.1

Actin -

0.08

0.06 Relative to B to Relative

0.04

0.02

0 Wild type Dnmt1c/c GK GR

FIGURE 2K. Expression of H19 is altered in GR clones. RT-qPCR was performed on biological triplicates. Wild type and and Dnmt1-/- cDNA were used as controls. H19 is upregulated in Dnmt1-/- due to the demethylation of both alleles. Decreased expression in GR clones demonstrates silencing of both alleles.

59

FIGURE 2L. Model of acetylation-dependent maintenance and de novo functions of DNMT1. When acetylated at the (GK) repeats DNMT1 carrying out only maintenance DNA methylation, whereas deacetylation results in de novo activity at paternal imprinting control regions.

60 II.4. Interpretation

The data presented in this chapter demonstrate that cells expressing mutant DNMT1 with unacetylatable arginines substituting for the lysine of the (GK) repeats with arginines and thereby rendering them un-acetylatable, have an ectopic gain of methylation at paternally imprinted genes, while maternal imprints do not similarly acquire abnormal methylation (Table

3). Furthermore, the observed gain of methylation has a functional effect and results in changes in the transcription of paternally imprinted genes, such as H19. Importantly, GR protein is stable and functional as evidenced by the fact that it rescues global methylation levels as well as methylation at IAPs and minor satellites. Based on these findings, we propose a model in which

DNMT1 in its unacetylated state acquires a de novo methyltransferase function specifically at paternal DMRs. This model raises the possibility that unacetylated DNMT1 establishes de novo methylation at paternal imprints in prospermatogonia.

a. Expression of DNMT1 in gonocytes during male imprint establishment

Paternal imprints are established in the male germline between embryonic days E14.5 and E16.5 (Kato et al. 2007). The paradigm in the field of DNA methylation has been that

DNMT1 is strictly a maintenance enzyme, and accordingly it was concluded that DNMT1 is not expressed during stages of genome-wide remethylation. However, careful examination of the primary literature shows that this interpretation might not be true on all the cases, especially as it concerns the male germline.

Immunoflourescence/immunohistochemistry studies show that DNMT1 is expressed in gonocytes at embryonic days E13.5 (La Salle et al. 2004), E14.5 and at a lower level at E16.5

(Sakai et al. 2001). DNMT1 expression is not detectable at E18.5 but resumes expression at 3dpp

61 (La Salle et al. 2004). Thus, DNMT1 is in fact expressed in gonocytes during the window of paternal imprint establishment, lending plausibility to its de novo methylation role according to our proposed model.

b. Sex-specific function of DNA methyltransferases in the germline

Different DNA methyltransferases are required for the establishment of paternal and maternal imprints. Genetic data demonstrates that Dnmt3l and Dnmt3a are required for establishment of maternal imprints in the oocyte (Bourc'his et al. 2001; Kaneda et al. 2004).

However, deletion of Dnmt3l in the male germline leads to a partial loss of methylation at H19

(Bourc'his & Bestor 2004). Conditional deletion of Dnmt3b in male germline has no effect on paternal imprint methylation. The role of Dnmt3a in de novo methylation of paternal imprints is not well understood as variable methylation phenotypes have been reported after conditional disruption of Dnmt3a in the male germline (Kaneda et al. 2004; Kato et al. 2007). Thus importance of DNMT3A for paternal imprint establishment requires further investigation. Quite recently, it has been demonstrated that the newly discovered rodent-specific de novo methyltransferase Dnmt3C is required for methylation of Rasgrf1 but not of H19 and Dlk1/Gtl2

(Barau et al. 2016). As such, the existing literature appears to be clear in establishing that methyltransferase DNMT3A and co-factor DNMT3L are required for methylation of maternally imprinted genes. However, DNMT3L is dispensable for de novo methylation of paternal ICRs, indicating that establishment of imprints in the male germline is mediated by other factors. Our results evidence the likely role of DNMT1 in this process.

62 c. Phenotypic specificity to paternal differentially methylated regions

Two possibilities could explain the observation that GR mutants exhibit methylation at paternal DMRs but not at maternal DMRs: 1) male germ cells express unknown factor (s) that targets DNMT1 to paternal imprints, or 2) paternal DMRs carry an intrinsic signal that recruits

DNMT1.

Since experiments were performed in male ES cells, it cannot be excluded that those cells express a germline-specific protein or RNA that targets DNMT1 to the paternal DMRs. To further investigate the influence of sex and sex-specific factors on the phenotype of GR cells, it would be interesting to generate the same mutation in female ES cells. Unfortunately, female ES cells are not desirable for use in methylation studies since they lose methylation over time (Ooi et al. 2010), and are known to randomly lose one X chromosome (Eggan et al. 2002).

d. Distinction between paternal and maternal imprints

We show that GR DNMT1 deposits ectopic methylation specifically at paternally imprinted genes, while having no effect on maternal imprints, which reflects a well-established sexual dimorphism in genomic imprinting (Bourc'his & Bestor 2006). There are several notable differences between paternal and maternal DMRs. First, only three paternal DMRs are known, while about 20 maternal DMRs have been characterized to date (Duffié & Bourc'his 2013).

Second, paternal DMRs have lower CpG density than the maternal DMRs and are located in intergenic regions, while maternal DMRs have higher CpG density and are found within promoters (Kobayashi et al. 2006; Bourc'his & Bestor 2006). Third, paternal imprints are established early in the developing male germline around the time of birth, while maternal imprints are established in the growing oocyte before ovulation (Bourc'his & Bestor 2006;

63 Schaefer et al. 2007). Fourth, in addition to methyltransferases, different chromatin factors may be required for maternal versus paternal imprint establishment. For example, KDM1B, a histone

H3 lysine 4 demethylase, is necessary for methylation of four maternal imprints (Ciccone et al.

2009), while methylation of paternally imprinted Rasgrf1, but not of H19 or Dlk1/Gtl2, is dependent on PIWI (Watanabe et al. 2011).

In summary, the published literature demonstrates that 1) DNMT1 is expressed in gonocytes during the acquisition of paternal imprinting; 2) DNMT3A, DNMT3L and DNMT3C seem to have only partial roles in paternal imprint establishment; and 3) there is a well- characterized difference between paternal and maternal imprints supporting our own observed phenotype. We predict that DNMT1 expressed in gonocytes between E14.5-16.5 is found in the unacetylated form, and this lack of acetylation allows DNMT1 to carry out de novo methylation at paternally imprinted DMRs.

Significantly, these findings have demonstrated that maintenance and de novo methylation activities are not as distinctly separated as previously thought. In fact, in vitro studies have shown that DNMT1 has greater affinity towards unmethylated DNA than de novo methyltransferases DNMT3A and DNMT3B (Okano et al. 1998). To date, all of the studied domains of DNMT1 were shown to be important for its maintenance activity (Leonhardt et al.

1992; Song et al. 2011). The (GK) repeats are therefore the first identified motif of DNMT1 that is important for its de novo methyltransferase function in vivo.

e. Future directions

To further test our model, we propose several additional experiments.

Immunofluorescence/immunohistochemistry should be performed to test for the presence of the

64 unacetylated form of DNMT1 in gonocytes at E14.5-16.5. The caveat to this experiment is that low levels of the protein might not be detectable, especially when compared to the neighboring somatic cells in the developing testes. Gonocytes could also be isolated by cell sorting to perform experiments on a purified population without any contamination of somatic tissue.

To test whether unacetylated DNMT1 plays a role in the establishment of paternal imprints in vivo, a loss-of-function mutation consisting of permanently acetylated lysines within the (GK) repeats should be engineered in the mouse. Scientists have traditionally used glutamine to mimic acetyl-lysine(Kamieniarz & Schneider 2009). However, due to the differences in structure lysine to glutamine substitution is not appropriate for the study of the role of acetylation in vivo (Kamieniarz & Schneider 2009). To this end, orthogonal aminoacyl-tRNA synthetase

(aaRS)/tRNA pair method has been developed to genetically incorporate non-canonical amino acids into endogenous proteins (reviewed by (Xiao & Schultz 2016). The most recent paper describes genetic incorporation of six acetyl-lysines into histone H3 in mouse ES cells (Elsässer et al. 2016). While this method is promising and remains the method of choice to study the role of the (GK) repeats in the germline directly, it requires several steps of genetic manipulations and has not been applied to mutagenesis in the mouse as of yet.

To further confirm the phenotype of the GR mutation at the endogenous level in ES cells,

CRISPR/Cas9 technology can be used to substitute lysines within the (GK) repeats with arginines in the endogenous Dnmt1 locus in wild-type ES cells (Ran et al. 2013; Doench et al.

2016). Since wild-type cells carry methylation only at one allele at imprinted genes, we postulate that the GR mutant would exhibit a gain of methylation at the second allele of paternally imprinted genes. The technical challenge of this experiment lies in the fact that the coding sequence for the (GK) repeats is split between two exons. Targeting two separate but relatively

65 adjacent genomic sequences simultaneously using CRISPR/Cas9 might be challenging as it would most likely lead to the deletion of the sequence between the two cut sites. This issue could be addressed by a two-step mutagenesis of the (GK) repeats: first targeting half of the motif in one exon, and then targeting the other half of the repeats in the downstream exon. This experiment also requires that both alleles acquire the exact mutation. To this end, using the

CRISPR/Cas9 system I have developed a heterozygous ES cell line harboring a deletion of the

1kb genomic sequence encoding the BAH domains and the (GK) repeats. This allows for the remaining wild-type allele of Dnmt1 to be targeted with greater efficiency. This cell line exhibits normal methylation status due to the fact that Dnmt1 mutation is not haploinsufficient(E. Li et al.

1992), and therefore it would be useful for the creation of various point mutations within the

BAH domains and the (GK) repeats using CRISPR/Cas9 technology.

66

This page intentionally left blank

67 CHAPTER III. USP7-INDEPENDENT REGULATION OF DNMT1 EXPRESSION

(Manuscript #1)

III.1. Rationale and Summary

DNMT1 is present in nuclei at constant levels throughout the cell cycle (Estève et al.

2011), and is recruited to hemimethylated sited at replication foci during S phase to copy the methylation pattern (Leonhardt et al. 1992). There has been an interest in the field to identify the role of DNMT1 during G1 and G2 cell cycle phases, when it exhibits diffuse nucleoplasmic distribution. Recently a model has been postulated linking DNMT1 biosynthesis and degradation to cell cycle (Du et al. 2010). This model suggests that DNMT1 levels during cell cycle are controlled by an acetylation-ubiquitination coupled mechanism, which involves interactions with the deubiquitinase USP7 and acetyltransferase TIP60 (Du et al. 2010).

Ubiquitin-specific protease 7 (USP7; also known as Herpes-associated ubiquitin-specific protease or HAUSP) is a critical regulator of p53 and MCM2 stability (M. Li et al. 2002; Hu et al. 2006; Sheng et al. 2006). It has been reported that during S phase USP7 binds to unacetylated

(GK) repeats of DNMT1 and stabilizes it by deubiquitination (Cheng et al. 2015). Later in S phase and during early G2 phase, TIP60 acetylates the (GK) repeats, which results in the dissociation of DNMT1/USP7 complex (Du et al. 2010), ubiquitination of DNMT1 at lysine 586

(RFTS domain) and lysine 997 (BAH2 domain) (Lecona et al. 2016), and proteosomal degradation of DNMT1.

This model was formulated based on in vitro biochemical studies, while any genetic analysis is complicated by the fact that Usp7 mutation leads to DNA replication defects (Lecona et al. 2016) and cell lethality via p53 misregulation (M. Li et al. 2002). We further test the

68 proposed model by characterizing genetic loss-of-function of USP7 in MEF cells, knocking down of USP7 in lung carcinoma H1299 cells, and by expression of DNMT1 carrying point mutations of the (GK) repeats that mimic acetylated lysines in vitro thus blocking the interaction of USP7 and DNMT1 (Cheng et al. 2015).

III.2. Manuscript

69 USP7-independent Regulation of DNMT1 Expression

Olya Yarychkivska1, Omid Tavana2, Wei Gu2, and Timothy H. Bestor1,3

1Department of Genetics and Development College of Physicians and Surgeons of Columbia University New York NY 10032 USA

2Department of Pathology and Cell Biology Institute for Cancer Genetics College of Physicians and Surgeons of Columbia University New York NY 10032 USA

3Author for correspondence: Department of Genetics and Development College of Physicians and Surgeons of Columbia University 701 W. 168th St. New York NY 10032 USA

[email protected] Tel 212 305 5331

70 It has been reported that USP7 (ubiquitin-specific protease 7) prevents ubiquitylation and degradation of DNA methyltransferase 1 (DNMT1) by direct binding of USP7 to the glycine-lysine (GK) repeats that join the N-terminal regulatory domain of DNMT1 to the C-terminal methyltransferase domain. The interaction of USP7 and DNMT1 had also been reported to depend on the acetylation status of the (GK) repeats. We found that DNMT1 was present at normal levels in mouse and human cells that contained undetectable levels of USP7, and loss of USP7 did not affect levels of UHRF1 or of histone H2A ubiquitylated at lysine 119. Mutation of the (GK) repeats to (GQ) repeats prevents binding of USP7 but did not affect the stability of DNMT1 or the ability of the mutant protein to restore genomic methylation levels when expressed in Dnmt1-null ES cells under the control of the Dnmt1 promoter. Furthermore, both USP7 and PCNA were recruited to sites of DNA replication in a manner that was independent of the presence or absence of DNMT1. These and other findings call into question the role of post-translational regulation of steady state levels of DNMT1, while several lines of evidence indicate that homeostasis of DNMT1 is instead controlled primarily at the level of transcription.

71 DNMT1 methylates hemimethylated CpG dinucleotides that appear after passage of the replication fork during S phase to ensure mitotic inheritance of genomic methylation patterns 1. As shown in figure 1a, DNMT1 bears a large N-terminal region that contains multiple functional domains that mediate nuclear import (the NLS domain), the suppression of de novo methylation (the CXXC and autoinhibitory domains) 2, two bromo-adjacent homology (BAH) domains of unknown function, and a set of glycine- lysine (GK) repeats consisting of 13 alternating glycine and lysine residues. The (GK) repeats join the N-terminal regulatory domain to the C-terminal methyltransferase domain that is closely related in sequence and structure to DNA (cytosine-5) methyltransferases from both prokaryotes and eukaryotes 3. The (GK) repeats are conserved among all eukaryotic DNMT1 homologs, all of which are lysine-rich and have at least one GK dipeptide between the regulatory and catalytic domains. GK dipeptides are also enriched in the N-terminal tails of histones H2A and H4 (Fig. 1b).

The structures of several mouse and human DNMT1 proteins have been determined 2,4- 6, and in all cases the (GK) repeats are disordered in the crystal structure and not resolved, which implies that they do not form stable associations with other domains of DNMT1. The approximate positions of the (GK) repeats of with respect to other domains of human DNMT1 are shown in Fig. 1c.

It has been reported that the (GK) repeats are involved in controlling proteosomal degradation of DNMT1 via reversible acetylation of lysines within the (GK) repeats 7-11; this has been suggested to couple DNMT1 biosynthesis and degradation to the cell cycle 9. Ubiquitin-specific protease 7 (USP7; also known as Herpes-associated ubiquitin-specific protease or HAUSP) 12-14 has been reported to bind to the unacetylated (GK) repeats of DNMT1 10; the acetylated form does not bind to USP7 and has been proposed to undergo ubiquitylation at lysine 586 (RFTS domain) and lysine 997(BAH2 domain) 15, and to be targeted for proteosomal degradation 7,16. DNMT1 was also reported to be required for the recruitment of USP7 to sites of DNA replication in S- phase nuclei 10. These reports depended on the results of in vitro studies or transfection of tagged and mutated forms of DNMT1 into cells that contained normal levels of

72 endogenous DNMT1; the relative levels of endogenous and exogenous DNMT1 were not reported and only exogenous DNMT1 was observed. Furthermore, levels of DNMT1 do not change appreciably during the cell cycle 17, although DNMT1 is not expressed in 18 G0 cells . We report here that reduction of USP7 to undetectable levels in mouse and human cells does not cause a measurable reduction in content of DNMT1 or in DNA methylation, that substitution of the acetylated lysines within the (GK) repeats by glutamines does not affect the amount or activity of endogenous DNMT1, and that recruitment of USP7 to replication foci during S phase is independent of the presence of DNMT1.

RESULTS Genetic removal of USP7 does not affect stability or activity of DNMT1. Mouse embryonic fibroblasts (MEFs) homozygous for Floxed alleles of Usp7 were cultured, immortalized by transfection of an expression construct that encoded SV40 large T antigen, and infected with a recombinant Adenovirus that drives expression of GFP and Cre recombinase. Cultures that showed near-complete infection as measured by GFP expression were cultured for 3-5 days and evaluated for expression of USP7 and DNMT1 by immunoblot. As shown in Fig. 2a, levels of DNMT1 were not affected when USP7 was reduced to undetectable levels. USP7 was present at normal levels in embryonic stem (ES) cells null for Dnmt1 (Fig.2a). As shown in Fig. 2b, global levels of DNA methylation were not notably affected by the removal of USP7, which is consistent with the lack of effect on DNMT1 levels upon removal of USP7. The human lung carcinoma cell line H1299 was transfected with an expression cassette that drives expression of an inducible USP7 shRNA and a stable cell line was derived. The shRNA produced is complementary to nucleotides 2766-2748 of transcript variant 1 and caused strong repression of all Usp7 transcript variants. This inducible system allows simultaneous induction of shRNA within the entire cell population. As shown in Fig. 2c, shRNA-mediated reduction in USP7 to undetectable levels did not affect expression of DNMT1 or UHRF1, and the levels of ubiquitylation of histone H2A at lysine 119 were normal. Fig. 2d shows that global DNA methylation was not measurably affected by the

73 loss of USP7 expression. These genetic data indicate that removal of USP7 does not lead to a significant reduction in DNMT1 or in global DNA methylation.

Mutation of GK repeats results in normal in vivo stability and activity of DNMT1. To further investigate the role of the reported interaction of USP7 with the (GK) repeats of DNMT1, we substituted the lysine residues within the (GK) repeats that have been shown to undergo acetylation 19,20 with glutamines (Fig. 3a), which cannot be acetylated but which block the interaction of USP7 and DNMT1 10. Expression constructs that directed production of wild type GK or GK→GQ DNMT1 were transfected into Dnmt1- null ES cells and stable clones established as described. Expression was driven by 14 kb of DNA 5’ of the Dnmt1 gene, which contains the Dnmt1 promoter and other regulatory sequences and has been shown to produce DNMT1 at wild type levels 21,22. As shown in Fig. 3a, the GK and GQ proteins were expressed at similar levels, even though the GQ substitution has been reported to be unable to bind to USP7 10 and according to previous reports would have been expected to be unstable. Fig. 3c also shows that the GK and GQ proteins were equally effective in rescuing global DNA methylation after expression in Dnmt1-null ES cells and that rescue of methylation of the IAP retrotransposon is equivalent when the GK and GQ variants of DNMT1 are expressed. The data of Figs. 2 and 3 indicate that the steady-state levels of DNMT1 are independent of USP7, that removal of USP7 does not affect global DNA methylation, and that substitution of the lysines in the (GK) repeats of DNMT1 that have been reported to be required for the DNMT1-USP7 interaction and stabilization of DNMT110 do not affect expression or activity of DNMT1.

Endogenous USP7 localizes to replication foci independently of DNMT1. It was reported that the recruitment of USP7 to replication foci is dependent on the (GK) repeats of DNMT1 10. We tested for recruitment of USP7 to replication foci in ES cells that were wild type for DNMT1 and in cells that completely lacked DNMT1. As shown in Fig. 3, USP7 colocalizes with the replication focus marker PCNA independently of DNMT1. These data show that USP7 is not recruited to replication foci by DNMT1 and are consistent with a recent report in which USP7 is required to maintain an enrichment

74 in SUMOylation and depletion of ubiquitylation at sites of DNA replication that is essential for DNA replication 15. DNMT1 is not required for DNA replication, as shown by normal DNA replication in Dnmt1-null ES cells.

DISCUSSION Several lines of evidence indicate the USP7 does not determine steady state levels of DNMT1. First, DNMT1 levels are independent of the presence of USP7, as shown in Fig. 2. Second, substitution of the GK repeats with GQ repeats, which inhibits the interaction of DNMT1 and USP7, does not affect steady state levels of DNMT1 (Fig. 3). Third, USP7 localizes to replication foci independently of DNMT1 (Fig 4). Fourth, modification of the endogenous Dnmt1 locus so as to delete the first 118 amino acids of DNMT1 caused the accumulation and persistence of truncated but fully active DNMT1 to ~ 5 times normal levels 23. This is the only genetically-defined region of DNMT1 that affects protein stability. However, this shortened and stabilized form of DNMT1 is produced only in growing oocytes and has not been detected in somatic cells 24.

The (GK) repeats are both highly basic and unstructured; they are therefore capable of adopting many conformations and would be expected to bind in vitro to many proteins that contain acidic pockets, as is the case for USP7. Overexpression of tagged or mutated forms of DNMT1 in cells that express endogenous but undetected DNMT1 at wild type levels may lead to oversaturation of DNMT1 binding sites in replication foci and lead to ectopic processing of the excess DNMT1 protein.

Despite the numerous reports of post-translational regulation of DNMT1 expression in an acetylation-dependent manner 7-11, strong evidence indicates that DNMT1 levels are normally controlled at the transcriptional rather than the post-translational level. Cells heterozygous for loss-of-function mutations at Dnmt1 express one-half the amount of DNMT1 protein when compared to wild type cells 25, and mice that contain additional copies of the Dnmt1 gene overexpress DNMT1 protein 26. DNMT1 is overexpressed in Friend Murine Erythroleukemia cells as a result of spontaneous amplification of the Dnmt1 gene in this cell line 27. DNMT1 is present in nuclei at constant levels throughout

75 the cell cycle and is recruited to replication foci during S phase 28. DNMT1 is down- 18 regulated in G0 cells, but this is likely to occur at the transcriptional level . Treatment of cells with drugs that induce entry into G0 phase will cause a reduction in DNMT1 levels due to cell cycle effects rather than a direct effect on DNMT1 stability 18. There is no direct evidence of a post-translational mechanism that compensates for reduced or increased DNMT1 transcript levels, and changes in Dnmt1 gene dosage result in proportionate increases or decreases in DNMT1 protein level. This is difficult to reconcile with the claims of post-translational homeostasis of DNMT1 expression. If there is post-translational regulation of DNMT1 protein levels, it must be small in effect compared to transcriptional regulation.

METHODS

Cell lines Stable ES cell lines were generated as described previously21 with the addition of an N- terminal Flag-HA tag. Point mutations were generated using QuikChange Site-Directed Mutagenesis kit (Agilent).

Usp7cl/+ mice 13 were intercrossed to generate homozygous conditional mutant embryos. MEFs were derived and then immortalized with SV40 large T antigen. MEFs were subsequently infected with Ad–GFP and Ad–Cre–GFP viruses which were purchased from Vector Biolabs (Catalogue numbers 1761 and 1710). Cultures that showed near complete infection with Ad-Cre-GFP virus as assessed by visualization of GFP expression were analyzed after five days.

To generate the inducible Usp7shRNA knockdown cells, H1299 cells were transfected with pTRIPZ encoding a Tet-inducible Usp7 short hairpin RNA purchased from Thermo Open Biosystems (clone ID: V2THS_172409). 48 hours later Puromycin (5g/mL) was added to the transfected cells for two weeks and positive clones were selected. To induce the shRNA, 5g/mL of doxycycline was added to the culture medium.

76 Immunoblotting Whole cell extracts were prepared by lysis in RIPA buffer (150mM NaCl, 1% NP-40, 0.5% Deoxycholic acid, 0.1% SDS, 50mM Tris pH 7.5) and briefly sonicated to disrupt genomic DNA, then heated to 100° in SDS and loaded onto SDS-PAGE gels. Proteins were transferred to nitrocellulose membrane and blocked in 5% milk, 0.1% Tween20, PBS for 1 hour at room temperature. Blots were incubated at 40C overnight with primary antibodies in 10% FBS, 0.1% Tween20. Blots were washed with PBST, after incubation with DNMT1 antibody blots were washed briefly with PBST and PBS. Antibodies used: rabbit polyclonal to DNMT1 (Cell Signaling Technology, 5032 (D63A6) 1:2,500 dilution), rabbit polyclonal to USP7 (Bethyl Laboratories, IHC00018; 1:5,000 dilution), rabbit monoclonal to Histone H2A monoubiquitylated at Lys119 (Cell Signaling Technology, 8240 (D27C4); 1:5,000 dilution), rabbit polyclonal to HA tag (Abcam, ab9110, 1:5000), rabbit polyclonal to UHRF1 (Bethyl Laboratories, A301-470A, 1:1000), mouse monoclonal to alpha-tubulin (Abcam, ab7291, 1:10,000).

Methylation analysis Genomic DNA was digested for two rounds with methylation-sensitive enzyme HpaII, its isoschizomer MspI as a control, or McrBC. DNA was quantified and ran on 0.8% agarose gel, which was stained with ethidium bromide. Southern blot analysis was performed with IAP probe generated by PCR. Primers (matching to LTR+GAG) used for probe amplification: probe_IAP_F (GGTAAACAAATAATCTGCGC); probe_IAP_R (CTGGTAATGGGCTGCTTCTTCC). Agarose gels were transferred to a Nytran SPC membrane (GE Healthcare) overnight in 10X SSPE buffer. After crosslinking, membrane was prehybridized with 6X SSC, 5X Denhardts, 1% SDS, 10% Dextran Sulfate for 1hour at 45° and incubated overnight with IAP probe at 45°. Membranes were washed once with 2X SSC, 0.5% SDS; 2X with 1X SSC, 0.5% SDS, and 1X with 0.2X SSC, 0.5% SDS.

Immunofluorescence ES cells were cultured on glass slides. For the PCNA immunostaining, cells were treated with 0.5% Triton X-100 in CSK buffer (100 mM NaCl, 300 mM sucrose, 10 mM

77 PIPES [piperazine-N,N-bis(2-ethanesulfonic acid)], pH 6.8, 3 mM MgCl2, 1 mM EGTA) for 30 seconds at 4°C, and then treated with methanol for 20 min at -20°C 29.

Cells were permeabilized/blocked in Block solution (5% Donkey serum, 0.3% Triton, 1X PBS), then incubated overnight with primary antibody diluted in Block solution at 40C. The following primary antibodies were used: mouse monoclonal to PCNA (Abcam, ab29, 1:500 dilution), rabbit polyclonal to USP7 (BethylLaboratories, IHC00018; 1:100 dilution).

Cells were washed ten times with PBS and incubated for 1 hour at room temperature with the following secondary antibodies diluted in Block buffer: Cy3-conjugated donkey anti-mouse (Jackson lmmunoResearch; 1:200 dilution) and IgM Alexa 488–conjugated IgG donkey anti-rabbit (Jackson lmmunoResearch; 1:500 dilution). Slides were subsequently washed with PBS, counterstained with Hoechst 33342 (Invitrogen) and mounted with Vectorshield mounting medium (Vector Labs).

78

Fig.1. The (GK) repeats in DNMT1. a., Domain organization in murine DNMT1. NLS: nuclear localization sequence. RFTS: replication focus targeting sequence. CXXC: Zinc-containing domain that binds to unmethylated CpG sites in double stranded DNA. Autoinhibitory: An acidic linker interposed between DNA and the active site of DNMT1

79 when the CXXC domain is bound to unmethylated DNA. BAH1 and 2: Bromo-adjacent homology domains that are of unknown function in DNMT1 but are involved in binding to specific histone modifications in other proteins. (GK): The run of alternating glycine and lysine amino acids at the junction between the N-terminal regulatory and C-terminal catalytic domains of DNMT1. Methyltransferase domain: Catalytic domain related in sequence and structure to eukaryotic and prokaryotic DNA (cytosine-5) methyltransferases. b., Alignment of (GK) repeats of DNMT1 homologs from mammals, insects, and plants, and with the N-terminal tails of histones H2A and H4. GK dipeptides are outlined in red; the related AK dipeptides are outlined in blue. c., Structure of autoinhibited form of human DNMT1 in complex with unmethylated DNA to show spatial relationships of domains diagrammed in a. The protein shown was truncated just N- terminal of the CXXC domain prior to crystallization. The (GK) repeats were unstructured in all the crystallographic studies of DNMT1; they are shown here roughly to scale in an arbitrary position.

80

Fig 2. Elimination of USP7 does not affect steady state level of DNMT1 or global DNA methylation. a., MEFs that lack USP7 after Cre-mediated excision of a Floxed allele of Usp7 show normal levels of DNMT1 (lane 2) but no detectable USP7 protein. b., Normal DNA methylation in MEFs that lack USP7. McrBC cleaves methylated DNA; HpaII cleaves unmethylated DNA at CCGG sites; MspI cleaves CCGG sites whether methylated or unmethylated. DNA from wild type and Usp7–/– MEFs show similar patterns of resistance to both McrBC and HpaII. c., Removal of USP7 by an inducible shRNA against USP7 from H1299 human lung carcinoma cells does not affect levels of DNMT1, UHRF1, or H2AK119ub1. d., DNA methylation is not measurably affected by removal of USP7 as assessed by resistance of DNA to HpaII.

81

Fig. 3. Substitution of lysines by glutamines within (GK) repeats does not affect DNMT1 stability or function. a., Positions of GK→GQ substitutions within the (GK) repeats of DNMT1.

The substituted lysines correspond to those reported to be acetylated in vivo. b., Normal stability of

GK→GQ DNMT1 in stably transfected Dnmt1-null ES cells. c., Normal methylation of genomic DNA as measured by resistance to HpaII in stably transfected ES cells at left; at right, equal methylation of IAP retrotransposon DNA by wild type and GK→GQ DNMT1. Three independent GK→GQ clones were analyzed in a-c.

82

Fig. 4. USP7 is recruited to replication foci in the absence of DNMT1. Replication foci where identified by staining for the replication factor PCNA. Colocalization of USP7 and PCNA is clearly apparent in both wild type (top three rows) and Dnmt1-null (bottom three rows) ES cell nuclei.

1. Goll, M. G. & Bestor, T. H. Eukaryotic cytosine methyltransferases. Annu. Rev. Biochem. 74, 481–514 (2005). 2. Song, J., Rechkoblit, O., Bestor, T. H. & Patel, D. J. Structure of DNMT1-DNA complex reveals a role for autoinhibition in maintenance DNA methylation. Science 331, 1036–1040 (2011). 3. Bestor, T., Laudano, A., Mattaliano, R. & Ingram, V. Cloning and sequencing of a cDNA encoding DNA methyltransferase of mouse cells. The carboxyl-terminal domain of the mammalian enzymes is related to bacterial restriction

83 methyltransferases. J. Mol. Biol. 203, 971–983 (1988). 4. Song, J., Teplova, M., Ishibe-Murakami, S. & Patel, D. J. Structure-based mechanistic insights into DNMT1-mediated maintenance DNA methylation. Science 335, 709–712 (2012). 5. Takeshita, K. et al. Structural insight into maintenance methylation by mouse DNA methyltransferase 1 (Dnmt1). Proc. Natl. Acad. Sci. U.S.A. 108, 9055–9059 (2011). 6. Zhang, Z.-M. et al. Crystal Structure of Human DNA Methyltransferase 1. J. Mol. Biol. 427, 2520–2531 (2015). 7. Du, Z. et al. DNMT1 stability is regulated by proteins coordinating deubiquitination and acetylation-driven ubiquitination. Sci Signal 3, ra80–ra80 (2010). 8. Felle, M. et al. The USP7/Dnmt1 complex stimulates the DNA methylation activity of Dnmt1 and regulates the stability of UHRF1. Nucleic Acids Res. 39, 8355–8365 (2011). 9. Bronner, C. Control of DNMT1 abundance in epigenetic inheritance by acetylation, ubiquitylation, and the histone code. Sci Signal 4, pe3–pe3 (2011). 10. Cheng, J. et al. Crystal Structure of human DNMT1 and USP7/HAUSP complex. (2015). doi:10.2210/pdb4yoc/pdb 11. Qin, W., Leonhardt, H. & Spada, F. Usp7 and Uhrf1 control ubiquitination and stability of the maintenance DNA methyltransferase Dnmt1. J. Cell. Biochem. 112, 439–444 (2011). 12. Kon, N. et al. Inactivation of HAUSP in vivo modulates p53 function. Oncogene 29, 1270–1279 (2010). 13. Kon, N. et al. Roles of HAUSP-mediated p53 regulation in central nervous system development. Cell Death Differ. 18, 1366–1375 (2011). 14. Brooks, C. L., Li, M., Hu, M., Shi, Y. & Gu, W. The p53--Mdm2--HAUSP complex is involved in p53 stabilization by HAUSP. Oncogene 26, 7262–7266 (2007). 15. Lecona, E. et al. USP7 is a SUMO deubiquitinase essential for DNA replication. Nature Structural & Molecular Biology (2016). doi:10.1038/nsmb.3185 16. Zhou, Q., Agoston, A. T., Atadja, P., Nelson, W. G. & Davidson, N. E. Inhibition of histone deacetylases promotes ubiquitin-dependent proteasomal degradation of DNA methyltransferase 1 in human breast cancer cells. Mol. Cancer Res. 6, 873– 883 (2008). 17. Estève, P.-O. et al. A methylation and phosphorylation switch between an adjacent lysine and serine determines human DNMT1 stability. Nature Structural & Molecular Biology 18, 42–48 (2011). 18. Kimura, H., Nakamura, T., Ogawa, T., Tanaka, S. & Shiota, K. Transcription of mouse DNA methyltransferase 1 (Dnmt1) is regulated by both E2F-Rb-HDAC- dependent and -independent pathways. Nucleic Acids Res. 31, 3101–3113 (2003). 19. Choudhary, C. et al. Lysine acetylation targets protein complexes and co- regulates major cellular functions. Science 325, 834–840 (2009). 20. Kim, S. C. et al. Substrate and Functional Diversity of Lysine Acetylation Revealed by a Proteomics Survey. Mol. Cell 23, 607–618 (2006). 21. Damelin, M. & Bestor, T. H. Biological functions of DNA methyltransferase 1 require its methyltransferase activity. Mol. Cell. Biol. 27, 3891–3899 (2007).

84 22. Tucker, K. L., Talbot, D., Lee, M. A., Leonhardt, H. & Jaenisch, R. Complementation of methylation deficiency in embryonic stem cells by a DNA methyltransferase minigene. Proceedings of the National Academy of Sciences 93, 12920–12925 (1996). 23. Ding, F. & Chaillet, J. R. In vivo stabilization of the Dnmt1 (cytosine-5)- methyltransferase protein. Proceedings of the National Academy of Sciences 99, 14861–14866 (2002). 24. Mertineit, C. et al. Sex-specific exons control DNA methyltransferase in mammalian germ cells. Development 125, 889–897 (1998). 25. Li, E., Bestor, T. H. & Jaenisch, R. Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell 69, 915–926 (1992). 26. Biniszkiewicz, D. et al. Dnmt1 overexpression causes genomic hypermethylation, loss of imprinting, and embryonic lethality. Mol. Cell. Biol. 22, 2124–2135 (2002). 27. Bestor, T. & Ingram, V. M. The DNA-methylating system of murine erythroleukemia cells. Prog. Clin. Biol. Res. 198, 95–104 (1985). 28. Leonhardt, H., Page, A. W., Weier, H. U. & Bestor, T. H. A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei. Cell 71, 865–873 (1992). 29. Takebayashi, S.-I., Tamura, T., Matsuoka, C. & Okano, M. Major and essential role for the DNA methylation mark in mouse embryogenesis and stable association of DNMT1 with newly replicated regions. Mol. Cell. Biol. 27, 8243– 8258 (2007).

85 III.3. Interpretation

The results described in the manuscript demonstrate that USP7 does not control levels of

DNMT1 and plays no regulatory role in maintenance DNA methylation. In addition, USP7 localization to replication foci is shown to be independent of DNMT1. Our data therefore contradicts the established model in the field.

The detected in vitro binding (Du et al. 2010) and crystal structure (Cheng et al. 2015) of

DNMT1 and USP7 can be explained by the observation that the (GK) repeats are highly basic and unstructured; as a result they could possibly adopt many conformations and bind in vitro to proteins with acidic pockets. Moreover, the crystal structure of DNMT1 and USP7 demonstrates that the two proteins bind via two interfaces: interface-1 involving the interaction of the (GK) repeats with the acidic groove on the surface of UBL1-2 domains of USP7; interface-2 involving the TRD of DNMT1 and UBL3 of USP7 (Cheng et al. 2015), suggesting that DNMT1/USP7 interaction does not exclusively depend on the (GK) repeats. Furthermore, only three lysines

(K1109, K1111 and K1115) of the (GK) repeats participate in the interaction with USP7 (Cheng et al. 2015), while the rest of the (GK) motif is disordered, suggesting that USP7 binding with

DNMT1 can take place via different modes of binding. Further mutagenesis of residues involved in the two interfaces is needed to understand the functional importance of these binding modes in vivo.

An alternative explanation for the interaction of DNMT1 and USP7 could be the requirement of DNMT1 for USP7 function at replication foci. A recent report demonstrated that USP7 is, in fact, required to maintain a SUMO-rich and Ub-poor chromatin environment at replisomes in mammalian cells through the process of deubiquitination of SUMO2/3 (Lecona et al. 2016). We

86 tested whether DNMT1 is required for localization of USP7 to replisomes by performing an immunofluorescence experiment, and showed that USP7 localization is DNMT1-independent.

Little is known about the nature of DNMT1 ubiquitination and proteasome degradation. Two lysines (K586 in RFTS domain and K997 in BAH2) were recently identified as possible sites of ubiquitination, and deubiquitination by USP7 (Lecona et al. 2016). It is not known which E3 ubiquitin ligase ubiquitnates these residues. UHRF1 has been suggested as a possible candidate

(Qin et al. 2011; Du et al. 2010) however protein levels of DNMT1 do not change in UHRF1- null cells (Sharif et al. 2007; Bostick et al. 2007) demonstrating that UHRF1 does not participate in targeting DNMT1 for proteosomal degradation. In addition, it remains uncertain whether ubiquitination of lysines 586 and 997, and deubiquitynation by USP7 involves mono- or polyubiquitination. Only certain types of polyubiquitynation (K48- and K11-linked) lead to proteasomal degradation of proteins, while monoubiquitynation and other forms of polyubiquitination (K63-linked) perform other regulatory functions such as kinase activation, signal transduction, endocytosis and others (Sadowski et al. 2012). While many of the details bear further investigation, it seems clear that in general USP7 is not involved in regulation of

DNMT1 stability, nor in maintenance DNA methylation.

a. Manuscript contributions

Experiments were designed and analyzed by O.Y. and T.H.B. All experiments were performed by O.Y. Usp7–null MEF and H1299 cells were provided by O.T. and W.G.

87 CHAPTER IV. THE ROLE OF BAH DOMAINS IN REGULATING DNMT1

LOCALIZATION AND FUNCTION.

IV.1. Rationale and Summary

DNMT1 contains two tandem Bromo-Adjacent-Homology (BAH) that have a dumbbell- shaped configuration in the crystal structure (Song et al. 2011; Song et al. 2012; Takeshita et al.

2011). Their function or role in DNA methylation is unknown.

Both BAH domains are positioned on the surface of the DNMT1 molecule remote from the bound DNA at the active site, although they are each connected to the rest of the molecule via long linkers: BAH1 connects to the CXXC domain via the autoinhibitory linker (Song et al.

2011), and BAH2 contains a long protruding loop that makes interactions with the TRD (Song et al. 2012). While the crystal structure suggests that the BAH domains could not have a direct effect on the catalytic site of DNMT1, they might possess regulatory functions via interactions with other proteins given their large, exposed surface area.

In addition to DNMT1, BAH domains are found in proteins involved in chromatin remodeling such as ASH1 (also known as KMT2H), MTA1, MTA2, RSC1, RSC2, Sir3 and

ORC1 (Yang & R.-M. Xu 2013). Recently, it has been shown that BAH domains in Sir3 and

ORC1 proteins are histone-binding modules (Armache et al., 2011; Kuo et al., 2012) that recognize specific histone modifications. ORC1 selectively binds to histone H4K20me2 (Kuo et al. 2012), while Sir3 makes several contacts with nucleosome containing unmodified H4K16 and

H3K79 (Armache et al. 2011). Therefore, we hypothesized that the BAH domains of DNMT1 might regulate targeting of DNMT1 based on the interaction with specific chromatin modifications, and we set out to test this using genetic, biochemical and structural approaches,

88 specifically CRISPR/Cas9 to delete BAH domains, genetic rescue with point mutations in the

BAH1, in vitro screens to identify interaction partners and structural analysis.

89 IV. 2. Materials and Methods a. Cell lines and cell culture

Mouse ES cells were cultured on gelatin following standard techniques. Stable ES cell lines were generated by nucleofection of Dnmt1-null ES cells (Lei et al. 1996) with MT80 minigene (Figure

2A) and pGKPuro plasmid for Puromycin resistance. MT80 minigene, carrying 12kb of 5’

Dnmt1 genomic sequence with endogenous promoter and 5.5kb of Dnmt1 cDNA (Tucker,

Talbot, et al. 1996; Damelin & Bestor 2007), was modified by the addition of the N-terminal

Flag-HA tag after the translation start site. Point mutations were produced using QuikChange

Site-Directed Mutagenesis kit (Agilent). Individual clones were selected with Puromycin for 10-

14 days and picked into 96-well plates. Clones were genotyped using primers specific to the

Flag-HA tag . Positive clones were propagated and levels of DNMT1 expression were tested by

Western blotting. Clones expressing DNMT1 at wild-type levels were used for further studies.

b. CRISPR mutagenesis

Broad Institute software was used to design gRNAs targeting intron 23-24

(Forward oligo: CACCGATAAAAACCAAGCCTGATGG; Reverse oligo:

AAACCCATCAGGCTTGGTTTTTATC) and intron 29-30

(Forward oligo: CACCGCAGACGCTAGGCTAATCCT; Reverse oligo:

AAACAGGATTAGCCTAGCGTCTGC) of endogenous Dnmt1. gRNAs were cloned into pX330-PURO plasmid and nucleofected into wild-type ES cells. Transient selection was applied for 24 hours, and surviving clones were picked 10-14 days after nucleofection. Remaining clones were pulled for genomic DNA and tested with T7 endonuclease to check the percentage of cuts in the population. Individual clones were genotyped using primers flanking the deletion site (F:

90 CAGGTAGCCCATCCGCTTG; R: AATTCCTAGCACCCACACGG). Homozygous and heterozygous clones were distinguished by using one primer inside the deleted site and one primer outside.

c. Immunoblotting

Whole cell extracts were prepared by lysis in RIPA buffer (150mM NaCl, 1% NP-40, 0.5%

Deoxycholic acid, 0.1% SDS, 50mM Tris pH 7.5) and briefly sonicated to disrupt genomic

DNA, then heated to 100° in SDS and loaded onto SDS-PAGE gels. Proteins were transferred to nitrocellulose membrane and blocked in 5% milk, 0.1% Tween20, PBS for 1 hour at room temperature. Blots were incubated at 40C overnight with primary antibodies in 10% FBS, 0.1%

Tween20. Blots were washed with PBST, after incubation with DNMT1 antibody blots were washed briefly with PBST and PBS. Antibodies used: rabbit polyclonal to DNMT1 (Cell

Signaling Technology, 5032 (D63A6) 1:2,500 dilution), rabbit polyclonal to USP7 (Bethyl

Laboratories, IHC00018; 1:5,000 dilution), rabbit monoclonal to Histone H2A monoubiquitylated at Lys119 (Cell Signaling Technology, 8240 (D27C4); 1:5,000 dilution), rabbit polyclonal to HA tag (Abcam, ab9110, 1:5000), rabbit polyclonal to UHRF1 (Bethyl

Laboratories, A301-470A, 1:1000), mouse monoclonal to alpha-tubulin (Abcam, ab7291,

1:10,000).

d. Methylation analysis

Genomic DNA was digested for two rounds with methylation-sensitive enzyme HpaII

(recognition sequence CCGG), its isoschizomer MspI as a control, or McrBC. DNA was quantified and ran on 0.8% agarose gel, and stained with ethidium bromide.

91

e. Southern blots

Southern blot analysis was performed with IAP probe generated by PCR. Primers (matching to

LTR+GAG) used for probe amplification: probe_IAP_F (GGTAAACAAATAATCTGCGC); probe_IAP_R (CTGGTAATGGGCTGCTTCTTCC). Probes for minor and major satellites were prepared by EcoRI digestion from plasmids (gift of Mathieu Boulard). Agarose gels were transferred to a Nytran SPC membrane (GE Healthcare) overnight in 10X SSPE buffer. After crosslinking, membrane was prehybridized with 6X SSC, 5X Denhardts, 1% SDS, 10% Dextran

Sulfate for 1hour at 45° and incubated overnight with IAP probe at 45°. Membranes were washed once with 2X SSC, 0.5% SDS; 2X with 1X SSC, 0.5% SDS, and 1X with 0.2X SSC,

0.5% SDS.

f. Immunofluorescence

ES cells were cultured on glass slides. For the PCNA immunostaining, cells were treated with

0.5% Triton X-100 in CSK buffer (100 mM NaCl, 300 mM sucrose, 10 mM PIPES [piperazine-

N,N-bis(2-ethanesulfonic acid)], pH 6.8, 3 mM MgCl2, 1 mM EGTA) for 30 seconds at 4°C, and then treated with methanol for 20 min at -20°C (Takebayashi et al. 2007). Cells were permeabilized/blocked in Block solution (5% Donkey serum, 0.3% Triton, 1X PBS), then incubated overnight with primary antibody diluted in Block solution at 40C. The following primary antibodies were used: mouse monoclonal to PCNA (Abcam, ab29, 1:500 dilution), rabbit polyclonal to USP7 (BethylLaboratories, IHC00018; 1:100 dilution). Cells were washed ten times with PBS and incubated for 1 hour at room temperature with the following secondary antibodies diluted in Block buffer: Cy3-conjugated donkey anti-mouse (Jackson

92 lmmunoResearch; 1:200 dilution) and IgM Alexa 488–conjugated IgG donkey anti-rabbit

(Jackson lmmunoResearch; 1:500 dilution). Slides were subsequently washed with PBS, counterstained with Hoechst 33342 (Invitrogen) and mounted with Vectorshield mounting medium (Vector Labs). g. Protein purification pRSF-Dnmt1 plasmid (with 6xHis tag) for expression of recombinant truncated DNMT1 with

BAH1, BAH2 and catalytic domains was a kind gift of Dr. Jikui Song. pRSF-DNMT1 was modified using QuikChange Site-Directed Mutagenesis kit (Agilent) to introduce STOP codons after BAH2 domain (for generation of BAH1-BAH2 recombinant DNMT1) and after BAH1 domain (for generation of BAH1 recombinant protein). Dnmt1 sequence was also subcloned into pGEX-6P-1 GST expression vector. Recombinant proteins were expressed in BL21(DE3)RIL E. coli cells and grown at 37C until OD600 reached 0.6. Temperature was then changed to 18C and cells were induced with 0.4mM IPTG overnight. Recombinant proteins were purified using a modified protocol from (Song et al. 2011). One step purification was performed using Ni-NTA beads (ThermoScientific, 88221). Purified proteins were dialized overnight in 20mM Tris-HCl,

7.5; 200mM NaCl and 5mM DTT, and stored at -80C.

h. Histone peptide array

Histone peptide arrays were purchased from ActiveMotif. Arrays were briefly blocked in 5% milk and incubated with fresh recombinant proteins for 1 hour at 4C. Binding was detected using antibodies against 6xHis tag or GST tags following a standard Western blotting as described above.

93 i. In vitro pull-down of ES cell extracts and recombinant proteins

Nuclear extracts from Dnmt1c/c ES cells were prepared using detergent-based buffer for lysis and

Dounce homogenizer, and subsequently digested with MNase to acquire mononucleosomes. At the same time purified recombinant proteins were bound to magnetic nickel beads, and then incubated with the nuclear extract overnight at 4oC. Proteins were eluted using acidic elution with Glycine. Eluted proteins were resolved on SDS-PAGE gel. Two bands were cut out and sent for mass spectrometry analysis at Harvard University.

j. Structural analysis

BAH1 and BAH2 structures were analyzed using Cn3D and Pymole programs. 3D sequence analysis to identify binding pockets was performed using VAST online software (Gibrat et al.

1996).

94 IV.3. Results a. Deletion of BAH domains results in stable DNMT1 protein

To study the function of BAH domains in mouse ESCs we utilized CRISPR/Cas9 technology to create a deletion of the exons 24-29, encoding the BAH1 and BAH2 domains, in the endogenous locus of Dnmt1 (Figure 4A(a)). The deletion of exons 24-29 does not alter the frame of the RNA messenger produced by splicing of exon 23 to exon 30. Cas9 was targeted to make cuts in the introns just upstream of exon 24 and downstream of exon 29. 22.5% of genotyped clones were homozygous for the desired deletion. Two homozygous clones were selected for analysis and one clone with unaltered wild-type alleles served as a control.

In-frame deletion of BAH domains had no effect on DNMT1 stability in vivo. Fig. X shows that DNMT1ΔBAH was produced at levels similar to wild type with the expected size of

150 kDa. No degradation products were detected, arguing that DNMT1ΔBAH is stable.

Importantly, antibody recognizing an epitope within the BAH2 domain failed to detect any protein for either clone (Figure 4A(b)). Our CRISPR/Cas9 strategy yielded production of wild type levels of DNMT1 protein lacking BAH domains from the endogenous locus.

b. BAH domains are required for global maintenance of DNA methylation

Global methylation levels of DNMT1ΔBAH clones were determined by digestion of genomic

DNA with the methyl-sensitive restriction enzyme HpaII, which is inhibited by methylation at

CpG sites. Figure 4B(a) shows that both DNMT1ΔBAH clones exhibit genome-wide demethylation similar to Dnmt1-/- ESCs. Hence, we can conclude that BAH domains of DNMT1 are required for global methylation in ESCs.

95 c. BAH domains are required for methylation at different genomic compartments to variable degrees

To examine the methylation status of DNMT1ΔBAH clones at different genomic sequences,

Southern blot analysis was performed with probes against IAP retrotransposons, minor and major satellite repeats. To analyze methylation at major satellites genomic DNA was digested with MaeII (HpyCH4IV) restriction enzyme, since there are no HpaII sites within the major satellite sequence. Figure 4B shows that both IAPs and major satellites lose methylation upon deletion of the BAH domains, while minor satellites retain partial methylation. These data indicate that DNMT1’s BAH domains are required for maintenance methylation at retrotransposons and major satellites, while they are only partially required for methylation at minor satellite sequences.

96

FIGURE 4A. Deletion of BAH domains at the endogenous locus yields stable protein at wild- type levels. a., Endogenous locus of Dnmt1 was targeting using CRISPR/Cas9 resulting in complete deletion of BAH domains, while allowing in frame splicing. b., Two homozygous clones for BAH deletion produced wild-type levels of truncated protein that did not exhibit degradation.

97

FIGURE 4B. ES cells with BAH deletion exhibit genome-wide demethylation. a., Genomic DNA was digested with methyl-sensitive enzyme HpaII and resolved on an agarose gel (left), subsequently probed with IAP (middle) and minor satellite (right) probes. b., For major satellite Southern blot, DNA was digested with MaeII. IAP and major satellites exhibit complete demethylation, while minor satellites retain partial methylation.

98

FIGURE 4C. DNMT1BAH clones exhibit mislocalization of the protein during S phase. a., Replication foci where identified by staining for the replication factor PCNA. Wild-type cells show colocalization of full-length DNMT1 and PCNA. b., Quantification of cells with DNMT1 foci during S phase.

99

FIGURE 4D. Figure 4D. Identification of BAH1 and BAH2 binding pockets. Protein sequence of DNMT1’s BAH domains was aligned with BAH of ORC1 and Sir3 using a 3D sequence alignment tool VAST. Residues that constitute binding pockets are outlined. Predicted BAH1 and BAH2 binding residues are in red. a., 3D structural alignment; b., sequence alignment based on structure.

100

FIGURE 4E. BAH domains bind to methylated H3K27 and acetylated H2B peptides in vitro. a., Truncated DNMT1 with BAH1, BAH2 and methyltransferase domains was purified. b., Recombinant protein was incubated with a histone peptide array (ActiveMotif). Binding was detected with antibodies against the His-tag.

101

FIGURE 4F. BAH1 and BAH2 bind to methylated H3K27 and acetylated H2B peptides in vitro. a., Truncated DNMT1 with BAH1 and BAH2 was purified. b., Recombinant protein was incubated with a histone peptide array (ActiveMotif). Binding was detected with antibodies against the His-tag.

102

FIGURE 4G. BAH1 alone can bind to methylated H3K27 and acetylated H2B peptides in vitro. a., BAH1 tagged with a GST-tag was purified. b., BAH1 was incubated with a histone peptide array (ActiveMotif). Binding was detected with antibodies against the GST-tag.

FIGURE 4H. Identification of BAH1 potential binding partners in vitro. Purified recombinant BAH1 domain or BAH1Y778A were incubated with the nuclear extract from Dnmtc/c ES cells. Eluted proteins were resolved on an acrylamide gel. Bands present in the BAH1 lane and absent in the BAH1Y778A lane were excised and sent for mass spectrometry analysis.

103

FIGURE 4I. Point mutation in BAH1 is not sufficient to disrupt its function. a., BAH1 point mutation does not effect DNMT1 stability. Stable clones expressing wild-type levels of DNMT1 were further analyzed. b., Position of Y778A substitution within the BAH1. c., Normal methylation of genomic DNA as measured by resistance to methyl-sensitive restriction enzyme HpaII; Southern blot analysis with probes for IAP retrotransposon and Minor Satellites revealed normal methylation.

104

FIGURE 4J. Structural model of DNMT1 recruitment to nucleosomes via its BAH domains. BAH domains of DNMT1 might contribute to proper localization during S phase by recognition of specific histone marks on nucleosomes. Modeling was done using Pymol by alignment of solved crystal structures of DNMT1 (3PT9) and Sir3 with nucleosome (3TU4).

105 d. Deletion of BAH domains prevents targeting of DNMT1 to replication foci during S phase

To investigate subcellular localization of DNMT1ΔBAH, immunofluorescence studies were performed with antibodies against PCNA to mark replication foci and cells in S phase as well as against DNMT1, using an epitope within the methyltransferase domain. In wild-type cells

DNMT1, is targeted to replication foci during S phase (Leonhardt et al. 1992), so that co- localization pattern between PCNA and DNMT1 was observed in wild type control cells.

Strikingly, both DNMT1ΔBAH clones exhibited loss of DNMT1 at replication foci during S phase, displaying instead a diffuse signal across the nucleus similar to that seen in non-S phase cells

(Figure 4C). This phenotype was observed for all mutant S phase cells demonstrating that the

BAH domains are necessary for targeting of DNMT1 to replication foci during S phase. Thus, the global loss of methylation phenotype seen in these mutants originates from the failed targeting of DNMT1ΔBAH to the replication foci.

e. BAH1 domain is predicted to be a histone-binding domain

Since ORC1 and Sir3 were shown to be histone-binding domains(Sampath et al. 2009;

Armache et al. 2011; Kuo et al. 2012) we hypothesized that either one or both of the BAH domains of DNMT1 might also bind to specific histone modifications. To identify the potential histonebinding pockets of BAH1 and BAH2, 3D sequence alignments were performed with

BAH domains of ORC1 and Sir3 (Gibrat et al. 1996). Individual residues of BAH1 and BAH2 binding pockets were identified based on their spatial alignment against the residues from the known histone-binding pockets of ORC1 and Sir3 (Figure 4D). 3D sequence alignment revealed that BAH1’s pocket is composed of tyrosine 778, tryptophan 798, aspartic acid 804, glutamic acid 821 and valine 819. BAH2’s pocket is composed of arginine 995, lysine 1021, tyrosine

106 1023, asparagine 1027 and tyrosine 1023. The BAH1 binding pocket thus shows close conservation with ORC1’s, implicating it as a potential methyllysine-recognizing domain. In contrast, the BAH2 pocket is quite divergent, acquiring positively charged arginine and lysine residues instead of the conserved tyrosine and tryptophan, leading to the conclusion that the

BAH2 domain might have another function than histone binding.

f. BAH1 domain recognizes specific modifications on histone tails in vitro

To identify if the BAH1 domain binds to histone tails an unbiased in vitro screen was performed using a commercial histone peptide array with 384 combinations of histone modifications spotted onto a cellulose membrane. To this end the following truncated recombinant proteins were purified from E. coli: BAH1 domain, BAH1-BAH2 tandem domains and BAH1-BAH2-MTase (Figure 4E(a); 4F, 4G(b)). Purified truncated proteins were incubated with the histone peptide arrays and binding was detected with antibodies against the tag epitope on the recombinant protein. Both BAH domains recognized histone H3 mono-, di- and trimethylated at lysine 27 and histone H4 acetylated at lysines 12 and 15 (Figure 4E(b); 4F).

Furthermore, the BAH1 domain alone could recognize all of these modifications (Figure 4G(a), demonstrating that histone-binding ability is confided mostly to the BAH1 domain. Importantly, this experiment was performed with BAH1 tagged with either His-tag or a GST-tag and yielded the same pattern of recognition, ruling out any artifactual binding due to tags or antibodies.

Interestingly, it was observed that BAH1’s interaction with H3 lysine 27 was blocked by phosphorylation of serine 28 (Figure 4E(b); 4F; 4G(a).

To further confirm histone targets of BAH1, fluorescent polarization (FP) was performed with purified BAH1 and fluorescently-tagged tail of histone H3 trimethylated at lysine 27. FP

107 failed to detect an interaction between BAH1 and H3K27me3. Other modifications identified on the array were not tested due to the costly nature of this technique.

g. BAH1 domain binds to chromatin factors in vitro

To identify potential non-histone interaction partners of the BAH1 domain, a pull-down experiment was performed in which purified wild-type BAH1 domain or mutated BAH1, with substitution of a conserved tyrosine 778 to alanine (BAH1Y778A), were incubated with nuclear extract from Dnmt1-/- ESCs. After the pull-down, proteins were resolved on an acrylamide gel; bands present in the BAH1 lane and absent in the BAH1Y778A lane were excised and sent for mass spectrometry analysis (Figure 4H). The following proteins were identified as potential interaction partners of BAH1: histone H1 and a number of factors involved in chromatin remodeling, namely RUVBL1, RUVBL2, GATAD2A, RBBP4 (also known as chromatin assembly factor 1 subunit C and RBAP48) and SMARCE1 (Figure 4H). Histone H1 has previously been reported to be involved in the maintenance of methylation at imprinted control regions in mammals (Fan et al. 2005) and plants (Rea et al. 2012). It should be noted that histone

H1 was not present on the histone peptide array and that core histones were not detectable in our assay because proteins with molecular weight below 25kDa were not analyzed by mass spectrometry. The putative interacting partners identified in this pull-down experiment have to be further validated.

h. Point mutation in BAH1 binding cage is not sufficient for disruption of BAH1 function

To understand more precisely the mechanism of BAH1 function we utilized a genetic rescue system as described in Chapter 2. Expression of Dnmt1 was driven by 12kb of DNA 5’ of the

108 Dnmt1 gene, which contains the endogenous Dnmt1 promoter and other regulatory sequences necessary for expression of DNMT1 at wild-type levels (Tucker, Talbot, et al. 1996; Damelin &

Bestor 2007). A point mutation in the BAH1 domain was introduced that lead to the production of DNMT1 with substitution of conserved tyrosine 778 to alanine (Y778A) (Figure 4I(a)). A construct carrying a wild-type Dnmt1 gene was introduced as a control. As shown in Figure

4I(b), DNMT1WT and DNMT1Y778A were expressed at levels similar to that of endogenous

DNMT1 in wild type cells, suggesting that Y778A substitution does not affect the stability of

DNMT1.

Next, we analyzed the methylation status of cells expressing DNMT1WT and DNMT1Y778A by digestion of genomic DNA with the methyl-sensitive restriction enzyme HpaII. We observed that

DNMT1Y778A rescues global methylation levels to wild-type levels similarly to DNMT1WT

(Fig. 4I(c)). Further Southern blot analysis revealed that DNMT1Y778A is able to normally methylate IAP retrotransposons and minor satellite repeat sequences (Fig. 4I(c)), indicating that

BAH1 point mutation does not compromise the maintenance methyltransferase activity of

DNMT1. We cannot exclude the possibility that this single substitution did not sufficiently disrupt BAH1 binding pocket; simultaneous substitution of several or all binding pocket residues might be necessary to create loss-of-function mutation of the BAH1 domain.

109 IV.4. Interpretation

The data presented in this chapter demonstrate that the BAH domains of DNMT1 are required for its targeting to replication foci during S phase, and this targeting possibly involves interaction with histones (Figure 4J). Expression of DNMT1 with deleted BAH domains in mouse ES cells leads to mislocalization of the protein and, consequently leads to global genome demethylation. In vitro studies using histone peptide arrays, structural analysis and pull-down experiments identified BAH1 as a histone-binding domain. Expression of DNMT1 harboring a point mutation in BAH1 domain exhibited normal methylation suggesting that the entire binding pocket should be substituted for future BAH1 loss-of-function studies.

a. Proper targeting of DNMT1 is required for maintenance methylation

During S phase DNMT1 localizes to the replication foci to carry out methylation of the newly replicated DNA (Leonhardt et al. 1992). This localization depends on RFTS domain

(Leonhardt et al. 1992), presence of methylated cytosines (Damelin & Bestor 2007; Takebayashi et al. 2007), and ubiquitin ligase UHRF1, which binds to RFTS of DNMT1 and selectively recognizes hemimethylated CpG sites via its SRA domain(Bostick et al. 2007; Sharif et al.

2007). It was thought that UHRF1’s recognition of histone H3 lysine 9 methylation is required for maintenance DNA methylation, but a UHRF1 mutant incapable of binding to histones has only a slight effect on DNA methylation levels (Zhao et al. 2016) showing that recognition of hemimethylated DNA by UHRF1 is necessary and sufficient for proper targeting and methylation by DNMT1.

Surprisingly, we found that the BAH domains are also required for proper targeting of

DNMT1 during S phase and that presence of a functional RFTS domain cannot compensate for

110 the loss of this BAH-associated function. Our data suggests that both the RFTS and BAH domains are necessary and sufficient to target DNMT1 to replication foci.

We suggest that BAH domains target DNMT1 to replisomes by interaction with specific histone modifications (Figure 4J). We identified the following possible interaction partners of the

BAH1 domain: methylated lysine 27 of histone H3 (H3K27me), acetylated histone H2B

(H2Bac), and linker histone H1.

H3K27 trimethylation is catalyzed by EZH2, a component of Polycomb Repressive

Complex 2 (PRC2) (Simon & Kingston 2009). Recent genome-wide analysis using sequential

ChIP-bisulfite-sequencing (ChIP-BS-seq) identified that DNA methylation and H3K27me3 co- occur in mammalian cells in general, but they are mutually exclusive in CpG-dense regions

(Brinkman et al. 2012). In addition, loss of DNA methylation in ES cells is associated with the establishment of broad domains of H3K27me3 at regions that were marked by high DNA methylation in wild-type cells (Brinkman et al. 2012). The link between DNA methylation and

H2B acetylation has been previously established in the filamentous fungus Neurospora crassa, where mutation of histone deacetylase hdac-1, which uses H2B as its substrate, results in DNA methylation defect (K. M. Smith et al. 2010). Analysis of triple H1-null ES cells demonstrated decreased methylation at some imprinted genes (H19, Igf2, Dlk1, Gtl2), leading to their misexpression, while global DNA methylation levels were normal (Fan et al. 2005). Therefore,

DNA methylation has an antagonistic relationship with H3K27me3 and H2B acetylation, suggesting that DNMT1 could be excluded from chromatin domains harboring these histone modifications via an inhibitory interaction of the BAH1 domain. On the other hand, linker histone H1 associated with DNA methylation at imprinted genes suggesting an activating interaction between DNMT1 and H1 at these genomic regions.

111 In accordance with previous work characterizing BAH domains in other proteins as a histone-recognizing modules (Kuo et al. 2012; Armache et al. 2011), our structural and sequence analysis suggests that the BAH1 domain of DNMT1 could function similarly, but BAH2 is unlikely to participate in such interactions due to non-conservative substitutions in its binding pocket and anchoring of the BAH2-TRD loop to the binding pocket.

We observed differential methylation of several distinct genomic compartments in BAH mutants. While IAP retrotransposons and major satellites lost all genomic methylation, minor satellite sequences retained partial methylation. Interestingly, major satellites are more abundant in the genome than minor satellites (Guenatri et al. 2004), so the observed difference in methylation is not due to the number of repeats or sensitivity of the probe. Major and minor satellites both possess trimethylated lysine 9 of histone H3 (Peters et al. 2001), though they have different sensitivities to micrococcal nuclease (Guenatri et al. 2004). In addition, DNMT3B is required for de novo methylation of minor satellite sequences (Okano et al. 1999). Importantly, we see that DNMT1 with BAH deletion clearly possesses enzymatic activity in vivo, though, at present, it is not clear why this truncated version of the enzyme has a preference for minor satellites.

b. Future directions

In order to further dissect the mechanism of function of BAH domains, I propose the following experiments:

1) To test loss-of-function of BAH1 domain specifically, our rescue experiment shows

that it is necessary to mutate all of the residues that form the predicted histone-

112 binding cage. I propose to utilize the MT80 system to re-introduce Dnmt1 carrying

point mutations in all five residues of the binding cage.

2) To confirm BAH1 interaction partners from histone peptide array and pull-down

experiments, further biochemical confirmation and target validation is necessary. A

possible orthogonal approach to this proof is the use of the BioID system to identify

in vivo protein-protein interactions (Roux et al. 2012) between DNMT1 and its

targets. The caveat of this experiment is that BioID tagging is based on the proximity

of two proteins and doesn’t necessarily mean their direct interaction. Nevertheless,

comparison of targets from our biochemical experiments and BioID pull-down would

allow for validation of in vivo interactors.

113 CHAPTER V. GENERAL DISCUSSION

The aim of this thesis was to expand our knowledge about the maintenance DNA methyltransferase DNMT1, specifically with regard to its regulation by N-terminal domains of previously unknown function: the (GK) repeats and BAH domains.

Although recombinant DNMT1 protein was shown to efficiently methylate unmethylated

DNA substrate in vitro (Okano et al. 1998; Yoder et al. 1997), its de novo methyltransferase activity has never been observed in vivo. In Chapter 2, I presented evidence showing that mutation of the (GK) repeats leads to de novo methylation at paternally imprinted DMRs. In particular, I found that cells expressing DNMT1 with unacetylatable arginines substituted for all lysine residues within the (GK) repeats acquire ectopic gain of methylation at paternally imprinted genes, while maternal imprints do not similarly acquire abnormal methylation. These data indicate that DNMT1 is capable of establishing de novo methylation at paternal imprinting control regions in vivo, and its de novo methyltransferase activity is inhibited by acetylation of the (GK) repeats. The observed gain of methylation has a functional effect and results in the change of the transcription of paternally imprinted genes, such as H19.

It is striking that ectopic methylation occurs exclusively at paternal imprinted control regions. Genetic studies of other DNA methyltransferases have not been clear about the establishment of paternal imprints in the male germline (Kaneda et al. 2004; Kato et al. 2007;

Bourc'his & Bestor 2004). Based on these findings, we propose a new model in which DNMT1 with unacetylated (GK) repeats is the enzyme responsible for the establishment of de novo methylation patterns at paternally imprinted genes in prospermatogonia. The similarity between histone H4 and the (GK) repeats raises the possibility that the two proteins could be acetylated

114 and deacetylated by the same enzymes (HAT/HDAC). Although the acetylation status of

DNMT1 in germ cells is unknown, de novo methylation in prospermatogonia is pre-meiotic and precedes the wave of histone H4 acetylation that occurs in haploid spermatids (Meistrich et al.

1992).

The (GK) repeats of DNMT1 has previously been proposed to play a role in the stability of the protein through its interaction with the deubiquitinase USP7 (Cheng et al. 2015). Chapter

3 aimed to test by genetic means the prevalent model stating that binding of the (GK) repeats to

USP7 modulates DNMT1 levels during the cell cycle in an acetylation/ubiquitination-dependent manner (Du et al. 2010; Cheng et al. 2015). Contrary to the biochemical prediction, my genetic data demonstrated that USP7 deletion has no influence on DNMT1 protein levels and plays no regulatory role in maintenance DNA methylation. In addition, my data showed that USP7 localization to replication foci is independent of DNMT1. Therefore, the significance of the physical interaction between USP7 and DNMT1 needs to be reexamined.

The data presented in Chapter 4 demonstrate that BAH domains of DNMT1 are required for its targeting to replication foci during S phase, and this targeting possibly involves interaction with histones. Using CRISPR/Cas9 technology, I generated a mutant of Dnmt1 that encodes a truncated DNMT1 protein with deleted BAH domains in mouse ES cells. I found that the mutated protein is stable but mislocalized, which causes global genome demethylation. Hence, it can be concluded that the BAH domains target DNMT1 to the replication fork. Based on structural similarities with BAH domains of other proteins, it is likely that the BAH domains mediate protein-protein interactions. However the putative protein that binds to DNMT1’s BAH domains has not been identified. In vitro studies using a histone peptide array and recombinant

BAH domains, structural analysis and pull-down experiments identified BAH1 as a histone-

115 binding domain. Further investigations are necessary to verify the significance of these newly found interactions in vivo.

My doctoral work revealed that maintenance and de novo methylation activities are not as distinctly separated as previously thought. In fact, in vitro studies have shown that DNMT1 has greater affinity towards unmethylated DNA than de novo methyltransferases DNMT3A and

DNMT3B (Okano et al. 1998). To date, all of the studied domains of DNMT1 were shown to be important for its maintenance activity (Leonhardt et al. 1992; Song et al. 2011). Our data identifies the (GK) repeats as the first known motif of DNMT1 that is important for its de novo methyltransferase function in vivo.

Proper subcellular localization of DNMT1 during S phase is a crucial aspect of normal maintenance DNA methylation. It has been shown that the RFTS domain of DNMT1 is important for its targeting to replication foci (Leonhardt et al. 1992) and interaction with a co- factor UHRF1(Bostick et al. 2007; Sharif et al. 2007). Importantly, our study of the BAH domains identified a novel pathway for targeting DNMT1 to replication foci during S phase.

Loss of either RFTS- or BAH-related targeting leads to mislocalization of DNMT1, suggesting that these mechanisms are not redundant. The fact that normal RFTS targeting cannot compensate for the disruption of the BAH domains, and vice versa, suggests that these targeting pathways act in a coordinated fashion. New studies propose that RFTS-dependent targeting to

UHRF1 is dependent on the ubiquitination of histone H3 (H3K23 in Xenopus; H3K18 in mammals) (Nishiyama et al. 2013; Qin et al. 2015). It is possible that the histone-binding ability of BAH1 domain might be necessary for securing interactions with the nucleosome.

Unexpectedly, DNMT1 has been found in a screen for proteins that bind to nucleosomes containing ubiquitinated H2A (Kalb et al. 2014). The combination of RFTS and BAH1 targeting

116 would provide a dual mode of recognition to ensure proper maintenance methylation. Further study of the mechanism of BAH targeting of DNMT1 is necessary to understand this interplay.

The bulk of the mammalian genome is methylated, but cytosine methylation has a functional role only at certain sequences. Even though DNMT1 is the only known maintenance methyltransferase, it becomes increasingly clear that it possess different affinities and regulatory interactions at different genomic sequences. This thesis demonstrates that different domains of

DNMT1 are required for methylation of different genomic compartments. The (GK) repeats motif plays no role in maintenance DNA methylation, though it is involved in the regulation of de novo methylation of paternal imprints. On the other hand, BAH domains are necessary for methylation of IAP retrotransposons and major satellites, while they are only partially required for methylation of minor satellite sequences.

117 V.1. Further Perspectives

The successful resolution of the DNMT1 crystal structure provides a platform from which to understand the mechanism of maintenance DNA methylation, along with the opportunity to study the functions of individual domains of the DNMT1 molecule as opposed to the loss-of-function genetics. In this thesis, I employed a strategy of studying the (GK) repeats and BAH domains by creating point mutations and deletions of domains within the endogenous protein. This technical approach allowed me to uncouple maintenance DNA methylation from other possible functions of DNMT1, and, as a result, I identified a novel de novo methyltransferase function of DNMT1 at paternally imprinted genes.

Similar methodology could be applied to the study of other domains of DNMT1. For example, our understanding of CXXC function derives only from in vitro studies, where it has been shown that CXXC restricts the methylation activity of DNMT1 on unmethylated substrates in vitro (Song et al. 2011). It would be interesting to see how a point mutant of the CXXC domain behaves in vivo, and if DNMT1 is able to gain robust de novo activity by losing CXXC inhibition.

Similarly, such an approach could be applied to the study of point mutations in the RFTS domain that cause DNMT1-complex disorder (Sun et al. 2014; Klein et al. 2013; Klein et al.

2011; Baets et al. 2015). Better understanding of the functional consequences of DNMT1 mutations could provide a possible cure and major life improvement for many patients that develop this autosomal dominant disorder in adulthood. Rapid development and application of

CRISPR/Cas9 technology would be very helpful for carrying out precise genetic manipulations at the Dnmt1 locus and for the creation of a mouse model of the disease.

118 Dedicated studies of the mutations associated with the DNMT1-complex disorder would provide valuable insight into the enigmatic function of DNMT1 in post-mitotic neurons (Goto et al. 1994; Trasler et al. 1996), specifically in neuronal cytoplasm (Inano et al. 2000). Baets at el.,

2015 found that mutant DNMT1 proteins when overexpressed in HEK293 cells mislocalize and aggregate in the cytoplasm. This observation potentially provides a link between DNMT1- complex disorder and the aggregative proteinopathies seen in many neurodegenerative disorders

(Ross & Poirier 2004; Ciryam et al. 2013). The only other instance of cytoplasmic trafficking of

DNMT1 known to occur is in the oocyte and in the cleavage-stage embryo(Howell et al. 2001).

Our discovery of a novel pathway for the de novo methylation of paternal imprints by the unacetylated form of DNMT1 opens many questions in the field of imprinting and post- translational modification of the methylation machinery. First, our model suggests that only the unacetylated form of DNMT1 present in prospermatogonia is capable of establishing de novo methylation at paternal imprints. Even low amounts of protein would be sufficient to perform this function at only three paternal DMRs. Regulation of acetylation/deacetylation and the specific HAT/HDAC enzymes required for this process should be further investigated. This especially raises the question of a potentially aberrant acetylation/deacetylation switch for

DNMT1 function. It would be beneficial to investigate if other sequences, in addition, to paternally imprinted genes, could acquire ectopic methylation due to misregulation of DNMT1 acetylation. This would be especially applicable to the study of cancer, where global hypomethylation is observed together with focal gains of ectopic methylation (Bestor 2003). No mutation in Dnmt1 has been identified as causative in cancer to date, but a defect in DNMT1’s acetylation/deacetylation could result from a mutation in other factors, such as HATs/HDACs.

119 Our model also broadens the already profound sexual dimorphism that surrounds the establishment of imprinting. Previously, it was thought that DNMT3A is involved in establishment of both maternal and paternal imprints, even though data regarding paternal imprints was not clear (Kaneda et al. 2004; Kato et al. 2007). In contrast, we propose, on the basis of our data, that two different enzymes and two distinct mechanisms are involved in paternal and maternal imprinting. Further investigation of the specific mechanism for the establishment of paternal imprints by the unacetylated form of DNMT1 would provide answers to some long-standing questions in the field, such as the distinction of maternal versus paternal differentially methylated sequences, establishment of complex imprinting systems and the origin of imprinting sequences in the mammalian genome.

It is paradoxical that the maintenance methyltransferase DNMT1 does not possess an ability to recognize hemimethylated DNA directly and instead contains CXXC domain that recognizes unmethylated DNA. Mechanisms targeting DNMT1 to hemimethylated DNA and inhibiting it at unmethylated sequences have been discovered and culminate in DNMT1’s ability to replicate methylation patterns with great fidelity. However, these mechanisms do not completely prevent the enzyme’s innate affinity towards unmethylated DNA. Our discovery of de novo methyltransferase function of DNMT1 changes the paradigm and our understanding of the methylation machinery, where there was thought to be a strict division of roles in establishment versus maintenance methylation.

Alignment of the (GK) repeats from DNMT1 homologues (Figure 1D) demonstrates an expansion of this motif with evolutionary progression. In Arabidopsis, there are two lysine- glycine dipeptides, four in Bombyx, and seven in the mouse. Interestingly, Arabidopsis has been shown to utilize imprinting in the endosperm, a nutrient-supplying tissue (Raissig et al. 2013)

120 bearing resemblance to the mammalian placenta. It has been suggested that imprinting arose in mammals with the acquisition of the placenta, though some reptiles and insects possess a simpler placenta as well (Renfree et al. 2013). This observation raises a question about the emergence of the (GK) motif in species that possess genomic imprinting and DNA methylation and would warrant further study.

121 REFERENCES

Aapola, U. et al., 2001. Isolation and initial characterization of the mouse Dnmt3l gene. Cytogenetics and cell genetics, 92(1-2), pp.122–126.

Allen, M.D. et al., 2006. Solution structure of the nonmethyl-CpG-binding CXXC domain of the leukaemia-associated MLL histone methyltransferase. The EMBO journal, 25(19), pp.4503– 4512.

Arita, K. et al., 2008. Recognition of hemi-methylated DNA by the SRA protein UHRF1 by a base-flipping mechanism. Nature, 455(7214), pp.818–821.

Armache, K.-J. et al., 2011. Structural basis of silencing: Sir3 BAH domain in complex with a nucleosome at 3.0 Å resolution. Science (New York, N.Y.), 334(6058), pp.977–982.

Avvakumov, G.V. et al., 2008. Structural basis for recognition of hemi-methylated DNA by the SRA domain of human UHRF1. Nature, 455(7214), pp.822–825.

Baets, J. et al., 2015. Defects of mutant DNMT1 are linked to a spectrum of neurological disorders. Brain : a journal of neurology, 138(Pt 4), pp.845–861.

Barau, J. et al., 2016. The DNA methyltransferase DNMT3C protects male germ cells from transposon activity. Science (New York, N.Y.), 354(6314), pp.909–912.

Beard, C., Li, E. & Jaenisch, R., 1995. Loss of methylation activates Xist in somatic but not in embryonic cells. Genes & development, 9(19), pp.2325–2334.

Bessman, M.J. et al., 1958. ENZYMATIC SYNTHESIS OF DEOXYRIBONUCLEIC ACID. III. THE INCORPORATION OF PYRIMIDINE AND PURINE ANALOGUES INTO DEOXYRIBONUCLEIC ACID. Proceedings of the National Academy of Sciences of the United States of America, 44(7), pp.633–640.

Bestor, T. et al., 1988. Cloning and sequencing of a cDNA encoding DNA methyltransferase of mouse cells. The carboxyl-terminal domain of the mammalian enzymes is related to bacterial restriction methyltransferases. Journal of molecular biology, 203(4), pp.971–983.

Bestor, T.H., 1992. Activation of mammalian DNA methyltransferase by cleavage of a Zn binding regulatory domain. The EMBO journal, 11(7), pp.2611–2617.

Bestor, T.H., 2003. Imprinting errors and developmental asymmetry. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 358(1436), pp.1411–1415.

Biniszkiewicz, D. et al., 2002. Dnmt1 overexpression causes genomic hypermethylation, loss of imprinting, and embryonic lethality. Molecular and cellular biology, 22(7), pp.2124–2135.

Birke, M. et al., 2002. The MT domain of the proto-oncoprotein MLL binds to CpG-containing DNA and discriminates against methylation. Nucleic acids research, 30(4), pp.958–965.

122 Blewitt, M.E. et al., 2008. SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation. Nature genetics, 40(5), pp.663–669.

Borgel, J. et al., 2010. Targets and dynamics of promoter DNA methylation during early mouse development. Nature genetics, 42(12), pp.1093–1100.

Bostick, M. et al., 2007. UHRF1 plays a role in maintaining DNA methylation in mammalian cells. Science (New York, N.Y.), 317(5845), pp.1760–1764.

Boulard, M. et al., 2010. Histone variant macroH2A1 deletion in mice causes female-specific steatosis. Epigenetics & chromatin, 3(1), p.8.

Boulard, M., Edwards, J.R. & Bestor, T.H., 2015. FBXL10 protects Polycomb-bound genes from hypermethylation. Nature genetics, 47(5), pp.479–485.

Bourc'his, D. & Bestor, T.H., 2004. Meiotic catastrophe and retrotransposon reactivation in male germ cells lacking Dnmt3L. Nature, 431(7004), pp.96–99.

Bourc'his, D. & Bestor, T.H., 2006. Origins of extreme sexual dimorphism in genomic imprinting. Cytogenetic and genome research, 113(1-4), pp.36–40.

Bourc'his, D. et al., 2001. Dnmt3L and the establishment of maternal genomic imprints. Science (New York, N.Y.), 294(5551), pp.2536–2539.

Brinkman, A.B. et al., 2012. Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk. Genome research, 22(6), pp.1128–1138.

Cassidy, S.B., Dykens, E. & Williams, C.A., 2000. Prader-Willi and Angelman syndromes: sister imprinted disorders. American journal of medical genetics, 97(2), pp.136–146.

Chen, T. et al., 2007. Complete inactivation of DNMT1 leads to mitotic catastrophe in human cancer cells. Nature genetics, 39(3), pp.391–396.

Chen, T. et al., 2003. Establishment and maintenance of genomic methylation patterns in mouse embryonic stem cells by Dnmt3a and Dnmt3b. Molecular and cellular biology, 23(16), pp.5594–5605.

Chen, T., Tsujimoto, N. & Li, E., 2004. The PWWP Domain of Dnmt3a and Dnmt3b Is Required for Directing DNA Methylation to the Major Satellite Repeats at Pericentric Heterochromatin. Molecular and cellular biology, 24(20), pp.9048–9058.

Chen, Z.-X. et al., 2005. Physical and functional interactions between the human DNMT3L protein and members of the de novo methyltransferase family. Journal of cellular biochemistry, 95(5), pp.902–917.

Cheng, J. et al., 2015. Crystal Structure of human DNMT1 and USP7/HAUSP complex.

123 Choudhary, C. et al., 2009. Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science (New York, N.Y.), 325(5942), pp.834–840.

Ciccone, D.N. et al., 2009. KDM1B is a histone H3K4 demethylase required to establish maternal genomic imprints. Nature, 461(7262), pp.415–418.

Ciryam, P. et al., 2013. Widespread Aggregation and Neurodegenerative Diseases Are Associated with Supersaturated Proteins. Cell Reports, 5(3), pp.781–790.

Creyghton, M.P. et al., 2008. H2AZ is enriched at polycomb complex target genes in ES cells and is necessary for lineage commitment. Cell, 135(4), pp.649–661.

Damelin, M. & Bestor, T.H., 2007. Biological functions of DNA methyltransferase 1 require its methyltransferase activity. Molecular and cellular biology, 27(11), pp.3891–3899.

Dennis, K. et al., 2001. Lsh, a member of the SNF2 family, is required for genome-wide methylation. Genes & development, 15(22), pp.2940–2944.

Dewannieux, M. et al., 2004. Identification of autonomous IAP LTR retrotransposons mobile in mammalian cells. Nature genetics, 36(5), pp.534–539.

Dhayalan, A. et al., 2010. The Dnmt3a PWWP domain reads histone 3 lysine 36 trimethylation and guides DNA methylation. The Journal of biological chemistry, 285(34), pp.26114– 26120.

Ding, F. & Chaillet, J.R., 2002. In vivo stabilization of the Dnmt1 (cytosine-5)- methyltransferase protein. Proceedings of the National Academy of Sciences, 99(23), pp.14861–14866.

Dodge, J.E. et al., 2005. Inactivation of Dnmt3b in mouse embryonic fibroblasts results in DNA hypomethylation, chromosomal instability, and spontaneous immortalization. The Journal of biological chemistry, 280(18), pp.17986–17991.

Doench, J.G. et al., 2016. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nature biotechnology, 34(2), pp.184–191.

Du, Z. et al., 2010. DNMT1 stability is regulated by proteins coordinating deubiquitination and acetylation-driven ubiquitination. Science signaling, 3(146), pp.ra80–ra80.

Duffié, R. & Bourc'his, D., 2013. Parental epigenetic asymmetry in mammals. Current topics in developmental biology, 104, pp.293–328.

Edwards, C.A. & Ferguson-Smith, A.C., 2007. Mechanisms regulating imprinted genes in clusters. Current opinion in cell biology, 19(3), pp.281–289.

Edwards, J.R. et al., 2010. Chromatin and sequence features that define the fine and gross structure of genomic methylation patterns. Genome research, 20(7), pp.972–980.

124 Eggan, K. et al., 2002. Male and female mice derived from the same embryonic stem cell clone by tetraploid embryo complementation. Nature biotechnology, 20(5), pp.455–459.

Elsässer, S.J. et al., 2016. Genetic code expansion in stable cell lines enables encoded chromatin modification. Nature Methods, 13(2), pp.158–164.

Estève, P.-O. et al., 2011. A methylation and phosphorylation switch between an adjacent lysine and serine determines human DNMT1 stability. Nature Structural & Molecular Biology, 18(1), pp.42–48.

Fan, Y. et al., 2005. Histone H1 depletion in mammals alters global chromatin structure but causes specific changes in gene regulation. Cell, 123(7), pp.1199–1212.

Feenstra, A. et al., 1986. In vitro methylation inhibits the promotor activity of a cloned intracisternal A-particle LTR. Nucleic acids research, 14(10), pp.4343–4352.

Felle, M. et al., 2011. The USP7/Dnmt1 complex stimulates the DNA methylation activity of Dnmt1 and regulates the stability of UHRF1. Nucleic acids research, 39(19), pp.8355–8365.

Garrick, D. et al., 2006. Loss of Atrx affects trophoblast development and the pattern of X- inactivation in extraembryonic tissues. PLoS genetics, 2(4), p.e58.

Garvilles, R.G. et al., 2015. Dual Functions of the RFTS Domain of Dnmt1 in Replication- Coupled DNA Methylation and in Protection of the Genome from Aberrant Methylation. PLOS ONE, 10(9), p.e0137509.

Gibrat, J.F., Madej, T. & Bryant, S.H., 1996. Surprising similarities in structure comparison. Current opinion in structural biology, 6(3), pp.377–385.

Gold, M., Hurwitz, J. & Anders, M., 1963. THE ENZYMATIC METHYLATION OF RNA AND DNA, II. ON THE SPECIES SPECIFICITY OF THE METHYLATION ENZYMES. Proceedings of the National Academy of Sciences of the United States of America, 50(1), pp.164–169.

Goto, K. et al., 1994. Expression of DNA methyltransferase gene in mature and immature neurons as well as proliferating cells in mice. Differentiation; research in biological diversity, 56(1-2), pp.39–44.

Gowher, H., Liebert, K., et al., 2005. Mechanism of stimulation of catalytic activity of Dnmt3A and Dnmt3B DNA-(cytosine-C5)-methyltransferases by Dnmt3L. The Journal of biological chemistry, 280(14), pp.13341–13348.

Gowher, H., Stockdale, C.J., et al., 2005. De novo methylation of nucleosomal DNA by the mammalian Dnmt1 and Dnmt3A DNA methyltransferases. Biochemistry, 44(29), pp.9899– 9904.

Guenatri, M. et al., 2004. Mouse centric and pericentric satellite repeats form distinct functional heterochromatin. The Journal of cell biology, 166(4), pp.493–505.

125 Hashimoto, H. et al., 2008. The SRA domain of UHRF1 flips 5-methylcytosine out of the DNA helix. Nature, 455(7214), pp.826–829.

Hata, K. et al., 2002. Dnmt3L cooperates with the Dnmt3 family of de novo DNA methyltransferases to establish maternal imprints in mice. Development (Cambridge, England), 129(8), pp.1983–1993.

Hellman, A. & Chess, A., 2007. Gene body-specific methylation on the active X chromosome. Science (New York, N.Y.), 315(5815), pp.1141–1143.

Hirasawa, R. et al., 2008. Maternal and zygotic Dnmt1 are necessary and sufficient for the maintenance of DNA methylation imprints during preimplantation development. Genes & development, 22(12), pp.1607–1616.

Holliday, R. & Pugh, J.E., 1975. DNA modification mechanisms and gene activity during development. Science (New York, N.Y.), 187(4173), pp.226–232.

HOTCHKISS, R.D., 1948. The quantitative separation of purines, pyrimidines, and nucleosides by paper chromatography. The Journal of biological chemistry, 175(1), pp.315–332.

Howell, C.Y. et al., 2001. Genomic imprinting disrupted by a maternal effect mutation in the Dnmt1 gene. Cell, 104(6), pp.829–838.

Howlett, S.K. & Reik, W., 1991. Methylation levels of maternal and paternal genomes during preimplantation development. Development (Cambridge, England), 113(1), pp.119–127.

Hu, M. et al., 2006. Structural basis of competitive recognition of p53 and MDM2 by HAUSP/USP7: implications for the regulation of the p53-MDM2 pathway. G. Petsko, ed. PLoS biology, 4(2), p.e27.

Inano, K. et al., 2000. Maintenance-Type DNA Methyltransferase Is Highly Expressed in Post- Mitotic Neurons and Localized in the Cytoplasmic Compartment. Journal of Biochemistry, 128(2), pp.315–321.

Ioshikhes, I.P. & Zhang, M.Q., 2000. Large-scale human promoter mapping using CpG islands. Nature genetics, 26(1), pp.61–63.

Jackson-Grusby, L. et al., 2001. Loss of genomic methylation causes p53-dependent apoptosis and epigenetic deregulation. Nature genetics, 27(1), pp.31–39.

Jeong, S. et al., 2009. Selective anchoring of DNA methyltransferases 3A and 3B to nucleosomes containing methylated DNA. Molecular and cellular biology, 29(19), pp.5366– 5376.

Kalb, R. et al., 2014. Histone H2A monoubiquitination promotes histone H3 methylation in Polycomb repression. Nature Structural & Molecular Biology, 21(6), pp.569–571.

Kamieniarz, K. & Schneider, R., 2009. Tools to Tackle Protein Acetylation. Chemistry &

126 biology, 16(10), pp.1027–1029.

Kaneda, M. et al., 2004. Essential role for de novo DNA methyltransferase Dnmt3a in paternal and maternal imprinting. Nature, 429(6994), pp.900–903.

Kareta, M.S. et al., 2006. Reconstitution and mechanism of the stimulation of de novo methylation by human DNMT3L. The Journal of biological chemistry, 281(36), pp.25893– 25902.

Kashiwagi, K. et al., 2011. DNA methyltransferase 3b preferentially associates with condensed chromatin. Nucleic acids research, 39(3), pp.874–888.

Kato, Y. et al., 2007. Role of the Dnmt3 family in de novo methylation of imprinted and repetitive sequences during male germ cell development in the mouse. Human molecular genetics, 16(19), pp.2272–2280.

Kaufman, M.H., Barton, S.C. & Surani, M.A., 1977. Normal postimplantation development of mouse parthenogenetic embryos to the forelimb bud stage. Nature, 265(5589), pp.53–55.

Kazazian, H.H., 2004. Mobile elements: drivers of genome evolution. Science (New York, N.Y.), 303(5664), pp.1626–1632.

Kim, S.C. et al., 2006. Substrate and Functional Diversity of Lysine Acetylation Revealed by a Proteomics Survey. Molecular cell, 23(4), pp.607–618.

Klein, C.J. et al., 2013. DNMT1 mutation hot spot causes varied phenotypes of HSAN1 with dementia and hearing loss. Neurology, 80(9), pp.824–828.

Klein, C.J. et al., 2011. Mutations in DNMT1 cause hereditary sensory neuropathy with dementia and hearing loss. Nature genetics, 43(6), pp.595–600.

Kobayashi, H. et al., 2006. Bisulfite sequencing and dinucleotide content analysis of 15 imprinted mouse differentially methylated regions (DMRs): paternally methylated DMRs contain less CpGs than maternally methylated DMRs. Cytogenetic and genome research, 113(1-4), pp.130–137.

Kono, T. et al., 2004. Birth of parthenogenetic mice that can develop to adulthood. Nature, 428(6985), pp.860–864.

Kuo, A.J. et al., 2012. The BAH domain of ORC1 links H4K20me2 to DNA replication licensing and Meier-Gorlin syndrome. Nature, 484(7392), pp.115–119.

La Salle, S. et al., 2004. Windows for sex-specific methylation marked by DNA methyltransferase expression profiles in mouse germ cells. Developmental biology, 268(2), pp.403–415.

Lecona, E. et al., 2016. USP7 is a SUMO deubiquitinase essential for DNA replication. Nature Structural & Molecular Biology.

127 Lee, J.H., Voo, K.S. & Skalnik, D.G., 2001. Identification and characterization of the DNA binding domain of CpG-binding protein. The Journal of biological chemistry, 276(48), pp.44669–44676.

Lehnertz, B. et al., 2003. Suv39h-mediated histone H3 lysine 9 methylation directs DNA methylation to major satellite repeats at pericentric heterochromatin. Current biology : CB, 13(14), pp.1192–1200.

Lei, H. et al., 1996. De novo DNA cytosine methyltransferase activities in mouse embryonic stem cells. Development (Cambridge, England), 122(10), p.3195.

Leonhardt, H. et al., 1992. A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei. Cell, 71(5), pp.865–873.

Li, B., Carey, M. & Workman, J.L., 2007. The role of chromatin during transcription. Cell, 128(4), pp.707–719.

Li, E., Beard, C. & Jaenisch, R., 1993. Role for DNA methylation in genomic imprinting. Nature, 366(6453), pp.362–365.

Li, E., Beard, C., Forster, A.C., et al., 1993. DNA methylation, genomic imprinting, and mammalian development. Cold Spring Harbor symposia on quantitative biology, 58, pp.297–305.

Li, E., Bestor, T.H. & Jaenisch, R., 1992. Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell, 69(6), pp.915–926.

Li, M. et al., 2002. Deubiquitination of p53 by HAUSP is an important pathway for p53 stabilization. Nature, 416(6881), pp.648–653.

Liao, J. et al., 2015. Targeted disruption of DNMT1, DNMT3A and DNMT3B in human embryonic stem cells. Nature genetics, 47(5), pp.469–478.

Lister, R. et al., 2009. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature, 462(7271), pp.315–322.

Liu, X. et al., 2013. UHRF1 targets DNMT1 for DNA methylation through cooperative binding of hemi-methylated DNA and methylated H3K9. Nature communications, 4, p.1563.

Luger, K. et al., 1997. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature, 389(6648), pp.251–260.

Marahrens, Y. et al., 1997. Xist-deficient mice are defective in dosage compensation but not spermatogenesis. Genes & development, 11(2), pp.156–166.

Maunakea, A.K. et al., 2010. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature, 466(7303), pp.253–257.

128 McGraw, S. et al., 2015. Transient DNMT1 suppression reveals hidden heritable marks in the genome. Nucleic acids research, 43(3), pp.1485–1497.

Meissner, A. et al., 2008. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature, 454(7205), pp.766–770.

Meissner, A. et al., 2005. Reduced representation bisulfite sequencing for comparative high- resolution DNA methylation analysis. Nucleic acids research, 33(18), pp.5868–5877.

Meistrich, M.L. et al., 1992. Highly acetylated H4 is associated with histone displacement in rat spermatids. Molecular reproduction and development, 31(3), pp.170–181.

Moghadam, K.K. et al., 2014. Narcolepsy is a common phenotype in HSAN IE and ADCA-DN. Brain : a journal of neurology, 137(Pt 6), pp.1643–1655.

Mohn, F. et al., 2008. Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Molecular cell, 30(6), pp.755–766.

Monk, M., Boubelik, M. & Lehnert, S., 1987. Temporal and regional changes in DNA methylation in the embryonic, extraembryonic and germ cell lineages during mouse embryo development. Development (Cambridge, England), 99(3), pp.371–382.

Morgan, H.D. et al., 2005. Epigenetic reprogramming in mammals. Human molecular genetics, 14 Spec No 1(suppl_1), pp.R47–58.

Nimura, K. et al., 2006. Dnmt3a2 targets endogenous Dnmt3L to ES cell chromatin and induces regional DNA methylation. Genes to cells : devoted to molecular & cellular mechanisms, 11(10), pp.1225–1237.

Nishiyama, A. et al., 2013. Uhrf1-dependent H3K23 ubiquitylation couples maintenance DNA methylation and replication. Nature, 502(7470), pp.249–253.

Okamoto, I. & Heard, E., 2009. Lessons from comparative analysis of X-chromosome inactivation in mammals. Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology, 17(5), pp.659–669.

Okano, M. et al., 1999. DNA Methyltransferases Dnmt3a and Dnmt3b Are Essential for De Novo Methylation and Mammalian Development. Cell, 99(3), pp.247–257.

Okano, M., Xie, S. & Li, E., 1998. Cloning and characterization of a family of novel mammalian DNA (cytosine-5) methyltransferases. Nature genetics, 19(3), pp.219–220.

Ooi, S.K. et al., 2010. Dynamic instability of genomic methylation patterns in pluripotent stem cells. Epigenetics & chromatin, 3(1), p.17.

Ooi, S.K.T. et al., 2007. DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature, 448(7154), pp.714–717.

129 Otani, J. et al., 2009. Structural basis for recognition of H3K4 methylation status by the DNA methyltransferase 3A ATRX-DNMT3-DNMT3L domain. EMBO reports, 10(11), pp.1235– 1241.

Panning, B. & Jaenisch, R., 1996. DNA hypomethylation can activate Xist expression and silence X-linked genes. Genes & development, 10(16), pp.1991–2002.

Peters, A.H. et al., 2001. Loss of the Suv39h histone methyltransferases impairs mammalian heterochromatin and genome stability. Cell, 107(3), pp.323–337.

Pradhan, M. et al., 2008. CXXC domain of human DNMT1 is essential for enzymatic activity. Biochemistry, 47(38), pp.10000–10009.

Pradhan, S. et al., 1999. Recombinant human DNA (cytosine-5) methyltransferase. I. Expression, purification, and comparison of de novo and maintenance methylation. The Journal of biological chemistry, 274(46), pp.33002–33010.

Qin, W. et al., 2015. DNA methylation requires a DNMT1 ubiquitin interacting motif (UIM) and histone ubiquitination. Cell research, 25(8), pp.911–929.

Qin, W., Leonhardt, H. & Spada, F., 2011. Usp7 and Uhrf1 control ubiquitination and stability of the maintenance DNA methyltransferase Dnmt1. Journal of cellular biochemistry, 112(2), pp.439–444.

Qiu, C. et al., 2002. The PWWP domain of mammalian DNA methyltransferase Dnmt3b defines a new family of DNA-binding folds. Nature structural biology, 9(3), pp.217–224.

Raissig, M.T. et al., 2013. Genomic Imprinting in the Arabidopsis Embryo Is Partly Regulated by PRC2 N. M. Springer, ed. PLoS genetics, 9(12), p.e1003862.

Ran, F.A. et al., 2013. Genome engineering using the CRISPR-Cas9 system. Nature protocols, 8(11), pp.2281–2308.

Rea, M. et al., 2012. Histone H1 affects gene imprinting and DNA methylation in Arabidopsis. The Plant journal : for cell and molecular biology, 71(5), pp.776–786.

Reik, W. et al., 1987. Genomic imprinting determines methylation of parental alleles in transgenic mice. Nature, 328(6127), pp.248–251.

Renfree, M.B., Suzuki, S. & Kaneko-Ishino, T., 2013. The origin and evolution of genomic imprinting and viviparity in mammals. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 368(1609), pp.20120151–20120151.

Riggs, A.D., 1975. X inactivation, differentiation, and DNA methylation. Cytogenetics and cell genetics, 14(1), pp.9–25.

Rollins, R.A. et al., 2006. Large-scale structure of genomic methylation patterns. Genome research, 16(2), pp.157–163.

130 Ross, C.A. & Poirier, M.A., 2004. Protein aggregation and neurodegenerative disease. Nature Medicine, 10(7), pp.S10–S17.

Rothbart, S.B. et al., 2012. Association of UHRF1 with methylated H3K9 directs the maintenance of DNA methylation. Nature Structural & Molecular Biology, 19(11), pp.1155–1160.

Roux, K.J. et al., 2012. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. The Journal of cell biology, 196(6), pp.801–810.

Sado, T. et al., 2004. De novo DNA methylation is dispensable for the initiation and propagation of X chromosome inactivation. Development (Cambridge, England), 131(5), pp.975–982.

Sado, T. et al., 2000. X inactivation in the mouse embryo deficient for Dnmt1: distinct effect of hypomethylation on imprinted and random X inactivation. Developmental biology, 225(2), pp.294–303.

Sadowski, M. et al., 2012. Protein monoubiquitination and polyubiquitination generate structural diversity to control distinct biological processes. IUBMB life, 64(2), pp.136–142.

Sakai, Y. et al., 2001. Expression of DNA methyltransferase (Dnmt1) in testicular germ cells during development of mouse embryo. Cell structure and function, 26(6), pp.685–691.

Sampath, V. et al., 2009. Mutational analysis of the Sir3 BAH domain reveals multiple points of interaction with nucleosomes. Molecular and cellular biology, 29(10), pp.2532–2545.

Santos, F. et al., 2002. Dynamic reprogramming of DNA methylation in the early mouse embryo. Developmental biology, 241(1), pp.172–182.

Sapienza, C. et al., 1987. Degree of methylation of transgenes is dependent on gamete of origin. Nature, 328(6127), pp.251–254.

Schaefer, C.B. et al., 2007. Epigenetic decisions in mammalian germ cells. Science (New York, N.Y.), 316(5823), pp.398–399.

Seisenberger, S. et al., 2012. The dynamics of genome-wide DNA methylation reprogramming in mouse primordial germ cells. Molecular cell, 48(6), pp.849–862.

Sharif, J. et al., 2007. The SRA protein Np95 mediates epigenetic inheritance by recruiting Dnmt1 to methylated DNA. Nature, 450(7171), pp.908–912.

Sheng, Y. et al., 2006. Molecular recognition of p53 and MDM2 by USP7/HAUSP. Nature Structural & Molecular Biology, 13(3), pp.285–291.

Simon, J.A. & Kingston, R.E., 2009. Mechanisms of polycomb gene silencing: knowns and unknowns. Nature reviews. Molecular cell biology, 10(10), pp.697–708.

Smit, A.F., 1996. The origin of interspersed repeats in the human genome. Current opinion in

131 genetics & development, 6(6), pp.743–748.

Smith, K.M. et al., 2010. H2B- and H3-specific histone deacetylases are required for DNA methylation in Neurospora crassa. Genetics, 186(4), pp.1207–1216.

Smith, Z.D. et al., 2012. A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature, 484(7394), pp.339–344.

Song, J. et al., 2011. Structure of DNMT1-DNA complex reveals a role for autoinhibition in maintenance DNA methylation. Science (New York, N.Y.), 331(6020), pp.1036–1040.

Song, J. et al., 2012. Structure-based mechanistic insights into DNMT1-mediated maintenance DNA methylation. Science (New York, N.Y.), 335(6069), pp.709–712.

Suetake, I. et al., 2004. DNMT3L stimulates the DNA methylation activity of Dnmt3a and Dnmt3b through a direct interaction. The Journal of biological chemistry, 279(26), pp.27816–27823.

Sun, Z. et al., 2014. Aberrant signature methylome by DNMT1 hot spot mutation in hereditary sensory and autonomic neuropathy 1E. Epigenetics, 9(8), pp.1184–1193.

Surani, M.A., Barton, S.C. & Norris, M.L., 1984. Development of reconstituted mouse eggs suggests imprinting of the genome during gametogenesis. Nature, 308(5959), pp.548–550.

Swain, J.L., Stewart, T.A. & Leder, P., 1987. Parental legacy determines methylation and expression of an autosomal transgene: a molecular mechanism for parental imprinting. Cell, 50(5), pp.719–727.

Takebayashi, S.-I. et al., 2007. Major and essential role for the DNA methylation mark in mouse embryogenesis and stable association of DNMT1 with newly replicated regions. Molecular and cellular biology, 27(23), pp.8243–8258.

Takeshita, K. et al., 2011. Structural insight into maintenance methylation by mouse DNA methyltransferase 1 (Dnmt1). Proceedings of the National Academy of Sciences of the United States of America, 108(22), pp.9055–9059.

Tanay, A. et al., 2007. Hyperconserved CpG domains underlie Polycomb-binding sites. Proceedings of the National Academy of Sciences, 104(13), pp.5521–5526.

Thorvaldsen, J.L., Duran, K.L. & Bartolomei, M.S., 1998. Deletion of the H19 differentially methylated domain results in loss of imprinted expression of H19 and Igf2. Genes & development, 12(23), pp.3693–3702.

Tomizawa, S.-I. et al., 2011. Dynamic stage-specific changes in imprinted differentially methylated regions during early mammalian development and prevalence of non-CpG methylation in oocytes. Development (Cambridge, England), 138(5), pp.811–820.

Trasler, J.M. et al., 1996. DNA methyltransferase in normal andDnmtn/Dnmtn mouse embryos.

132 Developmental Dynamics, 206(3), pp.239–247.

Tsumura, A. et al., 2006. Maintenance of self-renewal ability of mouse embryonic stem cells in the absence of DNA methyltransferases Dnmt1, Dnmt3a and Dnmt3b. Genes to cells : devoted to molecular & cellular mechanisms, 11(7), pp.805–814.

Tucker, K.L., Beard, C., et al., 1996. Germ-line passage is required for establishment of methylation and expression patterns of imprinted but not of nonimprinted genes. Genes & development, 10(8), pp.1008–1020.

Tucker, K.L., Talbot, D., et al., 1996. Complementation of methylation deficiency in embryonic stem cells by a DNA methyltransferase minigene. Proceedings of the National Academy of Sciences, 93(23), pp.12920–12925.

Ueda, Y. et al., 2006. Roles for Dnmt3b in mammalian development: a mouse model for the ICF syndrome. Development (Cambridge, England), 133(6), pp.1183–1192.

Walsh, C.P., Chaillet, J.R. & Bestor, T.H., 1998. Nature genetics, 20(2), pp.116–117.

Wang, J. et al., 2001. Imprinted X inactivation maintained by a mouse Polycomb group gene. Nature genetics, 28(4), pp.371–375.

Watanabe, T. et al., 2011. Role for piRNAs and noncoding RNA in de novo DNA methylation of the imprinted mouse Rasgrf1 locus. Science (New York, N.Y.), 332(6031), pp.848–852.

Waterston, R.H., Lander, E.S. & Sulston, J.E., 2002. On the sequencing of the human genome. Proceedings of the National Academy of Sciences, 99(6), pp.3712–3716.

Weinberger, L. et al., 2016. Dynamic stem cell states: naive to primed pluripotency in rodents and humans. Nature reviews. Molecular cell biology, 17(3), pp.155–169.

Wigler, M., Levy, D. & Perucho, M., 1981. The somatic replication of DNA methylation. Cell, 24(1), pp.33–40.

Wutz, A. et al., 1997. Imprinted expression of the Igf2r gene depends on an intronic CpG island. Nature, 389(6652), pp.745–749.

Xiao, H. & Schultz, P.G., 2016. At the Interface of Chemical and Biological Synthesis: An Expanded Genetic Code. Cold Spring Harbor Perspectives in Biology, 8(9), p.a023945.

Xie, S. et al., 1999. Cloning, expression and chromosome locations of the human DNMT3 gene family. Gene, 236(1), pp.87–95.

Xie, Z.-H. et al., 2006. Mutations in DNA methyltransferase DNMT3B in ICF syndrome affect its regulation by DNMT3L. Human molecular genetics, 15(9), pp.1375–1385.

Xu, G.L. et al., 1999. Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene. Nature, 402(6758), pp.187–191.

133 Yang, N. & Xu, R.-M., 2013. Structure and function of the BAH domain in chromatin biology. Critical reviews in biochemistry and molecular biology, 48(3), pp.211–221.

Yehezkel, S. et al., 2008. Hypomethylation of subtelomeric regions in ICF syndrome is associated with abnormally short telomeres and enhanced transcription from telomeric regions. Human molecular genetics, 17(18), pp.2776–2789.

Yoder, J.A. et al., 1997. DNA (cytosine-5)-methyltransferases in mouse cells and tissues. Studies with a mechanism-based probe. Journal of molecular biology, 270(3), pp.385–395.

Zemach, A. et al., 2010. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science (New York, N.Y.), 328(5980), pp.916–919.

Zhang, Y. et al., 2010. Chromatin methylation activity of Dnmt3a and Dnmt3a/3L is guided by interaction of the ADD domain with the histone H3 tail. Nucleic acids research, 38(13), pp.4246–4253.

Zhang, Z.-M. et al., 2015. Crystal Structure of Human DNA Methyltransferase 1. Journal of molecular biology, 427(15), pp.2520–2531.

Zhao, Q. et al., 2016. Dissecting the precise role of H3K9 methylation in crosstalk with DNA maintenance methylation in mammals. Nature communications, 7, p.12464.

Zietkiewicz, E. et al., 1998. Monophyletic origin of Alu elements in primates. Journal of molecular evolution, 47(2), pp.172–182.

Ziller, M.J. et al., 2013. Charting a dynamic DNA methylation landscape of the human genome. Nature, 500(7463), pp.477–481.

134