DNA Methylation Changes During Aging and in Age-Associated Diseases

Von der Fakultät für Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften genehmigte Dissertation

vorgelegt von

M.Sc. Monika Eipel

aus Jülich

Berichter: Univ.-Prof. Dr. Dr. Wolfgang Wagner Univ.-Prof. Dr. Martin Zenke

Tag der mündlichen Prüfung: 04.06.2019

Diese Dissertation ist auf den Internetseiten der Universitätsbibliothek verfügbar.

II

Abstract VI

Zusammenfassung VII

1 Introduction 1

1.1 DNA Methylation in the Aging Organism 1

1.1.1 Mechanisms of DNA Methylation and Demethylation 1 1.1.2 DNA Methylation Changes During Aging 3 1.1.3 DNA Methylation as a Biomarker for Aging 6

1.2 Clonal Development During Aging and Age-Associated Diseases 7

1.2.1 Age-Related Clonal Hematopoiesis 7 1.2.2 DNA Methylation Changes During Malignant Transformation 11 1.2.3 Inherited Age-Associated Diseases and Telomeropathies 12 1.2.4 The PRDM Family 17

1.3 Stem Cells 19

1.3.1 Induced Pluripotent Stem Cells 19 1.3.2 Induced Pluripotent Stem Cells as a Model System in Clinical Research 20

1.4 Objectives 23

2 Materials and Methods 24

2.1 Molecular Biology 24

2.1.1 DNA Isolation 24 2.1.2 Polymerase Chain Reaction 24 2.1.3 Agarose Gel Electrophoresis 25 2.1.4 Bisulfite Conversion 25 2.1.5 Pyrosequencing 26 2.1.6 Barcoded Bisulfite Amplicon Next Generation Sequencing 29 2.1.7 RNA Isolation 31 2.1.8 cDNA Synthesis and Semi-qPCR 31

2.2 Age Prediction Models 33 2.3 Cloning 34 III

2.3.1 Using CRISPR/Cas9n to Generate Gene Knockouts 34 2.3.2 Generation of PRDM8 Overexpression Vector 37 2.3.3 Transformation of Escherichia Coli 38 2.3.4 Plasmid Isolation and Colony PCR 38 2.3.5 Transfection of Induced Pluripotent Stem Cells via Electroporation 39 2.3.6 Virus Production 39 2.3.7 Transduction of Induced Pluripotent Stem Cells 40

2.4 Cell Culture 41

2.4.1 Culture of Induced Pluripotent Stem Cells 41 2.4.2 Embryoid Body Assay 42 2.4.3 Neural differentiation 43 2.4.4 HEK 293T Cell Culture 45 2.4.5 Cell Sorting 45

2.5 Proteochemistry 45

2.5.1 H and E staining of Buccal Swab Smears 45 2.5.2 Immunofluorescence Staining 45

2.6 Statistics and Data Analyses 46

3 Results 47

3.1 Epigenetic Age Predictions of Buccal Swab Samples 47

3.1.1 Comparative Analyses of Age-Associated DNA Methylation Changes in Blood and Mouth Swab Samples 47 3.1.2 Development of a Predictor to Determine Cell Compositions in Buccal Swabs 50 3.1.3 Combination of Age-Associated and Cell Type Specific CG Dinucleotides to Improve Age Predictions 55

3.2 Analysis of DNA Methylation Variability of Neighboring CG Dinucleotides in Clonal Diseases 57

3.2.1 Identifying Myeloid Malignancies by Analyses of DNA Methylation Variability at Neighboring CG Dinucleotides 58

IV

3.2.2 Using Barcoded Bisulfite Amplicon Next Generation Sequencing to Classify DNA Methylation Patterns of AML Samples 65

3.3 The Functional Role of PRDM8 on Differentiation 69

3.3.1 PRDM8 is Differentially Methylated in Dyskeratosis Congenita Patients 69 3.3.2 Using CRISPR/Cas9n to Modulate PRDM8 Expression in Induced Pluripotent Stem Cells 70 3.3.3 Influence of PRDM8 on Neural Differentiation 74 3.3.4 Genome-Wide Analyses of PRDM8-Dependent DNA Methylation Changes During Neural Differentiation 79

4 Discussion 85

4.1 Tissue Specific DNA Methylation Enhances Epigenetic Age Predictions 85 4.2 Malignant Clonal Development can be Identified by DNA Methylation Variability of Successive CG Dinucleotides 88 4.3 Classification of DNA Methylation Patterns of Individual DNA Strands Provides a New Perspective to Track Clonal Hematopoiesis 91 4.4 Lack of PRDM8 Expression Leads to Impaired Neural Differentiation and is Associated with Differential Promoter Methylation 93

5 Conclusion and Future Perspective 97

6 Bibliography 98

7 Appendix 116

7.1 Abbreviations 116 7.2 List of Figures 120 7.3 List of Tables 122 7.4 Publications 123 7.5 Acknowledgements 125 7.6 Declaration of Authorship 126

V

Abstract

Abstract

Highly reproducible changes in DNA methylation (DNAm) can be used for epigenetic age predictions. In forensic science, the specimens of choice for age predictions are buccal swabs containing leukocytes and epithelial cells. However, it is well known that DNAm varies between cell types. To take these differences into account, we developed a model to predict the proportion of cell types in buccal swabs based on DNAm levels at only two CpG sites. The combination of these CpG sites with three age-associated CpG sites generated a new model for age predictions of buccal swab samples which allows age prediction with high accuracy. Of note, epigenetic age predictors are not applicable to blood samples from patients with hematological malignancies, indicating that the DNAm landscape is disturbed during malignant transformation. To further elucidate how methylation levels in age-associated CpG sites change during malignant transformation, we focused on the age-associated region in PDE4C. We demonstrate that the variability of DNAm patterns of successive CpG sites in PDE4C is indicative for clonal hematopoiesis. Barcoded Bisulfite Amplicon Next Generation Sequencing was used to derive patient-specific methylation patterns of AML samples with single strand resolution. These patterns could be classified as healthy or AML-derived by machine learning algorithms. We hypothesize that the quantification of AML derived DNAm patterns provides a new perspective to track clonal hematopoiesis. Furthermore, we have recently described reduced expression and hypermethylation of the PRDM8 gene in patients with the premature aging disease dyskeratosis congenita. To study the physiological role of this epigenetically controlled histone methyltransferase, we modulated its expression in iPSCs using the CRISPR/Cas9n technology. Spontaneous and directed differentiation experiments revealed impaired neural differentiation of PRDM8-/- and PRDM8+/- clones. Genome-wide methylation analyses indicate that PRDM8 positively regulates neurogenesis via the promotion of specific DNAm changes in promoter regions of genes associated to neural functions. In conclusion, this thesis enhances our understanding of DNAm changes and variability at specific CpG sites during aging and in age-associated diseases and demonstrates how these changes can be used for forensic and clinical applications.

VI

Zusammenfassung

Zusammenfassung

Reproduzierbare Veränderungen in der DNA-Methylierung (DNAm) können für epigenetische Altersvorhersagen genutzt werden. Für die forensische Bestimmung des epigenetischen Alters werden Mundschleimhautabstriche (MSA), bestehend aus Leukozyten und Epithelzellen, bevorzugt. Jedoch variiert die DNAm zwischen verschiedenen Zelltypen. Daher haben wir ein Modell zur Bestimmung der zellulären Komposition von MSA entwickelt, welches auf der DNAm von zwei CpGs basiert. Die Kombination dieser CpGs mit drei altersassoziierten CpGs generierte ein neues Modell zur Altersbestimmung von MSA mit einer hohen Genauigkeit. Epigenetische Altersabschätzungen sind jedoch nicht für Blutproben von Patienten mit hämatologischen Erkrankungen anwendbar, was auf eine Veränderung der DNAm- Landschaft während der malignen Transformation hinweist. Um zu untersuchen wie sich die DNAm in altersassoziierten Regionen während der malignen Transformation verändert, fokussierten wir uns auf die altersassoziierte Region in PDE4C. Wir konnten zeigen, dass die Variabilität von DNAm-Mustern von benachbarten CpGs in PDE4C indikativ für klonale Hämatopoese ist. BBA-Seq wurde angewandt um patientenspezifische DNAm-Muster von AML-Patienten mit Einzelstrangauflösung zu generieren. Diese Muster wurden mittels maschinellen Lernens als gesund oder AML-abgeleitet klassifiziert. Wir nehmen an, dass die Quantifizierung der AML-abgeleiteten DNAm-Muster eine neue Perspektive zur Verfolgung der klonalen Hämatopoese bietet. Kürzlich beschrieben wir die Hypermethylierung und reduzierte Expression von PRDM8 in Patienten mit der altersassoziierten Krankheit Dyskeratosis Congenita. Um die funktionelle Rolle dieser epigenetisch regulierten Histon-Methyltransferase zu studieren, wurde die Genexpression in iPSCs mittels CRISPR/Cas9n moduliert. Unter spontanen und gerichteten Differenzierungsbedingungen konnte eine geminderte neuronale Differenzierung der PRDM8+/- und PRDM8-/- Zellen nachgewiesen werden. Genomweite Methylierungsanalysen legen nahe, dass PRDM8 die Neuronalentwicklung über DNA-Methylierungsveränderungen in Promotoren von Genen reguliert, welche mit neuronalen Funktionen assoziiert sind. Zusammenfassend vertieft diese Thesis unser Verständnis von spezifischen DNAm- Veränderungen während des Alterns als auch in altersassoziierten Krankheiten und demonstriert wie diese Veränderungen für forensische und klinische Zwecke genutzt werden können. VII

Introduction

1 Introduction

1.1 DNA Methylation in the Aging Organism

Aging is caused by a variety of hallmarks, such as genomic instability, telomere attrition, loss of proteostasis, deregulated nutrient sensing, mitochondrial dysfunction, cellular senescence, stem cell exhaustion and, importantly, epigenetic abnormalities including changes in DNA methylation (DNAm). Epigenetic systems control gene activity and thus - either directly or indirectly - affect all other hallmarks (Ashapkin et al. 2017). Therefore, current research focuses on the nature and mechanisms of epigenetic variability.

1.1.1 Mechanisms of DNA Methylation and Demethylation

Epigenetics can be referred to as the interface between the genome and the environment (Feil and Fraga 2012). The term “epigenetics” was formulated by Waddington in 1942 to refer to “the causal mechanisms by which the genes of a genotype bring about a phenotype” (Waddington 1942). Currently, the widely accepted definition of the term “epigenetics” is “heritable changes in genome function that occur without changes in the DNA sequence” (Russo et al. 1996; Kim and Costello 2017). The best understood epigenetic modifications are DNAm and histone modifications such as methylation, acetylation and phosphorylation (Portela and Esteller 2010). This study focuses on the changes of DNAm which is considered a relatively stable component of the epigenome (Bird 2002). Methylation of DNA in vertebrates is mainly restricted to Cytosine-Guanine dinucleotides (CpG sites). However, significant non-CpG methylation has been identified in pluripotent stem cells (Ziller et al. 2011). The human genome comprises approximately 29 million CpG sites, 60% - 80% of which are methylated (Lister et al. 2009). 7% of the methylated CpG sites can be found in high CpG density regions – so called CpG islands (Deaton and Bird 2011). However, CpG islands are largely resistant to DNAm and associated with a majority of annotated gene promoters (Saxonov et al. 2006). The enzymes transferring methyl groups to DNA are so called DNA methyltransferases (DNMTs). DNMT1 maintains global DNAm after cell division and shows a strong preference for hemimethylated DNA (Hermann et al. 2004). De novo methylation of DNA is carried out by DNMT3A and DNMT3B (Okano et al. 1999). In contrast, DNA demethylation is realized by active as well as passive mechanisms (Figure 1.1). Passive DNA demethylation occurs due to lack of

1

Introduction functioning DNAm maintenance machineries during successive rounds of DNA replication. By contrast, active DNA demethylation occurs in a series of enzymatic processes (Kohli and Zhang 2013). Ten eleven translocation (TET) family enzymes are the key players in active demethylation. TET proteins oxidize 5mC to 5-hydroxymethylcytosine (5hmC) and subsequently oxidize 5hmC to generate 5-formylcytosine and 5-carboxylcytosine (Ito et al. 2011). Alternatively, AID/APOBEC enzymes can deaminate 5mC and 5hmC to form thymine and 5-hydroxymethyluracil (5hmU). These oxidation derivatives are then either diluted during cell divisions or removed by base excision repair (Wu and Zhang 2017).

Figure 1.1 DNA methylation and demethylation pathway DNA methyltransferases (DNMTs) catalyze the formation of 5-methylcytosine (5mC) from cytosine (C). Iterative oxidations by TET enzymes produce 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5- carboxylcytosine (5caC). Passive DNA demethylation occurs via a reduced activity or failed recruitment of DNMTs during cell divisions. 5hmC, 5fC and 5caC are not recognized by DNMT enzymes and can promote passive DNA demethylation. Active demethylation can be achieved through the removal of 5fC and 5caC by the base excision repair (BER) pathway. Alternatively, 5mC and 5hmC can be deaminated by AID/APOBEC enzymes to form 5-hydroxymethyluracil (5hmU) which can be excised via the BER pathway. (Adapted from Ciccarone et al. 2018)

DNA methylation plays an important role in mammalian development, cell fate decision and maintenance of cellular identity through direct or indirect regulation of gene

2

Introduction expression (Smith and Meissner 2013). Thus it is not surprising that changes in DNAm have been observed during development and in many human diseases, especially cancer (Baylin and Jones 2016). During reprogramming of somatic cells into pluripotent cells a global reset of the mature somatic epigenome occurs. The epigenome of induced pluripotent stem cells (iPSCs) vastly resembles the epigenome of embryonic stem cells (ESCs) (Bock et al. 2011) and age-associated changes in DNAm are reversed during reprogramming (Weidner et al. 2014). Nonetheless, iPSCs harbor residual tissue of origin specific DNAm signatures (Polo et al. 2010; Bar-Nur et al. 2011). Recently, an increasing number of studies examine changes in DNAm across different tissues and across the lifespan (Jones et al. 2015).

1.1.2 DNA Methylation Changes During Aging

Aging is characterized by a progressive loss of tissue functionality and the risk of malignant transformation is increased in the elderly. Many intrinsic and extrinsic factors are linked to progressive physiological, mechanical and cognitive age-associated decline (Ciccarone et al. 2018). Thus, aging represents one of the main risk factors for various diseases, such as cancer, atherosclerosis and Alzheimer's disease (Niccoli and Partridge 2012). Aging is associated with highly reproducible changes in DNAm. Already half a century ago in 1967, Berdyshev et al. obtained the first experimental data on age-dependent DNAm changes in humpback salmon (Berdyshev et al. 1967; Ashapkin et al. 2017). Since then, genome-wide studies of DNAm became focus of the aging research. With the development of the Illumina Infinium HumanMethylation BeadChip hybridization assay platforms, changes in the epigenome during aging could be analyzed for large cohorts (Ashapkin et al. 2017). In neonatal blood, global DNAm levels are lower than observed at other points during development and adolescence (Martino et al. 2011; Wang et al. 2012). In analogy, the epigenetic age of ESCs is close to zero. The same is true for iPSCs derived from bone marrow samples and hematopoietic stem cells isolated from cord blood (Weidner et al. 2014). After birth, global DNAm levels in blood increase throughout the first year of life (Herbstman et al. 2013). These changes occur preferentially at CpG island shores and shelves, enhancers, and promoters lacking CpG islands (McClay et al. 2014). After the first year of life, median global DNAm levels are relatively stable (Martino et al. 2011). Studies that examined DNAm changes during childhood have reported that DNAm levels increase rapidly and then stabilize

3

Introduction by adulthood in both brain and blood. Therefore, the vast majority of changes are more accurately modeled as a function of logarithmic age than linear age (Alisch et al. 2012; Lister et al. 2013). Many studies describe a decrease in mean blood DNAm with increasing age during post-adulthood (Bjornsson et al. 2004; Heyn et al. 2012; Hannum et al. 2013; Horvath 2013; Johansson et al. 2013; Florath et al. 2014). Of note, age-associated DNAm changes in blood samples do not necessarily reflect DNAm changes in other tissues. Genome-wide analysis of DNAm in solid tissues of adult donors revealed an evident dependence of the age-associated methylation patterns on the tissue source of DNA. This is why special care needs to be taken when analyzing methylation data of sources with more than one cell type, for example buccal swabs. Age-associated hypomethylation occurs in heterochromatic regions of the genome, such as telomeres, repetitive elements and transposons which contain the majority of methylated CG dinucleotides in the human DNA (Figure 1.2). Conversely, focal hypermethylation occurs at CpG islands associated to genes which are generally unmethylated (Bollati et al. 2009; Madrigano et al. 2012). Gain of DNAm during aging is enriched at targets of Polycomb proteins, which generally show high levels of DNAm (Teschendorff et al. 2010; Heyn et al. 2012), highlighting the physiological relevance of DNAm changes during aging. Genes involved in telomere maintenance or epigenetic regulation - such as DNA methyltransferases or histone deacetylases - also tend to be increasingly methylated with aging (Christensen et al. 2009; Heyn et al. 2012; Florath et al. 2014). However, it is important to note that age-related DNAm changes are not necessarily associated with transcriptional changes (Peters et al. 2015; Yuan et al. 2015).

4

Introduction

Figure 1.2 DNAm patterns in early life and aging A) In early life, the CpG island promoters are widely protected from DNA methylation (white boxes). In contrast, repetitive elements (LTR, SINE and LINE), telomeric and centromeric heterochromatin regions are typically methylated (blue boxes). B) During aging, repetitive elements undergo a generalized hypomethylation, while locus specific epigenetic changes are observed in gene-associated regulatory regions. (Adapted from Ciccarone et al. 2018)

There are two views on how epigenetic changes occur during aging. In addition to individual, global changes in DNAm patterns during aging - which can be interpreted as “epigenetic drift” - DNAm levels at specific sites in the genome are highly associated with age and coherently altered among individuals (Hannum et al. 2013; Jones et al. 2015). These highly reproducible DNAm changes at specific sites are commonly referred to as “epigenetic clock” (Horvath 2013; Teschendorff et al. 2013). While epigenetic drift and epigenetic clock both are related to age, epigenetic drift represents the tendency for increasing discordance between epigenomes over time (Figure 1.3 ; Jones et al. 2015). Early indications of epigenetic drift were found in cell culture studies. Humpherys et al. observed that clones of a single cell line became epigenetically divergent upon multiple passages (Humpherys et al. 2001). The concept of epigenetic drift is also used to describe the increase in discordance of DNAm between monozygotic twins during their lifespan (Fraga and Esteller 2007; Poulsen et al. 2007).

5

Introduction

Figure 1.3 Epigenetic drift vs Epigenetic clock A specific CpG site may be undergoing either epigenetic drift (left) or the epigenetic clock (right). Both phenomena have different characteristics when examined across a population. Epigenetic drift and epigenetic clock are shown for the DNAm changes of one exemplary CpG site during aging. Each colored line represents one individual. While a CpG site of the epigenetic drift reveals donor-dependent correlations with age, the age correlation of epigenetic clock CpG sites are consistent across individuals. (Adapted from Jones et al. 2015)

The highly reproducible changes in DNAm of the epigenetic clock have been extensively used for the generation of epigenetic age predictors which estimate a donor’s age based on DNAm levels.

1.1.3 DNA Methylation as a Biomarker for Aging

In a large-scale investigation using the Illumina HumanMethylation450 BeadChip platform, Hannum et al. investigated methylation levels of 650 blood DNA samples of donors from 19 to 101 years (Hannum et al. 2013). A correlation with age was found for approximately 15% of CpG sites. An age prediction model was generated using a set of 71 CpG sites and allowing age prediction with a mean error of 3.9 years. Of 71 CpG sites chosen, most were associated with genes involved in age related processes - such as DNA damage, oxidative stress and tissue degradation. Notably, this age predictor is only applicable for blood samples. To build a tissue independent age predictor, a bioinformatic analysis of nearly 8000 samples from 51 tissues and cell types was performed by Horvath et al. in 2013.

6

Introduction

A set of 353 CpG sites gave rise to a model for age prediction with a median absolute deviation of only 3.6 years among the different tissues (Horvath 2013). Although these genome-wide age predictors are very stable and less prone to outliers, their application is rather time consuming and cost intensive. Thus, researchers are developing age predictors based only on few CpG sites. For the prediction of epigenetic age of blood samples, Weider et al. described an “aging-signature” based on only 3 CpG sites associated with the genes integrin alpha 2b (ITGA2B), aspartoacylase (ASPA), and phosphodiesterase 4C (PDE4C), which will be applied in parts of this thesis. They achieved prediction accuracies with a MAD of just 4.5 years (Weidner et al. 2014). However, aging rates are heterogeneous among individuals. Genetic predisposition can impact on the biological and epigenetic age, for example women are known to have a longer average lifespan compared to men. The majority of all super-centenarians that have reached their 110th year of life are women (Ashapkin et al. 2017). In addition, epigenetic age is dependent on environmental influences and can be accelerated by unhealthy life habits or slowed down by good ones. Overall, the epigenetic age analysis of blood confirms the conventional wisdom regarding the benefits of eating a high plant diet, moderate alcohol consumption, physical activity and education, as well as the health risks of obesity and metabolic syndrome (Quach et al. 2017). These finding suggest that epigenetic age rather reflects biological than chronological age (Wagner 2017). It is important to note, that epigenetic age predictions are not functioning in patients with clonal diseases. Lin and Wagner systematically analyzed age-associated DNAm patterns in different kinds of cancer. They found that in contrast to non-malignant tissues, the epigenetic aging signatures hardly reflect chronological age of cancer patients. They claim this may be at least partially attributed to the fact that cancer is a clonal disease capturing only the epigenetic make-up of the tumor-initiating cell (Lin and Wagner 2015).

1.2 Clonal Development During Aging and Age-Associated Diseases

1.2.1 Age-Related Clonal Hematopoiesis

Mammalian hematopoiesis is maintained by the activation of hematopoietic stem and progenitor cells (Kondo et al. 2003). During human aging, the expansion of specific

7

Introduction hematopoietic stem and progenitor cells will result in clones that may sustainably contribute disproportionately to the blood cell population. Aging hematopoiesis is the result of clonal selection at the level of the hematopoietic stem cells (HSCs). While previous studies described hematopoietic aging as the manifestation of uniform changes in lineage output with age, more recent studies suggest a model in which a pool of HSCs that are heterogeneous with respect to their self-renewal and differentiation capacities at birth are clonally selected over time (Chung and Park 2017; Park 2017). Under selective pressure during aging, the composition of HSCs changes resulting in a change in the hematopoietic cell population and myeloid bias. Single cell analysis of human HSCs have shown that HSCs may exhibit either myeloid-biased, lymphoid-biased, or balanced lymphomyeloid differentiation potential (Biasco et al. 2016). Thus, aging hematopoiesis may be regarded as the result of selection pressures that drive alterations in the composition of HSC clones (Chung and Park 2017). Accordingly, age-related clonal hematopoiesis (ARCH) is defined as the expansion of HSC clones, harboring specific disruptive genetic variants, in elderly individuals without clear diagnosis of hematological malignancies (Figure 1.4; Shlush 2018). This phenomenon is also referred to as clonal hematopoiesis of indeterminate potential (CHIP). Although CHIP is frequently observed in the elderly, by definition, the detection of clonal evolution is not exclusively found during aging (Steensma 2018). Initially, skewed X inactivation was used to identify clonal hematopoiesis. The first study to describe somatic clonal expansions in chronic myeloid leukemia was published in 1967 by Fialkow et al. (Fialkow et al. 1967). The authors found that all hematopoietic cells harbored the same allele of the G6PD locus on the X-chromosome, suggesting that all chronic myeloid leukemia cells were derived from a single cell and were hence clonal. The first study to describe age-related clonal hematopoiesis emerged from an analysis of clonal blood cells from a cohort of African women and their daughters in the 1980s. The authors observed a decreased frequency of heterozygous G6PD alleles in the mothers, but not the daughters and concluded that the heterozygous phenotype undergoes negative selection during aging (Hitzeroth and Bender 1981). A major advance in understanding ARCH was made in 2012. Copy number variation and next generation sequencing analyses provided evidence that ARCH is associated with specific, leukemia-related somatic mutations (Busque et al. 2012; Jacobs et al. 2012; Laurie et al. 2012). Laurie et al. described a low amount of clonal mosaicism in peripheral blood

8

Introduction from birth until 50 years after which it rapidly rises to 2-3% in the elderly. In addition, although only 3% of subjects with detectable clonal mosaicism had any record of hematological cancer before DNA sampling, those without a previous diagnosis showed an estimated tenfold higher risk of a subsequent hematological cancer (Laurie et al. 2012). In 2014, three independent studies provided evidence of age-related accumulations of mutations in leukemia-related genes (DNMT3A, TET2, ASXL1) (Genovese et al. 2014; Jaiswal et al. 2014; Xie et al. 2014). Jaiswal et al. reported mutations in leukemia associated genes in over 18% of healthy individuals between 90 and 108 years. The presence of a somatic mutation was associated with an increase in the risk of hematologic cancer, an increase in all-cause mortality and increase in the risk of incident of coronary heart diseases. The authors concluded that ARCH is a common condition that is associated with an increased risk for hematologic cancer and in all-cause mortality (Jaiswal et al. 2014). Thus, the finding that many ARCH related mutations occur in epigenetic modifiers, provides a possible link between the genetics of ARCH and changes in the epigenome of blood cells, specifically in cytosine methylation (Shlush 2018).

9

Introduction

Figure 1.4 Schematic representation of age-related clonal hematopoiesis Age-related clonal hematopoiesis is the clonal expansion of multipotent hematopoietic stem cells (HSCs). HSCs harboring leukemia associated mutations gain growth advantage and pass the mutations to progenitor cells and finally to mature blood cells. It should be noted that clones with leukemia associated mutations can also be found in healthy individuals. (Adapted from Shlush 2018)

Although, by definition, ARCH occurs in healthy elderly individuals, ARCH is associated with not only leukemia but also other, age-related diseases, such as inflammation, type 2 diabetes and vascular diseases. However, it remains to be elucidated, whether ARCH is the cause of this diseases or rather a marker due to its high correlation with aging (Pamukcu et al. 2010; Fuster et al. 2017). In addition, recent studies so far did not address the question why some individuals without ARCH may exhibit a hematopoietic aging phenotype, or why individuals with clonal hematopoiesis may not exhibit aging phenotypes (Chung and Park 2017). Finally, ARCH plays an important role in the premalignant state of clonal diseases and can be potentially used in the future as a predictive tool for the development of hematological malignancies (Shlush 2018). Hence, advanced techniques to detect clonal hematopoiesis are needed.

10

Introduction

1.2.2 DNA Methylation Changes During Malignant Transformation

Aging HSCs are not compellingly associated with changes that are sufficient to produce overt clinical manifestations. Many studies focus on the shift from non-malignant HSC aging to the development of myelodysplastic syndromes (MDS) (reviewed e.g. in Li et al. 2016 and Chung and Park 2017).

Figure 1.5 Clonal hematopoiesis during malignant transformation HSCs can aquire leukemia associated mutations during aging. These mutations increase the risk of malignant transformation and lead to clonal evolution. In malignant HSCs, reconstitution potential decreases resulting in clinically relevant cytopenias. Malignant transformation is accompanied by changes in the DNAm pattern. (Adapted from Chung and Park 2017)

The most frequently mutated genes found in individuals with ARCH are epigenetic modifiers, including DNA methyltransferase 3A (DNMT3A), Ten–eleven-translocation 2 (TET2) and additional sex combs-like 1 (ASXL1) (Kulasekararaj et al. 2014; Yoshizato et al. 2015; Kunimoto and Nakajima 2017). Notably, these epigenetic modifiers are also recurrently mutated in myeloid malignancies. Somatic mutations of DNMT3A were first reported in adult AML cases (Ley et al. 2010; Yan et al. 2011). In addition, DNMT3A mutations, which occur in about 30% of AML cases, are associated with adverse prognosis and decreased overall survival in cytogenetically normal AML (Marcucci et al. 2012; 11

Introduction

Kunimoto and Nakajima 2017). These findings suggest an essential role of DNAm changes in early phase of leukemia development (Figure 1.5). In contrast to rather limited variations in somatic mutations, patients with AML exhibit profound and heterogeneous disruption of their cytosine methylation landscapes (Figueroa et al. 2010b). Figueroa et al. used DNAm signatures to classify AML into sixteen disease subtypes with distinct clinical features. Analyses of these epigenetic signatures together with somatic mutations revealed how genetic lesions can directly impact on the epigenetic programming of hematopoietic cells (Figueroa et al. 2010a). The perturbation of DNAm landscapes in patients with hematological diseases is variable, with some patients exhibiting dominant hypomethylation, while others show dominant hypermethylation and yet others show signatures with intermediate hypo- and hypermethylation, indicating heterogeneity of DNAm changes during malignant transformation. (Figueroa et al. 2010b). Epigenetic heterogeneity is not random but occurs predominantly at specific hypervariable hotspots (Hansen et al. 2011). It is possible that sites of focal epigenetic variance allow the cells to sample different transcriptional states, thus resulting in greater evolutionary fitness. In this regard, focal “epialleles” can be attributed a similar significance to genetic alleles and perhaps follow similar subclonal distributions (Li et al. 2016b). In 2014, Landau et al. reported inferior clinical outcome of chronic lymphoid leukemia patients with increased DNAm heterogeneity (Landau et al. 2014) supporting the clinical relevance of DNAm variability in hematological diseases. However, so far the mechanisms and consequences of epigenetic heterogeneity in the context of MDS and AML remain to be elucidated. Improved methods to track clonal populations of cells in hematological diseases are required to gain a full understanding of this phenomenon (Alizadeh et al. 2015).

1.2.3 Inherited Age-Associated Diseases and Telomeropathies

Due to their association with ARCH, hematological malignancies can be regarded as age- associated diseases but need to be discerned from premature aging diseases. Premature aging-related phenotypes characterize a number of rare genetic disorders known as progeroid syndromes. These disorders are mainly caused by mutations in lamin proteins or in DNA repair enzymes (Ciccarone et al. 2018). However, in some progeroid syndrome

12

Introduction patients, no genetic aberration can be detected indicating that epimutations may drive the premature aging phenotype. The most common form of progeroid syndromes is the Hutchinson Gilford Progeria Syndrome (HGPS). HGPS is an extremely rare genetic disorder affecting about one per four to eight million live births (Pollex and Hegele 2004). While children with HGPS appear healthy at birth, they develop distinctive clinical features during early childhood. These symptoms include severe growth retardation, usually associated with skeletal alteration as well as loss of subcutaneous fat and skin appendages and some developmental processes are delayed (Coppede 2013). A majority of patients die in their early teens from heart attacks and strokes caused by progressive atherosclerotic diseases (Hennekam 2006). Another class of clinical disorders associated with premature aging phenotypes includes telomeropathies (reviewed in Townsley et al. 2014). In contrast to progeroid syndromes, telomeropathies are caused by mutations in telomere associated genes and are characterized by malfunctioning telomere maintenance. Physiologically, telomere loss eliminates cells with a long proliferative history as well as cells at risk for replication- dependent adverse genetic events (Hackett and Greider 2002; Townsley et al. 2014). The telomerase, which maintains the telomeres, is a ribonucleoprotein composed of two core components - the telomerase reverse transcriptase (TERT) as catalytic component and the RNA component TERC which acts as a template. Without telomerases, telomeres shorten with each successive round of replication. When a critical length is reached, the cells enter senescence. Telomerase is mainly restricted to cells such as germ cells, stem cells and their immediate progeny, activated T cells and monocytes (Blasco 2007).

13

Introduction

Figure 1.6 Components of the telomerase complex commonly found mutated in telomeropathies The telomerase complex - consistent of the enzymatic component TERT, the RNA component TERC, Dyskerin and additional associated proteins - adds a species-dependent telomere repeat sequence to the 3' end of telomeres. Heterozygous mutation of TERT, TERC and components of the shelterin complex are found in aplastic anemia and dyskeratosis congenita patients. Hemizygous mutation of Dyskerin is a marker for X-linked DKC.

Telomere deficiencies were first linked to clinical relevance by bone marrow failures (Figure 1.6). In 1998, Ball et al. described shortened telomeres in the leukocytes of acquired aplastic anemia (AA) patients (Ball et al. 1998), followed by the discovery of telomerase genes mutant in an inherited AA - called dyskeratosis congenita (DKC) (Heiss et al. 1998; Knight et al. 1999; Mitchell et al. 1999; Vulliamy et al. 2001). Aplastic anemia is the exemplar of bone marrow failure syndromes and is a rare disorder with an incidence of approximately two to three cases per million per year (Montane et al. 2008). It is characterized by marrow hypocellularity resulting in peripheral cytopenias. The molecular pathogenesis of AA is not fully understood. However, an antigen-driven and autoimmune dysregulated T-cell homeostasis is implicated. Another factor for AA development is defective telomere maintenance as evidenced by recurring mutations in related telomerase complex genes such as TERT and TERC (Shallis et al. 2018). In AA in general, there is the central problem of the development of MDS and AML late in the disease course in approximately 15% of cases over a decade (Maciejewski et al. 2002). Clonal evolution into MDS is accompanied by cytogenetic abnormality, usually monosomy 7, but also trisomy 8, and a variety of chromosome translocations. Calado et al. examined the impact of telomere content on clinical outcomes in a large cohort of almost 200 patients with severe AA. They found that telomere attrition was a major risk factor for clonal evolution with estimated rates five to six times higher than in patients with higher telomere content. Interestingly, bone marrow cells from patients

14

Introduction collected years prior to clinical development of MDS already showed telomere loss and chromosomal aberrations (Calado et al. 2012). In contrast to AA, DKC is a form of inherited bone marrow failure. Children with DKC have an up to 100-fold increased risk of developing AML (Alter et al. 2010). The bone marrow in DKC is hypocellular and aplastic leading to pancytopenia. With severe pancytopenia, the bone marrow is indistinguishable from patients with acquired AA. Mutations in the DKC1 gene cause X chromosome linked dyskeratosis. Affected male patients frequently show the classic dermatologic triad of dystrophic nails, leukoplakia and hypopigmented skin (Heiss et al. 1998). Loss of dyskerin function destabilizes the telomerase repair complex, thus leading to telomere attrition. Autosomal DKC is more heterogeneous than X chromosome linked dyskeratosis. To date, heterozygous mutations in 3 genes (TERC, TERT and TINF2) have been identified, providing a link between DKC and telomerases (Vulliamy et al. 2001; Dokal 2011). The identification of mutations in DKC1 and TERC established telomeropathy as being the principal underlying cause of DKC (Figure 1.7).

15

Introduction

Figure 1.7 Pathophysiology of dyskeratosis congenita Mutations in telomerase and shelterin components (primary defects) cause excessive telomere attrition. This leads to premature cell death and chromosome instability. With increasing age, stem cell pools become exhausted causing bone marrow failure and hematological as well as non-hematological cancers. (Adapted from Dokal 2011)

Due to heterogeneity in clinical features, care needs to be taken in regard to DKC and AA diagnosis. Measuring of average telomere lengths within peripheral blood leukocytes is used for disease screening. The standard assay to measure telomere lengths is the terminal restriction fragments length analysis by southern blot. However, this method is difficult to quantify and inappropriate for high throughput screenings (Aubert et al. 2012). The quick, cost-effective and standardized fluorescence in situ hybridization (FISH) analysis of telomere repeats by flow-cytometry and quantitative polymerase chain reaction was developed for clinical samples. These methods are based on the comparison of patient samples to age- matched controls (Diez Roux et al. 2009). It is important to note that in cohorts of patients lacking typical features of DKC, the sensitivity and specificity are less established. In 194 patients with identified DKC-associated gene mutations (DKC1, TINF2, TERT, and TERC), there were different telomere lengths between disease subtypes as measured by qPCR and Southern blot analyses of terminal restriction fragments (Vulliamy et al. 2011). In addition, mutations in telomere associated genes are not reported consistently for all DKC patients.

16

Introduction

To assist DKC diagnosis, Weidner et al. developed a mutation and telomere length independent epigenetic biomarker based on DNAm (Weidner et al. 2016). The authors identified a CpG site in the promoter region of the short transcript of the PRDI-BF1 and RIZ homology domain containing protein 8 (PRDM8) gene to be significantly hypermethylated in DKC and also AA patients. Of note, this hypermethylation was associated with a decrease in .

1.2.4 The PRDM Family The PRDM family plays an important role in stem cell fate and a wide range of developmental processes, as well as in malignant transformation (Morishita 2007; Fog et al.

2012). It is characterized by a conserved N-terminal domain. This domain was originally identified in two proteins: PRDI-BF1 (positive regulatory domain I-binding factor 1) and RIZ1 (retinoblastoma protein-interacting zinc finger gene 1) (Buyse et al. 1995) and named the PR (PRDI-BF1-RIZ1 homologous) domain. PR domains are related to the catalytic SET domains which represent a large group of histone methyltransferases (Huang et al. 1998; Schneider et al. 2002; Sun et al. 2008; Wu et al. 2010). However, histone methyltransferase activity is only described for PRDM2, PRDM8 and PRDM9 (Hohenauer and Moore 2012). The PR domain is followed by repeated zinc finger domains for DNA binding (Fumasoni et al. 2007; Kinameri et al. 2008). Non-catalytic PRDM proteins carry out their regulatory function by recruiting co- factors to enhancer regions that, in turn, associate with histone-modifying enzymes (Figure 1.8).

17

Introduction

Figure 1.8 PRDM family domain structure and relationships The domain structure for each human PRDM family member is illustrated, along with the relationships between their PR domains. Only the longest reported isoform is shown. Orange lines highlight the PRDMs conserved in C. elegans and Drosophila; a putative Prdm9 without zinc fingers has been reported in C. elegans. PRDM11 alone does not contain zinc fingers; instead, it has a smaller protein-protein interaction motif known as a zinc knuckle that is also present in several other family members. Protein features marked with asterisks are derived from UniProt. CtBP: C-terminal binding protein, KRAB: Krüppel-associated box, Rb: retinoblastoma, SSX: synovial sarcoma X (Hohenauer and Moore 2012). (With kind permission of Development)

This study focuses on PRDM8 because it revealed highest differential DNAm and decreased expression in the premature aging disease dyskeratosis congenita - evidence for an epigenetically controlled gene expression (Weidner et al. 2016). Especially in the human system, little is known about the physiological role of PRDM8. In mice, Prdm8 methylates H3K9 of histones, indicating transcriptional repression activity (Eom et al. 2009). PRDM8 gene expression is regulated in the developing central nervous system and during neuronal circuit assembly in mice. Mice lacking Prdm8 have cellular and behavioral abnormalities including axonal mistargeting by neurons of the dorsal telencephalon and abnormal itch-like behavior (Komai et al. 2009; Ross et al. 2012). Prdm family genes are expressed at high levels in a spatially and temporally restricted manner in the developing murine telencephalon. To date, the regulation and function of PRDM8 expression during human neural development is unclear.

18

Introduction

1.3 Stem Cells

Stem cells are defined as cells that have the potential for unlimited or prolonged self- renewal, as well as the ability to give rise to at least one type of mature, differentiated cells (Weissman 2000). This definition of “stemness” applies to all stem cells, however it is necessary to individually consider embryonic and adult stem cells as they differ significantly e.g. in their differentiation potential and thus in their capability to contribute to clinical research and applications (Chagastelles and Nardi 2011). Embryonic stem cells (ESCs) are pluripotent cells that can be isolated from the blastocyst which forms at approximately five days after fertilization. They are able to give rise to all cell types of the organism (Edwards 2001). In contrast, multipotent adult somatic stem cells can be isolated from various tissues throughout lifetime. They are rare, quiescent cells with a limited self-renewal and differentiation capacity. Adult stem cells have been isolated and described for a wide range of body compartments including the hematopoietic, epithelial, muscular, intestinal and neural systems (Alison and Islam 2009).

1.3.1 Induced Pluripotent Stem Cells

A milestone in human stem cell research was reached in 2007. In continuity to an earlier study in mice (Takahashi and Yamanaka 2006), Takahashi et al. and Yu et al. described the reprogramming of human somatic cells into so called induced pluripotent stem cells by expression of four transcription factors - OCT4, SOX2, KLF4 and c-MYC. iPSCs resemble characteristics of human ESCs, such as unlimited replication potential and the capability to form any cell type of the human body (Takahashi et al. 2007; Yu et al. 2007). However, Deng et al. demonstrated differences in DNAm between ESCs and iPSCs in eight functional gene categories related to ion transport, transcription, metabolic and developmental regulations (Deng et al. 2009). Interestingly, studies reported epigenetic memories of donor cells in derived iPSCs - an issue that needs to be considered when performing genome-wide epigenetic analyses and during re-differentiation experiments with iPSCs (Lister et al. 2011; Ohi et al. 2011). One of the major drawbacks of early iPSC technology was the integration of retroviral vectors, which led to the increased risk of tumorigenicity. The use of episomal plasmids resolved the need for viral transduction and provided a method for integration free iPSCs (Yu

19

Introduction et al. 2009). In combination with xeno-free culture media, the iPSC technology gave rise to a new era of clinical research (Chen et al. 2011).

1.3.2 Induced Pluripotent Stem Cells as a Model System in Clinical Research

The prevalence of age-associated diseases is increasing due to population aging and improved treatment options for patients with chronic or clonal diseases. The development of new therapeutic options requires a detailed understanding of molecular and physiological traits of the disease. However, in vitro cultures and animal models recapitulate only some of the underlying mechanisms of human diseases. The advent of iPSCs gave rise to new perspectives for personalized and promises to advance drug screening and development (Mullard 2012; Scannell et al. 2012). Patient-derived iPSCs provide unprecedented human models to study disease pathology in the context of different genetic backgrounds (Figure 1.9). The alleviation of disease specific phenotypes in iPSCs and their derivatives may indicate new treatment options for the donor. To date, several studies demonstrated that cells derived from differentiated stem cells affected by a specific disease can recapitulate disease traits in vitro, thus providing proof of concept that human iPSCs can be used to model genetic diseases (Matsa et al. 2011; Bellin et al. 2012). When using iPSCs as a model system, special care needs to be taken to apply the appropriate controls during experimental procedures. One control consists of human iPSC lines in which the candidate gene mutation has been corrected (Liu et al. 2011b; Yusa et al. 2011). In case of dominant-active mutations, knockout of the mutated gene must be performed. To generate isogenic controls, genome editing technologies are prerequisite. The invention of the CRISPR/Cas9 technology in 2012 enables targeted genome editing of iPSCs (Jinek et al. 2012). The main advantage of the CRISPR technology is that it is less expensive and more efficient than other genetic engineering techniques such as zinc-finger nucleases (ZFN) or transcription activator-like effector nucleases (TALEN) (Antony et al. 2018; Jiang et al. 2018; Xu et al. 2018). Genome editing by CRISPR is guided by RNA sequences which are simpler to design than the complex proteins upon which ZFNs and TALENs are based on (DiCarlo et al. 2018). Despite its specificity and efficiency, the CRISPR technology is so far only used in one active and ten recruiting clinical interventional studies (clinicaltrials.gov, retrieved on 2019-01-18).

20

Introduction

Figure 1.9 Human iPSC applications in clinical research Adult somatic donor cells can be reprogrammed into iPSCs by a variety of experimental procedures, including lentiviral, adenoviral or episomal approaches. After inducing differentiation in vitro, human iPSCs give rise to specialized cells that can be subsequently used for various applications. a) Human iPSCs can be used in disease modelling to understand the molecular mechanisms of underlying disease phenotypes. b) They can also be used for drug screening and discovery, to determine the effects of candidate drugs and to identify target pathways. c) Human iPSCs are also valuable for toxicity tests to assess cellular toxic responses of various tissues. The combination of drug screening and toxicity tests allow the introduction of “the patient” in early stages of the drug discovery and developmental process (Bellin et al. 2012). (With kind permission of SpringerNature)

In translational clinical research, CRISPR is used to replace disease specific mutations (Antony et al. 2018; Artero Castro et al. 2018; Ma et al. 2018). In addition, the CRISPR/Cas9 technology can be used to generate mutations in iPSCs derived from healthy individuals to mimic disease specific genotypes. These modelled diseases are of special interest for the research in the field of neurology. As brain tissue samples are explicitly inaccessible through biopsies, the generation of iPSC derived neural derivatives allows experiments that, until recently, were inconceivable. To date, diseases like Huntington’s disease, Alzheimer’s disease, Parkinson’s disease, amyotrophic lateral sclerosis and schizophrenia are modeled via iPSC differentiation (Park et al. 2008; Devine et al. 2011; Mitne-Neto et al. 2011; Israel et al. 2012). Recent neural differentiation protocols result in reasonably homogeneous populations of neural subtypes. Nevertheless, these protocols need to be widely tested and validated (Kriks et al. 2011; Shi et al. 2012).

21

Introduction

With regard to clinical trials, iPSCs significantly decrease the risk of immunological rejection and also circumvent the ethical criticism raised by the use of human ESCs. Multiple studies have found that administration of iPSCs in preclinical models safely ameliorate e.g. retinal function (Tucker et al. 2011; Li et al. 2012; Jiang et al. 2018). However, iPSCs are not yet widely used for clinical studies due to increased genomic instability and risk of teratoma formation (Volarevic et al. 2018). To date, there are only two active or recruiting interventional clinical studies using iPSC derivatives for cell therapy (clinicaltrials.gov, retrieved on 2019-01-18). A major concern is the occurrence of genetic and epigenetic abnormalities. Studies on chromosome numbers (Mayshar et al. 2010) , copy number variations (Hussein et al. 2011; Laurent et al. 2011) and point mutations (Gore et al. 2011) in iPSCs and ESCs, revealed that reprogramming and subsequent expansion of iPSCs in culture can lead to the accumulation of diverse abnormalities at the chromosomal and single-base level (Pera 2011). In addition, Lister et al. examined epigenome-wide DNA methylation of ESCs and iPSCs at single-base level and described aberrant methylation of CG dinucleotides and abnormalities in non-CG methylation (Lister et al. 2011). These studies demonstrate that iPSCs display more genetic and epigenetic abnormalities than ESCs. Chromosomal abnormalities appear early during the culture of iPSCs and the frequency of mutations in iPSCs was estimated to be ten times higher than in fibroblasts (Gore et al. 2011; Laurent et al. 2011; Pera 2011). Nevertheless, iPSCs are a valuable tool to identify and modulate pathways involved in a variety of diseases (Lee et al. 2018; Tanigawa et al. 2018; Gorabi et al. 2019).

22

Introduction

1.4 Objectives

Highly reproducible changes in DNA methylation (DNAm) can be used for epigenetic age predictions of various cell types and tissues. However, the tissue of origin is a major source for DNAm differences. To enable age predictions for buccal swabs, which are comprised of leukocytes and epithelial cells, we compensate for differences in the cellular composition by developing a model to predict the proportions of leukocytes and epithelial cells based on the DNAm levels of only two cell type specific CpG sites. These cell type specific CpG sites are combined with age-associated CpG sites into a new multivariate model to improve epigenetic age predictions of buccal swab samples. While epigenetic age predictors are applicable to healthy donors, several studies reported inadequate age predictions of samples from patients with hematological malignancies due to changes in the epigenetic landscape. To investigate the variability of DNAm in age-associated CpG sites during malignant transformation, we focus on neighboring CpG sites in PDE4C. The differences in DNAm pattern variability of blood samples from healthy and diseased donors will be assessed. By investigating DNAm patterns of longitudinal samples, we intend to provide further evidence that the DNAm patterns of AML and MDS patients resemble the epigenetic make-up of the malignant clone. In addition, clonal hematopoiesis is identified by training machine learning algorithms on DNAm patterns generated by barcoded bisulfite amplicon next generation sequencing of AML and control samples. This work demonstrates how the classification of patient specific DNAm patterns by machine learning approaches can be used to quantify the prevalence of malignant clones. To gain further insight into the functional consequences of differentially methylated genes in age-associated diseases, we modify the expression of PRDM8 in induced pluripotent stem cells using the CRIPRS/Cas9n technology. This gene was selected because it revealed highest differential DNAm and decreased expression in the premature aging disease dyskeratosis congenita and its functional relevance is largely unclear. Spontaneous and directed differentiation is performed to investigate the effect of PRDM8 expression and stem cell differentiation potential. We aim to understand whether changes in the differentiation potential can be explained by PRDM8-dependent differential DNAm. Taken together, this thesis aims to improve our understanding of DNAm changes during aging and in age-associated diseases and how these changes can be used for forensic or clinical applications. 23

Materials and Methods

2 Materials and Methods

All samples were taken after written consent and according to the guidelines of the local ethics committees.

2.1 Molecular Biology

2.1.1 DNA Isolation

For isolation of DNA, up to 2 million cells were detached with accutase (StemCell Technologies) at 37°C for 5 – 10 min. Accutase breaks up cell colonies into single cells which were collected with KO-DMEM. Cells were centrifuged at 10 000 g for 2 min and the supernatant was removed. Cell pellets can be stored at -80°C until further processing. DNA was isolated from cultured cells using the NucleoSpin Tissue Kit (Macherey-Nagel). For DNA isolation of blood samples, the QIAamp DNA Blood Mini Kit (Qiagen) was used. The concentration of elution products was quantified by spectrophotometry using the NanoDrop2000 (ThermoFisher-Scientific). DNA was stored at -20°C.

2.1.2 Polymerase Chain Reaction

All polymerase chain reactions (PCR) were performed with the PyroMark PCR Kit (Qiagen). This hot-start polymerase is specifically optimized for pyrosequencing analysis but provides highly specific and unbiased amplification of template DNA for various applications. Standard protocols were applied for all samples. Primer specificity was ensured by performing annealing temperature gradients and testing for specific amplification products on agarose gels.

Table 2.1 PCR reaction mix Reagent (stock concentration) Amount Pyro Mark master mix (2x) 12.2 µl CoralLoad (10x) 2.5 µl Primer forward (10 µM) 1 µl Primer reverse (10 µM) 1 µl Template DNA 50 - 100 ng Aqua dest. ad 25 µl

24

Materials and Methods

Table 2.2 Temperature profile PCR Temp [°C] Time 95 15 min 95 30 sec 52-58 30 sec 32x 72 30 sec 72 10 min

2.1.3 Agarose Gel Electrophoresis

For quality control of DNA and to ensure specificity of PCR products, agarose gel electrophoresis was performed. Samples were mixed with a loading dye (New England Biolabs, NEB) and separated in a 1.5% agarose gel in 0.5x TAE buffer. The 100 bp ladder from NEB was used as size standard. The samples were separated in the gel for 40 min at 120 V. Bands were visualized with UV light using a GelDoc (BioRad).

Table 2.3 TAE buffer (50x) Reagent (stock concentration) Amount Tris base 242 g Acetic acid 57.1 ml EDTA (0.5M) 100 ml Aqua dest. ad 1 l

2.1.4 Bisulfite Conversion

As a first step to analyze methylation frequencies in DNA samples, bisulfite conversion was performed. During bisulfite conversion, unmethylated cytosines become sulfonated at the C6 atom by sodium bisulfite. Deamination and subsequent desulphonation generate uracils. Methylated cytosines do not undergo sulphonation and remain 5-methyl-cytosine (Figure 2.1).

25

Materials and Methods

Figure 2.1 Bisulfite conversion of genomic DNA Cytosines are converted to uracils in a three-step process including sulphonation, hydrolytic deamination and desulphonation. 5m-Cytosine is not susceptible to sulphonation and will not be replaced by uracile.

By detecting either cytosines or uraciles via pyrosequencing or next generation sequencing (NGS), the methylation frequency can be determined.

2.1.5 Pyrosequencing

Pyrosequencing was used in this study to measure DNAm levels of specific age-associated or cell type specific CpG sites. The benefit of pyrosequencing is the capability to measure minor changes in DNAm levels in a simple and cost-effective manner. Thus, bisulfite converted DNA was amplified via PCR using the PyroMark kit as described in 2.1.2 with an annealing temperature of 52°C and 40 cycles. One primer of the primer set is biotinylated to enable binding of the PCR product to streptavidin/sepharose beads (Figure 2.2).

26

Materials and Methods

Figure 2.2 Pyrosequencing assays for cell type specific CpG sites Shown are the pyrosequencing assays for the cell type specific CpG sites in the CD6 and SERPINB5 genes. The sequence is shown after in silico bisulfite conversion. Red letters indicate the site of sequencing primer binding. Yellow regions show the location of CpG sites. Indicated are the positions of the cell type specific CpG sites identified from the Infinium HumanMethylation 450K BeadChip. (Eipel et al. 2016)

Using the vacuum prep tool of the PyroMark ID sequencer (Biotage), the bead-bound amplification products are linearized and denatured in 70% ethanol and 0.5 M NaOH subsequently. After washing in PyroMark Wash Buffer (Qiagen), the biotinylated single strands are transferred to the sequencing primer and the run can be started.

27

Materials and Methods

Figure 2.3 Pyrosequencing reaction Complementary deoxyribonucleoside triphosphates are added to the growing DNA chain by the polymerase, releasing pyrophosphate (PPi) to the reaction. The sulfurylase converts PPi to ATP in the presence of adenosine 5´ phosphosulfate. ATP serves as a substrate for the luciferase to convert luciferin into oxyluciferin which emits light proportional to the amount of generated ATP. Light signals can be detected by a camera and visualized as peaks in pyrograms. The apyrase degrades excess of deoxyribonucleoside triphosphates.

Suspension order of nucleotides is generated by the Pyro Q-CpG software (Biotage) and optimized for the respective sequences. Bisulfite conversion controls were added to the suspension order to test for incomplete bisulfite conversion. The light emission is analyzed by the PyroQ-CpG software (Biotage) and represented as peaks. Pyrosequencing of the age-associated regions in ITGA2B, ASPA, and PDE4C and the region in DNMT3A of all AML and MDS patients were performed by Cygenia GmbH. Primer sequences for the pyrosequencing assays are provided in Table 2.4.

28

Materials and Methods

Table 2.4 Primers for pyrosequencing Primer Sequence (5’ – 3’) ASPA Fw biotin-ATTATTTGGTGAAATGATT ASPA Rv CAACCCTATTCTCTAAATCTC ASPA Seq CCCTATTCTCTAAATCTCA ITGA2B Fw biotin-TAATTTTTTTTGGGTGATG ITGA2B Rv ACCAAAAATAAACAATATACTCAAT ITGA2B Seq CAATATACTCAATACTATACCT PDE4C Fw AGGTTTGTAGTAGGTTGAG PDE4C Rv biotin-AACTCAAATCCCTCTC PDE4C Seq GTTATAGTATGATTAGAGTTT CD6 Fw biotin-AGTATAGGTAGTTGGGGTTTTTTTTATTAGTTTTTGTA CD6 RV CCAAATCTACTCTACCCTTTACTATTCTTATTCCTAT CD6 Seq CCTATATCTCTCTCTACTCTCTCC SERPINB5 Fw ATTGTGGATAAGTTGTTAAGAGGTTTGAGTAGG SERPINB5 Rv biotin-AAACAAACAAACCAAAAACACAAAAACCTAAATAT SERPINB5 Seq GGTGTTGTTTAGGTGAGTT PRDM8 Fw biotin-GGGGTTGTTTATTGTTAGTAATATTGTATAAAAGGAGGA PRDM8 Rv ACCCCGCTCT AAACCCAAATTCTT PRDM8 Seq GCCTACCCTAAAAATATACC

2.1.6 Barcoded Bisulfite Amplicon Next Generation Sequencing

To detect methylation frequencies and DNAm pattern with single strand resolution, a self- developed barcoded bisulfite amplicon next generation sequencing (BBA-Seq) protocol was applied. Genomic DNA was isolated and bisulfite converted as described in 2.1.1 and 2.1.4. DNA from different donors and for specific genomic locations was amplified in a first PCR reaction (32 cycles, annealing temperature 56°C) using gene specific primers containing handle sequence overhangs. All amplicons for the same donor were pooled equimolarly and primers were removed via Agencourt AMPure XP Beads (Beckmann Coulter). DNA was amplified in a second PCR (16 cycles, annealing temperature 56°C) using primers binding to the handle sequence and containing donor-specific barcodes as well as Illumina adapter sequences. Samples were pooled and amplicons and primers smaller than 200 bp were removed by using the Select-a-Size DNA Clean & Concentrator (Zymo). Library was diluted to 10 pM according to the MiSeq System Denature and Dilute Libraries Guide and 20% PhiX DNA (Illumina) was spiked-in for low complexity libraries. The runs were performed using

29

Materials and Methods

MiSeq Reagent Nano Kit v2 (500-cycles) and MiSeq Reagent Kit v2 (500-cycles) on a MiSeq System (Illumina).

Figure 2.4 Library preparation for BBA-Seq Regions of interest are amplified via specific primers containing handle sequence overhangs. Amplicons of the same donor are pooled and primers are removed by magnetic bead purification. In a second PCR, Illumina adapters and donor specific barcodes are added to the amplicons. All samples are pooled in equimolar amounts and primers are removed via size exclusion purification.

Table 2.5 Primers for Next Generation Sequencing Primer Sequence (5’ – 3’) PDE4C_hanlde_ Fw CTCTTTCCCTACACGACGCTCTTCCGATCTTATGGAGAATTTGGGG PDE4C_handle_ Rv CTGGAGTTCAGACGTGTGCTCTTCCGATCTCTACAAAACCCCTACC CAAGCAGAAGACGGCATACGAGANNNNNNNNGTGACTGGAGTTCAGACGTGTGC Barcoded adapter TCTTCCGATCT Universal adapter AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT

30

Materials and Methods

2.1.7 RNA Isolation

RNA isolation was performed using the NucleoSpin RNA Plus kit (Macherey-Nagel) according to the manufacturer instructions. The RNA was eluted in 60 µl nuclease free water and quantified using the NanoDrop2000 (ThermoFisher-Scientific). RNA was stored at -80°C.

2.1.8 cDNA Synthesis and Semi-qPCR

Up to 500 ng of RNA was reverse transcribed using the High Capacity cDNA Reverse Transcription Kit (ThermoFisher-Scientific).

Table 2.6 cDNA synthesis reaction mix Reagent (stock concentration) Amount RT Buffer (10x) 2.0 µl dNTPs (100 mM) 0.8 µl RT random primer (10x) 2.0 µl MultiScribe reverse transcriptase (50 U/µl) 1.0 µl RNase Inhibitor (20 U/µl) 1.0 µl RNA 500 ng Nuclease-free water ad 10 µl

Table 2.7 Temperature profile for cDNA synthesis Temp [°C] Time [min] 25 10 37 120 85 5

Semi-quantitative real time polymerase chain reaction (semi-qPCR) was performed using the PowerSYBR® Green PCR Master Mix (ThermoFisher-Scientific). The real time PCR was performed with the StepOnePlus Real-Time PCR System (ThermoFisher-Scientific). Primer sequences are provided in Table 2.10.

31

Materials and Methods

Table 2.8 semi-qPCR reaction mix Reagent (stock concentration) Amount SyberGreen PCR Master Mix 5.0 µl Primer forward (10 mM) 1.0 µl Primer reverse (10 mM) 1.0 µl cDNA 10-20 ng Nuclease free water ad 10 µl

Table 2.9 Temperature profile for semi-qPCR Temp [°C] Time 95 10 min 95 15 sec 40x 60 60 sec

The specificity of all primers used was checked by a dissociation step. Relative expression was calculated as fold expression of the housekeeping gene GAPDH or via the ΔΔCT method.

32

Materials and Methods

Table 2.10 Primer for semi-qPCR Primer Sequence (5’ – 3’) AFP Fw GCCAAGCTCAGGGTGTAG AFP Rv CAATGACAGCCTCAAGTTGT Nav1.7 Fw CACAATCCCAGCCTCACAGT Nav1.7 Rv CTGAGGAGCTTGACCGGTTTA NEFH Fw CGACATTGCCTCCTACCAG NEFH Rv TCCGACACTCTTCACCTTCC NES Fw CCTCAAGATGTCCCTCAGCC NES Rv CCAGCTTGGGGTCCTGAAAG NXK2.5 Fw ACCTCAACAGCTCCCTGACTCT NXK2.5 Rv ATAATCGCCGCCACAAACTCTCC OCT4 Fw GGGGGTTCTATTTGGGAAGGTA OCT4 Rv ACCCACTTCTGCAGCAAGGG PAX6 Fw CAGACACAGCCCTCACAAAC PAX6 Rv TCATAACTCCGCCCATTCAC PRDM8 Fw ACCGTATATCTTTCGGGTAGACA PRDM8 Rv CTAGAGGGGCAGAGCAAGAG RUNX1 Fw CCGAGAACCTCGAAGACATC RUNX1 Rv GTCTGACCCTCATGGCTGT SOX1 Fw CCTGTGTGTACCCTGGAGTTTCTGT SOX1 Rv TGCACGAAGCACCTGCAATAAGATG SOX17 Fw AGGAAATCCTCAGACTCCTGGGTT SOX17 Rv CCCAAACTGTTCAAGTGGCAGACA TAC1 Fw CACAATCCCAGCCTCACAGT TAC1 Rv CAAAGAACTGCTGAGGCTTG

2.2 Age Prediction Models

Different linear models were used in this thesis. All are based on methylation values determined by pyrosequencing for the following age-associated CpG sites: (α)=cg02228185; (β)=cg25809905, and (γ)=a CpG site up-stream of cg17861230 which revealed better correlation with age than CG17861230. In addition, we utilized two cell type specific CpG sites: (δ)=cg07380416 and (ε)=cg20837735.

 The multivariate 3-CpG-blood-model (Aging Signature) has been described in detail in a previous work (Weidner et al. 2014). It was based on pyrosequencing results of 82 blood samples. Predicted age (in years) = 38.0 - 26.4 α - 23.7 β +164.7 γ. 33

Materials and Methods

 In analogy, we trained a similar multivariate model - the so called 3-CpG-swab-model based on 55 swab samples. Predicted age (in years) = 32.69 – 8.42 α – 47.38 β + 183.25 γ.  We developed two 1-CpG-PDE4C-models based on the methylation values of the CpG site in PDE4C in blood or mouth swab samples. Predicted age (in years) for blood samples = (γ – 0.0745) / 0.0038 and predicted age (in years) for mouth swab samples = (γ – 0.0648) / 0.0046, respectively.  We combined the linear regressions of the individual cell type-specific CpG sites into the Buccal-Cell-Signature to estimate cell compositions of swab samples. Percentage of buccal epithelial cells (ϐ) = (99.8 δ + 1.9) / 2 + (-98.1 ε + 88.5) / 2.  Finally, by using ϐ estimated with the Buccal-Cell-Signature we estimate parameters of age-associated linear models of buccal epithelial cells based on the 55 swab samples of the training set using the software R. This led to the following 5-CpG- model for age predictions of buccal swab samples: Predicted age (in years) = (1 - ϐ / 100) * (38.0 - 26.4 α - 23.7 β +164.7 γ) + (ϐ / 100) * (2.6 – 11.0 α – 15.6 β + 181.7 γ)

2.3 Cloning

Cloning was performed to generate PRDM8 knockout iPSCs. In addition, PRDM8 overexpressing native iPSCs and iPSCs with PRDM8 knockout background were generated. All cloned plasmids and genetically modified clones were Sanger sequenced and aligned to reference sequences using the “ApE - A plasmid Editor” Software by M. Wayne Davis to confirm genetic editing.

2.3.1 Using CRISPR/Cas9n to Generate Gene Knockouts

For genome editing, we applied the CRISPR/Cas9n technology based on the clustered regularly interspaced short palindromic repeats (CRISPR) of the prokaryotic immune system. A modified version of the Cas9 protein was used which creates single strand breaks - so called “nicks” - in the DNA instead of inducing double strand breaks (DSB) (Jinek et al. 2012; Cong et al. 2013). Hence, a pair of guide RNAs (gRNA) is needed to create two nicks which 34

Materials and Methods are located in close proximity and resulting in DNA DSBs. The advantage of this approach is that off-target effects due to miss-pairing of the gRNAs can be reduced. Two pairs of CRISPR guide RNAs (gRNAs) were designed to specifically target the intron/exon boundary of the first coding exon of PRDM8 (Figure 2.5). With this design, CRISPR-guided Cas9 nickases (Cas9n) specifically induce DSBs that lead to the deletion of the first exon and resulting in a reading frame shift.

Figure 2.5 CRISPR/Cas9 design for the PRDM8 knockout Shown are the short transcript of PRDM8 and the four gRNAs binding sites at the intron/exon boundary of the first exon (exon sequences shown in blue). Binding of Cas9n proteins to the Protospacer Adjacent Motif (PAM) flanked binding sites results in a nick of the DNA strand (red arrow head), resulting in two double strand breaks and the excision of the gDNA between the target sites.

Single gRNAs were designed according to the 5’-(N)20NGG-3’ approach, which positions the gRNA next to the PAM sequence NGG (Ran et al. 2013). BpiI restriction site sequences were added to the primers for subsequent cloning. The sequences of oligonucleotides to introduce the gRNAs to the Cas9n vector are listed in Table 2.11.

35

Materials and Methods

Table 2.11 gRNA oligonucleotides Primer Sequence (5’ – 3’) PRDM8 1a Fw AAACTCTTGTAGGCTGCAGAGAAAC PRDM8 1a Rv CACCGTTTCTCTGCAGCCTACAAGA PRDM8 1b Fw CACCGGTTAGCTGACACAGAATTAG PRDM8 1b Rv AAACCTAATTCTGTGTCAGCTAACC PRDM8 2a Fw AAACGCGTTTACACCACCTGCGACC PRDM8 2a Rv CACCGGTCGCAGGTGGTGTAAACGC PRDM8 2b Fw CACCGATCCCTGAGAATGCTATATT PRDM8 2b Rv AAACAATATAGCATTCTCAGGGATC

The gRNA oligonucleotides were resuspended in aqua dest. to a concentration of 100 µM and kept in slow agitation for 1 h. Oligonucleotides were phosphorylated via T4 polynucleotide kinase (PNK; see Table 2.12) and incubated at 37°C for 30 min followed by denaturation at 95°C for 5 min and annealing via cooling down in 5°C/min steps to 25°C. pX335A vectors, which also encode for GFP and a puromycin resistance cassette (kindly provided by Dr. Stefan Frank and Dr. Boris Greber, Max-Planck-Institute for Molecular Biomedicine, Münster, Germany) were digested with BpiI (Thermo-Fisher Scientific) for 2h at 37°C to generate sticky ends for subsequent ligation. Digestion reactions were purified with the NucleoSpin Gel and PCR Clean-up kit (Qiagen). Ligation was performed at 16°C for 16 h using the T4 DNA ligase (Thermo-Fisher Scientific), 50 ng pX335A vector and 4 µl of annealed oligonucleotides. After conformation of gRNA insertion via Sanger sequencing (Eurofins), the modified vector was amplified via transformation into Escherichia coli DH5α, purified using the NucleoBond Xtra Maxi Kit (see 2.3.4) and used for electroporation of iPSCs.

Table 2.12 Phosphorylation reaction mix Reagent (stock concentration) Amount [µl] T4PNK (10 U/µl) 0.5 T4 DNA ligase buffer (10x) 1 gRNA forward (100 µM) 1 gRNA reverse (100 µM) 1 Nuclease free water ad 10

36

Materials and Methods

Table 2.13 pX335A vector digestion mix Reagent (stock concentration) Amount pX335A 1 µg BpiI (10 U/µl) 1.5 µl Fast Digest Buffer (10x) 3.75 µg Nuclease free water ad 20 µl

2.3.2 Generation of PRDM8 Overexpression Vector

To analyze the impact of PRDM8 overexpression on the knockout clones, over expression was induced via lentiviral transduction of iPSCs (see 2.3.7). The GFP encoding pWPXLCX-GFP vector was used as expression vector. The PRDM8 cDNA sequence was amplified from a PRDM8 open reading frame containing vector (abmgoods) and EcoRI and BamHI restriction sites were added in frame via PCR. In addition to the primers resulting in PRDM8-GFP fusion proteins, PCR primers containing a T2A linker between the GFP and PRDM8 sequence were used to generate a vector construct which results in separated GFP and PRDM8 proteins. 2A sequences are viral elements which cause steric hindrance and ribosome skipping, thus resulting in “self-cleavage” of proteins. Hence, steric interactions of the fusion proteins are avoided and functionality of the expressed proteins is ensured. The benefit of 2A sequences in comparison to internal ribosome entry site (IRES) sequences is the generation of equimolar amounts of proteins (Donnelly et al. 2001). A GSG linker and additional nucleotides were integrated in the primer sequences and added between the GFP and

PRDM8 coding sequences to avoid steric hindrance and reading frame shift. For the restriction reaction, 1 µg vector and 1 µg PRDM8 amplicons were digested with EcoRI and BamHI (both NEB) for 3 h at 37°C. Ligation was performed via T4 ligase as described in 2.3.1. The Ligation mix was transformed into Escherichia coli DH5α and insertion of PRDM8 sequences were confirmed via Sanger sequencing (Eurofins).

Table 2.14 Primers for the generation of PRDM8 overexpression vectors Primer Sequence (5’ – 3’) PRDM8_OE_FusGFP_Fw CAGTGAATTCTATGGAGGATACTGGCATCCAGCG CAGTGAATTCTGGATCTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGA PRDM8_OE_T2AGFP_Fw CGTCGAGGAGAATCCTGGACCTATGGAGGATACTGGCATCCAGCG PRDM8_OE _RV ACTGGGATCCTCAATTATGCGAGGTCATGTGCCTGG

37

Materials and Methods

2.3.3 Transformation of Escherichia Coli

To amplify vectors for CRISPR/Cas9n cloning or PRDM8 overexpression, competent Escherichia coli DH5α cells were transformed with the respective plasmids. 100 µl chemically induced competent DH5α cells were incubated on ice for 30 min with 10 µl of the ligation reaction or 100 ng of cloning vectors, respectively. Heat shock was performed at 42°C for 45 sec and stopped by incubating on ice for 2 min. 900 µl of LB medium was added and the suspension was incubated at 37°C in a shaking incubator for 1 h and 200 rpm. 100 µl and 250 µl of transformation stock were plated onto 100 µg/ml ampicillin (Roth) LB agar plates and incubated at 37°C overnight. Colony PCR was performed (see 2.3.4) or individual colonies were transferred to 500 µl ampicillin containing LB medium using sterile pipette tips for clone expansion. After 16 h of incubation at 37°C and 200 rpm, plasmids were isolated via the NucleoSpin Plasmid DNA Purification kit or the solution was transferred to 300 ml LB medium and incubated additional 16 h to generate batches for plasmid isolation via maxiprep.

2.3.4 Plasmid Isolation and Colony PCR

Plasmid isolation was performed using the NucleoSpin Plasmid DNA Purification Kit (Macherey-Nagel) according to the manufacturer’s instructions. Isolation of larger plasmid quantities was performed with the NucleoBond Xtra Maxi Kit (Macherey-Nagel) according to the manufacturer’s instructions. Quantity and quality of isolated plasmids was assessed via spectrophotometry using the NanoDrop2000. Plasmid DNA was sequenced via Sanger sequencing (Eurofins). To directly screen plasmids for insertions or deletions without prior DNA isolation, colony PCR was performed. Colonies were partially transferred to 10 µl aqua dest. using a pipette tip. The suspension was heated up to 95°C for 2 min und centrifuged at 10 000 g for 2 min. 3 µl of the supernatant was inserted into a PCR as template. Primers were chosen to flank the modified region of the vector. Deletions or insertions of plasmid DNA could be visualized via gel electrophoresis.

38

Materials and Methods

2.3.5 Transfection of Induced Pluripotent Stem Cells via Electroporation

To introduce the four pX335A vectors coding for the four individual gRNAs and the Cas9n protein into iPSCs, electroporation was performed using the NEON Transfection System Kit and the NEON transfection device (Thermo-Fisher Scientific) according to the manufacturer’s instructions. Medium of iPSCs was changed 1 h prior to transfection and 10 μM Rho- associated kinase (ROCK) inhibitor was added (Y-27632, Abcam). iPSCs were detached and singularized via accutase. After washing with PBS (200 g, 4 min) cells were counted with a Neubauer counting chamber. 1x106 cells were resuspended in 120 μl of NEON buffer R and mixed with plasmid DNA solutions (2 μg of each CRISPR/Cas9 plasmid). Electroporation was performed with the NEON 100 μl pipette in NEON electrolyte buffer E2 (1500 V, 20 ms, 1 pulse). Immediately after electroporation, iPSCs were resuspended in 4 ml antibiotics-free iPS-Brew Medium (Miltenyi) containing 10 µM ROCK inhibitor. 1.5 ml and 2.5 ml of cell suspension was seeded into vitronectin coated dishes (Ø 10 cm). After 24 hours, penicillin and streptomycin were added to the culture medium and after 48 h positively transfected cells were selected by puromycin treatment (1 µg/ml) for 24 h. After reaching a size of about 500 µm, colonies were transferred to individual wells by scratching the clones from the plate with a pipette tip and expanded for DNA and RNA isolation to check for positive deletion of gDNA or cDNA sequences, respectively (see 2.3.1). In total, one PRDM8+/-and one PRDM8-/- iPSC clone could be generated.

2.3.6 Virus Production

All steps including viral particles were performed under S2 safety conditions. For virus production, pMD2.G served as envelope plasmid and psPAX as packaging plasmid. One day prior to transfection 750 000 human embryonic kidney (HEK) 293T cells were seeded into tissue culture plastic dishes (Ø 6 cm). One hour before transfection medium was changed.

The precipitation mix containing vectors, CaCl2 and aqua dest.(see Table 2.15) was added to 2x HBS buffer (see Table 2.16) on ice by gently mixing the solution several times. The solution was added dropwise to the HEK 293T cell culture while carefully moving the plate.

39

Materials and Methods

Table 2.15 Precipitation mix for virus production Reagent (stock concentration) Amount pWPXLCX 5 µg pMD2.G 1.25 µg psPAX 3.75 µg

CaCl2 (2M) 187.5µl Nuclease free water ad 500 µl

Table 2.16 2xHBS buffer for transfection Reagent (stock concentration) Amount HEPES (1M) 50 ml NaCl 14.6 g

Na2HPO4 0.21 g Nuclease free water ad 1 l

24 h after transfection, the HEK 293T medium was changed. After 48 h and 72 h of transfection, the supernatant of the HEK culture was collected in a centrifuge tube and centrifuged for 5 min at 200 g. The supernatant was filtered with a 0.45 µm Whatman filter and aliquoted into micro centrifuge tubes. The virus particles were stored at -80°C.

2.3.7 Transduction of Induced Pluripotent Stem Cells

For viral transduction, iPSCs were grown to 70% confluency. Cells were detached with 0.5 mM EDTA and passaged in a 1:4 ration to new tissue culture plastic 6-well plates. Virus particles were thawed at room temperature and supplemented with 8 µg/ml polybrene to enhance transduction efficiency. 300 µl of viral suspension was added to each well of freshly passaged cells. Cells were incubated for 16 h and medium was changed. 48 h hours after the first transduction, medium was changed and 300 µl of viral suspension containing polybrene was added to the culture to enhance transduction efficiency. Transduction efficiency was assessed via microscopy of GFP positive cells.

40

Materials and Methods

2.4 Cell Culture

All cell culture work was performed under sterile conditions in a laminar flow hood

(Heraeus) and cells were cultured in Heracell incubators (Heraeus) at 37°C and 5% CO2 in a humidified atmosphere. Centrifugation steps were performed with a Heraeus Multifuge 3L (Thermo-Fisher Scientific), a Rotanta 460 centrifuge (Hettich) or a Heraeus Megafuge 16 (Thermo-Fisher Scientific). Disposable plastic ware from Becton Dickinson (BD Biosciences) was used. For all cell culture media, reagents were sterile filtered and suspended in sterile aqua dest. or PBS.

2.4.1 Culture of Induced Pluripotent Stem Cells

Tissue culture plastic for iPSC culture was coated with 0.5 µg/cm2 vitronectin (Stemcell Technologies) solved in PBS and incubated for at least 15 min at 37°C or 20 min at RT. Cultivation was performed in iPS-Brew Medium (Miltenyi) and 100 U/ml penicillin/streptomycin (Gibco) at 37°C in a humidified atmosphere with 5% CO2. Medium changes were performed daily and cells were passaged at 70% confluency. Old medium was aspirated and cells were incubated with pre-warmed 0.5mM EDTA for 4 min at 37°C. After colonies started to fragmentize, EDTA was aspirated and cell clusters were suspended in fresh culture medium by gentle pipetting. For maintenance culture, cells were reseeded in a 1:4 ratio on vitronectin coated plates. Cells were frozen at 70% confluency as follows. Old medium was aspirated and cells were incubated with accutase (Stemcell Technologies) for 5 - 10 min at 37°C. Accutase breaks up colonies to single cells which were transferred with KO-DMEM into centrifuge tubes. After centrifugation at 200 g for 4 min, KO-DMEM (Gibco) was aspirated and cells resuspended in Cryo-SFM (PromoCell) containing 10 µM ROCK inhibitor. Cells were frozen at -80°C for 1 – 3 days and transferred to -140°C for long-term storage. Thawing of cells was performed in pre-warmed KO-DMEM and by centrifugation at 200 g for 4 min. After aspiration of the supernatant, pellets were resuspended in iPS-Brew medium with 10 µM ROCK inhibitor and seeded on vitronectin coated tissue culture plastic.

41

Materials and Methods

2.4.2 Embryoid Body Assay

To test for three-lineage potential of iPSC clones, cells were spontaneously differentiated via the embryoid body (EB) assay. iPSCs were incubated with 1 mg/ml collagenase IV for 5 - 15 min and rinsed off with KO-DMEM. Resulting cell clusters were sedimented by gravity at 37°C for 10 min. By removing the supernatant, singularized and dead cells could be removed from the culture. Cell clusters were resuspended in EB culture medium and transferred to ultra-low attachment plates (Corning) to form EBs in a 3D culture.

Table 2.17 EB culture medium Reagent (stock concentration) Amount [ml] KO-DMEM 250 Serum 50 Penicillin/Streptomycin (5000 U/ml) 2.5 L-Glutamine (200mM) 2.5 NEAA (100x) 2.5 ß-Mercaptoethanol 0.5

Partial medium change was performed every second day by carefully aspirating half of the medium without removing EBs from the culture and adding new medium. On day 5, EBs were transferred to plates coated with 0.1% gelatin for 2D culture. Cells were cultured for additional 16 days. At distinct time points, samples were harvested with trypsin for subsequent RNA isolation (Figure 2.6).

Figure 2.6 Embryoid body assay scheme IPSCs were detached with collagenase IV at 70 % confluency. Cell aggregates were transferred into low-binding plates and cultured for 5 days in 3D culture. At day 6, cells were transferred to gelatin coated wells for 2D culture. Samples were harvested at different time points for semi-qPCR analysis.

42

Materials and Methods

2.4.3 Neural differentiation

Directed neural differentiation was performed in collaboration with Corinna Rösseler (Institute of Physiology, Lampert Lab, RWTH Aachen University). After culturing in mTESR medium (StemCell Technologies) for 2 days, iPSCs were detached and singularized using accutase. Cells were counted using a Neubauer counting chamber and 100 000 cells/cm2 were seeded with ROCK inhibitor on Matrigel GFR (Fisher Scientific) coated plates. Medium was changed according to Table 2.20.

Table 2.18 Neural differentiation medium Reagent (stock concentration) Amount [ml] DMEM/F12 KO 409 KO serum replacement 75 L-Glutamine (200mM) 5 MEM NEAA (10mM) 5 β-mercaptoethanol (50 mM) 1 Penicillin/Streptomycin (100x) 5

Table 2.19 Neural maturation medium Reagent Amount [ml] DMEM/F12 480 B27 supplement (Life Technologies) 10 N2 supplement (Life Technologies) 5 Penicillin/Streptomycin (100x) 5

For maturation, plates were coated with PORN-LAM for 2 h at 37°C. Cells were singularized with TrypLE™ Express Enzym (Life Technologies) and seeded on coated plates at a density of 100 000 cells / cm2 (Figure 2.7). Medium was refreshed according to Table 2.20 every 3 to 4 days. Table 2.18 and Table 2.19 list the components of the respective basal media.

43

Materials and Methods

Figure 2.7 Neural differentiation scheme Schematic visualization of directed neural differentiation. iPSCs were cultured in differentiation medium for 10 days and subsequently in maturation medium for up to another 55 days. In the first 6 days, cells were primed for neuroectodermal differentiation by treatment with dual SMAD inhibitors. Further cultivation with CHIR99021, SU5402, DAPT and additional growth factors induced a sensory phenotype. (Scheme provided by the Institute of Physiology, Lampert Lab, RWTH Aachen University)

Table 2.20 Neural differentiation timetable

Day Medium (%) -2 -1 0 1 2 3 4 5 6 7 8 9 10 … mTESR 100 100 Differentiation medium 100 100 100 100 75 75 50 50 25 25 Maturation medium 25 25 50 50 75 75 100 100 Growth factors Y-27632 (10 µM) x x LDN-193189 (100 nM) x x x x x x SB 431542 (10 µM) x x x x x x SU 5402 (10 µM) x x x x x x x x DAPT (10 µM) x x x x x x x x CHIR 99021 (3 mM) x x x x x x x x BDNF (20 ng/ml) x x GDNF (20 ng/ml) x x NGF (20 ng/ml) x x Ascorbic acid (200 ng/ml) x x

44

Materials and Methods

2.4.4 HEK 293T Cell Culture

HEK 293T cells were cultured for virus production. Cells were kept in DMEM with 10% FCS, 100 U/ml penicillin/streptomycin and 2 mM L-Glutamine at 37°C humidified atmosphere with 5% CO2. Cells were passaged with trypsin twice a week and frozen in culture medium supplemented with 10% DMSO and stored at -140°C.

2.4.5 Cell Sorting

Viral transduced iPSCs were sorted according to their GFP expression in the Flow Cytometry Facility of the IZKF, RWTH Aachen University. Cells were detached by incubating with accutase for 5 min and suspended in KO-DMEM (Gibco). Cell aggregates were removed using a 40 µm cell strainer (StemCell Technologies). Cells were centrifuged at 200 g for 4 min and supernatant was aspirated. 1 million cells were resuspended in 200 µl iPS-Brew medium containing 10 µM ROCK inhibitor. GFP positive cells were sorted using the FACS Aria (Becton Dickinson) and cultured in iPS-Brew with ROCK-Inhibitor for 12 h.

2.5 Proteochemistry

2.5.1 H and E staining of Buccal Swab Smears

Cells were harvested by gentle exfoliating of the mouth mucosa with swabs (Copan Flock Technologies). Swabs were moved over glass slides to generate smears which were fixed with M-FIX Fixation spray (Merck). After letting the slides dry for 2 h, cells were incubated in 50% EtOH for 5 min and afterwards in hematoxylin and eosin for 5 min each. After washing in aqua dest., cells were dehydrated in an ascending alcohol series and in Xylol for 2 min each.

2.5.2 Immunofluorescence Staining

Staining of iPSCs for the typical pluripotency markers OCT4 und TRA-1-60 was performed to verify their pluripotency after gene editing. Following passaging, iPSCs were seeded onto vitronectin coated glass cover slides in 4-well plates and cultured for 3 days. Cells were fixed with 4% PFA (Sigma Aldrich) at RT for 20 min. After washing thrice with PBS, cells were blocked with goat serum (Merck Millipore) at RT for 30 min. Then, the primary antibody for

45

Materials and Methods

TRA-1-60 (clone TRA-1-60 mouse anti-human IgM) was added in a 1:100 dilution and incubated at 4°C for 16 h under gentle shaking. After removing the antibody and washing thrice with PBS, the secondary antibody (goat anti-mouse IgM Alexa Fluor 594 (ThermoFisher-Scientific) was incubated for 1 h in the dark in a 1:200 dilution. Staining was repeated for the Oct4 primary antibody (rabbit anti-human OCT4 IgG; clone sc-9081, Santa Cruz Biotechnology) in a 1:50 dilution (in 0.1% Triton-X100) and the secondary anti-rabbit IgG FITC (ThermoFisher-Scientific) in a 1:200 dilution. Finally, iPSCs were incubated for 5 min with 1 μg/ml DAPI (Vector Laboratories), washed twice and glass slides were mounted upside down with mounting solution (Dako) onto microscope slides. For microscopy of the sorted, GFP positive cells, the nuclei were counterstained with DAPI solution (1 µg/ml) in 0.1 % Triton-X100 for 5 min at RT.

2.6 Statistics and Data Analyses

Statistical significance was analyzed using two-tailed paired or unpaired Student’s t- test (GraphPad Prism) for paired and unpaired cohorts, respectively. For multiple comparisons of metric and normally distributed values, ANOVA was performed. Differences were considered significant (*) when p < 0.05, very significant (**) when p < 0.005 and extremely significant (***) when p < 0.001. DNAm analyses via the Infinium MethylationEPIC BeadChip (Illumina) were performed for the differentiated neural cells. The Arrays were performed at Life and Brain (Bonn, Germany). The Calculation of ß-values and adjusted p-values and the normalization of the raw data via quantile normalization was performed by Julia Franzen (Helmholtz Institute for , Stem Cell Biology and Cellular Engineering, RWTH Aachen University). For comparative analyses of the three WT clones from different donors, CpG sites located on the sex chromosomes were excluded. Gene ontology analyses were performed using the software GoMiner (Zeeberg et al. 2005). Analysis was performed on genes with differentially methylated CpGs in the promoter region (located in TSS1500, TSS200, and 5’UTR as annotated by the Infinium MethylationEPIC BeadChip). Categories comprising more than 1000 genes were not considered and similar categories are only listed once. Heatmaps were generated using the software “R” and hierarchical clustering is performed using Euclidean distances.

46

Results

3 Results

3.1 Epigenetic Age Predictions of Buccal Swab Samples

Saliva as well as mouth swabs are very heterogeneous in their composition of buccal epithelial cells and leukocytes (Thiede et al. 2000). It is easily anticipated that epigenetic modifications such as age-associated methylation patterns differ significantly between these two cell types. Nevertheless, buccal swabs are a widely used specimen in legal medicine due to advantages, such as their non-invasive harvesting procedure and low contagiousness. Bocklandt and coworkers have demonstrated epigenetic age predictions of saliva samples via a predictor based on three CpG sites (Bocklandt et al. 2011). However, the authors did not apply their predictor to mouth swab samples or an independent validation set and did not correct for differences in the cellular composition. We therefore followed the hypothesis that the precision of epigenetic age predictions in buccal swabs can be improved by taking the cellular composition of buccal epithelial cells vs. leukocytes into account. The results of this study are published in Eipel et al. 2016.

3.1.1 Comparative Analyses of Age-Associated DNA Methylation Changes in Blood and Mouth Swab Samples

For age predictions, we applied a recently developed epigenetic age predictor for blood samples based on DNAm levels at three age-associated CpG sites located in the genes ITGA2B, ASPA and PDE4C (Weidner et al. 2014), subsequently referred to as “3-CpG-blood- model”. To assess age prediction accuracy of this predictor for buccal swabs, samples were taken from 55 healthy donors (age range from 1 to 85 years) and DNAm levels were measured by pyrosequencing at these three relevant age-associated CpG sites. Age was predicted via the 3-CpG-blood-model as described in 2.2. The correlation of predicted and chronological age was R2=0.91 (Pearson correlation; Figure 3.1A). This correlation was higher than observed in 151 blood samples (R2=0.81; Figure 3.1B). However, there was a clear offset in age predictions of buccal swabs as buccal swab samples were overestimated by 14.6 years in average.

47

Results

Figure 3.1 Epigenetic age prediction of swab and blood samples A) Epigenetic age predictions of 55 mouth swab samples using an age predictor that was trained on blood samples as described before (Weidner et al. 2014). B) For comparison, we demonstrate the predictions for 151 whole blood samples. (Adapted from Eipel et al. 2016)

To investigate the influence of individual CpG sites on the age prediction accuracy of buccal swabs, the age correlations of the individual CpG sites were determined (Figure 3.2). Pyrosequencing results of the training and the validation set – containing 55 samples which were collected and analyzed independently by different labs (Institute for Legal Medicine, Heinrich Heine University, Düsseldorf, Germany; Varionostic GmbH, Ulm, Germany) – revealed only moderate age correlation of DNAm in the CpG sites in ASPA and ITGA2B (Figure 3.2A, B). Only the CpG site in the PDE4C gene showed high correlation with age in the training and validation set (R2=0.91 and R2=0.90, respectively), indicating that methylation values can be considered as cell type independent. To gain better insight into tissue- specificity of individual CpG sites, we used the publically available Illumina 450K Bead Chip datasets of blood (GSE41037), saliva (GSE28746), and mouth swabs (GSE50586) (Bocklandt et al. 2011; Horvath 2013; Jones et al. 2013). Of note, a neighboring CpG site (cg17861230) within the PDE4C gene with slightly reduced age correlation is represented on the Bead Chip and not the site used for pyrosequencing. Methylation of the CpG sites in ASPA and ITGA2B was moderately correlated with chronological age in blood samples (R2=0.39 and R2=0.23, respectively) but not in saliva or buccal swab samples (Figure 3.2D, E). In contrast, positive correlation between methylation frequencies and chronological age could be observed in blood, saliva and buccal swabs for the CpG site in PDE4C. These findings indicated that the

48

Results

CpG sites in ASPA and ITGA2B were differentially methylated in different cell types and could not be used for age predictions without adjusting for the cellular composition.

Figure 3.2 Age correlation of DNAm at age-associated CpG sites Methylation frequencies of the CpG sites in the genes ITGA2B, ASPA and PDE4C were determined by pyrosequencing and correlated with chronological age. Buccal swab samples from the training and validation datasets are indicated in black and red, respectively. D-F) Publically available datasets of blood (GSE41037), saliva (GSE28746), and mouth swabs (GSE50586) were used to visualize correlation of DNAm with chronological age. The CpG site cg17861230 corresponds to the neighboring CpG site in PDE4C that was used in the pyrosequencing models (because the latter is not represented by Illumina Bead Chips). (Adapted from Eipel et al. 2016)

To account for differences in DNAm in different cell types, we developed a 3-CpG- swab-model which was trained on buccal swab samples. We obtained age predictions with a mean absolute deviation (MAD) of 4.3 years for the training and 7.03 years for the independent validation set (Figure 3.3A). The improved age prediction accuracy for the 3-CpG-swab-model underlines the importance of cell type specific training of age prediction models.

49

Results

Figure 3.3 Age prediction models without cell type correction A) To generate the “3-CpG-swab-model”, the age predictor was re-trained on swab samples exclusively. B) Linear regression of correlation between methylation of the CpG site in PDE4C and chronological age was used to develop the 1-CpG-model. Age prediction accuracy is shown for the training (black) and validation set (red). (Adapted from Eipel et al. 2016)

Next, we analyzed the age prediction potential of the CpG site in PDE4C. This CpG site has been shown to reveal highest age correlation among the CpG sites from the 3-CpG- blood-model. We trained a linear regression model on the correlation of DNAm in the PDE4C CpG site and chronological age (Figure 3.2C). This linear regression was applied to predict the age of the training and validation set and achieved a MAD of 5.2 years and 7.6 years, respectively (Figure 3.3B). The prediction accuracy of the 3-CpG-swab-model and the 1-CpG- swab-model are overall comparable. This indicates that the CpG site in PDE4C has the highest age prediction potential. This finding is supported by the high coefficient for this CpG site in the multivariate 3-CpG-swab-model equation. Compared to other age predictors - e.g. the predictor developed by Bocklandt et al. - the generated models show less accurate age predictions. Thus, we aimed at developing an age predictor adjustable for cell compositions.

3.1.2 Development of a Predictor to Determine Cell Compositions in Buccal Swabs

Since buccal swabs consist of buccal epithelial cells and leukocytes, we aimed to develop a tool to measure the proportions of these cell types in swab samples. Specifically, we screened for differentially methylated CpG sites in blood and swab datasets from Illumina 450K BeadChip arrays. Such DNAm based tools have the advantage, that analysis is

50

Results still possible after DNA isolation and can be performed in retrospective for processed samples. To identify suitable CpG sites, DNAm datasets of swab (GSE50586) and whole blood samples (GSE39981) were analyzed (Jones et al. 2013; Accomando et al. 2014). We filtered with the following criteria: i) high difference in mean DNAm levels in swabs and blood (Figure 3.4A), ii) low variance in DNAm levels within each of these datasets (Figure 3.4B) and iii) no correlation with chronological age in blood samples (GSE40279) of 656 donors aged 19 to 101 (Hannum et al. 2013, Pearson correlation < 0.05). Based on these parameters, we identified a CpG site within the gene for T-cell differentiation antigen (CD6; cg07380416) and a CpG site in the gene for serpin peptidase inhibitor clade B member 5 (SERPINB5; cg20837735) as best suited candidates. The identified CpG sites are located in the promoter regions of the respective genes. Furthermore, neighboring CpG sites of cg07380416 and cg20837735 demonstrated similar differences between the cell types (Figure 3.4C). Datasets of blood, blood subsets, saliva and swabs were analyzed to gain insight into the methylation of the respective CpG sites in different tissues (Figure 3.4D, E). The CpG site cg20837735 was found to be hypermethylated in all hematopoietic cell fractions, whereas cg07380416 was consistently hypomethylated. As expected, ß-values in saliva samples, which usually comprise a higher percentage of leukocytes than buccal swabs, showed ß-values in between those of blood and swabs.

51

Results

Figure 3.4 Identification of differentially methylated CpG sites A) Mean β-values of CpG sites on the Illumina 27k BeadChip in datasets of buccal swabs (GSE50586) and blood (GSE39981). Red dots indicate CpG sites selected for the “Buccal-Cell-Signature”. B) As additional criterion for suitable cell type specific CpG sites we used the sum of variances in both datasets and filtered for CpG sites with a variance < 0.01. C) Mean β-values of all CpG sites that are represented on the 450k BeadChip for the respective genes are depicted for blood (red lines; GSE40279) and mouth swab samples (blue lines; GSE50586). Grey bars highlight the selected CpG sites. TSS1500: within 1500 bp of the transcription start site; TSS200: within 200 bp of the transcription start site; UTR: untranslated region. D, E) Mean β-values at cg07380416 (CD6) and cg20837735 (SERPINB5) were compared in whole blood (GSE41037, GSE39981), hematopoietic subsets (GSE39981), saliva (GSE28746, GSE34035, GSE39560), and buccal swabs (GSE25892, GSE50586). Error bars represent standard deviation. (Adapted from Eipel et al. 2016)

As shown in various studies, smoking, gender and ethnicity can influence on methylation frequencies of specific CpG sites (Zhang et al. 2011; Teschendorff et al. 2015). To test for these confounding factors, we analyzed the methylation levels of the age- associated and cell type specific CpG sites in publicly available datasets. The β-values did not differ in blood samples of 22 smokers and 179 non-smokers (GSE50660) with similar age distribution (Tsaprouni et al. 2014). Furthermore, the methylation status of CpG sites which

52

Results were previously described as smoking-associated (Shenker et al. 2013; Zeilinger et al. 2013; Tsaprouni et al. 2014) could be recapitulated (Figure 3.5A).

Figure 3.5 Analyses of smoking, gender and ethnicity as confounding factors A) DNA methylation levels at the three age-associated CpG sites (ITGA2B, ASPA, and PDE4C) and the two cell type associated CpG sites (CD6 and SERPINB5) in blood samples of current smokers (red) and never-smokers (blue; GSE50660). Differential methylation of smoking-associated CpG sites was validated in three CpG sites which have previously been described as smoking-associated (cg05575927, cg23480021, cg23576855). B) DNA methylation values of pure nasal epithelial cells of smokers (red) and non-smokers (blue) (GSE28368). C) DNA methylation levels of children (1 to 17 years) from different ethnic groups (GSE36054; blue: black donor; red: white donor; black lines: asian donor). D) Blood samples of 40 to 50 year old donors were used to test for gender as confounding factor (GSE40279; blue: female, red: male). Statistical significance was tested using the two-sided Student’s t-test or univariate ANOVA, respectively. *: p < 0.05, ***: p < 0.0005, Whiskers indicate 10% and 90% percentiles, respectively. (Adapted from Eipel et al. 2016)

53

Results

In addition to blood samples, datasets with nasal epithelial samples of six smokers and six non-smokers were analyzed (Rager et al. 2013). Again, no difference in the ß-values could be detected, thus excluding smoking as a confounding factor (Figure 3.5B). DNA methylation profiles of 8 white, 74 black and 3 Asian children (GSE36054) were analyzed to estimate the influence of ethnicity on the DNAm levels (Alisch et al. 2012). As shown in Figure 3.5C, no significant differences between groups could be detected. In analogy, we compared DNAm levels in male and female samples (Figure 3.5D) and found no gender- associated differences (GSE40279, 40 – 50 year old donors; Hannum et al. 2013). After excluding possible confounding factors, the two identified cell type specific CpG sites were used to develop a tool to predict cell compositions of mouth swab and saliva samples. To generate reference values for the correlation of methylation frequency and cell type composition, we determined the fractions of leukocytes and buccal epithelial cells in 11 mouth swab samples by cell counting in hematoxylin/eosin stained smears (Figure 3.6A, B). In addition, epithelial cell counts of blood samples were set as zero. Correlations of epithelial cell counts and methylation values of cg07380416 and cg20837735 reached R2=0.93 and R2=0.92, respectively (Figure 3.6C, D). A combination of linear regressions was used to generate the Buccal-Cell-Signature and to predict cellular composition in swab samples and reached a correlation of R2=0.94. We applied the Buccal-Cell-Signature to 55 samples of the training set and 26 samples of the validation set (Figure 3.6F). The predicted fraction of buccal cells ranged between 24% and 91% (mean = 71%) for the training and validation set and no difference between the sample sets could be detected. This analysis was performed in the same lab for all samples to exclude technical variation in pyrosequencing analysis.

54

Results

Figure 3.6 Buccal-Cell-Signature to estimate cellular compositions of buccal swab samples A, B) Representative mouth swab smears with different proportions of leukocytes and epithelial cells. Smears of freshly harvested cells were stained with hematoxylin and eosin. C, D) The percentage of buccal epithelial cells vs. leukocytes was determined by cell counting in 11 stained mouth swab smears. DNA methylation levels at the two cell type specific CpG sites were determined by pyrosequencing and correlated with cell counts. E) Linear regressions of both CpG sites were combined into the Buccal-Cell-Signature. Predicted percentages of buccal epithelial cells correlated with cell counts. F) Percentages of epithelial cells were subsequently estimated using the Buccal-Cell-Signature for 55 samples of the training set and 26 samples of the validation set. Error bars represent standard deviation. (Adapted from Eipel et al. 2016)

3.1.3 Combination of Age-Associated and Cell Type Specific CG Dinucleotides to Improve Age Predictions

To investigate if the accuracy of epigenetic age predictions is affected by the cellular composition in buccal swabs, the estimated fractions of buccal epithelial cells were correlated with the MAD of predicted and chronological age. The offset of age predictions in the training and validation datasets by the 3-CpG-blood-model was higher in samples with a higher fraction of epithelial cells. These cell type specific differences were less pronounced when using the 3-CpG-swab-model (Figure 3.7A, B).

55

Results

Figure 3.7 Combination of age-associated and cell type specific CpG sites A) The differences of chronological age and predicted age were compared to the predicted percentage of buccal epithelial cells. B) In analogy, we compared age predictions by the 3-CpG-swab-model to the estimated percentage of buccal epithelial cells. C) Age was predicted by a combination of age-associated and cell type specific CpG sites into a multivariate regression model of five CpG sites (5-CpG-model) and compared to chronological age. Training set samples are shown in black; validation set samples are shown in red. (Adapted from Eipel et al. 2016)

Finally, we followed the hypothesis that age predictions can be improved by taking the cellular composition into account. Thus, we combined the age-associated CpG sites and the Buccal-Cell-Signature into one linear “5-CpG-model”. Generation of this model was performed by Ivan Costa (IZKF Computational Biology Research Group, University Hospital of RWTH Aachen, Germany). Applying the 5-CpG-model, we achieved age predictions with a MAD of 4.66 years and 5.09 years for the training and validation set, respectively (Figure 3.7C). To further compare and evaluate the precision of our models, buccal swabs of an additional independent validation set of 37 donors between 18 and 35 years were analyzed. As described above, the blood based model overestimates the age by a mean of 17.3 years. The two swab trained models performed with equal accuracy as the cell type adjusted 5- CpG-model. However, it was unexpected, that our 5-CpG-model did not outperform the other models. This may be due to the fact that methylation values - especially for the PDE4C associated CpG site - differ less in the age range of our cohorts than in older donors.

56

Results

Figure 3.8 Comparison of age prediction accuracies in different models A) The models for age-prediction were subsequently validated and compared in a second, independent dataset of 37 samples (18 to 35 years). Shown is the absolute deviation between predicted and chronological age. Whiskers indicate the 10% – 90% percentiles. B) Samples of the validation group were stratified by an age of 35 years and absolute deviation was compared for each group using the paired two-sided Student’s t-test. Black lines indicate the mean. ***: p < 0.0005 (Adapted from Eipel et al. 2016)

We therefore stratified all validation samples by a cut-off of 35 years. Age prediction was significantly improved when applying the 5-CpG-model in older donors (p=0.0003; Figure 3.8B). It is thus evident, that cell type adjustment is particularly beneficial when predicting age of samples from older donors.

3.2 Analysis of DNA Methylation Variability of Neighboring CG Dinucleotides in Clonal Diseases

The DNAm landscape in patients with hematological diseases is variable, with some patients exhibiting global hypomethylation, while others show dominant hypermethylation (Figueroa et al. 2010b). There is evidence that epigenetic heterogeneity is not random but occurs predominantly at specific hypervariable loci (Hansen et al. 2011). Here, we analyzed changes in DNAm at a specific age-associated region within PDE4C in patients with hematopoietic diseases. We follow the hypothesis, that variability of DNAm patterns at neighboring CpG sites can be used to track clonal development.

57

Results

3.2.1 Identifying Myeloid Malignancies by Analyses of DNA Methylation Variability at Neighboring CG Dinucleotides

It is well known that myeloid neoplasms are reflected by aberrant DNAm which can be used for disease stratification (Li et al. 2016a; Li et al. 2016b). So far, these methods were particularly based on bioinformatic analyses of genome-wide DNAm profiles. In this thesis, we propose two alternative scoring systems based on the variability of DNAm levels at successive CpG sites in a specific age-associated genomic region. Parts of this study are accepted for publication at Hematologica. DNA methylation at age-associated regions can be used to predict donor age in blood samples of healthy individuals, whereas in clonal malignancies the tumor captures the very specific methylation pattern of the malignant clone (Lin and Wagner 2015). Thus, age- associated genomic regions might be ideally suited to investigate variability of DNAm levels at neighboring CpG sites. Therefore, we applied our previously described 3-CpG-blood- model based on methylation levels of three CpG sites associated with the genes ASPA, ITGA2B and PDE4C (Weidner et al. 2014) to 91 peripheral blood and bone marrow samples of AML patients and 155 peripheral blood samples of healthy controls. Age predictions revealed a moderate MAD of 5.1 for 155 control samples and a MAD of 8.2 for 61 patients after complete remission (Figure 3.9). In contrast, age predictions of 20 patients at first diagnosis and 10 patients with persistent tumors or under relapse revealed high differences to chronological age (MAD = 41.1 and MAD = 49.1, respectively).

58

Results

Figure 3.9 Aberrant methylation pattern of neighboring CpG sites in AML patients A) Epigenetic age predictions were performed based on three CpG sites (in the genes ASPA, ITGA2B and PDE4C) in 91 AML patient samples and 155 healthy controls. B-D) Methylation of the 7 CpG sites in the PDE4C amplicon of the pyrosequencing assays are shown for healthy control and patient samples at different stages of therapy.

To better understand how these differences in the epigenetic age are reflected by variances in DNAm levels of successive CpG sites, we analyzed methylation frequencies of 7 neighboring CpG sites in the PDE4C amplicon. DNA methylation patterns of control samples and samples from patients under remission followed a distinct, reproducible pattern. In contrast, samples from patients with acute disease status displayed high variations in the DNAm patterns and loss of covariation. We reasoned, that the outgrowth of the malignant, epigenetically altered clones in the blood cell population is reflected by the variation in DNAm patterns. This leads to the assumption, that DNAm patterns are patient specific. Indeed, we observed unique, patient

59

Results specific patterns which reoccurred in relapsing patients, indicating that the methylation pattern was preserved in the residual malignant cells during treatment (Figure 3.10).

Figure 3.10 Variations in DNAm patterns are patients specific A-C) Representative analyses of longitudinal data of three AML patients at different stages of therapy. Shown are the 7 neighboring CpG sites of the PDE4C amplicon as measured by pyrosequencing.

We aimed at quantifying the variability of DNAm patterns by scoring systems which estimate the degree of epigenetic changes of individual CpG sites in comparison to healthy controls (delta score or d-score) and - alternatively - which take the degree of covariation of neighboring CpG sites into account (neighborhood-score or n-score). The d-score was calculated by the sum of all absolute differences in DNAm to the expected, age-adjusted value of healthy controls. The n-score was calculated by the sum of absolute differences between neighboring CpG sites. Both variability scores were significantly increased in samples from patients at first diagnosis and at relapse when compared to healthy controls and complete remission samples (Figure 3.11). Positive correlations of d-score and n-score indicated a higher diversity and decreased covariation of DNAm pattern in AML samples in comparison to healthy controls.

60

Results

Figure 3.11 Quantification of DNAm variability during disease progression A) The d-scores (delta scores; sum of differences to age-adjusted expected values) and n-scores (neighborhood scores; sum of differences between neighboring CpG sites) of AML patients at different stages of therapy. B) d-score and n-score are positively correlated in AML patients. Error bars represent the mean with standard deviation; ns: not significant, **: p < 0.01, ***: p < 0.001

Patient specificity of DNAm patterns were validated via barcoded bisulfite amplicon next generation sequencing (BBA-Seq) (Figure 3.12). This technique enables multiplexing and analysis of larger amplicons. In total, 14 CpG sites could be covered by the sequencing assay. In analogy to the pyrosequencing result, the observed patterns revealed high patient specificity across the amplicon. In addition, functionality of the scoring systems could be validated via BBA-Seq (Figure 3.12D). As observed for pyrosequencing samples, d-score and n-score were significantly increased for patients at first diagnosis and under relapse and beyond that allowed discrimination of healthy and diseased samples. Correlation of both scores was also observed for the BBA-Seq samples (Figure 3.12E).

61

Results

Figure 3.12 Validation of scoring systems via next generation sequencing A-C) AML samples were re-analyzed by barcoded bisulfite amplicon sequencing (BBA-Seq). In total, 14 neighboring CpG sites were covered by the BBA-Seq assay (shown in grey). D) The d-scores (sum of differences to age-adjusted expected values) and n-scores (sum of differences between neighboring CpG sites) of AML patients at different stages of therapy were measured by BBA-Seq. E) d-score and n-score are positively correlated in AML patients. The higher number of CpG sites in BBA-Seq resulted in higher discriminatory power of the d-scores and n-scores. Error bars represent the mean with standard deviation; ns: not significant, *: p < 0.05, ***: p < 0.001

To test whether our scoring systems are applicable also for patients at early disease status, we analyzed a cohort of 126 peripheral blood samples from patients at first diagnosis of myelodysplastic syndrome (MDS). MDS patients are characterized by lower blast counts than AML patients which may lead to reduced detectability of DNAm variability. In analogy to AML patients, MDS patients were overestimated by a mean of 10 years (MAD=16.7). In comparison to healthy controls, the methylation patterns of MDS patients revealed higher variances in the PDE4C amplicon (Figure 3.13A, B). The d-scores and n-scores of MDS samples were significantly increased (p < 0.001), indicating that our scoring systems can detect clonal development also at early disease stages. Positive correlation of both scores

62

Results revealed that an increase in DNAm variability together with the loss of covariation in this genomic region is characteristic for clonal diseases, also in early stages of malignant transformation (Figure 3.13C, D).

Figure 3.13 MDS patients show increased variability scores A) Epigenetic age predictions based on three CpG sites (in the genes ASPA, ITGA2B and PDE4C) for 126 MDS patients and 155 healthy controls. B) DNA methylation patterns of the 7 CpG sites of PDE4C for MDS patients and healthy controls. C) The d-scores (sum of differences to age-adjusted expected values) and n-scores (sum of differences between neighboring CpG sites) are shown for MDS patients. D) d-score and n-score are positively correlating. Error bars represent the mean with standard deviation. ***: p<0.001

Prognostic potential of our scoring systems was supported by significant Cox- regression analyses which revealed a significantly improved overall survival for patients with low scores (d-score: Hazard Ratio = 1.01, 95%CI = 1.004 - 1.017, p = 0.001; n-score: Hazard Ration = 1.02, 95%CI = 1.007 - 1.027, p < 0.001).

63

Results

In a previous work, hypermethylation of a single CpG site (cg08485187) – the so called epimutation - in DNMT3A was described as associated with shorter overall survival of AML patients (Jost et al. 2014). The effect of the epimutation on overall survival could be confirmed for the MDS cohort indicating that epigenetic changes in this region also occur at early disease stages (Figure 3.14A). Pyrosequencing analyses of the MDS cohort revealed a hypermethylation of successive CpG sites in the DNMT3A amplicon (Figure 3.14B). These aberrant methylation patterns were reflected in an increased d-score. However, the n-score was not increased indicating that covariation of CpG sites is not altered in the DNMT3A amplicon in contrast to the PDE4C amplicon (Figure 3.14C).

Figure 3.14 Variability scores of a region in DNMT3A A) Hypermethylation at an individual CpG site in DNMT3A (cg08485187) is associated with shorter overall survival of MDS patients. Samples are stratified by a methylation level of 10%. B) DNA methylation patterns of the 7 CpG sites of DNMT3A were measured by pyrosequencing. C) The d-scores (sum of differences to age- adjusted expected values) and n-scores (sum of differences between neighboring CpG sites) were calculated for a region in DNMT3A. Error bars represent the mean with standard deviation. ns: not significant, **: p< 0.005

Lack of correlation of the respective scores in the PDE4C and DNMT3A amplicons (Figure 3.15) indicated that changes in DNAm variability and covariation appear independently in different genomic loci.

64

Results

Figure 3.15 Correlation of variability scores in different genomic regions The A) d-scores (sum of differences to age-adjusted expected values) and B) n-scores (sum of differences between neighboring CpG sites) for the genomic regions in PDE4C and DNMT3A of 126 MDS samples were calculated based on pyrosequencing measurements. Of note, the variability scores do not correlate.

3.2.2 Using Barcoded Bisulfite Amplicon Next Generation Sequencing to Classify DNA Methylation Patterns of AML Samples

We reasoned that the prevalence of malignant clones could be assessed via their patient specific DNAm patterns in the PDE4C amplicon. By identifying malignant DNAm patterns via machine learning approaches, we aimed to understand clonal development of malignant clones. To this end, methylation patterns of individual DNA strands needed to be analyzed and a BBA-Seq technique was established in cooperation with Julia Franzen (Helmholtz Institute for Biomedical Engineering, Stem Cell Biology and Cellular Engineering, RWTH Aachen University). In brief, after bisulfite conversion handle sequences and Illumina adapter sequences for NGS were added to the region of interest amplicon via two sequential PCR reactions. NGS was then performed generating single strand resolution data of DNAm profiles. Data processing was performed by Dr. Jan Hapala (Helmholtz Institute for Biomedical Engineering, Stem Cell Biology and Cellular Engineering, RWTH Aachen University). Methylation levels of 41 healthy controls, 47 samples at first diagnosis of AML, 19 samples with and 6 samples without minimal residual disease (MRD) were analyzed in cooperation with Tanja Božić (Helmholtz Institute for Biomedical Engineering, Stem Cell Biology and Cellular Engineering, RWTH Aachen University). In line with our previous findings 65

Results of the pyrosequencing measurements, we detected high variations of DNAm patterns in AML samples (Figure 3.16A). In contrast, DNAm patterns of healthy controls were highly reproducible. Follow up data demonstrated that patients with MRD also show increased variability in DNAm patterns (Figure 3.16B). In addition, DNAm patterns of MRD+ samples highly reflected DNAm patterns of the same patient at first diagnosis (data not shown) as previously observed for the pyrosequencing samples. We reasoned that the development of the patient specific malignant clone is reflected by the composition of DNAm patterns.

Figure 3.16 DNAm patterns of AML samples as measured by BBA-Seq A) Methylation levels of an age-associated region in PDE4C of 41 healthy controls and, 47 samples at first diagnosis of AML and B) 19 samples with MRD (MRD+) and 6 samples without MRD (MRD-) were analyzed via BBA-Seq. Shown is the mean methylation for each CpG site. Each line represents one sample. MRD: minimal residual disease

Consequently, we analyzed DNAm patterns of individual DNA molecules which provide binary information of the methylation status of each CpG site. The combination of the methylation status of all CpG sites in the amplicon provides individual patterns for each strand. Age-associated DNAm occurred reproducably at specific CpG sites in the PDE4C amplicon, thus leading to a limited variability in DNAm patterns in healthy individuals. In contrast, methylated CpG sites in AML samples occurred troughout the amplicon in a stochastic manner. Higher variability of DNAm patterns of diseased patient samples was reflected by a higher number of total individual patterns; thus providing a new measure of epigenetic variability (Figure 3.17A).

66

Results

Figure 3.17 Unsupervised clustering of AML samples by DNAm patterns A) The amount of different patterns per sample is shown. Data is normalized as different patterns per 1000 reads. Unsupervised hierarchical clustering was performed and data are visualized by B) principal component analysis and C) Euclidean distance hierarchical clustering on samples represented as normalized vectors of pattern abundances in the samples. The colors in the heatmap correspond to the mutual similarity between sample pairs. Data analysis was performed by Dr. Jan Hapala, RWTH Aachen University. MRD: minimal residual disease

67

Results

We assumed that the aberrant methylation patterns - which were exclusively found in AML patients - were derived from the developing malignant clone. Thus, we hypothesized that the composition of DNAm patterns could be used to discriminate between healthy and diseased samples. Unsupervised principle component analysis (PCA) of DNAm patterns revealed high discriminatory power of the first and second principle component (PC) (Figure 3.17B). In concordance, hierarchical clustering with Euclidean distance demonstrated that AML samples at first diagnosis and relapse clustered together (Figure 3.17C), providing proof of concept that DNAm patterns can be used to classify healthy and AML samples. However, PCA and hierarchical clustering analyses of DNAm patterns do not provide quantitative measures of the prevalence of the malignant clone. To overcome these limitations, machine learning was used to classify DNAm patterns of individual DNA strands into a diseased or healthy status. These analyses were performed by Dr. Jan Hapala (Helmholtz Institute for Biomedical Engineering, Stem Cell Biology and Cellular Engineering, RWTH Aachen University). Comparative analyses of different machine learning approaches identified the support vector machine (SVM) algorithm as the most suitable classifier. This classifier finds a hyperplane providing the best separation between the healthy and malignant samples. Healthy samples (n = 40) and first diagnoses of AML patients (n = 47) were randomly divided into a training (75%) and test (25%) set using the caTools package in the software R to train the SVM. Regions with coverage lower than 100 reads were removed. The read number of a particular DNAm pattern was normalized by the total number of reads of the corresponding region and sample, resulting in pattern frequencies. In healthy control samples up to 10.3% of patterns were classified as AML derived (mean = 4.4%, Figure 3.18). The percentage of AML classified reads was significantly increased in first diagnosis (mean = 42.2%, p < 0.001) or MRD+ samples (mean = 20.6%, p < 0.001). For MRD- samples percentages of AML classified reads were close to the numbers of healthy controls (mean = 5.2%).

68

Results

Figure 3.18 Support vector machine algorithms classify healthy and malignant DNAm pattern The machine learning algorithm was trained on 40 healthy and 47 AML derived samples. DNAm patterns were classified as derived from healthy controls or AML patients by a support vector machine algorithm. Shown are the percentages of AML classified patterns for each sample. MRD: minimal residual disease

In theory, this classifier can be used to determine the proportion of malignant clones in a blood sample. However, further studies need to be conducted to analyze the specificity and sensitivity of this classifier.

3.3 The Functional Role of PRDM8 on Stem Cell Differentiation

A variety of genes are known to be involved in the aging process, such as telomerase- associated genes and genes encoding for DNA repair enzymes (Christensen et al. 2009; Florath et al. 2014). These genes might be mutated or epigenetically modified during aging or in age-related diseases. In a genome-wide screening for DNAm changes in patients with telomeropathies, Weidner et al. identified CpG sites in the histone methyltransferase PRDM8 as significantly hypermethylated in comparison to healthy controls (Weidner et al. 2016). In addition, PRDM8 expression was significantly reduced in patient samples. In this thesis, we modify the PRDM8 expression to explorer the possible effects of this epigenetic modifier on stem cell differentiation.

3.3.1 PRDM8 is Differentially Methylated in Dyskeratosis Congenita Patients

To validate hypermethylation of the CpG site cg27242132 within the PRDM8 gene, MassARRAY analysis of peripheral blood samples from 10 healthy controls, 18 DKC patients and 41 AA patients was performed by Varionostic GmbH (Figure 3.19). In concordance to the

69

Results findings published by Weidner et al., significant hypermethylation could be detected in DKC and AA patients (p=0.009 and p=0.0004, respectively; Weidner et al. 2016).

Figure 3.19 DNAm frequencies of the PRDM8 gene in patients with telomeropathies DNA methylation frequencies of the CpG site cg27242132 within the PRDM8 gene were determined for 10 healthy controls, 18 DKC patients and 41 AA patients by MassARRAY analysis. Error bars represent the mean with standard deviation. ns: not significant, ***: p < 0.001, *: p < 0.05

Together with the previously described decrease in PRDM8 expression in DKC patients, this leads to the assumption that PRDM8 plays an important role in telomeropathies. To investigate the role of this epigenetic modifier on stem cell differentiation, gene knockouts in iPSCs were generated using the CRISPR/Cas9n technology.

3.3.2 Using CRISPR/Cas9n to Modulate PRDM8 Expression in Induced Pluripotent Stem Cells

The CRISPR/Cas9n technology was applied to generate PRDM8 gene knockouts. All generated cell preparations were clonally derived from single cells. Deletion of targeted PRDM8 gDNA regions was confirmed via PCR using flanking primers which generated a 320 bp band for the edited and 1046 bp band for the unedited genomic region. Two edited clones could be generated from the same iPSC clone which was reprogrammed from bone marrow mesenchymal stromal cells (MSCs). For the PRDM8 -/- clone, deletion of gDNA was confirmed by PCR (Figure 3.20A). Sequencing of cDNA confirmed the removal of the second exon of transcript 2 (Figure 3.20B). The deletion of start codon containing exon 2 leads to an

70

Results altered transcription start site and subsequent stop codon after 84 bp, thus resulting in the complete knockout of the PRDM8 protein in the homozygous clone. As additional clone, a heterozygous PRDM8 +/- clone was generated. To ensure clonal origin of this cell preparation, cells were sub-cloned by limited dilution. Nine subclones were generated, each revealing two bands on the agarose gel after control PCR on gDNA – a 320 bp and 1050 bp band. These bands represent the edited and unedited gDNA region respectively, proving heterozygosity of the generated clones (Figure 3.20C). In addition, Sanger sequencing of gDNA resulted in indistinct base calls, due to heterogeneity of the PCR amplicon (Figure 3.20D).

Figure 3.20 Screening for genetically engineered iPSC clones A) Homozygous deletion of the targeted PRDM8 region was confirmed via PCR of gDNA and subsequent gel electrophoresis. Expected lengths of PCR amplicons are 320 bp for the edited and 1046 bp for the unedited amplicon. B) Alignment of Sanger sequencing results for the PRDM8 -/- clone cDNA to the reference sequence. Exon 2 of transcript 2 is shown in yellow and marked in gray. Interrupted alignment due to exon deletion is indicated in red. C) Control PCR of subclones derived from heterozygous cell preparation. Expected lengths of PCR amplicons are 320 bp for the edited and 1046 bp for the unedited amplicon. D) Base calls for Sanger sequencing of gDNA of heterozygous PRDM8 +/- clone. Exon 2 of transcript 2 is marked in gray. The observed double peaks indicate heterozygosity. Green, yellow and red color code beneath peaks represents sequencing quality. Low sequencing quality is a result of double peaks caused by heterozygosity.

Specificity of the genome editing process could be significantly increased by using the mutated Cas9n instead of wildtype (WT) Cas9, thus off-target effects could be neglected. For 71

Results this reason, genome-wide off-target effects were not analyzed in this study. Western blot analyses could not be performed at this stage due to lack of mRNA and protein expression of PRDM8 in iPSCs. Pluripotency of all generated iPSC clones was analyzed by immunofluorescence staining (Figure 3.21). Colonies were stained with antibodies for the pluripotency markers OCT4 and TRA-1-60. Expression of these markers could be detected for the genetically engineered iPSC clones and the isogenic WT clone.

Figure 3.21 Staining of pluripotency markers for the genetically engineered iPSC clones Staining of iPSCs for the pluripotency markers OCT4 and TRA-1-60. Nuclei were counterstained with DAPI. All clones show positive staining for both pluripotency markers.

72

Results

Pluripotency is defined by the potential to differentiate into all three germ layers. To validate the differentiation potential, all clones were subjected to EB assays which allow for spontaneous differentiation of iPSCs. After 14 days of differentiation, gene expression analysis of lineage markers and the pluripotency marker OCT4 was performed (Figure 3.22).

Figure 3.22 Multi-lineage differentiation potential of genetically engineered iPSC clones Embryoid body assays of genetically engineered clones and the isogenic WT control were performed for 14 days. At day 14, RNA was harvested and gene expression analyses were performed. A) Gene expression is depicted as mean fold change of GAPDH. OCT4 is used as markers for pluripotency. PAX6 and NES were used as ectodermal markers. SOX17 and AFP were used as endodermal markers. RUNX1 and NKX2.5 were used as mesodermal markers. Gradient bar represents scale of expression levels; red: high expression, blue: low expression. B) Quantification of PRDM8 expression in EB assays at day 14 using the ΔΔCT method. Values were normalized to WT control day 0.

After 14 days of differentiation, a moderate reduction of OCT4 expression could be observed in all clones. Increased expression of all lineage markers could be observed for the isogenic WT clone. In addition, increased expression of the mesodermal marker RUNX1, the endodermal marker AFP and ectodermal marker PAX6 could be observed for the PRDM8+/- and PRDM-/- clones in comparison to day 0 (Figure 3.22A). Overall, this reveals that all clones lose their pluripotency during EB assay conditions and are able to form progenitor cells of all three germ layers. However, the genetically engineered clones showed reduced expression of the mesodermal marker NKX2.5 which is important for the development of cardiomyocytes. Importantly, the PRDM8+/- and PRDM8-/- clones revealed a lack of expression of the early neural markers PAX6 and NES when compared to the isogenic control, indicating impaired potential for neural differentiation. The expression of

73

Results ectodermal markers positively correlate with the PRDM8 genotype as the expression of these markers is even lower in the PRDM8-/- clones in comparison to the PRDM8+/- clones. Of note, PRDM8 expression in WT iPSCs is limited (Figure 3.22B) but increases after 14 of differentiation about 20-fold when compared to day 0. For the PRDM8-/- clones, no expression could be detected since the primers for the semi-qPCR bind to the deleted cDNA region. However, for the heterozygous clone moderate expression could be observed. We assume that this expression results from the native, unedited allele. Taken together, these findings revealed that PRDM8 expression is increased during early spontaneous stem cell differentiation, indicating that it may execute an important role in the regulation of stem cell differentiation.

3.3.3 Influence of PRDM8 on Neural Differentiation

The role of PRDM8 in neural development of mice has been elucidated by recent studies (Inoue et al. 2015; Iwai et al. 2018). This project aims to enhance our understanding of the physiological role of PRDM8 during human neurogenesis. In line with current literature, undirected differentiation via EB assays confirmed the impaired neural development of PRDM8 depleted cells also for the human system. Directed neural differentiation was performed (in collaboration with the Institute of Physiology, Lampert Lab, RWTH Aachen University) to further investigate the role of PRDM8 expression on human neural development. After 10 days of neural differentiation into the sensory phenotype, gene expression analyses were performed (Figure 3.23A). In comparison to the isogenic control, the PRDM8 depleted clones revealed decreased expression of the neural markers TAC1, NAV1.7, NES, NEFH and SOX1. The expression of neural markers in the homozygous knockout clone was reduced in comparison to the heterozygous clone, indicating that the neural differentiation capacity correlates with the genotype. After three additional days of neural maturation, microscopic analyses revealed prominent morphological differences (Figure 3.23B). While WT cells formed ganglion like structures with bridging axon like fibers, hardly any agglomerates of neuronal like cells could be detected in the PRDM8-/- and PRDM8+/- cultures. These findings suggest that PRDM8 plays an essential role for early neural development and the development of human sensory neurons.

74

Results

Figure 3.23 Neural differentiation of PRDM8 depleted iPSCs A) Neural differentiation of the PRDM8+/- and PRDM8-/- clones and the isogenic WT control was performed for 10 days. RNA was harvested and gene expression analyses were performed. Undifferentiated WT cells serve as control. Gene expression is depicted as mean fold change of GAPDH. Gradient bar represents scale of expression levels; red: high expression, blue: low expression. B) Microscopic analyses of cell morphology after three days of maturation into the sensory phenotype.

We hypothesized that overexpression of PRDM8 could rescue or promote the neural differentiation. Hence, PRDM8 overexpressing clones were generated in two strategies. Using an overexpression vector encoding for GFP as an expression marker, we designed either a construct encoding a GFP-PRDM8 fusion protein with a GSG linker or a construct encoding for a T2A separated GFP and PRDM8 to fully avoid steric hindrance. Successful overexpression of PRDM8 was confirmed via semi-qPCR (Figure 3.24). In comparison to iPSCs transduced with the GFP vector only, transduction with the GFP-PRDM8 fusion vector or GFP-T2A-PRDM8 vector resulted in 70 – 230 fold increased PRDM8 expression.

75

Results

Figure 3.24 Gene expression analysis of PRDM8 in transduced iPSCs iPSCs were transduced with vectors coding for GFP only, a GFP-PRDM8 fusion protein or T2A separated GFP and PRDM8 proteins. RNA was harvested ten days after transduction and relative expression of PRDM8 was determined. Values were normalized to GAPDH and gene expression is depicted as relative expression normalized to the WT cells transduced with the control vector (ΔΔCT-method).

Functionality of the T2A sequence was validated via confocal microscopy of sorted, GFP positive cells (Figure 3.25). Green fluorescence was detected exclusively in the nuclei of the cells transfected with the GFP-PRDM8 fusion construct indicating that the fusion protein is transported to the nuclei via the nuclear localization sequence of PRDM8. In contrast, in the control cells - transduced with the GFP vector only - and the cells transduced with the GFP-T2A-PRDM8 construct, green fluorescence was detected in the nuclei and in the cytoplasm. This indicates that the GFP-T2A-PRDM8 construct is separated via ribosome skipping at the T2A sequence, thus distributing the GFP into the cytoplasm. The proportion of GFP positive cells varied between the cell preparations indicating that either the sorting efficiency was limited or that the transgene was silenced during culture.

76

Results

Figure 3.25 Confocal microscopy of transduced iPSCs after cell sorting After cell sorting, GFP positive iPSCs were seeded onto vitronectin coated glass slides and grown for 2 days. Cells were fixed with 4% PFA for 10 min and counterstained with DAPI. In contrast to the clones transduced with the control vector and the vector coding for GFP-T2A-PRDM8, the cells transduced with the construct for GFP-PRDM8 showed GFP localization exclusively in the nucleus.

Meanwhile, the differentiation protocol could be optimized by our cooperation partners by the 10-fold increase of the SMAD Inhibitor SB 431542 concentration in the media. The PRDM8+/-, PRDM8-/-, the isogenic WT control and two additional WT control clones derived from bone marrow MSCs of different donors were then differentiated with the optimized protocol into the sensory neuronal phenotype - each either untransduced or transduced with the control vector, the GFP-PRDM8 vector or the GFP-T2A-PRDM8 vector. 77

Results

Shown are exemplarily the PRDM8-/- and isogenic WT clones after 28 days of maturation (Figure 3.26).

Figure 3.26 Maturation of sensory neurons derived from transduced iPSCs Shown is day 28 of sensory neuronal maturation. For transduction, a vector control and constructs encoding for a GFP-PRDM8 fusion protein and for T2A separated GFP and PRDM8 proteins were applied. The vector control contained only sequences encoding for GFP. Untransduced cells served as additional control. Red arrowheads indicate prominent neuronal structures such as ganglia-like aggregates and neuronal fibers.

78

Results

Compared to the untransduced control, less prominent neuronal structures could be observed in the transduced cells of the isogenic WT control, indicating that the treatment with an integrative virus may interfere with the neural differentiation potential. The amount and morphology of neuronal structures of the PRDM8 overexpressing cells however was similar to the cells transfected with the control vector. As observed in previous experiments, the neural differentiation capacity of the PRDM8-/- clone was reduced. Our hypothesis that PRDM8 overexpression may lead to a complete rescue of the differentiation could not be confirmed as neuronal structures of the PRDM8 overexpressing clones with PRDM8-/- background were less prominent in comparison to the isogenic WT control. Notably, the number and morphology of neuronal structures of samples under the all conditions (untransduced, vector control, GFP-PRDM8 and GFP-T2A-PRDM8) varied between the three WT controls so that no further conclusions could be drawn from the morphological analyses.

3.3.4 Genome-Wide Analyses of PRDM8-Dependent DNA Methylation Changes During Neural Differentiation

To gain quantitative insight into changes on global DNAm, Infinium MethylationEPIC BeadChip analyses were performed with all clones, which were untransduced or transduced with the GFP-T2A-PRDM8 construct, after ten days of neural differentiation. It is important to note that this part of the thesis aims to provide a rational for further, statistically sound studies on global DNAm and thus all experiments were only performed once and with one sample each. First, we applied the Horvath predictor to all samples because we hypothesized that the PRDM8 knockout or overexpression might influence the epigenetic age (Horvath 2013). However, all samples were estimated close to zero years. As a next step, clustering analyses were performed. Principle component analyses demonstrated that PC1 and PC2 seem to be mainly comprised by donor specific differences (Figure 3.27A). Hence, PC2 and PC3 were used to visualize PRDM8-dependent differences in the DNAm data (Figure 3.27B). This PC combination allowed clustering of the PRDM8+/- and PRDM8-/- clones and revealed differences in comparison to the isogenic control (WT1). However, for the PRDM8+/- and PRDM8-/- clones as well as the isogenic control clone, clustering of the untransduced and PRDM8 overexpressing conditions was observed.

79

Results

Figure 3.27 Clustering of DNAm data of neurally differentiated cells Principle component analyses of DNAm data of PRDM8+/- and PRDM8-/- cells, the isogenic control (WT1) and two additional controls (WT2, WT3) after 10 days of neural differentiation. The cells are either untransduced or overexpressing PRDM8 via the GFP-T2A-PRDM8 construct. A) PC1 and PC2 of DNAm data reveal donor specific clustering of samples. B) PC2 and PC3 are visualized to demonstrate differences of the PRDM8+/- and PRDM8-/- clones and the isogenic control.

To further investigate the influence of PRDM8 overexpression, mean DNAm data for the three untransduced and PRDM8 overexpressing WT controls were compared. Only a small number of CpG sites revealed substantial differential methylation (Figure 3.28A). In addition, adjusted p-values were calculated for these samples and revealed no significantly differentially methylated CpGs. These findings together with the PCAs led us to the conclusion that the overexpression does not influence on the DNAm pattern. Another explanation can be that PRDM8 expression was silenced during the neural differentiation and thus induced no differences in DNAm. Correlation of DNAm levels of the PRDM8-/- clone and the isogenic control revealed 6950 hypomethylated and 7630 hypermethylated CpG sites in the PRDM8-/- clone (Figure 3.28B). To test whether the overexpression led to inverse methylation in comparison to the knockout, we plotted the correlation of the methylation values of the same CpG sites as in Figure 3.28B for the overexpressing clone (WT1 O.E.) instead of the knockout clone (Figure 3.28C). Notably, no evident inversion of the methylation levels was observed. Overall, the small amount of CpG sites which were hypermethylated in the overexpressing clone were also hypermethylated in the knockout clone and vice versa, thus strengthening our hypothesis that PRDM8 overexpression experiments did not influence the methylation landscape. To further elucidate the specificity of PRDM8-dependent DNAm changes, hierarchical clustering of the differentially methylated

80

Results

CpG sites from Figure 3.28B was performed. To account for different genders of the donors, CpG sites located at the sex chromosomes were excluded. The PRDM8+/- and PRDM8-/- clones were grouped by hierarchical clustering and overall methylation levels between these two clones were comparable, indicating that the homozygous and heterozygous knockout led to similar changes in global DNAm (Figure 3.28D). Differentially methylated CpG sites between PRDM8-/- cells and the isogenic control also revealed differential methylation in comparison to the other control WT clones, substantiating that the observed differential methylation is PRDM8 dependent.

81

Results

Figure 3.28 PRDM8-dependent differentially DNAm A) Scatter plot of mean methylation values of untransduced and PRDM8 overexpressing WT controls. Shown are CpG sites with a methylation difference of at least 20%. B) Methylation levels of the PRDM8-/- clone and the isogenic control (WT1). Shown are CpG sites with a methylation difference of at least 20%. Red indicates hypermethylation and blue indicates hypomethylation of the knockout clone. C) The CpG sites from B) are shown for the PRDM8 overexpressing WT1 clone and the untransduced WT1 clone. No evident PRDM8- depedent reversion of the DNAm values can be observed. D) Hierarchical clustering of CpG sites from B) using Euclidean distances. Red indicates hypermethylation and blue indicates hypomethylation of the knockout clone.

82

Results

Most of the differentially methylated sites from Figure 3.28B are located in gene bodies and outside of CpG islands (Figure 3.29A, B). In tendency, promoter regions and CpG islands were more often hypomethylated than hypermethylated. Thus, we hypothesize that PRDM8 indirectly functions as a repressor of gene expression via the promotion of methylation in promoter regions.

Figure 3.29 Distribution of differentially methylated regions Localization of 14 580 CpG sites with at least 20% differential methylation between PRDM8-/- cells and the isogenic control with regard to A) gene structures and B) CpG islands. Gene ontology enrichment analyses were performed for genes revealing C) hyper- or D) hypomethylation in their promoter regions (TSS200, TSS1500, 5’UTR) using the software GoMiner.

83

Results

We reasoned that differential methylation in promoter regions may alter expression of the respective genes and thus fundamentally contributes to the impaired neural differentiation. Gene ontology analyses of hypermethylated CpG sites in promoter regions of the PRDM8-/- clone revealed enrichment in categories involved in intracellular signaling, neuronal signaling and sensory perception (Figure 3.29C). Promoter hypomethylation was associated to categories involved in cell-cell adhesion, transmission of nerve impulses and synapse formation (Figure 3.29D). In conclusion, this is the first report of impaired neural differentiation of human PRDM8-/- cells. Notably, we identified PRDM8-dependent differential DNAm of promoter regions of genes involved in neural development and function.

84

Discussion

4 Discussion

4.1 Tissue Specific DNA Methylation Enhances Epigenetic Age Predictions

Using differential DNAm profiling for the purpose of age predictions is a rapidly growing research field. Since the first described epigenetic age predictors in 2011 (Bocklandt et al. 2011; Koch and Wagner 2011) many studies focus on the generation of age predictors applicable for single or multiple tissues (Garagnani et al. 2012; Hannum et al. 2013; Horvath 2013; Weidner et al. 2014; Zbiec-Piekarska et al. 2015). In addition to age estimations for adults, recent studies report epigenetic age predictions also for the gestational age or for mice samples (Girchenko et al. 2017; Han et al. 2018). DNA methylation analyses also became of growing interest for forensic science. Using differential DNAm profiling for age predictions of an unknown blood stain donor’s age is by far the most popular and fast- expanding forensic epigenetic application (Vidaki and Kayser 2018). Buccal swabs are a widely used specimen for forensic science (Thongngam et al. 2017; Vidaki et al. 2018). They have the benefit that they are easy to harvest and store. A study on the Danish nurse cohort examined the compliance of participants in a pilot study (Hansen et al. 2007). Their data showed that only 31% of the requested participants delivered a blood sample, whereas 72% and 80% delivered a saliva or buccal cell sample via mouth swabs, respectively. In addition, sample collections for e.g. genetic testing or age predictions of allegedly underaged unaccompanied refugees require non-invasive procedures (Barata et al. 2015), thus it is easily anticipated that saliva samples or buccal swabs are an attractive source for epigenetic age predictions. In 2011, Bocklandt et al. described epigenetic age predictions for saliva samples - which are mainly comprised of leukocytes. From 88 identified age-related genes in saliva samples, only 10 CpG sites were previously described as correlated with age in whole blood samples (Bocklandt et al. 2011). They generated a multivariate model based on three CpG sites and performed age predictions with a MAD of 5.3 years in a cohort of 66 saliva samples. However, the authors didn’t apply their predictor to an independent validation cohort or to buccal swab samples. It is often suggested that buccal swabs can be alternatively used instead of saliva which is a misguided assumption due to different proportions of buccal epithelial cells and leukocytes in saliva and swab samples (Thiede et al. 2000; Vidaki et al. 2016; Vidaki and Kayser 2018). Tissue of origin has been widely accepted as a major source for DNAm

85

Discussion differences across different samples regardless of whether they originate from the same or different individuals (Ziller et al. 2011; Davies et al. 2012; Jiang et al. 2015). In line, our analyses of publicly available DNAm data demonstrate that the proportion of leukocytes in saliva samples lies in between those of blood and mouth swab samples. These findings underline the importance to account for the cellular composition when performing epigenetic analyses. The deviation of age predictions of swab samples with our 3-CpG-blood model (Weidner et al. 2014) can be explained by the differences in age correlation of the individual CpG sites in buccal epithelial cells as compared to blood. Out of the three CpG sites of the age predictor, only one - the CpG site associated to PDE4C - revealed correlation with age in buccal swab samples and could be used for age prediction in a 1-CpG-model (Figure 3.2). Nonetheless, age predictions based on pyrosequencing of only one CpG site are prone to errors caused by outliers. Also the re-training of our previous predictor into a 3-CpG-swab-model allowed moderate age predictions. However, this model is not adjustable for cell types and susceptible for outliers with above-average high amounts of either leukocytes or epithelial cells in the sample. To overcome these limitations, we aimed at developing an epigenetic age predictor adjustable for the proportion of cell types in swab samples. We found the CpG site in the promoter of CD6 to be hypomethylated in blood samples whereas the CpG site in the SERPINB5 promoter was hypomethylated in epithelial cells. CD6 is a surface marker of immune cells and SERPINB5 was originally described as tumor suppressor (Sager et al. 1997; Brown 2016). It is well known, that smoking, ethnicity and gender have an influence on the methylation of specific CpG sites in the genome (Zhang et al. 2011; Elliott et al. 2014). Investigations on the influence of these known confounding factors revealed no effect on our implemented CpG sites. This thorough investigation of confounding factors displays a major advantage of our generated age predictors. By the combination of these markers, we generated the so called Buccal-Cell-Signature to estimate the percentage of epithelial and blood cells in buccal swabs. The predicted cell proportion of leukocytes varied between 12% - 63% (mean = 35%). This is in line with a previous study based on short tandem repeats after allogeneic transplantation that reported percentages of leukocytes between 5% - 60% in buccal swab samples (Thiede et al. 2000).

86

Discussion

Recently, Zheng et al. developed an algorithm to determine the cellular composition of buccal swabs (Zheng et al. 2018). The cellular content was estimated based on methylation profiles with Hierarchical Epigenetic Dissection of Intra-Sample-Heterogeneity (HEpiDISH) which is a reference-based cell type deconvolution algorithm. Predicted epithelial cell percentages of buccal swabs ranged from 57.6% to 96.7%. In a comparative study, van Dongen et al. applied the HEpiDISH approach and our Buccal-Cell-Signature to the same set of samples. Estimates of epithelial cell proportions of both methods correlated significantly, thus supporting functionality of our predictor (R2 = 0.97, p < 2.2 × 10−16). However, HEpiDISH seemed to provide higher discriminatory power in the higher range of epithelial cell percentages and has the advantage to allow estimations of proportions of leukocyte subtypes (van Dongen et al. 2018). It is important to note, that models for the prediction of cellular compositions of buccal swab samples can not only be used for age-prediction purposes but also to improve interpretations of data derived from swab samples in general. Marinova et al. successfully applied our Buccal-Cell-Signature to correct for cell type differences of methylation array data derived from buccal swab samples to study the epigenetic effect of childhood labor and trauma (Marinova et al. 2017). In this thesis, the Buccal-Cell-Signature was combined with three age-associated CpG sites into a new model for epigenetic age estimations of buccal swab samples. Notably, this 5-CpG-model outperformed our previously developed age predictors when applied to samples of donors over the age of 35 years. Previous studies have highlighted greater deviations of predicted age in samples from children and elderly people, relative to young adults (Zbiec-Piekarska et al. 2015; Zubakov et al. 2016). Thus, the 5-CpG-model might provide a promising tool for age estimations especially for elderly persons. After publication of this project (Eipel et al. 2016), Hong et al. described age predictions of saliva samples by the combination of six age-associated and one cell type specific CpG site. They achieved age predictions with a MAD of only 3.2 years (Hong et al. 2017). The improved age prediction accuracy may result from the inclusion of a higher number of age related CpG sites. In addition, new methylation arrays and epigenome-wide techniques are available to identify CpG sites which reveal even higher age correlation in comparison to those applied in our study. Recently, Jung et al. developed an epigenetic age predictor applicable to blood, saliva and buccal swab samples based on 5 age-associated CpG sites (located in the genes ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59) which were previously described to perform

87

Discussion accurate age prediction in blood samples (Zbiec-Piekarska et al. 2015; Jung et al. 2019). The authors constructed a multivariate linear regression model allowing age prediction of all three samples types with an MAD of 3.8 years across all samples and an additional model applicable specifically for buccal swab samples with a MAD of 4.3 years in an independent test cohorts. This reveals improved age prediction in comparison to our 5-CpG-model (MAD = 5.09 years). This can be explained due to the fact that three out of the five CpG sites used by Jung et al. show high age correlation in swab samples thus diminishing the need for cell type adjustments. In general, when developing age prediction models, special care should be taken regarding the choice of suitable marker sets. Models with higher numbers of CpG sites are more robust against outliers. However, this benefit comes with the disadvantage of higher sequencing costs and the need for bioinformatic processing of the generated data when performing next generation sequencing or Illumina BeadChip based techniques. Although an epigenetic age prediction model has been proposed that can be applied across human tissues (Horvath 2013), the number of CpG sites used in this method is too large for multiplex-based trace analysis with current technologies (Vidaki and Kayser 2017). In conclusion, we demonstrated that cell type adjustments can improve epigenetic analysis of buccal swab samples. We provided a simple tool for cell type estimations which - together with age-associated CpG sites - can be used for accurate age predictions of buccal swab samples.

4.2 Malignant Clonal Development can be Identified by DNA Methylation Variability of Successive CG Dinucleotides

It is well known that myeloid neoplasms are reflected by aberrant DNAm which can be used for disease stratification (Li et al. 2016a; Li et al. 2016b). So far, these methods were particularly based on comparison of DNAm levels at specific CpG sites or on genome-wide DNAm profiles and require bioinformatic data processing. In this study, two alternative scoring systems based on variation of DNAm levels at successive CpG sites are proposed. Our finding, that patients with hematological malignancies have a higher predicted epigenetic age, is in line with a recent study by Lin and Wagner. In a meta-analysis, the authors described for 25 different cancer tissues a loss of age correlations of CpG sites which 88

Discussion show age-association in healthy tissue (Lin and Wagner 2015). This loss of age correlation leads to a vast overestimation of the epigenetic age. However, the authors reasoned that aberrant DNAm at age-associated regions can not only be attributed to accelerated epigenetic aging caused by higher cell proliferation because epigenetic age predictions are not exclusively overestimated in malignant diseases but can also be underestimated (Kim et al. 2014). Lin and Wagner concluded that epigenetic aging signatures in cancer tissues do not reflect accelerated or decelerated aging rates but rather reflect the state of aging in the tumor-initiating malignant clone. In fact, our finding, that methylation patterns at successive CpG sites are patient specific, supports this theory as it can be anticipated that the pattern is derived from the tumor initiating clone. The age-associated region within PDE4C seems to be of special capacity to track epigenetic variability during malignant transformation and biological age, even though it represents only a short genomic region. In 2016, Lin et al. demonstrated that methylation of a single CpG site in PDE4C - the CpG site originally identified for the generation of our epigenetic age predictor - is indicative for life expectancy in the Lothian Birth Cohort (Lin et al. 2016). This finding highlights the clinical relevance of methylation pattern analysis of age- associated regions in general and the PDE4C gene in particular. However, other AML associated genomic regions do not depict high levels of inter- CpG variation in hematological malignancies, such as the analyzed region in DNMT3A (Figure 3.14) as shown by the low n-scores. In addition, d-score and n-scores of both analyzed genomic regions did not correlate, indicating that DNAm variability in different genomic loci is not coherently modified during malignant transformation. Why variability in DNAm levels in PDE4C can be used to predict life expectancy and to track malignant clones remains to be elucidated. Hansen et al. found evidence that DNAm heterogeneity is not completely random throughout the epigenome and may occur preferentially at epigenetic hypervariable hotspots (Hansen et al. 2011; Li et al. 2016b). The authors suggested that it is possible that sites of focal epigenetic variance provide an evolutionary benefit for cells within a population by sampling different transcriptional states. To what extend this also applies also to the region in PDE4C remains to be investigated. Epigenetic variability has been used previously to track clonal malignancies. In 2016, Li et al. published a study on allelic diversity of epigenetic compartments in AML samples (Li et al. 2016a). They examined epigenetic heterogeneity, as assessed by DNAm within defined

89

Discussion genomic loci containing four CpG sites, as well as somatic mutations and transcriptomes of

AML patient samples at serial time points. They observed that epigenetic allele burden varies considerably during disease progression and identified subsets of AMLs with high epiallele and low somatic mutation burden at diagnosis, a subset with high somatic mutation and lower epiallele burdens at diagnosis, and a subset with a mixed profile, suggesting distinct modes of tumor heterogeneity. This result indicates that epigenetic variability is independent from the frequency of somatic mutations. Interestingly, epiallele burden was associated with poor clinical outcome. This is in line with our findings that increased d-scores and n-scores are indicative for clinical outcome. Nonetheless, Li et al. reported that epiallele shifting is highly variable within individual patients at diagnosis and relapse - thus contradicting our finding that DNAm patterns are highly patient specific and reoccur in relapsed patients. The observed variability in DNAm patterns may therefore not be a marker for changes in physiological function during malignant transformation but reflect stochastic changes in DNAm - or epigenetic drift - of the malignant clone. Still, reproducibility of variability in this specific genomic region in MDS and AML patients lead to the assumption that the quantification of this variability has the potential to be used as a diagnostic marker to detect clonal hematopoiesis. However, further studies are needed to enhance our understanding of the dynamics of DNAm changes during malignant transformation. Teschendorff et al. studied genome-wide inter-CpG correlations in DNAm data of cervical carcinoma samples (Teschendorff et al. 2014). They identified CpG sites that indicate the risk of neoplastic transformation in stages prior to neoplasia by an increased covariation of DNAm changes. Consistent with a phase transition model of carcinogenesis, their data indicate that epigenetic diversity is maximal prior to the onset of cancer. However, our finding suggests that epigenetic variability of successive CpG sites in defined loci increases with the expansion of the malignant clones. This disparity can be explained by the different study design. Teschendorff et al. applied a genome-wide study based on DNAm levels determined by Illumina 27k BeadChip arrays which do not represent directly neighboring CpG sites as realized by our d-score and n-score. On the other hand, findings for the PDE4C region must not be generalized. Further studies on the loss of covariation of successive CpG sites by our n-score in larger cohorts and on a genome-wide scale may shed light on the dynamics of DNAm variability during malignant transformation.

90

Discussion

To sum it up, this is the first report of easy and cost effective scoring systems to identify the prevalence of a malignant clone in hematological malignancies without the need for bioinformatic analyses. To which extent these scores can be used to quantify malignant clones requires further studies. Nonetheless, patient specificity of DNAm changes led us to the hypothesis that single strand resolution analysis of DNAm patterns by next generation sequencing can be used to quantify clonal development in AML patients.

4.3 Classification of DNA Methylation Patterns of Individual DNA Strands Provides a New Perspective to Track Clonal Hematopoiesis

In line with our previous pyrosequencing based findings, increased DNAm variability was detected via BBA-Seq in AML patients. In addition, patient specificity of mean DNAm patterns of samples at first diagnosis and with MRD was confirmed. This leads to the conclusion that the epigenetic makeup of the blood cell population is shaped by the composition of blood cell clones. Thus, hematological malignancies capture the arbitrary pattern of the prevailing malignant clone. In contrast to age-associated DNAm changes in the age-associated region of PDE4C, which appeared at highly reproducible CpG sites in a highly reproducible frequency, changes of DNAm in AML samples seemed to occur in a random, stochastic manner leading to an increased variability of DNAm patterns (Figure 3.17). It is well known that patients with AML exhibit profound and heterogeneous disruption of their DNAm landscapes (Figueroa et al. 2010b) and that patients with increased DNAm heterogeneity reveal inferior clinical outcome of chronic lymphoid leukemia (Landau et al. 2014). We reasoned that epigenetic heterogeneity reflects the clonal composition of blood cell populations and thus we investigate how DNAm patterns at individual DNA strands can be used to quantify clonal development. Hierarchical clustering and PCA demonstrated that healthy and AML samples could be grouped via the composition of their DNAm patterns. As a next step, we aimed at classifying individual DNAm patterns. However, classifying single DNAm patterns by SVM algorithms requires large patient cohorts and needs to be performed with caution in regard to the design of the machine learning training set. Also, the trade-off between sensitivity and specificity needs to be chosen with special care for the machine learning algorithm. So far, our classifier achieved a high sensitivity for malignant patterns and low false positive events.

91

Discussion

Consequently, the specificity of classifying malignant patterns needs to be improved before applying this method for prognostic or predictive purposes which may lead to a reduction in the identification of AML-derived patterns. Importantly, we reason that specificity can also be increased by training the SVM on individual patient samples. This reduces the variability of DNAm patterns for the training of the machine learning algorithm and improves sensitivity of classifications for MRD+ samples. Furthermore, the integration of additional selected loci with increased DNAm variability can potentially increase the sensitivity and render the method more robust against outliers and technical variations. One major aim of this study is to provide a perspective to improve MRD diagnostics. The state of the art technique for MRD diagnostics is the quantitative PCR of genomic mutations, overexpressed genes or chimeric fusion genes (Selim and Moore 2018). PCR based analyses of fusion genes and NPM1 mutations can collectively monitor only approximately 60% of AML presenting in children and younger adults. In addition, applicability declines with increasing age to only 32% in patients older than 60 years (Grimwade and Freeman 2014). PCR of overexpressed genes, most commonly WT1, reveals a dynamic range of sensitivity across different studies and is restricted to cases which lack other molecular markers (Cilloni et al. 2009; Marani et al. 2013; Selim and Moore 2018). Still, PCR based methods are characterized by their high resolution. E.g. quantitative PCR of NPM1 mutations revealed a sensitivity of 10-5 and could be used to reliably predict relapse of AML patients (Ivey et al. 2016). Our proposed epigenetic classifier has the benefit of being applicable to a high number of AML cases independent from genetic mutations, however further studies are required to assess the sensitivity of this approach. To this end, the number of AML-classified reads can be correlated with dilution series of AML samples or with blast counts of AML patients. It will be interesting to investigate whether our SVM classifications can be used to also detect age-related clonal hematopoiesis. Jaiswal et al. used NGS to screen for the enrichment of genomic mutations as a marker for clonal hematopoiesis in elderly persons (Jaiswal et al. 2014). 10% of persons older than 70 years carried clonal outgrowth of hematopoietic cells. Concluding from the proportion of alleles with specific somatic mutations, they found that a median of 18% of peripheral-blood leukocytes were derived from an abnormal clone. As clonal hematopoiesis - with or without known driver mutations - is associated with an increased risk of developing hematological cancer, early detection of

92

Discussion

ARCH may indicate the need for expanded screenings for hematological disorders and therefore improve patient care. Our DNAm based method provides a new, mutation independent perspective for the screening of ARCH. In addition, several studies have proven clonal hematopoiesis in 5% - 36% of aplastic anemia patients (Lane et al. 2013; Kulasekararaj et al. 2014; Yoshizato et al. 2015). Further studies will elucidate to what extend our DNAm based technique can be used to detect clonal hematopoiesis in patients with AA or other telomeropathies. In conclusion, we demonstrate that DNAm patterns at individual DNA strands can be classified as healthy and AML derived via machine learning. The quantification of classified patterns provides a new perspective to track clonal hematopoiesis and to identify MRD.

4.4 Lack of PRDM8 Expression Leads to Impaired Neural Differentiation and is Associated with Differential Promoter Methylation

In 2016, Weidner et al. demonstrated the hypermethylation of the PRDM8 promoter in DKC, AA and Down syndrome patients along with a reduced expression of transcript 2 of PRDM8, indicating an epigenetically controlled gene expression (Weidner et al. 2016). In continuation of this work, we validated hypermethylation of PRDM8 in an independent set of 18 DKC and 41 AA patients. Future studies will investigate to what extend this hypermethylation can be used as a biomarker to improve DKC and AA diagnostics. To date, about 40% of DKC cases show no mutational background which could be used to identify the disease and AA diagnosis still relies on the exclusion of mimicking disorders (Fernandez Garcia and Teruya- Feldstein 2014; Stanley et al. 2017) underlining the need for specific biomarkers to reliably identify telomeropathies. However, it has been shown that PRDM8 is also hypermethylated in a variety of tumors (Weidner et al. 2016) which interferes with the specificity. Nevertheless, we reasoned that the histone methyltransferase PRDM8 might play an important role in the establishment of the clinical phenotype of DKC. Thence, genetically engineered PRDM8+/- and PRDM8-/- iPSCs were generated. Subcloning of the PRDM8+/- clone confirmed heterozygosity and monoclonal origin of the cell preparation (Figure 3.20C). Immunofluorescence staining of pluripotency markers and embryoid body assays demonstrated pluripotency and three lineage potential of all generated clones (Figure 3.21, Figure 3.22). Thus, the generated cell lines provide a useful 93

Discussion tool for future studies on the function of PRDM8. However, due to a lack of appropriate antibodies, western blot analyses still need to be performed to prove the knockouts and overexpression on protein level. For validation of the differentiation experiments and DNAm data it is advisable to generate PRDM8-/- clones from additional donors and repeat the experiments with more replicates. Gene expression and morphological analyses of PRDM8 depleted clones after spontaneous and directed differentiation revealed impaired neural differentiation potential - even for the heterozygous knockout. This is the first report of impaired early neural differentiation in a human in vitro model. Our findings are in line with studies performed in mice. It has been demonstrated that PRDM8 is expressed in human (www.proteinatlas.org) and murine (Uhlen et al. 2010; Inoue et al. 2015) brain tissues. For mice, Prdm8 is described as important key regulator of neural development. Ross et al. demonstrated that Prdm8 functions as a repressor for Cadherin-11 to ensure proper neural circuit formation (Ross et al. 2012). In a further study, Inoue et al. revealed that knockdown of Prdm8 results in a premature change from multipolar to bipolar morphology during mice neocortical development (Inoue et al. 2015). We provide evidence that the findings of the mice studies may also apply to the human organism. To assess the influence of PRDM8 expression on the epigenetic age, we performed age predictions via the Horvath predictor (Horvath 2013) on genome-wide methylation data and observed no differences in the predicted epigenetic age; in fact, all samples were estimated close to zero years. This might be due to the fact that the cells were harvested after only ten days of differentiation. It will be interesting to study the aging phenotype of during long term culture after neuronal maturation. Experiments should include epigenetic age predictions and telomere length measurements and are required to improve our understanding of the influence of PRDM8 on the aging phenotype. No significant methylation changes in the PRDM8 overexpressing clones could be detected. Since the overexpression was driven by a EF1a promoter which was previously described to be active during neural differentiation of human ES cells (Norrman et al. 2010), we conclude that either PRDM8 overexpression was leading to cell death of the differentiating cells and the culture was only comprised of untransduced cells which had escaped cell sorting or that PRDM8 expression was epigenetically silenced during the

94

Discussion culture. Further experiments need to track PRDM8 and GFP expression levels during the differentiation to exclude gene silencing and confirm functional PRDM8 overexpression. Although PRDM8 overexpression experiments revealed no significant differences in DNAm, specific PRDM8-dependent changes in DNAm levels could be observed for the homozygous and heterozygous knockout clones. In tendency, promoter regions and CpG islands were more often hypomethylated than hypermethylated, indicating that PRDM8 might indirectly function as a repressor of gene expression via its H3K9 histone methylation activity and the subsequent promotion of methylation in promoter regions. This hypothesis is supported by several studies which report a crosstalk between epigenetic modifications such as H3K9 methylation and DNA methylation which is generally regarded as repressive epigenetic mark (Zhao et al. 2016; Huang et al. 2018; Symmank et al. 2018). Prdm8 was reported to methylate H3K9 in mice (Eom et al. 2009). Hence, it is of special significance that Meissner et al. described the presence of H3K4 methylation and the absence of H3K9 methylation as a powerful predictor of unmethylated CpG sites (Meissner et al. 2008) and that Esteve et al. reported the co-localization of DNMT1 and different histone methyltransferases with dimethylated H3K9 (Esteve et al. 2006). This crosstalk might provide a link between lack of PRDM8 expression and DNA hypomethylation in promoter regions. Gene ontology analyses revealed promoter hypermethylation of gene categories involved in intracellular signaling, neuronal signaling and sensory perception. Promoter hypomethylation was associated to categories involved in cell-cell adhesion and synapse formation (Figure 3.29), suggesting that PRDM8 promotes DNA methylation changes in promoters of genes directly associated to neural development and functions; thus providing a mechanism via which PRDM8 positively regulates neurogenesis. Several studies linked neuronal dysfunction to age-associated diseases such as Down syndrome which reveals aging-related features that particularly affect the central nervous system (Ciccarone et al. 2018). In addition, a form of severe DKC is associated with cerebellar hypoplasia (Townsley et al. 2014). This leads to the assumption that impaired neural differentiation due to lack of PRDM8 expression might be an evidence of the clinical premature aging phenotype. Amongst the hypomethylated genes, several members of the protocadherin superfamily were identified. As reviewed by Peek et al., protocadherins are involved in a wide range of neural functions and associated to neuronal disorders like Down syndrome and schizophrenia (Peek et al. 2017). In addition, several members of the protocadherin

95

Discussion superfamily were described to be differentially methylated during aging (Salpea et al. 2012; McClay et al. 2014), suggesting that knockout of PRDM8 contributes to the premature aging phenotype by the regulation of protocadherin methylation. The differentiation of patient-derived iPSCs is a powerful tool for investigating key parameters of human development and disease. Batista et al. demonstrated that even in the undifferentiated state, iPSCs from DKC patients harbor the precise premature aging characteristics as primary cells and that the magnitude of the telomere maintenance defect in iPSCs correlated with clinical severity (Batista et al. 2011). Furthermore, Liu et al. reported that iPSCs derived from Hutchinson-Gilford progeria syndrome patients revealed a normal phenotype. Upon differentiation, however, progerin expression and the aging-associated phenotype are restored (Liu et al. 2011a). To what extend the generated PRDM8+/- and PRDM8-/- clones can be used to investigate characteristics of DKC patients and whether PRDM8 or downstream targets suite as a target for drug treatment or cell therapy to ameliorate clinical features of premature aging needs to be a main focus of further studies. To this end, the DNAm data of PRDM8-/- clones should be compared to DNAm data of DKC patients. In addition, the substitution of PRDM8 in DKC-derived iPSCs will be an important experiment for further studies. Since PRDM8 was originally identified to be hypermethylated in peripheral blood samples of DKC patients, it will be interesting to perform hematopoietic differentiation of PRDM8-/- iPSCs. DNAm data of neural and hematopoietic differentiated cells can be compared to reveal to what extend the effects of the PRDM8 knockout are tissue specific. In addition, RNASeq data of the differentiated cells will reveal which of the genes with differentially methylated promoter regions show differential gene expression. This analysis will elucidate how the impaired neural differentiation is promoted by DNAm-dependent gene expression changes. Chromatin immunoprecipitation sequencing experiments can be performed to study changes of the histone code in PRDM8+/- and PRDM8-/- cells. Eom et al. described histone methyltransferase activity for PRDM8 in mice (Eom et al. 2009) and it will be of special interest to investigate methyltransferase activity in human cells. Analyses of differentially methylated histones may give additional insight into PRDM8 targets and into involved downstream signaling pathways. Taken together, evidence is provided that PRDM8 positively regulates neurogenesis by promoting DNAm changes in promoter regions of genes associated to neural functions

96

Conclusion and Future Perspective and organization. Via this mechanism, PRDM8 might also contribute to the clinical phenotype of DKC patients. Further studies need to elucidate whether the upregulation of PRDM8 expression can ameliorate the molecular and cellular symptoms of premature aging.

5 Conclusion and Future Perspective

This thesis highlights the importance of cell type adjustments when performing epigenetic age predictions. Our Buccal-Cell-Signature improves age predictions of buccal swab samples and can be integrated not only into new age prediction models, but also into epigenetic studies of buccal swab samples in general to adjust for the cellular composition (Marinova et al. 2017). We demonstrate that variability in DNAm patterns of specific successive CpG sites can be used as a potent marker for malignant clonal development. Further studies will improve the quantification of AML-associated patterns via patient specific machine learning to ultimately provide a new sensitive tool for the detection of clonal hematopoiesis and minimal residual diseases. It will be interesting to study to what extend this method will be also applicable to other hematological malignancies and age-associated diseases such as dyskeratosis congenita. In addition, this is the first report of impaired neural differentiation of a human PRDM8-/- stem cells. We demonstrate that PRDM8 knockout leads to differential methylation of genes associated with neural development. It will be a main focus of further studies to investigate to what extend PRDM8 is involved in the establishment of clinical features of DKC patients and whether PRDM8 or downstream signaling components are suitable targets for drug treatment or cell therapy. Therefore, comparative analyses of our DNAm data of neural precursors to DNAm data of DKC patients should be performed. RNASeq analyses of the differentiated cells will reveal which genes with differentially methylated promoter regions show altered gene expression and might therefore substantially promote the aging phenotype. Finally, PRDM8 expression should be modulated in DKC-derived iPSCs (Batista et al. 2011) to investigate whether the substitution of PRDM8 can ameliorate the clinical phenotype of DKC patients.

97

Bibliography

6 Bibliography

Accomando WP, Wiencke JK, Houseman EA, Nelson HH, Kelsey KT. 2014. Quantitative reconstruction of leukocyte subsets using DNA methylation. Genome Biol 15: R50.

Alisch RS, Barwick BG, Chopra P, Myrick LK, Satten GA, Conneely KN, Warren ST. 2012. Age- associated DNA methylation in pediatric populations. Genome Res 22: 623-632.

Alison MR, Islam S. 2009. Attributes of adult stem cells. J Pathol 217: 144-160.

Alizadeh AA, Aranda V, Bardelli A, Blanpain C, Bock C, Borowski C, Caldas C, Califano A, Doherty M, Elsner M et al. 2015. Toward understanding and exploiting tumor heterogeneity. Nat Med 21: 846-853.

Alter BP, Giri N, Savage SA, Peters JA, Loud JT, Leathwood L, Carr AG, Greene MH, Rosenberg PS. 2010. Malignancies and survival patterns in the national cancer institute inherited bone marrow failure syndromes cohort study. Br J Haematol 150: 179-188.

Antony JS, Latifi N, Haque A, Lamsfus-Calle A, Daniel-Moreno A, Graeter S, Baskaran P, Weinmann P, Mezger M, Handgretinger R et al. 2018. Gene correction of hbb mutations in cd34(+) hematopoietic stem cells using cas9 mrna and ssodn donors. Mol Cell Pediatr 5: 9.

Artero Castro A, Long K, Bassett A, Machuca C, Leon M, Avila-Fernandez A, Corton M, Vidal- Puig T, Ayuso C, Lukovic D et al. 2018. Generation of gene-corrected human induced pluripotent stem cell lines derived from retinitis pigmentosa patient with ser331cysfs*5 mutation in mertk. Stem Cell Res 34: 101341.

Ashapkin VV, Kutueva LI, Vanyushin BF. 2017. Aging as an epigenetic phenomenon. Curr Genomics 18: 385-407.

Aubert G, Hills M, Lansdorp PM. 2012. Telomere length measurement-caveats and a critical assessment of the available technologies and tools. Mutat Res 730: 59-67.

Ball SE, Gibson FM, Rizzo S, Tooze JA, Marsh JC, Gordon-Smith EC. 1998. Progressive telomere shortening in aplastic anemia. Blood 91: 3582-3592.

Bar-Nur O, Russ HA, Efrat S, Benvenisty N. 2011. Epigenetic memory and preferential lineage-specific differentiation in induced pluripotent stem cells derived from human pancreatic islet beta cells. Cell Stem Cell 9: 17-23.

Barata LP, Starks H, Kelley M, Kuszler P, Burke W. 2015. What DNA can and cannot say: Perspectives of immigrant families about the use of genetic testing in immigration. Stanford Law Pol Rev 26: 597-638.

Batista LF, Pech MF, Zhong FL, Nguyen HN, Xie KT, Zaug AJ, Crary SM, Choi J, Sebastiano V, Cherry A et al. 2011. Telomere shortening and loss of self-renewal in dyskeratosis congenita induced pluripotent stem cells. Nature 474: 399-402. 98

Bibliography

Baylin SB, Jones PA. 2016. Epigenetic determinants of cancer. Cold Spring Harb Perspect Biol 8.

Bellin M, Marchetto MC, Gage FH, Mummery CL. 2012. Induced pluripotent stem cells: The new patient? Nat Rev Mol Cell Biol 13: 713-726.

Berdyshev GD, Korotaev GK, Boiarskikh GV, Vaniushin BF. 1967. [nucleotide composition of DNA and rna from somatic tissues of humpback and its changes during spawning]. Biokhimiia 32: 988-993.

Biasco L, Pellin D, Scala S, Dionisio F, Basso-Ricci L, Leonardelli L, Scaramuzza S, Baricordi C, Ferrua F, Cicalese MP et al. 2016. In vivo tracking of human hematopoiesis reveals patterns of clonal dynamics during early and steady-state reconstitution phases. Cell Stem Cell 19: 107-119.

Bird A. 2002. DNA methylation patterns and epigenetic memory. Genes Dev 16: 6-21.

Bjornsson HT, Fallin MD, Feinberg AP. 2004. An integrated epigenetic and genetic approach to common human disease. Trends Genet 20: 350-358.

Blasco MA. 2007. Telomere length, stem cells and aging. Nat Chem Biol 3: 640-649.

Bock C, Kiskinis E, Verstappen G, Gu H, Boulting G, Smith ZD, Ziller M, Croft GF, Amoroso MW, Oakley DH et al. 2011. Reference maps of human es and ips cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144: 439-452.

Bocklandt S, Lin W, Sehl ME, Sanchez FJ, Sinsheimer JS, Horvath S, Vilain E. 2011. Epigenetic predictor of age. PLoS One 6: e14821.

Bollati V, Schwartz J, Wright R, Litonjua A, Tarantini L, Suh H, Sparrow D, Vokonas P, Baccarelli A. 2009. Decline in genomic DNA methylation through aging in a cohort of elderly subjects. Mech Ageing Dev 130: 234-239.

Brown MH. 2016. Cd6 as a cell surface receptor and as a target for regulating immune responses. Curr Drug Targets 17: 619-629.

Busque L, Patel JP, Figueroa ME, Vasanthakumar A, Provost S, Hamilou Z, Mollica L, Li J, Viale A, Heguy A et al. 2012. Recurrent somatic tet2 mutations in normal elderly individuals with clonal hematopoiesis. Nat Genet 44: 1179-1181.

Buyse IM, Shao G, Huang S. 1995. The retinoblastoma protein binds to riz, a zinc-finger protein that shares an epitope with the adenovirus e1a protein. Proc Natl Acad Sci U S A 92: 4467-4471.

Calado RT, Cooper JN, Padilla-Nash HM, Sloand EM, Wu CO, Scheinberg P, Ried T, Young NS. 2012. Short telomeres result in chromosomal instability in hematopoietic cells and precede malignant evolution in human aplastic anemia. Leukemia 26: 700-707. 99

Bibliography

Chagastelles PC, Nardi NB. 2011. Biology of stem cells: An overview. Kidney Int Suppl (2011) 1: 63-67.

Chen G, Gulbranson DR, Hou Z, Bolin JM, Ruotti V, Probasco MD, Smuga-Otto K, Howden SE, Diol NR, Propson NE et al. 2011. Chemically defined conditions for human ipsc derivation and culture. Nat Methods 8: 424-429.

Christensen BC, Houseman EA, Marsit CJ, Zheng S, Wrensch MR, Wiemels JL, Nelson HH, Karagas MR, Padbury JF, Bueno R et al. 2009. Aging and environmental exposures alter tissue-specific DNA methylation dependent upon cpg island context. PLoS Genet 5: e1000602.

Chung SS, Park CY. 2017. Aging, hematopoiesis, and the myelodysplastic syndromes. Blood Adv 1: 2572-2578.

Ciccarone F, Tagliatesta S, Caiafa P, Zampieri M. 2018. DNA methylation dynamics in aging: How far are we from understanding the mechanisms? Mech Ageing Dev 174: 3-17.

Cilloni D, Renneville A, Hermitte F, Hills RK, Daly S, Jovanovic JV, Gottardi E, Fava M, Schnittger S, Weiss T et al. 2009. Real-time quantitative polymerase chain reaction detection of minimal residual disease by standardized wt1 assay to enhance risk stratification in acute myeloid leukemia: A european leukemianet study. J Clin Oncol 27: 5195-5201.

Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini LA et al. 2013. Multiplex genome engineering using crispr/cas systems. Science 339: 819-823.

Coppede F. 2013. The epidemiology of premature aging and associated comorbidities. Clin Interv Aging 8: 1023-1032.

Davies MN, Volta M, Pidsley R, Lunnon K, Dixit A, Lovestone S, Coarfa C, Harris RA, Milosavljevic A, Troakes C et al. 2012. Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biol 13: R43.

Deaton AM, Bird A. 2011. Cpg islands and the regulation of transcription. Genes Dev 25: 1010-1022.

Deng J, Shoemaker R, Xie B, Gore A, LeProust EM, Antosiewicz-Bourget J, Egli D, Maherali N, Park IH, Yu J et al. 2009. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat Biotechnol 27: 353-360.

Devine MJ, Ryten M, Vodicka P, Thomson AJ, Burdon T, Houlden H, Cavaleri F, Nagano M, Drummond NJ, Taanman JW et al. 2011. Parkinson's disease induced pluripotent stem cells with triplication of the alpha-synuclein locus. Nat Commun 2: 440.

100

Bibliography

DiCarlo JE, Mahajan VB, Tsang SH. 2018. Gene therapy and genome surgery in the retina. J Clin Invest 128: 2177-2188.

Diez Roux AV, Ranjit N, Jenny NS, Shea S, Cushman M, Fitzpatrick A, Seeman T. 2009. Race/ethnicity and telomere length in the multi-ethnic study of atherosclerosis. Aging Cell 8: 251-257.

Dokal I. 2011. Dyskeratosis congenita. Hematology Am Soc Hematol Educ Program 2011: 480-486.

Donnelly ML, Luke G, Mehrotra A, Li X, Hughes LE, Gani D, Ryan MD. 2001. Analysis of the aphthovirus 2a/2b polyprotein 'cleavage' mechanism indicates not a proteolytic reaction, but a novel translational effect: A putative ribosomal 'skip'. J Gen Virol 82: 1013-1025.

Edwards RG. 2001. Ivf and the history of stem cells. Nature 413: 349-351.

Eipel M, Mayer F, Arent T, Ferreira MR, Birkhofer C, Gerstenmaier U, Costa IG, Ritz-Timme S, Wagner W. 2016. Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures. Aging (Albany NY) 8: 1034-1048.

Elliott HR, Tillin T, McArdle WL, Ho K, Duggirala A, Frayling TM, Davey Smith G, Hughes AD, Chaturvedi N, Relton CL. 2014. Differences in smoking associated DNA methylation patterns in south asians and europeans. Clin Epigenetics 6: 4.

Eom GH, Kim K, Kim SM, Kee HJ, Kim JY, Jin HM, Kim JR, Kim JH, Choe N, Kim KB et al. 2009. Histone methyltransferase prdm8 regulates mouse testis steroidogenesis. Biochem Biophys Res Commun 388: 131-136.

Esteve PO, Chin HG, Smallwood A, Feehery GR, Gangisetty O, Karpf AR, Carey MF, Pradhan S. 2006. Direct interaction between dnmt1 and g9a coordinates DNA and histone methylation during replication. Genes Dev 20: 3089-3103.

Feil R, Fraga MF. 2012. Epigenetics and the environment: Emerging patterns and implications. Nat Rev Genet 13: 97-109.

Fernandez Garcia MS, Teruya-Feldstein J. 2014. The diagnosis and treatment of dyskeratosis congenita: A review. J Blood Med 5: 157-167.

Fialkow PJ, Gartler SM, Yoshida A. 1967. Clonal origin of chronic myelocytic leukemia in man. Proc Natl Acad Sci U S A 58: 1468-1471.

Figueroa ME, Abdel-Wahab O, Lu C, Ward PS, Patel J, Shih A, Li Y, Bhagwat N, Vasanthakumar A, Fernandez HF et al. 2010a. Leukemic idh1 and idh2 mutations result in a hypermethylation phenotype, disrupt tet2 function, and impair hematopoietic differentiation. Cancer Cell 18: 553-567.

101

Bibliography

Figueroa ME, Lugthart S, Li Y, Erpelinck-Verschueren C, Deng X, Christos PJ, Schifano E, Booth J, van Putten W, Skrabanek L et al. 2010b. DNA methylation signatures identify biologically distinct subtypes in acute myeloid leukemia. Cancer Cell 17: 13-27.

Florath I, Butterbach K, Muller H, Bewerunge-Hudler M, Brenner H. 2014. Cross-sectional and longitudinal changes in DNA methylation with age: An epigenome-wide analysis revealing over 60 novel age-associated cpg sites. Hum Mol Genet 23: 1186-1201.

Fog CK, Galli GG, Lund AH. 2012. Prdm proteins: Important players in differentiation and disease. Bioessays 34: 50-60.

Fraga MF, Esteller M. 2007. Epigenetics and aging: The targets and the marks. Trends Genet 23: 413-418.

Fumasoni I, Meani N, Rambaldi D, Scafetta G, Alcalay M, Ciccarelli FD. 2007. Family expansion and gene rearrangements contributed to the functional specialization of prdm genes in vertebrates. BMC Evol Biol 7: 187.

Fuster JJ, MacLauchlan S, Zuriaga MA, Polackal MN, Ostriker AC, Chakraborty R, Wu CL, Sano S, Muralidharan S, Rius C et al. 2017. Clonal hematopoiesis associated with tet2 deficiency accelerates atherosclerosis development in mice. Science 355: 842-847.

Garagnani P, Bacalini MG, Pirazzini C, Gori D, Giuliani C, Mari D, Di Blasio AM, Gentilini D, Vitale G, Collino S et al. 2012. Methylation of elovl2 gene as a new epigenetic marker of age. Aging Cell 11: 1132-1134.

Genovese G, Kahler AK, Handsaker RE, Lindberg J, Rose SA, Bakhoum SF, Chambert K, Mick E, Neale BM, Fromer M et al. 2014. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N Engl J Med 371: 2477-2487.

Girchenko P, Lahti J, Czamara D, Knight AK, Jones MJ, Suarez A, Hamalainen E, Kajantie E, Laivuori H, Villa PM et al. 2017. Associations between maternal risk factors of adverse pregnancy and birth outcomes and the offspring epigenetic clock of gestational age at birth. Clin Epigenetics 9: 49.

Gorabi AM, Hajighasemi S, Tafti HA, Atashi A, Soleimani M, Aghdami N, Saeid AK, Khori V, Panahi Y, Sahebkar A. 2019. Tbx18 transcription factor overexpression in human- induced pluripotent stem cells increases their differentiation into pacemaker-like cells. J Cell Physiol 234: 1534-1546.

Gore A, Li Z, Fung HL, Young JE, Agarwal S, Antosiewicz-Bourget J, Canto I, Giorgetti A, Israel MA, Kiskinis E et al. 2011. Somatic coding mutations in human induced pluripotent stem cells. Nature 471: 63-67.

Grimwade D, Freeman SD. 2014. Defining minimal residual disease in acute myeloid leukemia: Which platforms are ready for "prime time"? Blood 124: 3345-3355.

102

Bibliography

Hackett JA, Greider CW. 2002. Balancing instability: Dual roles for telomerase and telomere dysfunction in tumorigenesis. 21: 619-626.

Han Y, Eipel M, Franzen J, Sakk V, Dethmers-Ausema B, Yndriago L, Izeta A, de Haan G, Geiger H, Wagner W. 2018. Epigenetic age-predictor for mice based on three cpg sites. Elife 7.

Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, Klotzle B, Bibikova M, Fan JB, Gao Y et al. 2013. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell 49: 359-367.

Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D et al. 2011. Increased methylation variation in epigenetic domains across cancer types. Nat Genet 43: 768-775.

Hansen TV, Simonsen MK, Nielsen FC, Hundrup YA. 2007. Collection of blood, saliva, and buccal cell samples in a pilot study on the danish nurse cohort: Comparison of the response rate and quality of genomic DNA. Cancer Epidemiol Biomarkers Prev 16: 2072-2076.

Heiss NS, Knight SW, Vulliamy TJ, Klauck SM, Wiemann S, Mason PJ, Poustka A, Dokal I. 1998. X-linked dyskeratosis congenita is caused by mutations in a highly conserved gene with putative nucleolar functions. Nat Genet 19: 32-38.

Hennekam RC. 2006. Hutchinson-gilford progeria syndrome: Review of the phenotype. Am J Med Genet A 140: 2603-2624.

Herbstman JB, Wang S, Perera FP, Lederman SA, Vishnevetsky J, Rundle AG, Hoepner LA, Qu L, Tang D. 2013. Predictors and consequences of global DNA methylation in cord blood and at three years. PLoS One 8: e72824.

Hermann A, Goyal R, Jeltsch A. 2004. The dnmt1 DNA-(cytosine-c5)-methyltransferase methylates DNA processively with high preference for hemimethylated target sites. J Biol Chem 279: 48350-48359.

Heyn H, Li N, Ferreira HJ, Moran S, Pisano DG, Gomez A, Diez J, Sanchez-Mut JV, Setien F, Carmona FJ et al. 2012. Distinct DNA methylomes of newborns and centenarians. Proc Natl Acad Sci U S A 109: 10522-10527.

Hitzeroth HW, Bender K. 1981. Age-dependency of somatic selection in south african negro g-6-pd heterozygotes. Hum Genet 58: 338-343.

Hohenauer T, Moore AW. 2012. The prdm family: Expanding roles in stem cells and development. Development 139: 2267-2282.

Hong SR, Jung SE, Lee EH, Shin KJ, Yang WI, Lee HY. 2017. DNA methylation-based age prediction from saliva: High age predictability by combination of 7 cpg markers. Forensic Sci Int Genet 29: 118-125. 103

Bibliography

Horvath S. 2013. DNA methylation age of human tissues and cell types. Genome Biol 14: R115.

Huang S, Shao G, Liu L. 1998. The pr domain of the rb-binding zinc finger protein riz1 is a protein binding interface and is related to the set domain functioning in chromatin- mediated gene expression. J Biol Chem 273: 15933-15939.

Huang X, Yan J, Zhang M, Wang Y, Chen Y, Fu X, Wei R, Zheng XL, Liu Z, Zhang X et al. 2018. Targeting epigenetic crosstalk as a therapeutic strategy for ezh2-aberrant solid tumors. Cell 175: 186-199 e119.

Humpherys D, Eggan K, Akutsu H, Hochedlinger K, Rideout WM, 3rd, Biniszkiewicz D, Yanagimachi R, Jaenisch R. 2001. Epigenetic instability in es cells and cloned mice. Science 293: 95-97.

Hussein SM, Batada NN, Vuoristo S, Ching RW, Autio R, Narva E, Ng S, Sourour M, Hamalainen R, Olsson C et al. 2011. Copy number variation and selection during reprogramming to pluripotency. Nature 471: 58-62.

Inoue M, Iwai R, Yamanishi E, Yamagata K, Komabayashi-Suzuki M, Honda A, Komai T, Miyachi H, Kitano S, Watanabe C et al. 2015. Deletion of prdm8 impairs development of upper-layer neocortical neurons. Genes Cells 20: 758-770.

Israel MA, Yuan SH, Bardy C, Reyna SM, Mu Y, Herrera C, Hefferan MP, Van Gorp S, Nazor KL, Boscolo FS et al. 2012. Probing sporadic and familial alzheimer's disease using induced pluripotent stem cells. Nature 482: 216-220.

Ito S, Shen L, Dai Q, Wu SC, Collins LB, Swenberg JA, He C, Zhang Y. 2011. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333: 1300-1303.

Ivey A, Hills RK, Simpson MA, Jovanovic JV, Gilkes A, Grech A, Patel Y, Bhudia N, Farah H, Mason J et al. 2016. Assessment of minimal residual disease in standard-risk aml. N Engl J Med 374: 422-433.

Iwai R, Tabata H, Inoue M, Nomura KI, Okamoto T, Ichihashi M, Nagata KI, Mizutani KI. 2018. A prdm8 target gene ebf3 regulates multipolar-to-bipolar transition in migrating neocortical cells. Biochem Biophys Res Commun 495: 388-394.

Jacobs KB Yeager M Zhou W Wacholder S Wang Z Rodriguez-Santiago B Hutchinson A Deng X Liu C Horner MJ et al. 2012. Detectable clonal mosaicism and its relationship to aging and cancer. Nat Genet 44: 651-658.

Jaiswal S, Fontanillas P, Flannick J, Manning A, Grauman PV, Mar BG, Lindsley RC, Mermel CH, Burtt N, Chavez A et al. 2014. Age-related clonal hematopoiesis associated with adverse outcomes. N Engl J Med 371: 2488-2498.

104

Bibliography

Jiang DJ, Xu CL, Tsang SH. 2018. Revolution in gene medicine therapy and genome surgery. Genes (Basel) 9.

Jiang R, Jones MJ, Chen E, Neumann SM, Fraser HB, Miller GE, Kobor MS. 2015. Discordance of DNA methylation variance between two accessible human tissues. Sci Rep 5: 8257.

Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. 2012. A programmable dual-rna-guided DNA endonuclease in adaptive bacterial immunity. Science 337: 816- 821.

Johansson A, Enroth S, Gyllensten U. 2013. Continuous aging of the human DNA methylome throughout the human lifespan. PLoS One 8: e67378.

Jones MJ, Farre P, McEwen LM, Macisaac JL, Watt K, Neumann SM, Emberly E, Cynader MS, Virji-Babul N, Kobor MS. 2013. Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome. BMC Med Genomics 6: 58.

Jones MJ, Goodman SJ, Kobor MS. 2015. DNA methylation and healthy human aging. Aging Cell 14: 924-932.

Jost E, Lin Q, Weidner CI, Wilop S, Hoffmann M, Walenda T, Schemionek M, Herrmann O, Zenke M, Brummendorf TH et al. 2014. Epimutations mimic genomic mutations of dnmt3a in acute myeloid leukemia. Leukemia 28: 1227-1234.

Jung SE, Lim SM, Hong SR, Lee EH, Shin KJ, Lee HY. 2019. DNA methylation of the elovl2, fhl2, klf14, c1orf132/mir29b2c, and trim59 genes for age prediction from blood, saliva, and buccal swab samples. Forensic Sci Int Genet 38: 1-8.

Kim J, Kim K, Kim H, Yoon G, Lee K. 2014. Characterization of age signatures of DNA methylation in normal and cancer tissues from multiple studies. BMC Genomics 15: 997.

Kim M, Costello J. 2017. DNA methylation: An epigenetic mark of cellular memory. Exp Mol Med 49: e322.

Kinameri E, Inoue T, Aruga J, Imayoshi I, Kageyama R, Shimogori T, Moore AW. 2008. Prdm proto-oncogene transcription factor family expression and interaction with the notch- hes pathway in mouse neurogenesis. PLoS One 3: e3859.

Knight SW, Heiss NS, Vulliamy TJ, Aalfs CM, McMahon C, Richmond P, Jones A, Hennekam RC, Poustka A, Mason PJ et al. 1999. Unexplained aplastic anaemia, immunodeficiency, and cerebellar hypoplasia (hoyeraal-hreidarsson syndrome) due to mutations in the dyskeratosis congenita gene, dkc1. Br J Haematol 107: 335-339.

Koch CM, Wagner W. 2011. Epigenetic-aging-signature to determine age in different tissues. Aging (Albany NY) 3: 1018-1027.

105

Bibliography

Kohli RM, Zhang Y. 2013. Tet enzymes, tdg and the dynamics of DNA demethylation. Nature 502: 472-479.

Komai T, Iwanari H, Mochizuki Y, Hamakubo T, Shinkai Y. 2009. Expression of the mouse pr domain protein prdm8 in the developing central nervous system. Gene Expr Patterns 9: 503-514.

Kondo M, Wagers AJ, Manz MG, Prohaska SS, Scherer DC, Beilhack GF, Shizuru JA, Weissman IL. 2003. Biology of hematopoietic stem cells and progenitors: Implications for clinical application. Annu Rev Immunol 21: 759-806.

Kriks S, Shim JW, Piao J, Ganat YM, Wakeman DR, Xie Z, Carrillo-Reid L, Auyeung G, Antonacci C, Buch A et al. 2011. Dopamine neurons derived from human es cells efficiently engraft in animal models of parkinson's disease. Nature 480: 547-551.

Kulasekararaj AG, Jiang J, Smith AE, Mohamedali AM, Mian S, Gandhi S, Gaken J, Czepulkowski B, Marsh JC, Mufti GJ. 2014. Somatic mutations identify a subgroup of aplastic anemia patients who progress to myelodysplastic syndrome. Blood 124: 2698- 2704.

Kunimoto H, Nakajima H. 2017. Epigenetic dysregulation of hematopoietic stem cells and preleukemic state. Int J Hematol 106: 34-44.

Landau DA, Clement K, Ziller MJ, Boyle P, Fan J, Gu H, Stevenson K, Sougnez C, Wang L, Li S et al. 2014. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell 26: 813-825.

Lane AA, Odejide O, Kopp N, Kim S, Yoda A, Erlich R, Wagle N, Abel GA, Rodig SJ, Antin JH et al. 2013. Low frequency clonal mutations recoverable by deep sequencing in patients with aplastic anemia. Leukemia 27: 968-971.

Laurent LC, Ulitsky I, Slavin I, Tran H, Schork A, Morey R, Lynch C, Harness JV, Lee S, Barrero MJ et al. 2011. Dynamic changes in the copy number of pluripotency and cell proliferation genes in human escs and ipscs during reprogramming and time in culture. Cell Stem Cell 8: 106-118.

Laurie CC, Laurie CA, Rice K, Doheny KF, Zelnick LR, McHugh CP, Ling H, Hetrick KN, Pugh EW, Amos C et al. 2012. Detectable clonal mosaicism from birth to old age and its relationship to cancer. Nat Genet 44: 642-650.

Lee JY, Kim M, Heo HR, Ha KS, Han ET, Park WS, Yang SR, Hong SH. 2018. Inhibition of microrna-221 and 222 enhances hematopoietic differentiation from human pluripotent stem cells via c-kit upregulation. Mol Cells 41: 971-978.

Ley TJ, Ding L, Walter MJ, McLellan MD, Lamprecht T, Larson DE, Kandoth C, Payton JE, Baty J, Welch J et al. 2010. Dnmt3a mutations in acute myeloid leukemia. N Engl J Med 363: 2424-2433.

106

Bibliography

Li S, Garrett-Bakelman FE, Chung SS, Sanders MA, Hricik T, Rapaport F, Patel J, Dillon R, Vijay P, Brown AL et al. 2016a. Distinct evolution and dynamics of epigenetic and genetic heterogeneity in acute myeloid leukemia. Nat Med 22: 792-799.

Li S, Mason CE, Melnick A. 2016b. Genetic and epigenetic heterogeneity in acute myeloid leukemia. Curr Opin Genet Dev 36: 100-106.

Li Y, Tsai YT, Hsu CW, Erol D, Yang J, Wu WH, Davis RJ, Egli D, Tsang SH. 2012. Long-term safety and efficacy of human-induced pluripotent stem cell (ips) grafts in a preclinical model of retinitis pigmentosa. Mol Med 18: 1312-1319.

Lin Q, Wagner W. 2015. Epigenetic aging signatures are coherently modified in cancer. PLoS Genet 11: e1005334.

Lin Q, Weidner CI, Costa IG, Marioni RE, Ferreira MR, Deary IJ, Wagner W. 2016. DNA methylation levels at individual age-associated cpg sites can be indicative for life expectancy. Aging (Albany NY) 8: 394-401.

Lister R, Mukamel EA, Nery JR, Urich M, Puddifoot CA, Johnson ND, Lucero J, Huang Y, Dwork AJ, Schultz MD et al. 2013. Global epigenomic reconfiguration during mammalian brain development. Science 341: 1237905.

Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM et al. 2009. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462: 315-322.

Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, Antosiewicz-Bourget J, O'Malley R, Castanon R, Klugman S et al. 2011. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471: 68-73.

Liu GH, Barkho BZ, Ruiz S, Diep D, Qu J, Yang SL, Panopoulos AD, Suzuki K, Kurian L, Walsh C et al. 2011a. Recapitulation of premature ageing with ipscs from hutchinson-gilford progeria syndrome. Nature 472: 221-225.

Liu GH, Suzuki K, Qu J, Sancho-Martinez I, Yi F, Li M, Kumar S, Nivet E, Kim J, Soligalla RD et al. 2011b. Targeted gene correction of laminopathy-associated lmna mutations in patient-specific ipscs. Cell Stem Cell 8: 688-694.

Ma S, Viola R, Sui L, Cherubini V, Barbetti F, Egli D. 2018. Beta cell replacement after gene editing of a neonatal diabetes-causing mutation at the insulin locus. Stem Cell Reports 11: 1407-1415.

Maciejewski JP, Risitano A, Sloand EM, Nunez O, Young NS. 2002. Distinct clinical outcomes for cytogenetic abnormalities evolving from aplastic anemia. Blood 99: 3129-3135.

Madrigano J, Baccarelli A, Mittleman MA, Sparrow D, Vokonas PS, Tarantini L, Schwartz J. 2012. Aging and epigenetics: Longitudinal changes in gene-specific DNA methylation. Epigenetics 7: 63-70. 107

Bibliography

Marani C, Clavio M, Grasso R, Colombo N, Guolo F, Kunkl A, Ballerini F, Giannoni L, Ghiggi C, Fugazza G et al. 2013. Integrating post induction wt1 quantification and flow- cytometry results improves minimal residual disease stratification in acute myeloid leukemia. Leuk Res 37: 1606-1611.

Marcucci G, Metzeler KH, Schwind S, Becker H, Maharry K, Mrozek K, Radmacher MD, Kohlschmidt J, Nicolet D, Whitman SP et al. 2012. Age-related prognostic impact of different types of dnmt3a mutations in adults with primary cytogenetically normal acute myeloid leukemia. J Clin Oncol 30: 742-750.

Marinova Z, Maercker A, Kuffer A, Robinson MD, Wojdacz TK, Walitza S, Grunblatt E, Burri A. 2017. DNA methylation profiles of elderly individuals subjected to indentured childhood labor and trauma. BMC Med Genet 18: 21.

Martino DJ, Tulic MK, Gordon L, Hodder M, Richman TR, Metcalfe J, Prescott SL, Saffery R. 2011. Evidence for age-related and individual-specific changes in DNA methylation profile of mononuclear cells during early immune development in humans. Epigenetics 6: 1085-1094.

Matsa E, Rajamohan D, Dick E, Young L, Mellor I, Staniforth A, Denning C. 2011. Drug evaluation in cardiomyocytes derived from human induced pluripotent stem cells carrying a long qt syndrome type 2 mutation. Eur Heart J 32: 952-962.

Mayshar Y, Ben-David U, Lavon N, Biancotti JC, Yakir B, Clark AT, Plath K, Lowry WE, Benvenisty N. 2010. Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 7: 521-531.

McClay JL, Aberg KA, Clark SL, Nerella S, Kumar G, Xie LY, Hudson AD, Harada A, Hultman CM, Magnusson PK et al. 2014. A methylome-wide study of aging using massively parallel sequencing of the methyl-cpg-enriched genomic fraction from blood in over 700 subjects. Hum Mol Genet 23: 1175-1185.

Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, Zhang X, Bernstein BE, Nusbaum C, Jaffe DB et al. 2008. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454: 766-770.

Mitchell JR, Wood E, Collins K. 1999. A telomerase component is defective in the human disease dyskeratosis congenita. Nature 402: 551-555.

Mitne-Neto M, Machado-Costa M, Marchetto MC, Bengtson MH, Joazeiro CA, Tsuda H, Bellen HJ, Silva HC, Oliveira AS, Lazar M et al. 2011. Downregulation of vapb expression in motor neurons derived from induced pluripotent stem cells of als8 patients. Hum Mol Genet 20: 3642-3652.

Montane E, Ibanez L, Vidal X, Ballarin E, Puig R, Garcia N, Laporte JR, Catalan Group for Study of A, Aplastic A. 2008. Epidemiology of aplastic anemia: A prospective multicenter study. Haematologica 93: 518-523. 108

Bibliography

Morishita K. 2007. Leukemogenesis of the evi1/mel1 gene family. Int J Hematol 85: 279-286.

Mullard A. 2012. 2011 fda drug approvals. Nat Rev Drug Discov 11: 91-94.

Niccoli T, Partridge L. 2012. Ageing as a risk factor for disease. Curr Biol 22: R741-752.

Norrman K, Fischer Y, Bonnamy B, Wolfhagen Sand F, Ravassard P, Semb H. 2010. Quantitative comparison of constitutive promoters in human es cells. PLoS One 5: e12413.

Ohi Y, Qin H, Hong C, Blouin L, Polo JM, Guo T, Qi Z, Downey SL, Manos PD, Rossi DJ et al. 2011. Incomplete DNA methylation underlies a transcriptional memory of somatic cells in human ips cells. Nat Cell Biol 13: 541-549.

Okano M, Bell DW, Haber DA, Li E. 1999. DNA methyltransferases dnmt3a and dnmt3b are essential for de novo methylation and mammalian development. Cell 99: 247-257.

Pamukcu B, Lip GY, Devitt A, Griffiths H, Shantsila E. 2010. The role of monocytes in atherosclerotic coronary artery disease. Ann Med 42: 394-403.

Park CY. 2017. Hematopoiesis in aging: Current concepts and challenges. Semin Hematol 54: 1-3.

Park IH, Arora N, Huo H, Maherali N, Ahfeldt T, Shimamura A, Lensch MW, Cowan C, Hochedlinger K, Daley GQ. 2008. Disease-specific induced pluripotent stem cells. Cell 134: 877-886.

Peek SL, Mah KM, Weiner JA. 2017. Regulation of neural circuit formation by protocadherins. Cell Mol Life Sci 74: 4133-4157.

Pera MF. 2011. Stem cells: The dark side of induced pluripotency. Nature 471: 46-47.

Peters MJ Joehanes R Pilling LC Schurmann C Conneely KN Powell J Reinmaa E Sutphin GL Zhernakova A Schramm K et al. 2015. The transcriptional landscape of age in human peripheral blood. Nat Commun 6: 8570.

Pollex RL, Hegele RA. 2004. Hutchinson-gilford progeria syndrome. Clin Genet 66: 375-381.

Polo JM, Liu S, Figueroa ME, Kulalert W, Eminli S, Tan KY, Apostolou E, Stadtfeld M, Li Y, Shioda T et al. 2010. Cell type of origin influences the molecular and functional properties of mouse induced pluripotent stem cells. Nat Biotechnol 28: 848-855.

Portela A, Esteller M. 2010. Epigenetic modifications and human disease. Nat Biotechnol 28: 1057-1068.

Poulsen P, Esteller M, Vaag A, Fraga MF. 2007. The epigenetic basis of twin discordance in age-related diseases. Pediatr Res 61: 38R-42R. 109

Bibliography

Quach A, Levine ME, Tanaka T, Lu AT, Chen BH, Ferrucci L, Ritz B, Bandinelli S, Neuhouser ML, Beasley JM et al. 2017. Epigenetic clock analysis of diet, exercise, education, and lifestyle factors. Aging (Albany NY) 9: 419-446.

Rager JE, Bauer RN, Muller LL, Smeester L, Carson JL, Brighton LE, Fry RC, Jaspers I. 2013. DNA methylation in nasal epithelial cells from smokers: Identification of ulbp3-related effects. Am J Physiol Lung Cell Mol Physiol 305: L432-438.

Ran FA, Hsu PD, Lin CY, Gootenberg JS, Konermann S, Trevino AE, Scott DA, Inoue A, Matoba S, Zhang Y et al. 2013. Double nicking by rna-guided crispr cas9 for enhanced genome editing specificity. Cell 154: 1380-1389.

Ross SE, McCord AE, Jung C, Atan D, Mok SI, Hemberg M, Kim TK, Salogiannis J, Hu L, Cohen S et al. 2012. Bhlhb5 and prdm8 form a repressor complex involved in neuronal circuit assembly. Neuron 73: 292-303.

Russo VEA, Martienssen RA, Riggs AD. 1996. Epigenetic mechanisms of gene regulation. Cold Spring Harbor Laboratory Press, Plainview, N.Y.

Sager R, Sheng S, Pemberton P, Hendrix MJ. 1997. Maspin. A tumor suppressing serpin. Adv Exp Med Biol 425: 77-88.

Salpea P, Russanova VR, Hirai TH, Sourlingas TG, Sekeri-Pataryas KE, Romero R, Epstein J, Howard BH. 2012. Postnatal development- and age-related changes in DNA- methylation patterns in the human genome. Nucleic Acids Res 40: 6477-6494.

Saxonov S, Berg P, Brutlag DL. 2006. A genome-wide analysis of cpg dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci U S A 103: 1412-1417.

Scannell JW, Blanckley A, Boldon H, Warrington B. 2012. Diagnosing the decline in pharmaceutical r&d efficiency. Nat Rev Drug Discov 11: 191-200.

Schneider R, Bannister AJ, Kouzarides T. 2002. Unsafe sets: Histone lysine methyltransferases and cancer. Trends Biochem Sci 27: 396-402.

Selim AG, Moore AS. 2018. Molecular minimal residual disease monitoring in acute myeloid leukemia: Challenges and future directions. J Mol Diagn 20: 389-397.

Shallis RM, Ahmad R, Zeidan AM. 2018. Aplastic anemia: Etiology, molecular pathogenesis and emerging concepts. Eur J Haematol doi:10.1111/ejh.13153.

Shenker NS, Polidoro S, van Veldhoven K, Sacerdote C, Ricceri F, Birrell MA, Belvisi MG, Brown R, Vineis P, Flanagan JM. 2013. Epigenome-wide association study in the european prospective investigation into cancer and nutrition (epic-turin) identifies novel genetic loci associated with smoking. Hum Mol Genet 22: 843-851.

110

Bibliography

Shi Y, Kirwan P, Smith J, Robinson HP, Livesey FJ. 2012. Human cerebral cortex development from pluripotent stem cells to functional excitatory synapses. Nat Neurosci 15: 477- 486, S471.

Shlush LI. 2018. Age-related clonal hematopoiesis. Blood 131: 496-504.

Smith ZD, Meissner A. 2013. DNA methylation: Roles in mammalian development. Nat Rev Genet 14: 204-220.

Stanley N, Olson TS, Babushok DV. 2017. Recent advances in understanding clonal haematopoiesis in aplastic anaemia. Br J Haematol 177: 509-525.

Steensma DP. 2018. Clinical consequences of clonal hematopoiesis of indeterminate potential. Blood Adv 2: 3404-3410.

Sun XJ, Xu PF, Zhou T, Hu M, Fu CT, Zhang Y, Jin Y, Chen Y, Chen SJ, Huang QH et al. 2008. Genome-wide survey and developmental expression mapping of zebrafish set domain-containing genes. PLoS One 3: e1499.

Symmank J, Bayer C, Schmidt C, Hahn A, Pensold D, Zimmer-Bensch G. 2018. Dnmt1 modulates interneuron morphology by regulating pak6 expression through crosstalk with histone modifications. Epigenetics 13: 536-556.

Takahashi K, Tanabe K, Ohnuki M, Narita M, Ichisaka T, Tomoda K, Yamanaka S. 2007. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131: 861-872.

Takahashi K, Yamanaka S. 2006. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126: 663-676.

Tanigawa S, Islam M, Sharmin S, Naganuma H, Yoshimura Y, Haque F, Era T, Nakazato H, Nakanishi K, Sakuma T et al. 2018. Organoids from nephrotic disease-derived ipscs identify impaired nephrin localization and slit diaphragm formation in kidney podocytes. Stem Cell Reports 11: 727-740.

Teschendorff AE, Liu X, Caren H, Pollard SM, Beck S, Widschwendter M, Chen L. 2014. The dynamics of DNA methylation covariation patterns in carcinogenesis. PLoS Comput Biol 10: e1003709.

Teschendorff AE, Menon U, Gentry-Maharaj A, Ramus SJ, Weisenberger DJ, Shen H, Campan M, Noushmehr H, Bell CG, Maxwell AP et al. 2010. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res 20: 440-446.

Teschendorff AE, West J, Beck S. 2013. Age-associated epigenetic drift: Implications, and a case of epigenetic thrift? Hum Mol Genet 22: R7-R15.

111

Bibliography

Teschendorff AE, Yang Z, Wong A, Pipinikas CP, Jiao Y, Jones A, Anjum S, Hardy R, Salvesen HB, Thirlwell C et al. 2015. Correlation of smoking-associated DNA methylation changes in buccal cells with DNA methylation changes in epithelial cancer. JAMA Oncol 1: 476-485.

Thiede C, Prange-Krex G, Freiberg-Richter J, Bornhauser M, Ehninger G. 2000. Buccal swabs but not mouthwash samples can be used to obtain pretransplant DNA fingerprints from recipients of allogeneic bone marrow transplants. Bone Marrow Transplant 25: 575-577.

Thongngam P, Leewattanapasuk W, Bhoopat T, Sangthong P. 2017. Single nucleotide polymorphisms minisequencing in hypervariable regions for screening of thais. Gene 627: 538-542.

Townsley DM, Dumitriu B, Young NS. 2014. Bone marrow failure and the telomeropathies. Blood 124: 2775-2783.

Tsaprouni LG, Yang TP, Bell J, Dick KJ, Kanoni S, Nisbet J, Vinuela A, Grundberg E, Nelson CP, Meduri E et al. 2014. Cigarette smoking reduces DNA methylation levels at multiple genomic loci but the effect is partially reversible upon cessation. Epigenetics 9: 1382- 1396.

Tucker BA, Park IH, Qi SD, Klassen HJ, Jiang C, Yao J, Redenti S, Daley GQ, Young MJ. 2011. Transplantation of adult mouse ips cell-derived photoreceptor precursors restores retinal structure and function in degenerative mice. PLoS One 6: e18992.

Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S et al. 2010. Towards a knowledge-based human protein atlas. Nat Biotechnol 28: 1248-1250. van Dongen J, Ehli EA, Jansen R, van Beijsterveldt CEM, Willemsen G, Hottenga JJ, Kallsen NA, Peyton SA, Breeze CE, Kluft C et al. 2018. Genome-wide analysis of DNA methylation in buccal cells: A study of monozygotic twins and mqtls. Epigenetics Chromatin 11: 54.

Vidaki A, Giangasparo F, Syndercombe Court D. 2016. Discovery of potential DNA methylation markers for forensic tissue identification using bisulphite pyrosequencing. Electrophoresis 37: 2767-2779.

Vidaki A, Kalamara V, Carnero-Montoro E, Spector TD, Bell JT, Kayser M. 2018. Investigating the epigenetic discrimination of identical twins using buccal swabs, saliva, and cigarette butts in the forensic setting. Genes (Basel) 9.

Vidaki A, Kayser M. 2017. From forensic epigenetics to forensic epigenomics: Broadening DNA investigative intelligence. Genome Biol 18: 238.

Vidaki A, Kayser M. 2018. Recent progress, methods and perspectives in forensic epigenetics. Forensic Sci Int Genet 37: 180-195.

112

Bibliography

Volarevic V, Markovic BS, Gazdic M, Volarevic A, Jovicic N, Arsenijevic N, Armstrong L, Djonov V, Lako M, Stojkovic M. 2018. Ethical and safety issues of stem cell-based therapy. Int J Med Sci 15: 36-45.

Vulliamy T, Marrone A, Goldman F, Dearlove A, Bessler M, Mason PJ, Dokal I. 2001. The rna component of telomerase is mutated in autosomal dominant dyskeratosis congenita. Nature 413: 432-435.

Vulliamy TJ, Kirwan MJ, Beswick R, Hossain U, Baqai C, Ratcliffe A, Marsh J, Walne A, Dokal I. 2011. Differences in disease severity but similar telomere lengths in genetic subgroups of patients with telomerase and shelterin mutations. PLoS One 6: e24383.

Waddington CH. 1942. The epigenotype. Int J Epidemiol 41: 10-13.

Wagner W. 2017. Epigenetic aging clocks in mice and men. Genome Biol 18: 107.

Wang T, Pan Q, Lin L, Szulwach KE, Song CX, He C, Wu H, Warren ST, Jin P, Duan R et al. 2012. Genome-wide DNA hydroxymethylation changes are associated with neurodevelopmental genes in the developing human cerebellum. Hum Mol Genet 21: 5500-5510.

Weidner CI, Lin Q, Birkhofer C, Gerstenmaier U, Kaifie A, Kirschner M, Bruns H, Balabanov S, Trummer A, Stockklausner C et al. 2016. DNA methylation in prdm8 is indicative for dyskeratosis congenita. Oncotarget 7: 10765-10772.

Weidner CI, Lin Q, Koch CM, Eisele L, Beier F, Ziegler P, Bauerschlag DO, Jockel KH, Erbel R, Muhleisen TW et al. 2014. Aging of blood can be tracked by DNA methylation changes at just three cpg sites. Genome Biol 15: R24.

Weissman IL. 2000. Stem cells: Units of development, units of regeneration, and units in evolution. Cell 100: 157-168.

Wu H, Min J, Lunin VV, Antoshenko T, Dombrovski L, Zeng H, Allali-Hassani A, Campagna- Slater V, Vedadi M, Arrowsmith CH et al. 2010. Structural biology of human h3k9 methyltransferases. PLoS One 5: e8570.

Wu X, Zhang Y. 2017. Tet-mediated active DNA demethylation: Mechanism, function and beyond. Nat Rev Genet 18: 517-534.

Xie M, Lu C, Wang J, McLellan MD, Johnson KJ, Wendl MC, McMichael JF, Schmidt HK, Yellapantula V, Miller CA et al. 2014. Age-related mutations associated with clonal hematopoietic expansion and malignancies. Nat Med 20: 1472-1478.

Xu CL, Park KS, Tsang SH. 2018. Crispr/cas9 genome surgery for retinal diseases. Drug Discov Today Technol 28: 23-32.

113

Bibliography

Yan XJ, Xu J, Gu ZH, Pan CM, Lu G, Shen Y, Shi JY, Zhu YM, Tang L, Zhang XW et al. 2011. Exome sequencing identifies somatic mutations of DNA methyltransferase gene dnmt3a in acute monocytic leukemia. Nat Genet 43: 309-315.

Yoshizato T, Dumitriu B, Hosokawa K, Makishima H, Yoshida K, Townsley D, Sato-Otsubo A, Sato Y, Liu D, Suzuki H et al. 2015. Somatic mutations and clonal hematopoiesis in aplastic anemia. N Engl J Med 373: 35-47.

Yu J, Hu K, Smuga-Otto K, Tian S, Stewart R, Slukvin, II, Thomson JA. 2009. Human induced pluripotent stem cells free of vector and transgene sequences. Science 324: 797-801.

Yu J, Vodyanik MA, Smuga-Otto K, Antosiewicz-Bourget J, Frane JL, Tian S, Nie J, Jonsdottir GA, Ruotti V, Stewart R et al. 2007. Induced pluripotent stem cell lines derived from human somatic cells. Science 318: 1917-1920.

Yuan T, Jiao Y, de Jong S, Ophoff RA, Beck S, Teschendorff AE. 2015. An integrative multi- scale analysis of the dynamic DNA methylation landscape in aging. PLoS Genet 11: e1004996.

Yusa K, Rashid ST, Strick-Marchand H, Varela I, Liu PQ, Paschon DE, Miranda E, Ordonez A, Hannan NR, Rouhani FJ et al. 2011. Targeted gene correction of alpha1-antitrypsin deficiency in induced pluripotent stem cells. Nature 478: 391-394.

Zbiec-Piekarska R, Spolnicka M, Kupiec T, Parys-Proszek A, Makowska Z, Paleczka A, Kucharczyk K, Ploski R, Branicki W. 2015. Development of a forensically useful age prediction method based on DNA methylation analysis. Forensic Sci Int Genet 17: 173- 179.

Zeeberg BR, Qin H, Narasimhan S, Sunshine M, Cao H, Kane DW, Reimers M, Stephens RM, Bryant D, Burt SK et al. 2005. High-throughput gominer, an 'industrial-strength' integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of common variable immune deficiency (cvid). BMC Bioinformatics 6: 168.

Zeilinger S, Kuhnel B, Klopp N, Baurecht H, Kleinschmidt A, Gieger C, Weidinger S, Lattka E, Adamski J, Peters A et al. 2013. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PLoS One 8: e63812.

Zhang FF, Cardarelli R, Carroll J, Fulda KG, Kaur M, Gonzalez K, Vishwanatha JK, Santella RM, Morabia A. 2011. Significant differences in global genomic DNA methylation by gender and race/ethnicity in peripheral blood. Epigenetics 6: 623-629.

Zhao Q, Zhang J, Chen R, Wang L, Li B, Cheng H, Duan X, Zhu H, Wei W, Li J et al. 2016. Dissecting the precise role of h3k9 methylation in crosstalk with DNA maintenance methylation in mammals. Nat Commun 7: 12464.

Zheng SC, Webster AP, Dong D, Feber A, Graham DG, Sullivan R, Jevons S, Lovat LB, Beck S, Widschwendter M et al. 2018. A novel cell-type deconvolution algorithm reveals 114

Bibliography

substantial contamination by immune cells in saliva, buccal and cervix. Epigenomics 10: 925-940.

Ziller MJ, Muller F, Liao J, Zhang Y, Gu H, Bock C, Boyle P, Epstein CB, Bernstein BE, Lengauer T et al. 2011. Genomic distribution and inter-sample variation of non-cpg methylation across human cell types. PLoS Genet 7: e1002389.

Zubakov D, Liu F, Kokmeijer I, Choi Y, van Meurs JBJ, van IWFJ, Uitterlinden AG, Hofman A, Broer L, van Duijn CM et al. 2016. Human age estimation from blood using mrna, DNA methylation, DNA rearrangement, and telomere length. Forensic Sci Int Genet 24: 33- 43.

115

Appendix

7 Appendix

7.1 Abbreviations

UTR untranslated region 5caC 5-carboxylcytosine 5hmC 5-hydroxymethylcytosine 5hmU 5-hydroxymethyluracil 5mC 5-methylcytosin AA aplastic anemia aa amino acid ASPA aspartoacylase AFP alpha fetoprotein AML acute myeloid leukemia APOBEC apolipoprotein B mRNA editing enzyme catalytic polypeptide approx. approximately ARCH age-related clonal hematopoiesis ASXL1 Additional Sex Combs Like 1 ATP adenosine triphosphate aqua dest. aqua destillata BAT brown adipose tissue BER base excision repair BM bone marrow bp base pairs c-MYC cellular myelocytomatosis oncogene Cas CRISPR-associated Cas9n Cas9 nickase CD cluster of differentiation cDNA complementary DNA CHIP clonal hematopoiesis of indeterminate potential CpG cytosine-guanine dinucleotide CRISPR clustered regularly interspaced short palindromic repeat Ct threshold cycle

116

Appendix

CTBP C-Terminal-Binding Protein DAPI 4',6-diamidino-2-phenylindole DKC dyskeratosis congenita DMSO dimethyl sulfoxide DNA desoxyribonucleic acid DNAm DNA methylation DNMT DNA methyltransferase dNTP deoxyribonucleotide triphosphates e.g. for example EB embryoid body EDTA ethylenediaminetetraacetic acid ESC FACS fluorescence activated cell sorting FCS fetal calf serum FITC fluorescein isothiocyanate FSC forward scatter GAPDH glyceraldehyde 3-phosphate dehydrogenase gDNA genomic DNA GFP green fluorescent protein gRNA guide RNA h hour HEpiDISH Hierarchical Epigenetic Dissection of Intra-Sample-Heterogeneity HPL human platelet lysate HSC hematopoietic stem cell iMSC iPSC-derived mesenchymal stromal cell iPSC induced pluripotent stem cell ITGA2B integrin Subunit Alpha 2b kb kilobase LINE long interspersed nuclear elements LTR long terminal repeats MAD mean absolute deviation MDS myelodysplastic syndrome

117

Appendix mM millimolar MRD minimal residual disease mRNA messenger RNA MSA Mundschleimhautabstriche MSC mesenchymal stromal cell NEFH neurofilament heavy NES Nestin Ng nanogram NGS next generation sequencing NKX2.5 NK2 homeobox 5 OCT4 octamer binding transcription factor 4 PAX6 paired box 6 PC principle component PCA principle component analysis PDE4C phosphodiesterase 4C PFA paraformaldehyde PPAR peroxisome proliferator-activated receptor PRDM PRDI-BF1 and RIZ homology domain containing protein R Pearson correlation coefficient RNA ribonucleic acid ROCK Rho-associated protein kinase RUNX1 Runt-related transcription factor 1 SD standard deviation SDS sodium dodecyl sulfate Semi qPCR semi quantitative polymerase chain reaction SERPINB5 serine Proteinase Inhibitor, Clade B, Member 5 SINE short interspersed nuclear elements SOX1 SRY (sex determining region Y)-box 1 SOX17 SRY (sex determining region Y)-box 17 TAC1 tachykinin Precursor 1 TALEN transcription activator-like effector nuclease TERC telomerase RNA component

118

Appendix

TERT Telomerase reverse transcriptase TET ten-eleven translocation TGF transforming growth factor TSS transcriptional start site TSS1500 1 500 bp upstream of TSS TSS200 200 bp upstream of TSS U units V volt vs. versus WAT white adipose tissue WT wildtype ZFN zinc-finger nuclease μg microgram μl microliter μM micromolar

119

Appendix

7.2 List of Figures

Figure 1.1 DNA methylation and demethylation pathway ...... 2 Figure 1.2 DNAm patterns in early life and aging ...... 5 Figure 1.3 Epigenetic drift vs Epigenetic clock ...... 6 Figure 1.4 Schematic representation of age-related clonal hematopoiesis ...... 10 Figure 1.5 Clonal hematopoiesis during malignant transformation ...... 11 Figure 1.6 Components of the telomerase complex commonly found mutated in telomeropathies ...... 14 Figure 1.7 Pathophysiology of dyskeratosis congenita ...... 16 Figure 1.8 PRDM family domain structure and relationships ...... 18 Figure 1.9 Human iPSC applications in clinical research ...... 21 Figure 2.1 Bisulfite conversion of genomic DNA ...... 26 Figure 2.2 Pyrosequencing assays for cell type specific CpG sites ...... 27 Figure 2.3 Pyrosequencing reaction ...... 28 Figure 2.4 Library preparation for BBA-Seq ...... 30 Figure 2.5 CRISPR/Cas9 design for the PRDM8 knockout ...... 35 Figure 2.6 Embryoid body assay scheme ...... 42 Figure 2.7 Neural differentiation scheme ...... 44 Figure 3.1 Epigenetic age prediction of swab and blood samples ...... 48 Figure 3.2 Age correlation of DNAm at age-associated CpG sites ...... 49 Figure 3.3 Age prediction models without cell type correction ...... 50 Figure 3.4 Identification of differentially methylated CpG sites ...... 52 Figure 3.5 Analyses of smoking, gender and ethnicity as confounding factors ...... 53 Figure 3.6 Buccal-Cell-Signature to estimate cellular compositions of buccal swab samples . 55 Figure 3.7 Combination of age-associated and cell type specific CpG sites ...... 56 Figure 3.8 Comparison of age prediction accuracies in different models ...... 57 Figure 3.9 Aberrant methylation pattern of neighboring CpG sites in AML patients ...... 59 Figure 3.10 Variations in DNAm patterns are patients specific ...... 60 Figure 3.11 Quantification of DNAm variability during disease progression ...... 61 Figure 3.12 Validation of scoring systems via next generation sequencing ...... 62 Figure 3.13 MDS patients show increased variability scores ...... 63 Figure 3.14 Variability scores of a region in DNMT3A ...... 64 120

Appendix

Figure 3.15 Correlation of variability scores in different genomic regions ...... 65 Figure 3.16 DNAm patterns of AML samples as measured by BBA-Seq ...... 66 Figure 3.17 Unsupervised clustering of AML samples by DNAm patterns ...... 67 Figure 3.18 Support vector machine algorithms classify healthy and malignant DNAm pattern ...... 69 Figure 3.19 DNAm frequencies of the PRDM8 gene in patients with telomeropathies ...... 70 Figure 3.20 Screening for genetically engineered iPSC clones ...... 71 Figure 3.21 Staining of pluripotency markers for the genetically engineered iPSC clones ..... 72 Figure 3.22 Multi-lineage differentiation potential of genetically engineered iPSC clones .... 73 Figure 3.23 Neural differentiation of PRDM8 depleted iPSCs ...... 75 Figure 3.24 Gene expression analysis of PRDM8 in transduced iPSCs ...... 76 Figure 3.25 Confocal microscopy of transduced iPSCs after cell sorting ...... 77 Figure 3.26 Maturation of sensory neurons derived from transduced iPSCs ...... 78 Figure 3.27 Clustering of DNAm data of neurally differentiated cells ...... 80 Figure 3.28 PRDM8-dependent differentially DNAm ...... 82 Figure 3.29 Distribution of differentially methylated regions ...... 83

121

Appendix

7.3 List of Tables

Table 2.1 PCR reaction mix ...... 24 Table 2.2 Temperature profile PCR ...... 25 Table 2.3 TAE buffer (50x) ...... 25 Table 2.4 Primers for pyrosequencing ...... 29 Table 2.5 Primers for Next Generation Sequencing...... 30 Table 2.6 cDNA synthesis reaction mix ...... 31 Table 2.7 Temperature profile for cDNA synthesis ...... 31 Table 2.8 semi-qPCR reaction mix ...... 32 Table 2.9 Temperature profile for semi-qPCR ...... 32 Table 2.10 Primer for semi-qPCR ...... 33 Table 2.11 gRNA oligonucleotides...... 36 Table 2.12 Phosphorylation reaction mix ...... 36 Table 2.13 pX335A vector digestion mix ...... 37 Table 2.14 Primers for the generation of PRDM8 overexpression vectors ...... 37 Table 2.15 Precipitation mix for virus production ...... 40 Table 2.16 2xHBS buffer for transfection ...... 40 Table 2.17 EB culture medium ...... 42 Table 2.18 Neural differentiation medium ...... 43 Table 2.19 Neural maturation medium ...... 43 Table 2.20 Neural differentiation timetable ...... 44

122

Appendix

7.4 Publications

Scientific Publications

Part of this work is incorporated in the following publications:

Eipel M, Mayer F, Arent T, Ferreira MR, Birkhofer C, Gerstenmaier U, Costa IG, Ritz-Timme S, Wagner W. 2016. Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures. Aging (Albany NY) 8: 1034- 1048.

Eipel M, Božić T, Mies A, Beier F, Jost E, Brümmendorf TH, Platzbecker U, Wagner W. 2018 . Tracking of myeloid malignancies by targeted analysis of successive DNA methylation at neighboring CG dinucleotides. accepted for publication at Hematologica

Further publications:

Han Y, Eipel M, Franzen J, Sakk V, Dethmers-Ausema B, Yndriago L, Izeta A, de Haan G, Geiger H, Wagner W. 2018. Epigenetic age-predictor for mice based on three CpG sites. Elife 7.

Fernandez-Rebollo E, Eipel M, Seefried L, Hoffmann P, Strathmann K, Jakob F, Wagner W. 2018. Primary Osteoporosis Is Not Reflected by Disease-Specific DNA Methylation or Accelerated Epigenetic Age in Blood. J Bone Miner Res 33: 356-361.

123

Appendix

Oral presentations Investigating the Influence of PRDM8-/- on the Aging Phenotype. International Meeting of the Stem Cell Network NRW, Münster, Germany, 2017

Poster presentation

Eipel M, Mayer F, Arent T, Costa I, Ritz-Timme S, Wagner W. Epigenetic Age Predictions of Mouth Swab Samples. Annual Meeting of the German Foundation for Aging Research (DGfA), Jena, Germany, 2015.

Eipel M, Mayer F, Arent T, Ferreira MRP, Birkhofer C, Gerstenmaier U, Costa I, Ritz-Timme S, Wagner W. A combination of cell type specific and age associated DNA methylation pattern for epigenetic age predictions of buccal swabs. 6th Clinical Epigenetics International Meeting of the Clinical Epigenetics Society (CLEPSO), Düsseldorf, Germany, 2016. and 2nd Cologne Ageing Conference (CECAD), Cologne, Germany, 2016.

Eipel M, Weidner C, Rösseler C, Lampert A, Wagner W. Investigating the premature aging phenotype of PRDM8 knockout in induced pluripotent stem cells using the CRISPR/Cas9 technology. Annual Meeting of the German Foundation for Aging Research (DGfA), Ulm, Germany, 2016.

124

Appendix

7.5 Acknowledgements

To honor the fact that research is rarely an entirely solitary process I decided to rely on the use of the first person pronoun “we” in the text. First of all, I would like to express my gratitude to Prof. Dr. Dr. Wolfgang Wagner who supervised me during my Ph.D. time. Thank you for your time and for teaching me not only research skills but also a lot about the scientific world and how to deal with drawbacks. I highly appreciated your encouraging guidance! I would also like to thank Prof. Dr. Martin Zenke for co-supervising me. I benefited from your experienced input over the years. Thank you for the possibility to work with your team and for evaluating this thesis. I also thank Prof. Dr. Panstruga for taking the time to act as my examiner. I would also like to thank him for awaking my interest in molecular biology during his motivating lectures. Many thanks also to our cooperation partners in the group of Prof. Ritz-Timme, PD Dr. Jost, Prof. Brümmendorf and particular to Dr. Fabian Beier for providing samples and constructive discussions during the years. Special thanks also to Dr. Corinna Rösseler and Prof. Lampert for the expertise to perform neural differentiation. I really enjoyed our collaboration. In addition, I am very thankful for the help with the bioinformatics analysis which was provided by Julia Franzen and Jan Hapala. Especially Julia was always very patient explaining me the basics of bioinformatic procedures. Thanks to Olivia and Florian for carefully reading my thesis and for providing helpful suggestions. My special thanks go to all former and recent colleagues which supported me with valuable scientific contributions. But not least important, you are all fantastic and positive persons and I am more than happy that I could work in such a friendly working environment. At this point I send my special thanks and love to Edu and Marta who were always there when I needed them. Mein ganz besonderer Dank gilt meiner Familie und meinen Freunden, die mich immer unterstützt haben und auf die ich mich immer bedingungslos verlassen kann.

125

Appendix

7.6 Declaration of Authorship

Declaration of Authorship

I certify that the work presented here is, to the best of my knowledge and belief, original and the result of my own investigations, except as acknowledged and has not been submitted, either in part or whole, for a degree at this or any other university.

Aachen, ______Monika Eipel

Eidesstattliche Erklärung

Ich versichere hiermit an Eides statt, dass ich die vorliegende Arbeit selbstständig und unter ausschließlicher Verwendung der angegebenen Literatur und Hilfsmittel erstellt habe. Die Arbeit wurde in gleicher oder ähnlicher Form bisher keiner anderen Prüfungsbehörde vorgelegt. Die Grundsätze zur Sicherung guter wissenschaftlicher Praxis der RWTH habe ich zur Kenntnis genommen und eingehalten.

Aachen, ______Monika Eipel

126