University of Connecticut OpenCommons@UConn

Doctoral Dissertations University of Connecticut Graduate School

11-8-2018 Evaluation of the Epigenetic Regulation of Two X- linked, Autism Candidate : X-linked Lymphocyte Regulated 3b and -Like 1 Glenn W. Milton University of Connecticut - Storrs, [email protected]

Follow this and additional works at: https://opencommons.uconn.edu/dissertations

Recommended Citation Milton, Glenn W., "Evaluation of the Epigenetic Regulation of Two X-linked, Autism Candidate Genes: X-linked Lymphocyte Regulated 3b and Transketolase-Like 1" (2018). Doctoral Dissertations. 1983. https://opencommons.uconn.edu/dissertations/1983 Evaluation of the Epigenetic Regulation of Two X-linked, Autism Candidate Genes: X-linked Lymphocyte Regulated 3b and Transketolase-Like 1 Glenn W. Milton, PhD University of Connecticut 2018 Abstract

According to the CDC the prevalence of Autism Spectrum Disorder (ASD) in

males compared to females is approximately 5:1. Skuse et al. hypothesized that

imprinted genes on the X could account for this sex bias. Genomic

imprinting is defined as differential expression dependent upon the parental origin

of the allele. While genomic imprinting on autosomes has been well classified, no

imprinting mechanism on the sex has been discovered. This work

presents an investigation into the regulation of expression governing the imprinted locus

of X-Linked Lymphocyte Regulated 3b/4b/4c (Xlr3/4), and closely linked ASD candidate

gene, Transketolase-Like 1 (TKTL1), in the developing central nervous system of the

. Examination of epigenetic signatures and the chromatin environment

including differential DNA methylation, differential nucleosome positioning, H3.3

deposition and G-quadruplex formation, suggests that epimutations in these loci may

underlie disease association.

Transketolases play a pivotal role in the Pentose Phosphate Pathway (PPP), and

dysfunction in the non-oxidative phase of the PPP has been linked to

neurodegenerative disorders and cancers. This work explores the expression of TKTL1

in human brain sub-regions of ASD patients compared to neurotypical individuals as

well as the functional role TKTL1 plays in the PPP. It is observed that TKTL1 may not Glenn Milton University of Connecticut 2018 function as a transketolase in the pathway, but may play a critical regulatory role as a competitive inhibitor.

Evaluation of the Epigenetic Regulation of Two X-linked, Autism Candidate Genes: X-linked Lymphocyte Regulated 3b and Transketolase-Like 1

Glenn W. Milton

B.S., Quinnipiac University, 2010

A Dissertation

Submitted in Partial Fulfillment of the

Requirements for the Degree of

Doctor of Philosophy

At the

University of Connecticut

2018

i APPROVAL PAGE

Doctor of Philosophy Dissertation

Evaluation of the Epigenetic Regulation of Two X-linked, Autism Candidate Genes: X- linked Lymphocyte Regulated 3b and Transketolase-Like 1

Presented By:

Glenn W. Milton B.S.

Major Adivsor Michael J. O’Neill Ph.D.

Associate Advisor Rachel J. O’Neill Ph.D.

Associate Advisor John Malone Ph.D.

Associate Advisor Leighton Core Ph.D.

Associate Advisor Joseph LoTurco Ph.D.

University of Connecticut 2018

ii Acknowledgements

First, I would like to thank my P.I., Michael O’Neill. It has been incredible working with you in your laboratory for the duration of my research. From our first meeting before I started at UCONN, I knew your lab was a great fit for growth and support as a scientist. It has been a pleasure being a TA for your courses where you allowed me to grow as an instructor and not just as a scientist. Your door was always open to discuss any and all science ideas, or life in general, and for that I will forever be grateful.

To my committee members: Rachel, you have always been an incredible resource in terms of my progress as a scientist. I know if I ever had a question, you would be more than willing to help. You helped shape the course of my research and pointed me in directions I may have otherwise not explored. John, from my very first meeting with you, your enthusiasm for science has been contagious. When we met for my proposal, we spoke for over two hours about my research and you were always happy when we saw each other in the halls to catch up on my progress. Leighton, your NGS and bioinformatics experience have been indispensable to my scientific growth. Joe, your questions pushing me to think critically about neurons and the impact my gene of interest may have has made me a better all-around scientist. To all my committee members, thank you.

To all O’Neill members past and present thank you. Especially Sohaib who trained me in lab when I first started. And Rob for being a great friend and person to discuss science with literally any time, day or night. Thank you to my graduate and undergrad students who were integral to my success in the lab, Hannah, Becca, Jessica and Liz.

To all of my family, your love and support has helped me through my winding path. To my parents especially, thank you for raising me to be the person I am today and supporting me in accomplishing my goals. I spent over ten years in Connecticut away from my family but every time we spoke or I saw you, you made me feel like the best son, brother, uncle, husband and father in the world.

To my friends, particularly Adam and Jeff, thank you for always making me laugh. You have always been there for me and I will never forget how much you mean to my life.

Most important of all, thank you to my wife Sami and my son Jack. You are my rock, my inspiration, and my driving force. Sami, you always push me to be better than I ever thought I could be. Without your love and support, I know I would not have been able to accomplish this goal. Jack, all of this is for you. I hope I can be as inspiring to you as the people in my life have been for me. I love you and thank you for being the best thing I have ever done.

iii Table of Contents

Chapter 1: Introduction ...... 1 Overview ...... 1 Background ...... 4 Evidence of a Social Cognitive Deficit Associated with Parent of Origin ...... 7 The Epigenetic Effect on ASD ...... 9 Evolutionary Origin of Genomic Imprinting ...... 10 Genomic Imprinting in the Context of Disease ...... 12 Genomic Imprinting and Behavior...... 13 Underlying Imprinting Control Mechanisms ...... 14 DNA Methylation and DMRs ...... 16 Mammalian Dosage Compensation ...... 17 Raefski Development of Turner Syndrome Model Mice ...... 19

Chapter 2: The Imprinted X-Linked Lymphocyte Regulated (Xlr) ...... 22 Characterization of the First X-Linked Imprinted Locus (Xlr) ...... 22 Functional Study of the Xlr3b ...... 24 Regulation of the Imprinted Xlr Cluster (Xlr3/4) ...... 25

Chapter 3: Chromatin Architecture at the Imprinted Cluster Xlr3/4 Region ...... 30 Bionano Methylation of XA7.2 ...... 30 Nucleosome Positioning at XA7.2 ...... 33 Assay of Nucleosome Positioning by MNase-Seq ...... 34 Open Chromatin and Nucleosome Occupancy ...... 41 ATAC-Seq DNA Transcription Factor Binding and DNA Footprint Analysis ...... 53 Conclusions ...... 56

Chapter 4: The Transketolase-Like 1 Gene in Neurodevelopment ...... 66 Discovery of the Tktl1 Imprinted Expression in Mouse ...... 66 TKTL1 Expression in Human Brain Sub-Regions ...... 69 TKTL1 Imprinting Status in Human Brain Sub-Regions...... 74 Conclusions ...... 75

Chapter 5: Characterization of the TKTL1 Protein ...... 77 TKTL1 Protein Interactions ...... 79 Conclusions ...... 85

Chapter 6: Synthesis and Future Directions ...... 88

Chapter 7: Materials and Methods ...... 103

iv

Table of Figures

Figure 1: Skuse et al. Social Cognitive Ability 8 Figure 2: Igf2/H19 Imprinting Mechanism 15 Figure 3: X Monosomic Mouse Breeding Scheme 20 Figure 4: Discovery of the Imprinted Xlr3/4 Region 23 Figure 5: X-Linked Lymphocyte Regulated (Xlr3/4) Locus 24 Figure 6: Methylation Status at the Xlr Imprinted Cluster 26 Figure 7: Xlr3b Primary Transcript Analysis 27 Figure 8: HpaII Methylation Assay at F8a 29 Figure 9: Bionano Methylation Assay Alignment 31 Figure 10: MNase-Seq Nucleosome Occupancy Map (NOM) 38 Figure 11: MNase-Seq Data Analysis 39 Figure 12: ATAC-Seq Protocol 42 Figure 13: ATAC-Seq Profile of the Xlr3 Paralogs 44 Figure 14: ATAC-Seq Profile of the Xlr4 Paralogs 45 Figure 15: ATAC-Seq Profile of Ddx3x, Sox3, and H19 46 Figure 16: Nucleosome Occupancy ATAC-Seq 49 Figure 17: ATAC Nucleosome Profile of Biallelically Expressed Genes 50 Figure 18: Survey of Nucleosome Occupancy in Regions of the X Chromosome 52 Figure 19: Protein Interaction Map of Nr2e3 55 Figure 20: Nucleosome Occupancy ATAC-Seq Xlr3b 58 Figure 21: Differential Nucleosome Positioning at the Xlr3/4 Locus 59 Figure 22: DNA TF Meme Suite Output 60 Figure 23: DNA Footprint Analysis of Xlr3 Paralogs 61 Figure 24: DNA Footprint Analysis of X Chromosome Genes 62 Figure 25: MNase Enrichment Comparison to ATAC-Seq Nucleosome Positioning 64 Figure 26: G-Quadruplex Alignment with Nucleosome Occupancy 65 Figure 27: Dr. Nesbitt Real-Time PCR results of Tktl1 Expression 67 Figure 28: 19 Week Fetal Human Brain TKTL1 Expression 67 Figure 29. Non-Oxidative Phase of the Pentose Phosphate Pathway 68 Figure 30. Primer design for qRT-PCR products in H. sapiens TKTL1 69 Figure 31: Relative TKTL1 Expression in Neurotypical vs. ASD Females 71 Figure 32: Sequencing Analysis of TKTL1 SNP Present in Human Brain Sub-regions 72 Figure 33: Tkt and Tktl1 Conservation 76 Figure 34: qRT-PCR HT-22 Transfected Cell Lines 79 Figure 35: Co-IP of Tkt/Akt and TKTL1/Akt 80 Figure 36: Tkt and TKTL1 IP for Phosphorylation Analysis 82 Figure 37: Relative Ratio of Glutathione and NADPH in Cells 83 Figure 38: qRT-PCR of Mouse IUE Brain Samples 83 Figure 39: Mouse IUE IP Western Blot Analysis 84 Figure 40: A Proposed Imprinting Mechanism Model Silencing the Paternal Xlr3b 92 Figure 41: Proposed Alternative Imprinting Mechanism Model for Xlr3b 95 Figure 42: Model of Typical TKTL1 Phosphorylation 101 Figure 43: Competition Model of Over Expressed TKTL1 in the PPP 102

v Chapter 1: Introduction

Overview

The CDC reports there is approximately a 5:1 ratio in the occurrence of Autism

Spectrum Disorder (ASD) in males compared to females, but the reason for this male bias is currently unknown1. Dr. David Skuse’s X-linked imprinting theory proposes that this bias is due to the presence of an imprinted gene(s) on the X chromosome2. In mammals, imprinted genes are distinguished by the fact that they exhibit parent-specific gene expression (PSGE). Currently, approximately 200 imprinted genes have been identified in mammals3, however only six genes showing PSGE have been identified on the mouse X chromosome and none have been reported for the human X chromosome4.

In 1997 Skuse et al. theorized that the presence of imprinted genes on the X chromosome may explain a parent-of-origin associated social cognitive deficiency seen in patients with Turner Syndrome (TS), i.e. X chromosome monosomy. Skuse et al. observed decreased social cognitive ability in TS females who inherited the maternal X

2 chromosome (45XM) compared to 45,XP TS females . From this work a research path emerged to investigate X-linked imprinted genes that may play a role in social cognitive function and neurodevelopment. At the time only a few dozen imprinted genes were known and all were autosomally linked.

In 2005, Raefski and O’Neill reported the discovery of the first X-linked imprinted cluster of genes including Xlr3b, Xlr4b, and Xlr4c on the mouse X chromosome (which will hence be referred to as the Xlr3/4 locus)5. While imprinting at autosomal loci was known to be regulated by parent-specific methylation at critical Differentially Methylated

1 Regions (DMRs), no DMR has been identified at this Xlr3/4 locus, suggesting a novel imprinting mechanism exists to maintain the imprinting status of these genes. Evidence to date suggests the chromatin environment could play a role in the PSGE mechanism seen at the Xlr3b/4 locus.

In addition to DMRs autosomal imprinted genes can be distinguished by the presence of other allele-specific epigenetic marks. The regulation and maintenance of

PSGE at autosomal imprinted loci is accomplished, in general, by well-characterized mechanisms utilizing DMR’s and specific histone modifications6. Imprinting mechanisms for many autosomally imprinted genes have been well studied and are well understood.

However, an imprinting mechanism for sex chromosome linked genes has yet to be identified and characterized.

This work explores the hypothesis that Xlr3b, Xlr4b, and Xlr4c are imprinted by a previously unidentified mechanism that does not involve the classic epigenetic marks known to exist at autosomal imprinted loci. Additionally, this work explores the hypothesis that Tktl1 in mice and TKTL1 in humans modulates the activity of the PPP in a way that could contribute to neurodegeneration in ASD. Further, this work provides insight into the overall chromatin environment of the X chromosome allowing for analysis of previously unknown differences between the maternal and paternal X chromosome in mammals.

Specific questions addressed in this dissertation include:

1) Are nucleosomes positioned differently in a parent specific manner at the

Xlr3/4 locus?

2 2) Are there parent-of-origin specific differences in open chromatin and thus

different Transcription Factor (TF) binding motifs at the Xlr3/4 locus?

3) Is TKTL1 expressed significantly different among Autistic and neurotypical

brain sub-regions?

4) What is the functional relationship between Tkt (the canonical mammalian

transketolase) and its X-linked paralog, Tktl1 and how do they influence the

Pentose Phosphate Pathway (PPP) in neuronal cells?

3

Background

Autism Spectrum Disorder (ASD)

A prevalent gender bias has been observed in several neurodevelopmental disorders and, of particular interest, in ASD. A 5:1 male to female ratio has been reported by the CDC, and in several studies a ratio as high as 7:1 has been seen in high-functioning ASD subjects7. The prevalence of ASD is approximately 1 in 59 children in the US and between 1-2% worldwide according to the CDC data and statistics. From genome-wide association studies (GWAS) and exome and whole- genome sequencing of simplex and multiplex families with autistic individuals, both rare and common variants in over 100 genes have been implicated in the etiology of ASD8.

Along with genetic factors, environmental factors are believed to play a significant role in the development of ASD9. At a cellular level, oxidative stress has been implicated as a contributing factor in ASD development stemming from a link between increased intracellular free radicals and neurodegeneration10.

The most recent description and differential diagnosis of ASD arises from the

Diagnostic and Statistical Manual V (DSM-5), since its release in 201311. The disorder typically manifests in symptoms including impaired communication and social interactions. The Autism Society also lists the following characteristics as signs/symptoms of ASD: lack or delay in spoken language, repetitive use of language and/or motor mannerisms, little or no eye contact, lack of interest in peer relationships, lack of spontaneous or make-believe play, and persistent fixation on parts of objects12.

Currently, early intervention behavioral testing can significantly improve autistic individual outcomes by conducting a developmental screening of infants and working

4 with multidisciplinary teams to identify the needs of the child. However, understanding the underlying genetic component of ASD could lead to earlier intervention and potentially pharmacologic therapy targeted to the disorder.

Neurons due to their high metabolic rate and reduced ability for cellular regeneration, are particularly sensitive to oxidative stress compared to other organs and cell types. This susceptibility to damaging effects of reactive oxygen species (ROS) has linked oxidative stress and neurodegenerative disorders such as Alzheimer’s disease and ASD. Ultimately, an increase in ROS leads to lipid peroxidation and the breakdown of cellular membranes13. Particularly of interest to this work is the function of transketolases during the non-oxidative phase of the PPP. Transketolase catalyzes the reversible conversion of glucose-6-phosphate to ribulose-5-phosphate in the non- oxidative phase of the PPP, and as a consequence, is responsible for the production of

NADPH. NADPH production is essential for ROS protection in neuronal cells.

Transkeolase activity in the PPP, therefore, may play a pivotal role in managing oxidative stress and, by extension, ensuring proper neurodevelopment14,15.

As stated before, potentially contributory variants in over 100 genes have been linked to the occurrence of ASD. Genes that have been linked to ASD include; RELN, important for intercellular connections in the nervous system16, HOXD117, important for brain structure formation, genes involved in the production of Gamma-amino-butryic acid (GABA), an important neurotransmitter18, and SLCC6A4, a serotonin transporter gene important for serotonin equilibrium in synapses18. The gene, HOXA1 has been well studied and is essential to the development of cranial nerves, brain structures and the skeleton of the head and neck. Ingram et al. showed that a variant form of the

5 HOXA1 (A/G) gene occurred at a frequency of approximately 40% in ASD patients compared to the frequency of 10-25% in neurotypical individuals of European and

African origin. Interestingly, this HOXA1 variant gene of interest displayed a gender effect of 11:10 (A/G to A/A) in females and 23:44 in males. It is suggested by Ingram et al. that this HOXA1 variant effect may be more apparent in females than males, where all affected females were A/G variant heterozygotes17. A statistically significantly larger proportion of affected offspring inherited the maternal G allele. It was concluded from segregation data that the parental source of the allele is important in the occurrence of

ASD19.

The sex differential in the HOXA1 variant association with ASD may be an example of what is termed, the Female Protective Effect (FPE) that was formulated to explain the sex bias in occurrence of ASD and other neurodevelopmental diseases20,21.

The underlying theory of the FPE is that females have an inherent defense against certain disorders due to the action of sex hormones, or to the presence of two X chromosomes compared to the single X in males (paired with a Y chromosome). It has been reported that females diagnosed with ASD display numerous de novo DNA mutations compared to males21. In essence, for females to present an ASD phenotype, the threshold for ASD manifestation is higher. This protection may result from X chromosome mosaicism due to mammalian dosage compensation via X chromosome inactivation (XCI). XCI in karyotypically normal females constitutively silences one randomly chosen X chromosome, the resulting mosaicism in females provides a built in defense mechanism for detrimental recessive X-linked mutations21. If a predisposing variant is inherited by a female on a single X chromosome, only approximately 50% of

6 cells would ultimately express the variant, while males carrying the variant would express it in all cells. In this way, the female is protected from potential region specific deleterious effects of variant alleles.

Evidence of a Social Cognitive Deficit Associated with X Chromosome Parent of Origin

The FPE could begin to explain the occurrence of the male bias in ASD. Females would be inherently protected from social cognitive deficits associated with the X chromosome due to random X inactivation. Another theory proposed to explain why there is a high prevalence of ASD in males is the Extreme Male Brain (EMB) theory20.

Baron-Cohen et al. hypothesize that autism is a manifestation of hyper-masculinized brain development. EMB invokes the involvement of Fetal Testosterone (fT), where increased levels of fetal androgens affect sex differences in behavior, cognition, brain structure and function20. This theory would explain how females are protected from higher rates of ASD and why males are susceptible. However, currently no concrete evidence exists connecting increased androgen levels and patients with ASD. With this lack of evidence, it has been hypothesized that differences on the X chromosome, such as the epigenetic environment, may explain the skewed prevalence of males diagnosed with ASD.

This X chromosome hypothesis was first suggested by Dr. David Skuse after studying females with Turner Syndrome2. Turner Syndrome is the result of either total or partial X monosomy and patients share several symptoms with those diagnosed with

ASD22,23. Skuse tested the social cognitive function of TS females first via a first-stage screening survey, then neurological assessments2. These neurological assessments included the Weschsler Intelligence Scales for Children, the Weschsler Adult

7 Intelligence Scales-Revised and the Test of Everyday Attention for Children. These

tests were scored and their results showed that females in the study that inherited the

maternal X chromosome (45,XM) performed significantly worse than those females who

inherited the paternal X chromosome (45,XP), (Figure 1). From these results, he

suggested that an imprinted gene or gene cluster on the X chromosome could account

for the parent-of-origin difference in TS social cognitive impairment, and, furthermore,

could explain the skewed prevalence of ASD in males2. This was the first report in the

literature to invoke an epigenetic explanation for the sex bias in ASD.

Figure 1: Skuse et al. Social Cognitive Ability Graph of social cognitive ability of Turner Syndrome Females inheriting either 45,XM or 45,XP compared to 46,XX and 46,XY individuals. The higher score on this test indicates lower social cognitive ability. 45,XM females displayed the lowest levels of cognitive ability while 45,XP individuals displayed similar social cognitive abilities to that of males. 46,XX females scored best on the social cognitive test. Adapted from Skuse et al, 19972. The Epigenetic Effect and ASD

Epigenetics encompasses the study of hereditable gene expression changes that

do not involve changes to the DNA sequence. Epigenetic effectors in mammals include

methylation of cytosines at CpG dinucleotides, and biochemical modifications to

8 nucleosomal histone such as acetylation, methylation, phosphorylation, and ubiquitination24. Each one of these epigenetic marks can influence gene regulation in the cell. The term “epigenotype” was first coined by C.H. Waddington to describe how external factors, such as the environment, can have an effect on an organism’s phenotype25. In addition to Skuse’s X imprinting theory, evidence of a strong environmental component to the overall susceptibility of ASD was identified in studies of both monozygotic and dizygotic twins26,27. In order to understand the influence of epigenetics on the development of ASD, it is important to consider epigenetic influences on gene expression in the broadest of sense.

One of the earliest observations of epigenetic gene regulation came from studies of position effect variegation in the fruit fly, Drosophila melanogaster in which experiments showed that a gene’s proximity to densely packed regions, or heterochromatic regions, of the chromosome can influence the overall expression levels of the gene28. Heterochromatic DNA is characteristically inaccessible to transcription factors and are generally silent domains due to the conformation of the chromatin in a dense state28. Another epigenetic phenomenon observed in maize involves an interaction of two alleles where one allele causes heritable changes in the second allele, without changing the DNA sequence, this is referred to as a paramutation29. The earliest epigenetic phenomenon studied in mammals focused on the mechanism that regulates

X chromosome dosage compensation between males and females. In this process, a random X chromosome in XX females is silenced to reduce the genomic content to equal the XY male30. The epigenetic marks associated with this silencing event are discussed later in the chapter. Finally, genomic imprinting is known to be regulated by

9 epigenetic marks determining parent-of-origin specific gene expression, which will be examined further in the following section31.

Evolutionary Origin of Genomic Imprinting

Genomic imprinting involves the differential expression of genes depending upon which parent contributed the allele, and is often termed Parent-Specific Gene

Expression (PSGE). PSGE has been most often observed in mammals32 and in seed- bearing plants33. The relevance of genomic imprinting to neurodevelopment and behavior in mammals comes about through understanding how it evolved. The original rationale to explain the evolution of imprinted genes involved a theory of parental investment (PI) in relation to both the parent and the offspring34. This theory addressed the conflict that arises between each parent’s optimal PI into each mating34. In mammals, as in seed-bearing plants, the development of offspring is supported by physical resource input from maternal tissues, while the paternal input is purely genetic.

A prevailing theory in the field, known as the Parent-Offspring Conflict model, or Kinship theory, addresses the issue that although parental genetic contributions are equivalent, the amount of resource investment into the development and growth of the offspring is unequal34. The paternal genome can evolve ways to exploit the maternal input, while the maternal genome counter-adapts. The Parent-Offspring Conflict model explains the observation that most paternally imprinted (silenced) genes are growth suppressors, while most maternally imprinted genes are growth enhancers34. Pomiankowski hypothesized that due to the maternal and paternal X pattern of inheritance, where the paternal X is always passed to the female offspring and the maternal X is passed similar to autosomes, the X chromosome is pre-adapted to be regulated by imprinting. This

10 theory addresses X-linked imprinting as a response to selection pressure differences between males and females. The hypothesis is based on the asymmetry of inheritance of the maternal and paternal X chromosome and provides insight as to how X-linked imprinting may have evolved differently than autosomal imprinted genes following the

Parent-Offspring Conflict model35.

The fact that the parental genomes display functional inequality, creates in essence a roadblock to asexual reproduction in mammals meaning both parental genomes are essential for the developing organism36,37,38. Parental marks established in the germ line on individual alleles are crucial for the proper expression of genes throughout development38. One of the first mammalian epigenetic marks identified was parent-specific differential methylation39. Differential DNA methylation marks are established in the parental germ lines and are reset in a sex specific manner in each subsequent generation40.

In terms of evolution, the Parent-Offspring Conflict model argues that paternal alleles aim to increase the input of maternal resources into developing young to gain a competitive advantage for their offspring from that specific mating (i.e. more resources, bigger, stronger offspring). Maternal alleles however, aim to reduce resource output per litter or offspring to ensure future viable mating. A female is able to contribute her genomic material to all of her offspring through successive matings with several males, whereas the male genome aims to optimize the survival of the offspring from only his mating. In essence, paternal alleles evolved an epigenetic program to maximize maternal output from a single breeding, while maternal alleles regulate to spread resources over several matings41. The result of this evolutionary conflict is the silencing

11 of the maternal alleles of fetal growth enhancing genes and silencing of the paternal alleles for growth suppressing genes42. As discussed below, mutations affecting normal imprinted expression of these genes not only affect growth patterns, but also adversely affect neurodevelopment.

Genomic Imprinting in the Context of Disease

Errors in balancing the genomic conflict inherent between the maternal and paternal genome have been associated with several disorders43,44. Understanding how these disorders arise may provide insight into how imprinted loci can influence ASD development. Loss of genomic imprints leading to inappropriate activation or silencing of both alleles severely alter gene dosages and can result in deleterious effects for the developing organism. Prader-Willi Syndrome45 and Angelman Syndrome44 are often caused by a deletion event on human chromosome 15. This deletion specifically occurs at the 15q11-q13 region and depending upon which parental allele experiences the deletion, the symptoms are very different46. A deletion of genes on the paternal chromosome 15 is responsible for the development of Prader-Willi Syndrome and

Angelman Syndrome is caused by a deletion of a gene on the maternal chromosome

1547. Due to the parent of origin specific deletion in these disorders, it was theorized that an imprinted locus was interrupted on chromosome 1547. Further research revealed that, in a subset of cases, uniparental disomy was responsible for the disorders48. This is consistent with the theory that these diseases involve genomic imprinting because the individuals would inherit identical imprinting marks on both alleles thus disrupting normal

PSGE. The imprinted gene primarily associated with Angelman Syndrome, UBE3A, is

12 an ASD candidate gene; furthermore, patients with Angelman Syndrome exhibit autistic features49.

Genomic Imprinting and Behavior

The influence of imprinted genes on behavior was investigated by Champagne et al. using a Peg3 knockout mouse model50. Peg3 is a maternally imprinted and paternally expressed gene on mouse chromosome 7 that is strongly expressed in the placenta and the developing organism. Behavioral changes were observed in Peg3 mutant female mice as well as a disruption of reproductive success. This result was dependent on the genetic background of the maternal mouse. Characteristics displayed by the Peg3 mutant mice included lower olfactory discrimination, reproductive success, pup retrieval, and postpartum licking/grooming. The mutant offspring displayed impairment in both sucking and growth rate compared to wild type (WT). Interestingly, the group also observed an effect of Peg3 mutant mice on their non-mutant female offspring suggesting there is a mode of transmission of effects without transmission of the mutant allele. It is believed the behavioral differences in the mother caring for her pups is responsible for this mode of transmission50. A recent study led to the identification of a gene in mouse, pleckstrin homology-like domain family A member

2 (Phlda2), that influences the hormones produced by the mother during pregnancy51.

This gene negatively regulates a portion of the placenta and is dependent on dosage of

Phlda2 in the offspring. Expression differences of this gene were observed to cause changes to the maternal hypothalamus and hippocampus during pregnancy and altered maternal care for pups after birth. Paternal silencing of Phlda2 caused increased nurturing of pups suggesting that paternal silencing of this gene evolved to increase

13 maternal care51. This study provides evidence for how genomic imprinting of genes can be directly linked to behavior. Furthermore, it provides evidence that imprinted genes are involved in Parent-Offspring conflict, where the paternal genome influences the maternal output of resources, such as nurturing, to ensure survival of offspring.

With imprinted genes associated with behavior and Pomiankowski’s theory of a pre-adapted X chromosome for genomic imprinting, Skuse et al.’s theory of X-linked imprinted genes underlying the difference in social cognitive ability in TS female patients is supported by data found the literature. Identification of X-linked imprinted genes responsible for X-linked imprinting and the underlying imprinting mechanism, may provide insight into the direct link of X-imprinted genes and neurodevelopmental disorders. The following section describes current understanding of characterized mechanisms of autosomal genomic imprinting.

Underlying Imprinting Control Mechanisms

Since imprinted genes have been implicated in ASD, it is important to understand the underlying mechanisms that control this parent specific expression. The following describes what is currently known in the literature about two autosomal imprinted gene clusters and their regulation. The best characterized of these genes is Insulin like growth factor II (Igf2)52,53. This gene is maternally silenced and paternally expressed on mouse chromosome 7. The imprint was discovered after disruption of Igf2 expression by gene targeting in mice produced differing results depending on whether the knockout was transmitted through the maternal or paternal germ line. When Igf2 expression was altered in the paternal germ line, the resulting offspring displayed adverse characteristics52. However, this was not the case when the gene was altered in the

14 maternal germline52. This observation suggested that a parent specific expression effect may be occurring at the locus. Subsequently, a DMR was observed in the promoter of the gene H1953, which is closely linked to Igf2 on mouse chromosome 753. At this locus, the paternal allele is methylated while the maternal allele is unmethylated, and H19 shows a pattern of PSGE opposite that of Igf253. Igf2 and H19 were observed to utilize the same enhancer elements downstream of H19 to regulate gene expression of opposing parental alleles53. The protein CTCF binds to the unmethylated maternal allele upstream of H19 and acts as an insulator allowing the enhancer elements to interact with the promoter and drive gene expression of H19. On the paternal allele, the region is methylated, CTCF cannot bind, and the enhancer elements are used by Igf2 to express the gene (Figure 2). To determine if this DMR was in fact the imprinting control region (ICR), deletion experiments were conducted to elucidate if expression of the parental alleles were altered54,55. It was concluded that the identified DMR was responsible for regulating proper imprinting at the locus54,55.

Figure 2: Igf2/H19 Imprinting Mechanism Methylation of CpG islands on the paternal allele of the Igf2/H19 locus. Due to this methylation CTCF is unable to localize and downstream enhancers loop 100kb to interact with Igf2. The ICR is unmethylated on the maternal allele allowing CTCF to bind upstream of the H19 promoter. CTCF acts as an insulator, preventing enhancers from looping back to transcribe Igf2 and instead drive expression of the non-coding RNA H19.

15 Differential methylation is one mechanism governing genomic imprinting at autosomal loci. Another mechanism involves the transcription of a long non-coding RNA transcript that can regulate imprinted expression56. One well-studied example of this mechanism is the transcription of a non-coding RNA, Airn, which regulates expression of the imprinted genes Igf2r, Slc22a2, and Slc22a357. Interestingly, a non-coding RNA is produced at the H19 locus, however, evidence shows this non-coding RNA is not responsible for establishing the imprint at the Igf2 locus58. To date, no sex-linked genomic imprinting mechanism has been identified. Understanding the underlying imprinting mechanism of X-linked genes could help explain the findings of Skuse et al. and provide insight into future therapies for X-linked imprinting disorders.

DNA Methylation and DMRs

Defects in DNA methylation have been implicated in disorders including cancer59 and neurodevelopmental disorders such as schizophrenia60 and ASD61. Most commonly in mammals, DNA methylation involves the addition of a methyl group to the 5’ carbon of a cytosine residue by a DNA Methyltransferase (DNMT)62. As stated previously, DNA methylation is an epigenetic mark that can be altered to change the phenotype of an organism without changing the underlying DNA sequence. This mark is known to influence processes including X Chromosome Inactivation (XCI), carcinogenesis, and genomic imprinting by regulating DNA accessibility and regulating expression62. DNA methylation is common and in general occurs in the mammalian genome at approximately 1% of cytosines49. Typically, high levels of methylation are seen at what are called CpG islands. The generally accepted characteristics of a CpG island are regions where the observed to expected ratio of CG nucleotides is greater than 60% in

16 200 bp of sequence63. About 85% of the time, these CpG islands are located near the promoters and transcription start sites (TSSs) of genes. In the case of the Igf2/H19 locus, differentially methylated transcriptional regulatory domains regulate gene expression in a parent-of-origin specific manner64,65. DNA methylation often blocks the binding of transcription factors, thus rendering the region silent66. During mammalian pre-implantation development, there is a wave of genome-wide demethylation followed by lineage-specific methylation that maintains cellular identity and genomic stability67.

However, ICRs manage to remain methylated during this event in the developing embryo68. In this way, ICRs are preserved to regulate genomic imprinted loci and establish the baseline for parental allele differentiation. This process is maintained by

DNA methyltransferase, Dnmt169.

Mammalian Dosage Compensation

The sex chromosomes are of particular interest in neurodegenerative disorders including ASD, which presents a sex bias in diagnosis frequency. Specifically, gene expression on the X chromosome is implicated in social cognitive deficiencies2. In mammals, males are the heterogametic sex with an X and Y chromosome. The evolutionary origin of the mammalian sex chromosomes is thought to be an autosomal pair that acquired a sex-determining trait70,71. When organisms are heterogametic for sex chromosomes with an unequal amount of genomic material, a method must evolve to normalize the transcriptional output between the two sexes72. This method is known as dosage compensation and this process has evolved differently in several organisms.

For example, in Drosophila melanogaster, the male inherits only a single X chromosome. To equalize expression to the female (XX), the male upregulates gene

17 expression on the single X two-fold73,74. The nematode worm, C. elegans, utilizes a different dosage compensation mechanism: the hermaphroditic organisms (XX) reduce the expression of each X by half so as to equal the expression levels of the male with one X chromosome75,76. Important to this work, female mammals, accomplish dosage compensation by randomly silencing a single X chromosome in a process known as X

Chromosome Inactivation (XCI)77. This process is required to balance the uneven genomic material present on two X chromosomes compared to the males X and Y. The

Y chromosome in most cases is small and gene poor compared to the X chromosome78,79,80.

X chromosome silencing actually occurs in two separate phases. First, a preferential silencing of the paternal X chromosome occurs in the preimplantation embryo81. This silencing is reversed in the early blastocyst and both X chromosomes are once again active82. After this point, to accomplish dosage compensation by XCI, the silencing mechanism first begins at a region known as the X chromosome inactivation center83. The number of X chromosomes are counted within the cells by molecular machinery to begin XCI84. The long non-coding RNA transcript Xist then begins to be produced by the X chromosome randomly selected to be silenced. On the active X chromosome, the antagonistic long non-coding RNA Tsix is produced preserving the active status of the allele85,86. The silenced X chromosome is condensed into a heterochromatic state by the Xist transcripts coating the chromosome and recruiting epigenetic modification to render it inactive. The inactive X chromosome is referred to as the Barr Body which remains inactive and locates to the periphery of the cell87,88. Interestingly, the overall methylation between the silenced and active X

18 chromosomes remains the same, however, CpG islands on the silenced X are shown to be hypermethylated89. This hypermethylation of CpG islands should silence transcription at normally active genes. Evidence has shown that histone H2A.1 is essential for XCI and is preferentially loaded into nucleosomes on the inactive X chromosome to ensure a persistent silent state90,91. This is one line of evidence of nucleosomes, and histone proteins influencing gene expression on the X chromosome.

The random inactivation of either of the two parental X’s in female eutherian mammals during XCI means that normal females are mosaics, carrying groups of cells differentially expressing a parental X in all tissues92. However, during fetal development, the placenta preferentially silences the paternal X while the maternal X remains active93.

Interestingly, marsupials preferentially silence the paternal X in all tissues throughout the life of the organism94,95.

While autosomal imprinted genes are expressed equally between sexes, X inactivation results in an additional layer of complexity in terms of genomic imprinting.

Skuse et al. and Raefski et al. explained that X-linked imprinted genes are not believed to be expressed equally between the sexes2,5. Due to the unequal expression of genes on the sex chromosomes, a maternally silenced X-linked gene would not be expressed in males. Conversely, a maternally expressed/paternally silenced X-linked imprinted gene would be expressed in all male cells while only in half of the cells in females due to XCI mosaicism. This would lead to higher expression of maternally expressed and paternally silenced genes in males compared to females. Having this imbalance of gene expression from genomic imprinting could result in sex specific differences96,97,98,99.

19 Raefski Development of Turner Syndrome Model Mice

Raefski and O’Neill utilized a TS mouse model to generate X monosomic mice where the X chromosome parental lineage could be tracked and studied3. The Patchy

Fur (Paf) strain caries a large inversion on the X chromosome leading to a high frequency of sex chromosome non- disjunction during meiosis in spermatogenesis. Male Paf mice mated with wild type C57 female mice produce

39,XM genotype in 25% of progeny. The

InX strain also carries a large X inversion that leads to non-disjunction of the X’s during oogenesis. C3H/InX female mice bred with a wild type C57 males Figure 3: X Monosomic Mouse Breeding Scheme produces a 39,XP genotype Breeding generated 39,XM ~25% of offspring and 39,XP ~20% of offspring approximately 20% of the time (Figure

3) (the C3H/InX hybrid females are used to distinguish XC3H/XC57 or XInX/XC57 females from monosomic XC57 females)5.

Using neonatal brain RNA from these monosomic mice to perform a transcriptomic screen, a gene cluster at XA7.2 was identified to be imprinted, comprising three paralogs of the X-Linked Lymphocyte Regulated (Xlr) family of genes.

Subsequent experiments showed that this imprinted locus was not regulated by a DMR and further experiments were conducted to investigate the underlying imprinting mechanism of the Xlr locus. To date, no sex-linked imprinted genes have been shown

20 to be regulated by a DMR suggesting that a novel imprinting mechanism exists for the

Xlr imprinted cluster(86,89,91,92). The Xlr imprinting cluster and underlying imprinting mechanism will be discussed further in the following chapters.

After the initial discovery of the Xlr imprinted locus, Dr. Inga Nesbitt screened surrounding genes on the mouse X chromosome to identify other potentially X-linked imprinted loci. A candidate gene was identified downstream of the Xlr cluster,

Transketolase-Like 1 (Tktl1), that displayed a partial imprint in both olfactory bulb and neonatal neocortex. Follow-up experiments investigating the imprinting status and function of Tktl1 and its human ortholog TKTL1, its imprinting status, and overall function will be addressed further in the following chapters.

21 Chapter 2: The Imprinted X-Linked Lymphocyte Regulated Locus (Xlr)

Characterization of the First X-Linked Imprinted Locus (Xlr)

The dosage compensation mechanism of the sex chromosomes in mammals presents an additional layer of complexity to parent specific effects. For this reason, in many published GWAS and genetic linkage studies the X chromosome is frequently ignored100, leaving a gap in knowledge surrounding sex-linked imprinted genes. Until

2005, imprinted genes on the X chromosome had yet to be identified in mammals.

Publishing back-to-back in the same issue of Nature Genetics, Raefski and O’Neill, along with Davies et al. published the discovery in mice of the imprinted gene Xlr3b5,99.

While Davies only identified this one gene, Raefski was able to determine imprinting status of two additional closely linked genes, Xlr4b and Xlr45. The Xlr genomic organization at this region consists of three triads, Xlr3a/b/c, Xlr4a/b/c and Xlr5a/b/c.

Only three of the nine genes in this cluster have been identified as imprinted, and no other member of the Xlr superfamily displays evidence of genomic imprinting. As discussed in the previous chapter, Dr. Raefski utilized a TS mouse model to generate X monosomic mice to examine the expression levels on the individual parental X chromosome employing a gene expression microarray screening approach. Using these mice, he determined all three imprinted genes are maternally expressed and paternally silenced in mouse neonatal brain (Figure 4), and later showed this pattern in mouse primary fibroblast cells (data not shown).

22

Figure 4: Discovery of the Imprinted Xlr3/4 Region Real-Time PCR results of 40,XX, 39,XM, 39,XP, and 40XY mouse transcripts identifying genomic imprinting of the genes Xlr3b, Xlr4b, and Xlr4c. Maternal X expression and paternal X silencing is noted. Raefski and O’Neill 20053.

There are multiple clusters of Xlr genes scattered across the rodent X chromosome. All Xlr genes are derived from the single copy autosomal gene

Synaptonemal Complex Protein 3, Sycp3. Sycp3 plays a critical role in meiosis, comprising the axial elements of the synaptonemal complex101. While Xlr-like gene clusters have been identified on most of the sequenced X chromosomes of eutherian mammals, the only Xlr-like genes found in primates are the FAM9 genes located at the human Xp22.3 region. While the Xlr3/4/5 cluster is specific to rodents, the Xlr3/4 genes have homologs on the X chromosomes of elephant, dog, and swine, suggesting that

Xlr3/4 are the most ancient members of the Xlr superfamily. The Xlr3/4/5 triad is present in five closely linked copies with their orientation suggesting segmental duplications are responsible for the multicopy organization of the genes102. The Xlr3, Xlr4, and Xlr5

23 genes share approximately 30% sequence similarity in protein-coding regions. All paralogous Xlr3 copies code for nearly identical proteins and share a coding nucleotide sequence similarity over 90%. The extremely high sequence conservation of the Xlr3 paralogs is also true for the Xlr4 and Xlr5 paralog groups, respectively.

Figure 5: X-Linked Lymphocyte Regulated (Xlr3/4) Locus Illustration of the X‐linked Lymphocyte Regulated (Xlr) locus at XA7.2 (mm10). Red indicates imprinted genes.

Functional Study of the Xlr3b Protein

Most research about the Xlr3/4 region revolved around the characterization of the imprinting mechanism regulating gene expression at the locus. More recently, work conducted by R. Foley in the O’Neill lab aimed to elucidate the function of the protein products of the Xlr3 paralogs103. Utilizing amino acid alignments to visualize protein domains, Xlr3b was observed to contain a so-called COR1 domain. The COR1 domain has an unknown function, but is conserved in the Synaptonemal Complex Protein 3

(Sycp3) protein104,105. During meiosis, homologous chromosomes exchange genetic information during crossing over before segregation and this process is regulated by the

Synaptonemal Complex (SC). Due to the shared COR1 domain between Xlr3b and

24 Sycp3 and the fact that Xlr3 mRNA is highly expressed in testis, Dr. Foley investigated the potential role of Xlr3b in spermatogenesis and meiosis.

While quantitative real-time PCR (qRT-PCR) showed expression of the Xlr3b transcript as early as spermatogonial stages, Western blot analysis using an antibody that recognizes all XLR3 protein products showed expression of the XLR3 protein first detected during early prophase of meiosis I. Spermatocyte and embryonic ovary cell surface spreads were used to view the localization of the XLR3 protein by immunohistochemistry (IHC). In meiotic oocytes, XLR3 was observed localizing to

Nuclear Organizing Regions (NORs). During spermatogenesis, XLR3 was observed localizing to the XY body, a compartment in the meiotic prophase I nucleus containing the inactivated X and Y chromosomes. Compartmentalization and inactivation of the X and Y chromosomes occurs in male meiosis and is referred to as Meiotic Sex

Chromosome Inactivation (MSCI). While the role of XLR3 in MSCI is currently undefined, Sycp3 is known to play an essential role in the establishment of MSCI.

Regulation of the Imprinted Xlr Cluster

Investigating the imprinting mechanism of the Xlr3/4 cluster may provide insight into how parent specific expression on the sex chromosomes is regulated at other as yet unidentified loci. As stated previously, the majority of autosomal imprinted genes are regulated by Imprinting Control Regions (ICRs), typically demarcated by a differentially methylated region (DMR) governing the chromatin state of the parental alleles. Using this as a starting point, a former member, B. Carone, attempted to identify a DMR regulating the Xlr3/4 locus106. DNA methylation was analyzed via bisulfite sequencing of predicted CpG islands in a ~250 kilobase region including the three non-pseudogenous

25 Xlr3/4/5 clusters (i.e. all of the A, B, and C paralogs). The results, as shown in Figure 6, showed hypermethylation at both the non-imprinted and imprinted paralogs of both parental alleles.

Figure 6: Methylation Status at the Xlr Imprinted Cluster Bisulfite sequencing analysis results of CpG island clones associated with the Xlr paralogs. Unmethylated CpGs are unfilled (white) “lollipops” while methylated are filled in black. Figure from Carone, 2008. Following these results, the search expanded to regions further up and downstream of the Xlr3/4 cluster. ICRs are known to regulate distal parental expression of genes such as in the Igf2/H19 imprinted region. In fact, approximately 90kb of sequence separates the ICR of this locus and the Igf2 promoter107. To investigate distal regions, M. Murphy of the O’Neill lab conducted immunoprecipitation of methylated DNA from monosomic neonatal brain samples (MeDIP) followed by hybridization to

Affymetrix tiling arrays of the mouse X chromosome108. While MeDIP identified a

26 potential DMR upstream of Xlr3b this result could not be replicated by bisulfite sequencing (data not shown).

In the absence of a detectable DMR, S. Kasowitz of the O’Neill lab investigated the Xlr3b primary transcript to determine if PSGE was achieved by a co- transcriptional/RNA processing mechanism109. He noted that while Xlr3b equally initiates transcription equally from both parental alleles, by exon 7, the transcript on the

Figure 7: Xlr3b Primary Transcript Analysis qRT-PCR of Xlr3b primary transcripts assayed at varying locations in the gene body. Uniform 39,XM primary transcript compared to lack of transcript on 39,XP allele in neonatal brain at Intron 7. Figure from Kasowitz, 2011.

paternal allele is virtually undetectable via qRT-PCR (Figure 7). This would mean some mechanism interrupts proper transcription elongation on the paternal allele. Conversely, a mechanism may exist that removes an obstacle to transcription elongation on the maternal allele allowing for the full transcript to be produced.

To determine if PSGE of Xlr3b is governed at the level of transcription elongation or transcript processing, former O’Neill lab member S. Qureshi investigated the

27 chromatin architecture and RNA Polymerase II (RNA Pol II) occupancy on the maternal and paternal alleles of Xlr3b by performing Chromatin Immunoprecipitation on neocortex of monosomic neonatal mice. RNA Pol II was equally enriched on both parental alleles at the 5’ end of the Xlr3b transcription unit. However, a peak of enrichment of the Serine 2-phosphorylated RNA Pol II (actively elongating) was observed on the paternal allele at the Intron3/Exon4 boundary compared to the maternal allele. At intron 7, Ser2-RNA Pol II was more enriched on the maternal allele.

This data is in agreement with the primary transcript qRT-PCR, and suggests that transcription is initiated on both alleles, but that elongation stalls somewhere in the middle of transcription on the paternal allele. It is important to note that the paternal allele-specific RNA Pol II stall at Intron3/Exon4 revealed by these ChIP experiments has not been replicated in further experiments. Therefore, the molecular nature of the Xlr3b imprint is still unknown.

Dr. Foley set out to investigate differential methylation in surrounding regions of the imprinted cluster, particularly the methylation status of Factor 8 associated gene A

(F8a). In doing so, he opened the door to a potential DMR residing directly within the imprinted cluster itself. Dr. Foley was first interested in F8a when ENCODE released

ChIP data observing CTCF binding motifs within the 5’ end of F8a. The very same insulating protein present at the most well studied imprinted locus, Igf2/H19, was positioned within the Xlr imprinting cluster of genes. CTCF is known to create chromatin loop structures to orchestrate enhancer/promoter interactions and also functions in important regions as an insulator. Also of note, the ENCODE tracks identified a p300 known enhancer flanking the CTCF site in F8a. These CTCF binding sites reside within

28 the F8a gene body along with a CpG island that was not previously investigated by former lab members. In fact, a portion of the CpG island was indeed analyzed, but did not incorporate the CTCF binding sites.

Dr. Foley decided to investigate the methylation status of this CTCF binding site associated with the CpG island. His results can be found in Figure 8 and depict differential methylation with a methyl sensitive PCR assay. Dr. Foley found that surrounding the CTCF binding site were differentially methylated CpG’s where XM was hypermethylated and XP was hypomethylated. Using bisulfite sequencing he determined a subtle methylation difference of approximately 4% differentiated the maternal and paternal allele. To explore whether this DMR located in F8a regulated

Xlr3/4 expression, he performed a

CRISPR deletion of the CTCF binding site. However, no effect was observed on the overall imprint. Work conducted to further understand this functional mechanism regulating Xlr3/4 will be discussed in the following Figure 8: HpaII Methylation Assay at F8a HpaII sites in neonatal whole brain analyzed. HpaII chapters. cutting at sites would produce no band hypomethylation. No cut would produce positive band and indicate hypermethylation.

29 Chapter 3: Chromatin Architecture at the Imprinted Cluster Xlr3/4 Region

Bionano Methylation of XA7.2

Thus far a novel X chromosome imprinting mechanism has yet to be discovered.

As discussed in the preceding chapter, previous members of the O’Neill lab assayed the

CpG island methylation status of regions surrounding the Xlr3/4 imprinted genes via bisulfite sequencing and MeDIP/microarray and found no differential methylation. Dr.

Murphy’s MeDIP initially identified a differentially methylated region upstream of Xlr3b but subsequent bisulfite sequencing of this site revealed no allele-specific methylation. It has been shown in previous studies such as the investigation of the H19/Igf2 locus that it is not required for a DMR to be proximal to the imprinted gene. The genes Igf2 and

H19 are approximately 70 kb apart and are regulated by the same DMR110.

Concerns in the O’Neill lab that previous methylation screens were insufficient to identify distal DMRs motivated a series of follow up experiments utilizing a new approach. To accomplish a wide range methylation profile of the XA7.2 region and also the entire X chromosome, an assay was developed for the Bionano Genomics Irys platform. The Bionano platform utilizes nanochannel technology in coordination with restriction enzyme cut site fluorophore incorporation to be read by a laser within the instrument. The general purpose of the Irys is to create large molecule (≥150 kb) fragments to be aligned to a restriction enzyme digested genome to create a scaffolding backbone for assembly and sequencing analysis111. Expanding on this concept, the laboratory of Dr. Yunal Ebenstein in Tel Aviv developed a methylation detection protocol to be used on the Bionano Irys system. In collaboration with the Ebenstein lab and the

30 laboratory of Dr. Elmar Weinhold in Germany, I modified the Bionano Genomics protocol to enable visualization of methylated sites with the Irys system111,112.

Figure 9: Bionano Methylation Assay Alignment Bionano methylation sensitive assay aligned to a CMAP file for BspQI (Bionano supplied enzyme). Bionano molecule alignment analysis with IrysView software from Bionano Genomics of genomic region ChrX:73,000,000-73,750,000 containing Xlr imprinted cluster (A.) and genomic region ChrX:100,000,000-103,600,000 (B.) representing a higher quality molecule alignment with the software. Xlr3b is highlighted in the blue boxes on the maternal and paternal alleles. Due to shearing events caused by the additional methylation steps of the protocol, molecule alignment results were lower than expected. The software needed to allow for smaller fragments causing lower mapping rates. XM mapping 28.1% of molecules and XP mapping 14.3% of molecules.

39,XM and 39,XP samples were prepared with the Bionano Prep Blood and Cell

Culture DNA Isolation Bundled Kit following the Bionano Genomics protocol for high molecular weight DNA isolation. Samples were labeled with the cofactor AdoYnCF640R

(gift from Dr. Elmar Weinhold lab)113 with M.TaqI followed by knick labeling with

31 Nt.BspQI enzyme as a part of the Bionano NLRS protocol. M.TaqI is a methyltransferase enzyme with a recognition site, TCGA. M.TaqI methylates the adenine residue within this sequence, and can be used to incorporate a labeled cofactor. However, if the CpG within the recognition sequence is methylated, the reaction is blocked114. Samples were run on the Bionano Irys instrument and images were captured and interpreted by the Bionano Irysview software package. A molecule quality report was produced for high molecular weight molecules and alignment to the digested mm10 reference genome (CMAP) was performed, followed by mapping assembly. Molecules were mapped by matching of restriction digest labeled regions incorporated via the Nt.BspQI restriction enzyme provided by Bionano Genomics. In this way, large DNA molecules are knick labeled at known restriction enzyme digest regions of the genome and fragments are aligned based on distance calculated between incorporated labels viewed in the nanochannel. Molecules depicted in Figure 9 show these alignment to the reference genome. In (A) the Xlr3/4 region is visualized with minimal coverage by large DNA molecules and in (B) a different region of the X chromosome is shown with higher coverage. The tool utilized by the Ebenstein lab to analyze the incorporated methylation fluorophore, Irys Extract, requires deeper coverage, and thus visualization of the methylation status on the X chromosome was unable to be accomplished with this platform.

Images generated by the Irys Extract software would have allowed visualization of the fluorescently labeled DNA molecules for a given chromosomal region. Had this assay produced non-sheared DNA molecules, Irys Extract could have identified which molecules mapped to the reference genome at the Xlr3/4 locus and provided insight into

32 the methylation status between the maternal and paternal alleles. The molecules were stained green for the Nt.BspQI restriction sites and red for methylated cytosines by

M.TaqI. Without high quality large DNA molecules, a sufficient Molecule Quality Report could not be generated for the data to allow for further analysis.

DNA was sheared due to the additional step of labeling the molecules with the methylation sensitive cofactor AdoYnCF640R. Shearing was determined by the average length of DNA fragments being <150 bp during analysis. This resulted in fewer molecules mapped to the reference and minimal coverage across the genome as seen in Figure 9. The methylation status across the X chromosome environment remains elusive and awaits further analysis. DNA handling steps for future methylation sensitive assays should be minimized to reduce potential shearing. Simply adding one additional mixing step was enough to shear DNA molecules. If more pervasive information about the overall methylation environment of the X chromosome is desired, I would not suggest using the Bionano Genomics platform for future studies. The Nanopore MinION could be used to discern differences in DNA methylation across the X chromosome.

This technique utilizes bioinformatics tools to identify differing ionic current signals on methylated versus unmethylated cytosines and adenosines114.

Nucleosome Positioning at XA7.2

Chromosome composition can affect transcription by influencing the ability of transcription initiation and elongation by RNA Pol II115. Nucleosomes present a physical barrier that can cause backtracking/arrest of RNA Pol II during active transcription and are known to regulate the access of cellular transcription machinery to DNA loci116.

Typically, the transcription machinery can overcome this arrest by the incorporation of

33 specific functional histone variants. Nucleosome rich regions are difficult for RNA Pol II to transcribe through and are often associated with RNA Pol II strand dissociation, meaning a dense region of nucleosomes near the Xlr3b intron3/exon4 boundary could explain the stalling event previously seen on the paternal allele117,118.

ChIP analysis by S. Qureshi in the O’Neill lab appeared to reveal an RNA Pol II stalling event during transcription of the Xlr3b gene on the paternal allele. Dr. Qureshi also found enrichment of H3K36me3 at the precise location of the RNA Pol II stalling event on the paternal Xlr3b via ChIP. Dr. Qureshi’s ChIP results formed the rationale for experiments to map nucleosome positioning at the Xlr3/4 locus. MNase-Seq was used to investigate nucleosome positioning in the Xlr3/4 region. However, recently Dr.

Qureshi’s data has not been able to be replicated by current members of the lab. While the stalling event is now in question, the following nucleosome positioning assays were conducted with the RNA Pol II stalling hypothesis in place.

Analysis of Nucleosome Positioning by MNase-Seq

Nucleosome positioning and occupancy has been implicated in influencing gene expression by impeding transcription in various ways. It has been shown that nucleosomes can inhibit activator binding and the formation of the preinitiation complex

(PIC) as well as preventing proper elongation in transcription119. Micrococcal nuclease is an enzyme used to digest double-stranded, single-stranded, circular and linear nucleic acids. In particular micrococcal nuclease is unique due to its ability to cause double stranded breaks and digestion of nucleosome linker regions while leaving the nucleosome region intact120. Utilizing the function of this enzyme, nucleosome positions are able to be determined. Genome-wide nucleosome occupancy maps (NOMs) have

34 been developed121; the X chromosome, however, often presents unique challenges to allele specific experiments and the data tends to be unreliable due to the additional complexity of X inactivation and heterogametic sex chromosomes. Using the 39,XM and

39,XP Mus. musculus TS model mice, I was able to sequence the single X chromosome and compare the nucleosome profile between the maternal and paternal X chromosome.

Neocortex brain tissue was extracted from neonate 39,XM and 39,XP Mus musculus neonates and chromatin was isolated in biological triplicates. Micrococcal nuclease was used to digest double stranded DNA not bound to histone proteins and

DNA fragments were target enriched using cRNA baits specific to the XA7.2 region of the mouse X created from Agilent SureSelect probes122. Library construction was done for the Illumina MiSeq platform and produced paired end 150 bp reads after adapter sequence trimming. Fasta and Fastq files were produced and a FastQC quality report was generated to assess read quality (Appendix Figure 1). Paired end reads were aligned and indexed to the mm10 Mus musculus genome using Bowtie2.

Many short read mappers could be used such as BWA and Bowtie2123. Bowtie2 was used for this experiment because of the use of genome indexing, a computational strategy utilizing a Burrows-Wheeler transformation to speed up the alignment algorithm. This indexing technique can compress a reference genome to save memory while still enabling efficient read alignment. Bowtie aligns each read one at a time to the Burrows-Wheeler transformed genome until alignment of the entire read is accomplished. This mapping algorithm, although much more complicated than others,

35 has been shown to be up to 30 times faster enabling high quality mapping in less time124.

Unique mapping is achieved in Bowtie2 by passing nucleotide mismatches and quality parameters to allow a single read to map to one location in the genome. This unique mapping is part of the default parameters of Bowtie2. Due to the high sequence similarity of the paralogous Xlr family of genes, unique mapping would result in the exclusion of nearly all reads in the region that do not contain a paralog specific single nucleotide polymorphism (SNP). Due to this, a different non-unique mapping approach was required for analysis of the MNase-Seq data. Since paralog specificity could not be achieved by this approach, any differences observed between the parental alleles are assumed to be associated with the imprinted paralog. Biallelically expressed paralogs would likely not display differences between parental alleles whereas monoallelic expression would be accomplished by differences between the parental alleles. All reads were called, mapped and then visualized in IGV125,126 for a profile of nucleosome occupancy across the Xlr3/4 region, creating a nucleosome map across the paralogs.

With the high sequence similarity, normalizing the reads was essential to examine the differential enrichment of nucleosomes between 39,Xm and 39,Xp neocortex brain samples. This was achieved utilizing the R package normR127. Bam alignment files and indexed files were imported into R and treated as control (39,Xm) and treatment (39,Xp).

I hypothesized that 39,Xp would have differential nucleosome positioning interfering with successful RNA Pol II transcription through the gene body.

Mapping was followed by data analysis with R packages including nucleR, PING and normR127,128,129. NucleR compiles all reads and filters with a threshold peak calling

36 of 25% and creates a bed file output that can be visualized on the UCSC Genome

Browser. Additionally, PING was used to display a nucleosome occupancy map (NOM) providing a graphical depiction of the nucleosome positioning data in a designated genomic frame. This analysis processes the reads of a specified size and uses a peak detection script creating a peak score determined by a 25% threshold. Figure 10 depicts the NOM of Xlr3b across the 39,XM and 39,XP gene body generated by PING. Finally, normR was used to normalize reads between the sample sets. Numerically identified neonate samples XM73, XM86, XM91, XP100, XP148 and XP175 were concatenated into large XM and XP BAM files for this portion of the analysis. NormR utilizes a script to normalize reads between two data sets against background. DiffR is a script in the normR package that can be used for enrichment analysis in ChIP-Seq and alike datasets and here was used to determine enrichment of reads between XM and XP samples. The read counts are modeled as a binomial mixture model, meaning the datasets were analyzed for the presence of differing sub-regions and applied to a binomial distribution. Significance was set in the script to P ≤ 0.001 and all remaining reads were depicted in an output BED file and visualized in IGV (Figure 11).

37

Figure 10: MNase-Seq Nucleosome Occupancy Map (NOM) Nucleosome occupancy map for Xlr3b generated from MNase sequencing reads from 39,XM and 39,XP neocortex samples. Reads generated from MNase-Seq treated samples and run on Illumina Mi-Seq Overall profile of nucleosome occupancy peaks similar between XM and XP. R package PING used to generate occupancy maps. Output threshold 25% peak calling per package guide. All graphs from ChrX:73192179−73202930 to depict Xlr3b in mm10.

38

Figure 11: MNase-Seq Data Analysis Nucleosome occupancy enrichment at the imprinted Xlr locus, F8a and Spin2d (both biallelically expressed). Blue bed file bars represent enrichment of nucleosomes on the paternal allele while red represent enrichment of nucleosomes on the maternal allele Enrichment analysis done in R with normR package and diffR script. All enrichment had a p-value of p=0.001.

39 An enrichment of nucleosomes on the 39, XP allele is observed juxtaposed to the stalling event initially seen by Dr. Qureshi. Additionally, this enrichment includes a portion of SINE-B4 element associated with imprinted genes in mouse. SINE-B4 elements have been shown to be enriched in maternally active and paternally silenced imprinted gene loci130. The presence of this known associated repetitive element coupled with enrichment of two nucleosomes positions of the paternal allele within this

SINE-B4 element could be associated with an imprinting control mechanism on the X chromosome.

Where there has been no discernable differential methylation in the region of the

Xlr3/4 imprinted cluster, some other mechanism must be controlling the expression status of these alleles. A potential RNA Pol II obstacle in the form of enriched nucleosomes on the paternal allele could govern the expression of the Xlr imprinted genes. Very few differences between the alleles have been identified previously, and this statistically significant difference highlighted in Figure 11 in the green boxes could very well be associated with an imprinting control mechanism on the X chromosome.

Nucleosomes prevent the progression of RNA Pol II productive elongation on the paternal allele by creating a physical barrier to the transcription machinery. RNA Pol II machinery has been shown to stall and dissociate from genomic regions in previously reported experiments131.

This MNase-Seq experiment was a critical step in identifying differential nucleosome positioning at the Xlr3/4 locus. However, due to the use of Agilent

SureSelect probes prior to this analysis, only a small target region of the X chromosome was evaluated with this analysis. A whole X chromosome survey was suggested using

40 the 39,XM and 39,XP model mice. While MNase-Seq generates quality data, only nucleosome positioning information can be analyzed from the resulting data. A newer technique, Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq) allows for nucleosome positioning data, and also open chromatin and DNA footprint analysis132. For this reason, ATAC-Seq was chosen for the survey of the X chromosome and will be discussed in the following section.

Open Chromatin and Nucleosome Occupancy

The method of ATAC-Seq utilizes regions of open chromatin and the binding affinity of transposases to sequence all openly accessible regions of chromatin131.

Utilizing this sequencing technique, a profile of open chromatin can be created to examine all regions that are accessible by external machinery for processes including

DNA replication or transcription. This process can also be used to create a nucleosome positioning profile as DNA that is bound to a histone octamer is not openly accessible and these regions can be assessed with bioinformatic tools by looking for size specific regions where dyads and triads exist due to this inaccessibility. Thus, this data can be used to both map open genomic sequence along with nucleosome positioning. Each library consists of 50,000 nuclei from frozen M. musculus neocortex tissue and 500 D. melanogaster nuclei as a spike-in control. These nuclei are extracted following the

Omni-ATAC-Seq protocol from the Chang lab133 that improved upon the previous

ATAC-Seq protocol from the Buenrostro lab134 eliminating the large portion of mitochondrial DNA captured and sequenced in the original protocol. Nuclei were isolated using an iodixanol gradient solution to eliminate all other cellular components.

41 As depicted in Figure 12, Illumina Nextera transposase is pre-loaded with adapter recognition sequences that attach to DNA when incubated at 37°C and allow for open chromatin to be amplified with barcoding and indexing primers for Illumina sequencing136. Amplification cycles were determined by a Real-Time PCR followed by library quantification using the KAPA Illumina Library Kit. Finally, a Qbit quantification was conducted to analyze the concentration of the libraries being pooled for loading onto the Illumina NextSeq.

Figure 12: ATAC-Seq Protocol Transposase (Tn5 transposome - Nextera) ligate Illumina adapters to regions of open chromatin for downstream amplification. JD Buenrostro et al, Nature Methods, 2013.

For nucleosome positioning assays with ATAC-Seq, it is suggested to sequence approximately fifty million reads per sample137. And for DNA footprint analysis it is suggested to sequence approximately two hundred million reads per sample137. The

42 Illumina NextSeq High Output sequencing kit produces approximately eight hundred million reads per run. ATAC-Seq was performed on two 39,XM mice and two 39,XP mice giving approximately two hundred million reads for each sample individually allowing for

DNA footprint analysis to be completed. The Illumina NextSeq 2x75 paired end 75 bp sequencing kit was used for samples XM291, XM294, XP1044 and XP1053 and sequences were aligned using Bowtie2 according to the ATAC analysis pipeline described by the Kundaje lab at Stanford University135.

Table 1

39,XM 39,XP Mapping quality > q30 (out of total) 84.70% 77.80% Mitochondrial reads (out of total) 1.10% 0.80% Final reads (after all filters) 78.60% 73.00%

Paired-end sequencing reads were mapped using Bowtie2 to the mm10 genome generating a BAM output. Mapping quality can be seen in Table 1. An overall profile of open chromatin domains was viewed with IGV in Figure 13. 39,XM reads were normalized to two hundred million to match the amount of reads from 39,XP sequences before analysis to avoid mapping bias. Regions of differential open chromatin were identified in Xlr3b between 39,XM and 39,XP upstream of the Int3/Ex4 boundary as seen in Figure 13. Some differences are noted in the non-imprinted paralogs of Xlr3 and Xlr4 genes (Figure 14), however, the most substantial differences observed were in the imprinted paralogs at the Xlr3/4 locus. Also analyzed were X-linked biallelically expressed genes Ddx3x, Sox3, and the imprinted autosomal gene H19, see Figure 15.

Overall open chromatin profiles appear to be consistent, including H19 due to both

43 parental alleles being present. While some differences exist in the non-imprinted and biallelically expressed genes, the data does show a greater difference between the imprinted genes, comparatively.

Figure 13: ATAC-Seq Profile of the Xlr3 Paralogs Open chromatin tracks viewed with IGV. Open chromatin mapping of Xlr3a, Xlr3b, Xlr3c. Open chromatin domain differences highlighted by green boxes. 39,XM (Red) and 39,XP (Blue) BAM files displaying open chromatin mapping by ATAC-Seq analysis pipeline. Quantifiable differences of open chromatin observed in the imprinted gene body to be investigated further with DNA footprint analysis. Substantial difference noted at the 5’ end of Xlr3b compared to the non-imprinted paralogs. Differential sequencing reads between alleles for further analysis.

44

Figure 14: ATAC-Seq Profile of the Xlr4 Paralogs Open chromatin tracks viewed with IGV. Open chromatin mapping of Xlr4a, Xlr4b, Xlr4c. Open chromatin domain differences highlighted by green boxes. 39,XM (Red) and 39,XP (Blue) BAM files displaying open chromatin mapping by ATAC-Seq analysis pipeline. Quantifiable differences of open chromatin observed in the imprinted gene body to be investigated further with DNA footprint analysis. Differential sequencing reads between alleles for further analysis.

45

Figure 15: ATAC-Seq Profile of Ddx3x, Sox3, and H19 Open chromatin tracks viewed with IGV. Open chromatin mapping of Ddx3x, Sox3, H19. Open chromatin domain differences highlighted by green boxes. 39,XM (Red) and 39,XP (Blue) BAM files displaying open chromatin mapping by ATAC-Seq analysis pipeline. Ddx3x and Sox3 are biallelically expressed genes at different loci on the X chromosome. Overall profiles of these genes are similar with few if any quantifiable differences. H19 is located on mouse chromosome 7 and is part of the Igf2/H19 imprinting cluster. Included in analysis to view open chromatin domains on autosomal locus. Both maternal and paternal alleles are represented in the data for H19.

ATAC-Seq quality metrics were determined using the ATAC-Seq/DNase-Seq data analysis pipeline developed by the Kundaje laboratory and open sourced from

GitHub136. Illumina Fastq files were normalized to two hundred million reads for DNA

46 footprint analysis. Quality metric outputs displayed in Figures S4&S5. Figures S4&S5 A) depicts an enrichment plot of ATAC-Seq reads relative to a normalized representative sample of Transcription Start Sites (TSS’s) while (Figure S4&S5 B) represents an aggregate of all enrichment of ATAC-Seq reads across all TSS’s in the genome. Overall read fragment length was observed predominantly below ~150 bp correlating to nucleosome free regions of open DNA (Figure S4&S5 C). This is important for ATAC-

Seq data analysis to determine accurate capture of open chromatin regions versus nucleosome bound DNA regions. Sequences obtained from this protocol for both 39,XM and 39,XP samples passed quality metrics for ENCODE library complexity statistics and were determined to be high quality reads for nucleosome positioning and DNA footprint analysis, see Table 2.

Table 2

39,XM # Reads % of Total Fraction of reads in universal DHS regions 79,379,373 50.6 Fraction of reads in blacklist regions 116,773 0.1 Fraction of reads in promoter regions 20,518,428 13.1 Fraction of reads in enhancer regions 58,903,814 37.6 Fraction of reads in called peak regions 37,307,828 23.8

39,XP Fraction of reads in universal DHS regions 60,755,132 46.5 Fraction of reads in blacklist regions 103,276 0.1 Fraction of reads in promoter regions 13,573,697 10.4 Fraction of reads in enhancer regions 47,153,034 36.1 Fraction of reads in called peak regions 23,788,702 18.2

Nucleosome occupancy is also able to be calculated using ATAC-Seq data using open chromatin states as a baseline to detect nucleosome protected regions of DNA

(Omni-ATAC)133. A python package, NucleoATAC, developed by the Greenleaf lab at

Stanford University for the purpose of identifying nucleosome occupancy and

47 positioning from ATAC-Seq data sets was used on Illumina Next-Seq sequencing data files137. Sequencing files were aligned as previously described with Bowtie2 to the mm10 genome. The pipeline configures ATAC-Seq bam alignment files of open chromatin and analyzes nucleosome protected regions of DNA elucidating nucleosome positioning with a resulting bedgraph output file138. Output bedgraph files were sorted with bedtools and then converted to a binary tiled data (.tdf) file for improved IGV performance and processing due to the large size of the files. Nucleosome positioning results can be seen in Figure 16 and 17.

Figure 16: Nucleosome Occupancy ATAC-Seq ATAC-Seq Data analysis with NucleoATAC reveals nucleosome positioning differences between 39,XM and 39,XP mouse neonate brain samples at Xlr3b region (A). NucleoATAC pipeline used to analyze normalized sequencing data followed by visualization in IGV. Overall profile of biallelically expressed F8a (B). Green boxes indicate regions of differential nucleosome predicted on the paternal allele of Xlr3b by NucleoATAC pipeline. Slight difference at the 3’ end of the gene F8a noted in yellow.

48

Figure 17: ATAC Nucleosome Profile of Biallelically Expressed Genes A closer look at two biallelically expressed genes at the XA7.2 region. As seen in the previous image, F8a (A.) displays some level of nucleosome differences in the gene body. However, at the promoter and 5’ end of the gene, the profiles are almost identical. Spin2d (B.) displays only a slight difference of nucleosome positioning at the 5’ end. While there is a slight difference it can be seen that some level of nucleosomes do exist around the promoter region of Spin2d.

Figures 16 and 17 depict the output data from the NucleoATAC pipeline and differential nucleosome occupancy was observed between the maternal and paternal allele of Xlr3b in normalized data. NucleoATAC utilizes the script pyatac to calculate sequence bias and eliminate reads that are not statistically significant nucleosome

139 regions . 39,XP was observed to contain nucleosome peaks where there were none on the maternal allele. Particularly, these peaks highlighted in green are positioned

49 proximal to the int3/ex4 boundary. Previously Dr. Sohaib Qureshi had identified a potential stalling event of RNA Pol II at this region of Xlr3b. Nucleosomes arranged in specific locations in a transcript could inhibit proper active elongation of RNA Pol II thus disrupting completion of transcription. This model could explain the previous findings by former lab members where transcription initiation is seen to begin, but by intron 7, the transcript of the paternal allele is no longer present. Similar differences were seen in

Xlr4b/4c however, it was not as drastic as Xlr3b. Also of note, Xlr3a and Xlr3c do not share this same nucleosome occupancy difference between the maternal and paternal allele. This appears to be an imprinted cluster paternal allele specific effect at the Xlr3/4 locus. The nucleosome positioning maps generated by both MNase-Seq and ATAC-Seq displayed regions around the intron3/exon4 boundary and exon 7 enriched on the paternal allele. This can be seen in Figure 11 highlighted in green boxes from the diffR analysis and also in Figure 16 (A). Another similar region in Xlr3b across the studies is intron 1, where there is nucleosomal enrichment identified on the maternal allele.

Overall, ATAC-seq not only gives a more robust view of nucleosome positioning, open chromatin domains, and DNA footprint analysis, it also produced a sufficient number of reads to allow for unique mapping. This allows for paralog specificity for the region and a more in depth view of Xlr3/4 imprinted cluster.

Generally, biallelically expressed genes in the region display less dynamic nucleosome occupancy across the gene body although it is noted that there are still some differences (Figure 17). In the case of F8a it has previously been postulated by members of the lab that differences within the gene such as CTCF binding and a DMR very well could aide in the imprinting mechanism of the Xlr cluster. The differences seen

50 in nucleosome occupancy of F8a could very well be involved in this mechanism allowing or restricting access of transcription factor binding proteins or potentially, insulator proteins such as CTCF. Spin2d, the biallelically expressed gene upstream on the X chromosome has an almost identical nucleosome positioning profile between the maternal and paternal allele. Only a slight difference is observed at the very beginning of the gene body. Below in Figure 18, a survey of other regions of the X chromosome are depicted starting with the imprinted region of Xist and Tsix and surrounding genes.

Several nucleosome occupancy differences are noted between the maternal and paternal allele. Secondly in Figure 18, Tktl1 displays major differences in nucleosome occupancy between the paternal alleles. There has been a long standing debate in lab whether Tktl1 is in fact imprinted, and this data could potentially be a novel discovery in its imprinting status. The pattern of large nucleosome deserts in one parental allele has remained consistent across several X chromosome imprinted genes (Figure 18). Finally, a region of biallelically expressed genes is examined. Subtle differences are noted, but the presence of large nucleosome deserts are noticeably absent. Five imprinted genes,

Xlr3b, Xlr4b, Xlr4c, Xist, and Tsix all have displayed these nucleosome differences between parental alleles.

51

Figure 18: Survey of Nucleosome Occupancy in Regions of the X Chromosome Nucleosome occupancy maps generated by ATAC-Seq reads and the NucleoATAC analysis pipeline for (A.) Xist/Tsix, (B.) Tktl1, Flna, Tex28, (C.) Stk26, Frmd7, Rap2c, MbnI3 and Hs6st2. Nucleosome occupancy differences noted in blue boxes reflect differences between the maternal and paternal alleles. Nucleosome deserts noted in imprinted genes while absent on biallelically expressed genes. All nucleosome occupancy maps generated with the pipeline with a 95% confidence and normalized by sequence count.

ATAC-Seq DNA Footprint Analysis

Transcription factor binding motifs are interspersed across the entirety of the mammalian genome allowing for proper binding of regulatory proteins to interact with the chromatin and drive or inhibit different stages of transcription140. The ENCODE

52 project compiled a database of sequence specific transcription factors (TFSS’s) via

ChIP-Seq and identified strong DNA binding motifs over a large number of cell types and model organisms141,142,143. ATAC-Seq identifies open chromatin regions of the genome not bound by nucleosomes which are freely available to bind to regulatory proteins. In this way, ATAC-Seq reads enable an investigation of regions for DNA binding protein motif sequences across the entire genome. This can also be achieved by DNase-Seq, however, recently researchers have begun utilizing ATAC-Seq data to conduct DNA footprint analysis134.

The MEME suite is designed to identify DNA binding protein motifs from sequencing data across thirteen different tools. This bioinformatics tool searches genome sequence files for known protein binding consensus sequences and scores them according to an e-value144. The e-value is used to eliminate background noise from the data by determining the number of hits that can be expected by chance when searching a database of a given size. Essentially, the value decreases exponentially as the score of a match increases. The lower the e-value, the less likely the scored motif would occur by chance in the genome.

53 Table 3.

Xlr3b 39,XM Motif e-Value 39,XP Motif e-Value HXB8 1.20E-81 STAT1 1.20E-119 FOXI1 2.10E-10 IRF3 1.70E-102 ZN322 1.60E-09 NR2E3 9.90E-83 MTF1 2.40E-09 HNF4A 2.00E-04 DDIT3 4.00E-08 STF1 7.90E-04 NRF1 1.50E-07 FOXI1 2.60E-03 SMAD4 1.60E-06 VDR 2.60E-03 P63 2.40E-06 SP7 9.70E-03 THA11 1.60E-04 RARA 1.90E-02 DBP 2.00E-04 BACH1 2.60E-02 STA5A 2.40E-04 TGIF1 2.80E-02 CUX2 1.90E-03 ZKSC1 3.30E-02 SMCA5 2.00E-03 CRX 3.60E-03 ZFP42 9.10E-03 HXA1 2.10E-02 NKX28 2.80E-02 F8a

39,XM Motif e-Value 39,XP Motif e-Value ZN281 6.2E-456 ZN281 5.2E-644 TGIF1 2.1E-421 INSM1 9.5E-472 SP3 1.1E-323 MYF6 1.0E-377 BHA15 1.0E-214 KLF4 1.2E-208 SP2 4.8E-198 SP3 2.5E-198 RORA 5.7E-180 SP2 2.5E198 ZFX 2.5E-139 ZN335 7.3E-110 HIC1 8.4E-128 NR1D1 1.1E-104 KLF6 1.8E-103 LEF1 1.4E-77 LEF1 8.4E-103 ZIC3 6.4E-50 INSM1 2.9E-90 EGR1 2.5E-43 MYF6 8.7E-81 MYOD1 3.0E-36 ZNF335 4.0E-69 ZIC3 2.1E-67

Table 3: Meme-ChIP Output for Xlr3b and F8a Motif binding site analysis output from Meme-ChIP suite. Open chromatin reads analyzed for Transcription Factor (TF) binding motifs. Highlighted motifs indicate binding sites similar between the maternal allele (left) and paternal allele (right). Xlr3b observed with only one similar motif between parental alleles. Biallelically expressed gene F8a observed with eight similar motifs.

54 Table 3 shows the results of the MEME suite analysis for reads mapped to Xlr3b and the bialllelically expressed gene F8a from the ATAC-Seq analysis. Although different parental alleles should be sequence similar, a drastic difference of DNA protein binding motifs have been identified. Signal transducer and activator of transcription 1

(STAT1)145 and Interferon regulatory factor 3 (IRF3)146 are implicated in the immune response but have not yet been associated with imprinted genes in the literature.

Nuclear receptor subfamily 2, group E, member 3 (Nr2e3) is a known TF binding protein that interacts with several other proteins to regulate transcription147,148,149,150. Two known interactions are with Nuclear receptor co-repressor 1/2 (Ncor1151 & Ncor2152,153) which have been shown to suppress transcription in mouse. Particularly, these TF proteins are known to repress transcription by chromatin condensation151,152,153.

Figure 19: Protein Interaction Map of Nr2e3 Graphical protein interaction map of TF binding protein Nr2e3 found on the paternal allele of Xlr3b in mouse. Map shows known interactions of Nr2e3 with other TF proteins that influence transcription. Courtesy of String Consortium.

55 IRF3 has been shown to bind to the TF E1A binding protein p300 (Ep300) which functions as a histone acetyltransferase and regulates transcription via chromatin remodeling154. Contrary to the transcriptionally silent state of the paternal allele of Xlr3b,

Ep300 acetylates all four core histones in nucleosomes that gives an epigenetic tag for transcriptional activation. It is thought that acetylation of histone H3 at Lysine-122

(H2K122ac) stimulates transcription by possibly promoting nucleosome instability155.

This instability of nucleosomes could account for the dynamic differences between the maternal and paternal alleles of the imprinted locus. The presence of an Nr2e3 binding motif on the paternally silenced allele could begin to explain how this allele is preferentially silenced compared to the expressed maternal allele. However, the presence of a binding consensus sequence does not necessarily correlate to physical binding of the protein in vivo.

Conclusions

Although Dr. Foley identified a potential region of differential methylation located within the F8a gene between Xlr4b and Xlr4c the role of this DMR has yet to be classified. The previous findings of no differential DNA methylation at the Xlr imprinted cluster in conjunction with Dr. Kasowitz’s primary transcript data and Dr. Qureshi’s RNA

Pol II stalling event strongly suggested a co-transcriptional process regulating the genomic imprint at the Xlr3/4 region. However, with Dr. Qureshi’s findings now in question, the data provided in this work suggests an independent finding. Differential nucleosome positioning could perturb active transcription throughout the paternal allele, or perhaps provide a locus for epigenetic marks allowing active transcription on the maternal allele yet to be identified. While nucleosome positioning has been shown in

56 organisms to affect active transcription, it has not yet been shown to be a part of an imprinting control mechanism in mammals. RNA Pol II stalling events are not unusual and recently it has been shown that pausing proximal to the genes promotor occurs early during active transcription but typically at the 5’ region of the gene156. This pausing is thought to be involved with the process of the transcription machinery adding a 5’ cap important for translation and RNA stability157. This specific pause however, would not account for the observation by Dr. Qureshi. Nucleosome positioning is not observed to be responsible for this stalling mechanism as at the 5’ region of the gene. This would suggest that nucleosome positioning is not directly involved in overcoming the stall during transcription initiation. In Figure 20, a nucleosome peak can be seen at the inton1/exon1 boundary where no peak is detected on the paternal allele. This differential nucleosome positioning could be a region of interest to the imprinting mechanism as will be discussed later in this section. This maternal specific nucleosome peak could allow for specific TF binding proteins to allow active transcription to occur solely on the maternal allele. How this nucleosome has been positioned differently to the paternal allele is yet unknown.

57

Figure 20: Nucleosome Occupancy ATAC-Seq Xlr3b 5’ Analysis of Xlr3b 5’ region nucleosome occupancy by ATAC-Seq. Sparse placement of nucleosomes at the 5’ end of the maternal and paternal allele indicative of accessibility for transcription initiation of both alleles. Peak noted at the intron1/exon1 boundary.

The X chromosome alleles are known to share 100% sequence identity and thus silencing due to sequence specific stalling or perturbation, which has been observed in

A+T-rich regions, can be ruled out as an influence on the imprinting control mechanism158,159. Overall nucleosome occupancy at the imprinting cluster displays differential positioning, seen in Figure 21. Although Dr. Qureshi’s analysis was the starting point of these experiments, this data stands alone in observing differences between the maternal and paternal X chromosome.

58

Figure 21: Differential Nucleosome Positioning at the Xlr3/4/5 Locus NucleoATAC nucleosome peaks visualized by IGV at the XA7.2 region across part of the Xlr paralog superfamily containing the imprinted paralogs (A). Boxes indicate regions of differentially positioned nucleosomes. Non-imprinted genes do not display distinct differences compared to imprinted genes. (B) A zoomed in image of NucleoATAC peaks at the imprinted Xlr genes. Distinct differential positioning of nucleosomes present at the Int3/Ex4 boundary.

While this differential nucleosome positioning does not by itself establish an X chromosome independent imprinting control mechanism, it does provide novel insight to the potential role nucleosome occupancy may play in regulating transcription at imprinted loci such as Xlr3/4. There were stark differences observed in the DNA TF binding motifs identified by the MEME suite from my ATAC-Seq data (Table 3). When

59 comparing binding sites only one (Foxi1)160 is shared between the maternal and paternal Xlr3b alleles. This binding motif difference suggests activating and silencing transcription factors are able to bind to different parental alleles. What remains to be seen is what causes the regions of open chromatin between the alleles to be different. A potential nucleosome rearrangement may occur to allow for transcription factor binding proteins to effectively access the maternal allele. Conversely, the paternal allele could undergo a nucleosome rearrangement to ensure the silencing of that allele.

Figure 22: DNA TF Meme Suite Output Depicted are the top three motifs identified by MEME analysis for the Xlr3b gene sequence for (A) 39,XM and (B) 39,XP. Of the top 13 hits, only one shared motif was identified (Foxi1). Consensus sequences e-value for each were above 1.60E-09 indicating a high confidence.

60 The predominant difference between the two alleles is the parent of origin of the

X chromosome suggests an epigenetic mechanism independent of DNA methylation causing silencing specific to the paternal allele. This is a novel hypothesis for an imprinting control mechanism and could be used to identify potential currently unidentified imprinted genes on both the mouse and human X chromosome in the future by searching the X chromosome for genes with similar regulatory characteristics. Figure

22 displays a graphical depiction of the consensus sequence for the top three TF binding domains identified by the MEME suite analysis program. DNA footprint profiles appear to be very different between parental alleles. In fact out of the top 13 motifs, only one (Foxi1) was similar.

Figure 23: DNA Footprint Analysis of Xlr3 Paralogs DNA Footprint of 39,XM (Red) and 39,XP (Blue) for the genes Xlr3a, Xlr3b, and Xlr3c. Data produced by the python script RGT-HINT. DNA footprints depict model active binding sites at the TSS of Xlr3a and Xlr3b on the maternal allele. No active binding sites are detected on any paternal allele. Xlr3c is absent of any predicted active binding domains. Previously predicted TF binding sites mapped with the Sequence Manipulation Suite.

61 DNA footprint analysis was accomplished using the RGT suite, and specifically the Hmm-based IdeNtification of Transcription Factor Footprints (HINT) script. This suite allows for MACS2 peaks to be manipulated to generate detected regions of increased and decreased histone modification and ATAC signals. It was originally adapted from a suite tailored for DNase-seq experiments to be used with ATAC-Seq data. This pipeline uses a bias-corrected signal before normalization steps making it a preferred method for

DNase/ATAC-Seq analysis. Seen in Figure 23 predicted active binding sites proximal to the TSS can be seen on the maternal allele for Xlr3a and Xlr3b however no active binding sites are predicted on the maternal allele for Xlr3c. Xlr3a does depict active binding sites on the paternal allele, however not at the TSS. Xlr3b does not contain any predicted active sites on the paternal allele.

Figure 24: DNA Footprint Analysis of X Chromosome Genes DNA Footprint of 39,XM (Red) and 39,XP (Blue) for the genes Ddx3x, Sox3, and Rhox5. Data produced by the python script RGT-HINT. DNA footprints depict active binding sites at the TSS of Ddx3x, Sox3, and Rhox5 on both parental alleles. A similar profile is seen across Ddx3x and Sox3, however, at the 3’ end of the maternal allele of Rhox5 there are predicted sites and none on the paternal allele.

62 As seen in Figure 24, biallelically expressed genes Ddx3x and Sox3 display virtually identical DNA Footprint profiles. Rhox5, a known imprinted gene, shares a similar 5’ end between parental alleles. However, the 3’ region displays peaks only on the maternal allele. The similarity observed between parental alleles of biallelically expressed genes and the differences observed between known imprinted genes suggests a potential role of predicted active binding sites at the loci. Further analysis will need to be conducted to elucidate the role of these binding sites and the transcriptional activity at the Xlr3/4 locus.

Combining the data from the MNase-Seq assay for maternal and paternal enrichment with the ATAC-Seq nucleosome positioning prediction shows consistency between predicted loci, see Figure 25. This overlap between the two assays provides insight into differentially positioned nucleosomes on the parental alleles. Several loci as seen in Figure 25 identified originally by MNase-Seq, displayed maternal enrichment and paternal depletion later in the ATAC-Seq assay. These differentially positioned nucleosomes could influence the overall transcriptional activity at the imprinted Xlr locus.

63

Figure 25: MNase-Seq Enrichment Comparison to ATAC-Seq Nucleosome Positioning Nucleosome MNase enrichment of maternal nucleosomes (bottom – red) align with predicted nucleosomes on the maternal allele from NucleoATAC output. Similarly, paternal enrichment of nucleosomes (bottom – blue) align with predicted nucleosomes from NucleoATAC.

A recent publication by Shioda et al. examining a mouse knockout of the chromatin remodeler, Atrx, identified a CpG island in Xlr3b at the intron1/exon1 boundary that appears to be differentially methylated according to the presence or absence of Atrx161. This CpG island corresponds to the region investigated by B.

Carone that was found to be equally methylated on both parental alleles (Figure 6). This region was also predicted to form the non-canonical DNA secondary structure know as a G-quadruplex by the QGRS Mapper162. The knockout of Atrx in hippocampus of adult male mice upregulated Xlr3b expression161. The group reasoned that Atrx is recruited to the G-quadruplex region, and in turn, recruits DNA methyltransferases (Dnmts) to establish a DMR regulating gene expression161,163,164,165. It has previously been reported that ATRX acts within a histone chaperone complex, along with DAXX to deposit histone variant H3.3 in heterochromatic regions enriched for H3K9me3166. In Figure 26,

I identified differential nucleosome occupancy on the maternal allele of Xlr3b aligning to the same region of this reported G-quadruplex from the Shioda group161. H3.3 has been

64 previously shown to be associated with active gene transcription by opening otherwise compact chromatin regions167.

Figure 26: G-Quadruplex Alignment with Nucleosome Occupancy Nucleosome occupancy at Xlr3b (A.) and Xlr4b (B.) in relation to the presence of predicted G-quadruplexes. Both imprinted genes display differential nucleosome occupancy in the 5’ region of the gene body correlating to the location of a high scoring G-quadruplex prediction. These regions are believed to interact with ATRX acting as a chromatin remodeler by means of a histone chaperone complex.

As seen in Figure 26, differential nucleosome occupancy on the maternal allele of the two sense strand imprinted genes in the Xlr locus correspond to predicted G- quadruplex regions. These nucleosomes have the potential to be modified by ATRX with the deposition of H3.3 causing the maternal allele to be actively transcribed while the paternal allele would remain silent. An imprinting mechanism involving differential

65 nucleosome positioning and ATRX deposition of H3.3 has not yet been identified and would be a novel discovery. This mechanism would explain the actively transcribed maternal allele compared to the transcriptionally silent paternal allele.

66 Chapter 4: The Transketolase-Like 1 Gene in Neurodevelopment

Discovery of Tktl1 Imprinted Expression in Mouse

Since imprinted genes often occur in large clusters, former O’Neill lab member

A.M. Nesbitt tested genes for PSGE in a 1 Megabase region surrounding the Xlr3/4/5 genes. PSGE of autosomal imprinted genes is generally strongest during embryonic stages of mouse development168. For this reason, Dr. Nesbitt performed her screen via qRT-PCR using embryonic liver, brain, and placenta from 39,Xm and 39,Xp mice. She first detected maternal dominant expression of Transketolase-like 1 (Tktl1) in embryonic stage 14.5 (E14.5) liver. Dr. Nesbitt then assayed for imprinted expression of Tktl1 in P0 brain sub-regions due to the link of imprinted genes and neurocognitive ability. A five-

Figure 27: Dr. Nesbitt Real-Time PCR results of Tktl1 Expression A statistically significant difference was observed between the different genotypes in all tissues analyzed (p<0.05). Error bars indicate a ninety-five percent confidence interval. Adapted from Dr. Addie Mae Nesbitt.

67 fold higher expression was observed in 39,XM neocortex compared to 39,XP along with an eight-fold difference in olfactory bulb, and two-fold difference in cerebellum (Figure

27). The allelic dominance in expression indicates that Tktl1 is regulated by what is termed a “partial” imprint.

In conjunction with mouse studies, Dr. Nesbitt assayed allele-specific expression of TKTL1 in a 46,XX 19 week human fetal brain sample (Figure 28). She again observed sub-region-specific partial imprinting: noting, for example, a five-fold allelic dominance in thalamus, and two-fold allelic dominance in frontal cortex. However, the lack of parental DNA matching this fetal brain sample made it impossible to determine which parental allele was the more highly expressed.

Figure 28: 19 Week Fetal Human Brain TKTL1 Expression TKTL1 expression levels in human fetal brain tissue sub regions using c.-44dupT allele imprinting primer set. Thalamus displays greatest fold difference in 19w Human brain samples. This data suggested at the time TKTL1 may in fact be imprinted in human brain as well as mouse brain tissue. This data propelled experiments looking at the link between TKTL1 and ASD. Adapted from Dr. Addie Mae Nesbitt. *Only one fetal brain sample was ever obtained for analysis.

68 TKTL1 Expression in Human Brain Sub-regions

Skuse et al.2 showed that Turner Syndrome (TS) females who inherited the maternal X chromosome, 45,XM show lower social cognitive ability than TS females who inherited the paternal X chromosome,

45,XP, implicating X-linked imprinted genes in neurodevelopment of social and cognitive ability. Furthermore, Skuse showed TS females that inherited the paternal X chromosome displayed similar social cognitive ability to 46,XY males Figure 29. Non-Oxidative Phase of the Pentose Phosphate Pathway: Function of and control 46,XX females demonstrated Transketolase in converting Ribulose 5- phosphate and production of NADPH. p. the highest level of social cognitive 863 of Biochemistry, by Voet & Voet, 3rd Edition. ability2. TKTL1 is a potential candidate gene involved in poor social cognitive ability due to the role it may play in the production of NADPH and ribulose-5-phosphate in the Pentose Phosphate Pathway (PPP)169.

Transketolase is the rate limiting step in the non-oxidative phase of the PPP (Figure 29) responsible for the reversible conversion of ribose-5-phosphate to glucose-6-phosphate which results in the reduction of NADP+ to NADPH. NADPH is essential for the reduction of glutathione which in turn is responsible for reducing hydrogen peroxide

170,171 (H2O2) . Reducing H2O2 results in a decrease of oxidative stress and neurodegeneration170,171.

69 Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR) was used to assess the relative abundance of TKTL1 transcripts present in neurotypical and ASD human brain sub-region tissue samples. The goal of this analysis was to determine if there is a correlation between the level of TKTL1 gene expression, or its imprinting status in sub-regions of the brain and diagnosis of ASD. Post-mortem frozen brain tissue samples were obtained from the NIH Neurobiobank and RNA was extracted with

Qiagen RNeasy extraction kits. Sample purity and concentration was quantified by

Nanodrop analysis and samples were normalized to 1000 ng RNA per cDNA synthesis reaction with cDNA Superscript. Samples were analyzed on the BioRad Real Time

Thermocycler using exonic primer pairs designed as depicted in Figure 30.

Figure 30. Primer design for qRT-PCR products in H. sapiens TKTL1 TKTL1-1 (1) forward designed in 5’ UTR/Exon 1, reverse located in Exon 3. TKTL1-3 (3) designed in Exon 9, reverse located in Exon 11. TKTL1-1 located in Thiamine Pyrophosphate (TPP) domain. TKTL1-3 located in Transketolase C-terminal domain.

Cycle thresholds from qRT-PCR were imported into Microsoft Excel and fold change comparing autistic and control samples were calculated using ΔΔC(T) method172. β-Actin and GAPDH mRNA were used as a reference housekeeping gene for normalization. A statistically significant difference in relative gene expression was observed between parietal brain tissues from females diagnosed with ASD compared to neurotypical females (Figure 31). However, no statistically significant difference was observed in female occipital, male occipital or male parietal tissue. With the high prevalence of ASD diagnosed in males, approximately 5:1 ratio compared to females1,

70 we had expected to find high levels of TKTL1 expression in male brain tissue. A possible explanation, if TKTL1 over-expression is indeed related to the development of

ASD, a threshold effect may occur. Females having one inactivated X chromosome may escape harmful effects. Whereas in males, having only one X chromosome may be more vulnerable to these harmful effects. If a female inherits a mutated copy of a gene on the X chromosome, ~50% of cells will express the aberrant allele. A mutation may not present as detrimental if it is expressed only in a sub-population of cells. In brain for instance, if a specific region is affected with an inherited mutation, another region will not be. When the mutation is located in a region necessitating the proper allele, detrimental effects will produce a phenotype. Since X inactivation occurs randomly, there is a protective effect to silencing a potentially deleterious allele. With this in mind it is possible this statistically significant difference in TKTL1 expression could be indicative of the potential development in females with ASD.

A significant difference in TKTL1 expression levels between neurotypical and

ASD samples would support the hypothesis of a role for TKTL1 in the neurodevelopmental defects underlying ASD. Incorrect regulation of TKTL1 could result in increased amounts of ROS causing lipid bilayer proliferation in neurological cell membranes. No significant difference between TKTL1 expression levels may indicate an earlier critical window for TKTL1 in the developing brain. Many genes and transcription factors have been shown to be necessary during certain developmental time points in fetal brain development156. During embryogenesis TKTL1 regulation may be more important for proper neurological development. Quantitative Real Time PCR

71 analysis shows increased expression between neurotypical and autistic female brain samples.

**

Figure 31: Relative TKTL1 expression in control vs. autistic females Relative expression of TKTL1 using qRT-PCR in female brain. Data transformed using Log10 for normal sample distribution. All relative expression normalized to β-Actin. A. TKTL1-1 primer set designed to 5’ end of gene from Exon 1 to Exon 3. Three of four Autistic samples had higher relative expression compared to neurotypical control samples, however they are not statistically significant. Levene’s Test: F=8.063, df=6, p=0.03. T-Test: df=6, p=0.156. B. TKTL1-1 primer set designed to 3’ end of gene from Exon 9 to Exon 11. All Autistic samples had higher relative expression compared to neurotypical samples. Levene’s Test: F=1.628, df1=6, p=0.249. T- Test: df:6, p=0.004.

72 TKTL1 Imprinting Status in Human Brain Sub-regions

The conventional experiment to conclusively determine genomic imprinting is the detection of monoallelic expression of a SNP in a cDNA sample in an individual heterozygous for the SNP in their genomic DNA. This method displays the suppression of one allele in the mRNA production pathway and thus, confirms that genomic imprinting is occurring at the locus. To examine the imprinting status of neurotypical and

ASD human brain sub-regions, RNA was extracted with the RNeasy kit from Qiagen

Figure 32: Sequencing Analysis of TKTL1 SNP Present in Human Brain Sub-regions A comparison of neurotypical female brain samples and ASD brain sub-regions for monoallelic and biallelic expression. Neurotypical females appear to have imprinted expression of TKTL1 in parietal tissue while ASD females appear to lose that imprinted expression in the same tissue.

73 and treated with Qscript mastermix (Quanta) to create cDNA for sequencing analysis on the Sanger 3130 in the Center for Genome Innovation (CGI). Sequences were viewed with Finch TV sequence viewer software and analyzed for the presence of SNPs in the genomic DNA. SNPs were identified and compared to the cDNA sequencing results to determine whether biallelic or monoallelic expression of the SNP in mRNA could be observed. The results determined that in neurotypical female parietal brain, monoallelic expression could be identified signifying genomic imprinting in this sub-region. In contrast, female occipital brain displayed biallelic expression of the SNP across all samples in neurotypical and ASD brain. Interestingly, ASD female parietal brain as shown in Figure 32 displayed a loss of parietal region-specific imprinting compared to their neurotypical counterparts.

Conclusions

Significant differences exist between neurotypical female brain and ASD female brain in a subregion specific manner. The combination of the data showing a statistically significant increase in relative expression of TKTL1 transcripts in ASD female parietal brain and the loss of imprinting status points to a potential correlation between the loss of the imprint and the development of this neurodevelopmental disorder. Moreover, it supports the hypothesis set out by Skuse et al. indicating imprinted genes on the X chromosome influence the cognitive ability of individuals2. To date, no confirmed imprinted genes have been identified on the human X chromosome, however, my work along with the work previously done by Dr. Addie Mae Nesbitt has provided preliminary evidence of Transketolase-like 1 imprinting status not just on the mouse X chromosome, but also on the human X.

74 Several explanations could suggest why TKTL1 is not observed to be over expressed in ASD male brain. First, a high post-mortem interval (PMI) is observed in several brain samples, as seen in Figure S6. RNA degradation due to this high PMI could account for skewed results in expression analysis. Excluding samples 1638 and

1349 as PMI outliers, the average difference between neurotypical and ASD males PMI is higher than that of females in this study. With this fact, it is possible that RNA degradation influenced the results as a source of error for the male brain samples more substantially than the females. Also, when acquiring samples from the Neurobiobank, there were inconsistencies among the labeled brain sections received, meaning a brain subregion may have been incorrectly categorized before shipment. Additionally it is worth noting that the cause of death of many of the subjects, asphyxiation related causes, could be indicative of increased oxidative stress (Figure S6). Lastly, it is possible that TKTL1 is not directly associated with ASD in males, and is female ASD specific. Without being able to obtain precise brain sub-region samples from the

Neurobiobank, increasing the cohort was not possible. While an N=4 is a low cohort of samples for statistical significance, these findings do suggest a pattern in female parietal brain.

75 Chapter 5: Characterization of the TKTL1 Protein

There is a debate in the literature about the function of the protein Tktl1 (or in human TKTL1).

As stated previously, Coy et al.174 and Schenk et al.194 postulate that TKTL1 is the rate limiting enzyme in the PPP controlling the production of Figure 33: Tkt and Tktl1 metabolites in the non-oxidative phase. However, Conservation. Pairwise protein alignment of Tkt and Tktl1 in Meshalkina et al. argues that TKTL1 does not itself mouse. Alignment done with EMBOSS Water alignment tool. even function as a canonical transketolase due to a 38 amino acid residue deletion in the N-terminal region of TKTL1 that encompasses a binding site for the essential co-enzyme, thiamine195. It has been shown that Tkt is phosphorylated at a threonine at amino acid position 382 (Thr382) by protein kinase B

(Akt) and that this phosphorylation event is critical for Tkt function196. The phosphorylation activation site, Thr382, is conserved between mouse and human Tkt and

Tktl1 proteins, as shown in Figure 33. This conservation could be a clue to the overall function Tktl1 serves in the cell and ultimately the role it may play as a regulating enzyme in the PPP. This chapter aims to elucidate the potential interactions of Tktl1 in

HT-22 mouse hippocampal cells and also in utero electroporation (UTE) transfected mice.

Tktl1 maps approximately 500 kb from the Xlr3/4 cluster in mice and the human ortholog TKTL1, is found in the syntenic region of the human X, mapping to Xq28. It is not currently known if TKTL1 functions as a transketolase despite encoding for a paralog of the autosomal transketolase gene, TKT197. In the PPP, transketolase is a rate

76 limiting step in the reversible non-oxidative portion of the pathway, and it is critical for providing the cell with ribose-5-phosphate for nucleotide synthesis and NADPH in conditions of oxidative stress189,190. A disruption in NADPH production could result in increased levels of free radicals, or reactive oxygen species (ROS). NADPH is required to reduce cellular glutathione which in turn converts the ROS peroxide to water. ROS are damaging to cells, nerve cells in particular, reacting with the lipid bilayer causing lipid peroxidation and cellular death198. Oxidative stress has been investigated as a causative factor of cellular degeneration by measuring the relative concentration of detoxifying agents such as reduced glutathione along with lipid peroxidation byproducts199,200. Incorrect regulation of TKTL1 imprinting in specific brain sub-regions may alter transketolase activity, thus altering the ability for neurons to produce an appropriate amount of reducing agents leading to increased oxidative damage and neurodegeneration201.

A cell line derived from mouse hippocampus (HT-22) was used due to the difficulty to culture and grow primary neuronal cell lines202. HT-22 cells grow effectively in culture and can be easily transfected. In these experiments, transfection was conducted with the Neon Transfection System from Thermo Fisher Scientific utilizing electroporation. This method allows for plasmids and vectors to be transfected through the cell membrane by electrical pulse203. A mouse model system to investigate the enzyme reaction of the PPP was chosen due to high level of conservation between human and mouse systems. Additionally, HT-22 cells serving as a model for human cells undergoing oxidative stress and other disorders is well-documented in the literature200,201,202. An expression construct developed by Dr. Nesbitt including the full-

77 length coding DNA sequence for human TKTL1 was ligated into a pCAGGS vector (gift from the J. LoTurco laboratory). The pCAGGS vector is controlled by a chicken β-actin promoter and has been observed to be highly expressed in several cell types205.

TKTL1 Protein Interactions

Tkt is activated by the phosphorylation of Thr382 by Akt as part of the mammalian

Target Of Rapamycin (mTOR) pathway to produce sufficient growth factors, extracellular nutrients and amino acids to allow proper cell proliferation196. As stated above, activated Tkt is an integral and rate limiting enzyme in the non-oxidative phase of the PPP187. Due to the conserved Thr382 phosphorylation activation site present in

TKTL1, I decided to investigate the potential for interactions between the Akt kinase, Tkt and TKTL1. The transfected construct contained human TKTL1 cDNA to allow it to be distinguished from endogenous mouse Tktl1 in gene expression assays. My working hypothesis was that human TKTL1 would work as a regulating factor for the activation of endogenous Tkt in the cell. With the conserved Thr382 binding site, TKTL1 could act as a competitive inhibitor by providing additional Thr382 sites for activation by Akt, decreasing the amount of Tkt phosphorylation events and thus influencing the reversible reactions of the non-oxidative phase of the PPP. In this way, aberrant TKTL1 expression could explain the increased levels of oxidative stress and provide a direct link to the over expression of TKTL1 observed in female parietal brain and the development of ASD.

To elucidate the potential for interactions between Akt/Tkt and Akt/hTKTL1 a pCAGGS-hTKTL1 vector was transfected via electroporation into the HT-22 cell line and transfected cells were assayed for expression of endogenous Tkt, hTKTL1 mRNA, and

78 glucose-6-phosphate 1-dehydrogenase (G6pd). G6pd is another gene on the X chromosome of the mouse that encodes a protein integral to the PPP. This gene was used in expression experiments to demonstrate the transfection of hTKTL1 did not alter the expression of other genes involved in the PPP. As is shown in Figure 34, the pCAGGS promoter was able to drive hTKTL1 expression to high levels in transfected

HT-22, while not increasing expression of other genes involved in the PPP.

Relative Expression of Tkt, hTKTL1, G6pd in HT22 and HT22 Transfected

Cells 90.00 *** p Value = 0.00012 80.00

70.00

60.00 50.00

40.00 30.00

Average FoldAverage Change 20.00

10.00

0.00

Tkt hTKTL1 G6pd HT22 Transfected

Figure 34: qRT-PCR HT-22 Transfected Cell Lines qRT-PCR results of pCAGGs-TKTL1 transfection in HT-22 cells. HT-22 control cells depicted in blue and transfected +hTKTL1 HT-22 cells depicted in red. No statistical difference observed in either Tkt or G6pd expression. Highly significant difference observed between control HT-22 cells and +hTKTL1 transfected cells. T-Test ***p=0.00012 for transfected HT- 22 cells compared to control. All values normalized to β-actin.

To investigate protein-protein interactions, co-immunoprecipitation (co-IP) assays were performed using an antibody to endogenous Akt. A co-IP utilizes an antibody of a known protein, in this case Akt, believed to form a strong association with another

79 protein creating a protein/protein complex. Using this antibody for an IP allows for both proteins, if strongly bound, to be pulled down together. Instead of blotting for the known protein, a second antibody against the potential associated protein is used during the western blot, in this case, Tkt and TKTL1 respectively. It is then possible to assess the protein/protein interactions of interest188. Protein complexes immunoprecipitated with the Akt antibody were separated via polyacrylamide gel-electrophoresis and western blotted. Western blots were treated with antibodies to either Tkt or TKTL1. As shown in

Figure 35, both Tkt and TKTL1 show protein/protein interaction with Akt. A modest but significant decrease in interaction of Tkt with Akt in the transfected cell line was observed compared with the control HT-22. Biological replicates were analyzed with

ImageJ for band concentrations and two tail p-value calculated by t-Test for paired two sample for means with an alpha of 0.05. TKTL1 is detected interacting with Akt in the transfected cell lines and not in the control HT-22 cells. This result suggests that over expression of TKTL1 mRNA, and thus increased TKTL1 protein, may interfere with the canonical interaction between the two proteins altering the efficiency of Tkt to function in the PPP as proposed by my hypothesis.

80

Figure 35: Co-IP of Tkt/Akt and TKTL1/Akt Co-immunoprecipitation depicting the interactions between Akt with Tkt (A) and Akt with TKTL1 (B). Modest decrease in interaction observed in Tkt, *p=0.015 (A) and a qualitative interaction observed with Akt/TKTL1 (B). α-tubulin as loading control, normalized data show protein/protein interactions with Akt. Concentrations of protein bands measured with ImageJ. Histogram of densitometric quantification of band concentrations ratios to background and normalized to loading control (D). With the data suggesting that both Tkt and TKTL1 interact with Akt the next step was to assess whether the phosphorylation status of Tkt was altered when TKTL1 mRNA was over expressed. To elucidate this, a protein immunoprecipitation with a monoclonal Transketolase antibody (Thermo Fisher) and a polyclonal TKTL1 antibody

(Thermo Fisher) were used to pull down the protein of interest in an immunoprecipitation. Following this, the PVDF membrane was blotted with a phospho- threonine antibody to quantify phosphorylation in protein present. My hypothesis was that in control HT-22 cells phosphorylation of Tkt would be higher compared to that of the transfected cells lines, suggesting hTKTL1 is interfering with phosphorylation of Tkt.

The interaction of Akt with hTKTL1 would also predict phosphorylation of hTKTL1 in the transfected cells lines and little to no phosphorylation in the control HT-22 cells. Any

81 detection of phosphorylation of TKTL1 in control HT-22 cells would likely be background endogenous mouse Tktl1 cross reacting with the polyclonal antibody to hTKTL1.

Figure 36: Tkt and TKTL1 IP for Phosphorylation Analysis HT-22 cell lines and hTKTL1 transfected cell lines showing differential protein quantification after immunoprecipitation. IP with Tkt antibody and blotted for phospho-threonine (A) **p=0.0067. IP with TKTL1 antibody and blotted for phospho-threonine (B) ***p=0.00064. Total loading control (C) with α–tubulin primary antibody. After transfection, Tkt phosphorylation (activation) is knocked down. TKTL1 phosphorylation is detected in transfected cells. Histogram of densitometric quantification of band concentrations ratios to background and normalized to loading control (D). Data suggests that the presence of constitutively transcribed hTKTL1 in HT22 cells decreases the phosphorylation of endogenous Tkt.

As seen in Figure 36, phosphorylation is decreased for Tkt in transfected HT-22 cell lines, concurrent with high expression and high levels of phosphorylation of TKTL1.

This data support the hypothesis that TKTL1 acts as a competitive inhibitor of Akt phosphorylation of Tkt, thereby potentially inhibiting the non-oxidative phase of the PPP.

82

Figure 37: Relative Ratio of Glutathione and NADPH in Cells Relative ratio of GSD/GSSH (A) and NADPH (B) in mouse hippocampal cell lines, HT-22, transfected with pCAGGS/TKTL1 vector. A statistically significant difference was observed between control HT-22 cells and transfected cell lines over expressing TKTL1. Error bars indicated a ninety-five percent confidence interval. Assays conducted by Amy Friss.

Interestingly, over expression of transketolase or transaldolase, both involved in the non-oxidative phase of the PPP, has been linked to increased oxidative stress.

Schenk et al. along with Coy et al. have proposed and shown evidence that one of the transketolase enzymes (Tkt or TKTL1) is the rate limiting enzyme in the non-oxidative phase of the PPP175,176. Previous work done in the O’Neill lab by Amy Friss showed that forced over-expression of TKTL1 in the mouse hippocampal cell lines, HT-22, reduced the levels of glutathione and NADPH produced during the non-oxidative phase of the

PPP, see Figure 37. This reduction of glutathione and NADPH inhibits the neural cells’ ability to reduce ROS’s such as hydrogen peroxides and leads to oxidative stress and ultimately neurodegeneration.

83 To investigate the interaction of Akt/Tkt/TKTL1 in vivo, CD1 fetal mice were transfected in utero with both the pCAGGS-hTKTL1 construct and pCAG-mRFP

(monomeric red fluorescent protein) in collaboration with Roman Goz in the laboratory of Dr. Joe LoTurco186. Embryos were allowed to gestate until P0 and neocortex was extracted at this time. qRT-PCR was conducted to confirm the presence of hTKTL1 transcript, as seen in the results in Figure 38. Immunoprecipitation was then performed on protein extractions using antibodies to Tkt, hTKTL1 or Akt as a control in this experiment. IP products were western blotted and stained with the antibody against phospho-threonine. This assay was performed to determine if phosphorylation of Tkt is inhibited by over expression of hTKTL1 in vivo. Figure 37 shows high expression of hTKTL1 transcripts in two mouse IUE brain samples, 11 and 13, compared to the non- transfected control. Moreover, endogenous Tkt expression appears unaffected in the transfected brains.

84

Figure 38: qRT-PCR of Mouse IUE Brain Samples Relative fold change of Tkt and TKTL1 in mouse IUE brain samples. A statistically significant increase in expression was observed in IUE brain 11 and 13 compared to WT. Error bars indicate 95 percent confidence interval.

Immunoprecipitation and subsequent western blots of protein extracts from brain samples 11 and 13 are shown in Figure 39. Again, an increase in phosphorylation of

TKTL1 when over expressed was observed along with a decrease in endogenous Tkt phosphorylation. Band concentrations were measured with ImageJ and an ANOVA single factor test was used to calculate statistical significance between IUE brain samples and WT. This data, combined with previously described protein data, provide a compelling argument that TKTL1 can and does act in competition with endogenous Tkt for phosphorylation. Inadequate levels of Tkt phosphorylation may lead to aberrant production of PPP metabolites and contribute to the levels of oxidative stress present in cells over expressing TKTL1.

85

Figure 39: Mouse IUE IP Western Blot Analysis Western Blot analysis on mouse IUE transfected samples compared to WT brain tissue. IP with Tkt antibody and blotted for phospho-threonine (A). IP with hTKTL1 antibody and blotted for phospho-threonine (B). Akt antibody used for total protein loading control (C). A decrease in phosphorylated Tkt was observed in IUE mouse brains compared to wildtype with a calculated *p-value=0.013. An increase in phosphorylation of hTKTL1 was observed in mouse IUE brains with a calculated *p-value=0.027. Histogram of densitometric quantification of band concentrations ratios to background and normalized to loading control (D). Akt used as total protein loading control. Secondary anti-body was provided by Li-cor.

Conclusions

With the lack of knowledge in the literature about the function of the TKTL1 protein, providing insight into the role the protein plays in the PPP and the role of the conserved phosphorylation site was important. Protein-protein competition altering enzyme function has often been reported in the literature. This competition was clearly defined by the Csn4–Bam–Bgcn competition model where these proteins antagonize each other’s functions to regulate germ cell differentiation in D. melanogaster189. In this system, Bam proteins are upregulated and sequester the Csn4 proteins from the

COP9 complex thus changing the function and allowing the cell to undergo

86 differentiation rather than self-renewal. My data suggests that TKTL1 may very well be acting in a similar fashion regulating the function of endogenous TKT in the PPP.

In normal cells, low level expression of TKTL1 may modulate phosphorylation of TKT to achieve homeostasis. This would explain why TKTL1 is typically expressed at low levels as to not cause adverse effects to the PPP and increase levels of oxidative stress in the cell. However, when TKTL1 is over expressed in cancer cells, it is an important component in mass producing nucleic acids via increased levels of ribose190. Rapidly proliferating cancer cells increase glucose consumption as a means to facilitate production of nucleotides and lipids191. Several studies have shown TKTL1 to be a pivotal enzyme in the PPP in cancer cells and is hypothesized to be upregulated to balance the production pentoses for nucleotide synthesis and fatty acid synthesis, as well as maintaining homeostasis for oxidative stress191,192.

TKTL1 may very well function as a transketolase in the PPP in cancer cells as studies have suggested. However, my experiments suggests a novel role for TKTL1 as a competitive inhibitor of TKT in neural cells. While TKTL1 expression and function has been well studied in cancer cells, its role in neural cells is currently unknown.

Cascante et. al193 observed increased TKTL1 expression in cancer cells as well as increased activity of Akt and the mTOR signaling pathway. Increased Akt activation would allow for phosphorylation of both TKT and TKTL1. Additionally, Cascante et al. proposes TKTL1 functions in an independent one-substrate reaction converting xyulose-5-phosphate to glyceraldehyde-3-phosphate. Since neural cells are not rapidly dividing, TKTL1 may function differently compared to previous studies in cancer cells. In fact, the glutathione assay conducted by A. Friss (Figure 37) directly

87 contradicts TKTL1 functioning as a transketolase in neural cells. When TKTL1 is aberrantly expressed in normal neural cells, such as female parietal brain, oxidative stress and neurodegeneration may occur and lead to neurodevelopmental disorders such as ASD.

Understanding how TKTL1 functions when it is over expressed in neuronal cells such as in these experiments can provide insight to the function it serves in neurotypical cells. This work has shown TKTL1 not only has a conserved Thr382 site with Tkt, but that it can alter the levels of Tkt activation as well. While TKTL1 is a relatively low expressed gene, it may be critical in modulating the function of the non - oxidative phase of the PPP. TKTL1 and Tkt could work together as the rate limiting enzymes in the PPP, based on phosphorylation activation. This data suggests TKTL1 regulates and when over expressed, inhibits, the activation of Tkt. Using this antagonistic relationship between the two alleles, the cell is able to properly balance the reversible reactions of the non-oxidative phase of the PPP to ensure oxidative stress is kept low, while also producing metabolites for downstream functions such as the production of DNA/RNA. As no studies to date have investigated the function of

TKTL1 in neural cells, this body of work serves as a novel insight into how TKTL1 functions within neural cells, and in particular how it can be used to regulate the PPP.

While TKTL1 may in fact not function as a transketolase itself due to the deletion of the co-enzyme binding site, this study suggests it does play a role in overall transketolase function within the cell. Additionally, the data suggests that when

TKTL1 is over expressed within neural cells, oxidative stress is increased and can lead to neurodegeneration, and potentially, ASD.

88 Chapter 6: Synthesis and Future Directions

The goal of past graduate students in the O’Neill lab has been primarily to elucidate an imprinting mechanism regulating PSGE on the X chromosome in mammals. This began with Dr. Raefski and his discovery of the imprinted Xlr3/4 cluster in 20053. Using a Turner Syndrome model mouse, this study identified maternal specific gene expression of the Xlr3b, Xlr4b, Xlr4c paralogs on the XA.7.2 region of the X chromosome. The imprint was observed in different tissue types including brain and fibroblast cell lines. Following this work Dr. Carone proposed that a novel imprinting mechanism must exist regulating the imprinted cluster after providing evidence that a

DMR does not exist within the region of the imprinted paralogs.

Looking at the primary transcripts of Xlr3b Dr. Kasowitz identified a difference in transcript abundance at the 3’ end that was not seen at the 5’ end when comparing the maternal to paternal alleles. This suggested that the maternal copy of Xlr3b completed transcription through initiation, elongation and termination while the paternal transcript was interrupted after initiation. Building off of this work, Dr. Qureshi sought to investigate the potential co-transcriptional regulation mechanism studying RNA Pol II activity at the Xlr3/4/5 locus. Through his experiments he showed a perturbation of RNA

Pol II during active transcription first beginning at Int3/Ex4 and almost completely stalled by intron 7. However, Dr. Qureshi’s results have not been able to be repeated by others in the O’Neill lab. This previous work culminated in the study of various histone modifications at the imprinted Xlr3/4 region to investigate if a differential chromatin environment exists between the maternal and paternal allele potentially influencing

89 active RNA Pol II transcription events. Prior to this dissertation, no robust, repeatable difference in chromatin architecture has been observed.

Dr. Nesbitt established the foundation for the work investigating TKTL1 (Tktl1) in both human and mouse. Dr. Nesbitt searched beyond the imprinted Xlr cluster in an attempt to identify other imprinted genes on the mouse X chromosome. In doing so, she identified the gene Tktl1 to be partially paternally imprinted in specific mouse tissues including subregions of the neonatal brain. The unique finding by Dr. Nesbitt enabled the lab to bridge the gap between the model mouse and the since, unlike the Xlr3/4 genes, Tktl1 has a human ortholog, TKTL1 that is also present in the syntenic region of the human X chromosome174. Her work along with Amy Friss identified Tktl1 to not only be imprinted, but also involved in the pathway of glutathione reduction. This discovery linked an imprinted gene on the X chromosome to potential downstream neurodegeneration and cognitive defects for the first time in our lab. I began my work studying the relative expression of TKTL1 in human brain sub-regions to test the hypothesis that individuals with ASD would aberrantly express TKTL1 and thus would have neurodegenerative and social cognitive defects. I also set out to investigate the function the TKTL1 protein plays in the Pentose Phosphate Pathway to test the hypothesis that over expression of TKTL1 would decrease the phosphorylation and activation of endogenous Tkt altering production of metabolites in the non-oxidative phase of the PPP.

Nucleosome Positioning and the RNA Pol II Stalling

A thorough investigation of nucleosome positioning and occupancy has shown that there are regions on the paternal allele of the imprinted Xlr genes that are highly

90 nucleosome dense compared to the maternal allele, and that regions of the maternal allele contain nucleosomes not present on the paternal allele. Specifically in Xlr3b, nucleosomes are preferentially positioned proximal to the Int3/Ex4 boundary and within intron 7 on the paternal allele. Although this is not a definitive answer to the stalling machinery or imprinting control mechanism of X-linked imprinted genes, it does provide a novel insight into how nucleosome positioning can potentially influence transcription regulation on the X chromosome. Low levels of paternal mRNA transcripts are detectable suggesting that RNA Pol II is able to overcome the nucleosome obstacles on the paternal allele but at a relatively low success rate. Additionally, the presence of a maternal nucleosome located at the G-quadruplex exon1/intron1 boundary suggest a potential novel mechanism where H3.3 may be deposited on the maternal allele and not the paternal allowing proper transcription initiation and elongation. H3.3 would only be able to be deposited on the maternal allele alleviating the secondary G-quadruplex structure and allowing proper transcription throughout the gene body. Conversely, on the paternal allele, H3.3 would not be deposited in the region of G-quadruplex formation and RNA Pol II would not be able to proceed effectively though the gene body.

A future characterization of transcription differences between the alleles could be investigated by conducting a global run on sequencing (GRO-Seq) assay to determine the status of RNA Pol II and the transcripts being produced from the maternal and paternal alleles175. Since the paternal allele is not producing a full transcript, the maternal sequencing results for the imprinted Xlr3/4 region should be statistically more prevalent. Additionally, this assay would indicate the status of RNA Pol II across the gene body for each of the imprinted paralogs providing a more in depth understanding

91 about the stalling mechanism as a whole. Conducting this GRO-Seq experiment with the Illumina Next-Seq would allow for a thorough investigation of the maternal and paternal X chromosome and help identify other genes with a similar profile. This process will map the quantity, orientation and position of actively transcribing RNA Pol

II.

An assay should be developed to destabilize the nucleosomes present on the paternal allele to observe if full active transcription is restored to provide evidence that nucleosome positioning is influencing transcriptional regulation. Nucleosome destabilization has been used to allow nucleosomes to be more easily modified during transcription176. One way to destabilize nucleosomes is to alter the salt concentration in in vitro expression assays. Nucleosome binding affinity can be altered by increasing or decreasing the overall salt concentration loosening or tightening the bound DNA allowing for easier histone octamer sliding events195. If nucleosome binding affinity is lessened and normal levels of transcription are restored, it could be concluded that specific nucleosome positioning perturbs RNA Pol II and in effect, regulates transcription of the imprinted cluster. It is important to thoroughly investigate this nucleosome occupancy and RNA Pol II activity as a potential imprinting mechanism and would require further evidence to conclusively characterize it as a novel imprinting mechanism. GRO-Seq and a nucleosome sliding assay would greatly contribute to this hypothesis.

An additional assay that would aide in deciphering the role of G- quadruplex/ATRX interactions would be a histone variant H3.3 ChIP-Seq. This experiment would determine if there is differential deposition of H3.3 on the maternal

92 allele compared to the paternal allele at the exon1/intron1 boundary. H3.3 is known to be deposited in heterochromatic regions of DNA to allow for DNA accessibility for active transcription177,178. If there is an enrichment of H3.3 at this locus on the maternal allele it would explain how RNA Pol II is able to produce a full transcript of Xlr3b on the maternal allele and explain why the paternal allele remains repressed. To determine if this H3.3 deposition truly is the ICR controlling the imprint of Xlr3b, an assay would need to be conducted to either remove H3.3 from the maternal allele where the expected outcome would be transcription silencing, or add H3.3 to the paternal allele where the expected outcome would be transcription activation of the paternal allele. An

RNAi construct could be created against histone variant H3.3 thus interfering with the deposition on the maternal allele. In this assay, the hypothesis would be the maternal allele would not produce a full transcript and would become silent.

The Model of Xlr3b Transcription Perturbation by Nucleosome Occupancy

For the purposes of this dissertation, the model of transcription perturbation identifies nucleosome occupancy on the paternal X chromosome to regulate expression at the Xlr3/4 imprinted gene cluster. The maternal allele of Xlr3b undergoes proper transcription initiation, elongation and termination. The maternal copy is hypermethylated at the promoter and transcription initiation events are observed similarly to autosomal imprinted loci in the genome. Although the promoter region is heavily methylated the pre-initiation complex (PIC) is able to be established successfully and RNA Pol II begins to synthesize mRNA products198,199. This is followed by proper transcription elongation and eventually termination at the 3’ end of the gene200.

93 The paternal allele does not follow the same pattern of transcription elongation.

Previous data by Dr. Carone, Murphy and Kazowitz have suggested that the PIC is able to successfully establish at the Xlr3b promoter and begin to synthesize mRNA. Early transcription events occur equally on both the maternal and paternal allele up through part of transcription elongation. Following this initiation of transcription, RNA Pol II appears to be interrupted on the paternal allele failing to produce a full length paternal transcript. As of yet, no mechanism or model has been reproducible to explain this paternal silencing. The following model, shown in Figure 40, explains an imprinting mechanism where differential nucleosome positioning may cause RNA Pol II to stop and dissociate during active transcription on the paternal allele. A buildup of nucleosomes results in the perturbation of RNA Pol II on the paternal allele followed by dissociation. The following model provides a schematic of how nucleosome positioning could cause RNA Pol II elongation perturbation and eventual dissociation from the DNA strand. Weber et al.201 explains that nucleosomes are observed to create a barrier to

RNA Pol II transcription in vitro. This study showed that nucleosome occupancy has a direct effect on the extent of RNA Pol II stalling and perturbation201. The ability of nucleosomes to pause RNA Pol II transcription is well defined in the literature, particularly at the promoter region. Gilchrist et al. noted highly regulated genes with nucleosomes at the promoter that display RNA Pol II stalling, disfavor nucleosome occupancy within the gene202. This data suggested a link between RNA Pol II competing with nucleosomes for promoter occupancy and downstream positioning of nucleosomes within the gene body. Nucleosome positioning at highly active gene sites has been observed to create a stalling event just upstream of the transcribed gene region. This

94 barrier is required to be overcome for downstream transcription to occur203. Wilhelm et al. hypothesized that a pile up of nucleosomes at the intron/exon boundary could cause

RNA Pol II to slow down or pause204.

Figure 40: A Proposed Imprinting Mechanism Model Silencing the Paternal Xlr3b Proper transcription initiation occurs on both the maternal and paternal alleles and actively being transcribing mRNA. When RNA Pol II approaches int3/ex4, it encounters specifically placed high affinity nucleosomes and begins to stall. This stall inhibits the ability of the transcription machinery to recruit chromatin remodelers and thus creates an environment difficult for the machinery to transcribe through. Some RNA Pol II is able to overcome the first obstacle, but further downstream RNA Pol II encounters another region of high affinity nucleosomes at intron 7 and the majority of RNA Pol II is stalled and dissociated truncating transcription and causing the silencing of the paternal Xlr3b transcript. An alternative model for the Xlr imprinting cluster could involve a G-quadruplex recruitment of ATRX on the maternal allele leading to the deposition of H3.3 to provide active RNA Pol II transcription at the 5’ end of the Xlr3b gene body. This novel mechanism would explain how the maternal allele is expressed more freely while the paternal allele remains repressed. In this model seen in Figure 41, a G-quadruplex at the exon1/intron1 boundary of the Xlr3b gene body would recruit ATRX to the region where there is a differential nucleosome occupancy between the maternal and paternal

95 alleles. The maternal allele which has an observed nucleosome located at this locus would allow ATRX to deposit H3.3 to promote active transcription of the allele.

Conversely, the paternal allele, would contain no nucleosome placed at this G- quadruplex region and would not allow deposition of H3.3 and would inhibit the ability of proper transcription elongation of the allele in a parent specific manner. This novel mechanism would be the first of its kind on the X chromosome to describe a genomic imprinting region utilizing ATRX at a G-quadruplex to regulate parent specific gene expression. Additionally, the Shioda group identified a small CG rich region that displayed differential methylation in their male ATRX knockout mice. In the study, ATRX

KO mice displayed a decrease in methylation at the region compared to WT178.

Although this region is not a classically identified CpG island, this differential methylation of the region could be involved with the imprinting mechanism. However, previous work by Dr. Carone and Dr. Murphy did not detect differential methylation in this region.

Additionally, the transcription factor complex, FAcilitates Chromatin Transcription

(FACT), is essential for chromatin remodeling and proper transcription through nucleosomes205. When RNA Pol II stalls at the 5’ region of genes, H3K36me3 also piles up with the transcription machinery. Set2, a histone methyltransferase, modifies histones with H3K36me3 in coordination with RNA Pol II during transcription elongation206,207. If H3K36me3 cannot be recruited properly, i.e. due to a lack of a nucleosome, FACT recruitment will also be altered and the ability of active transcription could be drastically diminished. This may be involved with the perturbation of RNA Pol II on the paternal allele along with overcoming a G-quadruplex on the maternal allele.

96

Figure 41: Proposed Alternative Imprinting Mechanism Model for Xlr3b Proper transcription initiation occurs on both the maternal and paternal alleles and actively being transcribing mRNA. However, at the exon1/intron1 boundary a nucleosome is positioned on the maternal allele and not the paternal allele. This positioning difference allows for ATRX, which is recruited to the G-quadruplex region of the gene body and deposits H3.3 to regulate active gene transcription. Conversely, on the paternal allele, this nucleosome does not exist at the exon1/intron1 boundary and thus does not allow for H3.3 deposition and transcript regulation is decreased due to RNA Pol II difficulty transcribing through the region. This is exacerbated further down the gene body by a pile-up of nucleosomes on the paternal allele perturbing any residual RNA Pol II from successfully transcribing the Xlr3b transcript and thus silencing the paternal allele.

TKTL1 Expression in Human ASD Brain Sub-regions

Oxidative stress has been closely linked to the development of neurodegenerative disorders such as ASD208,209,210. The literature has connected over expression of TKTL1 to increased levels of oxidative stress in certain cancers including colorectal211 and several carcinomas212. TKTL1 upregulation has been identified as a necessary event for tumor cells to proliferate and degrade glucose through the

97 anaerobic transketolase-dependent PPP, known as the Warburg effect, for tumor cell proliferation189. Due to this link to the PPP and production of substrates through the degradation of glucose, TKTL1 is also linked to cellular oxidative stress by influencing the reduction of glutathione to interact and reduce reactive oxygen species in the cell208,209,210. This role directly links Transketolase activity to the development of neurodegenerative disorders such as ASD along with cancer9. When I analyzed the relative expression of TKTL1 I expected to find over expression prevalent in autistic males due to the approximately 5:1 ratio of ASD in males to females1. However, the data showed no statistical significance in either parietal or occipital brain tissue for autistic males but did show a statistically significant difference in TKTL1 expression in autistic female parietal brain samples. This observation can be explained by the Female

Protective Effect20 which states that females require a greater etiologic load to manifest autistic behavioral impairment and thus this upregulation of TKTL1 in males may be embryo lethal or so detrimental that the outcome is not ASD, but a more severe neurological disorder. Another possible explanation is that male samples obtained for this study were collected at varying post mortem intervals (PMI), and among them there was a high rate of asphyxiation related causes of death. These factors could have made it more difficult to observe a statistically significant difference in male TKTL1 expression assays. It has been hypothesized in the literature that a single X chromosome locus could mediate the Female Protective Effect and in turn produce the male sex bias observed in ASD21. This also supports the hypothesis of Skuse et al. that there are imprinted genes on the X chromosome responsible for deficits in neurocognitive ability2.

98 An approximate 3-4 fold increase of relative TKTL1 expression was observed in brain samples from autistic females obtained from the Neurobiobank. This data suggests a sub-region specific increase of TKTL1 may contribute to the etiologic load in females resulting in the development of neurodegenerative disorders such as ASD. Dr.

Nesbitt had previously identified TKTL1 to be imprinted in human fetal brain and this imprint could support the Skuse hypothesis of imprinted X chromosome genes influencing neurocognitive acuity. Data of this imprint was also supported by my SNP identification assay where monoallelic expression of TKTL1 was observed in neurotypical female parietal brain samples while biallelic expression was observed in

ASD female parietal brain samples. This data suggests that there is a genomic imprint of TKTL1 on the human X chromosome even in adulthood and a subregion-specific loss of this imprint could lead to aberrant expression of TKTL1 and eventually the development of neurodegenerative disorders.

It would be important to investigate these findings in a larger representative sample of individuals to give higher confidence this effect is taking place. An N=4 for both female neurotypical and ASD brain samples does suggest a pattern of expression for TKTL1. Repeating these experiments with additional cohorts would greatly increase the significance of the finding and add to the overall knowledge of female ASD development and potential therapeutic interventions in the field. However, it would be important to ensure the samples provided were from identical brain sub-regions.

Another further experiment that would provide insight into the development of ASD would be RNA-Seq. Using the Illumina Next-Seq to sequence the RNA profile of ASD brain sub-regions would give an overall view of transcript differences across the entire X

99 chromosome. Comparing neurotypical brain RNA-Seq data to ASD data would provide inferences to differentially expressed regions and genes on the X chromosome and would allow for identification and further analysis of gene targets. Understanding the development of ASD is very important with the pervasive nature of the disorder and a larger sample size along with RNA-Seq would contribute to the collective knowledge in the field of ASD.

TKTL1 Protein Function and Influence on Endogenous Tkt

As previously discussed, transketolase is a rate limiting enzyme in the PPP regulating production of products to ensure proper cell homeostasis during the non- oxidative phase158,171. TKTL1 has been implicated in neurodegenerative disorders such as ASD and schizophrenia along with several forms of cancer including colorectal, breast and other carcinomas189,190. However, the overall function and role of the protein itself is not well classified in the literature. In fact, there is a disagreement whether

TKTL1 even maintains functional transketolase activity152. With this lack of knowledge in the literature, investigating the function of TKTL1 and determining its role in the PPP was important to this body of work. To do this I transfected HT22 hippocampal mouse cell lines with the human TKTL1 cDNA transcript. In doing this I aimed to elucidate the relationship between TKTL1 and endogenous Tkt. Previously in this dissertation TKTL1 and Tkt were explained to share a conserved phosphorylation site, Thr382, which has been shown to be essential for Tkt activation and proper function. Akt as part of the mTOR pathway phosphorylates Tkt at Thr382 activating the enzyme and allowing it to function properly in the PPP catalyzing reactions of substrates in the non-oxidative phase145. Knowing this phosphorylation is essential for Tkt activation, I hypothesized

100 that because of the conserved Thr382 domain, TKTL1 would act in competition with Tkt for phosphorylation and thus reduce the presence of activated Tkt in cells. As the data showed, TKTL1 was in fact phosphorylated by an interaction with Akt and subsequently the total amount of phosphorylated Tkt was reduced.

Additionally, transgenic mice were generated with the help of Roman Goz of the

LoTurco laboratory. CD1 fetal mice expressed TKTL1 transcript within the brain after In

Utero Electroporation (IUE). Extracting the brain tissue from these mice at P0, the interaction of TKTL1 and Tkt was investigated in vivo and the same competitive phosphorylation was observed. A constitutively active TKTL1 construct could be engineered to over express TKTL1 specifically in mouse brain. This experiment would generate transgenic mice with a similar expression profile to female ASD individuals and could be used to further investigate the effects of TKTL1 over expression on the

PPP and overall brain development. Potentially this model could be used for behavioral studies to determine if the mice present an ASD phenotype by decreased learning, repetitive behavior, and social deficiency.

A transgenic Tkt knock-out cell line could be generated to study the function of

Tktl1. To elucidate whether Tktl1 truly maintains the transketolase activity, these cell lines could be treated with an integrating Tktl1 expression vector to determine if over expressing Tktl1 would rescue the cell line, thus proving that Tktl1 can in fact function as a transketolase in the PPP. This could prove to be a difficult experiment as knocking out Tkt would potentially drastically inhibit the cells ability to proliferate213. However, if

Tktl1 is able to rescue the cell line and allow them to proliferate it would provide substantial evidence supporting Tktl1 transketolase activity. With how pervasive TKTL1

101 is in the literature relating to such prevalent disorders such as ASD and cancer, it is important to examine the function of the protein thoroughly. Thus far in the literature, the overall function of the protein has yet to be classified and these transgenic models would contribute to the knowledge of TKTL1 in the field and provide crucial insight to the role the gene plays in both neurodegenerative disorders and cancer.

The following models (Figures 42&43) depict the proposed competition observed in the data examined in this dissertation. The data suggests that TKTL1 actively competes with Tkt for phosphorylation by Akt. This competition inhibits the activation of

Tkt and ultimately leads to higher levels of oxidative stress in the cell due to the inability of Tkt to function properly in the non-oxidative phase of the PPP. This lack of catalyzing reactions results in a decrease in NADPH production lowering the amount of reduced glutathione able to interact and reduce reactive oxygen species like hydrogen peroxide.

High levels of ROS in the cell leads to oxidative stress, lipid peroxidation, and neurodegeneration. This provides the first direct link in the literature between TKTL1 and how it potentially leads to the development of neurodevelopmental disorders such as ASD.

102

Figure 42: Model of Typical TKTL1 Phosphorylation Endogenous Tktl1 is expressed in low levels in mouse. When properly regulated and expressed, it does not interfere with the phosphorylation and activation of Tkt by Akt and activated Tkt is able to function properly in the PPP. Tkt limits the reaction and conversion between Ribulose 5-phophate and Glucose 6-phosphate and regulates the production of NADPH and reduced glutathione.

103

Figure 43: Competition Model of Over Expressed TKTL1 in the PPP Over expression of TKTL1 in the cell creates a competition for Akt phosphorylation. When TKTL1 is over abundant, Akt will phosphorylate TKTL1 instead of endogenous Tkt. Phosphorylated TKTL1 is then responsible for catalyzing the reaction in the non-oxidative phase of the PPP. Even if TKTL1 retains the transketolase active, over expression will alter the precursors and products in the PPP. This can result in deleterious effects to NADPH production and lower the amount of reduced glutathione that is able to function as an antioxidant and reduce ROS lowering oxidative stress. This model displays a potential pathway for the increase of oxidative stress in neurons and eventual neurodegeneration seen in disorders such as ASD.

104 Chapter 7: Materials and Methods Animal Breeding and Tissue Collection

All mouse protocols were approved by the University of Connecticut Institutional

Animal Care and Use Committee. A breeding scheme was used to generate X- monosomic mice with a C57 strain background, (Figure 3). C3H/Paf males were mated to C57BL/6J females to produce 39, XM mice. C3H/In(X) females were mated with

C57Bl/6J males to produce 39,XP mice. Neonates were inspected visually to initially determine sex. A DNA extraction from limb tissue was conducted for PCR genotyping using proteinase K digestion. A Y chromosome specific primer set, Smcy, was used to definitively sex the mice. The DXMit130 marker primer set was used to determine between the C56 and C3H X chromosome.

RNA Extractions

Tissue was extracted from mouse, flash frozen in liquid nitrogen, and stored at -

80 until use. Human brain tissue was obtained already frozen from the Chuahan lab and from the Maryland Brain Bank and stored at -80. Brain tissue from both mouse and human were finely chopped with a scalpel and resuspended in buffer. Tissue and cell pellets were extracted with Qiashredder (Qiagen) and RNeasy spin columns (Qiagen) and suspended in DI water.

Bisulfite Sequencing

Chromatin was extracted from mouse 39,XM and 39,XP brain tissue samples.

Samples were treated with bisulfite conversion reagent (Zymo) at 42°C overnight.

Samples were purified and amplified using bisulfite primer sets designed with Zymo

Bisulfite Primer Seeker (Primer Table). Samples were run on the Sanger 3130 and

105 sequence peaks were viewed with FinchTV. Methylated DNA diagrams were generated using BiQ Analyzer from the Max Planck Institute bioinformatics.

Quantitative Real-Time PCR

SYBR Green Supermix (Quanta was used to perform qRT-PCR. Each sample was conducted in triplicate with a 15 uL reaction volume on an IQ thermocycler from

BIORAD. The following protocol was used on the cycler; initial denature (95°C for 5 min); amplification step (95°C for 10 s, 62°C for 20 s, 72°C for 20 s); and final extension step (72°C for 5 min). All primers used for qRT-PCR are provided in the supplemental table of primers. Melt curves and CT values were analyzed using CFX Manager from

BIORAD for multiple products and primer efficiency. CT values were determined and normalized using the relative expression ratio mathematical model (Pfaffl). Significance was calculated with a T-Test.

Micrococcal Nuclease Treatment

Chromatin was extracted from neocortex of three 39,Xm individuals and three

39,Xp individuals via Phenol Chloroform/Isoamyl extraction. The chromatin was digested with 0.5 units of micrococcal nuclease (NEB, MNase) for one hour at 37°C followed by a proteinase K digestion leaving only previously nucleosome bound DNA. DNA was suspended in DI water and analyzed on the Bioanalyzer DNA chip for library size quality. Expected samples sizes were for 147 bp fragments.

SureSelect Target Enrichment

Custom SureSelect (Agilent) cRNA baits were created using the SureDesign

Tool (Agilent). Baits were created on the Mus Musculus X chromosome from chrX:(70,319,680-70,535,867) from the mm9 build (NCBI 37.1). cRNA baits were

106 hybridized according to Agilent SureSelect protocol binding previously treated MNase digested DNA samples and suspended in DI water.

Illumina Mi-Seq Sequencing

SureSelect target enrichment samples were used to create Illumina Mi-Seq libraries. Six libraries were made for the three 39,Xm, and three 39,Xp brain tissue samples. Custom SureSelectXT for Illumina Mi-Seq was used to generate libraries.

Illumina Mi-Seq adapters were added and sequences were paired end. Libraries were quantified on the Bioanalyzer using a DNA HS chip (Agilent). Samples were loaded onto

Mi-Seq with the help of Dr. Bo Reese of the CGI.

Initial data was analyzed with a FastQC report to assess the quality of the run.

Bowtie2 was used to map Mi-Seq reads to the Mus musculus mm10 build (GRCm38) as a reference index genome. Mapped reads were then analyzed with the R packages

NucleR and PING. PING generated a nucleosome positioning map utilizing peak detection. NucleR generated a bed file of the nucleosome positions that could be uploaded to UCSC and compared to other functional elements in the genome.

Bedsubtract was used to generate uniquely mapped reads for the XM and XP alleles.

NormR was also be used to normalize sequencing data and call enrichment of peaks.

Bionano Genomics Methylation Protocol

DNA was extracted from mouse 39,XM and 39,XP neonatal fibroblast tissue cultures. Agarose plugs were made for high molecular weight extraction using the

CHEF Mammalian Genomic DNA kit (BIORAD) Cells were harvested and pelleted in collection tubes at 1000 RPM for 5 minutes. Following this, the plugs are washed with

Wash Buffer and TE Buffer for iterations of 10 minutes at 43°C. Cells are embedded

107 into agarose in a 15 minute incubation at 4°C. Next, there is a 2 hour proteinase K digestion at 50°C followed by an overnight digestion also at 50°C. Plugs are washed with wish buffer while shaking for a total of 2.5 hours at room temperature. The agarose plugs are then melted during a 2 minute incubation at 70°C. Agarose is digested with agarase during a 45 minute incubation at 43°C followed by a drop dialysis for 45 minutes at room temperature. DNA is homogenized overnight at room temperature to get into solution. A dialysis membrane is used to isolate high molecular weight (HMW)

DNA. A Qbit Broad Range DNA analysis determines the concentration of the HMW

DNA. 300 ng of DNA digested with cofactor Adocf640 and M.TaqI enzyme for five hours at 65°C. Proteinase K is added for two hours at 45°C.

The NLR reaction consists of a nick, label and repair. HMW DNA is nicked with the BspQI enzyme for two hours at 37°C. The reaction then is labeled with the Bionano labeling mix for one hour at 72°C. Finally the reaction is repaired with ligase for thirty minutes at 37°C. The DNA backbone is stained with the Bionano staining mastermix overnight at 4°C. The NLR reaction is quantified using the Qbit DNA HS assay.

The sample is loaded onto the Bionano Irys and a sample specific loading protocol is produced with via assisted loading on the machine. Molecules are identified by the program AutoDetect and data is analyzed on IrysView, Bionano Access and

IrysExtract.

Illumina NextSeq OMNI-ATAC-Seq

Mus musculus 39XM and 39XP neonate neocortex were extracted from mice.

Tissue was homogenized in 1x Homogenization buffer with Dounce homogenizer.

Nuclei were isolated through gradation by layering 35%, 29%, and 25% Iodixanol

108 solution, separating cellular debris and the nuclei band. Nuclei were counted with

Trypan blue staining and 50,000 nuclei (Mus musculus) and 500 (Drosophila) nuclei

(spike in) were treated with ATAC-Seq Resuspension buffer followed by transposase reaction buffer to attach preloaded Illumina adapter sequences. Samples were Pre-

Amplified for 5 cycles with 2x NEBNext Master Mix and Illumina adapter and index primers (Primer Table). Cycling conditions; 72°C for 5 minutes, 98°C for 30 seconds, followed by five cycles of 98°C for 10 seconds, 63°C for 30 seconds and 72°C for 1 minute. A qPCR reaction was conducted to assess additional cycles needed for amplification. Cycling conditions; 98°C for 30 seconds followed by 20 cycles of 98°C for

10 seconds, 63°C for 30 seconds and 72°C for 1 minute. Guidelines followed for amplification found in Buenrostro et al 2015 (PMID: 25559105). Library was quantified with KAPA Library Quantification kit and run on Illumina NextSeq with High-Output kit

2x75 paired end reads.

Initial data was analyzed with a FastQC report to assess the quality of the run.

Bowtie2 was used to map Mi-Seq reads to the Mus musculus mm10 build (GRCm38) as a reference index genome. Mitochondrial reads will be removed before building the genome index. Peaks will be called using MACS2 and compared similarly as ChIP-Seq data for enrichment between 39XM and 39XP samples. DNA footprint analysis will be conducted with ATAC-Seq pipeline according to Kundaje lab script on GitHub.

Nucleosome positioning was analyzed using NucleoATAC developed by the

Greenleaf lab at Stanford University generating nucleosome occupancy maps.

Following that, DNA footprint analysis was conducted with the MEME suite to detect transcription factor binding proteins within the Xlr3b open chromatin DNA.

109 Transfection

Mus musculus hippocampal neuronal cell lines (HT-22) were cultured in complete mouse media. A pCAGGs hTKTL1 plasmid (gift from Dr. Loturco and altered by Dr. Nesbitt for hTKTL1) was transfected into cells with the Neon Transfection System

(ThermoFisher). Cells were viewed under CKX41 (Olympus) florescent microscope to confirm the presence of GFP expressing cells to calculate transfection efficiency. Cells were harvested in cell pellets and divided into tubes for RNA and protein extraction. qRT-PCR was performed on cDNA to confirm upregulated expression of human TKTL1 in transfected mouse HT-22 cells.

Western Blot

Protein was extracted from HT-22 cell pellets with Laemmli lysis buffer and protease inhibitors. Laemmli loading buffer was added to protein samples and incubated at 95°C for 5 minutes before loading in gel. 12% SDS page gels were made with the

SureCast system (ThermoFisher) with a 4% stack for wells. Samples were loaded and ran at 125V for 90 minutes. Samples were then transferred to a PDVF membrane by wet transfer with the ThermoFisher SureCast System at 20V for 60 minutes.

Membranes were allowed to dry for 1 hour at room temperature to allow proteins to properly bind to the membrane. Following this membranes were reactivated in methanol for 1 minute and blocked in 50% Licor Blocking Buffer and 50% 1X PBS for 1 hour.

Membranes were blotted with 1° antibody at a dilution of 1/1000 in a 50% Licor Blocking

Buffer and 50% 1X PBS solution with 0.01% Tween for 1.5 hours at room temperature.

Membranes were then washed 4x 5 minutes in 1X PBS + 0.1% tween. The 2° antibody was added in a 1/10000 dilution same blocking solution with 0.1% tween. Membranes

110 were then washed 4x for 5 minutes in blocking solution with tween followed by a final wash in 1X PBS for 5 minutes. Western blots were imaged and analyzed on the Licor system using a Chameleon Duo protein ladder. Densitometry calculations were conducted with ImageJ following the ImageJ protocol for subtracting background.

111 Appendices A B

C D

Figure S1: MNase-Seq Quality Score: Average read quality for 39XM sample Xm73 (A) and 39XP sample Xp100 (B). Sequence length observed ~150 bp for majority of reads in 39XM sample Xm73 (C) and 39XP sample Xp100 (D).

112

Figure S2: ATAC-Seq Run Quality Report The average read quality for (a) 39,XM and (b) 39,XP samples approximately 35 on the Phred+33 scale. A high quality Phred+33 score exists for both (c) 39, XM and (d) 39, XP pools meaning only adapter trimming will be necessary for these reads. The read length going into mapping for (e) 39, XM and (f) 39, XP samples is distinctly 75bp.

113

Figure S3: pCAGGS Vector Map Vector used for transfections and IUE protocols of hTKTL1 cDNA transcripts. Commercially available. Utilized glycerol stocks from Dr. Nesbitt and Amy Friss.

114

Figure S4: 39,XM ATAC-Seq Quality Metrics (A) Proximal Transcription Start Site (TSS) enrichment. Represents enrichment around individual TSS’s. (B) Represents aggregated enrichment at all TSS’s. (C) Fragment length distribution of ATAC-Seq reads. Majority of sequences within nucleosome free regions (<150 bp) ~75 bp. (D) Signal correlation to roadmap DNase (ENCODE) from several sample types measured by the Spearman’s Correlation. The closer the sample is in signal distribution in the regions to your sample, the higher the correlation. Of note, the highest correlated tissues are all members of the central nervous system (CNS).

115

Figure S5: 39,XP ATAC-Seq Quality Metrics (A) Proximal Transcription Start Site (TSS) enrichment. Represents enrichment around individual TSS’s. (B) Represents aggregated enrichment at all TSS’s. (C) Fragment length distribution of ATAC-Seq reads. Majority of sequences within nucleosome free regions (<150 bp) ~75 bp. (D) Signal correlation to roadmap DNase (ENCODE) from several sample types measured by the Spearman’s Correlation. The closer the sample is in signal distribution in the regions to your sample, the higher the correlation. Again, the highest correlation exists within neurological CNS tissues.

116 Sample ID Age Sex Disorder Cause of Death PMI 1174 7.8 F ASD Multi-system failure 14 1182 9 F ASD Smoke Inhal. 24 1638 20.8 F ASD Seizure-related 50 4671 4.6 F ASD Fall 13 1407 9.1 F Control Asthma 20 1706 8.6 F Control Transplant rejection 20 1708 8.1 F Control Car Accident 20 1846 20.6 F Control Car Accident 9 0797 9.3 M ASD Drowning 13 1349 5.6 M ASD Drowning 39 4231 8.8 M ASD Drowning 12 4849 7.5 M ASD Drowning 20 4899 14.3 M ASD Drowning 20 5027 38 M ASD Bowel obstruct. 26 1185 4.7 M Control Drowning 17 1500 6.9 M Control Car Accident 18 4645 39.2 M Control Heart disease 12 4670 4.6 M Control CC accident 17 4722 14.5 M Control Car Accident 16 4898 7.7 M Control Drowning 12

Figure S6: Table of Neurotypical and ASD Brain Sub-Region Samples Table depicts subject ID’s and related information for samples. Of note for this body of work, the high prevalence of drowning or asphyxiation related causes of death and Post- mortem interval (PMI).

117 Primer Table

Primer Name Sequence Ad1_noMX AATGATACGGCGACCACCGAGATCTACACTCGTCGGCAGCGTCAGATGTG Ad2.1_TAAGGCGA CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGGAGATGT Ad2.2_CGTACTA CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGGAGATGT Ad2.3_AGGCAGAA CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGGAGATGT Ad2.4_TCCTGAGC CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGGAGATGT G6pdx_Ex8_F1 GGAGAAGCTGCCAATGGATA G6pdx_Ex9_R1 CCAGGCTTCTTGGTCATCAT Gapdh_Mus_Ex2_F2 ACTCCACTCACGGCAAATTC Gapdh_Mus_Ex3_R2 GTGGTTCACACCCATCACAA Hsa_TKTL1_3'UTR_F5 TTGGCCTCTTTACCCTGTGT Hsa_TKTL1_3'UTR_R5 CCTAACAAGCTTTCGCTGCT Hsa_TKTL1_5'UTR_F4 GAGCCCTTTTGAGATTGCAG Hsa_TKTL1_5'UTR_R4 CCCAGTGCGTTTATGTGATG Mus_Tktl1_Ex1_RT_F1 GTTCCATCAAGGCCACAAAT Mus_Tktl1_Ex12_RT_F3 ACAGGTGGCCGAATTATCAC Mus_Tktl1_Ex13_RT_R3 AGATGCATTTCACAGCCACA Mus_Tktl1_Ex3_RT_R1 TAGTCCTTGTCCAGGCCATC Mus_Tktl1_Ex5_RT_F2 AGCTCAAGTGAGAGGCAAGC Mus_Tktl1_Ex6_RT_R2 TGAGGCGAGTCCTCAATAGG TKT_Hsa_Ex2_F CACACCATGCGCTACAAGTC TKT_Hsa_Ex4_R GTCGAAGTATTTGCCGGTGT Tkt_mus_3'UTR_F1 AGGTCCCACAACTCCTCCTT Tkt_mus_3'UTR_R1 GCCCTAAGATCACCCACTGA TKTL1 3' 1F TTTGGCCTCTTTACCCTGTG TKTL1 3' 1R CAGAAGGCACGTGCTGAATA Tktl1 E1F GCTAGTGCCGGAACTTTTTG Tktl1 E1R TAGTGCCTCGCTGCCATCTA TKTL1_ex11_R GCTTTTGCACTGGAGACGAT TKTL1_ex9_F GACCACCCGACCAGAAACTA Tktl1_F19a AGCTCAAGTGAGAGGCAAGC TKTL1_Hsa_5'SNP_1F GCTAGTGCCGGAACTTTTTG TKTL1_Hsa_5'SNP_1R TGGCCATATCTTGCAACACC Tktl1_mus_F23 CCACCCCTACTGCACTGATT Tktl1_mus_F46 CAACACACATTTGCTTTACT Tktl1_mus_R23 ACACCAGAAGAGGGCATCAC Tktl1_mus_R46 CACCTTTGTTCCACATTCAGA TKTL1_Pro_SNP_F1 GCTAGTGCCGGAACTTTTTG TKTL1_Pro_SNP_R1 CAACCCCTTTGGAGTCTGAA Tktl1_R19 TGAGGCGAGTCCTCAATAGG

118 Antibody Table

Antibody Name Company Catalog AKT Pan Monoclonal Antibody Thermo Fisher MA5-14999 Phospho-Threonine Antibody (P-Thr-Polyclonal) Cell Signaling Technology 9381 TKTL1 Polyclonal Antibody Thermo Fisher PA5-28977 β-Actin Antibody N-21 Santa Cruz sc-130656

119 References

1. Baio, Jon et al. “Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014.” MMWR Surveillance Summaries 67.6 (2018): 1–23. PMC. Web. 14 June 2018. 2. Skuse, D. H., et al. “Evidence from Turners Syndrome of an Imprinted X-Linked Locus Affecting Cognitive Function.” Nature, vol. 387, no. 6634, 1997, pp. 705–708., doi:10.1038/42706. 3. Barlow, D. P., Bartolomei, M. S.. “Genomic Imprinting in Mammals.” Cold Spring Harbor Perspectives in Biology, vol. 6, no. 2, 2014, doi:10.1101/cshperspect.a018382. 4. Davies, W. “Genomic Imprinting on the X Chromosome: Implications for Brain and Behavioral Phenotypes.” Annals of the New York Academy of Sciences, vol. 1204, 2010, pp. 14–19., doi:10.1111/j.1749-6632.2010.05567.x. 5. Raefski, Adam S, O’Neill, Michael J. “Identification of a Cluster of X-Linked Imprinted Genes in Mice.” Nature Genetics, vol. 37, no. 6, 2005, oo. 620-224. Doi:10.1038/ng1567. 6. Ke, Xiayi, et al. “The Distinguishing Sequence Characteristics of Mouse Imprinted Genes.” Mammalian Genome, vol. 13, no. 11, Jan. 2002, pp. 639–645., doi:10.1007/s00335-002-3038-x. 7. Fombonne, E, et al. “Epidemiology of Pervasive Developmental Disorders.” Autism Spectrum Disorders. Pediatric Research, 2011, pp 90-111. Doi:10.1093/med/9780195371826.003.0007. 8. El-Fishawy, P, State, Matthew W. “The Genetics of Autism: Key Issues, Recent Findings and Clinical Implications.” Psychiatric Clinics of North America. Vol 33, no. 1, 2010, pp. 83-105. Doi:10.1016/j.psc.2009.12.002. 9. Lawler, C. P., et al. (2004). Identifying environmental contributions to autism: Provocative clues and false leads. Mental Retardation and Developmental Disabilities Research Reviews, 10(4), 292-302. doi:10.1002/mrdd.20043. 10. Essa, M. M., et al. (2013). Role of NAD , Oxidative Stress, and Tryptophan Metabolism in Autism Spectrum Disorders. International Journal of Tryptophan Research, 6s1. doi:10.4137/ijtr.s11355. 11. Diagnostic and statistical manual of mental disorders: DSM-5. (2013). Washington, DC: American Psychiatric Publishing. 12. Lord, C., et al. (2006). Autism From 2 to 9 Years of Age. Archives of General Psychiatry, 63(6), 694. doi:10.1001/archpsyc.63.6.694. 13. Andersen, J. K. (2004). Oxidative stress in neurodegeneration: Cause or consequence? Nature Reviews Neuroscience, 10(7). doi:10.1038/nrn1434. 14. Ng, Irene O, et al. “Transketolase Counteracts Oxidative Stress to Drive Cancer Development.” Proceedings of the Nation Academy of Sciences, vol 111, no. 6, 2016, doi:10.1073/pnas.1508779113. 15. Chauhan, Abha, and Ved Chauhan. “Oxidative Stress in Autism.” Pathophysiology, vol. 13, no. 3, 2006, pp. 171–181., doi:10.1016/j.pathophys.2006.05.007.

120 16. Lammert, D. B., & Howell, B. W. (2016). RELN Mutations in Autism Spectrum Disorder. Frontiers in Cellular Neuroscience, 10, 84. doi.org/10.3389/fncel.2016.00084. 17. Ingram, Jennifer L., et al. “Discovery of Allelic Variants of HOXA1 and HOXB1: Genetic Susceptibility to Autism Spectrum Disorders.” Teratology, vol. 62, no. 6, 2000, pp. 393–405., doi:10.1002/1096-9926(200012)62:63.3.co;2-m. 18. Chao, Sung Han, et al. “Autistic-like Behaviour in Scn1a /− Mice and Rescue by Enhanced GABA-Mediated Neurotransmission.” Nature, vol. 489, no. 7416, 2012, pp. 385–390., doi:10.1038/nature11356. 19. Cantor, Rita M., et al. “Replication of Autism Linkage: Fine-Mapping Peak at 17q21.”The American Journal of Human Genetics, vol. 76, no. 6, 2005, pp. 1050– 1056., doi:10.1086/430278. 20. Eichler, E. E., et al. A Higher Mutational Burden in Females Supports a “Female Protective Model” Neurodevelopmental Disorders. American Journal of Human Genetics, 94(3), 415–425. doi.org/10.1016/j.ajhg.2014.02.001. 21. Gockley, J., et al. “The Female Protective Effect in Autism Spectrum Disorder Is Not Mediated by a Single Genetic Locus.” Molecular Autism, vol. 6, no. 1, 2015, doi:10.1186/s13229-015-0014-3. 22. Jorde, L B et al. “Complex Segregation Analysis of Autism.” American Journal of Human Genetics 49.5 (1991): 932–938. 23. Rovet, Joanne F. “The Psychoeducational Characteristics of Children with Turner Syndrome.” Journal of Learning Disabilities, vol. 26, no. 5, 1993, pp. 333–341., doi:10.1177/002221949302600506. 24. Holliday, R. “Epigenetics: A Historical Overview.” Epigenetics, vol. 1, no. 2, 2006, pp. 76–80., doi:10.4161/epi.1.2.2762. 25. Waddington, C. H. “The Epigenotype.” Internation Journal of Epidemiology, vol 41, no. 1, 2011, pp. 10-13., doi:10.1093/ije/dyr184. 26. Hallmayer J, Cleveland S, Torres A, et al. Genetic heritability and shared environmental factors among twin pairs with autism. ArchGen Psychiatry. 2011;68(11):1095-1102. 27. Deng, W. et al. (2015). The Relationship Among Genetic Heritability, Environmental Effects, and Autism Spectrum Disorders. Journal of Child Neurology, 30(13), 1794- 1799. doi:10.1177/088307381558064. 28. Spofford, J. B. “Parental Control of Position-Effect Variegation: I. Parental Heterochromatin and Expression of the White Locus in Compound-X Drosophila Melanogaster.” Proceedings of the National Academy of Sciences, vol 45, no. 7, Jan. 1959, pp. 1003-1007., doi:10.1073/pnas.45.7.1003. 29. Brink, R. A. “Paramutation at the R Locus in Maize.” Cold Spring Harbor Symposia on Quantitative Biology, vol. 23, 1958, pp. 379–391., doi:10.1101/sqb.1958.023.01.036. 30. Ohno, S., et al. “Formation of the Sex Chromatin by a Single X-Chromosome in Liver Cells of Rattus Norvegicus.” Experimental Cell Research, vol 18, no. 2, 1959, pp. 415-418., doi:10.1016/0014-4827(59)90031-x. 31. Takagi, N., Motomichi, S. “Preferential Inactivation of the Paternally Derived X Chromosome in the Extraembryonic Membranes of the Mouse.” Nature, vol. 256, no. 5519, 1975, pp. 640-642., doi:10.1038/256640a0.

121 32. Scott, R.J., Spielman, M. “Epigenetics: Imprinting in Plants and Mammals – the Same but Different?” Current Biology, vol. 14, no. 5, 2004, doi:10.1016/j.cub.2004.02.022. 33. Inoue, A., et al. “Genomic Imprinting of Xist by Maternal H3K27me3.” Genes & Development, vol. 31, no. 19, 2017, pp. 1927–1932., doi:10.1101/gad.304113.117. 34. Trivers, R L. “Parent-Offspring Conflict.” American Zoologist, vol. 14, no. 1, 1974, pp. 249-264., doi:10.1093/icb/14.1.249. 35. Pomiankowski, A., Iwasa, Y. (1999). Sex Specific X Chromosome Expression Caused by Genomic Imprinting. Journal of Theoretical Biology, 197(4), 487-495. doi:10.1006/jtbi.1998.0888. 36. Surani, M, et al. “Development of Reconstituted Mouse Eggs Suggests Imprinting of the Genome During Gametogensis.” Nature, vol. 308, no. 5959, 1984, pp. 548-550., doi:10.1038/308548a0. 37. Surani, M, Barton, S. “Development of Gynogenetic Eggs in the Mouse: Implications for Parthenogenetic Embryos.” Science, vol 222, no. 4627, 1983, pp. 1034-1036., doi:10.1126/science.6648518. 38. Barton, S. et al. “Role of Paternal and Maternal Genomes in Mouse Development.” Nature, vol. 311, no. 5984, 1984, pp. 374-376., doi:10.1038/311374a0. 39. Reik, W et al. “Genomic Imprinting Determines Methylation of Parental Alleles in Transgenic Mice.” Nature, vol. 328, no. 6127, 1987, pp/ 248-251., doi:10.1038/328248a0. 40. Sapienza, C. et al. “Degree of Methylation of Transgenes is Dependent on Gamete of Origin.” Nature, vol 328, no. 6127, 1987, pp. 251-254., doi:10.1038/328251a0. 41. Haig, D, Westoby, M. “Parent-Specific Gene Expression and the Triploid Endosperm.” The American Naturalist, vol 134, no.1, 1989, pp. 147-155., doi:10.1086/284971. 42. Knoll, J. H. M., et al. “Angelman and Prader-Willi Syndromes Share a Common Chromosome 15 Deletion but Differ in Parental Origin of the Deletion.” American Journal of Medical Genetics, vol. 32, no. 2, 1989, pp. 285–290., doi:10.1002/ajmg.1320320235. 43. Wilkins, J.F. “Genomic Imprinting And Conflict-Induced Decanalization.” Evolution, vol. 65, no. 2, 2010, pp. 537–553., doi:10.1111/j.1558-5646.2010.01147.x. 44. Oliver, C., et al. “Genomic Imprinting and the Expression of Affect in Angelman Syndrome: What's in the Smile?” Journal of Child Psychology and Psychiatry, vol. 48, no. 6, 2007, pp. 571–579., doi:10.1111/j.1469-7610.2007.01736.x. 45. Rabinovitz, S., et al. “Mechanisms of Activation of the Paternally Expressed Genes by the Prader-Willi Imprinting Center in the Prader-Willi/Angelman Syndromes Domains.” Proceedings of the National Academy of Sciences, vol. 109, no. 19, 2012, pp. 7403–7408., doi:10.1073/pnas.1116661109. 46. Butler, Merlin G., et al. “Clinical and Cytogenetic Survey of 39 Individuals with Prader-Labhart-Willi Syndrome.” American Journal of Medical Genetics, vol. 23, no. 3, 1986, pp. 793–809., doi:10.1002/ajmg.1320230307. 47. Nicholls, Robert D., et al. “Genetic Imprinting Suggested by Maternal Heterodisomy in Non-Deletion Prader-Willi Syndrome.” Nature, vol. 342, no. 6247, 1989, pp. 281– 285., doi:10.1038/342281a0.

122 48. Scherer, S W. et al. Whole-genome sequencing of quartet families with autism spectrum disorder. Nature Medicine, 2015, 21(2), 185-191. doi:10.1038/nm.3792. 49. Champagne, F. A., et al. (2009). Paternal influence on female behavior: The role of Peg3 in exploration, olfaction, and neuroendocrine regulation of maternal behavior of female mice. Behavioral Neuroscience, 123(3), 469-480. doi:10.1037/a0015060. 50. Creeth, H. D., et al. (2018). Maternal care boosted by paternal imprinting in mammals. PLOS Biology, 16(7). doi:10.1371/journal.pbio.2006599. 51. DeChiara, T M, et al. “Parental Imprinting of the Mouse Insulin-like Growth Factor II Gene.” Cell., U.S. National Library of Medicine, 22 Feb. 1991. 52. Ferguson-Smith, A.c., et al. “Parental-Origin-Specific Epigenetic Modification of the Mouse H19 Gene.” Trends in Genetics, vol. 9, no. 7, 1993, pp. 234–234., doi:10.1016/0168-9525(93)90084-u. 53. Thorvaldsen, J. L., et al. “Deletion of the H19 Differentially Methylated Domain Results in Loss of Imprinted Expression of H19 and Igf2.” Genes & Development, vol. 12, no. 23, Jan. 1998, pp. 3693–3702., doi:10.1101/gad.12.23.3693. 54. Bouwland-Both MI, van Mil NH, Stolk L, Eilers PH, Verbiest MM, Heijmans BT, Tiemeier H, Hofman A, Steegers EA, Jaddoe VW, Steegers-Theunissen RP. DNA methylation of IGF2DMR and H19 is associated with fetal and infant growth: the generation R study. PLoS One. 2013 Dec 12;8(12). 55. Plagge A. “Non-Coding RNAs at the Gnas and Snrpn-Ube3a Imprinted Gene Loci and Their Involvement in Hereditary Disorders.” Frontiers in Genetics. 2012;3:264. doi:10.3389/fgene.2012.00264. 56. Sleutels, F, et al. “The Non-Coding Airn RNA is Required for Silencing Autosomal Imprinted Genes.” Nature, vol 415, no. 6873, 2002, pp. 810-813., doi:10.1038/415810a. 57. Jones, B. K., et al. “Igf2 Imprinting Does Not Require Its Own DNA Methylation or H19 RNA.” Genes & Development, vol. 12, no. 14, 1998, pp. 2200–2207., doi:10.1101/gad.12.14.2200. 58. Santos, K.f., et al. “The Prima Donna of Epigenetics: the Regulation of Gene Expression by DNA Methylation.” Brazilian Journal of Medical and Biological Research, vol. 38, no. 10, 2005, pp. 1531–1541., doi:10.1590/s0100- 879x2005001000010. 59. Tommasi, S., et al. “Investigating the Epigenetic Effects of a Prototype Smoke- Derived Carcinogen in Human Cells.” PLoS ONE, vol. 5, no. 5, 2010, doi:10.1371/journal.pone.0010594. 60. Gavin, D.P., Sharma, R. “Histone Modifications, DNA Methylation, and Schizophrenia.” Neuroscience & Biobehavioral Reviews, vol. 34, no. 6, 2010, pp. 882–888., doi:10.1016/j.neubiorev.2009.10.010 61. Nardone, S., Elliott E. “The Interaction between the Immune System and Epigenetics in the Etiology of Autism Spectrum Disorders.” Frontiers in Neuroscience, vol. 10, 2016, doi:10.3389/fnins.2016.00329. 62. Kanel, T.V., et al. “Quantitative 1-Step DNA Methylation Analysis with Native Genomic DNA as Template.” Clinical Chemistry, vol. 56, no. 7, 2010, pp. 1098– 1106., doi:10.1373/clinchem.2009.142828. 63. Kim, J. K., et al. “Epigenetic Mechanisms in Mammals.” Cellular and Molecular Life Sciences, vol. 66, no. 4, Mar. 2008, pp. 596–612., doi:10.1007/s00018-008-8432-4.

123 64. Ehrlich, Melanie, et al. “Amount and Distribution of 5-Methylcytosine in Human DNA from Different Types of Tissues or Cells.” Nucleic Acids Research, vol. 10, no. 8, 1982, pp. 2709–2721., doi:10.1093/nar/10.8.2709. 65. Zhang, Michael Q., and Ilya P. Ioshikhes. “Large-Scale Human Promoter Mapping Using CpG Islands.” Nature Genetics, vol. 26, no. 1, Jan. 2000, pp. 61–63., doi:10.1038/79189. 66. Corry, Gareth N., et al. “Epigenetic Regulatory Mechanisms during Preimplantation Development.” Birth Defects Research Part C: Embryo Today: Reviews, vol. 87, no. 4, 2009, pp. 297–313., doi:10.1002/bdrc.20165. 67. Hackett, J. A., and M. A. Surani. “DNA Methylation Dynamics during the Mammalian Life Cycle.” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 368, no. 1609, 2012, pp. 20110328–20110328., doi:10.1098/rstb.2011.0328. 68. Weaver, Jamie R., et al. “Imprinting and Epigenetic Changes in the Early Embryo.” Mammalian Genome, vol. 20, no. 9-10, 2009, pp. 532–543., doi:10.1007/s00335- 009-9225-2. 69. Howell, Carina Y., et al. “Genomic Imprinting Disrupted by a Maternal Effect Mutation in the Dnmt1 Gene.” Cell, vol. 104, no. 6, 2001, pp. 829–838., doi:10.1016/s0092-8674(01)00280-x. 70. Graves, Jennifer A. Marshall. “Sex Chromosome Specialization and Degeneration in Mammals.” Cell, vol. 124, no. 5, 2006, pp. 901–914., doi:10.1016/j.cell.2006.02.024. 71. Ohno, Susumu. “Sex Chromosomes and Sex-Linked Genes.” Monographs on Endocrinology, 1966, doi:10.1007/978-3-662-35113-0. 72. Vicoso, B., Charlesworth, B. “Evolution on the X Chromosome: Unusual Patterns and Processes.” Nature Reviews Genetics, vol. 7, no. 8, 2006, pp. 645–653., doi:10.1038/nrg1914. 73. Larsson, Jan, and Victoria H. Meller. “Dosage Compensation, the Origin and the Afterlife of Sex Chromosomes.” Chromosome Research, vol. 14, no. 4, 2006, pp. 417–431., doi:10.1007/s10577-006-1064-3. 74. da Cunha PR, et al. “Dosage compensation in sciarids is achieved by hypertranscription of the single X chromosome in males.” Genetics. 1994 Nov;138(3):787-90. 75. Meyer BJ. Sex in the wormcounting and compensating X-chromosome dose. Trends Genet. 2000 Jun;16(6):247-53. 76. Meyer BJ. Targeting X chromosomes for repression. Curr. Opin. Genet. Dev. 2010 Apr;20(2):179-89. 77. Reik, Wolf, and Annabelle Lewis. “Co-Evolution of X-Chromosome Inactivation and Imprinting in Mammals.” Nature Reviews Genetics, vol. 6, no. 5, Aug. 2005, pp. 403–410., doi:10.1038/nrg1602. 78. Monkhorst, K., et al. “X Inactivation Counting and Choice Is a Stochastic Process: Evidence for Involvement of an X-Linked Activator.” Cell, vol. 132, no. 3, 2008, pp. 410–421., doi:10.1016/j.cell.2007.12.036. 79. Wu H, Luo J, Yu H, Rattner A, Mo A, Wang Y, Smallwood PM, Erlanger B, Wheelan SJ, Nathans J. Cellular resolution maps of X chromosome inactivation: implication for neural development, function, and disease. Neuron. 2014 Jan 8:81(1).

124 80. Riggs, AD. X chromosome inactivation, differentiation, and DNA methylation. revisited, with a tribute to Susumu Ohno. Cytogenet Genome Res. 2002;99(1-4):17- 24. 81. Russell, L. et al. “Comparative Studies on X-Autosome Translocations in the Mouse. II. Inactivation of Autosomal Loci, Segregation, and Mapping of Autosomal Breakpoints in Five T(x;1)’s.” Genetics64.2 (1970): 281–312. 82. Huynh, K. Lee, J.T. “Inheritance of a Pre-Inactivated Paternal X Chromosome in Early Mouse Embryos.” Nature, vol. 426, no. 6968, 2003, pp. 857–862., doi:10.1038/nature02222. 83. Lee, J. T. “Regulation of X-Chromosome Counting by Tsix and Xite Sequences.” Science, vol. 309, no. 5735, 2005, pp. 768–771., doi:10.1126/science.1113673. 84. Brockdorff, N., et al. “Conservation of Position and Exclusive Expression of Mouse Xist from the Inactive X Chromosome.” Nature, vol. 351, no. 6324, 1991, pp. 329– 331., doi:10.1038/351329a0. 85. Brown, C. J., et al. “A Gene from the Region of the Human X Inactivation Centre Is Expressed Exclusively from the Inactive X Chromosome.” Nature, vol. 349, no. 6304, 1991, pp. 38–44., doi:10.1038/349038a0. 86. Clemson, C. M. “XIST RNA Paints the Inactive X Chromosome at Interphase: Evidence for a Novel RNA Involved in Nuclear/Chromosome Structure.” The Journal of Cell Biology, vol. 132, no. 3, Jan. 1996, pp. 259–275., doi:10.1083/jcb.132.3.259. 87. Lee, J., et al. “Tsix, a Gene Antisense to Xist at the X-Inactivation Centre.” Nature Genetics, vol. 21, no. 4, Jan. 1999, pp. 400–404., doi:10.1038/7734. 88. Bernardino, J., et al. “DNA Methylation of the X Chromosomes of the Human Female: an in Situ Semi-Quantitative Analysis.” Chromosoma, vol. 104, no. 7, 1996, pp. 528–535., doi:10.1007/bf00352117. 89. Heard, Edith, et al. “Methylation of Histone H3 at Lys-9 Is an Early Mark on the X Chromosome during X Inactivation.” Cell, vol. 107, no. 6, 2001, pp. 727–738., doi:10.1016/s0092-8674(01)00598-0. 90. Payer, B. “Developmental Regulation of X-Chromosome Inactivation.” Seminars in Cell & Developmental Biology, vol. 56, 2016, pp. 88–99., doi:10.1016/j.semcdb.2016.04.014. 91. Vallot, C., et al. “Establishment of X Chromosome Inactivation and Epigenomic Features of the Inactive X Depend on Cellular Contexts.” BioEssays, vol. 38, no. 9, 2016, pp. 869–880., doi:10.1002/bies.201600121. 92. Chandra, R., et al. “Female X-Chromosome Mosaicism for gp91phox Expression Diversifies Leukocyte Responses during Endotoxemia.” Critical Care Medicine, vol. 38, no. 10, 2010, pp. 2003–2010., doi:10.1097/ccm.0b013e3181eb9ed6. 93. Patrat, C., et al. “Dynamic Changes in Paternal X-Chromosome Activity during Imprinted X-Chromosome Inactivation in Mice.” Proceedings of the National Academy of Sciences, vol. 106, no. 13, 2009, pp. 5198–5203., doi:10.1073/pnas.0810683106. 94. Sharman, G. B. “Late DNA Replication in the Paternally Derived X Chromosome of Female Kangaroos.” Nature, vol. 230, no. 5291, 1971, pp. 231–232., doi:10.1038/230231a0.

125 95. Cooper, D. W., et al. “Phosphoglycerate Kinase Polymorphism in Kangaroos Provides Further Evidence for Paternal X Inactivation.” Nature New Biology, vol. 230, no. 13, 1971, pp. 155–157., doi:10.1038/newbio230155a0. 96. Engel, N. “Imprinted X Chromosome Inactivation Offers up a Double Dose of Epigenetics.”Proceedings of the National Academy of Sciences, vol. 112, no. 47, 2015, pp. 14408–14409., doi:10.1073/pnas.1520097112. 97. Prudhomme, J., et al. “A Rapid Passage through a Two-Active-X-Chromosome State Accompanies the Switch of Imprinted X-Inactivation Patterns in Mouse Trophoblast Stem Cells.” Epigenetics & Chromatin, vol. 8, no. 1, 2015, doi:10.1186/s13072-015-0044-2. 98. Wu, H., et al. “Cellular Resolution Maps of X Chromosome Inactivation: Implications for Neural Development, Function, and Disease.” Neuron, vol. 81, no. 1, 2014, pp. 103–119., doi:10.1016/j.neuron.2013.10.051. 99. Davies, William, et al. “Xlr3b Is a New Imprinted Candidate for X-Linked Parent-of- Origin Effects on Cognitive Function in Mice.” Nature Genetics, vol. 37, no. 6, 2005, pp. 625–629., doi:10.1038/ng1577. 100. Chang, Diana, et al. “Accounting for EXentricities: Analysis of the X Chromosome in GWAS Reveals X-Linked Genes Implicated in Autoimmune Diseases.” PLoS ONE, vol. 9, no. 12, 2014, doi:10.1371/journal.pone.0113684. 101. Kutsche, Kerstin, et al. “A New Gene Family (FAM9) of Low-Copy Repeats in Xp22.3 Expressed Exclusively in Testis: Implications for Recombinations in This Region.” Genomics, Academic Press, 29 Aug. 2002. 102. Alkan, Can, et al. “Personalized Copy Number and Segmental Duplication Maps Using next-Generation Sequencing.” Nature Genetics, vol. 41, no. 10, 2009, pp. 1061–1067., doi:10.1038/ng.437. 103. Foley, R.J. Insight into the Function of X-linked Lymphocyte Regulated 3 (XLR3) and the Mechanism Regulating its Imprinted Expression (Dissertation). University of Connecticut. 2015. 104. Shi, Y.‐Q. et al. SYCP3‐like X‐linked 2 is expressed in meiotic germ cells and interacts with synaptonemal complex central element protein 2 and histone acetyltransferase TIP60. Gene (2013). doi:10.1016/j.gene.2013.06.033. 105. Tsutsumi, M. et al. Characterization of a Novel Mouse Gene Encoding an SYCP3‐Like Protein That Relocalizes from the XY Body to the Nucleolus During Prophase of Male Meiosis I. Biol. Reprod. 85, 165 –171 (2011). 106. Carone, B. An X-linked imprinted cluster defies the classical mechanisms of epigenetic regulation (Dissertation). University of Connecticut. 2008. 107. Hark, A. T., and S. M. Tilghman. “Chromatin Conformation of the H19 Epigenetic Mark.” Human Molecular Genetics, vol. 7, no. 12, Jan. 1998, pp. 1979–1985., doi:10.1093/hmg/7.12.1979. 108. Murphy, M. Transcriptional Regulation of the Mammalian X Chromosome (Dissertation). University of Connecticut. 2012. 109. Kasowitz, S. Regulation and Evolution of X-linked Lymphocyte Regulated 3b (Dissertation). University of Connecticut. 2012. 110. Sasaki, H. Allele-specific detection of nascent transcripts by fluorescence in situ hybridization reveals temporal and culture-induced changes in Igf2 imprinting during

126 pre-implantation mouse development. Genes to Cells, 2001. 6(3), 249-259. doi:10.1046/j.1365-2443.2001.00417. 111. Grunwald, Assaf, et al. “Reduced Representation Optical Methylation Mapping (R 2 OM 2 ).” Mar. 2017, doi:10.1101/113522. 112. Ebenstein Y., et al. “Channeling DNA for Optical Mapping.” Nature Biotechnology, vol. 30, no. 8, 2012, pp. 762–763., doi:10.1038/nbt.2324. 113. Weinhold, E., et al. “Design of a New Fluorescent Cofactor for DNA Methyltransferases and Sequence-Specific Labeling of DNA.” Journal of the American Chemical Society, vol. 125, no. 12, 2003, pp. 3486–3492., doi:10.1021/ja021106s. 114. Rand, A. C.,et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nature Methods, 2017. 14(4), 411-413. doi:10.1038/nmeth.4189 115. Tanny, J. C. “Chromatin Modification by the RNA Polymerase II Elongation Complex.” Transcription 5.5 (2014): e988093. PMC. Web. 24 Apr. 2018. 116. Teves, S. S., et al. “Transcribing through the Nucleosome.” Trends in Biochemical Sciences, vol. 39, no. 12, 2014, pp. 577–586., doi:10.1016/j.tibs.2014.10.004. 117. Weber, C., et al. Nucleosomes Are Context-Specific, H2A.Z-Modulated Barriers to RNA Polymerase. Molecular Cell, 2014, 53(5), 819-830. doi:10.1016/j.molcel.2014.02.014. 118. Adelman, K. et al. Pausing of RNA Polymerase II Disrupts DNA-Specified Nucleosome Organization to Enable Precise Gene Regulation. Cell, 2010. 143(4), 540-551. doi:10.1016/j.cell.2010.10.004 119. Khoueiry, Rita, et al. “Abnormal Methylation of KCNQ1OT1 and Differential Methylation of H19 Imprinting Control Regions in Human ICSI Embryos.” Zygote, vol. 21, no. 02, 2012, pp. 129–138., doi:10.1017/s0967199411000694. 120. Li, Bing, et al. “The Role of Chromatin during Transcription.” Cell, vol. 128, no. 4, 2007, pp. 707–719., doi:10.1016/j.cell.2007.01.015. 121. Voong, L. N., et al. “Genome-Wide Mapping of the Nucleosome Landscape by Micrococcal Nuclease and Chemical Mapping.” Trends in Genetics, vol. 33, no. 8, 2017, pp. 495–507., doi:10.1016/j.tig.2017.05.007. 122. Freaney, Jonathan E., et al. “High-Density Nucleosome Occupancy Map of Human Chromosome 9p21-22 Reveals Chromatin Organization of the Type I Interferon Gene Cluster.” Journal of Interferon & Cytokine Research, vol. 34, no. 9, 2014, pp. 676–685., doi:10.1089/jir.2013.0118. 123. Asan, et al. “Comprehensive Comparison of Three Commercial Human Whole- Exome Capture Platforms.” Genome Biology, vol. 12, no. 9, 2011, doi:10.1186/gb- 2011-12-9-r95. 124. Trapnell, C, Salzberg, S. L. “How to Map Billions of Short Reads onto Genomes.” Nature Biotechnology, vol. 27, no. 5, 2009, pp. 455–457., doi:10.1038/nbt0509-455. 125. Langmead, B., et al. “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome.” Genome Biology, vol. 10, no. 3, 2009, doi:10.1186/gb-2009-10-3-r25. 126. Robinson, J. T., et al. “Integrative Genomics Viewer.” Nature Biotechnology, vol. 29, no. 1, 2011, pp. 24–26., doi:10.1038/nbt.1754.

127 127. Thorvaldsdottir, H., et al. “Integrative Genomics Viewer (IGV): High-Performance Genomics Data Visualization and Exploration.” Briefings in Bioinformatics, vol. 14, no. 2, 2012, pp. 178–192., doi:10.1093/bib/bbs017. 128. Helmuth J, Chung H (2017). “normr: Normalization and difference calling in ChIP- seq data.” R package version 1.6.0. 129. Flores O, Orozco M (2011). “nucleR: a package for non-parametric nucleosome positioning.”Bioinformatics, 27, 2149–2150. doi:1093/bioinformatics/btr345. 130. Zhang X, Robertson G, Woo S, Hoffman B, Gottardo R (2012). “Probabilistic Inference for Nucleosome Positioning with MNase-Based or Sonicated Short-Read Data.” PLoS ONE, 7. DOI:10.1371/journal.pone.0032095. 131. Palermo, Richard D., et al. “RNA Polymerase II Stalling Promotes Nucleosome Occlusion and PTEFb Recruitment to Drive Immortalization by Epstein-Barr Virus.” PLoS Pathogens, vol. 7, no. 10, 2011, doi:10.1371/journal.ppat.1002334. 132. Greenleaf, W. J., et al. “Transposition of Native Chromatin for Fast and Sensitive Mulitmodal Analysis of Chromatin Architecture.” Biophysical Journal, vol. 106, no. 2, 2014, doi:10.1016/j.bpj.2013.11.503. 133. Chang, H. Y., et al. “Omni-ATAC-Seq: Improved ATAC-Seq Protocol.” Protocol Exchange, 2017, doi:10.1038/protex.2017.096. 134. Buenrostro, Jason et al. “ATAC-Seq: A Method for Assaying Chromatin Accessibility Genome-Wide.” Current protocols in molecular biology / edited by Frederick M. Ausubel ... [et al.] 109 (2015): 21.29.1–21.29.9. PMC. Web. 24 Apr. 2018. 135. Zhang, Q., et al. “Impact of Three Illumina Library Construction Methods on GC Bias and HLA Genotype Calling.” Human Immunology, vol. 76, no. 2-3, 2015, pp. 166–175., doi:10.1016/j.humimm.2014.12.016. 136. Jin Lee, Grey Christoforo, Grey Christoforo, CS Foo, Chris Probert, Anshul Kundaje, … Mike Dacre. (2016, September 27). kundajelab/atac_dnase_pipelines: 0.3.0 (Version 0.3.0). Zenodo. http://doi.org/10.5281/zenodo.156534. 137. Daugherty, A. C. et al. “Chromatin Accessibility Dynamics Reveal Novel Functional Enhancers in C. Elegans.” Genome Research 27.12 (2017): 2096–2107. PMC. Web. 24 Apr. 2018. 138. Schep, A. N. et al. “Structured Nucleosome Fingerprints Enable High-Resolution Mapping of Chromatin Architecture within Regulatory Regions.” Genome Research 25.11 (2015): 1757–1770. PMC. Web. 24 Apr. 2018. 139. Hoang, D. H., Sung, W. “CWig: Compressed Representation of Wiggle/BedGraph Format.” Bioinformatics, vol. 30, no. 18, 2014, pp. 2543–2550., doi:10.1093/bioinformatics/btu330. 140. Yue, Feng, et al. “A Comparative Encyclopedia of DNA Elements in the Mouse Genome.” Nature, vol. 515, no. 7527, 2014, pp. 355–364., doi:10.1038/nature13992. 141. Krams, S. M., Bromberg, J. S. “Encode.” Annals of Neurology, vol. 73, no. 5, 2013, doi:10.1002/ana.23920. 142. Ahmed, A. J. “The ENCODE Project: Deciphering the Human Genome’s Rosetta Stone.” Journal of the Islamic Medical Association of North America, vol. 40, no. 3, 2010, doi:10.5915/40-3-5488. 143. Diehl, A. G., Boyle. “Deciphering ENCODE.” Trends in Genetics, vol. 32, no. 4, 2016, pp. 238–249., doi:10.1016/j.tig.2016.02.002.

128 144. Bailey, Timothy L., et al. “The MEME Suite.” Nucleic Acids Research, vol. 43, no. W1, July 2015, doi:10.1093/nar/gkv416. 145. He, C., Li, H., Viollet, B., Zou, M.-H., & Xie, Z. (2015). AMPK Suppresses Vascular Inflammation In Vivo by Inhibiting Signal Transducer and Activator of Transcription-1. Diabetes, 64(12), 4285–4297. doi.org/10.2337/db15-0107. 146. Ysebrant de Lendonck, L., Tonon, S., Nguyen, M., Vandevenne, P., Welsby, I., Martinet, V., … Goriely, S. (2013). Interferon regulatory factor 3 controls interleukin- 17 expression in CD8 T lymphocytes. Proceedings of the National Academy of Sciences of the United States of America, 110(34), E3189–E3197. doi.org/10.1073/pnas.1219221110. 147. Glubrecht, D. D., et al. “Differential CRX and OTX2 Expression in Human Retina and Retinoblastoma.” Journal of Neurochemistry, vol. 111, no. 1, 2009, pp. 250– 263., doi:10.1111/j.1471-4159.2009.06322.x. 148. Swaroop, A, et al. “Transcriptional Regulation of Photoreceptor Development and Homeostasis in the Mammalian Retina.” Nature Reviews Neuroscience, vol. 11, no. 8, 2010, pp. 563–576., doi:10.1038/nrn2880. 149. Hennig, A. K., et al. “Regulation of Photoreceptor Gene Expression by Crx- Associated Transcription Factor Network.” Brain Research, vol. 1192, 2008, pp. 114–133., doi:10.1016/j.brainres.2007.06.036. 150. Peng, G.-H., Chen, S. (2011). “Active opsin loci adopt intrachromosomal loops that depend on the photoreceptor transcription factor network.” Proceedings of the National Academy of Sciences of the United States of America, 108(43), 17821– 17826. http://doi.org/10.1073/pnas.1109209108. 151. Fozzatti, L., L., et al. (2011). “Resistance to thyroid hormone is modulated in vivo by the nuclear receptor corepressor (NCOR1).” Proceedings of the National Academy of Sciences of the United States of America, 108(42), 17462–17467. doi.org/10.1073/pnas.1107474108. 152. Minelli, C., et al. “Association of Forced Vital Capacity with the Developmental Gene NCOR2.” Plos One, vol. 11, no. 2, 2016, doi:10.1371/journal.pone.0147388. 153. Zheleznyakova, G. Y., et al. “Methylation Levels of SLC23A2 and NCOR2 Genes Correlate with Spinal Muscular Atrophy Severity.” Plos One, vol. 10, no. 3, 2015, doi:10.1371/journal.pone.0121964. 154. He, D. M., et al. (2014). “The gene expression patterns of BMPR2, EP300, TGFβ2, and TNFAIP3 in B-Lymphoma cells.” Cancer Biology & Medicine, 11(3), 202–207. doi.org/10.7497/j.issn.2095-3941.2014.03.006. 155. Devaiah, B. N., et al. “BRD4 Is a Histone Acetyltransferase That Evicts Nucleosomes from Chromatin.” Nature Structural & Molecular Biology, vol. 23, no. 6, 2016, pp. 540–548., doi:10.1038/nsmb.3228. 156. Kim, T. H., et al. “A High-Resolution Map of Active Promoters in the Human Genome.” Nature, vol. 436, no. 7052, 2005, pp. 876–880., doi:10.1038/nature03877. 157. Rasmussen, E. B., Lis, J. T. “In Vivo Transcriptional Pausing and Cap Formation on Three Drosophila Heat Shock Genes.” Proceedings of the National Academy of Sciences, vol. 90, no. 17, Jan. 1993, pp. 7923–7927., doi:10.1073/pnas.90.17.7923. 158. Adelman, K., et al. “Single Molecule Analysis of RNA Polymerase Elongation Reveals Uniform Kinetic Behavior.” Proceedings of the National Academy of

129 Sciences, vol. 99, no. 21, July 2002, pp. 13538–13543., doi:10.1073/pnas.212358999. 159. Tadigotla, V. R., et al. “Thermodynamic and Kinetic Modeling of Transcriptional Pausing.” Proceedings of the National Academy of Sciences, vol. 103, no. 12, 2006, pp. 4439–4444., doi:10.1073/pnas.0600508103. 160. Padanad, M. S., Riley, B. B. “Pax2/8 Proteins Coordinate Sequential Induction of Otic and Epibranchial Placodes through Differential Regulation of foxi1, sox3 and fgf24.”Developmental Biology, vol. 351, no. 1, 2011, pp. 90–98., doi:10.1016/j.ydbio.2010.12.036. 161. Shioda, N., et al. “Targeting G-Quadruplex DNA as Cognitive Function Therapy for ATR-X Syndrome.” Nature Medicine, vol. 24, no. 6, 2018, pp. 802–813., doi:10.1038/s41591-018-0018-6. 162. Kikin, O., et al. “QGRS Mapper: a Web-Based Server for Predicting G- Quadruplexes in Nucleotide Sequences.” Nucleic Acids Research, vol. 34, no. Web Server, 2006, doi:10.1093/nar/gkl253. 163. Argentaro, A. et al. Structural consequences of disease-causing mutations in the ATRX–DNMT3–DNMT3L (ADD) domain of the chromatin-associated protein ATRX. Proc. Natl Acad. Sci. USA 104, 11939–11944 (2007). 164. Dhayalan, A. et al. The ATRX–ADD domain binds to H3 tail peptides and reads the combined methylation state of K4 and K9. Hum. Mol. Genet. 20, 2195–2203 (2011). 165. Iwase, S. et al. ATRX ADD domain links an atypical histone methylation recognition mechanism to human mental-retardation syndrome. Nat. Struct. Mol. Biol. 18, 769–776 (2011). 166. Lewis, P. W., et al. (2010). “Daxx is an H3.3-specific histone chaperone and cooperates with ATRX in replication-independent chromatin assembly at telomeres.” Proceedings of the National Academy of Sciences of the United States of America, 107(32), 14075–14080. doi.org/10.1073/pnas.1008850107. 167. Goldberg, A. D. et al. Distinct factors control histone variant H3.3 localization at specific genomic regions. Cell 140, 678–691 (2010). 168. Monk, M. “Genomic Imprinting.” Genes & Development, vol. 2, no. 8, 1988, pp. 921–925., doi:10.1101/gad.2.8.921. 169. Chauhan, A, Chauhan V. “Oxidative Stress in Autism.” Pathophysiology, vol. 13, no. 3, 2006, pp. 171–181., doi:10.1016/j.pathophys.2006.05.007. 170. Dong, Y., Wang, M. “Knockdown of TKTL1 Additively Complements Cisplatin- Induced Cytotoxicity in Nasopharyngeal Carcinoma Cells by Regulating the Levels of NADPH and Ribose-5-Phosphate.” Biomedicine & Pharmacotherapy, vol. 85, 2017, pp. 672–678., doi:10.1016/j.biopha.2016.11.078. 171. Moon, E. J., et al. (2010). “NADPH oxidase-mediated reactive oxygen species production activates hypoxia-inducible factor-1 (HIF-1) via the ERK pathway after hyperthermia treatment.” Proceedings of the National Academy of Sciences of the United States of America, 107(47), 20477–20482. doi.org/10.1073/pnas.1006646107. 172. Pfaffl, Michael W., Graham W. Horgan, and Leo Dempfle. “Relative Expression Software Tool (REST©) for Group-Wise Comparison and Statistical Analysis of

130 Relative Expression Results in Real-Time PCR.” Nucleic Acids Research 30.9 (2002): e36. 173. Lambroso, P. J., & Rubenstein, J. L. Development of the Cerebral Cortex: V. Transcription Factors and Brain Development. Journal of the American Academy of Child & Adolescent Psychiatry, 37(5), 1998, 561-562. doi:10.1097/00004583- 199805000-00020 174. Coy, J., et al. “Expression of Transketolase TKTL1 Predicts Colon and Urothelial Cancer Patient Survival: Warburg Effect Reinterpreted.” British Journal of Cancer, vol. 94, no. 4, 2006, pp. 578–585., doi:10.1038/sj.bjc.6602962. 175. Schenk, G., et al. “Properties and Functions of the Thiamin Diphosphate Dependent Enzyme Transketolase.” The International Journal of Biochemistry & Cell Biology, vol. 30, no. 12, 1998, pp. 1297–1318., doi:10.1016/s1357-2725(98)00095-8. 176. Meshalkina, L. E., et al. “Is Transketolase-like Protein, TKTL1, Transketolase?”Biochimica Et Biophysica Acta (BBA) - Molecular Basis of Disease, vol. 1832, no. 3, 2013, pp. 387–390., doi:10.1016/j.bbadis.2012.12.004. 177. Saha, A, et al. “Akt Phosphorylation and Regulation of Transketolase Is a Nodal Point for Amino Acid Control of Purine Synthesis.” Molecular Cell, vol. 55, no. 2, 2014, pp. 264–276., doi:10.1016/j.molcel.2014.05.028. 178. Schneider, S, et al. “A Δ38 Deletion Variant of Human Transketolase as a Model of Transketolase-Like Protein 1 Exhibits No Enzymatic Activity.” PLoS ONE, vol. 7, no. 10, 2012, doi:10.1371/journal.pone.0048321. 179. Bentz, S., et al. “Hypoxia Induces the Expression of Transketolase-Like 1 in Human Colorectal Cancer.” Digestion, vol. 88, no. 3, 2013, pp. 182–192., doi:10.1159/000355015. 180. Sagara, Y, et al. “Cellular Mechanisms of Resistance to Chronic Oxidative Stress.” Free Radical Biology and Medicine, vol. 24, no. 9, 1998, pp. 1375–1389., doi:10.1016/s0891-5849(97)00457-7. 181. Zitzler, J, et al. “High-Throughput Functional Genomics Identifies Genes That Ameliorate Toxicity Due to Oxidative Stress in Neuronal HT-22 Cells.” Molecular & Cellular Proteomics, vol. 3, no. 8, Apr. 2004, pp. 834–840., doi:10.1074/mcp.m400054-mcp200. 182. Grohm, J, et al. “Bid Mediates Fission, Membrane Permeabilization and Peri- Nuclear Accumulation of Mitochondria as a Prerequisite for Oxidative Neuronal Cell Death.” Brain, Behavior, and Immunity, vol. 24, no. 5, 2010, pp. 831–838., doi:10.1016/j.bbi.2009.11.015. 183. Solanki, Naimesh, et al. “Modulating Oxidative Stress Relieves Stress-Induced Behavioral and Cognitive Impairments in Rats.” International Journal of Neuropsychopharmacology, vol. 20, no. 7, 2017, pp. 550–561., doi:10.1093/ijnp/pyx017. 184. Brees, Chantal, and Marc Fransen. “A Cost-Effective Approach to Microporate Mammalian Cells with the Neon Transfection System.” Analytical Biochemistry, vol. 466, 2014, pp. 49–50., doi:10.1016/j.ab.2014.08.017. 185. Hitoshi, N, et al. “Efficient Selection for High-Expression Transfectants with a Novel Eukaryotic Vector.” Gene, vol. 108, no. 2, 1991, pp. 193–199., doi:10.1016/0378-1119(91)90434-d.

131 186. Loturco, J., et al. “New and Improved Tools for In Utero Electroporation Studies of Developing Cerebral Cortex.” Cerebral Cortex, vol. 19, no. suppl 1, 2009, pp. i120–i125., doi:10.1093/cercor/bhp033. 187. Diaz-Moralli, S., et al. “A Key Role for Transketolase-like 1 in Tumor Metabolic Reprogramming.” Oncotarget, vol. 7, no. 32, 2016, doi:10.18632/oncotarget.10429. 188. Sciuto, M. R., et al. “Two-Step Coimmunoprecipitation (TIP) Enables Efficient and Highly Selective Isolation of Native Protein Complexes.” Molecular & Cellular Proteomics, vol. 17, no. 5, 2017, pp. 993–1009., doi:10.1074/mcp.o116.065920. 189. Pan, L, et al. “Protein Competition Switches the Function of COP9 from Self- Renewal to Differentiation.” Nature, vol. 514, no. 7521, 2014, pp. 233–236., doi:10.1038/nature13562. 190. Coy, J. “EDIM-TKTL1/Apo10 Blood Test: An Innate Immune System Based Liquid Biopsy for the Early Detection, Characterization and Targeted Treatment of Cancer.” International Journal of Molecular Sciences, vol. 18, no. 4, 2017, p. 878., doi:10.3390/ijms18040878. 191. Prudent, X., et al. “Controlling for Phylogenetic Relatedness and Evolutionary Rates Improves the Discovery of Associations Between Species and Phenotypic and Genomic Differences.” Molecular Biology and Evolution, vol. 33, no. 8, 2016, pp. 2135–2150., doi:10.1093/molbev/msw098. 192. Antoniewicz, M. R. et al. Evidence for transketolase-like TKTL1 flux in CHO cells based on parallel labeling experiments and 13 C-metabolic flux analysis. Metabolic Engineering, 37, 2016, 72-78. doi:10.1016/j.ymben.2016.05.005. 193. Cascante, M., et al. A key role for transketolase-like 1 in tumor metabolic reprogramming. Oncotarget,7(32), 2016. doi:10.18632/oncotarget.10429. 194. Lopes, Rui, et al. “GRO-Seq, A Tool for Identification of Transcripts Regulating Gene Expression.” Methods in Molecular Biology Promoter Associated RNA, 2017, pp. 45–55., doi:10.1007/978-1-4939-6716-2_3. 195. Lorch, Yahli, et al. “Selective Removal of Promoter Nucleosomes by the RSC Chromatin-Remodeling Complex.” Nature Structural & Molecular Biology, vol. 18, no. 8, Mar. 2011, pp. 881–885., doi:10.1038/nsmb.2072. 196. Flaus, A., Owen-Hughes, T. “Mechanisms for ATP-Dependent Chromatin Remodelling: Farewell to the Tuna-Can Octamer?” Current Opinion in Genetics & Development, vol. 14, no. 2, 2004, pp. 165–173., doi:10.1016/j.gde.2004.01.007. 197. Michod, D., et al. “Calcium-Dependent Dephosphorylation of the Histone Chaperone DAXX Regulates H3.3 Loading and Transcription upon Neuronal Activation.” Neuron, vol. 74, no. 1, 2012, pp. 122–135., doi:10.1016/j.neuron.2012.02.021. 198. Lauberth, S. M., et al. “H3K4me3 Interactions with TAF3 Regulate Preinitiation Complex Assembly and Selective Gene Activation.” Cell, vol. 152, no. 5, 2013, pp. 1021–1036., doi:10.1016/j.cell.2013.01.052. 199. Afek, A., Lukatsky D. B. “Genome-Wide Organization of Eukaryotic Preinitiation Complex Is Influenced by Nonconsensus Protein-DNA Binding.” Biophysical Journal, vol. 104, no. 5, 2013, pp. 1107–1115., doi:10.1016/j.bpj.2013.01.038. 200. Schweikhard, V. et al. (2014). Transcription factors TFIIF and TFIIS promote transcript elongation by RNA polymerase II by synergistic and independent

132 mechanisms. Proceedings of the National Academy of Sciences of the United States of America, 111(18), 6642–6647. doi.org/10.1073/pnas.1405181111. 201. Weber, C., et al. Nucleosomes Are Context-Specific, H2A.Z-Modulated Barriers to RNA Polymerase. Molecular Cell, 53(5), 2014, 819-830. doi:10.1016/j.molcel.2014.02.014. 202. Gilchrist, D. A., et al. Pausing of RNA Polymerase II Disrupts DNA-Specified Nucleosome Organization to Enable Precise Gene Regulation. Cell, 143(4), 2010. 540-551. doi:10.1016/j.cell.2010.10.004. 203. Kulaeva, O. I., et al. Mechanism of transcription through a nucleosome by RNA polymerase II. Biochimica Et Biophysica Acta (BBA) - Gene Regulatory Mechanisms, 1829(1), 2013. 76-83. doi:10.1016/j.bbagrm.2012.08.015. 204. Wilhelm, B. T., et al. Differential patterns of intronic and exonic DNA regions with respect to RNA polymerase II occupancy, nucleosome density and H3K36me3 marking in fission yeast. Genome Biology, 12(8), 2011. doi:10.1186/gb-2011-12-8- r82. 205. Hsieh, F.K., et al. (2013). Histone chaperone FACT action during transcription through chromatin by RNA polymerase II. Proceedings of the National Academy of Sciences of the United States of America, 110(19), 7654–7659. doi.org/10.1073/pnas.1222198110. 206. De Almeida, S. F., et al. “A link between nuclear RNA surveillance, the human exosome and RNA polymerase II transcriptional termination.” Nucleic Acids Res 38, 8015–8026 (2010). 207. Kolasinska-Zwierz, P. et al. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 41, 376–381 (2009). 208. Schultz, H., et al. “TKTL1 Is Overexpressed in a Large Portion of Non-Small Cell Lung Cancer Specimens.” Diagnostic Pathology, vol. 3, no. 1, 2008, p. 35., doi:10.1186/1746-1596-3-35. 209. Xu, I. M., et al. “Transketolase Counteracts Oxidative Stress to Drive Cancer Development.” Proceedings of the National Academy of Sciences, vol. 113, no. 6, 2016, doi:10.1073/pnas.1508779113. 210. Langbein, S., et al. “Expression of Transketolase TKTL1 Predicts Colon and Urothelial Cancer Patient Survival: Warburg Effect Reinterpreted.” British Journal of Cancer 94.4 (2006): 578–585. PMC. Web. 25 Apr. 2018. 211. Diaz-Moralli, Santiago, et al. “Transketolase-Like 1 Expression Is Modulated during Colorectal Cancer Progression and Metastasis Formation.” PLoS ONE, vol. 6, no. 9, 2011, doi:10.1371/journal.pone.0025323. 212. Sun, W., et al. “TKTL1 Is Activated by Promoter Hypomethylation and Contributes to Head and Neck Squamous Cell Carcinoma Carcinogenesis through Increased Aerobic Glycolysis and HIF1 Stabilization.” Clinical Cancer Research, vol. 16, no. 3, 2010, pp. 857–866., doi:10.1158/1078-0432.ccr-09-2604. 213. Xu, Z.-P., et al. “Transketolase Haploinsufficiency Reduces Adipose Tissue and Female Fertility in Mice.” Molecular and Cellular Biology, vol. 22, no. 17, 2002, pp. 6142–6147., doi:10.1128/mcb.22.17.6142-6147.2002.

133