CALIFORNIA STATE UNIVERSITY, NORTHRIDGE

Promoter Characterization of the Mantle Cell Lymphoma Associated , NAP1L1

A thesis submitted in partial fulfillment of the requirements

For the degree of

Master of Science in Biology

By

Loni M Hands

August 2016

Copyright by Loni Hands 2016

ii The thesis of Loni Hands is approved by:

______Kerry K. Cooper, Ph.D. Date

______Rheem D. Medh, Ph.D. Date

______Cheryl L. Van Buskirk, Ph.D. Date

______Cindy S. Malone, Ph.D., Chair Date

California State University, Northridge

iii ACKNOWLEDGMENTS

I would like to thank my P.I., Dr. Cindy Malone, for her constant, strong support throughout my time as a masters student.

I would like to thank my committee for their help and challenging questions during my proposal and defense.

I would like to thank Dr. Daniel Odom for always being available to talk science, reason through protocols, and help me with my lab techniques.

Finally, I would like to thank my family for always supporting me no matter what.

iv TABLE OF CONTENTS

Signature Page iii Acknowledgments iv Abstract vii Chapter 1: Introduction 1 1.1 Importance of gene regulation 1 1.2 Consequences of gene dysregulation 1 1.3 Gene regulation – Overview 2 1.4 Gene regulation – Core promoters and transcription initiation 3 1.5 Gene regulation – Proximal promoter elements (PPEs) 5 1.6 Gene regulation – Enhancers and silencers 6 1.7 in mantle cell lymphoma vs. small lymphocytic 8 lymphoma 1.8 assembly 1-like 1 (NAP1L1) 9 1.9 Characterizing the NAP1L1 promoter region and enhancer element 10 Chapter 2: Methods 12 2.1 Cloning the NAP1L1 promoter region into pGL3 Basic 12 2.2 Cloning a larger fragment of the NAP1L1 promoter region into pGL3 14 Basic 2.3 Cloning a region of NAP1L1’s intron 1 into pGL3N-1273 and pGL3 16 Basic 2.4 Boiling lysis plasmid purification 17 2.5 Restriction digests 18 2.6 Transcription factor consensus sequence identification 18 2.7 Designing initial deletion constructs of the NAP1L1 promoter region 19 2.8 Designing smaller deletion constructs of the NAP1L1 promoter region 21 2.9 Mutating the myc binding site within the first intron of NAP1L1 24 2.10 Transfection of promoter constructs into human embryonic kidney 27 (HEK293T) cells 2.11 Harvesting transiently transfected HEK293T cells 28 2.12 Dual-Luciferase Assay of transfected pGL3 constructs 29

v Chapter 3: Results 30 3.1 Bioinformatics analyses of the NAP1L1 promoter region 30 3.2 Cloning the NAP1L1 promoter region into pGL3 Basic 32 3.3 Removing the intron from pGL3 Basic containing the NAP1L1 promoter 34 region 3.4 Cloning the intron into the enhancer region of pGL3 Basic and 35 pGL3N-1273 3.5 Transient transfections of NAP1L1 promoter deletion constructs 36 revealed regions between -442 bp and -297 bp that may contain a positive regulatory element and -297 bp and -183 bp that may contain a negative regulatory element 3.6 Preliminary transient transfection results from the smaller NAP1L1 39 promoter deletions between -442 bp and -297 bp as well as between -297 bp and -183 bp relative to the TSS 3.7 Transient transfections of the NAP1L1 promoter and intron regions 41 3.8 Transient transfections of NAP1L1 promoter and intron regions with the 42 mutated c-Myc consensus sequence Chapter 4: Discussion 45 4.1 Transient transfections of initial deletion constructs revealed a 45 regulatory element located within the region between -442 bp and -297 bp relative to the transcription start site (+1) and smaller deletions of this region suggest that the element lies between -367 bp and -335 bp relative to the TSS 4.2 Transient transfections of deletion constructs revealed a regulatory 46 element located within the region between -297 bp and -183 bp relative to the transcription start site (+1) and smaller deletions of this region suggest that there is an element that lies between -297 bp and -254 and a second element that lies between -254 bp and -215 bp relative to the TSS 4.3 Transient transfections of initial deletion constructs revealed a potential 48 element between -183 bp and -56 bp relative to the TSS 4.4 Transfections of intron constructs revealed a functional c-Myc binding 48 site in the first intron of NAP1L1 4.5 Future directions 53 References 55

vi ABSTRACT

Promoter Characterization of the Mantle Cell Lymphoma Associated Gene, NAP1L1

By

Loni Hands

Master of Science in Biology

The precise regulation of gene expression within a cell determines the structure and function of a given cell type. Dysregulation of normal gene expression may lead to diseases such as cancer. Differences in gene expression among cancers of similar origin may explain the differing phenotypes between more aggressive cancers and less aggressive cancers. Mantle cell lymphoma (MCL), an aggressive cancer, expresses a higher level of the nucleosome assembly protein 1-like 1 (NAP1L1) than small lymphocytic lymphoma (SLL), a less aggressive cancer of similar origin. NAP1L1 is involved in cell cycle and gene regulation, making it a candidate for involvement in cancer progression.

In order to understand the mechanisms behind NAP1L1 expression, the putative regulatory region of NAP1L1 from -1273 bp to +387 bp relative to the transcription start site (TSS, +1) was cloned into the luciferase reporter vector pGL3 Basic. After transient

vii transfections of this construct into human embryonic kidney (HEK293T) cells confirmed the ability of this region to regulate gene expression, deletions of the promoter region were made and revealed the location of a positive regulatory element between -442 bp and -297 bp and a negative regulatory element between -297 bp and -183 bp. Preliminary data from transfections of smaller deletion constructs within each of these regions narrowed down the location of the positive regulatory element to between -367 bp and -

335 bp and potentially two negative regulatory elements between -297 bp and -254 bp and between -254 bp and -215 bp. Further research will be necessary to determine the exact identities of these elements.

In order to determine whether there is an enhancer element within the first intron of NAP1L1, as there appears to be in mice (Wu et al., 2008), a region from +382 bp to

+1915 bp containing the beginning of the first intron was cloned into the pGL3 construct containing the full isolated promoter region of NAP1L1 (-1273 bp to +387 bp). Transient transfections of this construct with the intron in the “forward” direction resulted in increased luciferase activity over 2-fold compared to the construct containing only the full NAP1L1 promoter region. Interestingly, and unexpectedly, the construct containing the full promoter region of NAP1L1 and the intron region in the reverse direction resulted in luciferase activity that was comparable to the construct with the full promoter and no intron. When the transcription factor binding sequence for c-Myc within the intron region was mutated in the construct containing the full promoter region and intron in the forward direction, the results were also similar to those of the construct containing only the full promoter, suggesting a potential role for c-Myc in regulating NAP1L1 in a location- independent, orientation-dependent manner.

viii CHAPTER 1: INTRODUCTION

1.1 Importance of gene regulation

Proper gene regulation is essential to the normal functions of the cell.

Coordinating the development of an organism, ensuring that are up regulated or down regulated at the right time during each stage is the responsibility of gene regulatory processes. Essentially, gene regulatory programs maintain the normal function of a given cell type. Characteristics like cell shape and cell type-specific functions such as immunoglobulin production and secretion by B lymphocytes are determined by the regulated gene expression of that cell type. For example, in the case of immune cells, differences in cell surface receptors and markers between cell types and cell signaling pathways that lead to regulatory responses are important to an appropriate immune system response to an infection. When this regulation is disrupted or prevented in any way, diseases such as cancer or abnormalities in development can result.

1.2 Consequences of gene dysregulation

Dysregulation of certain genes or classes of genes can lead to diseases or problems in development. Cancer, as an example, is a disease that results from the dysregulation of apoptotic and cell cycle genes such as proto-oncogenes. Proto- oncogenes are drivers of the cell cycle; they are directly involved in pushing the cell cycle forward (Vermeulen et al., 2003). When mutated or dysregulated, these oncogenes will cause a cell to continue inappropriately dividing, eventually resulting in the formation of a tumor (Vermeulen et al., 2003). One such example is the oncogene MYC,

1 which encodes the transcription factor c-Myc, which regulates genes involved in moving the cell cycle forward and whose overexpression has been correlated with multiple cancers (Tu et al., 2015; Wu et al., 2008). This transcription factor binds an enhancer box along with the regulator Max to increase expression of cell cycle genes, such as cyclin

D1, and drive proliferation (Tu et al., 2015). Uncontrolled cell divisions will result in an accumulation of mutations that will allow for tumor formation and even for the cancer to become metastatic.

1.3 Gene regulation – Overview

In order to understand why the dysregulation of certain genes leads to diseases such as tumor formation, the normal genetic regulation within various cell types must be studied. The genetic composition within a cell is generally the same between cell types of an organism, but it is the differences in genetic regulation that distinguishes one cell type from another. This differential regulation maintains cell structure, shape, and functionality of a given cell type. An epithelial cell may express some genes at higher or lower levels than a B lymphocyte, or genes may be completely turned on or off.

Determining whether a vertebrate limb forms and with what polarity it forms is determined by specific gene expression in a dose-dependent and even a location- dependent manner (Gehrke and Shubin, 2016). Mutations involved in diseases such as cancer lead to the dysregulation of many genes that compromise the normal functions of a cell.

Genes are regulated through the recognition of various DNA sequences by regulatory that collectively recruit RNA polymerase II (RNAPII) to transcribe a

2 gene or inhibit RNAPII from transcribing the gene. These DNA sequences are grouped into categories termed promoters, enhancers, and silencers. Promoters are composed of a core promoter region spanning from approximately -40 to +40 base pairs (bp) relative to the transcription start site (TSS), denoted +1, and promoter proximal elements located a few hundred base pairs upstream, and sometimes downstream, of the transcription start site.

1.4 Gene regulation – Core promoters and transcription initiation

Core promoters contain short sequences to which basal transcription factors bind.

The basal transcription factor machinery is proving to be more complex than originally thought (Danino et al., 2015). Previously, the basal TFs were thought to be similar between genes and cell types, but more recent work has shown that different combinations of subunits that can be cell type-specific (Danino et al., 2015, Goodrich and

Tjian, 2010). The most studied basal factors include TFIID, which usually contains the

TATA-binding protein (TBP) and 13-14 TBP-associated factors (TAFs), TFIIA, TFIIB,

TFIIF, TFIIE, and TFIIH (Goodrich and Tjian, 2010). Subunits within the TFIID complex recognize and bind core promoter elements and then other basal factors associate with TFIID (Goodrich and Tjian, 2010; Maston et al., 2006). The combination of various basal factors and the mediator with RNAPII form the preinitiation complex

(PIC) (Dikstein, 2011; Goodrich and Tjian, 2010). The process of transcription initiation differs between genes, but results when a combination of basal TFs and specific TFs assemble and interact to recruit RNAPII to the promoter (Danino et al., 2015; Dikstein,

2011; Goodrich and Tjian, 2010). Depending on the type of promoter, the pattern of

3 transcription initiation will be different (Danino et al., 2015). There are many ways that promoters have been classified, but two of the main classifications are focused versus dispersed promoters (Danino et al., 2015). Focused promoters are more often cell type- specific and are generally characterized by a single transcription start site with core elements, often with a TATA-box, positioned at specific distances from the TSS (Haberle and Lenhard, 2016; Danino et al., 2015). Focused promoters are less common in mammals, but are more highly conserved (Haberle and Lenhard, 2016). Dispersed promoters, on the other hand, are more characteristic of ubiquitously expressed or housekeeping genes and use multiple TSSs across a range of ~100 bp and are more likely to contain CpG islands (Haberle and Lenhard, 2016). Dispersed promoters are the more common type of promoter in humans (Haberle and Lenhard, 2016). Another classification of promoters is into three classes: Type I, II, and III (Haberle and Lenhard,

2016). These types incorporate epigenetic information as well. Type I promoters are somewhat similar to focused promoters, but include a less organized nucleosome structure around the promoter region. Type II promoters are similar to dispersed promoters and have specific locations of histones on either side of the. Type III promoters are somewhat more dispersed than type II and are more characterized by structural differences. Talk of a fourth promoter type for genes involved in translation has come up but is less well studied (Haberle and Lenhard, 2016). Still, the majority of differential regulation between cell types comes from the proximal promoter elements, enhancers, and silencers involved in gene expression.

4 1.5 Gene regulation – Proximal promoter elements (PPEs)

Proximal promoter elements are composed of short sequences to which specific transcription factors bind to increase or decrease transcription (Maston et al., 2006).

These groups of proximal promoter elements and corresponding transcription factors not only differ between genes but the specific transcription factors expressed in one cell type may not be expressed or may be expressed at a different level in a different cell type, leading to variable expression of genes containing those elements. The general mechanism by which these transcription factors bind and regulate expression through the proximal promoter elements has to do with the transcription factor structure, the interaction between these specific TFs, and the interactions between the specific TFs and the basal factors that bind the core promoter (Maston et al., 2006). The common model for the structure of the specific transcription factors is the presence of a DNA-binding domain and an activating domain. The DNA-binding domains vary in structure and the

TFs are generally categorized based on the way they recognize and bind DNA sequences

(Maston et al., 2006). Often, the specific transcription factors bind DNA as homodimers, interacting with a second copy of the TF, or heterodimers, interacting with a different transcription factor, to regulate transcription (Maston et al., 2006). Many transcription factors will interact with coactivators, which do not directly bind DNA but interact with other proteins bound to DNA in the promoter region, in order to assist in regulating transcription (Maston et al., 2006). These specific transcription factors and their interactions with each other, with coactivators, and with the basal transcription machinery display synergy with regards to their effects on transcription levels (Maston et al., 2006).

Although specific proximal promoter elements may have different functions in gene

5 expression, the main functions of PPEs include assisting in PIC assembly at the core promoter, contributing to transcription initiation, core promoter escape, or re-initiation, and recruiting chromatin remodeling complexes to affect the rate of transcription (Maston et al., 2006). The core promoter and proximal promoter elements are location-dependent, so removal or relocation of this region will significantly decrease or eliminate the expression of the corresponding gene.

1.6 Gene regulation – Enhancers and silencers

Enhancer and silencer sequences are also bound by specific transcription factors, but these sequences are location-independent, meaning they can be located relatively far from the genes they regulate and that moving these regulatory sequences will generally not affect gene expression (Maston et al., 2006). Enhancers are bound by activating transcription factors, such as c-Myc, which increase gene expression, and silencers are bound by repressors, transcription factors that decrease gene expression (Maston et al.,

2006). Enhancers, bound by specific transcription factors similar to or the same as those that bind PPEs, will loop around to interact with the promoter of a gene, forming an enhanceosome that increases transcription (Maston et al., 2006). This increase in activity seems to be the result of the enhancer complex having the ability to keep specific transcription factors and the basal factor complexes within the area of the promoter so that re-initiation of transcription occurs more frequently (Engel et al., 2016). Gene regulation also seems to be affected by the position of genes in lamina associated domains (LADs) or topologically associated domains (TADs) within the nucleus. LADs are near the periphery of the nucleus and are associated with genes that are silenced

6 (Engel et al., 2016). TADs are regions within the nucleus, whose boundaries are mostly determined by CTCF and cohesin proteins, that form a unit where DNA is brought more closely together to make it easier for distal enhancers to interact with the promoters they regulate (Engel et al., 2016). The activity of enhancers is also influence epigenetically by methylated histones (H3K27me3 and H3K4me1), correlating with decreased activity, and acetylated histones (H3K27ac), correlating with increased activity (Engel et al., 2016).

Enhancers were previously characterized as an orientation-independent regulatory element but there is evidence of enhancers behaving in an orientation-dependent manner

(Engel et al., 2016). For example, in different Drosophila species, a regulatory factor called ABD-B contributes to sexually dimorphic coloring by binding to an enhancer element that regulates the yellow gene (Jeong et al., 2006). In these different species, the sequence of the element is slightly different, especially for more distantly related species

(Jeong et al., 2006). In one of the species that lacks this sexually dimorphic coloring, the consensus sequence is reversed, or on the opposite strand, compared to species like D. melanogaster which contain the normal consensus sequence on the forward strand of the gene (Jeong et al., 2006). A gel shift assay revealed that the transcription factor still bound to both sequences but no longer regulated the yellow gene in the species lacking the distinguishing coloring (Jeong et al., 2006). In a second example, an enhancer located between two divergent promoters loses its ability to upregulate one of the genes when inverted in vitro (Swamynathan and Piatigorsky, 2002).

7 1.7 Gene expression in mantle cell lymphoma vs. small lymphocytic lymphoma

Identifying which of these regulatory elements are involved in the expression of a particular gene between normal cells and abnormal cells (cancerous or of another disease) of the same cell origin can help us understand the differential expression that may be causative of certain diseases as well as differentiate between diseases, such as a more aggressive cancer and a less aggressive cancer of similar cell origin.

As previously mentioned, dysregulation of cell cycle or apoptotic genes within a cell can lead to cancer. Changes in expression of the regulatory factors that bind promoters, as well as the chromosomal mutations that result in the movement of regulatory elements, may lead to increases or decreases in gene expression. For example, mantle cell lymphoma, an aggressive B cell cancer, has a characteristic translocation that moves a regulatory element near the cyclin D1 gene, increasing cyclin D1 expression

(Henson et al., 2011). By understanding the specific causes of dysregulation that lead to a disease such as cancer, possible targets for treatment can be identified and tested for efficacy. Two such cancers that display differential gene regulation are mantle cell lymphoma (MCL) and small lymphocytic lymphoma (SLL). Both are derived from similar B-lymphocyte origins, but MCL has a more aggressive phenotype and a patient survival time around 3-4 years, while SLL is less aggressive and has closer to a 10-year patient survival time (Henson et al., 2011). The differential gene regulation between these two lymphomas results in the phenotypic differences between the more aggressive

MCL and the less aggressive SLL, and so studying these differences may determine why

MCL is more aggressive than SLL and lead to the identification of potential targets for treatment. Using a suppression subtractive hybridization (SSH) method, Henson and

8 colleagues identified the differential gene expression in MCL versus SLL (Henson et al.,

2011). One of the genes expressed differently between these two cancers is the nucleosome assembly protein 1-like 1 (NAP1L1), which has the potential to be, in part, causative of a cancerous phenotype.

1.8 Nucleosome assembly protein 1-like 1 (NAP1L1)

Nucleosome assembly proteins are a class of proteins involved in nucleosome assembly, histone chaperoning, and cell cycle and gene regulation (Attia et al., 2011;

Okuwaki et al., 2010; Park and Luger, 2006). Nucleosome assembly protein 1 was first discovered in Xenopus laevis and there have since been found five nucleosome assembly protein 1-like proteins in humans, NAP1L1 through NAP1L5. Two of these NAP1-like proteins are expressed ubiquitously (NAP1L1 and NAP1L4) and the remaining three are expressed solely in neurons (Attia et al., 2011). NAP1L1 has been shown to have an involvement in cell cycle progression and gene regulation (Schimmack et al., 2014).

Knockdown of NAP1L1 in BON cells (a human pancreatic neuroendocrine neoplasm cell line) before transplantation into mice resulted in smaller tumors as well as no visible signs of metastasis compared to the transplanted control BON cells, which showed larger tumors and metastasis to the liver and peritoneum, suggesting a role for NAP1L1 in cell proliferation (Schimmack et al., 2014). NAP1L1 involvement in gene regulation can be seen in the results of a ChIP analysis of NAP1L1 binding the promoter of the tumor suppressor p57Kip2 and a correlation between decreased methylation patterns at the p57Kip2 promoter and the absence of NAP1L1 (Schimmack et al., 2014). In mice,

NAP1L1 appears to be regulated by c-Myc, as a ChIP-seq of MYC brings down the

9 promoter region (approximately -2000 bp to +2000 bp) of NAP1L1 and two c-Myc consensus sequences are found within the first intron (Wu et al., 2014). There also appears to be a strong correlation between high expression of NAP1L1 and high expression of MYC in human B cell lymphomas, suggesting a potential interaction or regulatory relationship between these two genes in humans as well (Wu et al., 2014).

NAP1L1 is also involved in histone chaperoning and has a higher nucleosome disassembly activity than the other ubiquitously expressed nucleosome assembly protein

NAP1L4 (Okuwaki et al., 2010). NAP1L1 overexpression is correlated with many cancers such as pancreatic neuroendocrine neoplasms, small intestinal carcinoids, and mantle cell lymphoma (Kidd et al., 2006; Schimmack et al., 2014; Henson et al., 2011).

1.9 Characterizing the NAP1L1 promoter region and enhancer element

Collectively, these findings suggest that human NAP1L1 may have a causative role in the aggressive nature of mantle cell lymphoma. Evidence of MYC binding the mouse NAP1L1 promoter region suggests that the human homolog may be regulated by the oncogene MYC through a conserved enhancer element within the first intron of

NAP1L1 (Wu et al., 2008). NAP1L1 also appears to have a role in regulating cell cycle genes and so when overexpressed this protein may affect cell cycle or apoptotic genes and lead to an increase in proliferation. Due to its role in cell proliferation and gene regulation, it is possible that overexpression of this gene may be causative of the aggressive nature of mantle cell lymphoma. If NAP1L1 is found to be causative of an aggressive phenotype when overexpressed, then it may be a potential target for treatment.

Identifying which DNA elements and transcription factors are involved in NAP1L1

10 expression will provide insight as to the methods involved in the normal regulation of this gene as well as what factors may contribute to the dysregulation of this gene. The goal of this project is to characterize the promoter region and identify a potential enhancer element within the first intron of NAP1L1 in order to elucidate in part the method of regulation of NAP1L1.

11 CHAPTER 2: METHODS

2.1 Cloning the NAP1L1 promoter region into pGL3 Basic. A region of 1431 base pairs (bp), spanning from -563 bp to +868 bp relative to the transcription start site (+1), was amplified from human genomic DNA by PCR using the forward primer 5’–AGT

CAA TTC ATC ATC ACT TG–3’ and reverse primer 5’–TCC GCC GTT CTT TTC

ATC TC–3’. The thermocycler was set for an initial denaturation of 95˚C for 5 min, 35 cycles at 95˚C for 30 s, 46˚C for 60 s, and 72˚C for 90 s, and a final extension of 72˚C for

5 min. The largest two of three fragments were gel extracted and purified using

QIAGEN’s Gel Extraction Kit and restriction digested with PstI to identify the correct fragment. The 5’ ends of the correct fragment were phosphorylated using the protocol and reagents from Lucigen’s pGC Blue cloning kit and subsequently purified using

QIAGEN’s PCR Purification Kit. The purified, phosphorylated fragment was then ligated into pGC Blue using the buffer and T4 DNA ligase from Lucigen’s pGC Blue cloning kit and protocol. To the ligation reaction, 5 µL of this purified fragment (~13.2 ng/µL), 2.5

µL of the 4X pGC Blue Vector Premix, 1 µL of CloneSmart DNA Ligase (2 U/µL), and autoclaved D.I. water to fill to 10 µL were added, mixed, and allowed to incubate at room temperature for 2 h. After 2 h, the ligation reaction was heat denatured at 70˚C for 15 min and transformed into Zymo Research’s Mix & Go Competent Cells following the manufacturer’s protocol. The cells were transformed with 5 µL of the ligation reaction and, before plating, 4 volumes of plain LB was added to the transformation mixture and the cells were allowed to recover for 1 h at 37˚C in a shaking incubator. From this mixture, 100 µL was plated onto one prewarmed (37˚C) LB-kanamycin (50 µg/mL) plate

12 and 10 µL was plated onto a second prewarmed LB-kanamycin plate. The following day, the remaining 400 µL was plated onto two prewarmed LB-kanamycin plates (200 µL per plate). Seven colonies were placed in liquid LB broth containing kanamycin and left overnight at 37˚C in a shaking incubator. The following morning these liquid cultures were boil prepped (See boiling lysis plasmid purification protocol below). Each plasmid prep was restriction digested with EcoRI to determine if the plasmid contained an insert and with SacI to determine the orientation of the insert. The plasmid containing the promoter region oriented towards the HindIII restriction site in pGC Blue was used for cloning into the luciferase reporter vector pGL3 Basic (Promega). A sequential digest first with HindIII and then with XhoI removed the promoter from pGC Blue and inserted into pGL3 Basic (do I need to talk about gel extraction here?) (See sequential restriction digest details under the restriction digest protocol below). Promega’s 10X Ligase Buffer

(300 mM Tris-HCl pH 7.8 at 25˚C, 100 mM MgCl2, 100 mM DTT, and 10 mM ATP) and T4 DNA Ligase (3 U/µL) were used to set up a 10 µL ligation reaction with 5 µL of purified insert (~15.2 ng/µL) and 3 µL of purified pGL3 Basic (~25 ng/µL). Following the protocol provided by Zymo Research, 5 µL of the ligation mixture was transformed into Mix & Go Competent Cells. These cells were plated directly onto prewarmed (37˚C)

LB-ampicillin plates. One LB-ampicilling plate received 10 µL of the transformation mixture and another received 95 µL. All 17 colonies from the transformation were placed in 3 mL liquid LB medium containing ampicillin and grown overnight. These overnight cultures were boil prepped and the uncut plasmids were run on a gel next to a sample of uncut pGL3 Basic. Five plasmid preps were then digested with EcoRI to determine the presence of an insert and XbaI to determine the orientation of the insert. Since the reverse

13 primer annealed within the first intron of NAP1L1, the intron needed to be removed from this recombined plasmid. A restriction digest with SpeI removed the intron and the plasmid was ligated onto itself to complete the removal of the intron. The confirmed recombined plasmid was then purified with QIAGEN’s Spin Miniprep Kit and sent for sequencing by Laragen.

2.2 Cloning a larger fragment of the NAP1L1 promoter region into pGL3 Basic. The larger fragment, 1945 bp, extends from -1273 bp to +672 bp relative to the transcription start site (+1). An amplified region of the NAP1L1 promoter was obtained through PCR using the forward primer 5’–AGA AGG GTG TAC AGG AAT AGG C–3’ and reverse primer 5’–GGT GGA GCT CAG AGA ACC TTA G–3’. A primer kinase reaction was set up for the primers following the pGC Blue Cloning Kit protocol (Lucigen). A PCR reaction was then set up using 1 µL of this primer kinase reaction, 50 ng genomic DNA and Promega’s GoTaq Green Master Mix in a 50 µL reaction. A touchdown PCR reaction was programmed into the thermal cycler so that there was an initial denaturation step at 95˚C for 5 min, 2 cycles of 95˚C for 30s, 61˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30s, 58˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30s, 55˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30s, 53˚C for 40 s, and 72˚C for 2 min,

27 cycles of 95˚C for 30s, 51˚C for 40 s, and 72˚C for 2 min, a final extension of 72˚C for

5 min, and a hold at 4˚C. The reaction was run on a gel and the band was extracted and purified using QIAGEN’s Gel Extraction Kit. Using the protocol in the pGC Blue

Cloning Kit, the phosphorylated NAP1L1 promoter region was ligated into pGC Blue using 6.5 µL of the gel extracted promoter (~17.3 ng/µL) and 4 µL of this ligation was

14 transformed into Lucigen’s chemically competent E. Cloni cells with an outgrowth step that includes the addition of 260 µL of Recovery Medium (Lucigen) and incubation at

37˚C (250 RPM) for 1 h before plating onto prewarmed LB plates containing kanamycin

(50 µg/ml). Overnight cultures were set up from resulting colonies and purified by boiling lysis. Purified plasmids were digested with EcoRI and SpeI to confirm identity and orientation of insert. In order to clone directionally into pGL3 Basic, pGL3 Basic and a pGC Blue plasmid containing the NAP1L1 promoter region pointing towards the

HindIII site in pGC Blue was digested first with HindIII and then with XhoI, increasing the salt concentration from 50 mM to 150 mM in between (see sequential digest protocol). Both the digested pGL3 Basic and NAP1L1 promoter region were gel extracted and purified using QIAGEN’s Gel Extraction Kit. A ligation was then set up using 7 µL of pGL3 Basic (~3.9 ng/µL), 1 µL NAP1L1 (~36.1 ng/µL), 1 µL 10X Ligase Buffer

(Promega), and 1 µL T4 DNA Ligase (3 U/µL) (Promega) for a total volume of 10µL that was incubated at room temperature for 3 h. The ligation was inactivated at ~74˚C for

15 min and 4 µL was transformed into Lucigen’s chemically competent E. Cloni cells following their protocol with the outgrowth step. Overnight cultures of resulting colonies were purified by boiling lysis and confirmed for identity and orientation of insert using

EcoRI and NotI. The confirmed recombined plasmid was then digested with SpeI and ligated onto itself to remove the intron that was initially amplified with the promoter region, leaving the remaining fragment that spans from -1273 bp to +387 bp relative to the transcription start site. The final pGL3N-1273 was then purified using Thermo

Scientific’s GeneJET Plasmid Miniprep Kit for transfection.

15 2.3 Cloning a region of NAP1L1’s intron 1 into pGL3N-1273 and pGL3 Basic. A

1913 bp region containing part of the first intron of NAP1L1, including the putative myc binding site, was amplified by PCR using the forward primer 5’–AAG ATA TGG TGG

GGT GCT TAA C–3’ and reverse primer 5’–TCT AAA ATA CGG GCT CCT TGA G–

3’. The primers were phosphorylated using the primer kinase reagents and protocol from

Lucigen’s pGC Blue Cloning Kit. A 50 µL PCR reaction was set up using 50 ng genomic

DNA, 1 µL primer kinase reaction, and Promega’s 2X GoTaq Green Master Mix. A touchdown PCR was programmed into the thermal cycler so that there was an initial denaturation at 95˚C for 5 min, 2 cycles of 95˚C for 30 s, 59˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30 s, 56˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30 s, 53˚C for 40 s, and 72˚C for 2 min, 2 cycles of 95˚C for 30 s, 51˚C for 40 s, and 72˚C for 2 min, 27 cycles of 95˚C for 30 s, 49˚C for 40 s, and 72˚C for 2 min, a final extension of 72˚C for 5 min, and a hold at 4˚C. The resulting band was gel extracted and purified using QIAGEN’s Gel Extraction Kit. This purified fragment was then ligated into pGC

Blue following Lucigen’s protocol, adding 6.5 µL of the phosphorylated intron fragment

(~26.1 ng/µL) to the reaction. The ligation was inactivated at 74˚C for 15 min and 4 µL of the ligation was transformed into Lucigen’s chemically competent E. Cloni cells including an outgrowth step and the entire mixture was plated onto prewarmed LB plates containing kanamycin. Overnight cultures were set up from resulting colonies and the plasmids were purified by boiling lysis. Plasmid identities and insert orientations were confirmed by digesting with EcoRI and SpeI. In order to clone into pGL3N-1273 and pGL3 Basic, the two vectors were cut with SalI, blunted, CIP-ed, and gel extracted and purified using QIAGEN’s Gel Extraction Kit. The pGC Blue vector with the intron facing

16 the SpeI site in pGC Blue was cut with SpeI to remove it from pGC Blue and at the same time remove the majority of the 5’–UTR region that was part of the original amplified fragment. This fragment, now 1534 bp, was also blunted and gel extracted and purified using QIAGEN’s Gel Extraction Kit. The intron was then ligated into pGL3 Basic and pGL3N-1273 using ~40 ng of intron at between 24 and 30 ng of pGL3 Basic and between

18 and 24 ng of pGL3N-1273. These ligations were transformed into chemically competent E. coli. Overnight cultures of resulting colonies were purified by boiling lysis and digested with EcoRI and SacI to confirm identity and intron orientation. Confirmed constructs were miniprepped using Thermo Scientific’s GeneJET Plasmid Miniprep Kit and a second confirmation was done by digesting the plasmids with BglI, SalI & XbaI, and SacI.

2.4 Boiling lysis plasmid purification. A master mix of boil buffer (0.5% Triton X-100,

8% sucrose, 50 mM EDTA, 10 mM Tris, pH 8) and lysozyme (10 mg/mL) was made in a

13:1 ratio, respectively. The master mix was most often made for the number of samples plus one, with each sample requiring 350 µL of master mix (325 µL boil buffer and 25

µL lysozyme). Overnight cultures were decanted into 1.5 mL microcentrifuge tubes and centrifuged at maximum speed (13,000 RPM) for 15 s, waiting until the centrifuge reached maximum speed and then counting 15 s. The supernatant was decanted and another 1.5 mL of culture was added and centrifuged at max speed for 15 s. The supernatant was decanted and tubes were inverted on paper towels to drain as much of the liquid supernatant as possible. To each bacterial pellet, 350 µL of the master mix was added and each tube was dragged across a peg rack to resuspend the pellet. The tubes

17 were placed in a boiling water bath for 40 to 45 s. Once the tubes were removed, they were allowed to cool to approximately room temperature and then centrifuged at max speed for 5 min. The gooey cell debris containing genomic DNA in each tube was removed carefully with a toothpick. To each tube, 40 µL of 3 M NaOAc was added, the tube was vortexed briefly, 425 µL of isopropanol was added, the tube was vortexed briefly, and the tube was centrifuged at max speed for 5 min. The supernatant was decanted and tubes were inverted to air dry before they were resuspended in 100 µL of

RNase water (10 µg/mL).

2.5 Restriction digests. Restriction digests were done according to either Promega’s or

Thermo Fisher’s protocols. For the sequential digests necessary for cloning the original

NAP1L1 promoter region (both the -563 and -1273 promoter regions) there was an adjustment of salt concentration between digests. The first digest with HindIII was done as Promega’s protocol recommends with 10X Buffer B (1X concentration: pH 7.5 @

37˚C, 6 mM Tris-HCl, 6 mM MgCl2, 50 mM NaCl, 1 mM DTT). After heat-inactivating

HindIII, the digest volume was brought up to a final volume 40 µL with the addition of

3.33 µL 10X Buffer D (1X concentration: pH 7.9 @ 37˚C, 6 mM Tris-HCl, 6 mM

MgCl2, 150 mM NaCl, 1 mM DTT) in order to increase the salt concentration from 50 mM to 150 mM for the XhoI digest. The digests were then run on a gel and desired bands were cut out and purified.

2.6 Transcription factor consensus sequence identification. Potential transcription factor binding sites within the promoter region were identified by means of similarity

18 between the isolated NAP1L1 promoter region sequence (-1273 bp to +387 bp, relative to the transcription start site) and transcription factor consensus sequences using the following bioinformatics software: Match, Alibaba 2, and PROMO. The parameters in

Match were set so “group of matrices” was set to “vertebrates”, “cut-off selection for matrix group” to “to minimize the sum of both error rates”, and the “predefined profiles” to “best_selection.prf” (http://www.gene-regulation.com/cgi- bin/pub/programs/match/bin/match.cgi). The parameters for Alibab2 were unchanged from the default and included “pairism to known sites” set to “50”, “mat. width in bp” set to “10”, “min number of sites” set to “4”, “min mat. conservation” set to “75%”, “sim of seq to mat.” set to “1%”, and “factor class level” set to “4 (e.g. RAR-b’)”

(http://www.gene-regulation.com/pub/programs/alibaba2/index.html). The parameters for

ALGGEN were set so that under “SelectSpecies” the “factor’s species” and “site’s species” were both set to “human, Homo sapiens” and under “SearchSites” the

“maximum matrix dissimilarity rate” was set to “5” (http://alggen.lsi.upc.es/cgi- bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3). After a list of putative transcription factor binding sites (TFBSs) was obtained from each program, the lists were compared and TFBSs similar between at least two of the programs were noted and organized into

Figure 1.

2.7 Designing initial deletion constructs of the NAP1L1 promoter region. Using the results of the transcription factor bioinformatics programs, deletion constructs were designed to remove segments that each increased in size by approximately 200 base pairs and attempt to avoid cutting directly in the middle of potential transcription factor

19 bindings sites. Using site-directed mutageness (QuikChange Site-Directed Mutagenesis

Kit, Agilent), single nucleotide changes were made to introduce restriction sites within the promoter regions at locations -442 bp, -297 bp, -183 bp, and -56 bp relative to the transcription start site (+1). The primers containing the mutations (bolded) for each of the deletion constructs can be found in Table 1. The primers for constructs pGL3N-442, pGL3N-297, and pGL3N-183 were designed by centering the mutation within the primer and encompassing a region that extended ~15 nucleotides on either side of the mutation.

The primers were designed by hand except for the primers for pGL3N-56, which were designed by Agilent’s free online QuikChange Primer Design program

(http://www.genomics.agilent.com/primerDesignProgram.jsp). Using the protocol and reagents provided by Agilent’s QuikChange Site-Directed Mutagenesis Kit, the first set of deletion constructs (pGL3N-442, -297, -183, and -56) were made. The PCR protocol and DpnI digestion were followed exactly as depicted in the protocol. The PCR products were then transformed into chemically competent E. coli DH5-a. Resulting colonies were set up in overnight liquid cultures (LB-Amp) and boil prepped the following day. The purified plasmids were digested using the corresponding restriction enzyme designed to take out a fragment of the promoter. After confirming the removal of a piece of the promoter from each construct on a gel, the larger fragments were gel extracted using

QIAGEN’s Gel Extraction Kit. The fragments were ligated on themselves using

Promega’s 10X Ligase buffer and T4 DNA Ligase (3 U/µL) in a 10 µL reaction. The ligation mixtures were transformed into chemically competent E. coli DH5-a. The resulting colonies were set up in overnight cultures and boil prepped the following day.

These purified plasmids were digested with KpnI and XbaI to confirm their identities.

20 The plasmids were repurified (from the corresponding colonies) using QIAGEN’s Spin

Miniprep Kit and sent out for sequencing at Laragen.

2.8 Designing smaller deletion constructs of the NAP1L1 promoter region. All primers for site-directed mutagenesis can be found in Table 1 and were designed by hand except for pGL3N-335, which was designed by Agilent’s free online QuikChange Primer

Design program (http://www.genomics.agilent.com/primerDesignProgram.jsp). The second set of deletion constructs was made using a similar method to the first set. The constructs pGL3N-398, pGL3N-335, and pGL3N-215 were made following the

QuikChange protocol with a few small adjustments. The PCR reactions contained a lower annealing temperature of 45˚C and the extension time was increased to 7 min. The amount of template (pGL3N-1273) that was used was 25 ng and the remainder of the

PCR protocol remained the same. The PCRs were digested with DpnI and 5 µL of each

PCR was transformed into Mix & Plate chemically competent E. coli cells (GeneChoice), including an outgrowth with the addition 220 µL Recovery Medium (Lucigen) and incubation at 37˚C (250 RPM) for 1 h before plating the entire mixture onto prewarmed

LB plates containing ampicillin. Overnight cultures of the resulting colonies were purified by boiling lysis and the purified plasmids were double-digested with SmaI &

BglII (pGL3N-398), SmaI & NdeI (pGL3N-335), and SmaI & EcoRV (pGL3N-215). The digests were run on a gel and the larger band was cut out and purified using QIAGEN’s

Gel Extraction Kit. The linearized plasmids pGL3N-398 and pGL3N-335 were cut with

BglII and NdeI, respectively, and so were blunted with Klenow and repurified with

QIAGEN’s PCR Purification Kit. The other construct, pGL3N-215, was cut with EcoRV

21 which leaves a blunt end and so did not require these additional steps. The linearized, blunted plasmids were then ligated onto themselves using Promega’s 10X Ligase Buffer and T4 DNA Ligase (3 U/µL). To each reaction 8 µL of plasmid was added (purified plasmid concentrations: pGL3N-398 = ~5.1 ng/µL, pGL3N-335 = ~6.6 ng/µL, and pGL3N-215 = ~3.9 ng/µL). The ligations were heat-inactivated before transforming 5 µL of each into Mix & Plate chemically competent E. coli cells (GeneChoice), including an outgrowth with the addition of 220 µL Recovery Medium (Lucigen) and incubation at

37˚C (250 RPM) for 1 h before plating the entire mixture onto prewarmed LB plates containing ampicillin. Overnight cultures of resulting colonies were purified by boiling lysis and all purified plasmids were digested with BglI and KpnI to confirm the deletion of a region of the promoter. Confirmed deletion constructs were then purified using

Thermo Scientific’s GeneJET Plasmid Miniprep Kit.

For pGL3N-367 and pGL3N-254 a different proofreading polymerase was used, Phusion

DNA polymerase (Thermo Scientific), so some adjustments to the protocol were made.

The PCR reaction for pGL3N-254 contained 10 µL 5X Phusion HF Buffer, 50 ng template DNA (pGL3N-1273), 125 ng of each primer (Table 1), 1 µL dNTP Mix (10 mM each), and 0.5 µL Phusion DNA polymerase (2 U/µL). The thermocycler was set with an initial denaturation step of 95˚C for 30 s, 18 cycles of 95˚C for 30 s, 45˚C for 1 min, 72˚C for 6 min 40 s, and a hold at 4˚C. PCR reactions were then digested with DpnI for ~2 h,

EtOH precipitated, and resuspended in 4 µL Elution Buffer (Thermo Scientific). The entire 4 µL was then transformed into Lucigen’s chemically competent E. Cloni cells.

Overnight cultures of the resulting colonies were purified through boiling lysis. The purified plasmids were digested first with SmaI and then with BglII using Promega’s

22 buffers, increasing the salt concentration for the BglII digest using 10X Buffer D as previously described (see sequential digest protocol). This digest was run on a gel and the larger band was cut out of the gel and purified using QIAGEN’s Gel Extraction Kit. The purified, linear fragment was blunted using Klenow as BglII produces overhangs.

Klenow was heat-inactivated and the blunted fragment was re-purified using QIAGEN’s

PCR Purification Kit. The vector was then ligated onto itself (8 µL at ~8.3 ng/µL) using

Promega’s 10X Ligase Buffer (1 µL) and T4 DNA Ligase (1 µL at 3 U/µL) for a total reaction volume of 10 µL. The ligation reaction was heat-inactivated at ~70˚C for 15 min and 5 µL was transformed into Mix & Plate chemically competent E. coli cells

(GeneChoice), including an outgrowth with 420 µL of Recovery Medium (Lucigen) and incubation at 37˚C (250 RPM) for 1 h. The entire transformation mixture was then plated onto prewarmed LB plates containing ampicillin. Overnight cultures of the resulting colonies were purified by boiling lysis. The purified plasmids were digested with BglI and KpnI separately to confirm the deletion of a fragment of the promoter region.

Confirmed constructs were then purified using Thermo Scientific’s GeneJET Plasmid

Miniprep Kit.

The construct pGL3N-367 was made similarly to pGL3N-254. The PCR reaction, DpnI digestion, EtOH precipitation, transformation, and boiling lysis plasmid purification were identical to the conditions for pGL3N-254 except for the primers used in the PCR reaction (Table 1). This construct was then digested first with SmaI and then PvuII with no increase in salt concentration as it was not required for PvuII. The digests were run on a gel and the upper band was cut out and purified using QIAGEN’s Gel Extraction Kit.

As SmaI and PvuII are restriction enzymes that result in blunt-ended DNA strands, the

23 plasmid was directly ligated onto itself (6 µL) using Promega’s 10X Ligase Buffer (1 µL) and T4 DNA Ligase (1µL at 3 U/µL) for a total reaction volume of 10 µL. The reaction was incubated at room temperature for 3 h, inactivated at ~70˚C for 15 min, and 4 µL was transformed into Mix & Plate chemically competent E. coli cells (GeneChoice), including an outgrowth step with 420 µL Recovery Medium (Lucigen) and incubating at

37˚C (250 RPM) for 1 h) before the entire mixture was plated onto prewarmed LB plates containing ampicillin. Overnight cultures of resulting colonies were purified by boiling lysis and the purified plasmids were digested with BglI and KpnI separately to confirm the deletion of a fragment of the promoter region. The confirmed plasmid was then purified using Thermo Scientific’s GeneJET Plasmid Miniprep Kit.

2.9 Mutating the myc binding site within the first intron of NAP1L1. In order to mutate the consensus sequence for the myc binding site withing the first intron of

NAP1L1 the QuikChange site-directed mutagenesis kit was used with some different reagents and adjustments to the protocol, including some influence from the QuikChange

XL site-directed mutagenesis kit as the template exceeded 8000 bp. The primers used to mutate the sequence can be found in Table 1. Thermo Fisher’s Phusion DNA polymerase was used in this mutagenesis instead of the PfuTurbo DNA polymerase from the

QuikChange site-directed mutagenesis kit. Four 50 µL PCR reactions were set up, containing 10 µL of 5X Phusion HF buffer, 50 ng of template (pGL3N-1273 + intron(F)),

125 ng of each primer (see Table 1), 1 µL of dNTP mis (10 mM each), 3 µL DMSO, and

1 U Phusion DNA polymerase. The thermocycler was set with an initial denaturation step at 95˚C for 1 min, 18 cycles of 95˚C for 50 s, 45˚C for 50 s, and 72˚C (Phusion PCR

24 protocol specified 72˚C for an extension temperature) for 8 min 15 s, a final extension at

72˚C for 7 min, and a hold at 4˚C. These four reactions were first digested with DpnI at

37˚C for 2 h, then the four reactions were combined, EtOH precipitated, and resuspended in 4 µL Elution Buffer (Thermo Scientific). This entire volume was transformed into chemically competent E. coli with an outgrowth step including the addition of 260 µL of recovery medium and incubation at 37˚C (250 RPM) for 1 h before plating the entire mixture onto LB plates containing ampicillin. Overnight cultures were boiled prepped and checked for correct plasmid size by digesting with BglI and ApaI. Plasmids of the correct size were miniprepped using Thermo Scientific’s GeneJET Plasmid Miniprep Kit and sent to Laragen for sequencing.

Table 1. Primers used for site-directed mutagenesis of promoter region and intron.

Promoter Region Forward Reverse Primer Engineered Construct (relative to Primer (5’-3’) (5’-3’) Restriction Site TSS +1) CCT CAT GTG CAT TAT AAC TCT CTG MluI pGL3N-442 -442 to +387 GCG TCC GAC GCG ACGCGT AGA GAA TTA TAA TGC AC TGA GG CCA TTT CGT CAG TCA TCC TTC GGC GTC AAA TTC AGA BglII pGL3N-398 -398 to +387 AGA TCT TCT TTT AGATCT GAA GCC GAC GGA GAA CTG TGA AAA ACG TGG CTG ACG CGC CAC GAC TGT ATC CTC ACA TCA GCA GCT PvuII pGL3N-367 -367 to +387 GCT GCG GAT GTA CAGCTG AGG ATG CAG TCC TGG CG GTC AG

25 GGG GTT GGG TGC GGG GGG TCG TCA NdeI pGL3N-335 -335 to +387 CAT ATG TCA TAT CATATG ATG ACG GCC CCC AGC ACC C CAA CCC C GGC GCG GTT TTG CTT GAG ACC CTC CTA CGC TAG NheI pGL3N-297 -297 to +387 TGC TAG CAT AGG GCTAGC CGG GTC AGA AGC TCC GCG CC AAA AC GAG GCC CAG TTT TTT GCC GAA CCT CTA GAG ATC BglII pGL3N-254 -254 to +387 GAT CTC TAG AGG AGATCT TTC GGC AAA AAA GGC CTC CTG CTT AGG CAA CTC CGG GGA GCC GGC AGG ATA CCC CGG EcoRV pGL3N-215 -215 to +387 TCG GTT TAA CCG GATATC ACC GGG ATA TCC GGC CGG TTC CCC CGA GTT G GCC TAA G CCG GCG TGT CAA AGT TGC GGC CGA EcoRI pGL3N-183 -183 to +387 AGG AAT ATT CCT GAATTC TCG GCC GCA ACT TTG ACA CGC CGG GCC GGG GGG AAG CGC CGG GCC TGG GCA CCC CCC GGG SmaI pGL3N-56 -56 to +387 GGG CCA TGC CCG CCCGGG GGC CTT GCG CCC CCC GGC Not a CGG CGG CAT GAA restriction site. (not relevant) CTT CGC GCC CAT pGL3N-1273 + Mutated the myc binding TCC ATT TAC CAC intron (F) – consensus for site was TGT GGT AAA TGG myc mut myc from mutated AAT GGG AGC GAA “CACGTG” to CTT CAT G GCC GCC G “TTTGTG”

26 2.10 Transfection of promoter constructs into human embryonic kidney (HEK293T) cells. The following protocol is slightly modified from QIAGEN’s Effectene Reagent protocol for transient transfection of adherent cells. HEK293T cells were counted and seeded in 12-well plates at a density of 2 x 105 cells per well with 1 mL of DMEM 10%

FBS medium. The number of plates needed was determined by the number of constructs that were to be tested. Each construct was transfected into three separate wells so that the luciferase assay results from the three wells could be averaged for each construct after each transfection. After the cells were seeded, the 12-well plates were placed in the incubater at 37˚C with 5% CO2 for approximately 28-30 h after seeding. The plasmid control constructs, pRL-SV40, pGL3 Basic (no promoter), pGL3 Promoter, and pGL3

Control, were transfected alongside all of the NAP1L1 promoter and intron constructs.

All plasmids used in transfections were purified using Thermo Scientific’s GeneJET

Plasmid Miniprep Kit. Plasmid concentrations were determined using a spectrophotometer. On the day of the transfection, master mixes were made and the media on the cells was changed ahead of the transfection procedure. The media was changed after cell confluency was determined. The existing media was removed, cells were washed once with 1 mL of 1X PBS, 800 µL of media was added to each well, and the cells were placed back into the incubator. Master mixes were prepared for the number of wells receiving a pGL3 construct plus one. For a transfection in which one construct was transfected into 3 wells, a master mix was made for 4 wells. The following portion of the protocol describes the procedure for transfection of each pGL3 construct into 3 wells and was scaled up if transfecting each construct into 4 wells. A master mix for each pGL3 construct was prepared in a 2.0 mL microcentrifuge tube by first diluting each

27 pGL3 construct to 50 ng/µL in Thermo Scientific’s Elution Buffer at a total volume of 16

µL. Each well should receive 200 ng of the corresponding pGL3 construct. In a separate microcentrifuge tube, the normalization construct pRL-SV40 was also diluted to 50 ng/µL in Elution Buffer (Thermo Scientific) to a volume large enough to accommodate each well receiving 100 ng of pRL-SV40, so each master mix would receive 8µL. The diluted pRL-SV40 was then added to each pGL3 master mix tube and the total volume was diluted with Buffer EC (QIAGEN’s Effectene Reagent Kit) to a total volume of 300

µL. As the DNA to Enhancer (QIAGEN) ratio needed to remain constant, 9.6 µL

Enhancer was added to each master mix, the mixtures were vortexed for 1 s, and left to incubate at room temperature for at least 5 min. The tubes were briefly centrifuged to bring the liquid down to the bottom of the tubes and 4 µL of Effectene Reagent

(QIAGEN) was added. The tubes were vortexed for 10 s and incubated at room temperature for at least 10 min. To each tube, 1600 µL warmed media was added and mixed. From each master mix, 375 µL was added to each of 3 wells receiving a specific construct. The plates were swirled and left to incubate at 37˚C with 5% CO2 for approximately 48-56 h.

2.11 Harvesting transiently transfected HEK293T cells. Protocol modified from

Promega’s Dual-Luciferase Reporter Assay System manual. First, the media was removed from each well and the cells were washed with 1 mL 1X PBS and swirled gently. The PBS was then removed and 250 µL of 1X PLB (Passive Lysis Buffer,

Promega) was added. For each well, the cells were removed by pipetting up and down to break up cell clumps and washing the bottom of the wells to make sure all of the cells

28 came off and were in the liquid. This lysate was then transferred to a microcentrifuge tube and vortexed. All of the tubes, one for each well, were incubated in a dry-ice ethanol bath for 5 min followed by a 5 min incubation in a 37˚C water bath. The tubes were then placed back in the dry-ice ethanol bath and the incubation cycle was repeated two more times, sometimes with the addition of vortexing between the 37˚C incubation and the dry- ice ethanol incubation. The lysates were then assayed for luciferase activity or stored in the -20˚C or -80˚C until a later time for the luciferase assay.

Note: Would often stop after removing the lysates from the plates and vortexing the tubes and continue with the dry-ice ethanol incubation and 37˚C incubation another day. Tubes would be stored in the -20˚C freezer until that time.

2.12 Dual-Luciferase Assay of transfected pGL3 constructs. Luciferase assays were carried out using Promega’s reagents from the Dual-Luciferase Reporter Assay System and the Monolight 2010 luminometer. The luminometer was programmed to calculated the normalization ratio of firefly luciferase activity (resulting from the pGL3 constructs) to Renilla luciferase activity (resulting from pRL-SV40).

29 CHAPTER 3: RESULTS

In order to understand how the NAP1L1 gene is regulated, bioinformatics analyses of the promoter region were used before the in vitro experiments. This data was used as a guide in order to design deletion construct approximately every ~200 bp. If there was a putative transcription factor binding site near one of the cut site locations, then the deletion was designed around this cut site.

3.1 Bioinformatics analyses of the NAP1L1 promoter region.

Bioinformatics software was used to identify consensus sequences of transcription factor binding sites (TFBSs) within the sequence of the NAP1L1 promoter region. The bioinformatics programs Alibaba2, Match, and PROMO were used to identify these consensus sequences. Each program generated a list of transcription factors and their corresponding TFBSs within the region along with the location of each binding site. The lists were compared and matching results from two or more of the programs were noted as potential candidates for regulation of gene expression of NAP1L1 and can be seen in

Figure 1. The colors of the boxes were used to group TFBSs that were similar between two or more of the bioinformatics programs: red – Alibaba2 and Match, blue – Alibaba2 and PROMO, green – Match and PROMO, and yellow – all three programs. The transcription start site is indicated by a bent arrow with the label “+1”. Locations of deletion constructs are denoted by black corners labeled with the nucleotide at the 5’ end of each deletion construct.

30

Figure 1. Bioinformatics results of transcription factor binding sites within the promoter region of NAP1L1. Boxes surround consensus sequences of transcription factor binding sites (TFBSs) and the transcription factor identity labeled above or below the box. Red boxes indicate TFBSs common between the results of Alibaba2 and Match. Blue boxes indicate TFBSs common between the results of Alibaba2 and PROMO. Green boxes indicate TFBSs common between the results of Match and PROMO. Yellow-orange boxes indicate TFBSs common between the results of all

31 three bioinformatics programs. The transcription start site is indicated by an arrow and “+1”. Black corners with numbers indicate the positions where the promoter region was cut to make deletion constructs.

3.2 Cloning the NAP1L1 promoter region into pGL3 Basic.

A decision was made to clone a larger region of the promoter region into pGL3

Basic as the originally cloned fragment only contained -563 bp of the sequence upstream of the transcription start site. This new fragment includes upstream sequence up to -1273 bp 5’ of the transcription start site. An expected fragment size of 1945 bp from the initial touchdown PCR reaction was confirmed by gel electrophoresis, which showed a bright band just below the 2000 bp marker band. Other bands were visible on the gel but were smaller and much more faint. After this band was excised, purified, ligated into pGC

Blue, and transformed, six of the resulting colonies were grown overnight in liquid LB supplement with kanamycin and purified by boiling lysis. These boil preps were digested with EcoRI to determine if the promoter region had been successfully ligated into pGC

Blue and would result in band sizes of 2233 bp and 1963 bp. A second digest with SpeI would determine if the promoter region had been inserted in the “forward” (2505 bp and

1691 bp) or “reverse” (3874 bp and 322 bp). The “forward” direction refers to the promoter region pointing towards the XhoI restriction site in pGC Blue. All six colonies contained the promoter region with two in the reverse direction and four in the forward direction. One of the samples containing the promoter in the reverse direction was then cut with HindIII and XhoI in order to directionally clone the promoter region into pGL3

Basic to drive the expression of luciferase. After pGC Blue with the NAP1L1 promoter region and pGL3 Basic were cut with HindIII and XhoI, the fragments were excised from a gel, purified, and ligated together. Of the resulting colonies, six were grown overnight

32 in liquid LB supplemented with ampicillin and purified by boiling lysis. These boil preps were digested with EcoRI to determine if the promoter region had been successfully ligated into pGL3 Basic and would result in band sizes of 4879 bp and 1963 bp. A second digest with NotI determined if the promoter region was inserted in the direction pointing towards the luciferase gene (6637 bp and 205 bp) or away from the luciferase gene. Only one of the EcoRI digests resulted in two visible bands near the 5000 bp and 2000 bp marker bands and the NotI digest appeared to not cut at all and may not have worked.

The remaining bacterial culture containing the plasmid the resulted in the correct bands for the EcoRI digest was used to streak a bacterial plate from which three colonies were then grown overnight in liquid LB containing ampicillin. These cultures were then purified using Thermo Scientific’s GeneJET Plasmid Miniprep Kit and digested with

EcoRI (4879 bp and 1963 bp), SacI (4784 bp, 2005 bp and 53 bp), and SpeI (6520 bp and

322 bp) to confirm that pGL3 Basic contained the promoter region in the correct orientation. All three plasmids contained the correct band sizes for all three digests. The bands in the EcoRI digests were just below the 5000 bp and just below the 2000 bp marker bands. The bands in the SacI digests were a little bit below the 5000 bp marker band and just above the 2000 bp marker band. The bands in the SpeI digests were between the 6000 bp and 8000 bpmarker bands (closer to the 6000 bp marker band) and between the 250 bp and 500 bp marker bands (closer to the 250 bp marker band). The final recombined plasmid still contained part of the first intron that was amplified in the original PCR reaction and needed to be removed before transfection.

33 3.3 Removing the intron from pGL3 Basic containing the NAP1L1 promoter region.

In order to remove the intron from the promoter region, the recombined vector containing the intron was digested with SpeI. Two existing restriction sites were already in the recombined vector, one within the small section of pGC Blue that came over when the promoter was removed from pGC Blue and cloned into pGL3 Basic and one at the very end of the first exon ~5 bp before the beginning of the first intron. The SpeI restriction site in pGC Blue was conveniently on the 3’ end of the promoter fragment removed from pGC Blue and so the pGL3 Basic vector containing the NAP1L1 promoter region was simply cut with SpeI and ligated onto itself to remove the intron fragment.

After the ligation was transformed, 6 of the resulting colonies were set to grow overnight in liquid LB containing ampicillin and the plasmids were purified by boiling lysis. The purified plasmids were digested with KpnI and XbaI to confirm the removal of the intron fragment. Expected band sizes from the KpnI digest were 4772 bp and 1748 bp and from the XbaI digest were 4169 bp and 2351 bp. Two of the boil preps were cut correctly by

EcoRI and XbaI. The EcoRI digests resulted in two bands, one between the 4000 bp and

5000 bp marker bands (closer to 5000 bp band) and one centered between the 1500 bp and 2000 bp marker bands. The XbaI digests results in two bands as well, one between the 4000 bp and 5000 bp marker bands (closer to the 4000 bp band) and one between the

2000 bp and 2500 bp marker bands (closer to the 2500 bp band). The final, correct plasmid, now referred to as pGL3N-1273, was then used for transfections, for the intron

(as a potential enhancer) cloning, and for the remaining promoter deletion constructs (-

398, -367, -335, -254, and -215).

34 3.4 Cloning the intron into the enhancer region of pGL3 Basic and pGL3N-1273.

In order to determine if there is a functional enhancer within the first intron of

NAP1L1, a region of 1913 bp was amplified by touchdown PCR. This fragment, which contained almost the entire first exon in addition to a region of the first exon, appeared just below the 2000 bp marker band as expected. One other faint band was visible. The fragment was excised, purified, ligated into pGC Blue, and transformed. Twelve of the resulting colonies were set up in liquid LB supplemented with kanamycin overnight and the plasmids were purified through boiling lysis. These purified plasmids were digested with EcoRI to confirm that the intron region had been cloned into pGC Blue. The EcoRI digests were expected to result in three bands at 2233 bp, 1045 bp, and 886 bp. Of the 12 colonies, 11 had plasmids that resulted in the correct band sizes. To determine the orientation of these plasmids, 6 were chosen and digested with KpnI. The KpnI digests were expected to result in two bands at 3396 bp and 768 bp if in the “reverse” direction and two bands at 2915 bp and 1249 bp if in the “forward” direction. Of these, 5 were in the reverse direction with a band between the 3000 bp and 3500 bp marker bands (closer to 3500 bp) and a band at the 750 bp marker band. The other plasmid contained the intron in the forward direction with one band just beneath the 3000 bp marker band and one band midway between the 1000 bp and 1500 bp marker band. One of the plasmids containing the intron in the reverse direction was then digested with SpeI to remove the intron for cloning into pGL3 Basic and pGL3N-1273 while at the same time remove the majority of the 5’-UTR that was amplified in the original PCR. Both pGL3 Basic and pGL3N-1273 were digested with SalI, which cuts 3’ of the luciferase coding region. All three linearized fragments were blunted and the two vectors were treated with CIP to

35 remove the phosphates. All three were run on a gel, excised, purified, ligated, and transformed. Ten colonies from the plate of pGL3 Basic with the intron and 6 colonies from the plate of pGL3N-1273 with the intron were grown overnight in liquid LB containing ampicillin and the plasmids were purified by boiling lysis. The purified plasmids were digested with EcoRI to determine both if the intron had been successfully inserted into the vectors and in which direction the intron had been inserted.

3.5 Transient transfections of NAP1L1 promoter deletion constructs revealed regions between -442 bp and -297 bp that may contain a positive regulatory element and -297 bp and -183 bp that may contain a negative regulatory element.

In order to determine the locations of regulatory elements within the isolated promoter region of NAP1L1, initial deletions of the promoter region in pGL3 Basic were transfected into human embryonic kidney (HEK293T) cells alongside the full promoter region. In order to test these promoter constructs for their ability to drive the expression of luciferase, the constructs were co-transfected with the normalization vector pRL-

SV40. After normalization of firefly luciferase expression from the pGL3 constructs to the Renilla luciferase expression from pRL-SV40, a fold value was calculated based on the minimum expression of luciferase by the negative control, pGL3 Basic. The pGL3

Basic vector (Promega) contains no promoter region and so resulted in the least amount of luciferase expression. One positive control was used as well, pGL3 Control, which contains the SV40 promoter and enhancer (blue box in Figure 2) (Promega). The controls pGL3 Basic and pGL3 Control are the first two vectors from the top in Figure 2. The luciferase coding region is denoted by a yellow box labeled “luc+”. All NAP1L1

36 promoter constructs contain a 5’–UTR (purple bar) that extends to +387 relative to the transcription start site upstream of luc+, a transcription start site (bent arrow), and an upstream promoter region (black bar) labeled with the distance from the transcription start site. Luciferase activity resulting from the construct containing the NAP1L1 promoter region (pGL3N-1273), spanning from -1273 bp to +387 bp relative to the transcription start site (+1), showed high luciferase activity similar to that of the pGL3

Control vector (Figure 2). Little differences in luciferase activity were seen between pGL3N-1273, pGL3N-563, and pGL3N-442 (Figure 2). However, a drop in luciferase activity was seen between the pGL3N-442 and pGL3N-297 constructs (Figure 2).

Although less apparent in the figure, there is a small increase in luciferase activity between the pGL3N-297 and pGL3N-183 constructs. The somewhat larger increase in luciferase activity between pGL3N-183 and pGL3N-56 was inconsistent between transfections and so produced large error bars. Although the error bars in this initial experiment are large, there was a consistent pattern between transfections that resulted in a drop in luciferase activity between pGL3N-442 and pGL3N-297 and an increase in

37 luciferase activity between pGL3N-297 and pGL3N-183, and so was a basis for designing smaller deletions within these two regions.

Figure 2. Luciferase activity resulting from promoter deletion constructs. Fold values were calculated by normalizing to the expression results of pGL3 Basic, the luciferase reporter vector containing no promoter.

The positive control vector pGL3 Control contains an SV40 promoter and enhancer, show an increase in luciferase activity relative to pGL3 Basic. The largest isolated NAP1L1 promoter region, pGL3N-1273, shows a high amount of luciferase activity similar to the pGL3 Control vector. The first notable change in luciferase activity occurs between the pGL3N-442 and pGL3N-297 vectors, where a large decrease in luciferase activity is seen in the pGL3N-297 construct. There also appears to be an increase in luciferase activity between the pGL3N-183 and pGL3N-56 constructs although the error bars are large. n=3, Mean ±

S.D.

38 3.6 Preliminary transient transfections results from the smaller NAP1L1 promoter deletions between -442 bp and -297 bp as well as between -297 bp and -183 bp relative to the TSS.

Smaller deletions within the two regions between -442 bp and -183 bp were made to narrow down the locations of potential regulatory elements. The constructs pGL3N-

442, pGL3N-398, pGL3N-367, pGL3N-335, pGL3N-297, pGL3N-254, pGL3N-215, and pGL3N-183 were transfected into HEK293T cells along with the controls pGL3 Basic and pGL3 Control and the full promoter construct pGL3N-1273. Preliminary transfection results are shown in Figure 3 below. The first three constructs are the control vectors as previously described above. The following promoter constructs are also pictured as previously described above. Given this preliminary data, there appears to be a decline in luciferase activity between the pGL3N-367 and pGL3N-335 constructs and two increases in luciferase activity between constructs pGL3N-297 and pGL3N-215 (Figure 3). The first difference is seen between pGL3N-297 and pGL3N-254 and the second between pGL3N-254 and pGL3N-215 (Figure 3).

39

Figure 3. Preliminary results from transient transfections of smaller deletions within the region -442 bp and -183 bp relative to the transcription start site. The controls are the top three constructs in the graph. The first is pGL3 Basic, which contains no promoter. The third is pGL3 Control, which contains the SV40 promoter and enhancer (blue box). In the construct below, the 5’ upstream sequence of the promoter region is denoted by the black bar and number of the most 5’ nucleotide of the promoter region. The transcription start site is denoted by the bent arrow and the 5’-UTR is denoted by the purple bar that ends at +387 bp in all constructs. The yellow box indicates the coding region for the firefly luciferase gene in the pGL3 constructs. This preliminary data shows a decrease in luciferase activity between -

367 bp and -335 bp relative to the transcription start site and an increase in luciferase activity between -297 bp and -

254 bp relative to the transcription start site. There may be a second increase in luciferase activity between the pGL3N-254 and pGL3N-215 constructs as well. n=2, Mean ± S.D.

40 3.7 Transient transfections of the NAP1L1 promoter and intron regions.

In order to determine whether or not an enhancer lies within the first intron of

NAP1L1, as was seen in mice (Wu et al., 2008), a section of the first intron spanning from +382 bp to +1915 bp was cloned into the region 3’ of the luciferase coding sequence in pGL3 Basic and pGL3N-1273. Controls pGL3 Basic and pGL3 Control were used in this set of transfections along with a recombined pGL3 Basic (no promoter) containing the intron in the forward and the reverse directions. The vectors pGL3N-1273, pGL3N-1273 + intron(F), and pGL3N-1273 + intron(R) were transfected to determine if the intron contained a functional enhancer. The images in Figure 4 are as previously described, except the included intron region is denoted as a green box with a green arrow indicating its orientation and the putative c-Myc binding site is denoted as a brownish oval. Luciferase activity of the pGL3 Basic constructs (no promoter) with the intron in both directions or without the intron were similar as expected (Figure 4). These initial results show a large increase in luciferase activity resulting from the pGL3N-1273 construct containing the intron in the forward orientation compared to the promoter construct without the intron, pGL3N-1273 (Figure 4). There seemed to be no difference in luciferase activity resulting from the construct containing the intron in the reverse

41 direction compared to the construct containing only the full promoter region, pGL3N-

1273 (Figure 4).

Figure 4. Initial transfections of NAP1L1 intron constructs suggest that a function enhancer lies within the first intron of NAP1L1. Control constructs are depicted as previously described. Intron contructs contain the intron region

(green box) and an arrow (green) representing the orientation of the intron within the construct. The putative c-Myc binding site is illustrated as a brownish oval within the intron. Results show a large increase in luciferase activity between the NAP1L1 promoter construct and the NAP1L1 promoter construct containing the intron region in the forward direction. The NAP1L1 promoter construct containing the intron in the reverse direction did not seem to result in a difference in luciferase activity compared to the NAP1L1 promoter construct without an intron. n=3, Mean ± S.D.

3.8 Transient transfections of NAP1L1 promoter and intron regions with the mutated c-Myc consensus sequence.

To determine if the c-Myc consensus sequence within the first intron of NAP1L1 is involved in the expression of NAP1L1, the consensus sequence was mutated in the promoter construct pGL3N-1273 containing the intron in the forward direction from

“CACGTG” to “TTTGTG”. This mutated construct, depicted in Figure 5 as previously

42 described except for a red “X” over the putative c-Myc binding site, was transfected alongside the initial set of intron constructs (see 3.7). The results show that mutating the c-Myc consensus sequence within the intron decreased the amount of luciferase activity to a fold value not significantly different from the NAP1L1 promoter construct without an intron (Figure 5). There is a significant difference in luciferase expression between the promoter construct without the intron region and the one with the intron in the forward direction (Figure 5). There was no significant difference between the promoter construct without the intron and the promoter construct with the intron in the reverse direction

(Figure 5). There was a significant difference in luciferase expression between the promoter construct with the intron in the forward direction and the same construct containing the mutation at the putative c-Myc site (Figure 5). The construct containing the promoter of NAP1L1 and the intron in the reverse direction showed a significant difference in luciferase expression from the NAP1L1 promoter construct containing the mutated intron in the forward direction (Figure 5). An unpaired student’s t-test was used to determine the significance between the results of two constructs at a time.

43

Figure 5. Transient transfections of the NAP1L1 intron constructs including the c-Myc binding site mutation. The controls are the same as those previously described. showed increases in luciferase activity relative to the pGL3 Basic vector. The controls pGL3 Basic containing the intron in the forward direction and pGL3 Basic containing the intron in the reverse direction resulted in low levels of luciferase activity similar to the empty pGL3 Basic vector. The construct pGL3N-1273 containing the intron in the forward direction resulted a large increase in luciferase activity, approximately twice that of the construct containing only the NAP1L1 promoter region. The construct containing the

NAP1L1 promoter region and the intron in the reverse direction, showed similar luciferase activity to that of the pGL3N-1273 construct. The promoter construct containing the NAP1L1 promoter region and the intron in the forward orientation with the mutated sequence “TTTGTG” resulted in a decrease in luciferase activity relative to the non- mutated construct. This mutated construct resulted in similar levels of luciferase activity to the pGL3N-1273 construct that did not contain an intron. n=3, Mean ± S.D., *p<0.05.

44 CHAPTER 4: DISCUSSION

Gene expression is modulated by specific transcription factors bound to sequences, such as proximal promoter elements, enhancers, and silencers, that interact with basal transcription factors at the core promoter to increase or decrease transcription

(Maston et al., 2006). Much of the differential regulation between cell types comes from the binding of specific transcription factors to proximal promoter elements, enhancers, and silencers. In order to better understand how a particular gene is regulated in a normal cell, these differential elements within the promoters and distal elements must be studied.

4.1 Transient transfections of initial deletion constructs revealed a regulatory element located within the region between -442 bp and -297 bp relative to the transcription start site (+1) and smaller deletions of this region suggest that the element lies between -367 bp and -335 bp relative to the TSS.

Transient transfections of larger deletion constructs resulted in a decrease in luciferase activity between the pGL3N-442 and pGL3N-297 constructs, suggesting that a positive regulatory element lies between -442 bp and -297 bp relative to the transcription start site (Figure 2). The preliminary results from the transfections with the smaller deletions within this region suggests that the element lies between -367 bp and -335 bp relative to the transcription start site (Figure 3). Bioinformatics analyses did not suggest a particular transcription factor binding site within this region (Figure 1). Results from individual bioinformatics programs (Alibaba2, Match, and PROMO) suggest the sequence contains bindings sites for transcription factors such as C/EBPb, XBP-1,

45 RREB-1, HNF-4, Sp1, and others (data not shown). These five transcription factors are capable of functioning as activators. XBP-1 has been shown to bind the promoter of

GPR43 and upregulate its expression in human monocytes (Ang et al., 2015). RREB-1 has both activating and suppressing capabilities (Jiang et al., 2010; Milon et al., 2010).

When the binding site for RREB-1 is deleted in the promoter of TCblR in a luciferase reporter transfected into HEK293T cells, the luciferase activity decreases significantly

(Jiang et al., 2010). HNF-4a increases the expression of CLDN7 in Caco-2 cells (Farkas et al., 2015) and increases the expression of EPXH1 with other factors in HepG2 cells

(Peng et al., 2015). Sp1 is known to have roles in activating or suppressing transcription

(Li et al., 2015; Zaid et al., 2001). For example, as an activator Sp1 was shown to bind and upregulate MALAT1 in human cells (Li et al., 2015). In order to determine which transcription factor is binding this region, more information will need to be obtained through luciferase assays of sequence mutations.

4.2 Transient transfections of deletion constructs revealed a regulatory element located within the region between -297 bp and -183 bp relative to the transcription start site (+1) and smaller deletions of this region suggest that there is an element that lies between -297 bp and -254 bp and a second element that lies between -254 bp and -215 bp relative to the TSS.

Results from preliminary luciferase assays revealed an increase in luciferase expression between the constructs pGL3N-297 and pGL3N-254, suggesting that a negative regulatory element lies between -297 bp and -254 bp. The results from the bioinformatics analyses (Figure 3) do not show a potential TFBS in this region common

46 between the bioinformatics programs. The individual programs (Alibaba2, Match, and

PROMO) suggest transcription factor binding sites such as GRa, C/EBPb, TGIF, Sp1,

Elf-1, and others (data not shown). GR, in mice and humans, may function as a transrepressor by interacting with other factors to inhibit the expression of a gene (e.g.

AP-1) instead of binding the DNA sequence directly (De Bosscher et al., 2001; Newton,

2014). Sp1 acts as a repressor of ANT2 in Hela and NIH3T3 cells (Zaid et al., 2001). Elf-

1 may function to upregulate or downregulate gene expression (Xiang et al., 2010; Honda et al., 2003). In rats, the bidirectional promoters for Nth1 and Tsc2 are suppressed by Elf-

1 binding to two sequences in between these promoters that are conserved in humans

(Honda et al., 2003). Luciferase assays of sequence mutations within this small region of the NAP1L1 promoter will help identify which specific sequences are involved in gene regulation and then sequence binding by a specific transcription factor can then be tested via a ChIP analysis.

There also seems to be an increase in luciferase expression between the constructs pGL3N-254 and pGL3N-215, also suggesting the location of a negative regulatory element within the region -254 bp to -215 bp relative to the TSS. The bioinformatics data in Figure 1 shows a putative transcription factor binding site for the progesterone receptor

(PR) in this region. The two isoforms of PR, PR-A and PR-B, have activating capabilities, but it seems that PR-A may have suppressive capabilities as well (Patel et al., 2015). This is a tentative result that will need support from experimental data by luciferase assays of smaller deletions or sequence mutations within this region.

47 4.3 Transient transfections of initial deletion constructs revealed a potential element between -183 bp and -56 bp relative to the TSS.

The rather large increase in luciferase expression between constructs pGL3N-183 and pGL3N-56 was variable between transfections, but may need to be looked into further for potential regulatory elements. As this construct encompasses the core promoter and a little upstream sequence, there may be downstream regulatory elements that are involved in this high expression level. Based on the bioinformatics data in Figure

1, this region contains two putative Sp1 binding sites and two AP-2a binding sites, one of which overlaps the junction at -56 bp. Ap-2a has activating capabilities as well as silencing capabilities, as it is known to act as a repressor in mice (Jiang et al., 1998,

Berlato et al., 2011). These speculations based on the bioinformatics data will need to be tested experimentally through luciferase assays of smaller deletion constructs or sequence mutations within this region.

4.4 Transfections of intron constructs revealed a functional c-Myc binding site in the first intron of NAP1L1.

Wu and colleagues (2008) identified c-Myc binding sites within the first intron of murine NAP1L1 after a ChIP-seq of c-Myc that brought down the promoter region containing part of the first intron with two c-Myc consensus sequences. In a follow-up experiment, a strong correlation was made between high levels of c-Myc and NAP1L1 in human B cell lymphomas (Wu et al., 2008). This led to the idea that there is a conserved c-Myc binding site within the first intron of the human NAP1L1. In order to determine if

48 there is a functional enhancer within the first intron of NAP1L1, a region of the intron was cloned into pGL3N-1273 with the expectation of seeing an increase in gene expression. An increase in luciferase activity seen in the construct containing the intron in the forward direction supports the hypothesis that there is an enhancer within the first intron of NAP1L1. Enhancers are generally known to function independent of their location relative to the promoter region (Maston et al., 2006). They are bound by activating transcription factors and form an enhanceosome complex at the promoter of a gene by looping around to interact with the transcription factors bound to the promoter

(Maston et al., 2006; Engel et al., 2016). As enhancers were previously characterized as functioning in both a location- and orientation-independent manner (Maston et al., 2006), the intron was cloned into a region further away from the NAP1L1 promoter (on the other side of the luciferase coding region) and a second construct with the intron in the reverse direction was made. Interestingly, and unexpectedly, an increase was not seen in the construct containing the intron in the reverse direction, suggesting the enhancer may be working in an orientation-dependent manner. The oncogene MYC produces a transcription factor c-Myc that appears to be a regulator of NAP1L1 in mice based on the results of a c-Myc ChIP-seq (Wu et al., 2008). The c-Myc consensus sequence in the first intron of human NAP1L1 may be functionally conserved as there appears to be an increase in luciferase expression with the construct containing the intron region in the forward direction. The displacement of this binding site is minimal between the forward and reverse orientations, shifting the location of the consensus sequence by approximately 300 bp. If found to be a regulator of NAP1L1 expression in humans, c-

Myc would seem to function to increase gene expression in a location-independent but

49 orientation-dependent manner. There is some evidence of orientation-dependent enhancer elements. One example, in mice, shows that an enhancer in between two divergent promoters acts in an orientation-dependent manner (Swamynathan and Piatigorsky,

2002). When this enhancer was inverted in the luciferase construct, the promoter of one of the genes resulted in a decrease of luciferase expression to ~6% that of the wild-type promoter. Another example, also previously mentioned, is the enhancer that regulates the yellow gene in Drosophila melanogaster but not in a closely related species that contains an inverted consensus sequence (Jeong et al., 2006). Although the intron region of

NAP1L1 does not appear to function in an orientation-dependent manner, as was previously thought to be characteristic of enhancers (Engel et al., 2016), the pGL3N-

1273 construct containing the intron in the forward direction did result in increased luciferase activity relative to the pGL3N-1273 construct without the intron region. There should also be some caution in the interpretation of these results, as the location and orientation of the enhancer within these constructs does not reflect that which is in the living cell and so some results, such as the decreased luciferase activity from the construct containing the inverted intron, may be a consequence of in vitro experiments.

After the intron was shown to increase gene expression within the plasmid containing the full, isolated promoter region of NAP1L1 and the intron in the forward direction, the putative c-Myc binding site within the intron was tested for its potential involvement as an enhancer. The consensus sequence for c-Myc was mutated from

“CACGTG” to “TTTGTG” in the full pGL3N-1273 construct containing the intron in the forward direction. The resulting luciferase activity was comparable to that of the full promoter construct without the intron (pGL3N-1273) (Figure 5). This confirmed the

50 hypothesis that the sequence “CACGTG” is important in the expression of NAP1L1. It is known that c-Myc and Max form a heterodimer to bind the sequence, called an E-box

(Walhout et al., 1997).

Regulatory factors in addition to c-Myc such as the upstream stimulating factor

(USF) have been shown to bind the E-box sequence (Walhout et al., 1997). Although there are other factors that bind the same sequence and regulate other genes containing that sequence, c-Myc seems to be the most likely transcription factor to regulate NAP1L1 based on the previous research by Wu and colleagues (2008), indicating a correlation between high levels of NAP1L1 expression and c-Myc in human lymphomas. c-Myc has been found to regulate other genes via an intronic enhancer including the rat ornithine decarboxylase (ODC) (Walhout et al., 1997) and rat prothymosin a (Gaubatz et al.,

1994). In fact, intronic enhancers are not uncommon. There are examples of functional enhancers within introns of human genes such as FDXR and GCH, and murine genes such as GnRH and NAP1L1 (Imamichi et al., 2014; Liang et al., 2013; Kim et al., 2011;

Wu et al., 2008).

If c-Myc is found to be a regulator of NAP1L1 expression, this could reveal a method by which NAP1L1 is overexpressed in certain cancers such as pancreatic neuroendocrine neoplasms and small intestinal carcinoids (Kidd et al., 2006; Schimmack et al., 2014). As c-Myc is a known oncogene and NAP1L1 overexpression has been correlated with cancer, the regulation of NAP1L1 by c-Myc may suggest a method by which c-Myc overexpression results in tumor growth. The overexpression of NAP1L1 has been correlated with the ability of induced pluripotent stem cells to maintain a proliferative and a stem-like state instead of differentiating (Gong et al., 2014). A recent

51 study showed that when NAP1L1 expression is knocked down in murine iPSCs, there was an increase in differentiation into cardiomyocytes (Gong et al., 2014). This same study showed that overexpression of NAP1L1 decreased iPSC differentiation into cardiomyocytes. A more recent study by the same lab confirmed NAP1L1’s role in the proliferative ability of murine iPSCs through knockdown and overexpression experiments (Yan et al., 2016). It is noteworthy that c-Myc is one of the four Yamanaka factors involved in inducing pluripotency (Yan et al., 2016), which may, in part, result from direct upregulation of NAP1L1 through its intronic enhancer as is seen in mice (Wu et al., 2008) and humans (this study). NAP1L1 overexpression has also been correlated with multiple cancers such as small intestinal carcinoids (SICs), hepatoblastomas, and pancreatic neuroendocrine neoplasms (Kidd et al., 2006; Nagata et al., 2003; Schimmack et al., 2014). This increased expression may be a result of upregulation by c-Myc, which is known to be expressed at higher levels in many cancers (Stine et al., 2015; Udager et al., 2016). This may suggest a potential method of downregulating NAP1L1 as a cancer treatment or in order to differentiate stem cells into specific tissues for grafting. As methods such as CRISPR-Cas9 are advancing and approaching clinical use, there may arise a method to downregulate NAP1L1 by mutating the c-Myc binding site within the first intron of the human NAP1L1 in order to decrease proliferation in cancer or increase differentiation in iPSCs for treatment (Sánchez-Rivera and Jacks, 2015; Sternberg and

Doudna, 2015).

52 4.5 Future directions.

After confirming that the sequence “CACGTG” within the first intron is involved in the expression of NAP1L1, the next step would be a ChIP assay to confirm whether or not it is the transcription factor c-Myc that is binding this sequence.

Given the preliminary data from the small promoter deletions between -442 bp and -183 bp relative to the TSS, a potential transcription factor binding site within the regions -367 bp and -335 bp relative to the TSS and -297 bp and -215 bp relative to the

TSS will be mutated to determine if there is a similar change in luciferase activity. As the bioinformatics data did not collectively suggest a specific transcription factor binding site within this region, sequence mutations within each of these regions can be made in order to determine specific sequences that are regulating gene expression. A ChIP assay will then confirm whether or not the putative transcription factor binds that sequence.

Transient transfections of smaller deletions within the region -183 bp to -56 bp relative to the TSS will also be necessary in order to reveal potential elements within this region.

Identifying the mechanisms of gene regulation in normal, healthy cells can help reveal the methods in which dysregulation of certain genes occurs in diseases such as cancer. For example, tumors can form from uncontrollably dividing cells whose cell cycle genes and apoptotic genes are improperly regulated. Developmental disorders can result from improper gene regulation as well and the effects will depend on what genes and at which stage the dysregulation occurs.

The nucleosome assembly protein 1-like 1 (NAP1L1) has been associated with multiple cancers as well as the stem-like nature of undifferentiated cells. When the

53 expression of this gene was knocked down in murine pancreatic neuroendocrine neoplasms, the proliferation of these cells decreased (Schimmack et al., 2014). When this gene’s expression was knocked down in murine iPSCs, the cells proceeded to differentiate into cardiomyocytes (Gong et al., 2014). Studying the factors involved in the expression of this gene is important for understanding why this gene is expressed differently between the two B-cell lymphomas, mantle cell lymphoma and small lymphocytic lymphoma (Henson et al., 2011).

Studying how genes are expressed allows for a better understanding of diseases caused by the dysregulation of genes as well as a better understanding of gene expression in general.

54 REFERENCES

Ang Z, Er JZ, Ding JL. The short-chain fatty acid receptor GPR43 is transcriptionally regulated by XBP1 in human monocytes. Nature Scientific Reports 2015; doi: 10.1038/srep08134

Attia M, Förster A, Rachez C, Freemont P, Avner P, Rogner UC. Interaction between Nucleosome Assembly Protein 1-like Family Members. Journal of Molecular Biology 2011;407:647-660.

Berlato C, Chan KV, Price AM, Canosa M, Scibetta AG, Hurst HC. Alternative TFAP2A isoforms have distinct activities in breast cancer. Breast Cancer Research 2011;13:R23.

Danino YM, Even D, Ideses D, Juven-Gershon T. The core promoter: At the heart of gene expression. Biochimica et Biophysica Acta 2015;1849:1116-1131.

De Bosscher K, Berghe WV, Haegeman G. Glucocorticoid Repression of AP-1 Is Not Mediated by Competition for Nuclear Coactivators. Molecular Endocrinology 2001;15:219–227.

Dikstein R. The unexpected traits associated with core promoter elements. Transcription 2011;2:201-206.

Engel KL, Mackiewicz M, Hardigan AA, Myers RM, Savic D. Decoding transcriptional enhancers: Evolving from annotation to functional interpretation. Seminars in Cell & Developmental Biology 2016;57:40-50.

Farkas AE, Hilgarth RS, Capaldo CT, Gerner-Smidt C, Powell DR, Vertino PM, Koval M, Parkos CA, Nusrat A. HNF4a Regulates Claudin-7 Protein Expression during Intestinal Epithelial Differentiation. The American Journal of Pathology 2015;185:2206– 2218.

Gaubatz S, Meichle A, Eilers M. An E-Box Element Localized in the First Intron Mediates Regulation of the Prothymosin a Gene by c-myc. Molecular and Cellular Biology 1994;14:3853–3862.

Gehrke AR, Shubin NH. Cis-regulatory programs in the development and evolution of vertebrate paired appendages. Seminars in Cell & Developmental Biology 2016;57:31– 39.

Gong H, Yan Y, Fang B, Xue Y, Yin P, Li L, Zhang G, Sun X, Chen Z, Ma H, Yang C, Ding Y, Yong Y, Zhu Y, Yang H, Komuro I, Ge J, Zou Y. Knockdown of Nucleosome Assembly Protein 1-Like 1 Induces Mesoderm Formation and Cardiomyogenesis Via Notch Signaling in Murine-Induced Pluripotent Stem Cells. Stem Cells 2014;32:1759– 1773.

55 Goodrich JA, Tjian R. Unexpected Roles for Core Promoter Recognition Factors in Cell- type Specific Transcription and Gene Regulation. Nat Rev Genet. 2010;11:549-558.

Haberle V, Lenhard B. Promoter architectures and developmental gene regulation. Seminars in Cell & Developmental Biology 2016;57:11-23.

Henson SE, Morford T, Stein MP, Wall R, Malone CS. Candidate genes contributing to the aggressive phenotype of mantle cell lymphoma. Acta Histochemica 2011;113:729–742.

Honda S, Kobayashi T, Kajino K, Urakami S, Igawa M, Hino O. Ets Protein Elf-1 Bidirectionally Suppresses Transcriptional Activities of the Tumor Suppressor Tsc2 Gene and the Repair-Related Nth1 Gene. Molecular Carcinogenesis 2003;37:122–129.

Imamichi Y, Mizutani T, Ju Y, Matsumura T, Kawabe S, Kanno M Yazawa T, Miyamoto K. Transcriptional regulation of human ferredoxin reductase through an intronic enhancer in steroidogenic cells. Biochimica et Biophysica Acta 2014;1839:33–42.

Jeong S, Rokas A, Carroll SB. Regulation of Body Pigmentation by the Abdominal-B Hox Protein and Its Gain and Loss in Drosophila Evolution. Cell 2006;125:1387–1399.

Jiang MS, Tang QQ, McLenithan J, Geiman D, Shillinglaw W, Henzel WJ, Lane MD. Derepression of the C/EBPa gene during adipogenesis: Identification of AP- 2a as a repressor. Proc. Natl. Acad. Sci. 1998;95:3467–3471.

Jiang W, Sequeira JM, Nakayama Y, Quadros EV. Characterization of the promoter region of TCblR/CD320 gene, the receptor for cellular uptake of transcobalamin-bound cobalamin. Gene 2010;466:49–55.

Kidd M, Modlin IM, Mane SM, Camp RL, Eick G, Latich I. The role of genetic markers — NAP1L1, MAGE-D2, and MTA1 — in defining small-intestinal carcinoid neoplasia. Annals of Surgical Oncology 2006;13:253–262.

Kim, HD, Choe HK, Chung S, Kim M, Seong JY, Son GH, Kim K. Class-C SOX Transcription Factors Control GnRH Gene Expression via the Intronic Transcriptional Enhancer. Mol Endocrinol 2011;25:1184–1196.

Kim MH, Field J. Translationally Regulated C/EBPb Isoform Expression Upregulates Metastatic Genes in Hormone-Independent Prostate Cancer Cells. The Prostate 2008;68:1362–1371.

56 Li S, Wang Q, Qiang Q, Shan H, Shi M, Chen B, Zhao S, Yuan L. Sp1-mediated transcriptional regulation of MALAT1 plays a critical role in tumor. J Cancer Res Clin Oncol 2015;141:1909–1920.

Liang Y, Inagaki H, Hao Q, Sakamoto M, Ohye T, Suzuki T, Ichinose H. Identification of an enhancer region for immune activation in the human GTP cyclohydrolase I gene. Biochemical and Biophysical Research Communications 2013;442:72–78.

Maston GA, Evans SK, Green MR. Transcriptional Regulatory Elements in the . Annual Review of Genomics and Human Genetics 2006;7:29- 59.

Milon BC, Agyapong A, Bautista R, Costello LC, Franklin RB. Ras Responsive Element Binding Protein-1 (RREB-1) Down-Regulates hZIP1 Expression in Prostate Cancer Cells. Prostate 2010;70:288–296.

Mishra S, Lin CL, Huang TH, Bouamar H, Sun LZ. MicroRNA-21 inhibits p57Kip2 expression in prostate cancer. Mol Cancer 2014;13:212.

Nagata T, Takahashi Y, Ishii Y, Asai S, Nishida Y, Murata A, Koshinaga T, Fukuzawa M, Hamazaki M, Asami K, Ito E, Ikeda H, Takamatsu H, Koike K, Kikuta A, Kuroiwa M, Watanabe A, Kosaka Y, Fujita H, Miyake M, Mugishima H. Transcriptional profiling in hepatoblastomas using high-density oligonucleotide DNA array. Cancer Genetics and Cytogenetics 2003;145:152– 160.

Newton R. Anti-inflammatory glucocorticoids: Changing concepts. European Journal of Pharmacology 2014;724:231–236.

Okuwaki M, Kato K, Nagata K. Functional characterization of human nucleosome assembly protein 1-like proteins as histone chaperones. Genes to Cells 2010;15:13-27.

Park YJ, Luger K. Structure and function of nucleosome assembly proteins. Biochem. Cell Biol. 2006;84:549–558.

Patel B, Elguero S, Thakore S, Dahoud W, Bedaiwy M, Mesiano Sam. Role of nuclear progesterone receptor isoforms in uterine pathophysiology. Human Reproduction Update 2015;21:155–173.

Peng H, Zhu QS, Zhong S, Levy D. Transcription of the human microsomal epoxide hydrolase gene (EPHX1) is regulated by an HNF-4a/CAR/RXR/PSF complex. Biochimica et Biophysica Acta 2013;1829:1000–1009.

57 Sánchez-Rivera FJ, Jacks T. Applications of the CRISPR-Cas9 system in cancer biology. Nat Rev Cancer. 2015;15:387–395.

Schimmack S, Taylor A, Lawrence B, Alaimo D, Schmitz-Winnenthal H, Büchler MW, Modlin IM, Kidd M. A mechanistic role for the chromatin modulator, NAP1L1, in pancreatic neuroendocrine neoplasm proliferation and metastases. Epigenetics & Chromatin 2014;7:15.

Sternberg SH, Doudna JA. Expanding the Biologist’s Toolkit with CRISPR-Cas9. Mol Cell. 2015;58:568–574.

Stine ZE, Walton ZE, Altman BJ, Hsieh AL, Dang CV. MYC, Metabolism, and Cancer. Cancer Discov. 2015;5:1024–1039.

Swamynathan SK, Piatigorsky J. Orientation-dependent Influence of an Intergenic Enhancer on the Promoter Activity of the Divergently Transcribed Mouse Shsp/aB-crystallin and Mkbp/HspB2 Genes. The Journal of Biological Chemistry 2002;277:49700–49706.

Tu WB, Helander S, Pilstål R, Hickman KA, Lourenco C, Jurisica I, Raught B, Wallner B, Sunnerhagen M, Penn LZ. Myc and its interactors take shape. Biochimica et Biophysica Acta 2015;1849:469–483.

Udager AM, De Marzo AM, Shi Y, Hicks JL, Cao X, Siddiqui J, Jiang H, Chinnaiyan AM, Mehra R. Concurrent nuclear ERG and MYC protein overexpression defines a subset of locally advanced prostate cancer: potential opportunities for synergistic targeted therapeutics. Prostate 2016;76:845–853.

Vermeulen K, Van Bockstaele DR, Berneman ZN. The cell cycle: a review of regulation, deregulation and therapeutic targets in cancer. Cell Proliferation 2003;36:131-149.

Walhout AJM, Gubbels JM, Bernards R, van der Vliet PC, Timmers HThM. c- Myc/Max heterodimers bind cooperatively to the E-box sequences located in the first intron of the rat ornithine decarboxylase (ODC) gene. Nucleic Acids Research 1997;25:1493-1501.

Wu CH, Sahoo D, Arvanitis C, Bradon N, Dill DL, Felsher DW. Combined Analysis of Murine and Human Microarrays and ChIP Analysis Reveals Genes Associated with the Ability of MYC To Maintain Tumorigenesis. PLoS Genetics 2008;4:e1000090.

Xiang P, Lo C, Argiropoulos B, Lai CB, Rouhi A, Imren S, Jiang X, Mager D, Humphries RK. Identification of E74-like factor 1 (ELF1) as a transcriptional regulator of the Hox cofactor MEIS1. Exp Hematol. 2010;38:798–808.

58 Xu XY, Wang WQ, Zhang L, Li YM, Tang M, Jiang N, Cai SL, Wei L, Jin F, Chen B. Clinical implications of p57 KIP2 expression in breast cancer. Asian Pac J Cancer Prev. 2012;13:5033-5036.

Yan Y, Yin P, Gong H, Xue Y, Zhang G, Fang B, Chen Z, Li Y, Yang C, Huang Z, Yang X, Ge J, Zou Y. Nucleosome Assembly Protein 1-Like 1 (Nap1l1) Regulates the Proliferation of Murine Induced Pluripotent Stem Cells. Cellular Physiology and Biochemistry 2016;38:340–350.

Zaid A, Hodny Z, Li R, Nelson BD. Sp1 acts as a repressor of the human adenine nucleotide translocase-2 (ANT2) promoter. Eur. J. Biochem. 2001;268:5497– 5503.

59