Copyright 2007

By

Alex (Yick-Lun) Steven So

ii

Dedicated to:

My mother (Y.Y.S) and father (M.S.S) for sacrificing so much for our family. I love you with all my heart.

My three sisters (Eliza So, Carina So, Jasica So) and brother (Mark So) for all that they have done for me and loving me unconditionally.

iii

Acknowledgements

Thank you Mom and Dad for leaving everything that you know to embark on such an exhausting journey just so that I can have the opportunity to be at this present moment. You have made more sacrifices than the world could ever imagine, and I can only begin to express my endless gratitude. I admire you both for your vast courage and immense strength to be able to provided us with everything from starting with nothing.

I wish to thank my sisters, brother, and grandparents for all their support and encouragement. You have taught me the meaning of ambition, perseverance, and inspiration. I would not be here without any of you.

I am grateful to all my friends for being there for me, listening to me complain about graduate school, and supporting me every step of the way. You all have shaped me as a person and as a scientist.

Thank you to the scientific committee for providing me with the resources to finish this body of work. My special appreciation goes out to my thesis committee members (Peter Walter, Wendell Lim, and Geeta Narlikar) and other people who have helped me along the way (Kevan Shokat and Michael Stallcup).

I am indebted to Keith Yamamoto for his motivation and entrusting me with the power to practice my thoughtfulness and creativity through science. Your intelligence, wisdom, charisma, and exceptional talents have inspired this body of work and all my scientific achievements and beyond. Finally, I would like to express my vast appreciation to all the members of the Yamamoto lab for being my role models and my friends.

iv

Determinants of Glucocorticoid Receptor (GR) Transcriptional Specificity and

Genomic Occupancy

Alex (Yick-Lun) Steven So

Abstract

The glucocorticoid receptor (GR) associates with glucocorticoid response elements

(GREs) and regulates selective transcription in a cell-specific manner. GREs are typically thought to be composite elements that recruit GR as well as other factors into functional complexes. We assessed whether GR occupancy is commonly a limiting determinant of GRE function as well as the extent to which core GR binding sequences

(GBSs) and GRE architecture are conserved at functional loci. We found that GR was bound in A549 cells predominately near responsive to glucocorticoids in those cells and not at genes regulated by GR in other cells. The GREs were distributed equally upstream and downstream of the transcription start sites, with 63% of them >10kb from those sites. Strikingly, sequences flanking the GBSs varied among GREs but were conserved at individual GREs. Similarly, although the GBSs across the set of GREs varied extensively around a consensus, the precise sequence at an individual GRE was conserved across species. Thus, we further examined whether sequence conservation of sites resembling GBSs is sufficient to predict GR-occupancy of GREs at genes responsive to glucocorticoids. Indeed, we found that the level of conservation of GBSs at genes up-regulated by glucocorticoids in mouse C3H10T1/2 mesenchymal stem-like cells correlated directly with the extent of occupancy by GR. We conclude that GR occupancy

v

is a primary determinant of glucocorticoid responsiveness and that sequence of GBSs as well as GRE architecture likely harbor gene-specific regulatory information. Moreover,

GBS conservation alone is sufficient to predict GR occupancy at induced genes.

In this study, we found that genes important for regulating circadian rhythm were responsive to glucocorticoids in mesenchymal stem cells (MSCs). Thus, we confirmed that GR stimulated rhythmicity in these cells and identified primary GREs at Per1, Per2, and E4bp4 genes. We investigated whether the circadian clock in MSCs became autonomous of GR function after glucocorticoids have initiated it and found that continual GR activity is required for maintaining rhythmicity. Thus, we conclude that

GR initiates circadian rhythm through directly activating Per1, Per2, and E4bp4 genes, and that GR function is essential for maintenance of rhythmicity.

vi

Table of Contents

Abstract…………………………………………………………………………………...v

List of Tables and Figures …………………………………………..………...…...... vii

Introduction………………………………………...……………………………………...1

Chapter 1

Determinants of Cell- and Gene-Specific Transcriptional

Regulation by the Glucocorticoid Receptor…………………………………...…..6

Chapter 2

Conservation Analysis Predicts In Vivo Occupancy of

Glucocorticoid Receptor Binding Sequences…………………………...... 56

Chapter 3

Initiation and Maintenance of Circadian Rhythm in

Mesenchymal Stem Cells by the Glucocorticoid Receptor……………...………82

Perspectives……………………………………………………………...…………….106

Bibliography…………………………………………………………………………....114

vii

List of Tables and Figures

Chapter 1

Table 1.1 Relative locations of GR Binding Regions……………………….…...... 12

Table 1.2 Dex responsiveness of steroid targets from other cells in A549 cells..….17

Table 1.3 Distances of GREs relative to adjacent gene are conserved

in the mouse genome……………….………..….……….………………29

Table 1.4 Primers used for cloning and mutating GRE reporters………………...... 41

Table 1.5 Primers used for qPCR analysis………………………………………….44

Figure 1.1 (A) Identification of GR binding regions (GBRs) ………….…………...14

(B) GBRs identified near SCNN1A and SDPR genes………….………...14

Figure 1.2 Percentage of genes associated with one or more

GBRs in A549 cells………..…..…..…..…..…..…..…..…..…..…..…….16

Figure 1.3 (A) GR binding regions confer glucocorticoid responsiveness………….19

(B) Identification of enriched motifs within GR binding regions…...... 19

(C) Conservation analysis of GREs……………………….……………..20

Figure 1.4 (A) Locations of GREs relative to the transcription start

site of target genes….….….….….….….….….….….….….….….….….22

(B) Distribution of GREs relative to target gene start site…………….…22

Figure 1.5 (A) Dex induces increased DNAse I accessibility…….…………………23

(B) Human-mouse sequence conservation within GREs………………...24

Figure 1.6 (A) Binding of GR at mouse orthologs of primary GR

target genes from human A549 cells.………………………………...... 27

viii

(B) Genes adjacent to GREs are regulated by GR in C3H10T1/2 cells....27

(C) Core GR binding sequences are highly conserved……….………….27

(D) Comparative sequence conservation across individual GREs.……...28

Figure 1.7 Sequence conservation “signatures” are distinct for each GRE……..…..31

Figure 1.8 (A) Sequence comparison of human GRE 10.5 with human GRE 6.1…..33

(B) Sequence comparison of human GRE 6.4 and X.2…………….…....33

Figure 1.9 Number of GREs with putative GR binding sites………………....……..53

Chapter 2

Table 2.1 Glucocorticoid responsive genes identified in C3H10T1/2 cells………..60

Table 2.2 Identification of GR targets and GR recognition motifs…………….…...62

Table 2.3 GR occupancy at genes induced, unresponsive or repressed

by glucocorticoids………………………………………………….…….63

Table 2.4 Luciferase GBS reporters……………….…………………...…………...73

Table 2.5 Positional weight matrix generating GBSs………………………………77

Table 2.6 Chromatin immunoprecipitation of GR in C3H10T1/2

cells at dex induced genes………………………………………………..78

Table 2.7 Chromatin immunoprecipitation of GR in C3H10T1/2

cells at dex repressed genes…………………………………………...…80

Table 2.8 Chromatin immunoprecipitation of GR in C3H10T1/2

cells at dex unreponsive genes……………………………….…..………81

Figure 2.1 Pair-wise sequence conservation of GBSs is sufficient to

predict GR occupancy at dex-induced genes…………………….………64

ix

Figure 2.2 Identification of known and novel GR-occupied GBSs…….…………...65

Figure 2.3 GR-occupied GBSs are functional in C3H10T1/2 cells…….…………...67

Chapter 3

Figure 3.1 Glucocorticoids stimulate transcriptional responsive of clock genes……87

Figure 3.2 GR induces circadian rhythm in mouse primary

mesenchymal stem cells……………………………………………..…...88

Figure 3.3 GR-induced transcriptional oscillation of the circadian clock

components is gene-specific………………………….………………….89

Figure 3.4 Glucocorticoids induce three distinct oscillating phases………….…..…91

Figure 3.5 GR directly regulates transcription of circadian clock genes…….……...94

Figure 3.6 Regulation of clock genes is conserved in human primary MSCs.……...96

Figure 3.7 GR activity is required to maintain glucocorticoid-stimulated

circadian rhythm…………………………………….…………………...98

Figure 3.8 RU486 can block the initiation of transcriptional

oscillation stimulated by dex…………………………………………….99

x

Introduction

The glucocorticoid receptor (GR) mediates a wide range of physiological processes, including anti-inflammatory responses in immune cells, anti-proliferative effects in osteoblasts, and differentiation of mesenchymal stem cells into adipocytes

[1,2]. GR can achieve such diverse biological processes through its ability to regulate cell-specific transcriptional responses, which together produce a function specific to a particular cell type. Comparison of glucocorticoid responsiveness in different cells cultured from distinct lineages reveal surprisingly little overlap of GR regulated genes between cell types [3,4]. Understanding the mechanism of cell-specific regulation by GR has important physiological and pharmacological implications. Further knowledge on the subject matter would allow the dissection of transcriptional networks that govern particular biological processes. Moreover, selective steroid ligands have been designed to modulate GR activity to direct cell-specific processes [5]. Ligands such as these will be important for selective glucocorticoid therapy, such as driving the anti-inflammatory response in immune cells without affecting the anti-proliferative property of osteoblasts that lead to osteoporosis. Currently, the determinants that drive cell-specific transcriptional regulation by GR are not well characterized.

Within an individual cell-type, GR can also direct transcription in a gene-specific manner. The most dramatic examples can be easily distinguished by observing the glucocorticoid response of different genes in which some transcripts are up-regulated while others are down-regulated. Rogatsky et al. [4] demonstrated that GR can exert gene-specific responses even within the set of targets with the same glucocorticoid response, e.g activation. It was demonstrated that different glucocorticoid targets require

1

gene-specific patterns of GR domains for transcriptional activation, suggesting that the receptor may use different mechanisms to drive expression at specific promoters. It appears that GR utilizes distinct cofactor complexes to mediate gene-specific regulation as the requirement of the coregulators, Med1 and Med14, has been shown to be target specific [6]. Hence, it appears that gene-specific regulation by GR is achieved through the use of different receptor domains, which in turn signals selective coregulatory requirement at a particular gene. However, the determinants that specify the gene- specific GR transcriptional regulation is unclear.

Transcriptional regulation is achieved by GR through its interaction with

“primary” glucocorticoid response elements (GREs), which are typically thought of as composite elements made up of one or more GR binding sequence (GBS) as well as other non-GR binding sites. It has been implicated that the precise sequence of the DNA binding sequence of GR may harbor additional regulatory functions in addition to serving as a factor docking sites [7]. For instance, GR may interact directly with specific DNA sequences or tether to distinct DNA sequences through -protein interaction to confer distinct modes of transcriptional regulation [7]. The concept that the DNA binding site of factor harbor regulatory information has also been implicated for other factors besides GR. For instance, the precise sequence of NFκB binding site has been shown to direct distinct cofactors for gene-specific transcriptional regulation [8].

Collectively, these data support that the DNA binding sequence can direct cell- and gene- specific transcriptional regulation. The mechanism by which the factor binding site can specify distinct function of the bound protein may in part be do to allosteric effects of the

DNA sequence. For instance, it has been implicated that different ER binding sequences

2

may alter the receptor conformation. It was shown that different DNA binding sequences could grossly affect the conformation of the DNA binding domain (DBD) of ER as measured using antibody gelshift experiments and protease digestion assays [9]. It was also suggested that different ER DNA binding sequence may also alter the receptor configuration to potentially form distinct protein-protein interaction surfaces [9]. In these models, the DNA binding sequence can be viewed as a ligand that can modulate the conformation of the bound protein upon interaction. For GR, it is unclear whether the

DNA binding sequence is a determinant of receptor function, such as to direct gene- specific regulation by GR.

At the beginning of this study, most identified regulatory elements have been upstream and proximal to the promoter. However, it is well established that genomic elements may reside far from the promoter to drive transcriptional regulation. For example, the regulatory elements that control the globin gene cluster, generally referred to as its locus control region (LCR), are >30kb from the promoter of the target genes

[10]. It has been demonstrated that genomic elements can exert long-range transcriptional regulation through ‘looping’ mechanisms, in which the elements are in close proximity to the target promoter in three-dimensional space [10]. The emergence of techniques that allow identification of genomic factor binding sites in a high throughput manner has led to some interesting observations regarding the locations of regulatory elements. Using chromatin immunoprecipitation followed by microarray

(ChIP-chip) analyses, Martone et al. demonstrated that greater than half of NFκB binding sites are positioned at regions other than upstream promoter proximal locations [11].

Similarly, Hartman et al. also found that nearly 40% of STAT binding sites are located

3

greater than 50kb from the nearest Ensembl gene [12]. Similarly, it was shown that several steroid receptor response elements can be located far and downstream from transcriptional start site (TSS) of the neighboring responsive gene, implying that long- range regulation may be a general mechanism for regulatory factors and that genomic elements are not restricted to regions upstream from the TSS [12]. The genomic locations of GREs relative to glucocorticoid responsive genes have not been well assessed.

Given the complexity of regulatory elements locations, it has been difficult to predict and identify response elements. Computational analyses have been applied to isolate DNA motifs in silico in order to predict sites occupied by regulatory factors [13].

However, this approach for predicting GREs is limited by our knowledge of the GR binding sequence (GBS) consensus. Thus, a better understanding of GBS motifs is required for computational identification of putative GR binding sites. Moreover, given the sequence flexibility of the GBS motif [14], this motif are likely to occur frequently in the large mammalian genome and further complicates prediction of GREs. Comparative analysis of DNA sequences across genomes of different species has been instrumental in identifying gene-coding regions. Potentially, species conservational analysis may be combined with computational identification of GBS motifs for predicting and isolated genomic elements occupied by GR in vivo. This notion has not been well examined.

Further understanding on the mechanism that specify GR transcriptional regulation will be pivotal for deciphering the networks and molecular pathways the receptor utilizes for controlling individual biological processes. For example, it has been demonstrated that glucocorticoids can stimulate circadian rhythm in dissociated rat

4

fibroblast cells [15]. In mammalian systems, the circadian clock regulate major aspects of metabolism [16-18] that overlaps with those mediated by glucocorticoids, which are also under circadian regulation [19]. This suggests that regulation of metabolism by the circadian may be mediated in part through GR. It has been demonstrated that GR can effect the transcriptional oscillation of clock-regulating genes [15]. However, it is unclear which clock components are directly regulated by GR to initiate the rhythmicity, and the molecular mechanism by which GR regulates circadian rhythm remains unclear.

A major goal of this thesis is to assess the determinants that govern GR transcriptional specificity. Particularly, what are the primary determinants that mediate cell- and gene-specific GR transcriptional regulation? Is cell-specific transcriptional regulation governed at the level of GR occupancy at GREs? Does the architecture of the

GREs or the precise sequence of the GBS play a functional role in determining gene- specific regulation? Also, where are GREs positioned relative to their target promoters?

Is sequence species conservation of GBSs alone sufficient to predict GR occupancy in vivo? Furthermore, we were interested in using the information gained from studying the mechanism of GR transcriptional regulation to further advance the understanding of GR biological processes. To this end, we investigated the molecular pathway by which GR mediates circadian rhythm.

5

Chapter 1: Determinants of Cell- and Gene-Specific Transcriptional

Regulation by the Glucocorticoid Receptor

6

Chapter 1 Introduction

The great challenge of metazoan transcriptional regulation is to create specialized expression pathways that accommodate and define myriad contexts, i.e., different developmental, physiological and environmental states in distinct organs, tissues and cell types. This is achieved by a network of transcriptional regulatory factors, which receive and integrate signaling information, and transduce that information by binding close to specific target genes to modulate their expression. For example, the glucocorticoid receptor (GR) associates selectively with corticosteroid ligands produced in the adrenal gland in response to neuroendocrine cues; the GR-hormone interaction promotes GR binding to genomic glucocorticoid response elements (GREs), in turn modulating the transcription of genes that affect cell differentiation, inflammatory responses, and metabolism [1,2]. Expression profile analyses have identified glucocorticoid responsive genes in different cell types [3,4], and it is striking that there is only modest overlap in glucocorticoid-regulated gene sets between two cell types. The mechanisms by which

GR selectively regulates transcription in cell-specific contexts are not well established.

An intriguing feature of GREs and other metazoan response elements is that their positions relative to their target genes are not tightly constrained [10,20]. Although certain metazoan response elements have been described that operate from long range, most searches for such regulatory sequences have nevertheless focused for technical reasons on restricted zones just upstream of promoters, where prokaryotic and fungal elements reside. Thus, the GRE for the interleukin-8 (IL8) is just upstream of the promoter [21], whereas the aminotransferase GRE resides at -2.5kb [22].

Recent, more systematic searches for response elements have revealed dramatic

7

examples, such as an estrogen response element 144kb upstream from the promoter of the

NRIP gene [23], and an intragenic region 65kb downstream from the Fkbp5 promoter that appears to serve as an androgen response element [24]. It has been suggested that long- range regulatory mechanisms are likely to facilitate and promote regulatory evolution

[25]. However, it has not been determined whether the position of a response element relative to its target gene is functionally significant.

Evidence from numerous anecdotal, gene specific studies indicates that native response elements are typically composite elements that encompass distinct sequence motifs recognized by two or more regulatory factors. In turn, the bound factors recruit non-DNA binding co-regulatory factors, forming functional regulatory complexes that remodel chromatin and modify the activity of the transcription machinery. In this scheme, the structure and activity of the regulatory complex at a given response element would be specified by at least three determinants: the sequence motifs comprising the response element, the availability of those sequences for factor binding, and the availability and activity levels of regulatory factors present in the cell. For example, primary GREs, defined as those at which GR occupancy is required for glucocorticoid- responsive regulation, are a diverse family of elements that bind GR together with an array of additional factors defined by the above three determinants. Such composite response elements provide a powerful driving force for combinatorial regulation [2,26], vastly increasing the capacity of a single factor to assume multiple regulatory roles.

Indeed, the mere presence of GR in a regulatory complex is not sufficient for glucocorticoid regulation [21]. It is not known, however, whether such “nonproductive”

8

binding by GR is common, or if instead GR occupancy is a strong indicator of GRE function.

GR binds to a family of related sequences that defines a consensus motif: an imperfect palindrome of hexameric half sites separated by a three spacer [27-

29]. Within those 15 base pair core GR binding sequences, a few positions are nearly invariant, whereas a substantial proportion can be altered with little effect on GR binding affinity [30]. However, the functional consequences of such “permitted” sequence variations are unknown. GR can mediate a range of regulatory processes within a single cell type, including activation and repression of specific genes [4,31]. These findings, together with the results of biochemical and structural studies, raise the possibility that the core GR binding sequences might themselves serve as distinct “GR ligands”, allosterically affecting GR structure to produce distinct GR functions [7,32]. Studies of other regulatory factors have led to similar conclusions [33,34]. If different core GR binding sequences indeed produce GRE-specific (and therefore target gene-specific) regulatory activities, we could expect that the core GR binding sequence associated with a given target gene would be strongly conserved through evolution, whereas the collection of core GR binding sequences across different genes would vary substantially.

Analogously, if the ‘architecture’ of composite GREs, i.e., the arrangements of additional sequence motifs surrounding the core GR binding site, are also important for gene- specific regulation, we would expect flanking sequences surrounding the core GR binding site to also be evolutionarily conserved in a GRE-specific manner but not across

GREs within a single genome. Neither of these notions has been examined.

9

In the present work, we sought to define and characterize a set of GREs in A549 human alveolar epithelial cells. Thus, we determined in A549 cells the presence of GR at specific GREs close to genes that are steroid regulated across a range of cell types. We assessed whether the GR-occupied GREs were limited mainly to genes that are GR regulated in A549, and measured within and between species the conservation of GRE sequences, architecture and genomic positions.

10

Chapter 1 Results

Identification of GR Binding Regions using ChIP-chip

To assess the correlation of GR occupancy with glucocorticoid responsiveness, we examined GR binding at three classes of genes in A549 human lung carcinoma cells: first, genes regulated by GR in A549 cells; second, genes regulated by GR in U2OS human osteosarcoma cells but not in A549; third, genes regulated by GR or the androgen receptor (AR) in cells other than A549 or U2OS. The AR-responsive genes were of interest because AR is closely related to GR and shares similar DNA-binding specificity in vitro [28,30,35]. The first two classes of genes were identified in our lab using expression microarrays, whereas the third class was compiled from our own microarray data and from published reports of others [3,4,36,37]. Both positively and negatively regulated genes were included; together the three classes comprised 548 candidate GR target genes. By examining these genes for GR binding in A549 cells, we could determine if GR occupancy in vivo is restricted only at genomic sites of genes actually regulated by glucocorticoids in A549 cells; alternatively, GR might also bind at genes that are not under glucocorticoid control in A549, but are regulated by GR or AR in other cells.

To identify GR binding regions (GBRs), we used ChIP-chip to interrogate 100kb genomic segments centered on the transcription start sites of our set of 548 genes. This

~55 Mb sample of the genome also included or impinged upon an additional 587 genes not previously reported to be regulated by GR; thus, we assessed GR occupancy in the vicinity of more than 1,000 genes. Immunoprecipitated chromatin samples from A549 cultures treated for one hour with the synthetic glucocorticoid dexamethasone (dex; 100

11

nM) or ethanol were hybridized onto the ChIP-chip tiling arrays. Independent biological replicates were hybridized onto two separate arrays and GBRs were identified using the

SignalMap detection program; we a 3.4% false positive rate for the GBRs detected in both arrays as assessed by conventional ChIP and quantitative PCR (qPCR) analysis.

Importantly, we did not detect GR occupancy at 22 regions that showed no GR binding in the arrays (data not shown). The ChIP-chip experiments revealed a total of 73 GBRs adjacent to 61 genes (Table 1.1) that were validated by GR ChIP and qPCR analysis

(Figure 1.1A). In addition to identifying GBRs previously detected by conventional

ChIP, our experiments revealed novel GBRs in regions not searched in prior studies. For example, two known promoter proximal GBRs at SCNN1A [38] and SDPR [3] were confirmed in the ChIP-chip arrays as well as newly observed GBRs +3kb and -20kb from the SCNN1A and SDPR transcription start sites, respectively (Figure 1.1B).

Table 1.1: Relative locations of GR Binding Regions. The ‘coordinate’ numbers of the corresponding GBR represent the center of the GR binding regions obtained from UCSC Genome Browser. All listed GBRs were validated as bona fide GR occupied regions by conventional ChIP-qPCR (see Figure 1.1A). The GBRs were assigned to the nearest dex responsive gene from the final list of genes that were included or impinged upon by the ChIP-chip arrays. Bold letter genes are those responsive to dex in A549 cells as shown in previous expression microarray profiles [3] or in this study. ‘Position of GBR’ indicates whether the GBR is intronic (“intron”), upstream of the transcription start site (TSS) (“upstream”), or downstream from the coding sequence of the target gene (“downstream of gene”). The longest transcript is listed first if a gene contains multiple alternative transcription start sites.

12

UTR Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Position of GBR Downstream of gene Downstream of gene Downstream of gene Downstream of gene Downstream of gene -43 831 -849 2273 4879 9301 -1537 -1227 -8684 -8387 -2095 -1972 -1392 46595 10412 29569 41230 15333 22741 47771 27516 29816 26162 34019 57335 -11082 -26427 -18680 -24939 -39285 -22148 -12960 -13046 -21639 -19683 -46566 -25622 -10430 -36007 -96753 -25288 -24708 199645 -332404 to TSS (bp) GBR Distance IRF8 Gene TNS4 TNS4 ERN1 PRG1 MT2A CDH1 CCR7 THBD Target CYGB PYGB PYGB DDIT4 DDIT4 BIRC3 PAMCI SSTR4 SSTR4 KRT6A EPSTI1 ABHD2 ABHD2 NFKBIA AMIGO2 VMD2L1 AKAP13 AKAP13 AKAP13 AKAP13 CCDC40 SCNN1A SCNN1A SEC14L2 TSC22D3 TSC22D3 TSC22D3 TSC22D3 TSC22D3 TSC22D3 OR7E14P ANGPTL4 KIAA1434 ALOX5AP SERPINA3 RefSeq NM_002727 NM_019058 NM_019058 NG_002175 NM_001165 NM_001038 NM_005554 NM_005447 NM_001038 NM_181847 NM_001629 NM_001002264 NM_020529 NM_001085 NM_007011 NM_007011 NM_005953 NM_004360 NM_002163 NM_032865 NM_032865 NM_001838 NM_001433 NM_134268 XM_371082 NM_139314 NM_017682 NM_019593 NM_001052 NM_002862 NM_002862 NM_012429 NM_006738 NM_144767 NM_006738 NM_144767 NM_001052 NM_000361 NM_198057 NM_001015881 NM_004089 NM_198057 NM_001015881 NM_004089 6355825 6352704 8326625 5586238 51174824 70491407 73685003 73678745 16991094 84732829 45713321 30185521 42477337 34942864 94135422 87442841 87461998 55198785 67307057 84531505 35896009 35920025 35952509 59513438 72065060 75652568 12722377 22938499 25166277 25206522 29143649 83688869 83688869 83924520 83924520 22969000 22969000 Coordinate 106791144 106791144 106791144 101682322 106767828 106767828 106767828 X.1 X.2 11.1 11.2 10.3 10.4 10.5 12.1 12.2 12.3 12.4 12.5 13.1 13.2 14.1 14.2 15.4 15.5 16.1 16.2 16.3 17.1 17.2 17.3 17.4 17.5 17.6 19.1 19.2 20.1 20.2 20.4 20.5 22.1 15.2 15.3 20.3 GBR Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Intron Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream Upstream/UTR Upstream/UTR Position of GBR Downstream of gene Downstream of gene Downstream of gene Downstream of gene Downstream of gene 12 14 -136 -507 -242 -839 -984 1584 4409 1252 2854 9146 -5602 -1380 -6191 -2885 -3255 12511 12019 13220 42201 20002 86852 47944 76555 49063 34178 21543 71636 35353 35353 -11245 -33791 -19787 -21248 -31610 -17325 -10071 -49313 -256751 -345710 to TSS (bp) GBR Distance IL8 IL6R SGK Gene PAG1 SHC1 SHC1 SDPR SDPR Target VIPR1 STOM ASB13 CCL20 CPLX2 CPLX2 ETNK2 CXCL2 ACSL1 FKBP5 HERC3 ANXA8 IGFBP1 IGFBP1 C6orf81 C6orf81 GPR115 DCBLD2 ZA2OD2 ZA2OD2 AKAP13 AKAP13 DHCR24 TMEM37 IRF2BP2 SLC19A2 CREB3L2 NOSTRIN KIAA0232 RAB11FIP1 RAB11FIP1 RAB11FIP1 ARHGAP26 RefSeq NM_014762 NM_000565 NM_006996 NM_018208 NM_182972 NM_183240 NM_052946 NM_004657 NM_004657 NM_004591 NM_004624 NM_080927 NM_014743 NM_002089 NM_014606 NM_001995 NM_000584 NM_015071 NM_004117 NM_145028 NM_145028 NM_005627 NM_153838 NM_000596 NM_000596 NM_194071 NM_018440 NM_006007 NM_006007 NM_004099 NM_024701 NM_001630 NM_003029 NM_183001 NM_006650 NM_001008220 NM_006738 NM_144767 NM_001002814 NM_025151 NM_001002233 6870929 5714370 82110304 55052924 42561322 75330454 89890827 74971324 35677841 35801593 35802767 47774006 45695008 45702450 72160383 72210285 47238469 83675562 83675562 37840809 37840809 37840809 Coordinate 142113152 119900074 151203962 166187035 200851096 231085797 169227864 192537475 192557274 228508484 100124471 186109853 134539075 137096157 121214984 151757059 151757059 175227852 175227852 1.1 1.2 1.4 1.5 1.6 2.1 2.2 2.3 2.4 2.5 3.1 3.2 4.1 4.2 4.3 4.4 4.5 5.1 6.1 6.2 6.3 6.4 6.5 7.1 7.2 7.3 8.2 9.1 9.2 9.3 1.3 5.2 8.1 10.1 10.2 15.1 GBR

13

A

B

GBR 12.4 GBR 12.1

SCNN1 A

GBR 2.3 GBR 2.4

SDP R

Figure 1.1: (A) Identification of GR binding regions (GBRs). The Log2(peak score) of GBRs obtained from the ChIP-chip arrays is plotted versus the dex induced enrichment of GR at the corresponding GBR, which was assessed by GR ChIP-qPCR (averaged over at least three independent experiments. Note: dex-induced GR binding at the GBRs reproducibly in all the individual experiments). Solid diamonds, bona fide GBRs; open diamonds, negative control regions. (B) GBRs identified near SCNN1A and SDPR genes. , vertical bars; introns, horizontal lines; arrows depict direction of transcription. GBRs 12.1 and 2.3 are known promoter proximal GR binding regions associated with SCNN1A and SDPR, respectively. GBRs 12.4 and 2.4 are novel GR binding regions identified in the present work. GBR nomenclature: unique identifiers corresponding to the human number containing the GR binding region followed by an arbitrary integer tag.

14

GR Occupancy Correlates with Glucocorticoid Responsiveness

Sixty-four of the 73 A549 GBRs (88%) identified were associated with genes regulated by GR in those cells (Table 1.1). Although the remaining nine GBRs may be nonfunctional, they may mediate responses under different biological conditions.

Notably, 27% of the genes that were glucocorticoid responsive specifically in A549 but not in U2OS cells were associated with a GBR, whereas only 1.9% of the genes responsive to glucocorticoids in U2OS but not in A549 contained A549 GBRs (Figure

1.2). Similarly, only 1.8% of the genes that were glucocorticoid or androgen responsive in other cells, and only 0.3% of the genes that were sampled by the ChIP-chip arrays but were not steroid regulatory targets were associated with A549 GBRs (Figure 1.2, Table

1.2). Thus, GR occupancy in A549 cells is generally restricted to genes that are actually regulated by glucocorticoids in those cells; specifically, GR is rarely bound in A549 cells at genes responsive to glucocorticoids in other cells. We conclude that GR occupancy is a major determinant of glucocorticoid responsiveness in A549 cells at the genes assessed in this study.

15

A549 Specific; U2OS Specific; Other Cells; Other Genes Included Dex Responsive Dex Responsive Steroid Responsive in arrays (150 genes) (101 genes) (274 genes) (587 genes)

27% with GBR <2% with GBR <2% with GBR 0.3% with GBR (40 genes) (2 genes) (5 genes) (2 genes)

Figure 1.2: Percentage of genes associated with one or more GBRs in A549 cells. ‘A549 specific dex responsive’ genes are regulated by GR in A549 cells but not U2OS cells. ‘U2OS specific dex responsive’ genes are regulated by GR in U2OS cells but not in A549 cells. The 34 genes regulated by GR in both A549 and U2OS cells, 12 of which associated with an A549 GBR, were excluded from the analysis shown. Genes responsive to glucocorticoids or androgens in cells other than A549 and U2OS are denoted as ‘other cells steroid responsive’. Lastly, additional genes that were wholly or partially included in our ChIP-chip arrays due to the extensive sampling of regions around all the genes mentioned above are represented as ‘genes included in arrays’.

16

1.3 0.8 1.1 1.4 1.2 1.0 1.0 1.0 1.1 1.1 1.0 0.9 1.5 1.1 1.3 1.1 1.2 1.1 1.4 1.3 0.9 0.9 1.2 1.4 0.9 -1.9 8hr Dex 1.4 0.8 0.9 1.4 0.9 1.5 0.8 0.7 1.0 0.9 1.1 1.0 1.2 1.3 1.1 1.5 1.0 0.8 1.1 1.0 0.9 0.9 1.0 0.9 0.8 -2.1 4hr Dex Source Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells Other Cells ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned ChIP-chip Spanned Gene DCBLD2 DST SFMBT2 CREB3L2 THOC3 ASB13 MMP7 LAMB1 TMEPAI ANK3 SCAP ARHGAP26 CDC42SE1 ANKRD47 SEC22L3 MTP18 SOX13 MCEE PGS1 PDE4C NAT6 PF4 RNF8 AP3S1 MTHFD2L TJAP1 1.4 0.8 0.6 0.9 0.8 0.9 0.8 0.9 1.0 1.0 1.2 1.0 2.2 1.0 1.1 1.0 0.7 1.2 1.0 0.7 1.3 8hr Dex 1.0 1.1 0.8 0.9 0.9 0.8 1.2 0.9 0.7 1.2 0.9 0.8 1.9 1.2 0.9 0.9 1.0 1.0 1.0 1.2 1.1 4hr Dex U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS U2OS Source Gene IRF8 PDGFA NET1 MSX2 VGCNL1 ZNF30 CYP4B1 NSDHL EFNA1 CDKN1B MYC CYLD DPP4 LADININ 1 MLYCD SERTAD1 SLC9A2 ASB9 HPS4 F3 PDE4DIP

Table 1.2: Dex responsiveness of steroid targets from other cells in A549 cells. Quantification of relative mRNA levels by qPCR of a subset of the 587 genes (denoted at ‘ChIP-chip Spanned’) included in the ChIP-chip arrays (Figure 1.2) showed that they were not dex responsive (less than 1.6 fold change) in A549 cells after 4 or 8 hr of treatment; ’U2OS” source genes are responsive to dex in U2OS but not in A549 cells; ”Other cells” source genes are potentially steroid responsive in other cells but not in A549 cells. Analysis with qPCR confirms that a majority of these genes were indeed not responsive to dex in A549 cells after 4 or 8 hours of treatment. Values shown are fold changes comparing dex and ethanol treatment averaged over at least two independent experiments. Bold letter genes are those that are dex responsive in A549 cells.

17

A549 GBRs are Functional Glucocorticoid Response Elements

To test whether the A549 GBRs can confer glucocorticoid-directed transcriptional responses, we cloned 500bp DNA fragments encompassing the GBRs into luciferase reporter plasmids. Of the 20 GBRs randomly selected from the 73 GBRs identified in this study, 19 were dex-responsive in A549 cells as assessed by reporter analysis (Figure

1.3A). We define primary GREs (denoted here simply as GREs) as genomic regions that are occupied in vivo by GR and confer glucocorticoid-regulated transcription in transfected reporters. Although the reporter analyses do not prove that the identified elements are functional in their native contexts (see Discussion), they establish that the

500bp fragments tested harbor sufficient information for GR to regulate transcription.

Thus, we shall refer to the GBRs henceforth as GREs.

18

A

B

p-value Transcription Motif (-log) Factor tgagtcaK 6.1 AP-1 atKaNgMaaKMH 3.2 HNF4 tKVgYcat 3.4 C/EBP caggaa 5.7 ETS Family NNSNgggagggNN 2.1 SP1

19

C

Figure 1.3: (A) GR binding regions confer glucocorticoid responsiveness. A549 cells transfected with luciferase reporter genes linked to 500bp GBRs were treated with EtOH or 100nM dex for five to seven hr, harvested, and measured for luciferase activity. Fold dex-inductions are plotted for wildtype (white) reporters and mutant (black) reporters with singly (mutGR) or doubly mutated (dmutGR) GR binding sites; standard errors of mean over at least three independent experiments are shown. The 13 mutated GR binding sites were randomly chosen. The GREs that harbor these GR binding sites represent a range of enriched GR binding regions, ranging from ~6 to 40 fold dex induced GR occupancy as assessed by ChIP-qPCR (data not shown). (B) Identification of enriched motifs within GR binding regions. Top panel: Sequence logo, generated using WebLogo [39], represents all the compiled sequences resembling GR binding sites identified through computational analysis. Bottom panel shows other enriched motifs (displayed in IUPAC symbols) found in the GRE sequences. Motifs resembling AP-1, HNF4, and C/EBP binding sites were identified using BioProspector whereas motifs similar to ETS and SP1 binding sites were found with MobyDick. The p-values of the enriched motifs represent the random probability of these motifs occurring within the GREs. (C) Conservation analysis of GREs. The identity of the human and mouse sequences was calculated as number of bp matches minus the number of bp deletions or insertions, divided by a 50bp window. Shown are the average identities for each window across 50 GREs. The background level was calculated as the average of all conservation scores across the 4kb region. The abscissa shows bp positions with ‘0’ defined as the center of core GR binding sites for GREs.

20

GREs are Generally Distal and Evenly Distributed between Upstream and

Downstream Regions

We determined the positions of the A549 GREs relative to the transcription start sites (TSSs) of their respective target genes (Figure 1.4A). For this analysis, the GREs were assigned to the nearest gene responsive to dex in A549 cells. Surprisingly, we found that 45% of the GREs were located downstream of the TSSs, suggesting that GR exhibits transcriptional regulation without a significant preference for regions upstream or downstream of TSSs (Table 1.1). Figure 1.4B summarizes the distribution of promoter proximal (within 5kb from the TSS) and distal GREs (farther than 10kb from the TSS).

Strikingly, 63% of the GREs were distal whereas only 31% of them were promoter proximal (Figure 1.4B). Mammalian response elements are commonly thought to reside upstream and proximal to their cognate promoters; thus, identification of GREs and response elements in general have mainly focused on these regions. Importantly, Figure

1.4B demonstrates that only a small fraction of the GREs (17%) identified in this study was positioned within these regions. These results indicate that GREs are just as likely to be located downstream of the TSSs and that the majority operate remotely from their target promoters, at least by linear DNA distance.

21

A

B

Figure 1.4: (A) Locations of GREs relative to the transcription start site of target genes. Number of GREs resident in 10kb increments distance relative to the transcription start site of the target gene. White bars and black bars represent GREs upstream and downstream of the transcription start site, respectively. (B) Distribution of GREs relative to target gene start site. The chart presents percentage of GREs at various positions upstream and downstream of target genes. Note that only 64 of the 73 GREs detected in A549 cells were included in these analyses; the remaining 9 GREs did not associate with a dex responsive gene in these cells. The GREs were assigned to the nearest gene regulated by GR in A549 cells from the final list of genes that were included or impinged upon by the ChIP-chip arrays. Coordinates of TSSs were obtained from UCSC Genome Browser based on RefSeq. Similar results were obtained when we used TSS coordinates that were experimentally determined (DBTSS) through 5’end cloning (data not shown) [40]. The TSS of the longest transcript was used for genes that have multiple alternative transcription start sites. Similar results were obtained if the GREs were assigned the closest TSS of the associated dex responsive gene: 38% of GREs were located downstream from TSS, 58% of GREs were positioned farther than 10kb from the assigned TSSs.

22

Our finding concerning GRE distribution is supported by two indirect analyses using nuclease sensitivity and sequence conservation. Sabo et al. found that DNAse I hypersensitive sites, indicative of chromatin-bound factors, are broadly distributed with a majority located >10kb from the nearest TSS [41]. Furthermore, Dermitzakis et al. showed that CNGs, ungapped 100bp fragments with at least 70% identity between human and mouse that are presumed factor binding regions, have no significant preference for promoter proximal regions [42,43]. As expected [44], we found that GR occupancy was correlated with DNAse I-hypersensitive cleavage at both promoter proximal (1.3, 1.5,

12.1, 16.1) and distal GREs (2.4, 5.1, 6.1, 6.3, 7.3, 20.2) (Figure 1.5A). In addition, by aligning the human GRE sequences with the corresponding regions in the mouse genome, we found that 23 of the GREs correspond to CNGs (Figure 1.5B). Moreover, GR occupancy and glucocorticoid responsiveness for several of these GREs/CNGs (6.4, 12.1,

5.1, 6.1, 6.2, 10.5, X.1, X.2) were maintained in mouse cells (see Figure 1.6A, B). Thus, by testing the GREs identified in our study, we were able to provide direct support for the notion that DNAse I hypersensitive sites and CNGs serve as regulatory elements [41,43].

A

23

B

Ungapped % Identity Ungapped % Identity GRE length (bp) (human-mouse) GRE length (bp) (human-mouse) 1.1 150 89 6.5 134 87 1.6 107 75 7.2 113 86 2.3 117 88 10.1 100 70 2.4 148 86 10.5 168 83 2.5 103 74 12.1 122 89 3.1 123 84 12.2 113 79 3.2 196 87 12.3 126 90 5.1 211 76 17.4 175 87 5.2 111 73 14.1 113 96 6.1 135 98 X.1 165 94 6.2 117 82 X.2 138 87 6.4 101 97

Figure 1.5: (A) Dex induces increased DNAse I accessibility. Nuclei from A549 cells treated with EtOH or dex for 1hr were isolated, treated with DNAse I, and harvested for DNA. The relative amount of the DNA at the corresponding region were assessed by qPCR and presented as % cleavage, with standard error of mean averaged among at least three independent experiments. Controls #1 and #2 correspond to regions near AMOTL2 and CDH17 gene, respectively, which do not exhibit dex induced GR occupancy (data not shown). (B) Human-mouse sequence conservation within GREs. The mouse sequences aligned with 500bp human GRE sequences were obtained from UCSC Genome Browser. The lengths of the continuous GRE sequences without gaps between human and mouse are shown, and the percent identity of these regions were calculated as number of matched basepairs divided by length of fragment.

Computation and Conservation Suggest that Native GREs are Composite Elements

Native GREs, defined as naturally evolved genomic elements that confer glucocorticoid regulation on genes in their chromosomal contexts, are likely to be

“composite elements”, made up of binding sites for GR together with multiple nonreceptor regulatory factors [2]. To assess whether we could detect such complex architecture, we used computational approaches (Bioprospector and MobyDick) to survey the 500bp GRE-containing fragments for sequences related to known regulatory factor

24

binding sites [45-47]. The most prominent motif found, present in 68% of the GRE sequences, was a series of imperfect palindromes similar to known core GR binding sites

(Figure 1.3B). Potentially, GR may interact with the remaining 32% of GREs through other recognition motifs or through tethering to other factors [21]. Mutagenesis of computationally predicted core GR binding sites decreased or completely abolished dex stimulation for each of 13 randomly tested sites, validating this approach for identifying functional core GR binding sequences (Figure 1.3A). Some GREs, such as 6.2, 7.2, and

7.3, contained multiple GR binding sites; we found that reporters mutated at only one of those sites retained residual dex inducible activity. These experiments imply that most of the core GR binding sites identified in our computational analysis are functional.

In addition, we found that motifs similar to AP-1, ETS, SP1, C/EBP, and HNF4 binding sequences were enriched in the 500bp GRE fragments (Figure 1.3B). For example, motifs resembling AP-1 and C/EBP binding sites were identified in the GRE of the IL8 gene. Importantly, the AP-1 site is known to be crucial for regulation of IL8 by the AP-1 factor [48]; similarly, C/EBPα enhances transcription of an IL8 reporter spanning the GRE region [49]. Thus, as with GR binding sequences, our computational analysis was capable of discovering functional nonreceptor binding sites. Detection of multiple factor binding sites within the GRE sequences is consistent with the hypothesis that native GREs are typically composite response elements that recruit heterotypic complexes for combinatorial control [2].

To estimate the extent of GRE conservation, we measured sequence identity in human and mouse across 4kb regions centered on the core GR binding sites (see Figure

1.3C legend and Materials and Methods) averaged across 50bp windows; a similar (albeit

25

higher resolution) pattern was obtained with 15bp windows (data not shown). Strikingly, we found that flanking sequences roughly 1kb surrounding the core GR binding sites was conserved relative to background (Figure 1.3C). This elevated evolutionary conservation implies that these segments are biologically functional, not only in reporter constructs

(Figure 1.3A), but also in their native chromosomal contexts, further supporting the view that native GREs are composite elements.

Sequence Conservation of Core GR Binding Sites and GREs

We next sought to examine in detail the extent of sequence conservation of some of the individual core GR binding sequences and GREs that we had identified in our study. We chose a subset of 12 human GREs that are occupied by GR both in another species, mouse, and in another cell type, C3H10T1/2 mesenchymal cells (Figure 1.6A).

Consistent with the correlation between GR occupancy and glucocorticoid responsiveness (Table 1.1 and Figure 1.2), we confirmed that several of these genes

(Fkbp5, Ddit4, Gilz, Mt2, and Sgk) were indeed dex-inducible in the C3H10T1/2 cells

(Figure 1.6B). These 12 GREs resided at very different locations relative to the TSSs of their human target genes (ranging from 0.1 kb to 86 kb; Table 1.3); remarkably, however, each locus was approximately maintained in the mouse genome (Table 1.3). This finding suggests that the positions of individual GREs may be integral to their regulatory functions.

26

A

B

C

Constrained bases * *** * Constrained bases * *** * hGRE 6.4 AGAACATTTTGTCCG hGRE 6.1 #2 AGCACATCGAGTTCA mGRE 6.4 AGAACATTCTGTTCT mGRE 6.1 #2 AGCACATCGAGTTCA rGRE 6.4 AGAACATTCTGTCCT rGRE 6.1 #2 AGCACACCGAGTTCA dGRE 6.4 AGAACGTTCTGTCCG dGRE 6.1 #2 AGCACATCGAGTTCA hGRE 12.1 AGAACAGAATGTCCT hGRE 6.2 GGTACAGTTTGTTAC mGRE 12.1 AGAACAGAATGTCCT mGRE 6.2 GGTACAGTGTGTTAC rGRE 12.1 AGAACAGAATGTCCT rGRE 6.2 GGTACAGTGTGTTAC dGRE 12.1 GGAACAGAATGTCCT dGRE 6.2 GGTACAGTTTGTTAC hGRE 16.1 #1 AGGACAGCCTGTCCT hGRE 10.3 GGAACAGAAAGTATT mGRE 16.1 #1 AGGACAGCCTGTCCT mGRE 10.3 GGAACAGAAAGTTTT rGRE 16.1 #1 AGGACAGGCTGTCCT rGRE 10.3 GGAACAGAAAGTTTT dGRE 16.1 #1 AGAACAGGCTGTCCT dGRE 10.3 GTACCAGAAAGTATT hGRE 16.1 #2 AGAACAGGATGTTTA hGRE 10.4 GGTACAGACCGTTCT mGRE 16.1 #2 AGGACACGGTGTTTA mGRE 10.4 GGTACAGAGCGTTCT rGRE 16.1 #2 AGAACAGGATGTTTT rGRE 10.4 GGTACAGAGAGTTCT dGRE 16.1 #2 AGGACAGGATGTTTA dGRE 10.4 GGTACAAACCGTTCT hGRE 19.2 GGAACACGGCGTCCC hGRE 10.5 AGAACACAATGTTCT mGRE 19.2 GGAACACTGAGTCCT mGRE 10.5 AGAACACAATGTTCT rGRE 19.2 GGAACACTGCGTCCT rGRE 10.5 AGAACACAATGTTCT dGRE 19.2 GGAACATAGTGTCCT dGRE 10.5 AGAACACAATGTTCT hGRE 5.1 GGTACATTCTGTTCA hGRE X.1 AGCACACCCGGAGCA mGRE 5.1 GGTACGCTCTGTTCC mGRE X.1 AGCACACCCAGAGCA rGRE 5.1 GGTACACTCTGTTCC rGRE X.1 AGCACACCCAGAGCA dGRE 5.1 AGTACACTCTGTTCA dGRE X.1 AGCACACCCAGAGCA hGRE 6.1 #1 AGAACAGGGTGTTCT hGRE X.2 AGAACATTGGGTTCC mGRE 6.1 #1 AGAACAGGGTGTTCT mGRE X.2 GGAACATTAGGTTCC rGRE 6.1 #1 AGAACAGGGTGTTCT rGRE X.2 GGAACATTAGGTTCC dGRE 6.1 #1 AGAACAGGGTGTTCT dGRE X.2 GGAACATTGGGTTCC

27

D

Figure 1.6: (A) Binding of GR at mouse orthologs of primary GR target genes from human A549 cells. ChIP experiments were performed to monitor GR binding in EtOH and dex treated C3H10T1/2 cells at genes shown, analyzed with qPCR, and normalized to a region near the mouse Hsp70 gene. The nomenclature ‘mGRE’ represents GRE sequences detected in the mouse genome. (B) Genes adjacent to GREs are regulated by GR in C3H10T1/2 cells. Reverse transcribed RNA samples (cDNA) from C3H10T1/2 cells treated with EtOH or 100nM Dex were subjected to qPCR and normalized to mouse Rpl19 transcripts. (C) Core GR binding sequences are highly conserved. GR binding sequences from human (h), mouse (m), rat (r), and dog (d). Red sequences represent bases that are identical to that of human. Note that GREs 6.1 and 16.1 each contain two GR binding sites. (D) Comparative sequence conservation across individual GREs. Sequence identities between human and mouse of GRE X.1 and X.2 were obtained using the same calculation as Figure 1.3C. The coordinates represent bp positions with ‘0’ defined as the center of core GR binding sites.

28

Supplemental Table 2: Distance of GREs in mouse genome relative to adjacent genes

Target GRE Distance GRE Distance GRE Gene to TSS (Human) to TSS (Mouse) 6.4 SGK -1380 -1214 12.1 SCNN1A -849 -1356 16.1 MT2A -1227 -1559 19.2 VMD2L1 -2095 -2449 5.1 ARHGAP26 -17325 -19592 6.1 FKBP5 86852 65757 6.2 C6orf81 -11245 -15183 10.3 PRG1 -26427 -22809 10.4 DDIT4 -18680 -18181 10.5 DDIT4 -24939 -22542 X.1 TSC22D3 34019 32690 X.2 TSC22D3 57335 55448

Table 1.3: Distances of GREs relative to adjacent gene are conserved in the mouse genome. The distances were calculated based on coordinates of the mouse aligned GREs (mGREs) and transcription start sites of the GR regulated mouse homolog genes obtained from UCSC Genome Browser. The TSS of the longest transcript was used for this calculation when a gene has multiple variants. Bold letters represent the distance of the GREs relative to the adjacent gene in mouse.

29

We then examined the extent of conservation of the 15bp core GR binding sites within the GRE set defined above. As anticipated, the 12 core GR binding sites from the different human GREs (hGREs) differed substantially, with only five invariant positions across the 15bp sequences (Figure 1.6C); for example, the binding sites of hGRE 5.1 and hGRE 10.3 match at only 7 positions. In striking contrast, we found that the core GR binding site sequences within the individual GREs were highly conserved among human, mouse, dog, and rat (Figure 1.6C); for example, the core GR binding sequence at GRE

10.5 is identical in all four evolutionarily distant species.

Finally, we compared in human and mouse the patterns of conserved sequences flanking the core GR binding sites, which provides “architectural signatures” of individual GREs. We found that the patterns of sequence conservation differed dramatically among the different GREs (Figure 1.6D, Figure 1.7). For example, GRE

X.1 contains conserved sequence elements at -900, -500, and +600bp, whereas GRE X.2 displays no conservation at those positions (Figure 1.6D). Although the functional significance of the conserved regions have yet to be tested (for example, we have not ruled out incidental overlaps with conserved noncoding expressed regions), they are likely to correspond to regulatory or structural motifs. As predicted by these findings, pair-wise calculations of sequence identity of different human GREs (using a 15bp window centered on the core GR binding sites) demonstrated that sequences flanking the core GR binding sites varied extensively among human GREs (Figure 1.8). Thus, the overall family of GREs is broadly divergent in sequence and organization, but each individual GRE retains a distinctive signature of conserved sequences, suggesting that each corresponds to a composite GRE that is functionally distinct.

30

A

B

C

31

D

E

Figure 1.7: Sequence conservation “signatures” are distinct for each GRE. Identity scores were determined for human-mouse aligned sequences and is plotted as in Figure 1.6D; for clarity, data are presented as pair-wise comparisons. Panel A to E represent comparisons of conservation of the specified GREs.

32

A

B

Figure 1.8: (A) Sequence comparison of human GRE 10.5 with human GRE 6.1 (B) Sequence comparison of human GRE 6.4 and X.2. The sequences were pair-wise aligned using ClustalW [50] and similarities were calculated as in Figure 1.3C using a 15bp window. Coordinate ‘0’ represents the center of the core GR binding sites. The red line represents the background level, which was calculated by taking the average of all identity scores.

33

Chapter 1 Discussion

We set out to examine the organization and function of genomic elements responsible for transcriptional regulation by GR. Our study yielded five conclusions: first, GR occupancy at a GRE is generally a limiting determinant of glucocorticoid response in A549 cells; second, the core GR binding sequences conform to a consensus that displays substantial GRE-to-GRE variation as anticipated, but the precise binding sequences at individual GREs are highly conserved through evolution; third, GREs appear to be evenly distributed upstream and downstream of their target genes; fourth, most GREs are positioned at locations remote from the transcription start sites of their target TSSs; and fifth, native GREs are commonly composite elements, comprised of multiple factor binding sites, and they are individually conserved in position and architecture yet very different from each other. We shall consider the implications of these conclusions in turn.

We began by surveying more than a thousand genes, with half of them candidates for steroid regulation, and a specific subset known to be GR-regulated in A549 cells. We found that GR occupancy of A549 GREs correlated strongly (nearly 90%) with genes that are glucocorticoid responsive in A549, suggesting that GR binding is generally a limiting determinant for response in these cells. In a small number of cases, we observed

GR occupancy close to genes that were GR-unresponsive in A549 cells, but were steroid regulated in other cells [4] (ECB, KRY unpublished results). This implies that GR occupancy at these genes likely reflects bona fide response element binding, but that GR binding is not a limiting factor for glucocorticoid regulation of this minority class of genes in A549 cells. Collectively, our data suggest that restriction of GR occupancy in

34

A549 cells may be responsible for much of the cell-specific GR-mediated regulation in these cells. The mechanisms of occupancy restriction could be positive or negative mechanisms, such as accessory factors that stabilize GR binding, or chromatin packaging that precludes it. Although the strong correlation between GR occupancy and glucocorticoid responsiveness in A549 cells seem likely to hold in other cell types, it is conceivable that responsiveness may be determined different in other cell types. Thus, it will be interesting to examine cell-specific GR regulation in other cells to complement the observations made in A549 cells. It is intriguing that one component, GR, within such varied and complex machineries would so strongly predominate as a determinant of transcriptional regulation in A549 cells. It will be interesting to examine regulatory complexes that mediate other types of responses (e.g., heat shock, DNA damage) to assess whether response element occupancy by a single factor in each class is a dominant determinant of responsiveness.

We examined sequence conservation of a set of GREs that are occupied by GR both in human lung epithelial cells and in mouse mesenchymal stem cells. We found that the 15bp core GR binding sequences varied greatly among the different GREs (Figure

1.3B), whereas the sequences of the individual binding sequences were nearly fully conserved across four mammalian species (Figure 1.6C). Crystallographic studies demonstrate that GR makes specific contacts with only 4 bases of the 15bp core binding sequence [51], yet every position, including the “spacer” between the hexameric half sites, appears to be equivalently conserved. This indicates that the binding sequences serve functions in addition to merely localizing GR to specific genomic loci, and instead may carry a regulatory ‘code’ affects GR function. Leung et al. reported similarly strong

35

evolutionary conservation of individual κB binding sequences [8]. Indeed, Luecke and

Yamamoto showed that GR directs distinct regulatory effects when tethered to NFκB at two κB response elements that differ by only one base pair [21]. Thus, one interpretation of our data findings is that factor binding sites may serve as allosteric effectors [32] in which individual binding sequences convey subtle conformational differences to specify distinct factor functions. Conceivably, this hypothesis might also explain why GR predominates as a limiting determinant of responsiveness, because factors that ‘read’ allosteric regulatory codes might specify the ‘rules’ for assembly of GRE-specific, and thus gene-specific regulatory complexes.

To characterize the ‘architecture’ of GREs, we took several approaches. In unbiased computational analyses, we identified enriched sequence motifs within 500bp segments encompassing core GR binding sites. Sequence motifs resembling binding sites for GR, AP-1, ETS, SP1, C/EBP, and HNF4 were overrepresented relative to a background of unbound GR regions, consistent with the notion that native GREs are composite elements. For most of these GREs, the role of these factors in GR transcriptional regulation remains to be tested, but it is notable that ETS-1, SP1, and

HNF4 have been shown at other genes to augment glucocorticoid responses [52-54].

Moreover, Phuc Le et al [55] described motifs resembling AP1 and CEBP binding sites within certain mouse GREs, and showed that nearly half of the GREs predicted to encompass a CEBP binding sites did indeed bind CEBPβ [55]. These findings further the view that our computational analysis can infer factors that potentially interact with GR at

GREs. Using a similar approach, Carroll et al. [23] and Laganiere et al. [56] have interrogated estrogen response elements and identified FOXA1 as a factor playing an

36

important role for both estrogen receptor binding and transcriptional activity. Thus, we anticipate that the factors that occupy the GR composite elements may interact physically, functionally or both, thereby affecting binding as well as regulatory activity.

Indeed, an averaged comparison of human and mouse sequences flanking core GR binding sites revealed that a region of approximately a kilobase was conserved above the background level (Figure 1.3C), suggesting that native composite GREs are extensive and typically may contain numerous factor binding sites. Interestingly, individual GREs displayed distinctive patterns of sequence conservation extending from the core GR binding sites (Figure 1.6D, Figure 1.7). These GRE signatures likely reflect conservation of various sequence motifs at different positions within each element, producing GRE- specific (and therefore gene-specific) architecture that likely creates distinct regulatory effects.

To investigate the distribution of regulatory elements relative to their target genes, we monitored GR occupancy across 100kb regions centered on the TSSs of glucocorticoid responsive genes. We found that GREs were evenly distributed upstream and downstream of their target genes with the majority located >10 kb from their target promoters; other metazoan regulatory factors, such as ER and STAT1, have similarly been reported to act from sites remote from their target genes [11,12,23,57,58]. In contrast to these factors, E2F1 was shown to mainly bind promoter proximal regions

[57]; others have used computational approaches to infer factor binding sites close to promoters, but these have not been experimentally confirmed [59]. In parallel with our findings, Carroll et al. reported that only 4% of ER binding regions was mapped within -

800bp to +200bp from TSS of known genes from RefSeq [58]. Our data demonstrated

37

that 9% of GR binding regions were positioned at this location. These studies together imply that steroid receptors, which include ER and GR, in general regulate transcription from remote locations. Interestingly, we found that the positions of individual GREs were generally conserved across species (Table 1.3), implying that GRE position may be functionally important for target gene regulation. In any case, our findings differ dramatically from those in and fungi, where transcriptional regulatory elements are promoter proximal. It has been suggested that these two broad classes of regulatory mechanisms, so-called long range and short range, are mechanistically and evolutionarily related, and that long range control might facilitate regulatory evolution

[25]. As predicted by that model, distal elements, far from target genes as measured by linear DNA distance, may operate in close proximity with their target promoters in 3- dimensional space. For example, Carroll et al. detected an interaction between the NRIP-

1 promoter and its distal estrogen response element [23]. It will be interesting to determine whether response element location, i.e., promoter proximal versus distal, is somehow related to mechanism or to physiological network.

Remote response element locations can complicate assignment of cognate target genes. An extreme example is olfactory receptor gene expression, which is governed by a regulatory element that can operate on target genes located on different

[60]. In this study, we assigned the GREs to the nearest RefSeq gene responsive to dex in A549 cells. In other contexts, these GREs may be nonfunctional, or may operate on genes other than those assigned in A549 cells (Table 1.1). Clearly, unequivocal assignment of a GRE to a given target gene will require genetic manipulations not readily accessible in mammalian cells at present. It is encouraging, however, that GR occupancy

38

of GREs correlated strongly with glucocorticoid responsiveness of adjacent genes, supporting the view that these are bona fide direct GR targets (Table 1.1, Figure 1.2). In fact, when these genes were subjected to (GO) analysis, we found that they were enriched in cell growth and immune responses (data not shown), two biological processes regulated by GR in A549 cells [5,61]. We found GR occupancy at genes up- and down-regulated in response to dex, consistent with GR serving either as activator or repressor in different contexts. At present, we cannot assess the significance of the finding that GR was detected at GREs adjacent to activated genes versus repressed genes at a 6:1 ratio in A549 cells; whether this difference reflects differences in GRE occupancy, epitope accessibility, crosslinking efficiency or other variables has not been determined.

Genomic response elements orchestrate transcriptional networks to mediate cellular processes for single and multi-cellular organisms. The present study advanced our understanding of the organization, evolution and function of GREs, and at the same time raised a series of interesting questions. Among the more intriguing: How is GR occupancy restricted to a small subset of potential GREs in a given cell context? What is driving the strong conservation of virtually every base pair within the core GR binding sequence at individual GREs? Addressing these and other questions raised in our study will contribute additional new insights about gene regulation by GR and by other regulatory factors.

39

Chapter 1 Materials and Methods

Cell Culture, plasmids, reporter analysis

A549 and C3H10T1/2 cells were grown in Dulbecco’s Modified Eagle’s Medium

(DMEM) supplemented with 5% or 10% fetal bovine serum (FBS), respectively, in a 5% carbon dioxide atmosphere. Before hormone treatment, media was replenished with

DMEM containing charcoal stripped FBS, which depletes endogenous steroids. Plasmid

PGL4.10 E4TATA (generously provided by Yuriy Shostak) was created by insertion of the E4 TATA minimal promoter into pGL4.10 vector (Promega). The 20 reporters tested

(Figure 1.3A) represent randomly chosen GRE fragments. The QuikChange kit

(Stratagene) was used for reporter mutagenesis. The 13 core GR binding sites that were mutated in the reporters (Figure 1.3A) were also randomly chosen based on success of mutagenesis. GBR-containing DNA fragments (500 bp) were amplified by PCR and subcloned into pGL4.10 E4TATA using KpnI and XhoI sites (see Table 1.4 for primer sequences). A549 cells were grown in a 48 well plate and cotransfected with 19ng of the reporter constructs, 10ng pRL Luc (Promega), and 38ng pCDNA3 hGR (human GR expression vector) using Lipofectamine 2000 (Invitrogen). After overnight transfection, cells were treated with hormone, harvested, and luciferase activity was measured as described for the dual luciferase reporter system (Promega) using a Tecan Ultra

Evolution plate reader (Tecan).

40

Table 1.4: Primers used for cloning and mutating GRE reporters. Capitalized letters represent the restriction digestion sites used for cloning the constructs into pGL4.10 E4TATA.

GBR Reporters: 1.2: gctgcaGGTACCacatggccttagatatcttttcctc tgcagcCTCGAGctcggccactgtccaggaagaggca 1.4: gctgcaGGTACCgtcacacgacgtgaactggggcggt, tgcagcCTCGAGcggttctgctgttcagtgaccagctt 1.5: gctgcaGGTACCaatgccgagcacaaccaccccaagcc, gctgcaCTCGAGctcatcaccctgctgtccagagaac 2.3: gctgcaGGTACCgtgagggtgtaggatgaggcccgct tgcagcCTCGAGggacacccggtgttctgcaacccat 2.4: gctgcaGGTACCggtcaaatgctgggtgcgtttcaag tgcagcCTCGAGactggggtgtgaagaagcacagctt 2.5: gctgcaGGTACCtgactattcagagagttcttagtgg tgcagcCTCGAGtcattcctgcctctcgccacattgc 3.1: gctgcaGGTACCaaccatatgcatcagagctggggct tgcagcCTCGAGgtccccatctcacagacatcgtctc 4.3: gctgcaGGTACCctcagagataccgagataaccgcctg cgctctCTCGAGgttccaagatttcaaagagagaaagg 5.1: gctgcaGGTACCttatttgagaagggaggatatcact tgcagcCTCGAGttgtgtcagtgctcggttcctaagc 5.2: gctgcaGGTACCgtttcctcgggcacacggccaacct tgcagcCTCGAGctgaatttccacttttatttatgag 6.1: gctgcaGGTACCttggtagagaaagaaataaacaagtt gctgcaCTCGAGaaatagacacttaccagagctaatgtc 6.2: gctgcaGGTACCgtttagaacctaaagacggggagtggg tgcagcCTCGAGctgggctgctggggctgtctgcaccc 6.3: gctgcaGGTACCtcatttcaatacacccaatatttattg tgcagcCTCGAGgaacaaagcatttatatcgcacgtttacc 7.1: gctgcaGGTACCtattttttcaaactttcccatcagtttg tgcagcCTCGAGataaagcccctggcctcagtttt 7.2: gctgcaGGTACCgccaccctgggtcccccaccagtct tgcagcCTCGAGcagagaatccgcaggaaatccctgt 7.3: gctgcaGGTACCtcctgtaacctctccaggagggaaaagc tgcagcCTCGAGctgccccctgcctgggtccaccccta 9.1: gctgcaGGTACCttacaaaacccctaggcacttaaga tgcagcCTCGAGtccaacactaacctaactggttaat 9.2: gctgcaGGTACCtttgtctccagccctgggcagacat tgcagcCTCGAGcacccatgggggacctgtgcctcgtgcc 19.1: gctgcaGGTACCcctaaaggtctcagaagctcaattt tgcagcCTCGAGgactcaggttcaaggaagagcatga

41

20.3: gctgcaGGTACCaactaggaaaggaaggggcggggca tgcagcCTCGAGtcaatactcccattctaggcatttt

Mutant Reporters: 1.5 mutGR: cccagacttctatgtcctgaccagctgccaatcaccttg gacatagaagtctgggcaactcccagccctcccctgcc 2.4 mutGR: ctagattgcattttttcctcaggtctctgaaatatcagg cctgatatttcagagacctgaggaaaaaatgcaatctag 2.5 mutGR: cacacataggttcagtactcggtgggcaccgtgccaagtg cacttggcacggtgcccaccgagtactgaacctatgtgtg 3.1 mutGR: cttcaaacgagagctgacgacggcgctttcctgcaactggcg cgccagttgcaggaaagcgccgtcgtcagctctcgtttgaag 4.3 mutGR: ttcccagtgtgtcaggcagcttagagagttctagcttaggtgtg cacacctaagctagaactctctaagctgcctgacacactgggaa 5.1 mutGR: cagagagcttgagcttaacgtgaagctaccggtgctgttggg cccaacagcaccggtagcttcacgttaagctcaagctctctg 5.2 mutGR: cagttgtcattcctcaggcatgcattctcgatcctcacacaaag ctttgtgtgaggatcgagaatgcatgcctgaggaatgacaactg 6.2 mutGR: cattccggtcccagcgtattaaactagaccagcctttttgag ctcaaaaaggctggtctagtttaatacgctgggaccggaatg 7.2 mutGR #1: cgtcctggatacagtagaggtttcccagatgtttac gtaaacatctgggaaacctctactgtatccaggacg 7.2 mutGR #2: ctggactgcttgcacgtccatacctgtgccattcttgg ccaagaatggcacaggtatggacgtgcaagcagtccag 7.3 mutGR: ccttgagatacaggtccgacgtgtcctgtctacgca tgcgtagacaggacacgtcggacctgtatctcaagg 9.2 mutGR: cgtgaactgggaaagcaccaaacgctctactcaaaagcc ggcttttgagtagagcgtttggtgctttcccagttcacg 19.1 mutGR: cactggtttcgatgtgccattacatagtcctaacaacc catcgaaaccagtgactggcctctaccgcccccagc

42

Chromatin Immunoprecipitation and Array Analysis

ChIP assays were performed as described [21] with the following modifications. The chromatin samples were extracted once with phenol-chloroform and purified using a

Qiaquick column as recommended by the manufacturer (Qiagen). The ligation mediated

PCR (LMPCR) process was adapted from Oberley et al [62]. 3.5-20ng of amplicon was used for real-time qPCR analysis and data were normalized to Hsp70 (see Table 1.5 for primer sequence). Human and mouse DNA sequences were retrieved from UCSC

Genome Browser (http://genome.ucsc.edu) NCBI Build 35, and qPCR primers were designed using Primer3 [63]. For the array, ~50kb upstream and downstream regions were tiled with isothermal 50mer oligos (spaced on average of every 54bp apart) relative to the transcription start sites of the investigated target genes. Where 100kb regions overlapped, the surrounding genomic region was tiled further bidirectionally. ChIP samples from (final concentration, 0.01% ethanol) or dex-treated A549 cells were labeled with Cy3 or Cy5, hybridized onto the arrays, and relative signal intensities were measured by NimbleGen Inc. SignalMap was utilized to find peak enrichments with both window threshold detection (500bp peak window size, 25% of Peak Threshold) and second derivative peak detection (500bp peak window size, 20bp smooth step, 25% peak threshold) (NimbleGen Inc.).

43

Table 1.5: Primers used for qPCR analysis. FO primer and RE primer represent the forward and reverse primer, respectively, for the corresponding amplified genomic regions or cDNA sequences of the indicated genes.

Primer pairs for amplification of human ChIP Samples (GBR) 1.1 gtgttggaacaggccgtact atttctgctagcctgccaaa 1.2 tggagccaggtggttatctc actgctggagagagggtctg 1.3 gcacactgtcaagggtgaga ctggtgtgaaggatgctgtg 1.4 ccggaatgtccattcagttt tcctgggcttctgatgtctt 1.5 agtgaatgggatttggcagt aaggggtagagcagagcaca 1.6 ggccttcctgaaagacacac tgaagtgtaccgcgtgattg 2.1 tcctggtgcctaactttgct cggactgccagatcatctct 2.2 caaagtcggaagggaatgtg gcatttggtgtgaaggacac 2.3 acccagtgggaggagaagtt ggggaacaggaaagaaaagc 2.4 gccacatatttgtgcgagaa agttcaaaagcagggcaaga 2.5 cagtggtgggtcacttgatg gcccagcacatactgaacct 3.1 ctcagcagggtctgaactcc tgagggacccaagaaacaag 3.2 agaaatgcgttctgggtcac agctctgcatgacagcattg 4.1 ggatgggtgacaaacggtag gggcaggtgaacctgaacta 4.2 tagggcaagaactgcagcat ccaggaaggagacaaaagctc 4.3 cctgtgattcccagaagcat tgcaaaggtgagaatgaagg 4.4 agctgatgttcagcacagga cctggtgtggtctcttgctt 4.5 tgtgatgactcaggtttgc tgtgccttatggagtgctcc 5.1 tggcatggcaacaatagaga tagaggcaggactggatgga 5.2 acctttggacaggcacctta tgtgtgaggatcgaggaaca

44

6.1 taaccacatcaagcgagctg gcatggtttaggggttcttg 6.2 ggagggaagagagcagaggt tggttgcaaaaggaacatca 6.3 tgccgtgagaaaggtacaaa tctgctgctccagctcataa 6.4 gtccgttccgcatgtaattt ccacagaggaatcgaggatg 6.5 gccaatttcctccaagttca gactctgcttgcatctgctg 7.1 cagctgctggtgagtaagca aaagcactgcatgaacagga 7.2 ctcatctggactgcttgcac cagagaatccgcaggaaatc 7.3 gaggctacctctctgcatgg atgctgtgggtcatttcaca 8.1 ggaccccacttcacatcact gggagtggaacactcagctc 8.2 tcaacatgggagattcagca cacagtccactggagcacat 9.1 gctcctgagccacagaaatc tgcatgtctcaagcctgttc 9.2 ctgggaaagcaccaaatgtt tggagcggtcagttgaaaat 9.3 gaacaggctccagtgtgtga agtgactgcacaagctgcac 10.1 actgtgtgaggtcagcaacg tctctctcttggccgtgaat 10.2 ttccagagacagtggggaag tgttaggtgggagggaattg 10.3 tgatgacaaggcacagcag cctctcctaaccccatagacg 10.4 gtacccgaaggatggtgcta ctaggcaagggacaacaagc 10.5 cagcaatgcaggttacagga tacccacaggaaaccctgaa 11.1 cacatatgaggcagggcaat gtctggacggtggtctgact 11.2 accccaaatgaatggtgaaa agggtacgcggagtacagag 12.1 aggccaggaatgtgtaatcg caccttcagtgcctgctttc 12.2 ctggttgcacaatggttgac tagcatcattcatgccaagc 12.3 aactttgaactgcgggattg gggatttccagatcgagtga

45

12.4 caacgaaatgacctggcttt ggccccttcgtatattccat 12.5 cacagcctctttcacaacca ggaaccagtgaggaatggaa 13.1 cctgcacggtacaaggaaat agaccaaacacaccccagac 13.2 aggtaggtttcgtggcacag tggccagattctgtgtgaat 14.1 ccatggtcagtgccttttct gccaggaacactcagctcat 14.2 cttgggtcagaagagggtca cagctcctctgcagtgaaca 15.1 gagctgtggcccagtttaag gaacaaagccagctgcaaat 15.2 ggtactttgcctgggaacaa agtcatgggaaaggcatgag 15.3 gttagcgagtccgaaagcac ttcgttggcctttgttatcc 15.4 ctccttgtgtggcctgaact actcggcattttccatcatc 15.5 cctgcatcctgatagcacct gaataccgcccatgtacacc 16.1 gacgattcggctgagctaga agggccttagatcgtcaacc 16.2 cctggagtcctgctcatctt tgcaccaagaacgctttatg 16.3 ctttggaacattgcctggtt gccgcttaacttctctgtgc 17.1 tccacagggtgctagagctt tgcattccagttcagacagc 17.2 ctcctggacctctcgctgt gcagcttgggtcattgct 17.3 ccttctgggagcacagtctt caacgtcagcctggtaacct 17.4 ggcagcaaacaaacaactca cttcaggccggtgttaaaat 17.5 tttctgtgcttccccagtct aaccacggctgtcagaaaac 17.6 ggttcagaaaccctcacagc ctgaaccacagtgcgctaga 19.1 tctgccctgcaatgtacaag ccaagagcaggacctcaaac 19.2 agggggagtgggtactgttt attgccatgaacacacaagc 20.1 tcatcgcttcctgttcaatg ggtgtatcaccaccggttt

46

20.2 gcagcattctcctctgaacc ctgtgtgggtgtggactttg 20.3 ggctcaatgtgaccaaaaca ttaggacacagccctcatcc 20.4 ggtaccactggccttaacga agcatctgtgggtgtgacag 20.5 tgttgttgcctgacaagtcc gggtctgccaacagtctcat 22.1 tagcagggtagctggttgct gaagtcccaggaagctctga X.1 gtgcctggagaccaactcat acccttgatgctgagcaagt X.2 ggctgtggtccaaagaaaga tccaccaatagagcaccaca Hsp70 ggatccagtgttccgtttcc gtcaaacacggtgttctgcg -control #1 cggagcagtgagtgaagatg ggttatgccagcctttcttg -control #2 accagggagtatgaggaaagg ttaacactgcacaggcattagg -control #3 gtgagacagccagtgacgac aacaacaacaaacccctgaac

Primer pairs for amplification of mouse ChIP Samples (mGBR) 5.1 gagctggaacagagcgtacc cccagccagtttgtttgttt 6.1 gttcagctgtgcaatccaga agggtgttctgtgctcttcaa 6.2 tcctaagaggtgggaacacg accaaagggacagggtcttt 6.4 ggcagtgcaatgcaatctt gtgaggaggtggcgagttag 10.3 acccaccagggctgtagac tgcaaattccattccattca 10.4 tcgtcatgtccttgggtaca ggctacaccaccaagggtaa 10.5 tcgtcatgtccttgggtaca ggctacaccaccaagggtaa 12.1 tgttctcttccagggtctgc gtgggcaccagaggtaacat 16.1 gggaaaggacacggtgttta gctgtcctcgcagctcttag 19.2 tacctgctgagcaacagacg agcccaagctggttcctaat X.1 gcaaagagctcccatagtgc

47

gctcacaaccccagctagtc X.2 ggccaggatgtcagtaccag ccataaatcagtgctccacaag Hsp70 tgtgtgttgggagtgagagg gctgaagactggagggtctg

Primer pairs for amplification of human cDNA VMD2L1 ggcaaatgtggaatgctctt gggcacacacaggtctaggt SSTR4 caccagcgtcttctgtctca Atggggagagtgaccaacag EPSTl1 actgacctcgaaaagcctca Ggagaagccagtcactcctg MT2A gcaaatgcaaagagtgcaaa Atccaggtttgtggaagtcg VIPR1 attcaaggccgcaatgtaag Aaactcgctgccttgtcatc ANXA8 gctcattgtggcccttatgt Aggccaggatctcaatgatg ACSL1 ttgggaaggattctggtctg Gtcagaaggccattgtcgat KRT6A tgctgcctacatgaacaagg Tgtctgagatgtgggtctgc PAMCI tcatttgagtggcatccaga Tcctttaactggcacccatc SEC14L2 gcctccagaggtgatccaac Aagcccctcgcagtcataaa DDIT4 cctggacagcagcaacagt Tcactgagcagctcgaagtc ZA2OD2 cagctagtggttccaacagtcc Tttgctgagttacaggcaagg SHC1 gccgagtatgtcgcctatgt Gggtgggttcctgaggtatt KIAA0232 tgagttgttttcggcagatg Ctctttctgtccctcccaca NFkBIA cacctccactccatcctgaag Atcagcacccaaggacacc CDH1 tggagagacactgccaactg Ttagggctgtgtacgtgctg TMEM37 ttgtgcgaggacaaacactc Acattagggtgaagccgatg FLJ25390 ggctacatgctggatgacaa Cccagatggtggagatcagt HERC3 cctggtatcagcaccaacct Tccagcaggaacacagagtg PAG1 ctcaccaatggggacattct

48

Cattttggtttccctgtgct OR7E14P catgctcctgagtgtgatgg Aattgtgcagctgggagtct SGK ctatgcatgcaaacaccctg Gccaaggttgatttgctgag IRF8 agctgtatgtccggcaactg Aacatccggaagacctggtc PDGFA cccgcagtcagatccacag Tgagctctcaggctggtgtc NET1 gacctgcaggatggagatgt Cttgtggaacacgtcattgg MSX2 agcttcagtctccctttccc Ggtggtacatgccatatccc VGCNL1 atgtgaaagatcgctggtgtg Ataatcagtggccgtggaatc ZNF30 cctttgttaggcatgggaga Acactcctcacacccaaagg CYP4B1 ggctttctcaagctcatccac Gcccaggacaccactttg NSDHL tgctggtgatggtgatcagt Cacggtcctctccatagcat EFNA1 tacctggtggagcatgagga Aacctcaagcagcggtcttc CDKN1B aactctgaggacacgcatttg Tagaagaatcgtcggttgcag MYC gtcctcggattctctgctctc Cttgttcctcctcagagtcg CYLD tggaaagtgattacgcaggtc Ccaatagggttatccatgtcc DPP4 aagtggcgtgttcaagtgtg Tcttctggagttgggagacc LADININ1 agataccacacggccatacg Tgagccttgatgtcacaacc MLYCD aaactggatggacatgaagc Acgattgcctggatgttg SERTAD1 gcggaaacgggaggaggaggag Ggtcaaagagggagctagaggc CCDC40 tcaggcagacttcgacacac Agcactagggactgcttgga IRF2BP2 ccaaccaggttcactccact Cactgcacaaaatgggtgtc KIAA1434 cctggaggaagatgacgatg Tgcaagccataaccacactc NOSTRIN cagctcagcagcagactttg Acataagcggcaggaaaatg CXCL2 gcagggaattcacctcaaga

49

Gacaagctttctgcccattc IL8 accggaaggaacca Atcaggaaggctgccaagag ANKRD47 acctctgtgaggcaggaaga Agtagggttgccaccatgtc SEC22L3 tgctcctaatttccgaatgg Aaatgcaggctacgaaagga MTP18 tctagcctctgtggccattc Ttccccactgttgggtagag SOX13 tcctgggactgtagctctcc Gtccagcttctcctggcttc MCEE tcaagctcccattccaacag Ctgctatggctacatggttgag PGS1 tcaccaggagcaagagcag Tcttcacccacagcttcacc PDE4C atcatggccgagttcttcc Cactgaggccgtatgcttg NAT6 ctacacaccggatggagctg Tcatctctggttggcatgtg PF4 cagtgcctgtgtgtgaagacc Ttcagcgtggctatcagttg RNF8 agcatcttcagggtttggag Ccttcttgctgcgatttagc AP3S1 gatcaaggcgatcctaatcttc Ctccctgatgatttgctgttg MTHFD2L cactaccagaccacgttgatg Ggcaggtatgagagaatgctg TJAP1 tgcccgagtgttagagaagc Ccacatgcacaaagacatcc RPL19 atcgatcgccacatgtatca Gcgtgcttccttggtcttag SLC9A2 tgatgataaatatctgcggaagc Ttcacgacaatcatttagagatgc ASB9 cacctgggcactccactcta Accagctccacaggacgttt HPS4 ggactccaggatggttcagc Cctggcagagttgtgcagtc F3 ccgacgagattgtgaaggat ccgaggtttgtctccaggta PDE4DIP tccgggatgttggtatgaat Ttattggcaaaggagccatc DCBLD2 ccttctccatgcctctgttc Tgtaagggttccactctcagg DST ttctgctgcgtcacctacc Ctcctgtggctggaataacc SFMBT2 aatcaacgcagcctacaagc

50

Tgtccgtacgattttgacca CREB3L2 gtgtcaatggaggtggaacc ggtggtaatgtgggtgaagg THOC3 gcaacaaggatgatgtggtg caaccattgccatttgtcag ASB13 catgagactgcccttcacca Ccagtggccttcctcaagtt MMP7 tcacttcgatgaggatgaacg Ggatctccatttccataggttg LAMB1 ggcaatccatcagaagttgg Cacctcccagtctccttgtc TMEPAI cagccatctggagcaaagag Ctaagaagcgcggagtgttc ANK3 tctgctggaatatggtgctg Ttcgcatttctaccgaggag SCAP aggagcttggaggagttgtg Aagcgttcccagtcattctg ARHGAP26 gtggtgtttggacccactct Tatcgggcacggtgttaaat CDC42SE1 actgggctgctgtgtggtag Ctcctgaactgcacctgtca

Primer pairs for amplification of mouse cDNA Ddit4 gtgcccacctttcagttgac tgtaaccagggaccaaggaa Gilz tggggcctagtaacaccaag Gagcacactggcatcacatc Rpl19 agcctgtgactgtccattcc Ggcagtacccttcctcttcc Sgk cgtccgaacgggacaacat Gtccaccgtccggtcatac Mt2 gcctgcaaatgcaaacaat Acttgtcggaagcctctttg Fkbp5 aggccgtgattcagtacagg Gaacgactctgaggctttgg

Primers for DNAse I treated samples -control #1 ttcaatgccagctttcaggt gacattcttgcggctagctt -control #2 tccgcgttttgacattatga ctcctgcctggatttgtttg 1.5 ggatttggcagtggagaaat ggacatagtgttctgggcaac all other regions: same as those used for ChIP analysis

51

RNA Isolation, Reverse Transcription, and Real-time Quantitative PCR (qPCR)

The RNA isolation, reverse transcription, and qPCR steps were performed as previously described [4]. Primers for cDNA amplification are displayed in Table 1.5.

DNAse I Accessibility Assay

The experiments were adapted from previous described protocol [64] with the following modifications. Briefly, nuclei from A549 cells treated with vehicle or dex for 1 hr were treated with 6.25-200 units/mL of DNAse I (Qiagen) for 5 min at room temperature. The reaction was stopped and treated with Proteinase K for 1 hr at 65°C. The DNA samples were extracted once with 1:1 phenol-chloroform and further purified using MiniPrep columns (Qiagen). The samples were subjected to qPCR analysis to determine the relative amount of cleaved product (see Table 1.5 for primer sequences), which was converted to percent DNAse I cleavage.

Computational analysis

For computational analysis of enriched motifs, all repeat-masked DNA sequences were downloaded from the UCSC genome browser (NCBI Human Build 35). BioProspector analysis was initially performed using nucleotide widths(w) 14 and 16 on GREs to identify GR binding sites and the top motifs with were masked to identify other motifs

[47]. For MobyDick analysis, both the human and the human/mouse aligned sequences were used as inputs to identify enriched motifs [45,46]. Similar motifs were clustered using CAST [65-67]. All p-values for enrichment were Bonferroni corrected to identify putative factor binding sites [67]. The top Bioprospector w14 position weight matrix

52

(PWM) was used to score GREs for putative GR binding sites with a false positive rate of less than 10%. This upper bound was calculated from randomly sampling unbound GR regions (Figure 1.9).

Figure 1.9: Number of GREs with putative GR binding sites. All 73 GRE sequences were scored for putative GR binding sites using a predicted PWM representative of the GR binding site. Percent of sequences predicted to contain a GR binding site with varying score cutoffs is plotted as red squares. The false positive rate (blue triangles) was calculated by randomly sampling unbound sequences at varying score cutoffs.

We built position weight matrices (PWMs) of those motifs with p-values less than

0.05, which were used to measure similarity to known binding sites in TRANSFAC [68].

We measured the distance between the PWMs and those representing binding sites for known regulatory factors using relative entropy (Kullback-Liebler divergence) with a

53

cutoff of less than 6.0 to associate motifs with putative regulatory factor binding sites.

The known binding site matrices were obtained from TRANSFAC professional v.9.3.

The human-mouse conservation score was calculated as described [9] using a 50- mer window for 50 sequences containing a putative GR binding site based on our computational and experimental analysis (Figure 1.9 and Figure 1.3A). The conservation score was calculated as number of bp matches minus the number of bp deletions or insertions, divided by the bp window size. We centered each alignment based on the highest scoring putative GR binding site in human and expanded equally on each side of the binding site to a total length of 4kb. The background level was calculated by taking the average of all conservation scores across the 4kb region. The human(hg)/mouse(mm) genome alignments were downloaded from Vista (http://pipeline.lbl.gov/cgi- bin/gateway2).

54

Chapter 1 Author Contributions

AYS, ECB, KRY, CC, and HL conceived and designed the experiments. AYS, ECB, and

CC performed the experiments. This manuscript was written by AYS, KRY, and CC with guidance from ECB and HL.

55

Chapter 2: Conservation Analysis Predicts In Vivo Occupancy of Glucocorticoid

Receptor Binding Sequences

56

Chapter 2 Introduction

Genomic response elements coordinate gene transcription networks through the binding of regulatory factors that serve as sensors of physiological and environmental cues. Typically, these elements are composites, comprised of binding sites for multiple distinct factors [58,69,70] clustered within conserved genomic segments (some extending to 2kb), whose effects are integrated to elicit specific transcriptional regulatory outcomes.

In response to corticosteroid signaling, the glucocorticoid receptor (GR) occupies primary glucocorticoid response elements (GREs) and alters expression of genes that mediate a range of essential biological processes, such as development and immune responses [1,71]. Within a given cell type, binding of GR to GREs can either up- or down-regulate transcription in a gene-specific manner. For example, in human A549 alveolar epithelial cells, GR occupancy at the ETNK2 GRE triggers transcriptional induction [70] whereas binding of the receptor at the IL8 GRE drives repression [21]. At a major class of GREs, GR binds directly to 15bp GR binding sequences (GBSs) composed of imperfect palindromic hexamers separated by three base pairs [70]. At a distinct class of GREs, GR tethers indirectly to DNA through protein-protein interactions

[32]. The regulatory significance of the different classes of GREs is not well understood.

Previously, we found that GR occupied GREs are evenly distributed upstream and downstream from the transcription start site of their target genes, with the majority >10kb from the corresponding promoter [70]. Strikingly, although the 15bp GR-occupied GBSs are highly variable among different GREs, the precise sequence of the GBS within an individual GRE is highly conserved across multiple species. Moreover, individual GREs

57

at different genes appear to retain specific ‘architectural signatures,’ including distinct

GBSs as well as binding motifs for other factors.

Studies with GR and its close relative AR (androgen receptor) established that receptor occupancy at their corresponding response elements is commonly a determinant of transcriptional responsiveness [69,70] and that the core binding sites at these genomic elements are evolutionarily conserved [70]. Thus, we wished to examine whether the conservation, per se, of the receptor binding sequences at their target genes was sufficient to predict receptor occupancy in vivo. In the present study, we identified glucocorticoid responsive targets, computationally inferred 15bp GBS motifs, and tested whether species conservation of these sites alone was sufficient to predict GR-occupied genomic GREs.

58

Chapter 2 Results

Identification of GR Target Genes and GBS motifs

We began our study by performing expression microarray profiling to identify genes responsive to glucocorticoids in mouse C3H10T1/2 mesenchymal stem-like cells.

Single stranded cDNA, synthesized from samples obtained from cells treated for 90 min with ethanol vehicle or 1µM dexamethasone (dex), a synthetic glucocorticoid, was labeled and hybridized with the arrayed oligonucleotide probes. Approximately 100 genes were found to be responsive to dex; 69 up-regulated and 17 down-regulated genes were validated in C3H10T1/2 cells by quantitative PCR (>1.5 fold difference compared to ethanol treated samples) (Table 2.1).

To identify GBS motifs at genes responsive to dex in C3H10T1/2 cells, we scanned computationally 32kb upstream and 32kb downstream from the transcription start sites (TSSs) of these genes. We generated a GBS positional weight matrix, using 79

GBSs previously identified through chromatin immunoprecipitation-microarray (ChIP- chip) tiling experiments [70]. Successive 15bp windows progressing across the 64kb surrounding each target gene TSS were scored against this matrix, and the sites falling within the ninetieth percentile were defined as GBSs. Importantly, the lowest scoring site included in this set had a higher score than previously verified GBSs (data not shown)

[3,70], demonstrating that we likely only included sites that are recognized by GR in vitro. Scanning 69 dex-induced genes across 64kb windows, we identified 325 GBSs, an average of 4.7 sites per gene (Table 2.2). This motif was detected at a similar or perhaps slightly lower frequency (3.5 sites per gene) at genes repressed by dex (Table 2.2).

Importantly, the GBS motif also occurred at a similar frequency near randomly chosen

59

genes that were unresponsive to dex in C3H10T1/2 cells, indicating that sequences potentially recognizable by GR are widely distributed in the genome.

Table 2.1: Glucocorticoid responsive genes identified in C3H10T1/2 cells. The genes found to be responsive to 1uM dex (1uM) in the microarray experiments are listed. Note: only the genes that were validated by qPCR as being dex responsive (at least 1.5 fold change compared to ethanol treated samples averaged over at least two independent experiments) after 1.5-2hrs of treatment are shown. The relative fold dex-stimulated induction/repression are displayed. Of the genes found to be unresponsive to dex in the microarray experiments, 40 were randomly chosen for this study and listed in the table. Dex responsiveness of nine of these genes were examined by qPCR, and all nine were found to be unresponsive to dex after 1.5-2hrs of treatment (less than 1.5 fold change compared to ethanol treated samples averaged over at least two independent experiments). The corresponding primers used for qPCR analysis of the respective genes are shown.

Dex-Unresponsive Genes Accession number Gene Fold change (Dex/EtOH) Forward Primer Reverse Primer NM_028408 Cnih3 1.1 gctggccttctacctcctct catctggtgctcctctagcc NM_029949 Snapc3 0.8 gcacacaccacagtgaatcc acgttccatcccgactacag NM_172760 Elmo3 0.8 cgagagttccgcaagttagg ttggtgcatgtctggagaag NM_013663 Sfrs3 0.8 agatttgcaaaggggttcct tgaaaggacactggcatctg NM_009004 Kif20a 0.7 cctgaagcccttattgtcca gtggacagctcctcctcttg NM_010158 Khdrbs3 1.4 attatgggcatggactcagc gtctctgtagacgcccttcg NM_023743 Eif4enif1 0.8 ccaaatggtttggctcagat tgtcaccacaggtccaggta NM_133987 Slc6a8 0.9 gttggaggaatccccatttt aatcaccatggaggcatagc NM_172665 Pdk1 0.8 ggtccagtggataagcgaaa gcttctggtcggagttcttg

Dex-Repressed Genes Accession number Gene Fold Repression (Dex/EtOH) Forward Primer Reverse Primer NM_031874 Rab3d -1.5 atcaccacggcctactatcg cttgttccccacgaggatta NM_007678 Cebpa -1.7 tggacaagaacagcaacgag tcactggtcaactccagcac NM_178357 Tieg2 -1.7 atgaccagtgtgatccgtca caaacgctcctgtccttctc NM_008390 Irf1 -3.5 tcttgccctcctgagtgagt gggactatgctttgccatgt NM_011824 Grem1 -1.5 gacaaggctcagcacaatga actcaagcacctcctctcca NM_178382 Flrt3 -2.7 accctcaatcgagagcaaga acagtgacccgttcctatgc NM_020581 Fiaf -2.5 aagatgcacagcatcacagg atggatgggaaattggagc NM_009621 Adamts1 -2.5 gtgtccagcccccgttatgt cgagaacagggttagaaggtaatgc NM_009062 Rgs4 -2.2 gctgggttggtctagagctg tctgccctcacctaagcagt NM_176933 Dusp4 -2.8 gcatgtgtgtgcaggagtct accctgcgtctgatggtaac AK020467 9430052C07Rik -4.6 ggagcaagactgtacttttcctggacaaagcagcacgatttca NM_011333 Ccl2 -2.4 agcaccagccaactctcact cgttaactgcatctggctga AK074068 A930021H16Rik -2.0 agaggaatgaaggtggctca ggtgaacatgtttgccacag NM_008764 Tnfrsf11b -1.6 gttcctgcacagcttcacaa aaacagcccagtgaccattc NM_013654 Ccl7 -2.4 aagtgggtcgaggaggctat agaaagaacagcggtgagga NM_029682 1700095N21Rik -2.0 tctgctgctggctgactcta tttggcacaaccacatgagt NM_010501 Ifit3 -1.7 gtggtggattcttggcagtt gacacacttccggttgtcct

60

Dex-Induced Genes Accession number Gene Fold Induction (Dex/EtOH) Forward Primer Reverse Primer NM_011361.1!! Sgk 4.9 cgtccgaacgggacaacat gtccaccgtccggtcatac NM_008630 Mt2 5.8 gcctgcaaatgcaaacaat acttgtcggaagcctctttg NM_013602 Mt1 6.6 ctccgtagctccagcttcac aggagcagcagctcttcttg NM_009516 Wee1 2.0 agtcgtatgtgctgctggtg tctgtcacctcctgggaaag NM_010559 IL6ra 3.1 gatggtttacgcgagtgaca ttcgcctgaagtcctgagat NM_007679 Cebpd 7.6 ttcagcgcctacattgactc tgtggttgctgttgaagagg NM_019740 Foxo3a 2.3 ttcccatataccgccaagag tggatagtctgcatgggtga NM_023324 Peli1 2.5 ttggtccctatgtccctctg tgggatctgggaccagtaag NM_199299 Phf15 2.9 actggaagttgaagcggaga aaaagcttcaggcgtcgata NM_001024955 Pik3r1 2.0 gagcaaagccaaggaaactg gtgctggtggatccatttct NM_010286 Tsc22d3 6.4 tggggcctagtaacaccaag gagcacactggcatcacatc NM_020507 Tob2 3.1 ttctcagcctagcaccacct aaggcaacacccaacttgtc NM_013642 Dusp1 11.9 cagctgctgcagtttgagtc gggatggaaacagggaagtt NM_010638 Bteb1 2.4 gcagtgagctccacatttca cgctagtgatggctgtcgta NM_010220 Fkbp5 10.6 aggccgtgattcagtacagg gaacgactctgaggctttgg NM_008587 Mertk 4.6 caaatgtatgcgcgtattgg tgcaaacctgacttgacagc NM_207205 Igsf3 1.8 tcccagtgttgagctgtctg tgatgtgggagctctctgtg NM_010516 Cyr61 6.2 acgaggactgcagcaaaact gggtctgccttctgactgag NM_022305 B4galt1 1.5 ccaaatcacagtggacatcg ccagtcaactggcagagaca NM_009743 Bcl2l1 1.7 ttcgggatggagtaaactgg tgcaatccgactcaccaata NM_130895 Adarb1 3.0 gataccgtgcagttccacct gtagctgtccccttgctttg NM_175638 Prkwnk4 2.3 tgccccatctttcctatcac acctgagaagcactggagga NM_198092 Usp2 6.3 gctgtttgaaggacagcaca actctgggaaagggacaggt NM_009026 Rasd1 2.7 tggtcatttgcggtaacaaa ggtccaagctgctgttcttc NM_011521 Sdc4 1.5 ctgatcctgctgctggtgta ggaggaagcttcatgcgtag NM_001033453 Gm1024 5.3 agcagttggtgggtacaagg ctacagcatggcgaatgaga NM_011567 Tead4 3.9 caacctggaacatcccacgat gaaagccgagaactccaacat NM_009228 Snta1 2.9 agctggagtcttctgggtca gatgaaggaggggagagagg NM_009061 Rgs2 13.6 acgaaaaccccaagtttcct cctgcatttagtgcaagcaa NM_025367 Sphk1 2.9 tggggctatgacttggaaag ccagggaaggtccctaagag NM_173733 Suox 2.7 gaggaacagtgtcccaggaa attggggctacagtgtctgg NM_008638 Mthfd2 4.7 aggtcccaagcctttgagtt gtaagggagtgccgttgaaa NM_010070 Dok1 2.4 ttttctgccttggagatgct tctcagcttccaccctcagt NM_011817 Gadd45g 8.2 tgccttggagaagctcagtt tcaccaagtcgatcagacca NM_013867 Bcar3 2.2 gatgccatgggagacctcta atggctgtctgcgtgtagtg NM_008131 GluL 7.9 tagcaacctttgaccccaag actggtgcctcttgctcagt NM_009504 Vdr 2.6 agattgccgcatcaccaagg atctctcgcttacgctgcac NM_175445 Rassf2 1.8 cccacaatgtgtatgcttgc gctgctgggaatttaaccaa NM_013870 Smtn 2.0 tcaagcagatgttgctggac tcaaaagcctcagggaagaa NM_144925 Tnrc6a 1.8 ttgaatcatgcaggccaata ggaaagaaggaatccgaagg NM_013703 Vldlr 3.0 tcgggctttgtttactggtc agtagaggcggcttttgaca NM_010834 Mstn >100 ctgtaaccttcccaggacca gcagtcaagcccaaagtctc NM_023377 Stard5 2.3 aagtccctctatggccacct aagacccacacagggacaag NM_033602 Peli2 1.5 ccagacggtagtggtggagt gtgatctgggcatcttcgtt NM_183187 Bc055107 9.9 gacgcacccaagagtggtat ggccaaggaattctgtgtgt NM_173440 Nrip1 4.3 tgaggcagacgatactgacg cctcgcaacttccttagcac NM_009769 Klf5 3.6 cacgtacaccatgccaagtc ctgcagcatctcagcttgtc NM_144808 Slc39a14 2.5 cagaggcttttggcttcaac ggtgctcgtttttctgcttc NM_008046 Fst 2.0 ctcttcaagtggatgattttc cattcgttgcggtaggtttt NM_009627 Adm 8.8 ttcgcagttccgaaagaagt tgtcgtctcatcagcgagtc NM_181444 Rai3 2.0 gagtttcgacagctcccaag ggctctgttctccacaccat NM_009667 Ampd3 4.1 ctcccaattttggttgcact gtggacagtcagggacaggt NM_010295 Gclc 8.5 aacacagacccaacccagag tggcacattgatgacaacct NM_009883 Cebpb 9.1 tggacaagctgagcgacgag tgtgctgcgtctccaggttg NM_024406 Fabp4 4.1 tcacctggaagacagctcct aagcccactcccacttcttt NM_175175 Plekhf2 4.7 ggctttgttcctgctttttg ccaagttcccatcagcattt NM_020257 Clec2i 1.8 tggtcccacagtgctatcaa acaccatgtggttgctcaga NM_008871 serpine1 3.9 gtctttccgaccaagagcag gacaaaggctgtggaggaag NM_011756 Zfp36 2.8 ccctctgcaactctggtctc gaccaccggacactgaactt NM_025404 Arfl4 9.7 tgctacctgccgttttaagg taggggccatttcagtcaag NM_177710 Ssh2 3.3 gcagcctttttctcacttgg tgtgaagagggctggagagt NM_133919 Mllt2h 2.9 agttggacaggatccgtttg caataagcagcgcagaacag NM_007796 Ctla2a 2.5 agggctcagccagagtaaca gagcctctccagcatcattc NM_026232 Slc25a30 2.4 ggaagccctttgtgtatgga ctcggaagttggcatcattt NM_010276 Gem 4.5 cctgctacgtggatgtctca aggacctcgaaatcacatgg AK150917 Ctla2b 2.7 caaattgctgtggaagctca gcccttccaggtgtcagata NM_133232 Pfkfb3 2.2 cctgaggaaaccttgagctg cccagaagacatgtggacct NM_178076 Mcf2l 3.3 gcagtagaccagcatgcaaa actaagcaagcccaagcaaa NM_145076 Trim24 1.7 gcctaagcagaatcctgtcg tgctgaatatgctggagtcg

61

Responsive Induced Repressed Unresponsive # of genes 86 69 17 40 Total # of GBSs 384 325 59 181 Average # of GBSs per gene 4.5 4.7 3.5 4.5 Frequency of GBS occurrence 14kb 14kb 18kb 14kb Range of GBSs per gene 0-11 0-11 1-7 1-10 Total # of conserved GBSs 107 93 14 37 Average # of conserved GBSs per gene 1.2 1.4 0.8 0.9

Table 2.2: Identification of GR targets and GR recognition motifs. Frequency of GBSs at genes activated, repressed or unresponsive to dex. GBSs were identified by computational scanning of 64kb surrounding the transcription start sites of the mouse genes shown in Table 2.1. Human aligned GBS sequences were obtained from Ensembl Genome Browser. GBSs that contain at least 9 identical bases across the mouse and are considered conserved, and any GBSs that contain internal gaps at positions 4-12 of the 15 bp motif are considered not conserved.

Sequence Conservation of GBSs is Sufficient to Predict GR-occupancy at Dex- induced Genes

We showed previously that the precise sequences of GR-occupied GBSs within individual GREs at genes induced by dex are highly conserved across four mammalian species (human, mouse, rat, dog) [70]. Based on this finding, we wished to assess whether GBS species conservation at dex-induced genes is sufficient to predict GR occupancy. We began with simple pairwise comparisons, examining mouse-human identity with no weighting bias introduced at any base position across the motif; the mouse-human evolutionary separation is ~100 million years. We then assessed GR occupancy using chromatin immunoprecipitation (ChIP) at these sites in C3H10T1/2 cells. Whereas GR was bound at only 13 of the 39 (33%) weakly conserved sites (9-11

62

of 15 base pairs identical between mouse and human) (Table 2.3), we detected GR

binding at 16 of the 34 (47%) moderately (12-13 identical bases) conserved sites, and at

17 of the 20 (85%) highly conserved (14-15 identical bases) sites. In contrast, only four

of the 76 (5%) tested non-conserved sites (< 9 base identity) were bound by GR (Table

2.3). In summary, we found that mouse-human sequence identity of GBSs correlated

directly with the likelihood of these sites being bound by GR (Figure 2.1A). Thus,

pairwise (mouse-human) sequence conservation of GBSs at GR-induced genes is

sufficient to predict genomic sites occupied by GR in vivo.

Dex-Induced Genes Dex-Unresponsive Genes Dex-Repressed Genes Number of Mouse-Human Number of Number of Number of Identical Bases GR-Occupied GBSs GR-Occupied GBSs GR-Occupied GBSs 15 7/7 0/1 0/0 14 10/13 0/3 0/4 13 9/14 1/4 0/1 12 7/20 0/5 0/1 11 6/15 1/10 0/1 10 5/16 0/10 0/2 9 2/8 0/4 0/5 <9 4/76 0/13 0/45

Table 2.3: GR occupancy at genes induced, unresponsive or repressed by glucocorticoids. (A) GR occupancy of GBSs at dex-induced genes assessed by chromatin immunoprecipitation. DNA samples immunoprecipitated from C3H10T1/2 cells treated for 90 min with ethanol or dex were quantified with qPCR using specific primers spanning the corresponding GBS. Relative amplification was normalized to a genomic region near the Hsp70 gene, which does not bind GR. (B) GR binding at genes unresponsive to dex. A set of genes unresponsive to dex in C3H10T1/2 cells (Table 2.1) was chosen at random. (C) GR occupancy at genes repressed by dex.

63

A

B

Figure 2.1: Pair-wise sequence conservation of GBSs is sufficient to predict GR occupancy at dex-induced genes. (A) Direct correlation between sequence conservation of GBSs and GR occupancy at dex-induced genes. The number of identical bases of the GBSs was plotted against the percent within that population that was occupied by GR in C3H10T1/2 cells. All the conserved GBSs identified at genes induced by dex are represented in the graph. (B) GR occupancy of conserved and nonconserved GBSs at dex-induced genes. GR occupancy at all of the conserved GBSs was experimentally tested by chromatin immunoprecipitation. For the nonconserved GBSs, the number of GR-occupied sites was extrapolated from 76 tested sites (Table 2.3).

64

Of the total 325 GBSs identified at 69 genes up-regulated by dex, we found 232 sites (72%) that were not conserved (<9bp identity between mouse-human) (Table 2.2,

Table 2.3, Figure 2.1B). Extrapolation from the tested sites suggested that only 12 (5%) would be GR bound in C3H10T1/2. Assessing the 93 conserved sites (>9bp identity between mouse-human) identified near these genes, we correctly predicted GR occupancy at 46 (50%) (Table 2.3, Figure 2.1B); those 46 occupied sites were associated with 31 (45%) of the 69 genes induced by dex. Notably, prediction rate is a direct function of sequence conservation stringency: we predicted GR occupancy at ~80% of

GBSs with >12 bp mouse-human identity, and at 100% of GBSs with full identity between the two species. Thus, by applying computational and conservation analysis, we could predict precise 15 bp positions bound by GR within the 64,000 bp regions sampled.

Our analysis identified both known GBSs, such as those near the promoters of Sgk and

Mt2 [70,72], and novel sites, such as those near Peli1, Tob2, Bteb1, and Bcl2l1 (Figure

2.2A, Figure 2.2B).

A

Mt2 mouse AGGACAGCCTGTCCT human AGGACAGCCTGTCCT Sgk mouse AGAACAGAATGTTCT human CGGACAAAATGTTCT Peli1 mouse GGTACAGAAAGTGCT human GGTACAGGAAGTGCT Tob2 mouse GGCACAGGATGTCCC human AGCACAGGATGTCCC Bteb1 mouse AGGACAAACTGTTCC human AGGACAAACTGTTCC Bcl2l1 mouse GGGACACTGTGTTCC human GGGACACTATGTTCC

65

B

Figure 2.2: Identification of known and novel GR-occupied GBSs. (A) Examples of conserved GBSs found in C3H10T1/2 cells. GBSs found near the indicated genes are shown, together with corresponding human sequences obtained from the Ensembl Genome Browser. Displayed in red letters are bases that are identical in the mouse and human genomes. (B) Examples of GR occupancy of GBSs in C3H10T1/2 cells. Chromatin immunoprecipitation experiments were performed and analyzed as indicated in Table 2.3. The data represent an average of at least three independent experiments and are plotted with standard error of mean. The genes shown correspond to the dex-induced transcript nearest the corresponding GBS.

GR-occupied GBSs are Functional in Reporter Assays

We subcloned 500bp DNA fragments encompassing our GR-occupied GBSs into a luciferase reporter vector to assess their potential to function in cells as glucocorticoid response elements. In C3H10T1/2 cells transfected with these reporters, luciferase activity was stimulated greater than two-fold for 9 of the 11 reporters tested upon treatment with dex (Figure 2.3). To confirm that GR was indeed recognizing the 15bp

GBSs predicted in our computational analysis, we mutated highly constrained positions within GBS motif at the 9 dex-responsive reporters. In each case, the dex-stimulated

66

luciferase response was compromised in the mutants (Figure 2.3), indicating that our computational analysis was indeed identifying GR-bound GBSs that could confer glucocorticoid responsiveness.

Figure 2.3: GR-occupied GBSs are functional in C3H10T1/2 cells. Cells transfected with reporters each bearing a single copy of a 500 bp GBS-containing fragment were treated with ethanol or dex, harvested, and the activity of the firefly luciferase reporter was normalized to renilla luciferase activity. The figure displays the ratio of relative light units in log scale of the reporters treated with dex or ethanol, averaged over at least four independent experiments. Black bars represent the reporters with wild-type GBSs (showing standard error of mean) whereas the gray bars correspond to mutant reporters, in which the GBSs were mutated to sequences that abrogate GR binding (see Materials and Methods and Table 2.4).

GBS Conservation and Occupancy at Genes Unresponsive to Glucocorticoids

We showed previously that GR occupancy in a given cell type appears to be strongly restricted to genes that are actually glucocorticoid responsive within those cells

[70]. Consistent with this finding, we detected bound GR at only two of the 37 (5%) conserved GBSs in C3H10T1/2 at genes not regulated by dex in those cells, and at none of the 13 non-conserved sites tested (of 144 total) (Table 2.3). Recall, in contrast, that

GR occupied conserved sites at 45% of the dex-induced genes. The two occupied GBSs

67

at the unresponsive genes are nevertheless notable, however, as they appear to represent the relatively rare cases in which events subsequent to receptor binding may serve as primary determinants of glucocorticoid responsiveness.

Distinct GBS Occupancy at GR Induced vs. Repressed Genes

GR can either activate or repress gene transcription. We found that GBSs occurred at similar frequencies at these two classes of genes (Table 2.1, Table 2.2), and that mouse GBSs were conserved in the human genome at only slightly lower rates at repressed relative to induced genes. To our surprise, we failed to detect GR binding at any GBSs associated with GR repressed genes (Table 2.2, Table 2.3), whereas at least

45% of the induced genes contained GR-bound GBSs. Thus, there appear to be at least three constraints on GR occupancy of GBS motifs: first, GR generally fails to bind at

GBSs associated with genes that are not GR regulated in a given cell context; second, GR typically does not occupy GBSs that are not well conserved at dex-induced genes; and third, GR binding is not detected even at conserved GBSs linked to GR repressed genes.

We conclude that GBS conservation is a strong predictor of GR occupancy and function at GR induced genes, but that a GBS alone is not sufficient to specify GR occupancy.

68

Chapter 2 Discussion

In this study, we examined whether sequence conservation of GBSs across two mammalian species is sufficient to predict GR-occupied GREs. We first identified glucocorticoid responsive genes in mouse C3H10T1/2 cells (Table 2.1). We then inferred sites recognizable by GR at these genes using previously defined GBSs to derive a receptor binding motif positional weight matrix. Next, we determined the level of sequence conservation of these sites between mouse and human and tested GR occupancy using chromatin immunoprecipitation. Strikingly, we found that the level of conservation of these sites correlated directly with the predictability of these sites being occupied by

GR at dex-induced genes (Figure 2.1A).

Using sequence conservation analysis, we predicted sites of GR occupancy at nearly half of the genes induced by dex. The remaining genes may be secondary targets, have GR-occupied GBSs outside of the 64kb examined here, or may have GR binding sequences that do not conform to the motif applied in this study. Alternatively, these genes may be bound by GR at relatively poorly conserved GBSs, which comprise only

~20% (12 of 58 GBSs) of GR occupied sites (Figure 2.1B). However, it should be noted that we restricted our analysis to positionally conserved sites, and that these sites found in the mouse genome may actually be conserved in sequence but shifted location within the human genome. Moreover, GBSs that are not conserved between mouse and human may be conserved in genomes of other species, and extending our study beyond the current pairwise analysis may be informative. Finally, some non-conserved sites may reflect selected changes that enable species evolution.

69

Among the genes that are glucocorticoid-unresponsive in C3H10T1/2 cells, we expect a subset of GBSs found at these genes to be responsive in other cell contexts, and the remainder to be unresponsive under any conditions. We predict that some of the 37 conserved GBSs at genes that are unresponsive in C3H10T1/2 cells may in fact be occupied and functional in other cell contexts. In contrast, GBS motifs that are not conserved at unresponsive genes are unlikely to be GR-bound and functional in any context.

We found that GBSs occurred at similar frequencies at glucocorticoid induced and repressed genes in C3H10T1/2 cells. However, none of the GBSs at dex-repressed genes were bound by the receptor, indicating that GR occupancy at these sites is specific to dex-induced genes and precluded from repressed genes. GR has been demonstrated to be an active repressor [21]; thus, our study suggests that GR may occupy specific sites with distinct sequences to elicit transcription repression. Consistent with this notion, repression of POMC [73,74] and osteocalcin [75] by glucocorticoids is directed by binding of GR to DNA sequences distinct from the GBS motif applied in this study. In addition, GR tethers through protein-protein interactions to DNA-bound nonreceptor factors, such as NFκB and AP-1, to repress transcription of genes such as interleukin-8

[21] and collagenase [76], respectively. Thus, the GBS motif examined here appears to specify transcriptional activation by GR, but not repression; moreover, GR fails to occupy the motif close to GR repressed genes.

In general, we found that GR occupancy is remarkably site-selective. This is despite the fact that GBSs are widespread across the mouse genome, as predicted by the rather modest conservation constraints across the 15-mer. Hence, GBSs occurred at

70

comparable frequencies near genes activated, repressed or unresponsive to dex, yet bound

GR was detected at these sites only at activated genes. For example, precisely the same

GBS (GGAACAGAATGTTCA) appears at the activated Dusp1 and unresponsive

Adamts19 genes, but it is conserved and occupied by GR only at the former gene (Table

2.6, 2.7). Indeed, we observed selective GR occupancy even at GBSs associated with activated genes. The same GBS (AGAACAGTCTGTGCT) is present at Il6ra and Adm genes (data not shown), both of which are induced by dex in C3H10T1/2 cells, but is only occupied by the receptor at the Il6ra gene. These and other findings suggest strongly the presence of additional elements that either promote or inhibit GR binding at GBSs in particular contexts, and in a manner that also relates to GBS conservation. In the simplest case, such elements might be proximal to the GBSs themselves. Indeed, for 50

GREs examined in another study [70], we found conserved regions of up to 2 kb that surround GR-occupied GBSs, and identified enriched motifs resembling non-GR factor binding sites within the GREs; it is tempting to speculate that those sites may be important for directing receptor occupancy at specific GBSs, and it will be interesting to determine if the nonconserved GBSs lack such sequence signatures.

In this study, we were able to predict GR-occupied GREs by exploiting sequence conservation of the GBSs with the constrained and flexible base positions weighted equally, consistent with our previous finding that individual GBSs are strongly conserved across the full 15 bp motif (1). Structural studies have determined that GR makes specific contacts with as few as four bases of the 15 bp GBS [51]. Thus, it appears that the receptor-binding sites are not merely docking sites for GR but rather serve additional functions. As GR regulates transcription in a highly gene-specific manner [4], the precise

71

sequence of the receptor-binding site may harbor a ‘regulatory code’ that confer allosteric effects on GR (7) to direct such gene-specificity.

72

Chapter 2 Materials and Methods

Cell Culture, plasmids, reporter analysis

C3H10T1/2 cells were grown in Dulbecco’s Modified Eagle’s Medium (DMEM) with

10% fetal bovine serum (FBS), pyruvate, penicillin and streptomycin in 5% carbon dioxide atmosphere. For all experiments, the cells were supplemented with 1µg/mL of insulin upon treatment with ethanol or dex (1µM). For reporter analysis, inserts were cloned into PGL4.10 E4TATA plasmid using KpnI and XhoI sites and GBS mutagenesis was carried out using QuikChange (Stratagene) (see Table 2.4 for primer sequences).

Approximately 200,000 C3H10T1/2 cells were transfected with 200ng of the reporter constructs and 200ng pRL Luc (Promega) using Amaxa 96-well shuttle with SEC solution and 96-CM-137 program (Amaxa Biosystems). Three hours after transfection, cells were treated with EtOH vehicle (0.1%) or 100 nM dex for 6-8hr, harvested and luciferase activity was measured as described for the dual luciferase reporter system

(Promega) using a Tecan Ultra Evolution plate reader (Tecan).

WT Reporters GBS 4: GBS Sequence: AGGACATTGTGTCCA GCTGCAGGTACCTCCAGGGAAACAGGTAGGAGGTAGAGAG CGCTCTCTCGAGCCCAAGAGGCTCCTGGGAGCTTGGCCC GBS 8: GBS Sequence: GGCACAGGATGTCCC GCTGCAGGTACCTGGGATCTCGGATTCCCCTCTACTCTCTG CGCTCTCTCGAGTCGGAATCTCTTGAGGAATGTGACTGGT GBS 3: GBS Sequence: AGAACAAAATGTACA GCTGCAGGTACCGCTGAAACTTCCTGGCTGCAAAGTGGG CGCTCTCTCGAGAGGCTGTTGCAAAAGCGACCAGCAGGCG GBS 5: GBS Sequence: AGGACAAACTGTTCC GCTGCAGGTACCAGAATGGGTGGGACTAGGGGCTTAAG CGCTCTCTCGAGAGCTATGCACACACTCAAGCAGTATT GBS 6: GBS Sequence: AGGACAGCCTGTCCT GCTGCAGGTACCAGGGAGCACGGGAGCCCAGAAATCAATC CGCTCTCTCGAGCCAGGGTGTTTGCCAAGTAGGAAGGGCC GBS 9: GBS Sequence: GGAACAGGCAGTGCC GCTGCAGGTACCTGGGAGAGGGTGGGCTAGGAGGAG CGCTCTCTCGAGGAGAATCGGGGAGGGGTGGAGGTGAC

73

GBS 10: GBS Sequence: GGTACAGAAAGTGCT GCTGCAGGTACCAGGTGTGAAATATCATTTTTCCACGTGA CGCTCTCTCGAGATCTTGAGACAGAAACAAGGTGGCCACAT GBS 18: GBS Sequence: AGTACACTCTGTTCA GCTGCAGGTACCCAGCACGAAGCAGGGAGTGATGTCAG CGCTCTCTCGAGGATGGGATGAGTGTCCTCAGGGAAGG GBS 27: GBS Sequence: AGAACAGAATGTTCT GCTGCAGGTACCTTGTGAGAGAGGGACCCGGGCTAGGG CGCTCTCTCGAGCACAGCGGGCGACGGCGACCGGCGAGC GBS 28: GBS Sequence: AGTACATTTTGTACT GCTGCAGGTACCGTCCCTCAGCTAACTGTACCCAGAGTGT CGCTCTCTCGAGACTTGTGAGGTGACCTCCAGTACACACA GBS 31: GBS Sequence: GGAACAGAATGTTCA GCTGCAGGTACCAAGGACCTTGGAAGATGAAGAAGGA CGCTCTCTCGAGGAACTCACAGGGCCTGTGACTCAGGG

Mutant Reporters MutGBS 3: mutated GBS sequence: AGAAGTAAACCTACA GGAAGGAGTTTTGTAGGTTTACTTCTATCAGATTAGCC GGCTAATCTGATAGAAGTAAACCTACAAAACTCCTTCC MutGBS 5: mutated GBS sequence: AGGAGGAACCCTTCC AGGGGGGTAAAAAGGAGGAACCCTTCCACAACAACCACA TGTGGTTGTTGTGGAAGGGTTCCTCCTTTTTACCCCCCT MutGBS 6: mutated GBS sequence: AGGAGCGCCCATCCT ACTAAGAGCTGCGAGGAGCGCCCATCCTGTTACAACCCACC GGTGGGTTGTAACAGGATGGGCGCTCCTCGCAGCTCTTAGT MutGBS 9: mutated GBS sequence: GGAATTGGCCCTGCC GTGGCAGGCCAGGGAATTGGCCCTGCCCAAGGACTGAGG CCTCAGTCCTTGGGCAGGGCCAATTCCCTGGCCTGCCAC MutGBS 10: mutated GBS sequence: GGTATTGAAACTGCT TATCCCGTTCAGGGTATTGAAACTGCTCTTCAAATGTCC GGACATTTGAAGAGCAGTTTCAATACCCTGAACGGGATA MutGBS 18: mutated GBS sequence: AGTAGGCTCCCTTCA ATGACCACACAAAGTAGGCTCCCTTCAGTAGAAGGCATA TATGCCTTCTACTGAAGGGAGCCTACTTTGTGTGGTCAT MutGBS 27: mutated GBS sequence: AGAATTGAAAGTTCT AATTTATGCGGAAAGAATTGAAAGTTCTCGGAGATTAGTTG CAACTAATCTCCGAGAACTTTCAATTCTTTCCGCATAAATT MutGBS 28: mutated GBS sequence: AGTAAGTTTCATACT GAACCCTCCCGTCTAGTAAGTTTCATACTGTACTTGGAGG CCTCCAAGTACAGTATGAAACTTACTAGACGGGAGGGTTC MutGBS 31: mutated GBS sequence: GGAATTGAACCTTCA CTCTTCTAGCGCTGAAGGTTCAATTCCCTTTCCCAACAC GTGTTGGGAAAGGGAATTGAACCTTCAGCGCTAGAAGAG

74

Table 2.4 Luciferase GBS reporters. Primers used to clone 500bp inserts into luciferase reporters are shown. The ‘mutant GBS reporter’ primers were used for the QuikChange (Stratagene) reactions to generate reporters that harbor mutations in the GBS sequences.

75

RNA Isolation, Reverse Transcription, Quantitative PCR (qPCR), and Microarray

Analysis

The RNA isolation, reverse transcription, and qPCR steps were performed as previously described [4]. For the microarray analysis, we exposed C3H10T1/2 cells with dexamethasone (dex) or DMSO for 90 min in triplicate and hybridized the transcribed cDNA to arrayed MEEBO (whole genome mouse oligo set, Invitrogen) oligonucleotide probes. We analyzed the arrays using the limma package in BioConductor [77]. Primers for cDNA amplification using qPCR are displayed in Table 2.1.

Computational and Conservation Analysis to Identify Putative GR Binding Sites

The GBS positional weight matrix was derived using GBSs identified in GR ChIP-chip experiments [70]. Briefly, the GBSs were extracted from the GREs using BioProspector:

BioProspector analysis was performed using nucleotide widths(w) 14 and 15. The GBSs found in the top scored motifs and the second top scored motif generated using w14

(Table 2.5) were subsequently used for the positional weight matrix. BioPerl and the

TFBS module [78] were used to scan for GBS motifs within 64kb DNA segments surrounding the transcription start site of the closest glucocorticoid responsive gene. All sites that scored within the top ninetieth percentile were considered GBSs. Ensembl compara API (Ensembl Release 45) [79] was used to extract the aligned human sequences for GBSs.

76

GGAACAGGCCGTACT AGAACAGAATGTCCT GGTACAAGAAGTACA AGAACAGGGTGTTCT GGGACAGAGAGATCA GGAACACGGCGTCCC AGAACAGCGCGTTTC AGGACAGGACGTCCC AGGACATAGTGTTCT TGCACAGAGTGTTCC GGTACAGAATGTGTG GGAACAGCATGTGCA GGCACAGAATGACCA AGTACAAACTGTACC TGAACACAATGTGTC AGTACAGGGCGTTCC AGGACACTTTGTTGC GGGACACAGTGTCCA GGCACAAGGTGATTG AGAACAGAGTGTTCC AGCACATACTGAACC GGAACAAACAGTCCT GGCACAGGCTGTTTC AGAACACAGTGACTT TGTACAGCAAGTGCA GGCACAGGGTGTTCA GGAACATCCTGTCTG AGCACAAAGCGAGTG GGGACAAGGAGTTCT AGGACAGCCTGTCCT GGGACAAGGAGTTCT AGAACAACATGTCTT AGAACATGCTGTCAT TGGACAGACAGATCA AGGACAGGACGTGCA AGCACAGCATGTCCT TGAACAGAATGTACC TGCACAGACTGTGCT GGAACAGAAAGTATT AGCACAGAAAGTTCG AGAACAGGGTGTTCC TGTACACAGTGTCCC GGTACAGTTTGTTAC GGCACATCGTGTCCA GGTACAAACTGAGTG GGTACAAACAGTACC AGAACAGCGTGACCT GGCACAAGGTGTTCC TGGACACAGTGTCTT GGTACAGAATGTTCC AGGACAGAGTGTCTG AGAACATTGGGTTCC AGCACATACTGTATC GGCACAGCGTGTGGC GGTACAACCTGTATC AGAACATTTTGTCCG AGCACATCGAGTTCA AGAACATTTGGTGCT GGTACAGAGTGTGAG GGGACATCTGGTTCC AGCACACTCAGTTTG AGAACATTCTGTGAG TGAACAAACTGTTTT AGGACATGCTGGTCA AGAACAGGATGTTTA AGGACACTGTGGACA GGGACAAACTGTGTT AGGACAGTGTGGACA GGTACAGAGTGATAT AGGACATGGAGGTCT AGTACAGGAAGTACA TGAACACTCTGAAAT GGTACAGACCGTTCT GGAACAATGTGTGAG AGAACACAATGTTCT GGAACATTCTGATTT GGGACAAACTGTGTT AGCACATCCTGAGGG GGTACAGAGTGATAT

Table 2.5: Positional weight matrix generating GBSs. The GBSs displayed were extracted from known GR binding regions as described in the Materials and Methods section. These sequences were used to generate the GBS positional weight matrix.

77

Chromatin Immunoprecipitation

ChIP assays were performed as previously described [21]. Briefly, cells were formaldehyde crosslinked, the reaction stopped with glycine, chromatin sheared with sonication, and chromatin samples were immunoprecipitated with 8µg of N499 anti-GR antibody per 15cm plate of cells with Protein A/G beads (Santa Cruz). After washing the beads, chromatin samples were extracted once with phenol-chloroform and purified using a Qiaquick column (Qiagen). The relative amplification of dex versus ethanol treated samples at each genomic site was assessed using qPCR (Table 2.6-8). The primers used for amplifying the ChIP samples are displayed in Table 2.6-8.

Table 2.6: Chromatin immunoprecipitation of GR in C3H10T1/2 cells at dex induced genes. All GBSs found at dex induced genes and examined by chromatin immunoprecipitation followed by qPCR analysis are displayed. NBSs represent those GBSs that were not occupied by GR in C3H10T1/2 cells. The chromosome coordinate, the relative fold dex induced GR occupancy, and the sequences of the GBSs are listed. The distance of the GBSs relative to the transcriptional start site (TSS) of the corresponding genes are shown. The qPCR primer sequences are also displayed.

Mouse Feb 2006 (mm8) GBS Identifier Mismatch (bp) Chromosome Coordinates Sequence of GBSs ChIP (Dex/EtOH) Gene Position From TSS Forward Primer Reverse Primer +GBS 1 0 chr2:164147136-164147150 AGGACAGTTTGTCCC 21.1 Sdc4 12745 TGGGGAATGATGTAAGTGACC CACACTGCCAAGTCTGTGGT +GBS 2 0 chr2:128400941-128400955 AGAACAGGCTGTCTT 4.4 Mertk 10503 ACGGGCTCACACTAGCTGAA ACCAGCCACCAGATACAAGC +GBS 3 0 chr13:102819770-102819784 AGAACAAAATGTACA 19.2 Pik3r1 26866 GGGTTGAACAGGAAGTGATAGC GCGGTTGAAAAAGGAGTTGA +GBS 4 0 chr15:81667054-81667068 AGGACATTGTGTCCA 11.6 Tob2 18527 AGAACAAGCCCTTTGCTCAG TCACAGCTTGGCTGACTGAC +GBS 5 0 chr19:23202190-23202204 AGGACAAACTGTTCC 31.1 Bteb1 6662 TGATGAAACGTGAGCGCTAT GTCAGAAGGGCTGTGTTTCC +GBS 6 0 chr8:97060665-97060679 AGGACAGCCTGTCCT 22.4 Mt2 1248 GGGAAAGGACACGGTGTTTA GCTGTCCTCGCAGCTCTTAG +GBS 7 0 chr4:10959827-10959841 AGAACACAATGTATT 20.3 Plekhf2 25120 TGCTCTGCATTGTGTCATCA TTGCTTCGCTGCTTTTCAC +GBS 8 1 chr15:81661712-81661726 GGCACAGGATGTCCC 24.7 Tob2 23869 CTGGCCAAGGTCTGAGAGTT CTGCTATTTCCAGCCTCTGC +GBS 9 1 chr17:28243704-28243718 GGAACAGGCAGTGCC 20.7 Fkbp5 30027 TCCTAAGAGGTGGGAACACG ACCAAAGGGACAGGGTCTTT +GBS 10 1 chr11:20961517-20961531 GGTACAGAAAGTGCT 15.3 Peli1 29810 CTAGTGTGCATGGAGGCTGA GGGCTCTTACAAGCAAGCTG +GBS 11 1 chr11:101047378-101047392 AGGACAAAGTGTCCA 17.7 Prkwnk4 29323 AGTCAGACCCCTCCCTGAAT AGCAACAGAGAGGCTTCCAG +GBS 12 1 chr9:43837537-43837551 AGAACAAAATGTCCC 12.3 Usp2 19375 GAAAGGTCACTCCTGGGACA AGGCCAGGCCAGATATTCTT +GBS 13 1 chr9:43830085-43830099 AGGACAAGCTGTTCC 17.7 Usp2 11923 CCTAGAGCCAGAGAGCCAGA GTGCTCAGTGGGAACAGCTT +GBS 14 1 chr2:152511910-152511924 GGGACACTGTGTTCC 16.1 Bcl2l1 11213 ATGAGGCTGGGCTCCTAAGT CACACATGTGCCTTTGTTCC +GBS 15 1 chr17:28243222-28243236 GGTACAGTGTGTTAC 15.4 Fkbp5 29545 TCCTAAGAGGTGGGAACACG ACCAAAGGGACAGGGTCTTT +GBS 16 1 chr1:155663844-155663858 AGAACAATATGTTCC 20.0 Glul 1681 GGGAGAGGCAGTTGTTTTCA GGATAGCTGGAGAGGGAACA +GBS 17 1 chr11:51717800-51717814 AGTACACTCTGTTCT 18.3 Phf15 16896 TCACTCAGATCTCGGTCCAA CCAGCCCTACCAGAACAGAG +GBS 18 2 chr10:21694418-21694432 AGTACACTCTGTTCA 4.1 Sgk 10307 TGGCAAGCAGAAGATGATGT CCCTTGGTCTACCCTTCACA +GBS 19 2 chr10:76884709-76884723 AGAACATTCTGTCCT 21.4 Adarb1 22720 TCAACTCTGGGCTCTTTGCT ATCGAGACTGGAGTGCCTGT +GBS 20 2 chr11:101046391-101046405 GGCACACTGTGTCCA 14.5 Prkwnk4 30310 AGCCACCCTCTCCTCTCTGT GGCCTCCAATTTAAGAGCAA +GBS 21 2 chr11:59777200-59777214 GGAACAGTATGTACC 4.9 Rsd1 3930 GTGCTATCTCAGCCCTGACC AGGAGCTGAGGGAAAGAAGG +GBS 22 2 chr6:82993665-82993679 AGTACAGGGAGTTCC 45.1 Dok1 5456 GTCCTCCTTCTGTGCTCGAC GGCTGAGGGAGAGTGTGTTC +GBS 23 2 chr4:11907250-11907264 AGGACATTCTGTACC 5.6 Gm1024 13683 GGCATCCAGAGGAGAAAGACT CTTTGAAGCCCAGGAACATC +GBS 24 2 chr17:28215926-28215940 GGAACAGTCAGTTCT 3 Fkbp5 2249 CCCTTCTAGGCCTGTGTCTG AAGGTGGATTTGGTGGTCAG +GBS 25 2 chr10:128075663-128075677 GGCACAGTGTGTCTC 6.3 Suox 111 CTGAGAGTGCATGGTCCAAA TGTACCTTTCCATCCCCACT +GBS 26 2 chr14:7103641-7103655 TGGACATGGTGTTCT 12.1 Bc055107 109 GCACTTTTGCACTCCCATTT GCTTGGACATGGTGTTCTGA +GBS 27 3 chr10:21683118-21683132 AGAACAGAATGTTCT 23.4 Sgk 993 GGCAGTGCAATGCAATCTT GTGAGGAGGTGGCGAGTTAG +GBS 28 3 chr7:109908279-109908293 AGTACATTTTGTACT 21.2 Wee1 4957 GGAGAGAACTGGAGATTTGACTTC TAAAATTGCATGCTCCTCCA +GBS 29 3 chr8:97078693-97078707 GGAACACAATGTCTT 8.7 Mt1 10337 GCCATTGCTCCTTGGTAACT GCTGCACGTTAAATGACTGG +GBS 30 3 chr17:28212542-28212556 GGGACAGGGTGTACA 4.5 Fkbp5 1135 CCGCATGCAGAATTTACTGA GCGTTGGAAGGTACAGATCG +GBS 31 3 chr17:26262929-26262943 GGAACAGAATGTTCA 17.4 Dusp1 26868 GGCTTTGAGCTCACTTCCTG CTGGGTCCACTTTCCCACTA +GBS 32 3 chr4:41038353-41038367 AGAACAGTCTGTACT 3.2 B4galt1 4316 ACCCTAAAGCAGAGGGTGCT ACCTTCACCCATTTGTGCAT +GBS 33 3 chr8:97060562-97060576 AGGACACGGTGTTTA 22.4 Mt2 1351 GGGAAAGGACACGGTGTTTA GCTGTCCTCGCAGCTCTTAG +GBS 34 4 chr16:15803412-15803426 GGAACAAAATGTCCA 3.5 Cebpd 2548 CCTTCCCAAGGACGCTCT AGGAAGGCTGCAGCTTATGA +GBS 35 4 chr11:116350355-116350369 GGGACAAAATGTTTC 7.4 Sphk1 918 TCTTCCCCCAAGAAATGATG CACAGGTTGTCTGGGAGTGA +GBS 36 4 chr2:154094475-154094489 AGTACACAGAGTTCG 6.6 Snta1 5047 TGACCTCAGCAGCTTTGCTA CCTTTCCCTCGAACTCTGTG +GBS 37 4 chr3:101516409-101516423 GGAACAGAACGTTCT 6.6 Igsf3 9032 TTGCAGTCTTTGTCCCACAC GAACGTTCTGTACGGTTCCA +GBS 38 4 chr3:101527211-101527225 AGCACAGATTGTTCC 9.2 Igsf3 19834 TCACTGGTGGAGTTGCTCAG CAGGAGCCAGCATCTTAGGT +GBS 39 4 chr3:89998667-89998681 AGGACAGTGTGTTTT 9.7 Il6ra 422 GAGAGCCAGTGGCATAAAGG GGTGTCTGGTTCCCAGTGTT +GBS 40 5 chr4:11899855-11899869 AGCACAGAGTGTTTC 25.1 Gm1024 6288 TGGTTTCAGGGAACTTCTGG GTGACAGCCAGCAGCAAAT +GBS 41 5 chr8:97068124-97068138 GGGACATGATGTTCC 9.0 Mt1 232 TCAGGAACTCCAGGAAAGGA TATTACGGCCTATCGCTGCT +GBS 42 5 chr8:97078800-97078814 AGAACAGGGTGATTG 13.2 Mt1 10444 CGTGCAGCCTTTTCTTTTTC ACGCCAAAAGCATTCAATTT +GBS 43 5 chr17:28241492-28241506 AGAACACAGTGTCCC 3.9 Fkbp5 27815 AGAAAAAGCTGCCCAGAACA CAGATGGAGCAGAGACACCA +GBS 44 5 chr8:97060824-97060838 GGTACATGGTGTTTC 22.4 Mt2 1089 GGGAAAGGACACGGTGTTTA GCTGTCCTCGCAGCTCTTAG +GBS 45 6 chr1:145759784-145759798 AGAACAGGGAGTACA 7.0 Rgs2 6583 TGGTGCTACTTGGAGAACAGG CATGTGTAAATAATTCCATGTAGCTT

78

+GBS 46 6 chr10:128075253-128075267 AGCACAGAACGTCCC 4.8 Suox 299 GGGGTTGACTTGACTCTGGA TTCCAATTGCAAACTGGTGA +GBS 47 >6 chrX:135859919-135859933 AGAACACTGTGTGCT 3.5 Tsc22d3 29612 GGGACAGTGATTCACCCAAC TTTCTCTTGGCCTGTTGGTC +GBS 48 >6 chr14:69071730-69071744 AGGACAGCCTGTCCA 2.5 Slc39a14 14774 TTTTGTTCCCATTACGCACA TGTCATCCACTCCCTCTTCC +GBS 49 >6 chr6:128267383-128267397 AGTACAGACTGTCCC 2.9 Tead4 30692 AACAAAGCCTGGTCACATCC TGGTACCAGGTTCAATGAAGG +GBS 50 >6 chr6:83268255-83268269 AGGACACTTTGTTCT 6.8 Mthfd2 15007 CAGGCTGTGACCAGTTCAGT GGCCTTGAATTCCTGATCCT +NBS 1 1 chr4:11889612-11889626 AGTACAAACTGTTCA 1.3 Gm1024 3955 TGCAGTACAAACTGTTCAGCAT AATGGTGATTGGGCAGGTAG +NBS 2 1 chr15:97759398-97759412 AGCACAGCGTGTTAT 1.2 Vdr 23068 AAACCATGGCTTGAATTTCG AGATCCAACCACATCCTTCC +NBS 3 1 chr10:41963309-41963323 GGGACAAAGTGAGCC 0.7 Foxo3a 1848 CTGTCGCCCTTATCCTTGAA CTGTCCTATGCCGACCTGAT +NBS 4 2 chr1:155659337-155659351 GGAACAGGCTGTTCT 0.7 Glul 2826 GGTGGCAATTTAGGGGAGA CTTAACTGGCTTGCCCTCTG +NBS 5 2 chr4:41065960-41065974 AGCACAGGCTGTTTG 1.3 B4galt1 23291 GCAGCCCTTGACAAAAACAT AGCCTGTGCTGTCAAGGAAG +NBS 6 2 chr8:12905870-12905884 AGAACAGAGTGTCCT 1.5 Mcf2l 10065 TACTCTCCTGGCACCAGCTC CTCACCTCACTGCAGGAACA +NBS 7 2 chr3:101520657-101520671 AGAACAAACAGTTCT 1.2 Igsf3 13280 GACCCGCTATAGGGGAGAAG GGGAGTTTCACGGGGACTAT +NBS 8 2 chr11:116351825-116351839 AGGACAGACTGAGCA 1.3 Sphk1 2388 CTGCGGCTCTATTCTGTGCT GCCCACTGTGAAACGAATCT +NBS 9 3 chr13:51852465-51852479 AGAACAAACTGTTTA 0.6 Gadd45g 6726 ATCACCCTGCAACCACATC CCAAGGACAAGATCATCACG +NBS 10 3 chr3:122398368-122398382 GGCACAGACAGTTCA 0.2 Bcar3 13481 GGAGAACCCATTCTGGCATA AAGTGGTCTGGGAAATGTCG +NBS 11 3 chr8:12929574-12929588 GGCACAGAGAGTTTT 0.6 Mcf2l 13639 ACATGTTTGAGGTTGGCACA CACACGCTGGTCTTCACACT +NBS 12 3 chr10:21688012-21688026 AGGACAGGGAGTCCT 1.7 Sgk 3901 GCACTTCGATCCCGAGTTTA GCTTCTGCTGCTTCCTTCAC +NBS 13 3 chr10:21689767-21689781 AGGACAAGCTGTCCA 1.8 Sgk 5656 TCAATAATGTTCCCTGTGTTGA CATATTTGTTAAAAGAATTACCTGTCA +NBS 14 3 chr14:47055485-47055499 GGAACAAGCTGTACT 1.1 Peli2 12594 TCCCCAGTACAGCTTGTTCC AAAGCCTTGAACCGGAATCT +NBS 15 3 chr13:51860406-51860420 AGAACAAACTGAGCT 1.2 Gadd45g 1215 TGCCTTGGAGAAGCTCAGTT TCACCAAGTCGATCAGACCA +NBS 16 3 chr15:81678080-81678094 AGAACAGGCTGTTAG 0.8 Tob2 7501 AGGGCTGGTTGGGATACTG CAGCAGCAGCTTTGATGTGT +NBS 17 3 chr13:102802613-102802627 AGAACAAAATGTGTC 1.1 Pik3r1 9709 GCGTGATTGGCTACTTCCTC ATGTGTCGAGGGTGGAAGAC +NBS 18 3 chr11:116352030-116352044 GGAACAAGGTGTGTG 1.3 Sphk1 2593 TGCCTTCTCATTGGACTGTG GATGCATAACACCAGCCTCA +NBS 19 3 chr1:53014283-53014297 TGAACAGAGTGATTT 0.7 Mstn 7969 CCACTCCAAATCACTCTGTTCA AGGCCAAAACCGCATAAAC +NBS 20 3 chr5:104094589-104094603 AGAACAGAGTGTGAT 0.9 Mllt2h 1135 GGGGGTGAATCACACTCTGT TGTCCTCTGGCTCCCTCTTA +NBS 21 3 chr3:89995879-89995893 GGGACAGATTGTCTC 1.6 Il6ra 3210 ATGCCAGTCAGATGGGGTAG AATGTGGCCAAAAGCAAATC +NBS 22 4 chr7:122999632-122999646 AGGACAGAGTGACTC 0.9 Tnrc6a 28694 GTGGAGGGACAATGACATCC GGGAATGGCCTATGGTGAGT +NBS 23 4 chr13:115599835-115599849 AGAACAACGTGTATC 1.1 Fst 20371 CCACCAAAGCATCTCCAAAT GCCCAAATTTCACCTCTGTC +NBS 24 4 chr3:101503816-101503830 AGCACATGCTGTCTC 1.0 Igsf3 3561 CAGGTTAGCTGGCCTAGCAC GGGATGGTGAAGGAGGATTT +NBS 25 4 chr1:145793113-145793127 AGGACAAAGTGTTTC 1.3 Rgs2 26746 GCCCTGAAAGAGTTGCAAGA AAGCCATGCATTGCAGAAC +NBS 26 4 chr11:101092520-101092534 AGAACAGAGTGTTTG 1.2 Prkwnk4 15819 GAACCTTCGGCAGAAGTGAG ACTTCCGACAAGTGCCTGAG +NBS 27 4 chr4:11913868-11913882 GGCACAGTATGTCCT 1.7 Gm1024 20301 ATGACAGCTGCTCTCCTGGT GGTTTCCTCCTCGGATCTTC +NBS 28 4 chr7:110583516-110583530 GGAACAGGCTGACCC 1.0 Ampd3 19735 GCATCCTGGCTCTGCTATGT ACAACCTTGGCTCCTGAGTG +NBS 29 4 chr6:83297769-83297783 GGCACAAGCTGTCTT 0.9 Mthfd2 14507 GGGCCCCATCTTCTGTCTA ATCTGCTTCAGAGCCAGCAC +NBS 30 4 chr3:122397212-122397226 GGCACAGGGTGGTCT 0.6 Bcar3 14637 GTCCTCTTGAAGCCCTGTTG CAGAAGCCAGGTCATCATCA +NBS 31 5 chr11:59800223-59800237 GGTACAGACTGTGCG 1.3 Rasd1 19093 ACCCACAGGGTCATTTCAAG TCATTCATACCCGCACAGTC +NBS 32 5 chr4:11890964-11890978 TGGACAGAATGTCTT 0.9 Gm1024 2603 TGACAGATTTGTGTCTGCTTTTG AAACAAATCTCTTCTTGGGTAGTC +NBS 33 5 chr14:69107438-69107452 GGAACAAAGTGATTT 1.0 Slc39a14 20934 GCCCCAAATGATTCAAACAC CAAAGTGATTTCACCCTGAGC +NBS 34 5 chr11:21000700-21000714 AGTACAACCTGTACT 1.5 Peli1 9373 GGAGAACAATTGCTATGCAAAA GCTGAACCATCTGGTCATCTC +NBS 35 5 chr2:128392278-128392292 GGCACACAGTGTCCC 0.7 Mertk 1840 CGTGTTTCAGGAGTCCCAGT GGACACTGTGTGCCCCTTAT +NBS 36 5 chr4:41063531-41063545 AGAACAGAGTGATTG 1.0 B4galt1 20862 ATCCTGCCACACCCACATAC ACTCATGTGTGGCCCCTAAG +NBS 37 5 chr14:98170324-98170338 GGCACAGACTGAGCT 0.7 Klf5 11510 GCGGAAACATATACGCACCT CACAATCAGCCACCGTAGAA +NBS 38 5 chr8:97063821-97063835 GGCACAATGTGTCCA 1.6 Mt1 4535 TGCAGCGTGGAGACTAAATG GCTCTGTGGAAGAGGGAAAA +NBS 39 5 chr16:15775335-15775349 AGGACAAGCTGTATT 0.8 Cebpd 25529 GTCCGACAGGCAGAGAGAAC TCACCCAGTTCTCAGGTGTG +NBS 40 5 chr13:51860641-51860655 GGAACAAGGTGTCCC 1.2 Gadd45g 1450 GAGGCTGCTAGCACAGGAAG CTTTCACTGGAAGGCTGCTT +NBS 41 5 chr3:10221919-10221933 AGAACAAACTGTCAT 0.7 Fabp4 30833 TCAGCCAGGATTTGGAAACT ACCTCCCTGACGGATCCTAT +NBS 42 6 chr2:11397863-11397877 AGAACATACCGTTCT 1.0 Pfkfb3 22057 CTCTGCCTGATTCTGCTCAA GAGTGCCTGTGTTCAAGTGG +NBS 43 6 chr13:115585291-115585305 GGGACAGAAAGTACT 1.1 Fst 5827 AAAATGCCAGCATGTGAATG GCAAATGCTGGGTTTGTTTT +NBS 44 6 chr7:110414748-110414762 AGGACAGGATGTATA 0.8 Adm 4098 ACAGTTCAAAACAAAACCTCCAA TGGCACTTCAATGTCTTCCTC +NBS 45 6 chr7:122943446-122943460 GGAACAGGCTGTGTC 0.7 Tnrc6a 27492 CACCTTCAGGATGGGAGAAA CGCACACCATAAAAAGCTCA +NBS 46 6 chr13:102798755-102798769 GGGACAGGCTGAACT 1.6 Pik3r1 5851 GGACAGGCTGAACTCAAAGG ATCACTGGGACGATGAGGAG +NBS 47 6 chr7:83517077-83517091 GGGACAAGGTGTCTT 1.0 Stard5 8875 AAGTCCCTCTATGGCCACCT AAGACCCACACAGGGACAAG +NBS 48 >6 chr17:26259587-26259601 AGAACAAGCTGTGCT 1.4 Dusp1 23526 CTGAAGTCACACAGCCAGGA CTTCCCCCTCCATCTTTTTC +NBS 49 >6 chr11:51713295-51713309 AGAACAGGCTGACCT 0.9 Phf15 12391 AATGCACGTATGGCCAGTCT GGTTCCAGAACAGCCAGCTA +NBS 50 >6 chr10:41942528-41942542 AGGACAGACAGATCT 0.6 Foxo3a 22629 GGCAGTTCAAAGGTCAAAGC TTGGAGAAAGGTTTGCTGCT +NBS 51 >6 chr8:97053899-97053913 AGAACAGTCAGTGCT 0.6 Mt2 8014 GATCTGATGCCCTCTTCTGG AGCTCCCCCTCTTTAGTGCT +NBS 52 >6 chr2:131750046-131750060 AGAACAGCCAGTGCC 1.0 Rassf2 28795 GGCTGTTGTGGGGCTTAGTA CCCATGGAGGTTAGGAGACA +NBS 53 >6 chr2:128373638-128373652 AGGACAGAGTGGGCT 1.2 Mertk 16800 GCTCTGGCCAGCTAGCTTTA GGGCATAGCCATTTACAGGA +NBS 54 >6 chr2:152529152-152529166 AGCACATGCTGTTTT 0.9 Bcl2l1 6029 AGCTGTGTTGCACCTTCCTT GTGGACATGCAGCACTGAAC +NBS 55 >6 chr9:43834330-43834344 AGGACAGTGTGTCTC 0.9 Usp2 16168 GGGTGAAATGCTTGTGTGTG GACTGGCCAGGACCTTGTTA +NBS 56 >6 chr9:43808132-43808146 AGGACAGCCTGAGCT 0.7 Usp2 10030 TAGGCAGCACATTTGCTCAG GCCTGACTCTGCCTTCTGAC +NBS 57 >6 chr11:59773054-59773068 AGCACAGGGAGTTCA 1.0 Rasd1 8076 TCCCTTCCTGTCTCACTGCT GGCTCTCCTGTACCCACAAG +NBS 58 >6 chr10:76885372-76885386 AGGACAGAGTGTCTC 1.7 Adarb1 23383 CTGTGAGCCCTTCTGGAGTC GGAGCACTGTTAAGGCTTGC +NBS 59 >6 chr10:76832988-76833002 GGTACATAATGTACT 1.0 Adarb1 29001 TGGGTTTGGGTTCTAGATGG TGTCACATCAAAGCCAGGAG +NBS 60 >6 chr10:76890077-76890091 GGGACACAGTGTTTG 0.7 Adarb1 28088 ACCACTTTCATTCCCACCAG GTGCCATCAGGGAAATTCAA +NBS 61 >6 chr4:11641976-11641990 GGGACAGACAGTTTC 0.9 Gem 10372 GACAGTTTCCCCTGCCATTA ATTTGACCCCAAATCCTTCC +NBS 62 >6 chr5:137327677-137327691 GGGACAGAGTGTCAC 0.6 Serpine1 29209 CATTTGAACCCACAATGCAA AATGTGCAGCTTCCTGGAGT +NBS 63 >6 chr4:10902731-10902745 AGAACAGTCAGTGCT 1.3 Plekhf2 31976 TGAAGCTTTAACGGCCTCAG TGGGTCTGTGTCTCAGGCTA +NBS 64 >6 chr7:110553382-110553396 AGTACAGAGAGTACA 0.7 Ampd3 10399 GCCCCCAATCAATAGGACTT GCGTCTCCTTGGTCTGAGAG +NBS 65 >6 chr7:110398200-110398214 AGAACAGTCTGTATT 0.8 Adm 20646 AAGCCAGGTATGGTGGTGAA TCCCTGGCTGTATTGGAACT +NBS 66 >6 chr13:115547863-115547877 GGGACAGTGTGTGTG 0.7 Fst 31601 CCACTCTTGGAAATGGCTTC AGGGGACTGGCTAGATGGTT +NBS 67 >6 chr14:69096205-69096219 AGAACAGAATGTCTC 0.7 Slc39a14 9701 TCCCAGATGAGTGTGCTCAG CAGGGTTTTGCTCTGTAGCC +NBS 68 >6 chr7:122994132-122994146 AGAACAGCATGTACA 0.7 Tnrc6a 23194 TATCACCGAACCACGACTCA CTCCTGCTGTGAGCAGACAC +NBS 69 >6 chr10:128083130-128083144 AGAACAGTATGTACT 1.0 Suox 7578 CCTCCTGAATGGCAGGATTA TCTGTGTGCACCTAGCCACT +NBS 70 >6 chr5:137329538-137329552 AGCACAGGCTGATCT 1.0 Serpine1 27348 GTGCACTCTAAGGCCTGACC GGGAAGCAGTGAGGACTCTG +NBS 71 >6 chr14:98210760-98210774 AGGACAGACTGACCC 1.1 Klf5 28926 TGCCTCTTGGCTACTTGGAT ATGCTAGCTTGGGCGATTTA +NBS 72 >6 chr15:97716171-97716185 AGAACACGGTGTGCA 0.8 Vdr 20159 AGCCAGTGCTTCACCAGAAT CTCTTTCCCTTCTGCACACC +NBS 73 >6 chr13:51882431-51882445 GGAACAAGATGTGCC 0.7 Gadd45g 23240 GAAGGGGTAGAGGCTTGGAC CAGGGCTTCCAGGTACCATA +NBS 74 >6 chr14:74492403-74492417 GGCACAGACTGTGAC 1.3 Slc25a30 28789 GGGATTAAAGGCATGAGCAA ACCCACTACCCAATGCTTGA +NBS 75 >6 chr11:77023417-77023431 AGGACAAAATGTTTT 1.0 Ssh2 9203 GCCTGAAATCTTTTTGCCAAT TGGCTTCTTGAAAGGCAGAT +NBS 76 >6 chrX:135884296-135884310 AGTACAGAAAGTGCT 1.0 Tsc22d3 5235 CACTAAGTGACTGCGGGACA CATGAGAGAAAGAGGCCTGAG +NBS 77 >6 chr11:21003610-21003624 AGTACAGACTGTCTC 1.1 Peli1 12283 AGTGCACAGCAGTGCTCAAG CATGGAATCTTGAAGCCAGAA +NBS 78 >6 chr17:26258968-26258982 GGGACAGAGAGTGTC 1.0 Dusp1 22907 GCCCCTTGAGACACTCTCTG GTGGGAGGGGACTCATAGTG +NBS 79 >6 chr2:128413051-128413065 AGAACAGAGTGTTAC 0.8 Mertk 22613 GGGCTCAGTGATTCCAGCTA CCCTGGGAGACACAGTAAGG +NBS 80 >6 chr7:83491672-83491686 GGCACAGACTGTCTT 0.9 Stard5 16530 CAGTCTGTGCCAGCTTTCAG GAGTGCACTGGCCTTTTAGC +NBS 81 >6 chr14:47050065-47050079 AGAACAGGCTGTTAA 0.4 Peli2 7174 CCTTAGCCTCTGCCCTTCTT CACAGAGCACGAGGTGTTGT +NBS 82 >6 chr6:37809902-37809916 AGAACAAAATGTTAA 0.8 Trim24 9483 AGCCCCTGTGTATCCTGCTA GCCCAGTGGTTTTGTGTTTT +NBS 83 >6 chr19:27295442-27295456 AGAACAGGGAGTTCA 1.1 Vldlr 10913 ACTTCCTGCTGAGCCATCTC ATGTGATGCATGGCTGTCTC +NBS 84 >6 chr11:116345419-116345433 GGAACACCATGTCCT 1.5 Sphk1 4018 TCTTGTGAGGCCTGTCTGTG ACATGGTGTTCCAATGCTGA +NBS 85 >6 chr2:131746215-131746229 GGCACAGAGTGAACT 1.2 Rassf2 24964 ATTACAGGCCGTTCTCATGC CCTCCATTTGCATTTTGCTT +NBS 86 >6 chr9:77553727-77553741 AGAACAGCCAGTGCT 1.1 Gclc 13233 GTGGGCCTCTTCTCACTGAC CCAGCACTGGCTGTTCTTTT +NBS 87 >6 chr13:102806388-102806402 AGGACAGAGTGTGCC 1.0 Pik3r1 13484 CACCTGGCAAGCAGTTTGTA TTACCACAGCTCTTGCGTTG +NBS 88 >6 chr15:81699982-81699996 GGCACAGGCTGTTCG 0.8 Tob2 14401 GCAAGCTTGGGTTTAAGCAG CCTTGTCATCACCTTGAGGAA +NBS 89 >6 chr2:128391007-128391021 GGGACAGCCTGTCTG 1.5 Mertk 569 TTGCTTGGAGGCTAGGTGTC CAGGCTGTCCCTACAAAGGA +NBS 90 >6 chr3:101497686-101497700 AGCACAGGCTGTGCA 1.4 Igsf3 9691 CACATCCAGCTTCGAGATCA GGTTCTCCAGGGTATGCTCA +NBS 91 >6 chr13:115594267-115594281 GGTACATGTTGTCCT 0.9 Fst 14803 CACCGACCCCCTTCTCTAAT CCCAGTTTCAACCCTGACAT +NBS 92 >6 chr2:167403175-167403189 AGAACACCTTGTCCA 1.6 Cebpb 23055 TGTGCTGGTGGTCAGCTAAG TGCTCTTTGTGTGCCCATTA +NBS 93 >6 chr13:115558386-115558400 AGTACATAAAGTTCT 1.3 Fst 21078 TTGGTTTCCAAAATGGCTGT ATTTTCATCCCCACGTGAAC +NBS 94 >6 chr2:167363486-167363500 GGGACAGTATGTTCC 1.4 Cebpb 16634 CCTCCCATTCTCGAGTCCAC AGACCTCGCTTTGCATTTTG +NBS 95 >6 chr7:28077961-28077975 GGGACAGAGAGTTCA 1.5 Zfp36 10027 GGAAAGGGGGTGTTGGTAAT GAACTCTCTGTCCCCCTTCC +NBS 96 >6 chr6:128844559-128844573 AGGACAGAAAGTACA 1.1 Clec2l 8674 CAGCTCAGGGCTAATCTGCT TGCCCTTAAATTCTGCCTGA +NBS 97 >6 chr6:37789325-37789339 TGAACAGAAAGTTCA 1.3 Trim24 11094 TCCATAAGGCTGGGTGTCTC CTTGCCTTGCTAGCACACCT +NBS 98 >6 chr7:109945194-109945208 TGGACAAAGTGTCCA 1.2 Wee1 31958 AAGTGGAGGCAGGAAGATCA TGCTAAACGAAGCACACAGG +NBS 99 >6 chr7:109922546-109922560 AGAACAGCCAGTGCT 1.2 Wee1 9310 CAGAAGCCCTGAAACTGGAG TGGCTCAGCAGTTTTAAGCA +NBS 100 >6 chr3:89979974-89979988 GGTACAAGATGTTTT 1.1 Il6ra 19115 CTGCAGCTGAAGAGCGACTA TGAGATGTCCATGGTTGCTG +NBS 101 >6 chr11:21010976-21010990 AGAACAGGTTGTCCT 1.3 Peli1 19649 ATATGGGAGGGAGGGTGAAA GGCACAGAGGGCAAGTTAAG +NBS 102 >6 chrX:135890429-135890443 AGAACAGCCAGTGCT 1.1 Tsc22d3 898 TCCTCTGGAGAACAGCCAGT AGGCCAGCCTGAGCTATGTA +NBS 103 >6 chr7:28097274-28097288 GGAACAGGAAGTCCA 0.9 Zfp36 9286 GAGGTGGAACAGGAAGTCCA ACCACTGCGAAGTCTCAACC +NBS 104 >6 chr10:128073022-128073036 GGTACAGAGTGAGTT 1.3 Suox 2530 GGACAACCAGGGCTACACAG TGTGTGTGGTGTGTGGAATG +NBS 105 >6 chr11:116380780-116380794 AGAACAAACTGTCCC 1.1 Sphk1 31343 GGAGCAAGCCAGTACAGTCC TGAGCAGAGCACCTGACATT +NBS 106 >6 chr11:116379077-116379091 AGCACAGGCTGATCT 0.9 Sphk1 29640 CGGGGTTTAACTCCCAAAAT CGCCTGTTTTTGAATTGGAT +NBS 107 >6 chr14:98211187-98211201 AGGACAGGGTGTTAT 1.1 Klf5 29353 CTTCTCCCACGTTGTCAGGT TTTGGTTGGTTGGTTGGTTT +NBS 108 >6 chr16:76237617-76237631 AGAACAGGAAGTCCT 1.0 Nrip1 18112 GTCCCCACCCATAGACACAC CCAGTCGAATCCCTTTGCTA +NBS 109 >6 chr16:76268786-76268800 GGAACATCCAGTTCT 0.9 Nrip1 13057 AATGGTGTGTGGAAGGGAAG CCACTGCCAGACACATTTCA +NBS 110 >6 chr11:116366257-116366271 GGCACAAAGTGTCTT 1.2 Sphk1 16820 GCTTCACCTCAACTGCCTTC AGCTCAAGGCTCTCGTCAAG +NBS 111 >6 chr1:53028975-53028989 GGCACATGCTGTTCA 1.0 Mstn 22661 AAGCAGCAGGGTCACATTCT GCCCAGAAGGGAAGGTGTAT +NBS 112 >6 chr1:52994676-52994690 AGTACAGTGTGTACC 1.1 Mstn 11638 GCCCGACACTCTGTCTAAGG TGTCAAGAAAAGGCAAAGCA +NBS 113 >6 chr8:12934329-12934343 AGCACAAAGTGTCCC 0.7 Mcf2l 18394 GCCAGTCTCTGAACCACCAT CACGAATGGTCAAAGGGATT +NBS 114 >6 chr2:11405981-11405995 AGCACAGCCTGTTCT 0.6 Pfkfb3 13939 AACTTGGGACCTTTGCCTTT CCATCCAGCTGTTCAGGTTT +NBS 115 >6 chr2:11420060-11420074 GGAACAGAGCGTGCT 1.0 Pfkfb3 140 CCAATGTGGTTCGTTGACAG GGCTTGTAAGGCTGAACTGG +NBS 116 >6 chr3:10180480-10180494 AGGACAGCCTGATCT 0.8 Fabp4 10606 TCAAGGATAGGTGGGGACTG GGCGAATTTTCTAGGCACTG +NBS 117 >6 chr1:155647330-155647344 AGAACAGCATGTTTC 1.3 Glul 14833 ACCATGACAACAGGGCTAGG AGCCTCTCAGTCCCCAAACT +NBS 118 >6 chr1:155689027-155689041 TGTACAGAGAGTTCA 0.8 Glul 26864 CAGGGATGCAGAGTTTGACA CCAACATACAATGGCAAGGA +NBS 119 >6 chr15:97748177-97748191 AGTACAGAACGTTCC 1.9 Vdr 11847 GGGCAAAAACCTTGTCTCAA TTCAATTTGGTGTTGCCAGA

79

Table 2.7: Chromatin immunoprecipitation of GR in C3H10T1/2 cells at dex repressed genes. Primers used for qPCR analysis of ChIP samples targeted against GR of GBSs found at genes repressed by dex are shown. All GBSs identified at GR repressed genes are shown. The supplemental table is displayed as indicated in Table 2.6.

Mouse Feb 2006 (mm8) GBS Identifier Mismatch (bp) Chromosome Coordinates Sequence of GBSs ChIP (Dex/EtOH) Gene Position From TSS Forward Primer Reverse Primer -NBS 1 1 chr2:113566773-113566787 AGCACAGGCTGAGCT 1.4 Grem1 7288 AACCAGCCCTAAGTGCACAG GCTGACGAGACTCACAGCAC -NBS 2 1 chr17:33381772-33381786 AGAACAAAATGTTCT 1.9 Fiaf 6500 GGAATACGATGTCCGAGGAA GCACAGCTCACAGCATTCTT -NBS 3 1 chr16:85668398-85668412 TGAACACAGTGTTTA 1.4 Adamts1 23897 ATGCCCTCTATGCATCGTTT GGGAATACCAGAATGGCTTG -NBS 4 1 chr17:33419255-33419269 AGGACAGCCAGTGCC 0.9 Fiaf 30983 TTTCTCGGTCCTGTTGCTTT AATTGCCAGGGAAAAACTGA -NBS 5 2 chr11:53627644-53627658 AGGACAGGCTGAGCT 1.1 Irf1 14143 CAAGAACTGGGAGCAAGAGG ACTGGACCACAGGTGGGATA -NBS 6 3 chr16:85683741-85683755 GGAACATGGTGATCA 1.8 Adamts1 8554 CACCAGGATACGACTGTTGG GATCAGGGTCCCATGAGATG -NBS 7 4 chr19:34639193-34639207 AGAACAGTGAGTTTA 1.5 Ifit3 10358 AATGCCATTTCACCTGGAAC TGCACATGGTGGCTTTAAACT -NBS 8 5 chr1:171602766-171602780 AGGACAGGATGATCA 1.5 Rgs4 18471 GACAAGGAGCCTTGAAGCAC AGCAACAGAGAAGCAAAGCA -NBS 9 5 chr19:34639163-34639177 AGGACAGGGTGTTTA 0.5 Ifit3 10388 TTGGATGAGTTTGAGGACAGG TGCAGTGCTGCCTCATTTAG -NBS 10 6 chr2:113558275-113558289 GGAACAGCCAGTCCA 1.4 Grem1 1210 AGACCAGAGAGCCCCCTAAG GACTGGCTGTTCCCTCTCTG -NBS 11 6 chr15:54112238-54112252 TGAACACTGTGTGCT 0.9 Tnfrsf11b 3671 CCTGAACACTGTGTGCTTCC TTGTCCGGGTTTCAGCTATC -NBS 12 6 chr9:21679854-21679868 GGTACACCCTGTTCC 1.9 Rab3d 11331 AGGGTTTTGGAGCCTGAGAT TCACGTCTGAGCTGGAACAG -NBS 13 6 chr11:53617327-53617341 AGAACAGGCTGTGTG 1.0 Irf1 3826 CCCTCACACAGCCTGTTCTT TCACTGCCCACCCTACCTAC -NBS 14 6 chr19:34619336-34619350 AGCACATTGTGTTCC 1.7 Ifit3 30215 AAACCTACCTGTGCAGGAAAA CCAGGTCCTCTGGAAGAACA -NBS 15 >6 chr2:140379805-140379819 AGAACAAAGTGTTTC 1.6 Flrt3 16895 GCCATTCTCTTTCTCCTTTGA AAGCTCTGCCCAGGTGTG -NBS 16 >6 chr16:85715383-85715397 GGGACAAAGTGAACT 1.4 Adamts1 23088 CTCAGGGTGCTGACAGATGA ATAGGAGCGCTGGAAGACAA -NBS 17 >6 chr12:25212705-25212719 AGTACATGGTGTACC 0.7 Tieg2 27648 GCTCTGTCAGTGCAGATGGA CTCATTAGCAGCCATGCAGA -NBS 18 >6 chr9:21680879-21680893 AGCACAAGGTGTCCA 0.9 Rab3d 12356 GCTACTGCCTCCCAAGTGTC CCTAAGATTTGGTCCCAGCA -NBS 19 >6 chr12:25259555-25259569 GGAACATCTTGTCCA 0.9 Tieg2 19202 ACTCTGACACCCTGCACGTA CACGAGGCTCGAAAACTAGG -NBS 20 >6 chr7:34796141-34796155 AGTACATGCTGTGCT 1.1 Cebpa 31922 CAGGCTGGCTTTGAACTCTC GTTCGATGCCTAGCACAACA -NBS 21 >6 chr7:34818835-34818849 AGCACAGCCAGTGCT 1.1 Cebpa 9228 CGAGAGCGGCATGAGTAGAT GGTTGGTGAGATGGCTCAGT -NBS 22 >6 chr7:34842851-34842865 GGAACAGGGTGTGTA 1.2 Cebpa 14788 TCACTCACACCCACTCTTCTTCAGT CAAAGCCCAGCAGCAATCTACA -NBS 23 >6 chr1:171559249-171559263 AGAACAGCGTGTACA 0.6 Rgs4 25046 TGGCATCACTGGCTACTCAC ATGGTGCAGAGATGGGAGAG -NBS 24 >6 chr2:140381328-140381342 GGGACACAATGTGCA 0.9 Flrt3 18418 GTTTGGGACACAATGTGCAG TTGCTCCTTCCAACAACTGA -NBS 25 >6 chr11:81837798-81837812 AGAACAGAAAGTGCC 1.4 Ccl2 13974 CCCCACCTCTACCACAGAGA CCACAGCTTCAAGGCACTTT -NBS 26 >6 chr11:81855390-81855404 GGGACAGGCTGTCTG 0.7 Ccl2 3618 GGAGCTAAAGGGATCTGCAA AATGGGCCTCTCTTTCCAGT -NBS 27 >6 chr11:81855390-81855404 GGGACAGGCTGTCTG 0.9 Ccl7 6519 AAGGAAGTGGGGCACAAATA GCAAGGATGGAGGAACTCAG -NBS 28 >6 chr11:81837798-81837812 AGAACAGAAAGTGCC 0.5 Ccl7 24111 CCCCACCTCTACCACAGAGA CCACAGCTTCAAGGCACTTT -NBS 29 >6 chr11:53612814-53612828 AGGACAAGGTGTTCC 1.5 Irf1 687 GTGTTCCCCCATCACTTCAG TTGGAAATACGGGGCAGTAG -NBS 30 >6 chr5:139981116-139981130 AGAACAGAGAGTGTC 1.1 A930021H16Rik 7542 CCTGCAGTTCTCTTGCATTG TCAGGACGTGTGTGTGTGTG -NBS 31 >6 chr5:140009174-140009188 AGCACAGCCTGTGCC 1.1 A930021H16Rik 20516 GTCTCCCTGACCCACACACT TTGACTGTGACAGGCCCATA -NBS 32 >6 chrX:49287573-49287587 AGAACAGACTGTGCT 1.2 9430052C07Rik 10330 CTGGCTACGCAAACAGTTCA CTGTGCTTAAGCCTGAGACG -NBS 33 >6 chrX:49278384-49278398 AGAACAGGCTGAGCA 0.8 9430052C07Rik 19519 CCTTGTGGGTGAGACCATCT GACTGGAACTCAGGCAGGTC -NBS 34 >6 chr19:34248392-34248406 TGCACAGACTGTGTT 0.7 1700095N21Rik 9875 GCATCCTGACACCAATGCTA CTTCCTGCGAGCTTGAAAAT -NBS 35 >6 chr9:21669685-21669699 AGAACAGTCAGTGCT 1.0 Rab3d 1162 TGGCTCAGCAGTTAAGAGCA CTGTCTTCAGGCACACCAGA -NBS 36 >6 chr9:21696342-21696356 AGGACAGCCTGAGCT 1.1 Rab3d 27819 AAGCTCAGGCTGTCCTTGAA AGGGCACTGATGGGACATAG -NBS 37 >6 chr7:34821618-34821632 AGGACAAAATGTCTT 1.7 Cebpa 6445 CAGAGCCAGGGTAGCTTTTG GCCTTGGAAGGAGGAGAAAG -NBS 38 >6 chr7:34857457-34857471 AGGACATGCTGTTTG 1.1 Cebpa 29394 GTTGCCCAGGCTCTGTAGTC TGCTGTTTGTCTCCACCAAG -NBS 39 >6 chr12:25259977-25259991 AGAACAGTCAGTGCT 1.0 Tieg2 19624 ACACACCTGTAATCCCAGCA ATGGTTGTGAGCCACCATGT -NBS 40 >6 chr11:53597350-53597364 AGAACAGTCAGTCCT 1.1 Irf1 16151 AAACCTACCTGTGCAGGAAAA CCAGGTCCTCTGGAAGAACA -NBS 41 >6 chr2:113542322-113542336 AGGACAAAGAGTTTT 1.4 Grem1 17163 TGTGTCCCAAAGAGTTTGATGA TCCTCCCTTCCAGATCAAAA -NBS 42 >6 chr2:113568635-113568649 AGAACAGTCAGTGCT 1.1 Grem1 9150 AGGACCCAGGTTCAATTTCC GGTGTTTTGCCTTCATGGAT -NBS 43 >6 chr2:113551639-113551653 AGAACAGAAAGTGTT 0.9 Grem1 7846 GGGCAATAGAACACCCAGAA GTTGAAAAGTGGGGTCTGGA -NBS 44 >6 chr2:113576556-113576570 AGCACAGGAAGTCCT 1.0 Grem1 17071 TTCCAGACCAGTGCTTACCA TCCATGGAGACGAGGACTTC -NBS 45 >6 chr17:33369357-33369371 AGAACAGAATGTGAT 0.6 Fiaf 18915 AAGGTCATCTCCTGCCACAT AGCCACCATGTTGGTACTGA -NBS 46 >6 chr2:113560456-113560470 GGAACAGTGTGTGTG 1.2 Grem1 971 ACACACGCACATACCCAAAC GGGCTTCGGGACTCATAACT -NBS 47 >6 chr17:33415229-33415243 AGGACAGCCAGTGCT 1.0 Fiaf 26957 GGCTGTCCTGGAACTCACTC TGGTGGTGCACACCTTTAAT -NBS 48 >6 chr17:33415109-33415123 AGGACAGCCTGAGCT 1.0 Fiaf 26837 TCATAACTCCAAATGGGTTAAAGT GCCAGCCTGGTCTACAGAGT -NBS 49 >6 chr17:33379629-33379643 AGAACAGCCAGTGCA 1.0 Fiaf 8643 GGGTTCAAGTCTCAGGACCA ATCAAACATGGCTCCTCTGC -NBS 50 >6 chr4:36264491-36264505 GGAACAGATTGTTTT 0.9 Dusp4 11636 TCCACCTCACACCAATCAGA GCTTAAAATCCCACCAGCAA -NBS 51 >6 chrX:49311863-49311877 AGGACAGTCAGTTCT 1.6 9430052C07Rik 13960 TGTGGAGCTATTGGCAGTTG TGCAATTGGGAATAATTGAGG -NBS 52 >6 chr11:81865889-81865903 AGGACATCCAGTTCT 1.1 Ccl2 14117 ATGTTCCAAAGCACCAAAGC GATGCAGACAGAAGCAACCA -NBS 53 >6 chr19:34667299-34667313 AGAACAATGTGTCCT 1.1 Ifit3 17748 CCTCTTCTTCCTCCCTGGAC GAGTGGGAAATCAAGCATGG -NBS 54 >6 chr19:34625936-34625950 AGGACAGGCAGTTCT 1.0 Ifit3 23615 GCCTGGAGCAAATAACAGGA AGAACTGCCTGTCCTTCAGC -NBS 55 >6 chr19:34618582-34618596 AGCACACAAAGTTCC 1.5 Ifit3 30969 CCTTGATGGAGAGGTGCAAT GGCTGTCTTTGATTGGTGTG -NBS 56 >6 chr15:54116271-54116285 AGGACAAATTGTGCT 1.2 Tnfrsf11b 7704 GCTTGATCTGGAGGTGGGTA GGTAGGAAGGCATGCTGGTA -NBS 57 >6 chr11:81865889-81865903 AGGACATCCAGTTCT 0.7 Ccl7 3980 ATGTTCCAAAGCACCAAAGC GATGCAGACAGAAGCAACCA -NBS 58 >6 chr7:34807445-34807459 TGAACAGTGAGTTCT 1.6 Cebpa 20618 CCCAGTGTGGTTGAACAGTG CCACCAAATATGGCACCTTC -NBS 59 >6 chr5:139958553-139958567 TGGACACAGTGTTTC 0.7 A930021H16Rik 30105 CTGTAGGCACAGCCAGGAAG GCCTGCAGACCAACCTAATC

80

Table 2.8: Chromatin immunoprecipitation of GR in C3H10T1/2 cells at dex unreponsive genes. The supplemental table is displayed as indicated in Table 2.6.

Mouse Feb 2006 (mm8) GBS Identifier Mismatch (bp) Chromosome Coordinates Sequence of GBSs ChIP (Dex/EtOH) Gene Position From TSS Forward Primer Reverse Primer rGBS 1 2 chr10:21095269-21095283 GGAACAGCCTGTGCT 3.6 Aldh8a1 28562 GCTGTGTTCATGCTCCTGTG CACGCTTTCTCACATGGTGT rGBS 2 4 chr5:142633018-142633032 TAGACAGCTTGTTCT 18.2 Foxk1 20953 CCCTCTCCCTTGGAGCATAC CAGCCTGTAACAGCCATGTG rNBS 1 0 chr9:21139018-21139032 AGAACATGGTGTCCC 1.3 Qtrt1 23287 TGGCAAAGCATTCTTCAGTG CTGTTGCCTTTCTCCTGCTC rNBS 2 1 chr5:110506321-110506335 GGAACAGGCTGAACT 0.7 Pole 20369 TACACACCCCAAACAGCAGA TTGCAAGCCTGGGATTAAAG rNBS 3 1 chr2:124305457-124305471 AGAACAGGAAGTTTC 1.4 Sema6d 3720 CAACATTGTCTGGCTGTGGT GGACTTCCAACCCTGAATGA rNBS 4 1 chrX:69927219-69927233 GGGACAGAGTGGCCC 1.9 Slc6a8 1349 GAGTGATTGGCCACTGAGGT TGTGTCACTTGTGTGCCTGA rNBS 5 2 chr7:141290058-141290072 GGAACAGAGTGATCG 1.1 Cd151 30129 AGGTGCCAGGCAGAGTAGAA GGTACAGGCCTTGGAACAGA rNBS 6 2 chr9:21147941-21147955 AGTACAAGCTGATCT 0.6 Qtrt1 14364 TTGGGAAGTTTGATCCCATC CCCTGGCTTCAGCTGATTTA rNBS 7 2 chr11:72796701-72796715 GGAACAGCGTGTGAT 1.0 P2rx1 18636 CCCACAGACAGACCTGACCT CCTTTGTTGTCCCCTGTGAT rNBS 8 3 chr10:21095757-21095771 GGAACAGGCTGTACA 1.2 Aldh8a1 29050 GAAAGCCTGCAGCCATGTA GCCAGAGCTGTAAGCCAAGT rNBS 9 3 chr15:80479513-80479527 TGCACATCCTGTGCT 1.1 Grap2 28775 CTTCCTTCAGCCTCCAGATG GGGGTGATGAGAGGAAGTGA rNBS 10 3 chr6:29486215-29486229 GGAACAGGATGAGCT 0.6 Irf5 9472 TTGATGCAGAGCTCATCCTG TTCTCTCGGGGTTTGACATC rNBS 11 3 chr4:62850884-62850898 GGAACAGCCAGTGCC 1.2 Orm1 19957 GAGGTCCTGGAAAGCTTGTG TGGCATGGAAACTTCAATCA rNBS 12 3 chr9:102679640-102679654 GGGACAAAGAGTTTT 0.6 Ryk 13430 GGAAGAACACTGCGAAAAGC TCCCATGACTTGTTGCTTCA rNBS 13 4 chr16:16227029-16227043 GGAACAGACTGTGTA 0.8 Yars2 10469 TAATGCACCAGGGCAGTGTA CAAACAAAACTCTTTTGCAGTTTC rNBS 14 4 chr13:82264598-82264612 AGGACATGGTGTCTT 0.9 Cetn3 3969 CTGTGCGCTGTGACTTTTGT TGGACTTCATGTCCCCATCT rNBS 15 4 chr15:80421027-80421041 AGAACATAGTGTCCC 1.0 Grap2 29711 AAGCAAATGCTTTCCCAAGA GCATGTGTCCCTTGTCTGTG rNBS 16 4 chr14:46585946-46585960 GGTACAAACTGTTCA 0.9 Ktn1 290 CTCCACTGAATGTGGCAGAA TTTTGCCTGGGATCTTTGAT rNBS 17 4 chr15:80468813-80468827 AGGACAGAGTGACTT 1.2 Grap2 18075 AGCCATTGACAGCACCAATA CTCAGCTCTGGTCTTTCTCCA rNBS 18 4 chr15:98910535-98910549 GGGACAGGCTGAGCC 0.5 Troap 7523 CAGAACTAAAGCCCCCAGTG CTGTCCCCGACAGCTAGG rNBS 19 4 chr8:108204879-108204893 AGCACAGTGTGTCCG 1.1 Elmo3 10141 TAAGATCACGGAGGCCACTC CTTGATCTCAGTGGCTGCAA rNBS 20 4 chr9:56815719-56815733 GGGACAGTGTGATCT 1.2 Ptpn9 22614 CTGATGCTGGCCACTCTACA ATCACACTGTCCCACAACCA rNBS 21 4 chr5:142655924-142655938 AGGACAGGAAGTGCT 1.2 Foxk1 1953 AGGAAGTGCTGTGCCTCAGT CCTGGGAATAATGGGAGCTT rNBS 22 5 chr3:133047477-133047491 AGGACAAGCTGTTTT 1.9 Ints12 18218 CATCCTGGTTGCACATTCTG GAAGAAGCTGGGTGCAAACT rNBS 23 5 chr17:25871186-25871200 GGCACAGGGTGTCCA 1.5 Tmem8 30269 GCACAGGGTGTCCATTTTCT AGCAATTTCAGCTGGGGTTA rNBS 24 5 chr8:108169149-108169163 GGAACAGACTGACCC 0.7 Elmo3 25589 CAGATTCCTGAAGGCCAGAA GGAACAGACTGACCCTCCTG rNBS 25 5 chr2:124308618-124308632 AGGACAGAATGTCTG 0.8 Sema6d 6881 GGGGGAGGAAAGACAGACAT GAGCCAGGAGCAGGTACAAG rNBS 26 5 chr5:110530572-110530586 AGGACAGACTGACCT 0.8 Pole 3882 TTGCTCATCACTTGCCACTC GACCTTTCTCGCTGCAATGT rNBS 27 5 chr15:80479027-80479041 GGGACAGACCGTCCT 1.3 Grap2 28289 TGCAGGAAGACTTCTCAGGAA TTTTGACCAGGACGGTCTGT rNBS 28 5 chr7:107390091-107390105 GGAACAGGTTGTTTT 0.6 Ppfibp2 3648 CTGGCAGGTTTTGCTTCTGT TGAGCCATTCTTCCTTGAGC rNBS 29 5 chr7:107374748-107374762 AGCACAGTATGATCC 1.6 Ppfibp2 11695 GGCTCCTGGTAGCTGAGAGA GAAACGGAGTGCAGGCTTAG rNBS 30 5 chr4:149848387-149848401 AGAACAAAGTGTTCC 1.0 Uts2 7490 ACAGCAGGAACACGTGTCAG TTTTTCTGGACCGCAGTTCT rNBS 31 5 chr10:21086277-21086291 GGAACAATGTGTGCT 0.8 Aldh8a1 19570 ACAGAAGGTATGGGGCACAG AACAATGTGTGCTGGGATCA rNBS 32 6 chr7:107392708-107392722 GGGACACCTTGTTCA 0.7 Ppfibp2 6265 CTCTCCAGGGTAAAGCTGTCA TGGGACACCTTGTTCAGTCA rNBS 33 6 chr9:122799662-122799676 GGTACAAAGTGTCTG 1.0 Zfp105 27886 GCGGACTTCTGACGTAGACC CACCAGAATCCTCCCTTCAA rNBS 34 6 chr15:9081194-9081208 GGTACAGAGTGTCTA 1.5 Lmbrd2 13742 TGCAGAAAAGTGACCTCATTG TTAGGAAACTTCAGGGGCACT rNBS 35 6 chr7:107396001-107396015 AGGACATCATGTTTT 0.9 Ppfibp2 9558 GTGGGCTGATGCTCTAAAGG ACTCTGGGTCTGGAAGAGCA rNBS 36 >6 chr16:16198510-16198524 AGGACAGAATGTACA 1.0 Yars2 18050 GAGGGATCTGATTTTCCGAAG GAGGATGAAGGATGCTGGAA rNBS 37 >6 chr16:16198652-16198666 AGTACAGACAGTGCC 1.0 Yars2 17908 GAGGGATCTGATTTTCCGAAG GAGGATGAAGGATGCTGGAA rNBS 38 >6 chr18:58969138-58969152 AGAACAGAAAGTACA 0.7 Adamts19 7005 CGTAGATGGCTTGGAGAACAG GCCCATGATGGAATTGTCTC rNBS 39 >6 chr9:102670839-102670853 AGCACACGCTGTCCC 0.9 Ryk 22231 ATCTGCAGCCATTCAGGTCT CTGTGGTGGGAGAGACATCC rNBS 40 >6 chr1:183180894-183180908 AGCACAGCGTGTGCA 1.0 Cnih3 8409 GCATCGAAATAGGATGCACA TTAGCCTTGCCACTCCAGTT rNBS 41 >6 chr15:98881929-98881943 AGCACAGGCAGTCCC 1.2 Troap 21083 GACACATGGCTCTCAGCAAA GGCTGGTGTCTGACCTTTGT rNBS 42 >6 chr6:42864606-42864620 GGAACATTATGTACT 0.9 Olfr446 7766 GCAGAAGACTCTTGCCCTGT AAGGATTTCCTTCCCAGGAT rNBS 43 >6 chr11:72812830-72812844 GGAACAAGAAGTTCT 1.7 P2rx1 2507 CTCTTCCATGCACAAGCAAA TGGCCCTATAAAGTGGATGC rNBS 44 >6 chr2:124292480-124292494 AGAACAAGCTGTCCT 1.0 Sema6d 9257 CTGCTTTTCCTGTGTGCAAG TTTCATGCAATCATGCAGGT rNBS 45 >6 chr4:82905242-82905256 TGAACAGACTGTGCT 0.8 Snapc3 16262 TCAGTTGGGGAAGCTGATTT TCTGTTCATTCCCACCATGA rNBS 46 >6 chr3:133042311-133042325 AGCACAGGCTGTTCT 0.7 Ints12 13052 TCTTGCTTAGGGATGGTGCT CACAACCTGCAGTCACTTGG rNBS 47 >6 chr17:25827347-25827361 GGTACAAAATGTACA 1.4 Tmem8 13570 AGATATCTGCCTGCCTCTGC ACAGTGGACCTGGGATGGTA rNBS 48 >6 chr10:128897182-128897196 AGCACACAGTGTACT 1.1 Olfr787 31555 CTGTGGCCAGATAGGCATTT GGGAAGGAATTAAGGGGAGA rNBS 49 >6 chr18:58950472-58950486 GGAACAGAATGTTCA 0.7 Adamts19 11661 GCCATTCCACAAGCCTTAAA ATCAGTCGGAATCTGCTCGT

81

Chapter 3: Initiation and Maintenance of Circadian Rhythm in Mesenchymal Stem

Cells by the Glucocorticoid Receptor

82

Introduction

Circadian rhythms are evolutionarily conserved processes that are maintained from bacteria to vertebrates. In mammalian systems, light is a major signal that entrains the master clock in the suprachiasmatic nuclei (SCN) of the hypothalamus to form rhythmic cycles of approximately 24 hours in length. In turn, the SCN sends endocrine cues to other tissues and regulates the local clocks within peripheral tissues. The clocks in SCN and peripheral tissues likely form interplaying feedback loops that govern behavior and physiological processes.

Studies utilizing genetic knock-out have been instrumental in characterizing the components of the circadian clock. These studies revealed that disruption of individual clock components in mouse does not generally produce dramatic alterations in their circadian properties in 12hr light/ 12hr dark (LD12:12) cycles as measured by locomotion, exemplifying the robustness of the circadian clock and implying partially redundant factor functions under conditions of normal persistent light/dark entrainment. However, under constant darkness (DD), Clock [80], Bmal1 [81],

Npas2 [82,83], Per1 [84,85], and Per2 [84,86], Cry1 [87], and Cry2 [87,88] mutant mice exhibit altered period length in locomotor rhythmicity, emphasizing their requirement in governing intrinsic circadian properties.

Although genetic disruptions followed by behavior studies have provided extensive knowledge of clock functions, they do not reflect cell-autonomous properties.

Recently, Liu et al. demonstrated that Per1 and Cry1 were dispensable for sustaining

Per2 rhythms in SCN explants but not in individually dissociated SCN neurons, demonstrating that clock components have unappreciated cell-autonomous functions

83

[89]. They also showed that while Per1 and Cry1 were not individually required for sustaining Per2 rhythms in SCN explants, they were indispensable in lung explants, indicating that individual components may serve tissue-specific circadian functions [89].

In mammalian systems, the circadian clock regulates metabolic pathways [16-18] in a pattern that overlaps metabolic regulation by glucorticoids, leading to speculation that GR might somehow link circadian rhythm and metabolism. Indeed, adrenal secretion of glucocorticoids is under strict circadian regulation that becomes impaired when specific components (Per1) of clock genes are disrupted [90,91]. In turn, glucocorticoids can alter the circadian oscillators in peripheral tissues [92]. In cultured rat fibroblasts, glucocorticoids can reset the circadian clock [15]. However, the mechanism by which glucocorticoids initiate circadian rhythm in peripheral tissues is unclear.

The regulation of daily glucocorticoid levels is maintained across evolutionary distant species, from fish [19] to human, suggesting that this timed process plays important biological functions. Glucocorticoids govern a wide range of vital physiological processes, including adipogenic differentiation of mesenchymal stem cells

(MSCs) [93]. Interestingly, circadian clock genes Rev-Erbα [94,95] and Bmal1 [96] have been implicated in mediating adipogenesis in cultured cells. Mice with disrupted clock components (Per1 and Clock) have altered body mass compared to wild-type

[90,97]. Conceivably, GR regulated circadian rhythmicity in stem cells may be important for coordinating cell differentiation events, such as adipogensis in MSCs. Thus, it would be interesting to examine whether glucocorticoids regulate circadian rhythm in stem cells.

84

In this study, we examined the molecular mechanism by which GR regulates circadian rhythm in peripheral cells. Specially, we investigated whether GR induces rhythmicity in MSCs, examined the mechanism by which GR initiates rhythmicity, and assessed whether continual GR activity is required to maintain periodicity in these cells.

85

Results

Glucocorticoids Induce Circadian Rhythm in Mouse Primary Mesenchymal Stem

Cells

In our previous study (Chapter 2), we characterized glucocorticoid responsive genes in C3H10T1/2 mesenchymal stem-like cells using microarray and found that a subset of clock genes was regulated by GR in these cells. To examine whether glucocorticoids regulate the expression of clock genes in mouse primary dissociated stem cells, we isolated mesenchymal stem cells (MSCs) from the bone marrow, stimulated them with dexamethasone (dex) for 12hr, and monitored the transcript levels of these genes. We found that dex up-regulated expression of both positive (Npas2, Bmal1) and negative regulators (Cry1, Cry2) of circadian rhythm at the assessed time points (Figure

3.1). In contrast, dex down-regulated expression of other clock components, including

Per3, Rev-Erbα, Rev-Erbβ, Dbp, and Timeless. The transcript level of GR, Csnk1d,

Csnk1e, and Csnk1g did not change in response to dex, demonstrating that the glucocorticoid responsiveness was gene-specific. Thus, in primary MSCs, glucocorticoids activate and repress transcription of various genes important for controlling circadian rhythm, suggesting that GR plays a functional role in coordinating the circadian clock in MSCs.

86

Figure 3.1: Glucocorticoids stimulate transcriptional responsive of clock genes. Equal amount of RNA from mouse primary MSCs, treated with ethanol (EtOH) or 1µM dex for 12 hours, were reverse transcribed and measured by quantitative PCR. The relative transcript level were normalized to the EtOH treated samples. The data represent the average of three independent experiments plotted with standard error of mean.

To assess whether glucocorticoids can induce circadian rhythm in primary MSCs, we treated these cells with dex and monitored the transcript levels of clock genes at 4hr intervals for 48 hrs. Interestingly, we found that dex stimulated transcriptional oscillation for a subset of clock components (Per2, Per3, Cry1, Cry2, Rev-Erbα, Rev-Erbβ, Dbp,

Npas2, Bmal1, and Per1) (Figure 3.2). Surprisingly, the dex-induced transcriptional oscillation was found to be gene-specific; some clock genes, such E4bp4, Dec1, and

Timeless, exhibited dex responsiveness without oscillation whereas other clock genes did not respond to dex (Cnsk1d, Cnsk1e, Clock, Bmal2, and GR) (Figure 3.3). Potentially, dex could provoke oscillation of these gene produces at the protein or activity level. It is interesting that GR activates some clock genes and represses others; it is possible that

87

glucocorticoids may function to orchestrate the stochiometric level of these genes to stimulate rhythmicity in MSCs.

88

Figure 3.2: GR induces circadian rhythm in mouse primary mesenchymal stem cells. Total RNA was isolated from cultured mouse primary MSCs, equal amount of RNA was reverse transcribed, and the relative amount of each corresponding transcript was measured by quantitative PCR. The transcript levels at the different time points were normalized to the levels at time 0. All the graphs represent the average of three independent experiments (plotted with standard error of mean) using MSCs isolated from different mice. MSCs were amplified in culture for 2-4 weeks before hormone treatment. DMSO and 1µM dex-treated samples are graphed in black and blue, respectively. All the graphs are plotted with time (hr) against the relative transcript levels in Log2 scale.

89

Figure 3.3: GR-induced transcriptional oscillation of the circadian clock components is gene-specific. The samples were obtained, measured, and the data were represented as described in Figure 3.2.

We observed three distinct oscillatory phases stimulated by dex (Figure 3.2,

Figure 3.4), and we found the genes that displayed the same phases have similar functions, i.e act as positive or negative feedback loop of the circadian rhythm. For instance, the heterodimeric positive regulators Npas2 and Bmal1 have the same dex- induced phasing. In contrast, the negative regulators Cry1 and Cry2 also displayed overlapping phases (Figure 3.2, Figure 3.4) distinct from that displayed by the positive regulators. Although whether the period genes correspond to the positive or negative limb of the feedback loop has not completely been defined [84,86], the phases of all the period genes, Per3, Per2, and Per1, were synchronized with Cry1 and Cry2, suggesting that they function as negative regulators in glucocorticoid-mediated rhythmicity (Figure

90

3.2, Figure 3.4). Furthermore, Rev-Erbα, Rev-erbβ, and Dbp together formed another distinct phase, suggesting that they possess similar functions. Likewise, the genes that did not exhibit glucocorticoid regulated transcriptional oscillation also responded in a fashion parallel to genes of similar functions. For example, all three casein kinases

(Csnk1d, Csnk1e, Csnk1g) were not response to dex whereas both Dec1 and Dec2 were repressed by dex at many of the time points examined throughout the duration of the

48hrs (Figure 3.3). The clustering of the transcriptional phasing and the synchronous responsiveness of similar functioning genes are likely important for coordinating glucocorticoid stimulated rhythmicity in MSCs.

Figure 3.4: Glucocorticoids induce three distinct oscillating phases. The relative transcript levels from Figure 3.2 in linear scale are represented.

91

GR Directly Regulates Components of the Clock to Induce Circadian Rhythm

To examine the mechanism by which GR initiates circadian rhythmicity in MSCs, we examined which components of the clock genes are early glucocorticoid responsive targets in mouse MSCs. We found that dex stimulated accumulation of transcript levels of Per1, Per2, E4bp4 and Timeless genes after 4 hours of treatment whereas those of other clock genes remained constant (Figure 3.5A). This suggests that GR may initiate rhythmicity in part by inducing Per1, Per2, and E4bp4 expression while repressing

Timeless. To assess whether the early target genes induced by dex are direct primary GR targets, we first scanned computationally for GBS motifs 64kb surround the transcriptional start site (chapter 2) and then assessed GR occupancy at these sites using chromatin immunoprecipitation. We found eight GBS motifs at the Per2 gene (Figure

3.5B) and detected GR occupancy at an intronic region, which contains two overlapping

GBSs, located +22.8kb downstream from the transcriptional start site (TSS) (Figure

3.5C). At the E4bp4 gene, we identified seven GBS motifs (Figure 3.5B) and detected

GR occupancy at a GBS located -5kb upstream from the TSS (Figure 3.5D). At the Per1 locus, we identified seven GBS motifs (Figure 3.5B) and observed GR binding at two distinct genomic regions located +0.5kb and -1.9kb from the TSS of the RefSeq (Figure

3.5E). Indeed, the expression of Per1 can be driven by two alternative TSS, and both are dex-responsive in mouse MSCs (data not shown); it is possible that the two GR-occupied

GBSs may direct Per1 TSS-specific expression or collaborate synergistically to drive transcription at both TSSs. Collectively, our data suggests that GR directly activate Per1,

Per2, and E4bp4 genes to initiate circadian rhythmicity in MSCs.

92

A

B

Gene ReSeq GBS Sequence of GBS Distance to TSS Per1 NM_011065 Per1 GBS.1 TGTACAGAGTGTACA -26728 BC039768 Per1 GBS.2 GGGACAGGCTGATCT -5776 AK081813 Per1 GBS.3 GGAACATCGTGTTCT -1872 Per1 GBS.4 AGAACAGGATGTTCC 468 Per1 GBS.5 AGCACATAGTGTTTA 17159 Per1 GBS.6 GGGACAGGCTGTGTC 27189 Per1 GBS.7 GGAACAAAGAGTTCT 29102

Per2 NM_011066 Per2 GBS.1 GGTACAGAGTGAGCA -22750 Per2 GBS.2 GGCACAGTCTGTCTT -22239 Per2 GBS.3 AGAACAGCAAGTGCT -19942 Per2 GBS.4 AGAACAGACTGGCCT -7977 Per2 GBS.5 AGTACAGAGTGAGTC 18344 Per2 GBS.6 TGTACAGAATGTTCC 22813 Per2 GBS.7 TGGACAGAGTGTACA 22822 Per2 GBS.8 AGAACAAAGTGTTTA 31606

E4bp4 NM_017373 E4bp4 GBS.1 AGAACAGAGTGACTT -30763 E4bp4 GBS.2 GGAACAGTGTGTCAT -29644 E4bp4 GBS.3 GGGACATGCTGATCT -25290 E4bp4 GBS.4 TGCACAGAGTGTTCT -22371 E4bp4 GBS.5 AGTACAGAATGTTCT -5045 E4bp4 GBS.6 AGAACACGGTGTCTT 4946 E4bp4 GBS.7 AGTACAGGATGATCA 18347

C

93

D

E

Figure 3.5: GR directly regulates transcription of circadian clock genes. (A) Identification of early GR targets. Mouse primary MSCs were treated with DMSO or 1µM dex for 4 hours, and equal amount of RNA were reverse transcribed and measured by quantitative PCR. The relative transcript level was normalized to the DMSO treated samples. The data represent the average of three independent experiments plotted with standard error of mean. (B) Identification of putative GR binding sequences (GBSs). DNA sequences 32kb upstream and downstream from the transcriptional start sites (TSS) of the corresponding genes were scanned for putative GBSs as previous described (Chapter 2). The RefSeq, arbitrary GBS nomencluature, the respective GBS sequence, and the position of the GBSs

94

relative to the TSS of the genes are shown. GBSs highlighted in red are sites occupied by GR in primary MSCs (Figure 3.5C, D, E). (C), (D), (E) GR-occupied GBSs at Per1, Per2, and E4bp4 genes. MSCs were treated with DMSO or 1µM dex for 4 hours, immunoprecipitated with anti-GR antibody N499, and chromatin immunoprecipitated samples were measured by quantitative PCR. The samples were normalized to amplification of a genomic region near the Hsp70 gene, which is not occupied by GR.

Direct GR Regulation of Circadian Rhythm Genes is Evolutionarily Conserved

To examine whether GR regulation of circadian genes are evolutionarily conserved, we first examined whether the sequence of the GR-occupied GBSs at Per1,

Per2, and E4bp4 genes were maintained across four different mammalian genomes (rat, human, chimp, and dog). We found high sequence similarity of these GBSs between the separate genomes (Figure 3.6A), suggesting that GR occupancy and function of these sites may also be conserved. Therefore, we examined GR occupancy of these sites in human primary MSCs using chromatin immunoprecipitation; indeed, we detected GR binding at the GBSs of PER1, PER2, and E4BP4 but not at a negative control region

(Figure 3.6B). Importantly, the locations of these sites relative to the transcription start sites of their corresponding target are maintained across the mouse and human genome

(data not shown). For instance, GR occupies GBSs +22.8kb and +24.6kb downstream from the transcription start site of Per2 in mouse and human MSCs, respectively (data not shown). Similar to mouse MSCs, we also observed transcriptional responsiveness of

PER1, PER2, DEC1, and PER3 upon treatment of dex in human MSCs (Figure 3.6C).

Thus, direct regulation of these genes by GR is maintained across mouse and human, suggesting that the mechanism by which GR initiates circadian rhythm in MSCs is evolutionarily conserved and likely plays an important functional role in MSC biology.

95

A

Per1 GBS.I AGAACACGATGTTCC rat AGAACACGATGTTCC human AGAACATGATGTTCC chimp AGAACATGATGTTCC dog AGAACACGATGTTCC Per1 GBS.II GGAACATCCTGTTCT rat GGAACATCCTGTTCC human AGAACATCCCGTTCC chimp AGAACATCCCGTTCC dog GGAACATCCCGTTCC Per2 GBS.I GGAACATTCTGTACA rat GGAACATCCTGTACA human GGAACATTCTGTATA chimp GCAACATTCTGTATA dog GGAACATTCTGTACC E4bp4 GBS.I AGTACAGAATGTTCT rat AGTACAGAGTGTTCT human GGTACAGAATGCACC chimp GGTACAGAATGCACT dog GGTAGAAAATTCTGG

B

C

96

Figure 3.6: Regulation of clock genes is conserved in human primary MSCs. (A) GR-occupied GBSs at PER1, PER2, and E4BP4 are conserved in sequence. Aligned rat, human, and dog sequences of GBSs were obtained from the UCSC Genome Browser. Conserved bases of the GBSs are highlighted in red. (B) GR occupancy of GBSs at PER1, PER2, and E4BP4 is conserved in human MSCs. Chromatin immunoprecipitation experiments in primary human MSCs were performed and analyzed as indicated in Figure 3.5C, D, and E legends. Human MSCs were treated with DMSO or 1µM dex for 1.5hrs. (C) Transcriptional regulation of clock genes by GR is conserved in human cells. Human primary MSCs were treated with DMSO or 1µM dex for 6 or 36 hours and the transcript levels were measured as indicated in Figure 3.1 legend. The transcript levels were normalized to the amplification of RPL19, and the graph is plotted as relative fold change stimulated by dex.

Continual GR Activity is Essential to Maintain Glucocorticoid-Induced Circadian

Rhythm

In this study, we demonstrated that glucocorticoids initiate circadian rhythm in

MSCs likely through direct regulation of clock components. We investigated whether continual GR activity was required to maintain the dex-induced oscillation. For this analysis, we took advantage of the GR antagonist RU486 ligand, which can block GR activity. We first demonstrated that addition of RU486 at the same time as dex treatment

(time=0hr) abolished the induction of circadian rhythm in MSCs (Figure 3.7), indicating that RU486 can efficiently antagonizes the dex-mediated initiation of this process.

97

Figure 3.7: RU486 can block the initiation of transcriptional oscillation stimulated by dex. Mouse primary MSCs were treated with 100nM Dex alone (highlighted in blue) or with 100nM Dex and 1µM RU486 at the same time (highlighted in gray). Cells were harvested every 4 hours for 48 hours, total RNA reverse transcribed, and the relative amount of each corresponding transcript was measured by quantitative PCR. All the graphs represent the average of two independent experiments with MSCs isolated from different mouse. All the graphs are plotted with the respective time (hr) against the relative transcript levels in Log2 scale.

Next, we wished to examine whether the maintenance of the GR-initiated circadian oscillation requires continuous GR function. Therefore, we examined whether

RU486 could inhibit the circadian rhythm by adding the antagonist after the completion of a dex-induced cycle. We treated MSC cells with dex and monitored the transcript levels of clock genes for 48 hrs at 4hr periods with RU486 added between the 24-48hr time intervals (Figure 3.8). Strikingly, we found that addition of RU486 immediately altered the cycling and inhibited the next rhythmic event, indicating that components of the clock genes cannot function independently of GR to maintain glucocorticoid-initiated

98

circadian rhythm. Thus, we conclude that continual GR activity is essential to initiate as well as to maintain the glucocorticoid-induced circadian rhythm.

RU486

RU486

Figure 3.8: GR activity is required to maintain glucocorticoid-stimulated circadian rhythm. Cultured mouse primary MSCs were treated with 100nM dex as indicated in Figure 3.2 legend. 1µM RU486 was added to the respective samples (between 24 and 48 hour intervals) after 22 hours of dex treatment. The blue plot corresponds to samples treated with dex alone and the red plot represents samples with addition of RU486. The samples were processed, quantified, and plotted as indicated in Figure 3.2 legends.

99

Discussion

In this study, we investigated the molecular mechanisms that govern circadian rhythm in peripheral cells by the glucocorticoid receptor. We first examined and found that circadian clock machinery is functional in mesenchymal stem cells, which can be coordinated by glucocorticoid signaling. Potentially, each single cell within the population of MSCs has individual intrinsic cycling that can be synchronized by GR activity. In this model, intercellular communication would be required for GR stimulated circadian rhythm in the MSC population. On the other hand, it is possible that the clock is not operational within the individual cells before glucocorticoid signaling and is initiated and maintained with continuous GR function. In this model, intracellular signaling may be sufficient for glucocorticoid-stimulated periodicity. In the future, it would be interesting to distinguish between these models by performing single cell experiments.

The stimulation of circadian rhythm is achieved by GR through coordination of gene-specific transcriptional activation as well as repression. For example, Per2 and

Cry1 are generally upregulated whereas Per3 and Dbp are downregulated in response to dex. We speculate that the gene-specific mode of response functions to coordinate the relative abundance of the clock components to achieve circadian rhythmicity. We found three distinct phases with genes of overlapping functions grouped in similar phases. The positive regulators (Npas2 and Bmal1), which activate transcription of the clock genes, are at phases distinct from those of the negative regulators (Cry1 and Cry2), which inhibit the activity of Npas2 and Bmal1 to form a transcriptional feedback loop. We observed that the positive and negative regulators are not directly anti-phase of each other; rather,

100

Rev-Erbα and Rev-Erbβ are anti-phase of the positive regulators. It has been shown that

Rev-Erbα directly regulates transcription of Bmal1, consistent with the hypothesis that

Rev-Erbα provides an association between the positive and negative limb of the circadian clock [95]. Collectively, this suggests that at least three unique and distinguishable transcriptional loops are required to complete the rhythmic periodicity. In this study, we found that GR directly regulates the transcriptional loop that consists of Per1 and Per2 for coordinating circadian rhythm. It will be insightful to examine whether GR also directly regulates the other two transcriptional loops.

We assessed how GR initiates rhythmicity and found that the receptor directly regulates the transcription of Per1, Per2, and E4bp4, suggesting that GR drives the expression of these three genes to initiate the circadian clock. It will be interesting to examine whether these genes are general direct targets in response to other non-steroid signals that stimulate rhythmicity. E4bp4 regulates the molecular clock by directly binding to a response element at the promoter of Per2 to repress transcription [98]. The

GR binding site at Per2, located in an intronic region +22.8kb downstream from the transcription start site (Figure 3.3B), is separated far from the E4bp4 response element of

Per2. In contrast, the GR and Bmal1 [96] binding sites at the Per1 gene are proximal to each other (separated by only 500bp); these binding sites may interact to form a composite response element. Potentially, the E4bp4 and GR binding site at Per2, although distal in linear space, may also associate in three-dimensional space to form a transcriptional response unit that function to coordinate Per2 oscillation.

Per1 and Per2 were proposed to function in the negative limb of the autoregulatory loop to control the clock based on studies of their homolog in Drosophila;

101

however, it was demonstrated that Per2 mutation in mice reduced the expression of clock genes, including Per1, Cry1, and Bmal1 in the SCN, suggesting that it functions as a positive regulator of the transcriptional feedback loop [84,86] at least in the central oscillator. Per1 disruption in mice did not seem to alter transcript levels of clock genes but rather the protein levels in the SCN, suggesting that Per1 might function to regulate rhythmicity through translation modifications [84,85]. It would be interesting to test whether Per1 and Per2 function similarly or differentially in perhipheral cells, such as in

MSCs.

In addition to investigating the mechanism by which GR initiates rhythmicity, we extended our analysis to assess whether continual GR activity is required to maintain periodicity in MSCs. Using RU486 to inhibit GR activity in a time-specific manner, we found that the antagonist can block rhythmicity after the clock has progressed. This indicates that the clock does not become autonomous after glucocorticoid-induced circadian initiation; rather, continual GR activity is required to preserve the fidelity of the autoregulatory loops for maintaining oscillation. Thus, we conclude that in addition to initiating rhythmicity, persistent GR activity is required to maintain periodicity. Our observation is consistent with having a daily oscillating level of glucocorticoid, which itself is controlled by the master clock within the SCN [90,91]. It appears that the

SCN master clock stimulates a daily glucocorticoid fluctuation to ensure a daily systemic concentration of glucocorticoids for initiating and maintaining rhythmicity in peripheral tissues. It would be interesting to examine whether the glucocorticoid-stimulated peripheral clocks feedback signals to the SCN to maintain the timing of the master clock.

102

We observed that GR occupancy of GBSs at Per1, Per2, and E4bp4 were conserved across mouse and human MSCs (Figure 3.4B). Indeed, the glucocorticoid responsiveness of these genes as well as other components of the clock was also evolutionarily conserved (Figure 3.4C). The maintenance of GR regulation of clock genes in MSCs across different species implies a functional significance. Consistent with this notion, knockdown of the clock gene Bmal1 was demonstrated to be essential for adipogenic differentiation in mesenchymal stem-like cultured cells [94-96]. In addition to adipocyte lineage, MSCs can also differentiate into osteoblasts for bone formation, and interestingly, it has been shown that disruption of Per1 result in bone mass increase with leptin infusion [99], suggesting a role for the circadian clock to regulate cell fate decisions in MSCs. It is known that long-term treatment with glucocorticoids induces obesity and osteoporosis; thus, it is tempting to speculate that GR stimulated circadian rhythm in MSCs may function to induce adipogenesis but inhibit osteoblast differentiation.

103

Materials and Methods

Isolation of MSCs, Cell Culture, RNA Isolation, Reverse Transcription,

Quantitative PCR

Mice were anesthetized and the bone marrow from the femur and tibia were isolated using 3% FBS Isocove’s media. The extracted mesenchymal stem cells were plated and grown in MesenCult basal media with Mesenchymal stem cell stimulatory supplements

(StemCell Technologies), pyruvate, penicillin, and streptomycin in 5% carbon dioxide atmosphere. Human MSCs were purchased from Lonza (PT-2501) and seeded as indicated by manufacturer or human MeSenCult proliferation kit (StemCell

Technologies). MSCs were plated onto 12-well formats (Greiner) and allowed to grown to confluency for experiments involving RNA isolation. RNA were extracted using the

RNeasy kit (Qiagen), 500ng to 1µg of RNA were used for reverse transcription, the reverse transcribed samples were diluted 10folds with water, and 3uL of cDNA samples were used for quantitative PCR measurements as previously described [70].

Identification of Putative GR Binding Sequences Through Computational Analysis

GR binding sequences were computationally identified as previous described (Chapter 2).

Briefly, a GBS positional weight matrix was generated with GBSs extracted from GR- occupied GREs isolated in ChIP-chip experiments [70]. BioPerl and the TFBS module

[78] were used to search for GBS motifs at 64kb DNA regions surrounding the transcription start sites. The 15bp sequences that fall within the top ninetieth percentile score of the GBSs used to generate the positional weight matrix were considered GBSs.

The UCSC genome browser was used to obtain the human, chimp, rat, and dog aligned

104

sequences of the GBSs.

Chromatin Immunoprecipitation

Chromatin immmunoprecipitation experiments were performed as previously described

[70]. Briefly, cells were crosslinked with 1% of formaldehyde, the reaction stopped with

125mM glycine, sonicated, and DNA samples were immunoprecipitated with N499 anti-

GR antibody (8µg per 15cm plate of cells) and Protein A/G beads (Santa Cruz). The beads were was with once with RIPA buffer (10mM Tris-HCl pH 8.0. 1mM EDTA,

0.5mM EGTA, 140mM NaCl, 5% glycerol, 0.1% Na deoxycholaet, 0.1% SDS, 1%

Triton X-100), three times with RIPA buffer containing extra 360mM NaCl, and a final wash with RIPA buffer. DNA were extracted once with phenol-chloroform and subsequently purified using a Qiaquick column (Qiagen). The relative immunoprecipitated DNA samples were measured using qPCR.

105

Perspectives

Collectively, the data from this thesis provide information that advances the understanding of GR transcriptional regulation and its regulated biological processes.

First, our data that GR occupancy generally correlates with glucocorticoid responsiveness supports the hypothesis that cell-specific transcriptional regulation is governed by selective GRE occupancy. Thus, it is likely that the capability of GR to regulate diverse physiological responses, e.g anti-inflammatory response in immune cells and promotion of adipogenesis in MSCs, stems from cell-specific GR occupancy that in turn direct gene selective regulation for mediating a distinct biological process. The next question then is what directs cell-specific GRE occupancy?

Although we have not discerned the mechanism that directs GR occupancy in a particular cell, our data demonstrated that the GBS is not sufficient to direct receptor binding. In C3H10T1/2 cells, we find that although the same precise sequence of GBS can be present at separate genes, GR occupies some but not others. This indicates that other elements besides the GBS is required to direct receptor binding. Indeed, we find enriched motifs (e.g motifs resembling AP-1 and C/EBP binding sites) in addition to

GBSs within A549 GREs and we observed elevated species conservation of ~1kb surrounding GBSs, suggesting that GREs are composite elements and that sequences within these regions are important for GRE function. To test this hypothesis, we could delete the enriched motifs surrounding the GBSs to examine whether they are essential for GR occupancy or glucocorticoid responsiveness. Currently, this could be experimentally tested using bacterial artificial chromosomes (BACs) that contain the entire GRE and the gene regulated by the regulatory element. Using the BAC system,

106

different fragments of the GRE could be mutated to test their functional significance.

However, it is important to note that BAC systems may not recapitulate transcriptional regulation exerted by the endogenous native chromosomal context. Alternatively, we could mutate individual GRE fragments using in vivo recombination technologies. For instance, it is feasible to perform such mutations in mouse embryonic stem cells (ES) and generate transgenic mice from these cells to obtain the desired disruptions.

Unfortunately, these experiments are time consuming and restricted to analysis with non- human cells. Thus, future technical advances that would allow efficient genetics in human cells will be required for such analyses. In addition, it is possible that the enriched motifs identified within A549 GREs are important for specifying GR occupancy in these cells but not in other cells. This hypothesis would then predict that other enriched motifs might be present at GREs identified in non-A549 cells; therefore, it would be interesting to compare the architecture of A549 versus C3H10T1/2 GREs identified in this study. Recently, it has been implicated that epigenetic marks present at genomic response elements may play a vital role in directing steroid receptor occupancy

(Myles Brown, CSH conference 2007). It would be interesting to examine whether epigenetic marks, such as acetylation or methylation, also specify cell-selective GR occupancy. Further, it is possible that selective GBS occupancy is a function of chromatin structure. In other words, chromatin compaction may preclude GR from binding to particular sets of GBSs.

In our ChIP-chip experiments in human carcinoma A549 cells, we were able to assess the genomic location of GREs. Contrary to the dogma that regulatory elements are generally proximal and upstream from the promoter, our data suggest that GREs are

107

generally distal and equally distributed upstream and downstream from the TSS. This phenomenon is not cell or species specific; we made similar observations in mouse

C3H10T1/2 mesenchymal stem-like cells. Further, we confirmed that GR occupancy at

Per2 is distal (>20kb from the TSS) in primary cells using mouse and human primary

MSCs, supporting the idea that the GRE location is physiological and significant.

Hence, our data support the hypothesis that GR drives transcriptional regulation by interacting with elements that are commonly remote from the target TSS. Using the same platform for ChIP-chip analysis, we also found that the androgen receptor (AR), a receptor closely related to GR, mostly occupied genomic sites far away from the neighboring androgen responsive targets [69]. Other labs have isolated other steroid receptor-occupied response elements, specifically estrogen receptor, and found similar results [23,58]. Collectively, it appears that steroid receptors in general occupy genomic sites remote from their target promoters to drive transcription. Currently, the functional significance of long-range regulation is unknown. It is possible that GREs are positioned at random locations in the genome and may not contribute significantly to the their regulatory functions. However, our observation that GR-occupied GREs are positionally conserved across the mouse and human genome suggests that the precise location of the

GREs may be a determinant of receptor function. Currently, we are attempting to test directly whether GR can drive long-range transcriptional regulation by attempting to delete endogenous GREs in A549 cells.

It has been demonstrated that GR can regulate transcription in a gene-specific manner. The most dramatic examples are the comparison in which GR interacts with

GREs of genes to directly activate transcription of some targets while repress others, such

108

as ETNK2 and IL8 respectively. Thus, the determinant of the gene-specific regulation appears to be subsequent to GR occupancy at GREs. We further characterized the GREs identified in this study to examine whether the architecture of these elements may be important for specifying activation versus repression. Interestingly, we found that GR occupancy of GBS motif appears only at up-regulated but not at down-regulated genes in

C3H10T1/2 cells. It is important to note that our observation may be provisional given that our analysis is limited by the small number of genes down-regulated by GR in these cells. In A549 cells, we found GR occupancy at genes repressed by glucocorticoids, such as IL8 and CXCL2, which indicated that the receptor could function actively as both an activator and a repressor. Importantly, the GREs of both IL8 and CXCL2 do not appear to have a GBS motif. Collectively, these data suggest that the GR occupancy at the GBS motif directs transcriptional activation. Further, they suggest that the precise DNA sequence that is occupied by GR may be a determinant to direct the receptor to exert activation or repression. Further characterization of the DNA binding site at GREs of genes repressed by glucocorticoids will be required to determine the motif recognized by

GR for regulating these sets of genes.

In addition, GR regulates transcription in a gene-specific manner even at sets of genes that have the same type of glucocorticoid response, e.g activation [4]. We examined and found that although the GBS motif within individual GREs are highly variable in sequence, the precise sequence of an individual GBS is highly conserved across evolutionarily distant species. This suggested that the precise sequence of the

GBS might be a determinant for gene-specific transcriptional regulation. This observation is consistent with our analysis comparing GR-occupied DNA sequences at

109

activated and repressed genes, further exemplifying the potential importance of the GR binding sequence for governing GRE and receptor function. Moreover, we compared the sequence spanning the GBS motif of different GREs at glucocorticoid up-regulated genes and found that the architecture within these elements were highly variable although individually, they displayed distinct conserved sequence ‘signatures.’ These data imply that both the precise sequence of the GBS motif and the architecture of the GREs may be determinants of gene-specific receptor function. It is important to note that ongoing experiments in the Yamamoto lab have demonstrated that the precise sequence of the

GBS motif indeed can dictate GR function. Meijsing et al. have cloned 15bp GBSs into luciferase reporters and found that the precise sequence of the GBS can specify different receptor domains required for transcription (unpublished data: Sebastiaan Meijsing, Keith

Yamamoto). Also, we have demonstrated that the sequence of the GBS can affect GR cofactors requirements, Carm1 and Brm, for glucocorticoid induction in reporters

(unpublished data: Sebastiaan Meijsing, Alex So, Keith Yamamoto). Collectively, these data indicate that the precise sequence of the GBS harbors regulatory function. This is likely through an allosteric mechanism by which the sequence of the GBS can affect the conformation of the receptor as shown in X-ray crystallography and NMR studies

(unpublished data: Miles Pufall, Lisa Watson, Sebastiaan Meijsing Keith Yamamoto).

Furthermore, we were impressed by the level of species conservation of the GBSs found in A549 GREs. This prompted us to examine whether species conservation of

GBSs motifs was sufficient to predict GR occupancy at GR activated genes. Strikingly, we found that the level of sequence conservation correlated directly with the predictability of GR to occupy those sites in mouse C3H10T1/2 cells at genes induced by

110

glucocorticoids. In this thesis, we only showed sequence conservation of the GBS motifs without regard to functional conservation. By scoring the corresponding aligned 15bp motif found in the human genome, which we found ~30% did not confirm well to the

GBS motif (data not shown), we predict that ~30% of the GR-occupied GBSs found in mouse C3H10T1/2 cells would not be maintained in the human genome. Thus, it would be extremely interesting to compare GR occupancy across the mouse and human genome to map functionally conserved and species-specific GREs. For example, we could compare the glucocorticoid responsiveness and GR-occupancy within the same cell type, e.g MSCs, of the two different species (mouse and human). We would predict that there would be a large overlap of glucocorticoid-responsive genes in the two species. Within these genes, it is likely that most of the GR-occupied GREs would be functionally conserved. Additionally, we would expect that there would be a fraction of GREs that will be species-specific, which likely permit evolution of GR biology. For example, some genes responsive to glucocorticoid in both mouse and human MSCs may have GR- occupied GREs but with different locations relative to the corresponding target gene. On the other hand, some genes may be responsive in both species but only occupied by GR in one of the organisms. These studies will be important for dissecting species similarities and differences in GR biology.

In our analysis to identify glucocorticoid responsive targets in C3H10T1/2 mesenchymal stem-like cells, we found that several genes important for regulating circadian rhythm were regulated by GR. Indeed, we found that glucocorticoids stimulated circadian rhythm in MSCs. Using our computational and conservation analysis to predict in vivo GR occupancy at GBSs, we identified Per1, Per2, and E4bp4 as

111

early direct targets of glucocorticoids, implying that initiation of rhythmicity by the receptor is likely to involve regulation of these genes. We also found that continual GR activity was required for maintaining circadian rhythmicity. Further, we found that regulation of Per1, Per2, E4bp4 as well as other clock components was evolutionarily conserved between mouse and human MSCs, implying a significant biological function associated with this glucocorticoid mediated process. It will be important to test whether

Per1 and Per2 are required for GR-stimulated rhythmicity in MSCs by utilizing genetic disruptions in mouse. Importantly, the Per2 mutant mouse generated by the Bradley lab

[86] encompasses a disruption that targets the GR-occupied GBS +20kb from the TSS of this transcript (data not shown). The deletion is ~2kb in length and deletes the entire

PAS domain of Per2 [86]. Using this mutant, we experimentally examined whether the

+20kb GR-occupied GBS is required for glucocorticoid induction of the Per2 transcript.

Indeed, we found that the deletion harboring the GBS is essential for glucocorticoid responsiveness of Per2 (data not shown). This provides direct evidence supporting that this site drives GR mediated Per2 transcription and supports that GR can truly exert long- range distal regulation. However, it is important to note that we have not eliminated the possibility that the PAS domain deletion may disrupt the glucocorticoid responsiveness.

Moreover, glucocorticoids induce adipogenesis in MSCs, and in this study, we found that

GR stimulates circadian rhythmicity in these cells. Hence, it would be interesting to examine whether circadian rhythmicity is required for GR induced adipogenesis.

In this thesis, we have generated information that provides a better understanding on the mechanism of GR transcriptional regulation, which subsequently provided us with the tools to dissect a biological process governed by the receptor (circadian rhythm).

112

Collectively, our data suggests that native chromosomal GREs are composite elements, and the composition of the GBS along with other non-GR factor binding sites within the

GREs may be important for directing cell-specific GRE occupancy. In this model, the precise sequence of the GBS within the GREs can direct receptor function to produce gene-specific transcriptional responses upon GR occupancy. Our characterization of glucocorticoid responsive targets and GREs led us to the discovery that GR directly initiates and maintains circadian rhythm in MSCs.

113

Bibliography

1. Jobe AH, Ikegami M (2000) Lung development and function in preterm infants in the surfactant treatment era. Annu Rev Physiol 62: 825-846. 2. Yamamoto KR, Darimont BD, Wagner RL, Iniguez-Lluhi JA (1998) Building transcriptional regulatory complexes: signals and surfaces. Cold Spring Harb Symp Quant Biol 63: 587- 598. 3. Wang JC, Derynck MK, Nonaka DF, Khodabakhsh DB, Haqq C, et al. (2004) Chromatin immunoprecipitation (ChIP) scanning identifies primary glucocorticoid receptor target genes. Proc Natl Acad Sci U S A 101: 15603-15608. 4. Rogatsky I, Wang JC, Derynck MK, Nonaka DF, Khodabakhsh DB, et al. (2003) Target- specific utilization of transcriptional regulatory surfaces by the glucocorticoid receptor. Proc Natl Acad Sci U S A 100: 13845-13850. 5. Wang JC, Shah N, Pantoja C, Meijsing SH, Ho JD, et al. (2006) Novel arylpyrazole compounds selectively modulate glucocorticoid receptor regulatory activity. Genes Dev 20: 689-699. 6. Chen W, Rogatsky I, Garabedian MJ (2006) MED14 and MED1 differentially regulate target- specific gene activation by the glucocorticoid receptor. Mol Endocrinol 20: 560-572. 7. Lefstin JA, Thomas JR, Yamamoto KR (1994) Influence of a steroid receptor DNA-binding domain on transcriptional regulatory functions. Genes Dev 8: 2842-2856. 8. Leung TH, Hoffmann A, Baltimore D (2004) One nucleotide in a kappaB site can determine cofactor specificity for NF-kappaB dimers. Cell 118: 453-464. 9. Wood JR, Greene GL, Nardulli AM (1998) Estrogen response elements function as allosteric modulators of estrogen receptor conformation. Mol Cell Biol 18: 1927-1934. 10. Bulger M, Groudine M (1999) Looping versus linking: toward a model for long-distance gene activation. Genes Dev 13: 2465-2477. 11. Martone R, Euskirchen G, Bertone P, Hartman S, Royce TE, et al. (2003) Distribution of NF- kappaB-binding sites across human chromosome 22. Proc Natl Acad Sci U S A 100: 12247-12252. 12. Hartman SE, Bertone P, Nath AK, Royce TE, Gerstein M, et al. (2005) Global changes in STAT target selection and transcription regulation upon interferon treatments. Genes Dev 19: 2953-2968. 13. Tavera-Mendoza LE, Mader S, White JH (2006) Genome-wide approaches for identification of nuclear receptor target genes. Nucl Recept Signal 4: e018. 14. Nordeen SK, Kuhnel B, Lawler-Heavner J, Barber DA, Edwards DP (1989) A quantitative comparison of dual control of a hormone response element by progestins and glucocorticoids in the same cell line. Mol Endocrinol 3: 1270-1278. 15. Balsalobre A, Brown SA, Marcacci L, Tronche F, Kellendonk C, et al. (2000) Resetting of circadian time in peripheral tissues by glucocorticoid signaling. Science 289: 2344-2347. 16. Karlsson B, Knutsson A, Lindahl B (2001) Is there an association between shift work and having a metabolic syndrome? Results from a population based study of 27,485 people. Occup Environ Med 58: 747-752. 17. Rudic RD, McNamara P, Curtis AM, Boston RC, Panda S, et al. (2004) BMAL1 and CLOCK, two essential components of the circadian clock, are involved in glucose homeostasis. PLoS Biol 2: e377. 18. Wijnen H, Young MW (2006) Interplay of circadian clocks and metabolic rhythms. Annu Rev Genet 40: 409-448. 19. Srivastava AK, Meier AH (1972) Daily variation in concentration of cortisol in plasma in intact and hypophysectomized gulf killifish. Science 177: 185-187.

114

20. Chandler VL, Maler BA, Yamamoto KR (1983) DNA sequences bound specifically by glucocorticoid receptor in vitro render a heterologous promoter hormone responsive in vivo. Cell 33: 489-499. 21. Luecke HF, Yamamoto KR (2005) The glucocorticoid receptor blocks P-TEFb recruitment by NFkappaB to effect promoter-specific transcriptional repression. Genes Dev 19: 1116- 1127. 22. Jantzen HM, Strahle U, Gloss B, Stewart F, Schmid W, et al. (1987) Cooperativity of glucocorticoid response elements located far upstream of the tyrosine aminotransferase gene. Cell 49: 29-38. 23. Carroll JS, Liu XS, Brodsky AS, Li W, Meyer CA, et al. (2005) Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell 122: 33-43. 24. Magee JA, Chang LW, Stormo GD, Milbrandt J (2006) Direct, androgen receptor-mediated regulation of the FKBP5 gene via a distal enhancer element. Endocrinology 147: 590- 598. 25. Yamamoto KR (1989) A Conceptual View of Transcriptional Regulation. American Zoologist 29: 537-547. 26. Yamamoto KR (1995) Multilayered control of intracellular receptor function. Harvey Lect 91: 1-19. 27. Beato M, Chalepakis G, Schauer M, Slater EP (1989) DNA regulatory elements for steroid hormones. J Steroid Biochem 32: 737-747. 28. Nordeen SK, Suh BJ, Kuhnel B, Hutchison CD (1990) Structural determinants of a glucocorticoid receptor recognition element. Mol Endocrinol 4: 1866-1873. 29. Strahle U, Klock G, Schutz G (1987) A DNA sequence of 15 base pairs is sufficient to mediate both glucocorticoid and progesterone induction of gene expression. Proc Natl Acad Sci U S A 84: 7871-7875. 30. La Baer J, Yamamoto KR (1994) Analysis of the DNA-binding affinity, sequence specificity and context dependence of the glucocorticoid receptor zinc finger region. J Mol Biol 239: 664-688. 31. Rogatsky I, Luecke HF, Leitman DC, Yamamoto KR (2002) Alternate surfaces of transcriptional coregulator GRIP1 function in different glucocorticoid receptor activation and repression contexts. Proc Natl Acad Sci U S A 99: 16701-16706. 32. Lefstin JA, Yamamoto KR (1998) Allosteric effects of DNA on transcriptional regulators. Nature 392: 885-888. 33. Geserick C, Meyer HA, Haendler B (2005) The role of DNA response elements as allosteric modulators of steroid receptor function. Mol Cell Endocrinol 236: 1-7. 34. Scully KM, Jacobson EM, Jepsen K, Lunyak V, Viadiu H, et al. (2000) Allosteric effects of Pit-1 DNA sites on long-term repression in cell type specification. Science 290: 1127- 1131. 35. Roche PJ, Hoare SA, Parker MG (1992) A consensus DNA-binding site for the androgen receptor. Mol Endocrinol 6: 2229-2235. 36. DePrimo SE, Diehn M, Nelson JB, Reiter RE, Matese J, et al. (2002) Transcriptional programs activated by exposure of human prostate cancer cells to androgen. Genome Biol 3: RESEARCH0032. 37. Nelson PS, Clegg N, Arnold H, Ferguson C, Bonham M, et al. (2002) The program of androgen-responsive genes in neoplastic prostate epithelium. Proc Natl Acad Sci U S A 99: 11890-11895. 38. Sayegh R, Auerbach SD, Li X, Loftus RW, Husted RF, et al. (1999) Glucocorticoid induction of epithelial sodium channel expression in lung and renal epithelia occurs via trans- activation of a hormone response element in the 5'-flanking region of the human epithelial sodium channel alpha subunit gene. J Biol Chem 274: 12431-12437. 115

39. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188-1190. 40. Suzuki Y, Yamashita R, Sugano S, Nakai K (2004) DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res 32: D78-81. 41. Sabo PJ, Kuehn MS, Thurman R, Johnson BE, Johnson EM, et al. (2006) Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays. Nat Methods 3: 511-518. 42. Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla C, et al. (2002) Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature 420: 578-582. 43. Dermitzakis ET, Kirkness E, Schwarz S, Birney E, Reymond A, et al. (2004) Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res 14: 852-859. 44. Zaret KS, Yamamoto KR (1984) Reversible and persistent changes in chromatin structure accompany activation of a glucocorticoid-dependent enhancer element. Cell 38: 29-38. 45. Bussemaker HJ, Li H, Siggia ED (2000) Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc Natl Acad Sci U S A 97: 10096- 10100. 46. Bussemaker HJ, Li H, Siggia ED (2000) Regulatory element detection using a probabilistic segmentation model. Proc Int Conf Intell Syst Mol Biol 8: 67-74. 47. Liu X, Brutlag DL, Liu JS (2001) BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput: 127-138. 48. Hoffmann E, Dittrich-Breiholz O, Holtmann H, Kracht M (2002) Multiple control of interleukin-8 gene expression. J Leukoc Biol 72: 847-855. 49. Freund A, Jolivel V, Durand S, Kersual N, Chalbos D, et al. (2004) Mechanisms underlying differential expression of interleukin-8 in breast cancer cells. Oncogene 23: 6105-6114. 50. Higgins DG, Thompson JD, Gibson TJ (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266: 383-402. 51. Luisi BF, Xu WX, Otwinowski Z, Freedman LP, Yamamoto KR, et al. (1991) Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature 352: 497-505. 52. Geng CD, Vedeckis WV (2004) Steroid-responsive sequences in the human glucocorticoid receptor gene 1A promoter. Mol Endocrinol 18: 912-924. 53. Marinovic AC, Zheng B, Mitch WE, Price SR (2002) Ubiquitin (UbC) expression in muscle cells is increased by glucocorticoids through a mechanism involving Sp1 and MEK1. J Biol Chem 277: 16673-16681. 54. Wang JC, Stromstedt PE, Sugiyama T, Granner DK (1999) The phosphoenolpyruvate carboxykinase gene glucocorticoid response unit: identification of the functional domains of accessory factors HNF3 beta (hepatic nuclear factor-3 beta) and HNF4 and the necessity of proper alignment of their cognate binding sites. Mol Endocrinol 13: 604-618. 55. Phuc Le P, Friedman JR, Schug J, Brestelli JE, Parker JB, et al. (2005) Glucocorticoid Receptor-Dependent Gene Regulatory Networks. PLoS Genet 1: e16. 56. Laganiere J, Deblois G, Lefebvre C, Bataille AR, Robert F, et al. (2005) From the Cover: Location analysis of estrogen receptor alpha target promoters reveals that FOXA1 defines a domain of the estrogen response. Proc Natl Acad Sci U S A 102: 11651-11656. 57. Bieda M, Xu X, Singer MA, Green R, Farnham PJ (2006) Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res 16: 595-605. 58. Carroll JS, Meyer CA, Song J, Li W, Geistlinger TR, et al. (2006) Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38: 1289-1297. 116

59. Bajic VB, Tan SL, Christoffels A, Schonbach C, Lipovich L, et al. (2006) Mice and men: their promoter properties. PLoS Genet 2: e54. 60. Lomvardas S, Barnea G, Pisapia DJ, Mendelsohn M, Kirkland J, et al. (2006) Interchromosomal interactions and olfactory receptor choice. Cell 126: 403-413. 61. Rosen J, Miner JN (2005) The search for safer glucocorticoid receptor ligands. Endocr Rev 26: 452-464. 62. Oberley MJ, Tsao J, Yau P, Farnham PJ (2004) High-throughput screening of chromatin immunoprecipitates using CpG-island microarrays. Methods Enzymol 376: 315-334. 63. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132: 365-386. 64. Rao S, Procko E, Shannon MF (2001) Chromatin remodeling, measured by a novel real-time polymerase chain reaction assay, across the proximal promoter region of the IL-2 gene. J Immunol 167: 4494-4503. 65. Ben-Dor A, Shamir R, Yakhini Z (1999) Clustering gene expression patterns. J Comput Biol 6: 281-297. 66. Li H, Rhodius V, Gross C, Siggia ED (2002) Identification of the binding sites of regulatory proteins in bacterial genomes. Proc Natl Acad Sci U S A 99: 11772-11777. 67. Patil CK, Li H, Walter P (2004) Gcn4p and novel upstream activating sequences regulate targets of the unfolded protein response. PLoS Biol 2: E246. 68. Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 31: 374-378. 69. Bolton EC, So AY, Chaivorapol C, Haqq CM, Li H, et al. (2007) Cell- and gene-specific regulation of primary target genes by the androgen receptor. Genes Dev 21: 2005-2017. 70. So AY, Chaivorapol C, Bolton EC, Li H, Yamamoto KR (2007) Determinants of Cell- and Gene-Specific Transcriptional Regulation by the Glucocorticoid Receptor. PLoS Genet 3: e94. 71. Thompson BT (2003) Glucocorticoids and acute lung injury. Crit Care Med 31: S253-257. 72. Itani OA, Liu KZ, Cornish KL, Campbell JR, Thomas CP (2002) Glucocorticoids stimulate human sgk1 gene expression by activation of a GRE in its 5'-flanking region. Am J Physiol Endocrinol Metab 283: E971-979. 73. Drouin J, Sun YL, Chamberland M, Gauthier Y, De Lean A, et al. (1993) Novel glucocorticoid receptor complex with DNA element of the hormone-repressed POMC gene. Embo J 12: 145-156. 74. Bilodeau S, Vallette-Kasic S, Gauthier Y, Figarella-Branger D, Brue T, et al. (2006) Role of Brg1 and HDAC2 in GR trans-repression of the pituitary POMC gene and misexpression in Cushing disease. Genes Dev 20: 2871-2886. 75. Meyer T, Carlstedt-Duke J, Starr DB (1997) A weak TATA box is a prerequisite for glucocorticoid-dependent repression of the osteocalcin gene. J Biol Chem 272: 30709- 30714. 76. Schule R, Rangarajan P, Kliewer S, Ransone LJ, Bolado J, et al. (1990) Functional antagonism between oncoprotein c-Jun and the glucocorticoid receptor. Cell 62: 1217- 1226. 77. Smyth GK, Michaud J, Scott HS (2005) Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21: 2067-2075. 78. Lenhard B, Wasserman WW (2002) TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 18: 1135-1136. 79. Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, et al. (2007) Ensembl 2007. Nucleic Acids Res 35: D610-617. 80. Vitaterna MH, King DP, Chang AM, Kornhauser JM, Lowrey PL, et al. (1994) Mutagenesis and mapping of a mouse gene, Clock, essential for circadian behavior. Science 264: 719- 725. 117

81. Bunger MK, Wilsbacher LD, Moran SM, Clendenin C, Radcliffe LA, et al. (2000) Mop3 is an essential component of the master circadian pacemaker in mammals. Cell 103: 1009- 1017. 82. DeBruyne JP, Weaver DR, Reppert SM (2007) CLOCK and NPAS2 have overlapping roles in the suprachiasmatic circadian clock. Nat Neurosci 10: 543-545. 83. Dudley CA, Erbel-Sieler C, Estill SJ, Reick M, Franken P, et al. (2003) Altered patterns of sleep and behavioral adaptability in NPAS2-deficient mice. Science 301: 379-383. 84. Bae K, Jin X, Maywood ES, Hastings MH, Reppert SM, et al. (2001) Differential functions of mPer1, mPer2, and mPer3 in the SCN circadian clock. Neuron 30: 525-536. 85. Zheng B, Albrecht U, Kaasik K, Sage M, Lu W, et al. (2001) Nonredundant roles of the mPer1 and mPer2 genes in the mammalian circadian clock. Cell 105: 683-694. 86. Zheng B, Larkin DW, Albrecht U, Sun ZS, Sage M, et al. (1999) The mPer2 gene encodes a functional component of the mammalian circadian clock. Nature 400: 169-173. 87. van der Horst GT, Muijtjens M, Kobayashi K, Takano R, Kanno S, et al. (1999) Mammalian Cry1 and Cry2 are essential for maintenance of circadian rhythms. Nature 398: 627-630. 88. Thresher RJ, Vitaterna MH, Miyamoto Y, Kazantsev A, Hsu DS, et al. (1998) Role of mouse cryptochrome blue-light photoreceptor in circadian photoresponses. Science 282: 1490- 1494. 89. Liu AC, Welsh DK, Ko CH, Tran HG, Zhang EE, et al. (2007) Intercellular coupling confers robustness against mutations in the SCN circadian clock network. Cell 129: 605-616. 90. Dallmann R, Touma C, Palme R, Albrecht U, Steinlechner S (2006) Impaired daily glucocorticoid rhythm in Per1 ( Brd ) mice. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 192: 769-775. 91. Moore RY, Eichler VB (1972) Loss of a circadian adrenal corticosterone rhythm following suprachiasmatic lesions in the rat. Brain Res 42: 201-206. 92. Le Minh N, Damiola F, Tronche F, Schutz G, Schibler U (2001) Glucocorticoid hormones inhibit food-induced phase-shifting of peripheral circadian oscillators. Embo J 20: 7128- 7136. 93. Feldman BJ, Streeper RS, Farese RV, Jr., Yamamoto KR (2006) Myostatin modulates adipogenesis to generate adipocytes with favorable metabolic effects. Proc Natl Acad Sci U S A 103: 15675-15680. 94. Fontaine C, Dubois G, Duguay Y, Helledie T, Vu-Dac N, et al. (2003) The orphan nuclear receptor Rev-Erbalpha is a peroxisome proliferator-activated receptor (PPAR) gamma target gene and promotes PPARgamma-induced adipocyte differentiation. J Biol Chem 278: 37672-37680. 95. Preitner N, Damiola F, Lopez-Molina L, Zakany J, Duboule D, et al. (2002) The orphan nuclear receptor REV-ERBalpha controls circadian transcription within the positive limb of the mammalian circadian oscillator. Cell 110: 251-260. 96. Shimba S, Ishii N, Ohta Y, Ohno T, Watabe Y, et al. (2005) Brain and muscle Arnt-like protein-1 (BMAL1), a component of the molecular clock, regulates adipogenesis. Proc Natl Acad Sci U S A 102: 12071-12076. 97. Turek FW, Joshu C, Kohsaka A, Lin E, Ivanova G, et al. (2005) Obesity and metabolic syndrome in circadian Clock mutant mice. Science 308: 1043-1045. 98. Ohno T, Onishi Y, Ishida N (2007) A novel E4BP4 element drives circadian expression of mPeriod2. Nucleic Acids Res 35: 648-655. 99. Fu L, Patel MS, Bradley A, Wagner EF, Karsenty G (2005) The molecular clock mediates leptin-regulated bone formation. Cell 122: 803-815.

118

It is the policy of the University to encourage the distribution of all theses and dissertations. Copies of all UCSF theses and dissertations will be routed to the library via the Graduate Division. The library will make all theses and dissertations accessible to the public and will preserve these to the best of their abilities, in perpetuity.

I hereby grant permission to the Graduate Division of the University of California, San Francisco to release copies of my thesis or dissertation to the Campus Library to provide access and preservation, in whole or in part, in perpetuity.

119