<<

GENOMIC ANALYSIS OF RIBOSOMAL DNA AND ITS APPLICATION TO THE

INVESTIGATION OF DISEASE PATHOGENESIS

by

GABRIEL ETIENNE ZENTNER

Submitted in partial fulfillment of the requirements

for the degree of Doctor of Philosophy

Dissertation advisor: Peter C. Scacheri, Ph.D.

Department of

CASE WESTERN RESERVE UNIVERSITY

January 2012 Gabriel Etienne Zentner

Doctor of Philosophy

Guangbin Luo

Peter Scacheri

Helen Salz

Derek Abbott

10-7-2011

To Stephanie, for everything.

1

Table of Contents

List of tables 5 List of figures 6 Acknowledgements 8 List of abbreviations 10 Abstract 14 Chapter 1: Background and Significance 16 Overview 17 Structure and of rDNA 17 Organization of the mammalian rDNA repeat 17 Cytological features and structure of rDNA 20 occupancy of rDNA 21 Transcription of rRNA 23 Cell type-specific regulation of rRNA transcription 24 Epigenetic regulation of rDNA 25 CpG 25 Core modifications 27 H1: the linker histone weighs in on rRNA transcription 27 Histone variants 29 H2A.Z 29 H3.3 31 Nucleosome positioning 31 TTF-I sets the stage for epigenetic regulation of rDNA 31 Formation of at rDNA 32 Noncoding RNA transcripts epigenetically regulate rDNA 34 Establishment of active rDNA chromatin 35 Replication timing of rDNA repeats 40 Consequences of dysregulated ribosome biogenesis 41 Crosstalk between ribosome biogenesis and 41 Ribopathies: diseases of ribosome biogenesis 43 The Minute 45 General phenomena related to impaired ribosome biogenesis 46 CHD , CHD7, and CHARGE syndrome 47 CHD proteins 47 CHD7 and CHARGE syndrome 48 Summary and research aims 53

Chapter 2: Integrative genomic analysis of human ribosomal DNA 56

2

Abstract 57 Introduction 58 Results 60 Alignment of high-throughput sequencing data to rDNA 60 Distribution of histone modifications at rDNA 61 Cell type-specificity of histone marks at rDNA 66 Chromatin accessibility and transcription at rDNA 76 Nucleosome occupancy of rDNA 79 ChIP-seq analysis of Pol I chromatin association 79 ChIP-seq analysis of UBF chromatin association 80 The -binding CTCF associates with rDNA 87 Discussion 92 Materials and methods 97

Chapter 3: CHD7 functions in the nucleolus as a positive regulator of rRNA biogenesis 105 Abstract 106 Introduction 108 Results 111 CHD7 associates with rDNA 111 CHD7 is dually localized to the nucleoplasm and nucleolus 112 CHD7 influences the levels of the 45S pre-rRNA transcript 116 Depletion of CHD7 reduces cell proliferation and protein synthesis 119 CHD7 antagonizes DNA methylation at active rDNA repeats 122 CHARGE-relevant tissues from Chd7 -trap mice show reduced pre-rRNA levels 126 CHD7 promotes rDNA association of the Treacher Collins syndrome protein, treacle 128 Discussion 133 Materials and methods 139

Chapter 4: Investigation into dysregulated ribosome biogenesis as a shared pathogenic component of human syndromes 146 Abstract 147 Introduction 149 Results 152 Discussion 157

3

Chapter 5: Discussion and Future Directions 160 Summary 161 Genomic analysis of rDNA 161 CHD7 positively regulates rRNA synthesis 163 Discussion and future directions 164 How does CHD7 promote rRNA biogenesis? 164 Is dysregulated rRNA transcription a pathogenic component of CHARGE syndrome? 167 Relevance of dysregulated rRNA transcription to CHARGE syndrome 167 Relevance of dysregulated nucleoplasmic transcription to CHARGE syndrome 168 A dual-function model for CHD7 and its relevance to CHARGE syndrome 173 rDNA copy number, CpG methylation, and phenotypic variability in CHARGE syndrome 173 rDNA copy number variation 175 CpG methylation 175 Implications of variable rDNA copy number and CpG methylation for phenotypic variability in CHARGE syndrome 176 Dissecting nucleoplasmic and nucleolar functions of CHD7 177 Testing the requirements for a nucleoplasmic function of CHD7 in the mouse 177 Separating the functions of CHD7 using patient- specific iPSCs 179 Investigating CHD7 nucleolar targeting via nucleolar protein interactions 181 CHD7 and rRNA biogenesis: a connection to ? 183 Further applications of rDNA genomics 184 Condition-dependent alterations in rDNA chromatin structure 184 Distinguishing active and inactive rDNA repeats 188 Large-scale analysis of protein occupancy at rDNA 190

Appendix: Detailed chromatin immunoprecipitation protocol 192

Bibliography 199

4

List of Tables

Chapter 1 Table 1-1. Histone modifications and variants associated with rDNA 28

Chapter 2 Table 2-1. Correlation coefficients for pairwise comparisons 74

Table 2-2. CTCF consensus motifs within human and mouse rDNA 93

Table 2-3. ChIP-PCR primers used in Chapter 2 103

Chapter 3 Table 3-1. qRT-PCR primers used in Chapter 3 144

Table 3-2. ChIP-PCR primers used in Chapter 3 145

Chapter 4 Table 4-1. List of transcription factors and chromatin-associated proteins associated with haploinsufficient congenital anomaly syndromes 154

Chapter 5 Table 5-1. GO biological processes and mouse associated with CHD7-bound active and poised enhancers in mESCs 171

5

List of Figures

Chapter 1 Figure 1-1. Structure of the mammalian rDNA repeat 19

Figure 1-2. NoRC-dependent silencing of rDNA 36

Figure 1-3. Structure of CHD7 51

Chapter 2 Figure 2-1. Comparison of input samples from K562 cells 62

Figure 2-2. Distribution of histone modifications at rDNA in K562 cells 64

Figure 2-3. H3K4me1 ChIP-PCR in K562 cells 65

Figure 2-4. Normalized tag density scores for histone modifications 67

Figure 2-5. Correlation heatmaps of pairwise comparisons between median signals for histone modifications at rDNA 68

Figure 2-6. Distribution of histone modifications at rDNA in HUVECs 70

Figure 2-7. Distribution of histone modifications at rDNA in H1-hESCs 71

Figure 2-8. Distribution of histone modifications at rDNA in NHEKs 72

Figure 2-9. Comparison of rDNA histone marks across multiple cell types 73

Figure 2-10. Chromatin accessibility, transcription, and nucleosome occupancy at rDNA 77

Figure 2-11. ChIP-seq analysis of Pol I and UBF rDNA association 81

Figure 2-12. UBF associates with nucleoplasmic chromatin 83

Figure 2-13. Analysis of nucleoplasmic UBF peaks 85

Figure 2-14. UBF regulates nucleoplasmic gene transcription 88

6

Figure 2-15. CTCF is associated with human rDNA 90

Figure 2-16. CTCF binds to mouse rDNA 91

Chapter 3 Figure 3-1. CHD7 binds to rDNA 113

Figure 3-2. CHD7 localizes to the nucleoplasm and nucleolus 114

Figure 3-3. CHD7 positively regulates rRNA biogenesis 117

Figure 3-4. CHD7 knockdown does not affect protein levels of known regulators of rRNA transcription 120

Figure 3-5. Loss of CHD7 impairs cell proliferation and protein synthesis 123

Figure 3-6. CHD7 is associated with active rDNA repeats and counteracts rDNA methylation 127

Figure 3-7. Pre-rRNA levels are reduced in CHARGE-relevant tissues from Chd7 gene-trap embryos 129

Figure 3-8. CHD7 promotes association of treacle with rDNA 131

Figure 3-9. CHD7 physically interacts with treacle 132

Chapter 5 Figure 5-1. CHD7 associates with active and poised mESC enhancers 170

Figure 5-2. A model for dual functions of CHD7 174

Figure 5-3. CHD7 expression in HCV-induced HCCs 185

Figure 5-4. CHD7 expression in ovarian 186

Figure 5-5. CHD7 expression in gliomas 187

7

Acknowledgements

First and foremost, I thank my thesis advisor, Dr. Peter Scacheri, who has given me the freedom to pursue my own scientific interests and provided me with strong conceptual and experimental training with which to pursue my scientific goals. My training has provided me with the skills to ask and answer my own scientific questions, take scientific risks, and face the challenges of science with energy and enthusiasm.

I thank my thesis committee members, Dr. Guangbin Luo, Dr. Helen Salz,

Dr. Steven Sanders, and Dr. Derek Abbott for their patience and support throughout my graduate career. I am indebted to my collaborators, Dr. Donna

Martin and Dr. Maria Hatzoglou, without whom substantial portions of this work would not have been possible. I also thank Dr. Paul Tesar for his advice and encouragement in the later phases of my graduate work.

None of this work would have been possible without the members of the

Scacheri lab, both past and present. I am particularly indebted to Michael

Schnetz for his advice on choosing the right path through graduate school and his patience in helping me get set up in the lab. I am especially grateful for the friendship I have developed with Stephanie Balow, my "lab sister" and fellow Star

Wars geek. Every member of the Scacheri lab, past and present, has contributed to my scientific development and I am truly grateful for all they have done for me. I am also indebted to the administrative staff of the Department of

Genetics for their assistance throughout my graduate career.

8

I received invaluable support and encouragement, both scientific and otherwise, from many dear friends near and far, including Jason Heaney, Lorrie

Rice, Brian Cobb, Spike Murphy, and Neal Evans.

I thank my family, who have been unwavering in their support during my graduate career. They have always encouraged my educational endeavors and challenged me to reach my fullest potential, and it is in no small part because of them that I have completed this work.

Last, and certainly not least, I am indebted to Stephanie Doerner for her constant support and encouragement throughout my graduate career. It is no exaggeration to say that, without her, none of this would have been possible.

9

List of abbreviations ac acetyl

ActD actinomycin D

ATP adenosine triphosphate

bp base pairs

BrdU 5-bromo-2'-deoxyuridine

BSA bovine serum albumin

CHARGE of the eye, heart malformations, atresia of the

choanae, retardation of growth and development, genital

hypoplasia, and ear abnormalities including deafness

CHD DNA binding protein

ChIP chromatin immunoprecipitation

ChIP-chip chromatin immunoprecipitation with microarray analysis

ChIP-chop chromatin immunoprecipitation with methylation-sensitive

restriction digest

ChIP-PCR chromatin immunoprecipitation with quantitative PCR

ChIP-seq chromatin immunoprecipitation with high-throughput

sequencing

co-IP co-immunoprecipitation

CPE rDNA core promoter element

CTCF CCCTC-binding factor

DBA Diamond-Blackfan anemia

DNA deoxyribonucleic acid

DNase-seq DNase I digestion with high-throughput sequencing

10

DNMT DNA

DOC deoxycholate

m/hESC mouse/human embryonic stem cell

ETS rDNA external transcribed spacer

GEO Omnibus

GREAT Genomic Regions Enrichment of Annotations Tool

HDAC

HUVEC human umbilical vein endothelial cell

IGS rDNA intergenic spacer

iPSC induced pluripotent stem cell

ITS rDNA internal transcribed spacer kb kilobase pair

M molar mCi millicurie me1/2/3 mono-/di-/trimethyl ml milliliter mM millimolar

MNase-seq micrococcal nuclease digestion with high-throughput

sequencing

N normal

NaCl sodium chloride

NHEK normal human epidermal keratinocyte ng nanogram

11

NLS nuclear localization sequence

NoLS nucleolar localization sequence

NOR nucleolar organizer region

NoRC nucleolar remodeling complex

NT nontarget

nt

PBS phosphate buffered saline

PCR polymerase chain reaction p phospho

Pol I/II/III RNA polymerase I/II/III pRNA rDNA promoter-associated RNA qRT-PCR quantitative real-time PCR rDNA ribosomal DNA

RNA ribonucleic acid

RNA-seq high-throughput RNA sequencing rRNA ribosomal RNA

SD standard deviation

SEM standard error of the mean siRNA short interfering RNA

SRA Sequence Read Archive

TCA trichloroacetic acid

TCOF1 Treacher Collins Franceschetti syndrome 1

TCS Treacher Collins syndrome

12

TIP5 TTF-I interacting protein 5

TSS transcription start site

TTF-I transcription termination factor for Pol I

UBF upstream binding factor

UCE rDNA upstream control element

µg microgram

µl microliter

µm micrometer

13

Genomic analysis of ribosomal DNA and its application to the investigation of

disease pathogenesis

by

GABRIEL ETIENNE ZENTNER

Abstract

The synthesis of rRNA is critical to all growing organisms, accounting for

well over half of all cellular transcription. Highlighting its central function in cellular , dysregulation of rRNA biogenesis and subsequent ribosome assembly has been implicated in human genetic diseases and cancer. While the transcription of rRNA is highly regulated at the chromatin level, it has not been analyzed by genomic methods due to the exclusion of rDNA from current assemblies. This work describes a novel method of analysis that enables the alignment of high-throughput sequencing data to a single copy of rDNA in the context of the full . Integrated analysis of genomic datasets reveals that the coding region of rDNA is contained within nucleosome- poor open chromatin with high transcriptional activity. We find that histone

modifications are enriched not only at the rDNA promoter but also at novel sites

within the noncoding intergenic spacer. The distribution of active modifications is

more similar within and between cell types than that of repressive modifications.

Using ChIP-seq, we show that the nucleolar protein UBF is bound to sites

throughout the genome and may play a role in regulating the transcription of

nucleoplasmic . Lastly, the insulator-binding protein CTCF is bound to a

14

site proximal to the junction between adjacent rDNA repeats, potentially

indicating a role for transcriptional insulation in the regulation of rRNA

transcription.

We apply this method to a disease-relevant protein, CHD7, a chromatin-

remodeling mutated in the developmental disorder CHARGE syndrome.

CHARGE syndrome shares several clinical features with known disorders of

ribosome biogenesis. ChIP-seq analysis reveals robust association of CHD7 across the transcribed region of rDNA. Immunofluorescence and subcellular fractionation confirm the nucleolar localization of a substantial fraction of CHD7.

Knockdown experiments show that CHD7 functions to positively regulate rRNA biogenesis by counteracting DNA methylation at the rDNA promoter. Lastly,

CHARGE-relevant tissues from Chd7- mouse embryos display reduced

levels of precursor rRNA.

Taken together, these studies provide a novel means for assessing

protein occupancy at rDNA that can be applied to disease-relevant chromatin-

associated proteins in order gain novel insights into disease pathogenesis.

Additionally, the work presented herein defines a novel role for CHD7 as a

positive regulator of rRNA synthesis and suggests that dysregulated rRNA

synthesis is involved in the pathogenesis of CHARGE syndrome.

15

Chapter 1

Background and Significance

16

Overview

The synthesis of ribosomal RNA (rRNA) is a critical process for all growing cells, which require a continuous supply of rRNA to ensure sufficient ribosome biogenesis to meet the protein synthesis requirements of the cell. The transcription of rRNA is a massive undertaking, accounting for well over half of all transcription in growing cells [1]. The production of rRNA occurs in a distinct subnuclear compartment, the nucleolus, which forms around tandemly repeated arrays of rRNA genes (rDNA) [2,3]. The number of rDNA repeats varies widely within and between organisms, ranging from less than 50 to more than 26,000 [4-

10]. Consistent with its central role in cellular life, many signaling pathways regulated by nutrient availability or growth factors regulate rRNA production as a downstream effect [11-27]. Further underscoring their importance, dysregulated rRNA synthesis and/or ribosome biogenesis have been recognized as pathogenic components of several human diseases [28].

Structure and transcription of rDNA

Organization of the mammalian rDNA repeat

Each cell contains hundreds to thousands of copies of a single rDNA repeat. These repeats are organized into tandemly repeated arrays known as nucleolar organizer regions (NORs), around which nucleoli are formed. NORs are located on the short arms of 13, 14, 15, 21, and 22 in and 12, 15, 16, 17, 18, and 19 in mouse [29,30] and are arranged in a telomere- to-centromere orientation [29]. The number of rDNA repeats per NOR is highly variable, even within a single cell [7].

17

Each rDNA repeat is quite large, spanning 42.9 kb in humans and 45 kb in mice [31,32]. Each repeat consists of two major components: the coding region, containing the sequences encoding the mature 18S, 5.8S, and 28S rRNAs, and the intergenic spacer (IGS), a noncoding region with a high content of simple repeats and other elements including LINEs, SINEs, and ALUs [31] (Figure 1-1).

The coding region of rDNA begins at position 0 of the annotated human

(GenBank accession no. U13369) and mouse (GenBank accession no.

BK000964) rDNA sequences, which is the transcription start site (TSS) of rRNA.

However, the pre-rRNA promoter is annotated as being contained within the 3'- most region of the IGS of the preceding rDNA repeat. The pre-rRNA promoter has a bipartite structure consisting of the core promoter element (CPE), located immediately upstream of the rDNA TSS, and the upstream control element,

(UCE) located approximately 100 bp upstream of the CPE [33,34]. The coding region of rDNA spans approximately 13-14 kb in mammals and contains the sequences encoding the 18S, 5.8S, and 28S rRNA species. In addition, the coding region contains two external transcribed spacers (ETSs) and two internal transcribed spacers (ITSs), which are removed following rRNA transcription [35].

A variable number of Sal box transcription repeats are located immediately upstream and downstream of the rDNA coding region. These terminator repeats are bound by transcription termination factor for Pol I (TTF-I).

TTF-I mediates Pol I transcription termination and also participates in transcriptional regulation of rDNA [33,36,37].

18

Figure 1-1. Structure of the mammalian rDNA repeat

(A) A schematic representation of the mouse rDNA repeat based on GenBank accession number BK000964. Within the coding region, external and internal transcribed spacers (ETS/ITS) are depicted as white boxes and the rRNA coding sequences are depicted as blue boxes. The 3' end of the IGS of the preceding rDNA repeat harbors several regulatory elements: the spacer promoter (yellow bar) and its terminator sequence (TSP; green bar), a variable number of repeats (purple bars), the T0 terminator sequence necessary for TTF-I binding

(green bar), and the bipartite pre-rRNA promoter (red bar). Several terminator elements (T1-T10; green bars) lie immediately downstream of the coding region.

(B) Bipartite structure of the mouse rDNA promoter. The core promoter element

(CPE) is immediately upstream of the rDNA TSS and the upstream control element (UCE) lies approximately 100 bp upstream of the CPE and rDNA TSS.

Also depicted is the T0 terminator sequence required for TTF-I binding.

19

The remainder of the rDNA repeat is made up of the IGS. The IGS is

generally devoid of coding or regulatory DNA and contains a high degree of

repetitive sequence, containing highly variable numbers of simple repeats, ALUs,

LINEs, SINEs, and other transposons [31]. The length of the IGS is highly polymorphic due not only to its high repeat content but also due to its high susceptibility to recombination [7,38,39]. A notable exception is the spacer promoter and its cognate terminator element, located approximately 2 kb upstream of the rRNA TSS. The spacer promoter is responsible for the production of 150-300 nt noncoding transcripts that are homologous to the rDNA core promoter (pRNA) [40]. These transcripts appear to be essential for epigenetic regulation of rRNA transcription [41-44]. In addition, a variable number of enhancer repeats lie between the spacer promoter and the pre-rRNA promoter in mouse. These repeats function to stimulate transcription of the pre- rRNA [45-47].

Cytological features and chromatin structure of rDNA

Cytological examination has revealed two distinct classes of NORs.

Active NORs are undercondensed during metaphase and are visible as secondary constrictions on metaphase chromosomes. Their constituent rDNA repeats are approximately tenfold less condensed than adjacent satellite DNA, and Pol I and other factors necessary for rRNA transcription such as UBF remain associated with active NORs on metaphase chromosomes, indicating their transcriptional competence [48,49]. In contrast, inactive NORs are

20

indistinguishable from adjacent regions of heterochromatin and are devoid of Pol

I and other rRNA transcription factors [49].

The study of chromatin structure at rDNA has been greatly facilitated by

the use of psoralen. Psoralen is a drug that intercalates into double-stranded

DNA and gives rise to covalent interstrand links following UV irradiation [50].

Active rDNA repeats, which possess a highly open, euchromatic structure, are highly accessible to psoralen, while the closed, heterochromatic structure of silent rDNA repeats is prohibitive for psoralen crosslinking. Crosslinked DNA molecules migrate at a slower rate than uncrosslinked DNA, and thus the proportion of active and silent repeats in a cell can be determined. Studies in a variety of model systems have established that roughly equal proportions of

active and silent rDNA repeats coexist in the cell. The proportion of active and

silent rDNA repeats is stably propagated through multiple cell divisions, indicating epigenetic inheritance of the active or silent state [2].

Nucleosome occupancy of rDNA

An area of particular interest pertaining to the chromatin structure of rDNA

is the nucleosome occupancy of rDNA. While it is well established that the

coding region of inactive rDNA repeats, as well as the IGS of both active and

inactive repeats, contain , the nucleosome occupancy of the coding

region of active rDNA is the subject of ongoing controversy. The very high

transcriptional output of rDNA led to the hypothesis that the transcribed region of

active rDNA repeats might be completely free of nucleosomes; this seems to be

a reasonable conclusion given the high elongation rate of Pol I and the high

21

density of Pol I along the transcribed region of rDNA (~1 Pol I molecule/100 bp)

[51,52]. A number of studies suggest a complete absence of nucleosomes

based on the results of psoralen crosslinking, electron microscopy, and

micrococcal nuclease (MNase)/restriction enzyme digests from a variety of model

organisms from S. cerevisiae to mouse [50,53-57]. However, more recent

experiments have cast doubt on these results. Depletion of inhibits

rRNA transcription in yeast [58] and ChIP analysis of a yeast strain with a

reduced rDNA copy number (hereafter referred to as the 40-copy strain) revealed

the association of with the transcribed region of rDNA and that Pol I

transcription termination was dependent on the presence of chromatin

remodeling proteins [59]. However, the results of this study relied on the

assumption that all rDNA repeats in the 40-copy strain were actively transcribed.

Interestingly, recent psoralen crosslinking experiments have indicated that

10-20% of rDNA repeats in the 40-copy strain are inactive [60]. Additionally,

chromatin endogenous cleavage (ChEC) assays suggest a very low nucleosome

occupancy within the coding region of active rDNA repeats. In the ChEC assay,

proteins of interest are fused to MNase and expressed in yeast. Cells are

incubated in a Ca2+-containing buffer to keep MNase inactive, and upon Ca2+

withdrawal the enzyme is activated, digesting the DNA nearest the of

the fusion protein. In the 40-copy strain, slight cleavage of rDNA was observed

when histone-MNase fusion proteins were expressed. This result indicates low association of nucleosomes with rDNA, and may be explicable by the presence

of a low number of inactive repeats as detected by psoralen crosslinking [60].

22

Further confounding these findings, deposition of the histone variant H3.3 into active rDNA arrays in cells has been demonstrated [61]. In addition, several factors, including the facilitates chromatin remodeling (FACT) complex and the nucleolar proteins nucleolin and B23, enable Pol I transcription through a chromatin template. Moreover, depletion of various FACT components, nucleolin, or B23 reduces pre-rRNA transcription in vivo, strongly suggesting that the coding region of active rDNA repeats contains an appreciable level of nucleosomes [62-66]. Currently, there is no clear consensus on this issue. However, given the requirement of chromatin remodeling for efficient Pol I transcription in vitro and in vivo, it seems quite likely that the transcribed region of active rDNA repeats does contain an appreciable amount of nucleosomes.

Transcription of rRNA

As rDNA is by far the most highly transcribed sequence in the genome, it is perhaps not surprising that the cell has evolved Pol I, the sole function of which is to transcribe rRNA. Analogous to transcription by RNA polymerases II and III

(Pol II and Pol III), transcription initiation by Pol I requires additional factors to mediate processes such as promoter recognition, transcriptional elongation, and termination. Formation of the Pol I preinitiation complex (PIC) requires the promoter selectivity factor SL1 (human designation; TIF-IB in mouse) and has also been proposed to require upstream binding factor (UBF). UBF contains several sequence-tolerant HMG DNA-binding motifs and appears to enable the wrapping of DNA to approximate the UCE and CPE at the rDNA promoter [67].

23

More recently, it has been suggested that UBF stimulates promoter escape and transcriptional elongation by Pol I rather than PIC formation [17,68]. Promoter specificity is determined by SL1, a complex containing the TATA-binding protein

(TBP) and several Pol I-specific TBP-associated factors (TAF112, TAF141,

TAF148, TAF168, TAF195/110) [69-72]. Following promoter recognition, TAFs participate in transcription initiation complex recruitment via interaction with UBF.

The interaction of SL1 with TIF-IA, the mammalian homolog of yeast Rrn3p, a factor associated with initiation-competent Pol I, then drives the formation of productive transcription complexes [73,74].

Cell type-specific regulation of rRNA transcription

Different cell types have distinct requirements for rRNA synthesis. Thus, transcription of rRNA is also regulated by cell type-specific factors. In highly proliferative cells such as oocytes and keratinocytes, the transcriptional regulator basonuclin is highly expressed and upregulates rRNA synthesis [75-78]. Other examples of cell type-specific regulators of rRNA transcription include MyoD,

Mgn, Runx2, and C/EBPβ. These factors function during the differentiation of highly proliferative C2C12 myoblasts to downregulate rRNA synthesis during myogenic, osteogenic, and adipogenic differentiation, respectively [79,80].

These findings indicate that a common function of lineage-specific transcription factors may be regulation of rRNA transcription. This is consistent with previous observations that rRNA levels are dramatically decreased upon differentiation, reflecting the high protein synthesis requirements of rapidly dividing progenitor populations versus terminally differentiated cell types [81-84]

24

An additional layer of complexity in the cell type-specificity of rRNA

expression involves rDNA variants (v-rDNA). Studies in mouse cells have revealed the existence of seven individual v-rDNAs containing sequence polymorphisms within and at the 5' end of the 28S rRNA coding region.

Individual v-rDNAs differ based on copy number, expression level, and CpG methylation. Three v-rDNAs are expressed in all tissues, two in some tissues, and two in no tissues [85]. These findings raise the possibility that within different cell types there is crosstalk between v-rDNAs and cell type-specific factors to modulate the rDNA transcriptional output of the cell.

Epigenetic regulation of rDNA

CpG methylation

Methylation of DNA at CpG dinucleotides, mediated by DNA (DNMTs), is a major mechanism of epigenetic transcriptional repression in mammals [86], and alterations in CpG methylation are widespread in cancer and other diseases [87]. In humans, the rDNA promoter contains in excess of two dozen CpG dinucleotides, and no particular CpG has been shown to be critical for the silencing of rDNA. In contrast, methylation of a single CpG in the mouse rDNA promoter at position -133 relative to the rDNA TSS is sufficient to silence a repeat by impairing promoter association of UBF [88]. This observation suggests that DNA methylation participates in rDNA silencing by disrupting key protein-DNA interactions. The importance of DNA methylation in rDNA regulation is underscored by the finding that rDNA promoter is hypomethylated in hepatocellular carcinoma versus normal liver tissue [89].

25

CpG methylation also plays a role in the genomic stability of rDNA, presumably by reinforcing a closed chromatin structure less accessible to the cellular recombination machinery. Maintenance of rDNA stability is critical, given the high susceptibility of rDNA to intra- and inter-chromosomal recombination

[7,38,39]. In cells with somatic knockout of DNMT1, decreased CpG methylation and increased rRNA levels occur as expected, along with severe disorganization of the nucleolus [90], indicating a role for CpG methylation not only in rRNA transcriptional regulation but also nucleolar organization.

Interestingly, a recent study has suggested that CpG methylation of rDNA in fact has a net positive effect on rRNA synthesis. As expected, cells with somatic knockout of DNMT1 and DNMT3b or treated with the DNMT inhibitor 5- aza-2'deoxycitidine (aza-dC) displayed reduced CpG methylation and reactivation of silent repeats. Surprisingly, loss of CpG methylation led to a decrease in the rate of rRNA synthesis as well as accumulation of unprocessed pre-rRNA. Consistent with previous results, loss of CpG methylation was also associated with nucleolar disruption and increased rDNA recombination. Most notably, loss of CpG methylation led to occupancy of rDNA by Pol II and the production of cryptic Pol II transcripts which were found to interfere with rRNA processing [91]. In yeast, cryptic Pol II transcription is associated with increased rDNA recombination [92]. Thus, CpG methylation appears to mediate efficient rRNA synthesis by suppressing cryptic Pol II transcription, which impairs rRNA processing and increases rDNA recombination, though further studies are needed to determine if this is the primary function of rDNA CpG methylation.

26

Core histone modifications

The epigenetic status of an rDNA repeat as transcriptionally active or inactive is correlated with the presence of various histone modifications. Studies of rDNA chromatin in a range of systems have indicated that the promoter of active rDNA repeats contain histone marks generally correlated with active transcription, including methylation of 4 of histone H3 (H3K4me) and of histones H3 and H4 (H3ac, H4ac). In contrast, the promoter of silent rDNA repeats is marked by histone marks associated with transcriptional repression, including methylation of 9 and 27 of histone H3 and lysine 20 of histone H4 (H3K9me, H3K27me, H4K20me) and hypoacetylation of histones

H3 and H4 [2]. A summary of rDNA-associated histone modifications and the proteins regulating their establishment and removal is presented in Table 1-1.

H1: the linker histone weighs in on rRNA transcription

In addition to the core histones, linker histone H1 and its phosphorylated variants have recently been shown to play a variety of roles in rDNA regulation.

Depletion of UBF increases the association of histone H1 with the promoter and coding region of active rDNA repeats to establish a "pseudosilent" chromatin state intermediate to that of active and silent repeats, reducing rRNA transcription without increasing CpG methylation. This process is termed methylation-independent silencing [93]. Notably, the role of histone H1 at rDNA does not appear to be strictly repressive. 173-phosphorylated histone

H1.2 (H1.2S173p) and serine 187-phosphorylated histone H1.4 (H1.4S187p) are present in nucleoli. Inhibition of Pol I transcription with actinomycin D (ActD)

27

Residue Modification Active/repressive Regulated by H3K4 me2 Active ? (eNOSC) [94] CTCF [44] H3K4 me3 Active FBXL10 [95] H3K9 Ac Active SIRT1 (eNoSC) [94] CTCF [44] H3K9 me2 Both SETDB1 (NoRC) [96] SUV39H1 (eNoSC) [94] G9a [96] PHF8 [97-99] H3K9 me3 Repressive SETDB1 (NoRC) [96] H3K27 me3 Active EZH2? (NoRC) [100] H3K36 me1/2 Active KDM2A [101] H4K5 Ac Active HDAC1 (NoRC) [102] H4K8 Ac Active HDAC1 (NoRC) [102] H4K12 Ac Active HDAC1 (NoRC) [102] H4K16 Ac Active MOF [100] HDAC1 (NoRC) [102] H4K20 me3 Repressive SUV4-20? (NoRC) [100,103] H1.4 Un Repressive ? [104] H1.4S187 P Active ? [104] H2A.Z Un Active CTCF [44] H2A.ZK4 Ac Active ? [44] H2A.ZK7 Ac Active ? [44] H2A.ZK11 Ac Active ? [44] H3.3 Un Active ? [61]

Table 1-1. Histone modifications and variants associated with rDNA

Listed are the core and variant histone modifications that have been found to

associate with rDNA. me2: dimethyl, me3: trimethyl, ac: acetyl, p: , un: unmodified. Proteins known to be involved in the establishment or removal of a particular mark are listed in the "regulated by" column. If a protein is present in a complex, the name of the complex is given in parentheses. If a modification is known to be influenced by a complex but the actual factors regulating deposition/removal of the modification are unknown, the complex name is given.

28

reduces the association of H1.4S187p with the rDNA promoter but increases

rDNA promoter association of unmodified histone H1.4, suggesting opposing

roles for modified and unmodified histone H1.4 in rRNA transcription [104]. In S.

cerevisiae, linker histone Hho1p serves a dual role in rRNA transcriptional

regulation, being required for both maximal Pol I processivity and proper rDNA

chromatin compaction [105]. Consistent with its role in rDNA compaction, Hho1p

also suppresses rDNA recombination in S. cerevisiae [106].

Histone variants

H2A.Z

H2A.Z is a multifunctional histone H2A variant involved in a variety of processes including Polycomb silencing, ESC lineage commitment, stability and segregation, inhibition of heterochromatin spreading, and foregut

development [107-115]. Consistent with its diverse cellular functions, H2A.Z can

be localized both to and facultative heterochromatin but tends to be

excluded from constitutively heterochromatic regions, such as centromeres [116].

Genome-wide analyses indicate that H2A.Z preferentially localizes to promoters

but can also occupy a substantial fraction of enhancers in human cells [117-119].

It appears that, as for the canonical core histones, functional diversity of

H2A.Z is modulated by posttranslational modifications. Monoubiquitylation of

H2A.Z by the RING1b , a component of Polycomb silencing,

appears to be associated with its targeting to facultative heterochromatin, as the

majority of H2A.Z associated with the inactivated in female cells

is monoubiquitylated [116]. H2A.Z can also be acetylated on several residues

29

and acetylated H2A.Z tends to be associated with active promoters, suggesting

that acetylation is critical for H2A.Z functions in transcriptional activation and anti-

silencing [114,120].

Recently, ChIP analysis in mouse cells has shown that H2A.Z and its

lysine 4/7/11-acetylated (H2A.ZK4/7/11ac) forms occupy a site just upstream of

the rDNA spacer promoter and, to a lesser extent, the T0 terminator immediately

upstream of the pre-rRNA promoter [121]. This suggests that the H2A.Z-

occupied region may function to demarcate the end of the repetitive stretch of the

IGS from the functionally important 3' IGS region containing the spacer promoter,

enhancer repeats, and pre-rRNA promoter. Interestingly, in this study, the

highest occupancy of UBF and Pol I were observed at the spacer promoter,

further suggesting that the H2A.Z domain serves as a recruitment platform for

important components of the rDNA transcription machinery.

Interestingly, recent data suggest that CCCTC-binding factor (CTCF), well-known for its functions in transcriptional insulation [122], regulates the levels of H2A.Z proximal to the rDNA spacer promoter. CTCF binds strongly to the rDNA spacer promoter and its depletion reduces levels of not only H2A.Z but

UBF and Pol I at the spacer promoter. Loss of CTCF reduces pre-rRNA levels by ~20% but decreases pRNA levels by over 50% [44]. This finding is somewhat counterintuitive, given that CTCF is generally considered to be a transcriptional and that previous studies have suggested a role for CTCF in the repression of rDNA transcription [123,124]. Thus, the functions of CTCF in rDNA

30 regulation may be context-dependent, and further studies are needed to clarify this matter.

H3.3

H3.3 is an H3 variant with only four amino acid changes from canonical

H3 [125]. Notably, three out of four of these residues alter the nucleosome assembly behavior of H3.3, allowing it to be incorporated into chromatin independent of DNA replication [61]. H3.3 is deposited into active rDNA arrays in

Drosophila cells, suggesting that replacement of canonical H3 with H3.3 can rapidly activate rRNA genes that have been silenced by H3 modifications [61].

However, it is not certain to what extent this mechanism functions in rDNA regulation.

Nucleosome positioning

In the mouse, positioning of a single nucleosome is a critical determinant of the transcriptional activity of an rDNA repeat. The promoter nucleosome of active rDNA repeats covers DNA from positions -157 to -2 relative to the rDNA

TSS. This nucleosome positioning sequesters the critical CpG at position -133 within the globular domain of the nucleosome, rendering this dinucleotide inaccessible to DNMTs. In contrast, at silent repeats, this nucleosome is moved

25 nt upstream, covering bases -132 to +22 and exposing the critical CpG at -

133 which can then be methylated to silence the repeat [88,126].

TTF-I sets the stage for epigenetic regulation of rDNA

TTF-I is a multifunctional protein that associates with terminator elements immediately downstream of the rDNA coding region to mediate termination of Pol

31

I transcription and also associates with a similar terminator element upstream of

the rDNA TSS [33,36,37,127]. The association of a transcription termination

protein proximal to the rRNA promoter suggested that it might also function in

regulating Pol I transcription. Indeed, TTF-I association with the promoter-

proximal terminator element upregulates in vivo Pol I transcription [128,129]. In

vitro studies also showed that TTF-I was able to induce changes in the structure of nucleosomal templates which correlated with induction of Pol I transcription

[130,131]. These results suggest that TTF-I mediates epigenetic regulation of rDNA by recruiting chromatin remodeling and modifying factors.

Formation of heterochromatin at rDNA

There are several protein complexes that regulate the epigenetic state of rDNA, the best characterized of which is the nucleolar remodeling complex

(NoRC). NoRC functions to repress rRNA transcription and is comprised of two core subunits, the ATPase SNF2H and transcription termination factor for Pol I interacting protein 5 (TIP5), which contains a bipartite PHD/bromodomain unit

[132,133]. TIP5 mediates several key events needed for rDNA silencing, beginning with the recruitment of NoRC to rDNA via interaction with TTF-I [134].

The bromodomain of TIP5 then associates with acetylated lysine 16 of histone

H4 (H4K16ac), which is required for subsequent deacetylation of this residue and acetylated lysines 5/8/12 of histone H4 (H4K5/8/12ac) [102]. The bipartite

PHD/bromodomain unit is also necessary and sufficient for the recruitment of

HDACs and DNMTs to rDNA, as a fusion protein consisting of the TTF-I DNA binding domain and the TIP5 PHD/bromodomain unit is able to form

32 heterochromatin at rDNA via recruitment of these repressive chromatin activities

[102]. Chromatin modifiers recruited to rDNA in a TIP5-dependent manner include DNMT1/3b, HDAC1, and the H3K9 methyltransferase SETDB1

[96,133,135].

TIP5 activity is also regulated by acetylation. TIP5 is acetylated at lysine

633 (K633) by males absent on the first (MOF), and this acetylation event is necessary for efficient heterochromatin formation at rDNA. A K633R mutant of

TIP5 is able to associate more efficiently with TTF-I, suggesting that acetylation weakens the interaction of TIP5 with TTF-I, impairing its rDNA recruitment.

Additionally, depletion of the histone deacetylase sirtuin 1 (SIRT1) reduces TIP5 association with rDNA [100]. Thus, it stands to reason that unacetylated TIP5 is recruited to rDNA, where it becomes acetylated and competent to mediate rDNA silencing. While TIP5 is able to recruit DNMTs to rDNA, the chromatin remodeling activity of SNF2H is also required for appropriate rDNA silencing, as overexpression of an ATPase-deficient mutant of SNF2H impairs methylation of the rDNA promoter and overexpression of TIP5, which is required for rDNA recruitment of SNF2H, increases the number of nucleosomes at the inactive position [126].

While NoRC is certainly the most well characterized epigenetic negative regulatory complex for rDNA, a number of other chromatin-modifying factors are known to repress rRNA transcription independent of NoRC. This observation likely reflects the fact that a broad range of cellular stimuli influence rRNA synthesis and that different cell types may employ specific mechanisms to

33

repress rRNA transcription. The histone FBXL10/KDM2B is localized to the nucleolus and is able to demethylate in human cells and Drosophila [136,137]. The energy-dependent nucleolar silencing complex

(eNoSC), containing SIRT1, the H3K9 methyltransferase SUV39H1, and the novel methyltransferase-like protein nucleomethylin (NML), dimethylates H3K9 in response to glucose deprivation, protecting cells against nutrient deprivation- induced [94]. Similarly, the KDM2A histone demethylase represses rRNA transcription via demethylation of H3K36me1/2 in response to glucose and serum starvation [101]. It is currently unclear if these proteins act in concert with or independent of NoRC.

Noncoding RNA transcripts epigenetically regulate rDNA

It has recently been established that short (150-300 nt) noncoding transcripts are produced from the rDNA spacer promoter, located approximately

2 kb upstream of the rDNA TSS. These transcripts are produced by Pol I transcription and are homologous to the core rDNA promoter; thus, they are designated promoter-associated RNA (pRNA) [41-43,138]. NoRC associates with pRNA through the TIP5/ARBP/MBD (TAM) domain of TIP5 and abrogation of this interaction reduces TIP5 rDNA association and impairs heterochromatin formation [102]. Surprisingly, MOF-dependent acetylation of TIP5 reduces its association with pRNA, a striking result in light of the fact that pRNA is necessary for efficient association of TIP5 with rDNA and that TIP5 acetylation is necessary for heterochromatin formation [100]. However, a model to explain this these seemingly contradictory results emerges: unacetylated TIP5 is associated with

34

pRNA and is also efficiently bound to TTF-I, ensuring robust rDNA recruitment.

Upon rDNA association of TIP5, MOF acetylates K633 of TIP5, mediating release of pRNA from TIP5 and enabling robust heterochromatin formation. A model of NoRC-dependent rDNA silencing incorporating pRNA is depicted in

Figure 1-2. Recent work has also shown that pRNA is able to form a DNA:RNA hybrid at the rDNA promoter which mediates rDNA recruitment of DNMT3b [43].

Thus, pRNA functions both in concert with and independent of NoRC in the establishment of silent chromatin at rDNA. Interestingly, it has recently been demonstrated that CTCF enhances transcription from the spacer promoter [44].

Given that CTCF exerts both positive and negative effects on rRNA transcription, this may indicate that pRNA can also activate rRNA transcription, though further studies are necessary to examine this possibility.

Establishment of active rDNA chromatin

The mechanisms regulating the formation of active chromatin at rDNA are not as well studied as NoRC-dependent establishment of heterochromatin; however, recent studies have implicated protein B (CSB) and the WSTF-ISWI chromatin remodeling complex (WICH) as major players in establishing a transcriptionally permissive state at rDNA.

CSB, mutated in Cockayne syndrome, is a SNF2-like ATPase/helicase with demonstrated chromatin-remodeling capabilities [139]. CSB plays an important role in transcription-coupled DNA repair (TCR), which repairs stalled transcription complexes by removing DNA lesions from the transcribed strand

[140]. CSB also interacts with a ternary complex of DNA, Pol II, and RNA,

35

Figure 1-2. NoRC-dependent silencing of rDNA

(A) TIP5 is maintained in an unacetylated state by SIRT1 and associates with

pRNA. NoRC is recruited to the rDNA promoter via the interaction of TIP5 with

TTF-I. At this stage, the promoter nucleosome covers positions -157 to -2 and the critical CpG at -133 is nucleosomal and the repeat is transcriptionally active.

(B) MOF acetylates TIP5 at K633, leading to the dissociation of pRNA from

NoRC. DNMTs, HDACs, histone methyltransferases (HMTs) and histone (HDMs) are recruited to rDNA. (C) Demethylation of H3K4, deacetylation of H3 and H4, and methylation of H3K9, H3K27, and H4K20 occurs. SNF2H slides the promoter nucleosome 25 bp upstream, covering the rDNA TSS and exposing the critical CpG at -133. (D) The CpG at -133 is methylated (represented by a filled circle), completing the silencing of the repeat.

36

suggesting a role in transcription [141]. Consistent with this, CSB-deficient cells display reduced overall transcription [142]. CSB localizes to the nucleolus and interacts with Pol I and TFIIH, both required for rRNA transcription. CSB is capable of stimulating Pol I transcription from a chromatin template, and CSB patient cells display reduced rRNA synthesis [143]. CSB associates with rDNA in a TTF-I-dependent manner, thereby linking TTF-I to targeting of both activating and repressive chromatin-modifying activities to rDNA. Stimulation of Pol I transcription by CSB is dependent on its ATPase activity, demonstrating that

CSB indeed remodels rDNA chromatin to activate rRNA transcription. This study also established a surprising link between H3K9 methylation and activation of rRNA transcription. Knockdown of CSB reduces H3K9 methylation, presumed to be a repressive histone modification, across rDNA. CSB interacts with the H3K9

mono- and dimethyltransferase G9a. G9a mediates dimethylation of H3K9

across the promoter and transcribed region of active and silent rDNA repeats,

promoting rDNA association of heterochromatin protein 1γ (HP1γ). Efficient

association of H3K9me2 and HP1γ with rDNA is dependent on active Pol I

transcription, an intriguing finding given that H3K9 methylation and HP1 have

generally been presumed to have repressive transcriptional functions via

heterochromatin formation [144-147]. Finally, knockdown of G9a or

overexpression of a catalytically inactive mutant reduces rRNA levels [96]. In

addition to providing mechanistic insight into the regulation of rRNA chromatin

structure by CSB, these data provide additional evidence that the distinction

between active and repressive functions of H3K9 methylation is not as clear as

37

previously thought and may depend upon the overall chromatin context of the

of interest.

In addition to CSB, WICH has been implicated in the positive regulation of

rRNA transcription. WICH contains the Williams syndrome

(WSTF), which functions in the DNA damage response via its kinase

activity [148]. WSTF also contains a bromodomain, which mediates interaction

of WSTF with acetylated lysine 12 of histone H2B (H2BK12ac), acetylated lysine

14 of histone H3 (H3K14ac), and H4K16ac [149,150]. WICH also contains the

ATPase SNF2H, which is a component of NoRC [132]. A variant of WICH containing nuclear 1 (NM1), a known positive regulator of rRNA transcription [151], stimulates Pol I transcription [152]. These findings suggest that SNF2H functions in the establishment of both active and repressive rDNA chromatin depending on its interaction partners. Interestingly, a large fraction of

WICH is found in a larger complex designated B-WICH, which contains NM1 and

several other proteins with nucleolar functions including the RNA helicase II/Guα,

myb-binding protein 1a, and, notably, CSB, suggesting that B-WICH coordinates

the activities of multiple chromatin remodeling proteins to establish a

transcriptionally permissive state at rDNA. B-WICH is also associated with pre-

rRNA, and B-WICH assembly is dependent upon ongoing Pol I and Pol III

transcription [153]. It is not currently known how B-WICH is targeted to rDNA.

MOF-dependent H4K16 acetylation at the rDNA promoter may play a role, as the

WSTF bromodomain is able to bind this modification [149,150]. Thus,

association of B-WICH with rDNA may depend on MOF activity. Another non-

38

mutually exclusive possibility is that rDNA recruitment of B-WICH is RNA-

dependent, as B-WICH is associated with pre-rRNA and its formation is

dependent upon ongoing Pol I transcription [153].

Many other factors are known to have a role in establishing a

transcriptionally active chromatin state at rDNA, perhaps reflecting the diversity

of cellular processes that affect rRNA transcription. SIRT7, a histone deacetylase, promotes rRNA transcription via its catalytic activity [154,155]. This

is a surprising finding, given that other histone deacetylases (i.e., HDAC1,

SIRT1) function in the establishment of heterochromatin at rDNA [94,100,135].

The histone demethylase PHD finger protein 8 (PHF8) has also been found to

positively regulate rRNA transcription through the removal of H3K9me1/2 [97-99].

This list is likely to expand as the mechanisms of cell type-specific rRNA synthesis are further explored, and it will be interesting to determine if these factors act independently or in concert.

Several studies have also identified factors that influence the level of CpG methylation at active rDNA repeats by either protecting the rDNA promoter from methylation or by targeting it for demethylation. Methyl CpG binding domain protein 3 (MBD3), a member of the MBD family of methylated DNA binding proteins, has a unique function in that its MBD domain mediates its association with unmethylated, rather than methylated, DNA. MBD3 associates with hypomethylated, active rDNA promoters. Knockdown of MBD3 increases CpG methylation at the rDNA promoter and decreases pre-rRNA levels, while overexpression of MBD3 decreases promoter methylation [156]. These data

39

suggest that MBD3 participates in the maintenance of active rDNA chromatin by

protecting the rDNA promoter from CpG methylation. Additionally, a recent study

has identified an active DNA demethylase, growth arrest and DNA damage-

inducible gene 45 alpha (Gadd45a). Gadd45a is a well-established regulator of

the DNA damage response, proliferation, apoptosis, and progression, and its overexpression induces a global reduction of CpG methylation [157].

Gadd45a is recruited to the rDNA promoter by TAF12, a common component of the Pol I and Pol II basal transcription complexes. Overexpression of Gadd45a reduces CpG methylation at the rDNA promoter and increases pre-rRNA transcription, apparently by counteracting NoRC-dependent CpG methylation.

DNA demethylation by Gadd45a is dependent on the actions of nucleotide excision repair (NER) proteins, including xeroderma pigmentosum proteins A, F,

and G (XPA, XPF, XPG) and thymine-DNA glycosylase (TDG) [158].

Replication timing of rDNA repeats

The proportion of active and silent rDNA repeats is stably maintained

through multiple cell divisions and must be reestablished following DNA

replication. In mammalian cells, chromatin structure is linked to replication

timing: euchromatin is replicated in early S-phase, while heterochromatin is

replicated in late S-phase [159]. rDNA repeats follow this asynchronous pattern,

with active repeats replicated in early and silent repeats replicated in late S-

phase. NoRC associates exclusively with repressed, late-replicating rDNA

repeats and ectopic expression of TIP5 shifts overall rDNA replication to late S-

phase [160]. Acetylation of TIP5 occurs prior to late S-phase, suggesting that

40

TIP5 acetylation primes NoRC to reestablish heterochromatin at silent rDNA repeats following DNA replication [100]. Interestingly, during early development, rDNA replication is synchronous, implying that all copies of rDNA are active prior to implantation. At about the time of implantation, one allele of each locus is stochastically chosen for inactivation. This allele then becomes late-replicating and is silenced by epigenetic mechanisms [161].

Consequences of dysregulated ribosome biogenesis

Crosstalk between ribosome biogenesis and p53

While the primary function of the nucleolus is ribosome biogenesis, it is becoming increasingly clear that the nucleolus functions as a cellular stress sensor, as the majority of cellular stresses lead to nucleolar disruption and subsequent stabilization of p53 [162]. Given the importance of protein synthesis for proper cell proliferation, it is logical that a major source of cellular stress would be disruption of ribosome biogenesis. Indeed, a number of studies have indicated that disruption of any step in the process leads to p53 stabilization followed by cell cycle arrest and, potentially, apoptosis. For instance, inhibition of rRNA processing by overexpression of a truncated form of block of proliferation 1

(Bop1) leads to a p53-dependent cell cycle arrest [163]. Deletion of the S6 ribosomal protein also causes p53-dependent defects in proliferation, indicating that defects in ribosome assembly can also trigger the ribosome biogenesis checkpoint [164]. Finally, p53-dependent proliferation defects are also observed after inhibition of Pol I transcription with low doses of ActD [165].

41

The level of cellular p53 is regulated by its interaction with mouse double

minute 2 ( and its human homolog HDM2). Under normal cellular

conditions, MDM2 inhibits p53 function in two ways: (1) binding directly to p53,

blocking its transcriptional activation domain [166] and (2) serving as an E3

ubiquitin ligase for p53, targeting it for proteasomal degradation [167-169]. In contrast, cellular stresses including DNA damage induced by radiation or genotoxic chemicals, overexpression of oncogenes, and viral infection inhibit

MDM2 activity and lead to posttranslational modification of p53 that results in its stabilization [170-175].

An interesting model for the crosstalk between ribosome biogenesis and p53 has emerged, wherein disruption of ribosome biogenesis leads to nucleolar disruption and the release of ribosomal proteins, which function to inhibit the degradation of p53 [176]. As stated above, there are many sources of stress that can disrupt ribosome biogenesis. Interestingly, it was shown over 15 years ago that the ribosomal protein RPL5 interacts with MDM2 [177]. It has now become clear from a number of studies that ribosomal proteins, particularly RPL5, RPL11, and RPL23, act to stabilize p53 by interacting with MDM2 and inhibiting its ubiquitin ligase activity [178-181].

This p53-dependent nucleolar stress response also appears to be relevant

in vivo. Knockdown of ribosomal protein L11 in zebrafish activates the p53-

dependent nucleolar stress response, leading to increased apoptosis and

morphological defects in the brain, heart, head, and eyes [182]. Similar

activation of the p53 pathway is seen in the Bap28 mutant zebrafish line, which

42

harbors a in a gene homologous to yeast Utp10, a component of a

small nucleolar RNA (snoRNA) complex [183]. Morpholino knockdown of 19/21

ribosomal proteins in zebrafish yielded specific, reproducible defects in the brain,

trunk, eyes, and ears of embryos 25 hours post-fertilization [184]. In humans, disruption of ribosome biogenesis and subsequent p53-dependent apoptosis has been implicated in the pathogenesis of human disease including Treacher Collins syndrome (TCS) and Diamond-Blackfan anemia (DBA), which are discussed below.

Ribopathies: human diseases of ribosome biogenesis

Dysregulation of rRNA transcription and subsequent steps of ribosome biogenesis has recently been recognized as a pathogenic component of several human diseases [28]: Treacher Collins syndrome [185], Diamond-Blackfan anemia [186], dyskeratosis congenita [187], 5q- syndrome [188], cartilage-hair hypoplasia [189], Schwachman-Diamond syndrome [190], and cancers with c-

Myc overexpression [191-193]. These disorders involve pleiotropic defects in diverse cell populations and organ systems, highlighting the importance of proper ribosome biogenesis in normal organismal development and cancer.

One of the most well-studied ribopathies is Treacher Collins syndrome

(TCS). TCS is a haploinsufficient congenital anomaly syndrome characterized by a variety of craniofacial abnormalities including hypoplasia of the facial bones, dental anomalies, cleft palate, eyelid coloboma, and external and middle ear abnormalities with conductive hearing loss [185]. TCS is caused by mutations in the TCOF1 gene, encoding the protein treacle, as well as mutations in the

43

POLR1D gene, encoding a Pol I subunit [185,194]. Treacle functions to positively regulate rRNA transcription via its interaction with UBF [195] and also participates in 2'-O-methylation of the pre-rRNA in complex with the pNop56 pre- ribosomal ribonucleoprotein (pre-rRNP) complex [196,197]. Mice harboring a heterozygous mutation in the Tcof1 gene develop craniofacial features similar to those of human TCS patients as a consequence of massive neuroepithelial apoptosis [198]. This apoptosis is p53-dependent, indicating induction of the nucleolar stress response by treacle haploinsufficiency. Notably, apoptosis in

Tcof1-heterozygous mice can be suppressed by genetic and pharmacological inhibition of p53, leading to restoration of wild-type craniofacial features [199].

Another well-characterized ribopathy is Diamond-Blackfan anemia (DBA).

DBA is a congenital disorder of red blood cell development characterized by a marked reduction or absence of erythroid precursors in the bone marrow as well as a range of congenital anomalies affecting the craniofacial region, eyes, thumbs, urogenital system, and heart. DBA is caused by heterozygous mutations in the genes encoding several ribosomal proteins, the most frequently mutated of which is RPS19 [200]; thus, the erythropoetic and congenital anomalies associated with DBA may be considered human Minute phenotypes.

Downregulation of RPS19 by siRNA in HeLa cells blocks maturation of the 18S rRNA by impairing pre-rRNA processing, leading to accumulation of 18S precursor molecules in the nucleoplasm and increased apoptosis. DBA patient fibroblasts display impaired pre-rRNA processing and nucleolar disorganization

[186]. In zebrafish, deficiencies in rps19 and several other ribosomal proteins

44 result in p53 stabilization, indicating involvement of the p53-dependent nucleolar stress response in the pathogenesis of DBA [201].

The Minute mutations

A particularly well-studied group of mutations affecting ribosome biogenesis are the Minute mutations of . Minutes are heterozygous mutations that give rise to an array of phenotypes including developmental delay, small body size, short bristles, and poor fertility and viability

[202,203]. First characterized in 1929, it was not until 1985 that the molecular basis of a Minute mutation was discovered to be mutation of a ribosomal protein

[204]. Since that time, nearly all Minutes have been traced back to mutations in ribosomal proteins [203]. The developmental phenotypes observed in Minutes

(i.e. small body size, small bristles) are consistent with a high requirement for protein synthesis in the developing fly. Further strengthening the link between deficient ribosome biogenesis and developmental defects in Drosophila, knockdown of nucleolar phosphoprotein 140 (Nopp140), which functions in pre- rRNA modification via interaction with small nucleolar ribonucleoproteins

(snRNPs), to approximately half of its normal levels gives rise to Minute-like phenotypes [205]. Notably, the Minute mutations account for the majority of haploinsufficient mutations in Drosophila [203], suggesting that genes involved ribosome biogenesis are particularly dosage sensitive.

Interestingly, Minute mutations, while relatively common in Drosophila, have not been widely reported in mammalian systems. In the mouse, a single

Minute mutation has been reported: Belly spot and tail (Bst). Bst mice display a

45 range of phenotypic features including decreased pigmentation, kinked tail and other skeletal abnormalities, retinal anomalies, and occasional optic nerve [206-208]. The Bst results from heterozygous mutation of the RpL24 gene, encoding a ribosomal protein, and impairs rRNA processing, protein synthesis and cell proliferation [208]. In humans, DBA is the only disorder known to be caused by mutation of ribosomal proteins and is discussed in the previous section [200].

General phenomena related to impaired ribosome biogenesis

Studies of deficient ribosome biogenesis across organisms from yeast to humans have revealed common features resulting from deficient ribosome biogenesis. In particular, genes involved in ribosome biogenesis are almost always haploinsufficient, and, surprisingly, disruption of ribosome biogenesis gives rise to tissue-specific defects.

In S. cerevisiae, screening of a heterozygous deletion library revealed that approximately 3% of the yeast genome is haploinsufficient under conditions favoring rapid growth [209]. Nearly half of the genes identified as haploinsufficient under these conditions (49%) were associated with various stages of ribosome biogenesis. In Drosophila, the Minute mutations, comprising a set of over 50 heterozygous mutations in ribosomal proteins, comprise the vast majority of haploinsufficient mutations in the fly [203]. Human diseases caused by deficiencies in ribosome biogenesis proteins, such as TCS and DBA, are haploinsufficiency syndromes [185,200]. Overall, genes involved in ribosome

46 biogenesis tend to be haploinsufficient, likely reflecting heightened sensitivity to impaired protein synthesis in highly proliferative cell types.

The finding that ribosomal protein deficiencies lead to malformations in specific tissues is quite interesting, given that genes involved in ribosome biogenesis are generally considered to be "housekeeping genes." In the case of ribosome biogenesis-related proteins expressed in a spatially restricted pattern, such as treacle, it may be sufficient to postulate that the reduction in ribosome biogenesis seen upon its heterozygous mutation would only affect tissues in which it is highly expressed. More perplexing is the tissue specificity of defects resulting from mutations in ribosomal proteins, which are expected to be ubiquitously expressed, such as those seen in DBA or that result from ribosomal protein knockdown in zebrafish [184,200]. Given that ribosome biogenesis disruption generally causes congenital anomalies and defects in highly proliferative cells, such as lymphocytes, erythroid precursors, and neural crest cells [164,198,200], it may simply be that a one-half reduction in the dosage of a protein involved in ribosome biogenesis is not sufficient to sustain the high protein synthesis requirements of rapidly dividing cells during development and adulthood.

CHD proteins, CHD7, and CHARGE syndrome

CHD proteins

The chromodomain helicase DNA binding protein (CHD) family is a highly conserved group of chromatin remodeling enzymes with representatives in all from its single member, Chd1, in S. cerevisiae, to the nine CHD

47

proteins found in vertebrates [210,211]. Extensive studies of CHD proteins in a

variety of model organisms have established these proteins as chromatin-

remodeling transcriptional regulators that influence a wide variety of cellular

processes including embryonic stem cell (ESC) pluripotency, development of

multiple organ systems, β-catenin signaling, transcriptional elongation, DNA

damage response, apoptosis, and cell cycle progression [212-222]. Additionally,

dysregulation of CHD proteins has been implicated in a range of human

diseases: disruption of and variants in the CHD2 gene are associated with [223,224], CHD5 is a tumor suppressor frequently lost in neuroblastomas [225], and mutations in the CHD7 gene are causative in the

majority of cases of CHARGE syndrome, a complex developmental disorder

[226].

Members of the CHD family are distinguished by the presence of two

domains. They contain a pair of N-terminal , responsible for binding methylated histones [227], and a central SNF2-like ATPase/helicase domain with chromatin remodeling activity. The chromodomains of several CHD proteins have been shown to associate with H3K4me [228-230], and the helicase activity of several CHD proteins has been demonstrated by nucleosome remodeling assays [231-234]. While all members of the CHD family contain these domains, the family can be further subdivided based on the presence of additional domains [210,211].

CHD7 and CHARGE syndrome

48

Of all CHD proteins, CHD7 is of particular interest. Heterozygous

mutations in the CHD7 gene are found in approximately two-thirds of cases of a complex developmental disorder, CHARGE syndrome, so named for its cardinal clinical features of Coloboma of the eye, Heart malformations, Atresia of the

Choanae, Retardation of growth and/or development, Genital hypoplasia, and

Ear abnormalities including deafness [226,235]. Not included in this acronym are

other prevalent clinical features including hyposmia/anosmia, tracheoesophageal

fistula, limb abnormalities, and cleft lip/palate [226,236-238]. Most CHD7

mutations detected in CHARGE syndrome are nonsense or frameshift and are

thus predicted to be loss of function. Thus, the pathogenic mechanism of

CHARGE syndrome is proposed to be haploinsufficiency. Studies in mice

support this hypothesis. Mice heterozygous for a W973X mutation in Chd7,

known as Whirligig, develop several features reminiscent of human CHARGE syndrome including heart malformations, inner ear anomalies, olfactory defects,

and female genital hypoplasia. Mice homozygous for the same mutation die in

utero around embryonic day 11 [239,240]. Independent confirmation of

haploinsufficiency as the pathogenic mechanism of CHARGE syndrome comes

from studies of mice harboring a gene-trapped allele of Chd7, wherein

heterozygotes develop similar features to those seen in Whirligig mice while

homozygotes die during mid-gestation [241]. Expression analysis of Chd7 by in

situ hybridization and immunofluorescence has demonstrated robust, ubiquitous

expression of Chd7 early in development with progressive restriction to

CHARGE-relevant tissues as development progresses [239,241,242].

49

The CHD7 gene is located on human chromosome 8q12.1 and spans approximately 188 kb of genomic sequence. It is made up of 38 , the first of which is noncoding. The encoded CHD7 protein is 2997 amino acids in length and contains the chromodomains and helicase domain found in all members of the CHD family (Figure 1-3). In addition, CHD7 contains a C-terminal DNA binding domain with a low degree of homology to the DNA/histone-binding SANT domain [243] as well as two C-terminal BRK domains, the function of which are unknown despite solution of their crystal structure [244]. CHD7 also contains five consensus nuclear localization sequences (NLSs), and nuclear localization of

CHD7 has been experimentally validated by subcellular fractionation and western blotting [229].

Substantial insight into the potential functions of CHD7 has come from genomic studies. ChIP-chip and ChIP-seq analysis of CHD7 chromatin binding in human and mouse cell types revealed that CHD7 binds predominantly to regions of the genome with characteristics of enhancer elements: they are enriched for monomethylated lysine 4 of histone H3 (H3K4me1), hypersensitive to DNase I digestion, located distal to TSSs, co-occupied by the enhancer- binding protein p300, and generally show a cell type-specific distribution

[229,245]. A number of CHD7 sites bound and not bound by p300 are able to active transcription in transient reporter assays, confirming the regulatory potential of CHD7-bound sequences [229]. In mouse ESCs (mESCs), CHD7 colocalizes with members of the core ESC pluripotency network, including Oct4,

Sox2, Stat3, and Smad1, and Oct4 knockdown reduces CHD7 chromatin

50

Figure 1-3. Structure of CHD7

A schematic representing the domain organization of CHD7. Domains are depicted approximately to scale. Consensus NLSs were determined with

PredictNLS (https://rostlab.org/owiki/index.php/PredictNLS).

51

occupancy. Heterozygous and homozygous loss of CHD7 causes subtle

changes in gene expression, primarily in the positive direction, implicating CHD7

as a transcriptional "rheostat" that fine-tunes the transcriptional output associated

with various enhancers [245].

In bone marrow-derived stromal cells, CHD7 forms a complex with Nemo-

like kinase (NLK) and the H3K9 methyltransferase SETB1 to inhibit

transactivation by ligand-bound peroxisome proliferator activated receptor

(PPARγ) upon Wnt5a treatment. This in turn promotes osteogenic over

adipogenic cell fate by inducing Runx2 expression [246]. Interestingly, this study

also demonstrated association of CHD7 with not only H3K4me3 but also

H3K9me3. This is the only study to demonstrate association of CHD7 with a

histone modification other than H3K4me and the significance of this finding is

currently not clear.

Recent work combining genetic, genomic, and biochemical approaches in

human cells and Xenopus laevis tadpoles has provided a glimpse into novel

functions of CHD7 potentially underlying CHARGE pathogenesis. Knockdown of

CHD7 in human neural crest like cells (hNCLCs) leads to impaired formation of

migratory of neural crest cells as measured by TWIST expression levels.

Morpholino knockdown of CHD7 or overexpression of an ATPase-deficient

mutant of CHD7 in Xenopus tadpoles causes several CHARGE-like phenotypes and is accompanied by reduced neural crest cell migration and altered expression of the neural crest-related genes Sox9, Twist, and Slug. In hNCLCs, proteomic analysis revealed that CHD7 associates with the Polybromo, BRG1-

52 associated factors (PBAF) chromatin remodeling complex. CHD7 and BRG1 also co-occupy a number of sites on chromatin. Morpholino knockdown of Brg1 and Brd7, subunits of PBAF, recapitulated the phenotypes associated with Chd7 knockdown, indicating an important role for the interaction of CHD7 and PBAF in neural crest development and potentially suggesting that mutations in the PBAF complex could also lead to CHARGE syndrome [214].

Studies of CHD7 in the mouse have also yielded interesting insights into its function. Analysis of the developing olfactory system in Chd7Gt/+ revealed numerous defects, including olfactory bulb hypoplasia, disorganized olfactory sensory neurons, and altered regeneration of olfactory sensory neurons after chemical ablation. Further analysis revealed reduced neural stem cell proliferation in the olfactory epithelium of these mice, suggesting a mechanism for the olfactory defects seen in humans and mice with CHD7 mutations [238].

Analysis of a conditional-null Chd7 allele also revealed that heterozygous and homozygous loss of Chd7 in the developing inner ear led to misregulation of several genes important for its development. Genetic depletion of CHD7 also reduced cell proliferation in the inner ear [247]. Analysis of independent Chd7 gene-trap mouse lines has also suggested that CHD7 cooperates with the transcription factor Tbx1 in pharyngeal ectoderm development [248]. Thus, the roles of CHD7 in development appear to be quite diverse.

Summary and research aims

The transcription of rRNA is essential to life and accounts for 80% of all cellular transcription in growing cells [1]. Transcription of rRNA is limiting for

53 ribosome biogenesis and subsequent protein synthesis and cell proliferation.

Nearly all signaling pathways regulated by growth factors or nutrient levels influence rRNA synthesis [1,17] . Dysregulation of rRNA transcription or subsequent stages of ribosome biogenesis leads to cell cycle arrest and apoptosis that has been implicated in a number of human diseases [28]. Despite the central importance of rRNA transcription in cellular life, rRNA genes have not been analyzed by genomic methods due to their exclusion from current genome builds. The application of genomic technologies to the investigation of rDNA could lead to novel insights into chromatin-level regulation of rRNA transcription and suggest novel avenues of investigation into disease pathogenesis.

To this end, we developed a novel bioinformatic approach, outlined in

Chapter 2. We show that short sequence reads can be accurately aligned to a single copy of rDNA in the context of the whole human genome. This approach yields several novel insights regarding the chromatin structure and transcription of rDNA. We also present the first ChIP-seq analysis of two proteins involved in rRNA transcription, Pol I and UBF, and show that UBF is bound throughout the genome and may have a role in general transcriptional regulation. Lastly, the insulator-binding protein CTCF is associated with rDNA near the junction between adjacent repeats, indicating that transcriptional insulation may play a role in rDNA regulation.

We also apply this method to the investigation of CHD7, a chromatin remodeling protein mutated in CHARGE syndrome. CHARGE syndrome is developmental disorder characterized by a variable constellation of congenital

54

birth defects [226]. Interestingly, CHARGE syndrome shares considerable clinical overlap with established disorders of ribosome biogenesis including TCS and DBA [185,200]. In chapter 3, we aligned CHD7 ChIP-seq data to a build of the human genome containing a single rDNA repeat and found substantial enrichment of CHD7 throughout the rRNA coding region. Subsequent analysis of

CHD7 subcellular localization confirms that a significant amount of cellular CHD7

is localized to the nucleolus. Knockdown studies reveal a role for CHD7 is

upregulating rRNA synthesis by counteracting DNA methylation at the rDNA

promoter. Lastly, we observe reduced levels of pre-rRNA in CHARGE-relevant

tissues from Chd7-heterozygous mice.

Taken together, these studies demonstrate that alignment of high-

throughput sequencing data to rDNA can be done accurately and efficiently. This

method can reveal novel insights not only into the basic chromatin biology of

rDNA but also suggest novel avenues of investigation into the pathogenesis of

diseases suspected to involve dysregulation of rRNA transcription. Indeed, we

also show that CHD7, mutated in CHARGE syndrome, has a role in rRNA

synthesis, and we propose that dysregulation of rRNA synthesis may contribute

to the pathogenesis of CHARGE syndrome.

55

Chapter 2

Integrative genomic analysis of human ribosomal DNA

Gabriel E. Zentner1, Alina Saiakhova1, Pavel Manaenkov2, Mark D. Adams1,3,4,

and Peter C. Scacheri1,3

1Department of Genetics, 2Department of Electrical Engineering and Computer

Science, 3Case Comprehensive Cancer Center, and 4Case Center for

Proteomics and Bioinformatics, Case Western Reserve University, Cleveland,

OH 44106, USA

A modified version of this chapter was previously published as:

Zentner, G.E., Saiakhova, A., Manaenkov, P., Adams, M.D., Scacheri, P.C. (2011). Integrative genomic analysis of human ribosomal DNA. Nucleic Acids Res. 39(12):4949-4960.

56

Abstract

The transcription of rRNA is critical to life. Despite its importance, rDNA is not included in current genome assemblies, and consequently, genomic analyses to

date have excluded rDNA. Here, we show that short sequence reads can be

aligned to a genome assembly containing a single rDNA repeat. Integrated analysis of ChIP-seq, DNase-seq, MNase-seq, and RNA-seq data reveals several novel findings. First, the coding region of active rDNA is contained within nucleosome-depleted open chromatin that is highly transcriptionally active.

Second, histone modifications are located not only at the rDNA promoter but also at novel sites within the intergenic spacer. Third, the distributions of active modifications are more similar within and between different cell types than repressive modifications. Fourth, we show that UBF, a positive regulator of rRNA transcription, binds to sites throughout the genome. Lastly, we show that the insulator binding protein CTCF associates with the spacer promoter of rDNA, suggesting that transcriptional insulation plays a role in regulating the transcription of rRNA. Taken together, these analyses confirm and expand the results of previous ChIP studies of rDNA and provide novel avenues for exploration of chromatin-mediated regulation of rDNA.

57

Introduction

The transcription of rRNA is a critical process for all cells, accounting for

up to 80% of all cellular RNA production [1]. Deficiencies in rRNA transcription

lead to reduced ribosome biogenesis, altered cellular growth, and increased cell

death [249]. Highlighting its central importance in cellular function,

dysregulation of rRNA biogenesis has been implicated in many human diseases

[28]: Treacher Collins syndrome [185], Diamond-Blackfan anemia [186],

dyskeratosis congenita [187], 5q-syndrome [188], cartilage-hair hypoplasia

[189], Shwachman-Diamond syndrome [190], and cancers with amplification of the c- oncogene, the product of which is a positive regulator of rRNA transcription [191-193]. Accordingly, rRNA transcription is tightly regulated at many levels, including that of chromatin structure.

The diploid human genome contains ~400 copies of a 43 kb rDNA unit tandemly arrayed in nucleolar organizer regions (NORs) on the five acrocentric chromosomes. Each unit contains approximately 13.3 kb of sequence encoding the 28S, 5.8S, and 18S rRNAs (hereafter referred to as the “coding region”) and a noncoding intergenic spacer (IGS) containing an enhancer, spacer promoter, and the core promoter of the adjoining rDNA repeat [2]. Transcriptionally active rDNA is euchromatic, hypomethylated at CpG sites and marked with histone modifications generally associated with transcriptionally active nucleoplasmic genes (i.e. H3K4me3 and H3K9ac). Transcriptionally silent rDNA is heterochromatic, hypermethylated and marked with repressive histone modifications (i.e. and H4K20me3) [2]. The coding sequence of

58

each transcriptionally active rDNA unit is transcribed by RNA polymerase I (Pol I) into a pre-ribosomal RNA (pre-rRNA), containing the sequences encoding the

18S, 5.8S, and 28S rDNA species [250,251]. The mature rRNA species are generated through a complex series of RNase cleavages and chemical modifications and subsequently assembled into ribosomes [35].

The recent development of ChIP-seq [252] has allowed for rapid assessment of protein occupancy throughout the genome. However, because rDNA is not included in reference genome assemblies to which sequence reads are normally aligned, virtually no ChIP-seq studies reported to date have analyzed rDNA. In this report, we show that short-sequence reads generated from ChIP-seq experiments can be accurately aligned to genome assemblies containing rDNA. We present the locations of nine histone modifications at rDNA in multiple cell types, confirming and expanding the results of previous ChIP studies that have focused primarily on the rDNA promoter.

Results from analysis of MNase-seq, DNase-seq, and RNA-seq data are consistent with a model wherein the coding region of rDNA is contained within nucleosome-depleted open chromatin that is highly transcriptionally active. We also present the first ChIP-seq analyses of human UBF and RPA116 (the second largest subunit of Pol I), validating their distribution along rDNA and demonstrating extensive nucleoplasmic chromatin association of UBF. We report the association of CTCF with the rDNA spacer promoter, suggesting the presence of an insulator element. Taken together, our results provide a high- resolution map of chromatin structure at rDNA and provide a reference for future

59

studies of chromatin-mediated regulation of rDNA.

Results

Alignment of high-throughput sequencing data to rDNA

There are approximately 400 copies of rDNA in the average mammalian

genome arranged in variable orientations on several chromosomes [2].

Sequencing of rDNA loci was not performed during the sequencing of the

human genome [253,254], and thus current genome assemblies do not contain

rDNA. Despite the high copy number of rDNA, each individual unit is similar in

repetitiveness to the human genome as a whole (48.88% for rDNA versus 50%

for the whole genome). Therefore, in principle, alignment of short sequence

reads to rDNA should not be any more problematic than alignment of reads to

other regions of the genome. We implemented the following pipeline for

analysis of ChIP-seq data. First, the complete sequence of one rDNA unit was

added to the human genome assembly (HG18). We reasoned that if reads

were aligned to rDNA alone, out of the context of the whole genome assembly,

reads derived from elsewhere in the genome with sequences similar to those

occurring in rDNA might be forced to align to rDNA, resulting in false positives.

The rDNA sequence was added to the proximal tip of , on which

rDNA is endogenously located, so that ChIP-seq signals corresponding to rDNA could be easily compared to those on nucleoplasmic chromatin. We call this assembly HG18_plus_rDNA. To further reduce false positives we discarded non-unique reads and reads aligning to more than one region of the genome.

Finally, to mitigate the effects of systematic biases that might be present in the

60

data, ChIP-seq signals were normalized against control input DNA from each

cell type. Importantly, input libraries prepared and sequenced by independent labs were analyzed and found to be similar to one another, suggesting that the stability of the rDNA in a given cell type is not likely to affect the signal output

(Figure 2-1). We note that this analysis pipeline, which incorporates multiple filtering steps, is highly conservative and inherently designed to retain only robust rDNA signals that are likely genuine. In addition, with the exception of one region located 2-5 kb into the repeat, the sequence of the rDNA locus is of

sufficient uniqueness to avoid false negatives. However, the limitation of this

approach is that the signal obtained at rDNA is an aggregate of signals at all

immunoprecipitated rDNA repeats and does not discriminate between

transcriptionally active and repressed rDNA.

Distribution of histone modifications at rDNA

We first analyzed publically available ChIP-seq data from K562 cells

generated by the ENCODE consortium. K562, a lymphoblastoid chronic

myelogenous leukemia line, was first selected for analysis because these cells

are grown under specific guidelines established by the ENCODE

consortium, thus minimizing experimental variability due to culture conditions. In

total, we analyzed nine histone modifications, six generally associated with

transcriptional activation (H3K4me1, H3K4me2, H3K4me3, H3K9ac, ,

and ) and three generally associated with transcriptional repression

(H3K9me1, H3K27me3, and H4K20me1). Consistent with previous analysis of

rDNA by standard ChIP, we detected strong enrichment of the active

61

Figure 2-1. Comparison of input DNA samples from K562 cells

To assess systematic bias in our ChIP-seq analyses, we compared two K562 input DNA samples from two independent laboratories. Sequenced input libraries were aligned to rDNA as described in the text. The input DNA profiles are generally similar, suggesting that variability in input DNA is not likely to distort results upon subtraction of input from ChIP signals. The “Broad” input dataset

was obtained from the Broad Histone track and the “Yale” input dataset was

obtained from the Yale TFBS track of the UCSC Genome Browser.

62

modifications H3K4me2/3 and H3K9ac at the rDNA promoter, just upstream of

the transcription start site (TSS) (Figure 2-2). This region also shows enrichment

of H3K4me1 and H3K27ac, marks that have not previously been analyzed at

rDNA. Enrichment of active modifications was also detected within the IGS of

rDNA, at a site located ~28-29 kb into the rDNA repeat (Figure 2-2). H3K4me1

ChIP-PCR was used to verify enrichment at this region (Figure 2-3). Little or no

enrichment of active modifications was detected in the coding region. This is an interesting observation, as it has been previously speculated that the coding region of active rDNA repeats is nucleosome-poor or completely free of nucleosomes [255]. In general, enrichment of active modifications is more punctate than repressive modifications, which are broadly distributed along the

IGS and sometimes within the coding region of rDNA (i.e. H3K9me1; Figure 2-2).

Interestingly, H3K36me3, a mark generally associated with the bodies of

transcriptionally elongating genes [256], is virtually absent from the coding region

of the rDNA. While the significance of this finding is currently unclear, the results may suggest that the function of H3K36me3 differs between the nucleoplasm and nucleolus.

To address the significance of the ChIP-seq signals at rDNA, we compared the intensity of peaks at the rDNA to peaks on nucleoplasmic chromatin. One might predict, since sequence reads were aligned to a genome assembly containing only one copy of the rDNA, that signals at rDNA would appear inflated relative to nucleoplasmic chromatin. However, there are a number of unknown factors that could influence the signal at rDNA, including

63

Figure 2-2. Distribution of histone modifications at rDNA in K562 cells

64

Figure 2-3. H3K4me1 ChIP-PCR in K562 cells

ChIP was performed to validate the observation of active histone modification enrichment at ~28-29 kb in the rDNA IGS. The “IGS” primer sets are under the active modification peak and the “coding” primer sets are within the rDNA coding region, where active marks are not enriched. See Table 2-3 for primer details.

Error bars represent mean + SD for triplicates.

65 but not limited to ChIP efficiency, the proportion of active versus repressive rDNA loci, the copy number of rDNA in a given cell type, and the scaling of signals at high-copy sequences during peak detection. Nevertheless, we would expect genuine ChIP-seq signals at rDNA to be at least as intense as nucleoplasmic signals. Indeed, the intensities of all histone signals at rDNA were equal to or higher than those at nucleoplasmic genes (Figure 2-4).

Similar results were found for ChIP-seq signals analyzed in additional cell types.

Visual inspection of the data shows that the active marks tend to show similar profiles, an observation that is also apparent for the repressive marks. To address this observation in an unbiased fashion, the data were median- smoothed in 100 bp windows, and pairwise correlations between all nine histone modifications were performed. The correlation scores were then hierarcharchially clustered and plotted in a heatmap. The results clearly show that, with the exception of H3K36me3, the active and repressive modifications separate into two distinct groups (Figure 2-5). These data indicate that overall, active histone marks are distributed similarly along the rDNA in K562 cells, and that repressive marks tend to correlate more with one another than with active marks.

Cell type-specificity of histone marks at rDNA

We next investigated whether the distribution of histone modifications at rDNA in K562 cells is similar to that in other cell types, including HUVEC (human umbilical vein endothelial cells), H1-hESC (human embryonic stem cells), and

66

Figure 2-4. Normalized tag density scores for histone modifications

To compare histone modification ChIP-seq signal intensity at rDNA versus nucleoplasmic chromatin, the intensity of the 3 strongest rDNA peaks was averaged and divided by the average intensity of the 100 strongest peaks on chromosome 13. The result was designated the normalized tag density for that modification in that cell type.

67

Figure 2-5. Correlation heatmaps of pairwise comparisons between median signals for all histone modifications at rDNA in K562 cells, HUVECs, H1- hESCs, and NHEKs

68

NHEK (normal human epidermal keratinocytes). Similar to K562 cells, enrichment of the active marks H3K4me1/2/3 and H3K9ac was detected at the promoter of rDNA and at the 28-29 kb site located in the IGS. HUVEC and

NHEK cells also show enrichment of H3K27ac at the TSS (data not available for

H1-hESC). H1-hESC shows similar enrichment of H3K4me1/2/3 and H3K9ac at the promoter and IGS, but shows additional peaks within the IGS at ~15 and ~20 kb. With respect to the repressive histone modifications, there is less similarity to

K562, and in fact, each cell type shows a fairly specific pattern of enrichment

(Figures 2-2, 2-6 through 2-8). Overall, the data indicate that each histone mark shows some level of cell-type specificity, though cluster analyses reveal distinct groupings of active and repressive marks in each cell type that are similar to

K562 (Figure 2-5).

Visual inspection of the data suggests that the distribution of active marks tends to be more consistent among the 4 cell types than repressive marks. To test this more systematically, we performed pairwise linear regression analyses between all four cell types for each modification and plotted the average R2 values on a color scale (Figure 2-9A; see Table 2-1 for R2 values). The results indicate that the three active modifications associated with promoter and IGS of rDNA (H3K4me2/3 and H3K9ac) are most conserved between cell types.

H3K27ac and H3K4me1, marks that localize to gene enhancer elements on nucleoplasmic chromatin, are the next most conserved. This scenario is reminiscent of that in the nucleoplasm, where marks associated with promoters tend to be invariant between cell types while marks associated with enhancers

69

Figure 2-6. Distribution of histone modifications at rDNA in HUVECs

70

Figure 2-7. Distribution of histone modifications at rDNA in H1-hESCs

71

Figure 2-8. Distribution of histone modifications in NHEKs

72

Figure 2-9. Comparison of rDNA histone marks across multiple cell types

(A) Gradient plot representing the average correlation score for each histone modification at rDNA. (B) Three-dimensional representations of H3K4me2 and

H3K27me3 ChIP-seq data, illustrating the general correlation of active modifications and lack of correlation of repressive modifications between cell types.

73

Modification/protein 1 Modification/protein 2 R2 value H3K4me1_K562 H3K4me1_HUVEC 0.13 H3K4me1_K562 H3K4me1_hESC 0.28 H3K4me1_K562 H3K4me1_NHEK 0.13 H3K4me1_HUVEC H3K4me1_hESC 0.03 H3K4me1_HUVEC H3K4me1_NHEK 0.74 H3K4me1_hESC H3K4me1_NHEK 0.02 H3K4me2_K562 H3K4me2_HUVEC 0.78 H3K4me2_K562 H3K4me2_hESC 0.68 H3K4me2_K562 H3K4me2_NHEK 0.79 H3K4me2_HUVEC H3K4me2_hESC 0.44 H3K4me2_HUVEC H3K4me2_NHEK 0.92 H3K4me2_hESC H3K4me2_NHEK 0.44 H3K4me3_K562 H3K4me3_HUVEC 0.72 H3K4me3_K562 H3K4me3_hESC 0.63 H3K4me3_K562 H3K4me3_NHEK 0.22 H3K4me3_HUVEC H3K4me3_hESC 0.53 H3K4me3_HUVEC H3K4me3_NHEK 0.41 H3K4me3_hESC H3K4me3_NHEK 0.19 H3K9ac_K562 H3K9ac_HUVEC 0.47 H3K9ac_K562 H3K9ac_hESC 0.6 H3K9ac_K562 H3K9ac_NHEK 0.55 H3K9ac_HUVEC H3K9ac_hESC 0.1 H3K9ac_HUVEC H3K9ac_NHEK 0.95 H3K9ac_hESC H3K9ac_NHEK 0.13 H3K9me1_K562 H3K9me1_HUVEC 0.09 H3K9me1_K562 H3K9me1_NHEK 0.26 H3K9me1_HUVEC H3K9me1_NHEK 0.28 H3K27ac_K562 H3K27ac_HUVEC 0.09 H3K27ac_K562 H3K27ac_NHEK 0.09 H3K27ac_HUVEC H3K27ac_NHEK 0.97 H3K27me3_K562 H3K27me3_HUVEC 0.001 H3K27me3_K562 H3K27me3_hESC 0.04 H3K27me3_K562 H3K27me3_NHEK 0.04 H3K27me3_HUVEC H3K27me3_hESC 0.02 H3K27me3_HUVEC H3K27me3_NHEK 0.49 H3K27me3_hESC H3K27me3_NHEK 0.02 H3K36me3_K562 H3K36me3_HUVEC 0.005 H3K36me3_K562 H3K36me3_hESC 0.008 H3K36me3_K562 H3K36me3_NHEK 0.0004 H3K36me3_HUVEC H3K36me3_hESC 0.01 H3K36me3_HUVEC H3K36me3_NHEK 0.42 H3K36me3_hESC H3K36me3_NHEK 0.05 H4K20me1_K562 H4K20me1_HUVEC 0.04 H4K20me1_K562 H4K20me1_hESC 0.00005 H4K20me1_K562 H4K20me1_NHEK 0.0002 H4K20me1_HUVEC H4K20me1_hESC 0.03 H4K20me1_HUVEC H4K20me1_NHEK 0.07 H4K20me1_hESC H4K20me1_NHEK 0.04

74

DNase_K562 DNase _HUVEC 0.89 DNase_K562 DNase _hESC 0.88 DNase_K562 DNase _NHEK 0.9 DNase_HUVEC DNase _hESC 0.97 DNase_HUVEC DNase _NHEK 0.94 DNase_hESC DNase _NHEK 0.9 CTCF_K562 CTCF_HUVEC 0.97 CTCF_K562 CTCF_hESC 0.88 CTCF_K562 CTCF_NHEK 0.98 CTCF_HUVEC CTCF_hESC 0.76 CTCF_HUVEC CTCF_NHEK 0.99 CTCF_hESC CTCF_NHEK 0.78

Table 2-1. Correlation coefficients for pairwise comparisons

Comparisons were made between the median signals in 100bp window at rDNA for individual histone modifications, DNase, and CTCF in different cell types and their R2 values were reported.

75

display cell-type specificity [257,258]. The repressive marks and H3K36me3 show the lowest degree of correlation between cell types. Three-dimensional plots of H3K4me2 and H3K27me3 ChIP-seq data in all cell lines analyzed are presented to illustrate this point (Figure 2-9B).

Chromatin accessibility and transcription of rDNA

The rDNA has an exceptionally high transcriptional output, producing up to 80% of all RNA in a cell [1]. This fact suggests that the chromatin of rDNA is readily accessible to the Pol I machinery and therefore likely to be open. The results of DNase-seq analysis, which measures accessibility of chromatin based on sensitivity to DNase I digestion [259], indicate that the coding region, parts of the IGS, and the promoter region of rDNA are in an open state (Figure 2-10A). DNase hypersensitivity was also highly correlated between cell types (R2 = 0.88-0.97; Table 2-1).

We next assessed the levels of transcripts emanating from rDNA using

small RNA-seq data from K562 nucleoli. As expected, extremely high levels of

RNA corresponding to the coding region of rDNA were detected. It is also known

that ~150nt rDNA promoter-associated RNAs (pRNAs) are involved in rDNA

silencing. These transcripts are produced from a spacer promoter located ~2 kb

upstream of the core rDNA promoter [40-42,138]. We therefore adjusted the Y-

axis scale of the RNA-seq data to look for pRNA and other potentially low-

abundance rDNA transcripts. We detected RNA signal corresponding to pRNA

at the promoter region (Figure 2-10B), as well as weak signals throughout the

IGS. Notably, one such signal is located ~28-29 kb into the repeat, at the same

76

Figure 2-10. Chromatin accessibility, transcription, and nucleosome occupancy at rDNA

(A) Profile of DNase hypersensitivity at rDNA in K562, HUVEC, H1-hESC, and

NHEK cells. The DNA sequence of the region at 2-5 kb contains large stretches of high identity to other genomic regions, and thus reads matching this region would have been discarded during alignment. Therefore, the reduction in signal

77 at 2-5 kb is likely a false negative. (B) Nucleolar small RNA-seq profile of rDNA in

K562 cells. The lower panel shows a zoomed-in view of rDNA from ~14-43 kb with a reduced Y-axis to show low-abundance RNAs. RNA-seq signal corresponding to pRNA is indicated by an arrow. (C) Profile of nucleosome occupancy at rDNA in CD4+ T cells as determined by MNase-seq.

78 location where high enrichment of active histone marks was detected (Figures 2-

2, 2-6 through 2-8). We next assessed the strand specificity of the mapped

RNA-seq reads. In total, 75.5% (335,816/444,558) of nucleolar RNA-seq reads mapped uniquely to rDNA. Of these, only 16 mapped to the negative strand. We therefore conclude that, under normal cellular conditions, transcription of rDNA predominantly occurs in the sense direction. These results are consistent with previous studies assessing strandedness of rDNA transcription [91].

Nucleosome occupancy of rDNA

The IGS of both active and inactive rDNA loci contains nucleosomes.

However, whether the coding region of active rDNA repeats contains nucleosomes or is nucleosome-free is controversial [255]. We assessed nucleosome occupancy of rDNA using data obtained by high-throughput sequencing after micrococcal nuclease digestion (MNase-seq) in CD4+ T cells

[260] (Figure 2-10C). On average, median signal across the coding region was

~2.2-fold lower than that across the IGS of the locus (3.19 vs. 7.01, P =

1.18x10-9). We therefore conclude that, overall, the coding region of rDNA has reduced nucleosome occupancy relative to the IGS. However, because active and inactive rDNA repeats were sampled together in the MNase-seq assay, we cannot discern whether the coding region of active rDNA repeats is completely nucleosome-free and the residual signal is due to nucleosome occupancy only at inactive rDNA loci, or if the coding regions of both active and inactive copies of rDNA are nucleosome-depleted relative to the IGS.

ChIP-seq analysis of Pol I chromatin association

79

Transcription of rRNA is mediated by Pol I and its associated basal

transcription machinery [250,251]. ChIP-seq of RPA116, the second-largest

subunit of the Pol I holoenzyme, showed high enrichment of Pol I across the

promoter and coding region of rDNA (Figure 2-11A). This result is consistent

with previous studies [261,262] and was validated by ChIP-PCR (Figure 2-

11B). We also detected 31 RPA116 peaks on non-rDNA chromatin. Many of

these nucleoplasmic peaks were located close to centromeres, a known

source of false positives in ChIP-seq experiments [263]. Most of the nucleoplasmic peaks also appeared irregular, with a jagged, discontinuous distribution of tags rather than the smooth gradations in tag density seen at legitimate peaks. We therefore conclude that the nucleoplasmic signals are artifacts, and that Pol I is exclusively nucleolar.

ChIP-seq analysis of UBF chromatin association

UBF (upstream binding factor) maintains the rDNA promoter in an open

state and is necessary for the formation of the Pol I preinitiation complex [67].

ChIP-seq of UBF revealed substantial enrichment in the promoter and coding

region of rDNA (Figure 2-11C), consistent with previous work [261,262] and

confirmatory ChIP-PCR assays (Figure 2-11D). The distributions of UBF and

RPA116 were highly similar to one another in HEK293T cells (R2 = 0.87). UBF

binding in HEK293T and K562 was also strongly correlated (R2 = 0.94), as was

HEK293T UBF and K562 RPA116 binding (R2 = 0.88).

UBF has previously been shown to bind the CCND1 promoter and

activate β-catenin-responsive reporter genes, suggesting a non-nucleolar

80

Figure 2-11. ChIP-seq analysis of Pol I and UBF rDNA association

(A) ChIP-seq profile of the RPA116 subunit of Pol I binding to rDNA. (B) ChIP-

PCR validation of RPA116 rDNA binding. (C) ChIP-seq profile of UBF binding to rDNA. (D) ChIP-PCR validation of UBF rDNA binding. See Table 2-3 for primer details. Error bars represent mean + SD for triplicates.

81

function for UBF in β-catenin signaling [264,265]. We detected robust

enrichment of UBF on nucleoplasmic chromatin in HEK293T (1,796 peaks) and

K562 (43 peaks) cells (Figure 2-12A). The association of UBF with nucleoplasmic chromatin was confirmed by standard ChIP-PCR with two

different (Figure 2-12B). We also verified specificity by performing

UBF ChIP-PCR following knockdown of UBF (Figure 2-12C). In HEK293T

cells, 70.4% of the UBF peaks were located within 2 kb of a RefSeq TSS,

22.3% were intergenic, and 7.3% were intragenic. A much smaller fraction of

UBF sites (44.2%) in K562 were located near TSSs, while the majority (53.5%)

of sites were located in intergenic regions (Figure 2-13A). Analysis of UBF

binding to all TSSs in the genome revealed that UBF was bound to

approximately 10% and 1% of all TSSs in HEK293T and K562, respectively,

suggesting that some UBF TSS binding events are slightly below the FDR

threshold used (Figure 2-13B). 33/43 (76.7%) of UBF peaks in K562

overlapped with those detected in HEK293T (Figure 2-13C).

We next correlated the UBF ChIP-seq data to gene expression. The results indicate that genes containing high levels of UBF at the TSS are generally expressed at relatively high levels (Figure 2-13D). Transcription factors and genes involved in developmental processes and nucleobase metabolism were significantly overrepresented among genes with a significant

UBF peak at their TSS in HEK293T (see online supplementary material

[266]). Despite the low number of UBF sites in K562, analysis revealed that transcription factor and nucleobase metabolism genes were

82

Figure 2-12. UBF associates with nucleoplasmic chromatin

(A) UCSC Genome Browser view of UBF binding on human in

HEK293T and K562 cells. A zoomed-in view is shown in the lower panel. (B)

ChIP-PCR at nucleoplasmic regions bound by UBF in HEK293T and K562 cells using antibodies H-300 (Santa Cruz #9131, used for ChIP-seq) and F-9 (Santa

Cruz #13125). Four rDNA regions, two bound (-0.3 kb and +2.1 kb) and two not bound (+18.5 kb and +30.6 kb) by UBF, are included as controls for ChIP efficiency and their values are plotted on the right Y-axis. (C) UBF ChIP was performed after siRNA knockdown of UBF (see Figure 2-14A for knockdown western blot). Three pairs of UBF-bound and -unbound nucleoplasmic sites were tested. Two rDNA sites, one bound and the other not bound by UBF, were also

83 assayed. All UBF-bound, but not -unbound, sites tested showed a significant reduction in ChIP signal following knockdown. Error bars represent mean + SD for triplicates. *, P < 0.05; **, P < 0.01 by t-test.

84

Figure 2-13. Analysis of nucleoplasmic UBF peaks

(A) Distribution of UBF binding sites with respect to RefSeq genes. (B) UBF

signal ± 5 kb of all unique TSSs in the human genome in descending order of

average signal intensity. The panels on the right show a zoomed-in view of the

topmost area of each heatmap, showing details of UBF signal in each cell type.

(C) Venn diagram indicating overlap of UBF binding sites in HEK293T and K562

cells. (D) Density histograms of gene expression levels for all genes, genes with

significant UBF peaks < 2 kb from their TSS, or genes with low or no UBF

binding at their TSS. HEK293T UBF enriched vs. all genes, P = 1.85x10-28;

HEK293T UBF enriched vs. non-enriched genes, P = 3.29x10-21; K562 UBF

85 enriched vs. all genes, P = 0.0013; K562 UBF enriched vs. non-enriched genes,

P = 0.0031 by t-test.

86

significantly over-represented (see online supplementary material [266]).

Genes encoding nucleolar and ribosomal proteins were not over-represented

among UBF-bound genes in either cell type. Interestingly, in HEK293T cells,

genes involved in Wnt signaling were significantly over-represented among

UBF-enriched genes, in support of previous studies suggesting a role for UBF

in β-catenin signaling [265,267]. Motif analysis revealed 150 and 56

transcription factor binding motifs within significant UBF peaks in HEK293T

and K562, respectively (see online supplementary material [268]). These results raise the possibility that UBF collaborates with a variety of cofactors to regulate nucleoplasmic transcription.

To test whether binding of UBF to nucleoplasmic genes has a functional effect, we knocked down UBF with siRNA (Figure 2-14A) and quantified transcript levels of 11 UBF-bound genes. Two of the eleven genes (PRCC and

PSMD14) showed a modest but significant decrease in expression 48 hours

after UBF knockdown in HEK293 cells (Figure 2-14B). No genes were

increased upon UBF knockdown. These data suggest that UBF functions to

modestly increase expression of a subset of nucleoplasmic genes. Further

studies are required to determine whether the effect of UBF on nucleoplasmic

transcription is generally modest, if other targets are more dramatically affected

upon UBF depletion, or if complete loss of UBF has a more dramatic effect.

The insulator-binding protein CTCF associates with rDNA

Because rDNA is organized into tandemly repeated arrays (NORs), it

stands to reason that there is a mechanism to prevent leaky transcription

87

Figure 2-14. UBF regulates nucleoplasmic gene transcription

(A) Western blot showing depletion of UBF protein 48 hours after siRNA transfection in HEK293T cells. (B) Expression analysis of nucleoplasmic genes

48 hours after UBF siRNA transfection in HEK293T cells (n = 3). Error bars represent mean + SEM for biological replicates. *, P < 0.05 by t-test.

88 between repeats. Such a mechanism could be an insulator element.

Insulator elements generally function to demarcate discrete transcriptional units and prevent inappropriate transcription [269]. A large proportion of conserved DNase hypersensitive sites overlap with CTCF, a well- characterized insulator-binding protein [270]. Additionally, CTCF has been shown to localize to the nucleolus and repress rRNA transcription [123]. We therefore aligned CTCF ChIP-seq data from K562, HUVEC, H1-hESC, and

NHEK cells to HG18_plus_rDNA and found that CTCF was highly enriched at the 3’ end of rDNA, at the spacer promoter (Figure 2-15A). The binding pattern of CTCF was highly consistent across multiple cell types (R2 = 0.76-

0.99; Table 2-1), suggesting that this CTCF binding site serves an important, conserved function. Normalized tag density for CTCF was ~11-32, suggesting that CTCF is present at many copies of rDNA (Figure 2-15B).

ChIP-PCR confirmed the association of CTCF with rDNA in K562 cells (Figure

2-15C). We also aligned CTCF ChIP-seq data from mESCs to a build of the mouse genome containing a single rDNA repeat and observed CTCF binding at the spacer promoter of rDNA (Figure 2-16). Notably, an independent laboratory has also recently observed CTCF binding to human and mouse rDNA at a site similar to that observed in our analyses, further validating our findings [44].

We searched the sequence under the CTCF peak in human and

89

Figure 2-15. CTCF is associated with human rDNA

(A) ChIP-seq profile of CTCF at rDNA in K562, HUVEC, H1-hESC, and NHEK cells. The putative CTCF motif is indicated by an asterisk. (B) Normalized tag density scores for CTCF at rDNA. (C) ChIP-PCR validation of CTCF rDNA association in K562 cells using two primer sets under the CTCF ChIP-seq peak.

Primer details are listed in Table 2-3. Error bars represent mean + SD for triplicates.

90

Figure 2-16. CTCF binds to mouse rDNA

CTCF ChIP-seq data from mESCs were aligned to a custom build of the MM8 genome assembly containing a single rDNA repeat. The CTCF peak is situated at the spacer promoter, consistent with its location in human cells. The position of the putative CTCF motif is indicated by an asterisk.

91 mouse rDNA for consensus binding motifs using the CTCF Binding Site

Database (CTCFBSDB) [271]. Three of four positional weight matrices used by CTCFBSDB yielded scores indicating a suggestive match for a CTCF consensus binding site in human rDNA (Table 2-4A), and all positional weight matrices yielded suggestive scores for a CTCF site in mouse rDNA (Table 2-

4B). Combined with our analysis of DNase-seq data demonstrating hypersensitivity at this location (Figure 2-10A), these results suggest that the

3’ end of rDNA, in addition to promoting the transcription of rRNA, also acts as an insulator element to demarcate the boundaries of each rDNA repeat and/or repress rRNA transcription. This model is supported by a previous study demonstrating CTCF-mediated transcriptional repression of rDNA in human cells [123]. However, as this study was under review, a report was published demonstrating that CTCF promotes association of UBF, Pol I, and active histone marks with the rDNA spacer promoter in mouse and human cells [44].

Therefore, CTCF may act in a context-dependent manner to regulate rDNA transcription in both the positive and negative directions.

Discussion

The rDNA has posed significant obstacles to genomic analysis and has thus far been analyzed only by standard ChIP and other biochemical techniques. Here, we integrated ChIP-seq, DNase-seq, MNase-seq, and

RNA-seq datasets to assemble a high-resolution map of chromatin structure at the human rDNA locus that verifies and expands the results of previous rDNA ChIP studies. Additionally, our results complement those of previous

92

Motif Motif sequence Motif Motif Score PWM start orientation REN_20 GCGGCCGCCAGATGGAGCCC 42097 - 17.0737 MIT_LM2 CGGCCGCCAGATGGAGCCC 42097 - 0.0963824 MIT_LM7 CGGCCGCCAGATGGAGCCCG 42096 - 6.10893 MIT_LM23 CGGCCGCCAGATGGAGCCCG 42096 - 8.53602

Table 2-2A. CTCF motifs under the CTCF peak at human rDNA predicted in silico with CTCFBSDB [272]. 41941-42480 were analyzed and the motif start position is given relative to the rDNA TSS. PWM: positional weight matrix. The score is the log-odds ratio of the binding site being generated by the

CTCF motif versus the background sequence, with scores > 3 considered to be suggestive matches. The sequence of the putative CTCF motif within human rDNA is 5’-GCGGCCGCCAGATGGAGCCC-3’, on the - strand.

Motif Motif sequence Motif Motif Score PWM start orientation REN_20 GTCACCACTAGGTGTCGCCC 43124 - 12.6699 MIT_LM2 TCACCACTAGGTGTCGCCC 43124 - 16.4354 MIT_LM7 TCACCACTAGGTGTCGCCCG 43123 - 13.2717 MIT_LM23 TCACCACTAGGTGTCGCCCG 43123 - 18.7405

Table 2-2B. CTCF motifs under the CTCF peak at mouse rDNA predicted in silico with CTCFBSDB [272]. Nucleotides 41941-42480 were analyzed and the motif start position is given relative to the rDNA TSS. The sequence of the putative CTCF motif within mouse rDNA is 5’-GTCACCACTAGGTGTCGCCC-3’, on the - strand.

93

genome-wide ChIP-seq studies by focusing on a critically important region of

the genome that has thus far not been analyzed by this method. We present

four novel findings. First, histone modifications at rDNA are not only located

at the promoters, but also at sites within the IGS. Second, the distributions of

active modifications are more similar within and between different cell types

than repressive modifications. Third, UBF, primarily a nucleolar transcription factor, binds to many sites on nucleoplasmic chromatin. Fourth, the insulator binding protein CTCF associates with rDNA, at a site situated between the spacer and core promoters, suggesting that transcriptional insulation plays a role in regulation of rRNA transcription.

The significance of the peak of active modifications present in multiple cell types at the ~28-29 kb site within the IGS is not yet clear. The sequence composition of this region is unremarkable, with a ~55% GC content and 3.8% of bases masked due to the presence of a simple repeat. This region is also contains a stretch of DNA similar to a clone hypomethylated in sperm but not somatic tissues [273]. It is possible that this region harbors an as yet uncharacterized transcription unit. Another possibility is that this region harbors a novel functional element associated with active transcription, such as an enhancer. Enhancers are generally marked with H3K4me1/2, located distal to TSSs, and usually hypersensitive to DNase digestion [257,258]. The

28-29 kb IGS site has these same characteristics. Further studies such as reporter assays using this region of rDNA could be useful in addressing this further. It is important to note that these two possibilities are not mutually

94

exclusive, as it has recently been shown that active enhancers can produce short transcripts [274].

UBF, well known as a regulator of rRNA transcription, is generally thought to be a nucleolus-specific factor. However, previous studies have suggested that UBF may function outside the nucleolus as a mediator of β- catenin signaling [265,267]. We find that UBF is bound throughout the genome, suggesting that UBF may function in a broader range of cellular processes than previously appreciated. We show that siRNA-mediated depletion of UBF leads to a modest decrease in the expression of two nucleoplasmic genes, PRCC and PSMD14. While the majority of cellular UBF is likely to function in rRNA transcription, we propose that a small amount of

UBF may have an extranucleolar transcriptional role.

There are limitations associated with this study. First, because we have included only a single rDNA repeat in our genome assembly, all sequences aligning to rDNA will “pile up” at the single copy of rDNA; thus, the signals we observe are an aggregate of signals at all rDNA copies immunoprecipitated in each ChIP experiment. We are therefore unable to determine how many copies of rDNA contain a particular histone mark or are bound by a given protein. Another limitation of this study is that the datasets we have analyzed likely contain a mixed population of active and inactive rDNA repeats, and thus, the data represent an average of these two populations. Therefore, we cannot definitively conclude whether the distinct patterns of active and repressive histone marks seen occur on independent repeats or if they coexist on the

95

same repeat. However, previous studies combining ChIP with methylation-

sensitive restriction digest (ChIP-chop) and bisulfite sequencing have

demonstrated that marks associated with transcriptional activation (H3K4me,

H3K9ac, H4ac) tend to associate with DNA-hypomethylated, ostensibly active repeats, while repressive modifications (H3K9me, H4K20me) associate with hypermethylated, silent repeats [2]. We therefore suggest that the distinct

patterns of modifications we observe occur on independent repeats.

Taken together, our analyses provide the first high-resolution picture of

chromatin structure at rDNA. Our results provide novel insight into a region not

previously studied by ChIP-seq and serve as a reference for further studies of

chromatin-mediated regulation of rDNA. Future studies focusing on the regulatory potential of the rDNA CTCF binding site and the peak of active modifications ~28-29 kb into the rDNA repeat will be particularly informative in

delineating novel modes of chromatin-level regulation at this critical region of

the genome.

96

Materials and Methods

Cell culture, siRNA knockdown, and gene expression analysis

HEK293T cells were cultured in DMEM supplemented with 10% FBS

and 50 µg/ml gentamicin at 37°C, 5% CO2. K562 cells were cultured in RPMI

medium 1640 supplemented with 10% FBS and 50 µg/ml gentamicin at 37°C,

5% CO2. For UBF knockdown, HEK293T cells were transfected with a

control or UBF siRNA SmartPool (Dharmacon). Cells were harvested 48 hours after transfection for analysis. UBF depletion was assayed by western blot (rabbit anti-UBF, Santa Cruz #9131, 1:1000 and rabbit anti-, ICN

BioMedicals, 1:5000). RNA was extracted using TRIzol (Invitrogen) and cDNA was prepared using the High Capacity cDNA Archive Kit (ABI). We selected 11 genes with significant UBF binding at their TSS in both HEK293T and K562 cells (ATG2A, C10ORF140, CNOT4, IPP, KDELR1, MED26,

PCBP2, PRCC, PSMD3, PSMD14, SETD2) to analyze in HEK293T cells following UBF knockdown. TaqMan probes (ABI) were used to assay the expression of each gene on a GeneAmp 7300 real-time thermal cycler (ABI).

GAPDH was used as endogenous control for all reactions.

ChIP

ChIP was performed as described in the Appendix. For PCR, triplicate wells were performed for each primer set using Sybr Green (ABI) on an ABI

7300 real-time thermal cycler. Relative enrichment was calculated using the

ΔΔCt method. See Table 2-3 for primer details. Following successful ChIP-

PCR, 30 µl ChIP or 500 ng input DNA was used to prepare sequencing libraries

97

as described [275]. Sequencing of ChIP and input libraries was performed on

an Illumina Genome Analyzer II at the Case Western Reserve University

Genomics Core. We obtained the following unique read numbers: HEK293T

RPA116: 14,103,573; HEK293T UBF, 15,473,371; HEK293T input: 20,913,306;

K562 UBF: 8,801,725; K562 input: 12,188,805. Antibodies used for ChIP were

rabbit anti-RPA116 (a gift from Ingrid Grummt, 5 µl/ChIP), rabbit anti-UBF (H-

300, Santa Cruz #9131, 5 µg/ChIP), mouse anti-UBF (F-9, Santa Cruz #13125,

5 µg/ChIP), rabbit anti-H3K4me1 (Abcam #8895, 8 µg/ChIP), and rabbit anti-

CTCF (Millipore #07729, 10 µl/ChIP). ChIP-seq datasets generated in this study have been deposited to the SRA (SRA027342).

Datasets

Data for histone modifications, CTCF, and a corresponding input from

K562, HUVEC, H1-hESC, and NHEK cells were obtained from the ENCODE

Broad Histone track of the UCSC Genome Browser. DNase-seq data from

K562, HUVEC, H1-hESC, and NHEK cells were obtained from the ENCODE

Duke/UNC/UT Open Chromatin track of the UCSC Genome Browser. Short

nucleolar RNA-seq data from K562 cells were obtained from the ENCODE

CSHL small RNA-seq track of the UCSC Genome Browser. MNase-seq data from CD4+ T cells [260] and CTCF ChIP-seq data from mESCs [276] were obtained from the SRA (SRX000168 and SRX000540, respectively).

Sequencing data alignment and analysis

Because rDNA is not included in the human genome assembly, we created a custom build of HG18. We removed the unsequenced bases near

98

the centromere of chromosome 13 and added a full, non-repeat masked

human rDNA repeat (GenBank accession no. U13369), yielding

“rDNA_chr13”. A custom HG18 assembly containing rDNA_chr13 rather

than chromosome 13 was constructed with bowtie-build [277]. We

designated this HG18 build "HG18_plus_rDNA." A similar genome build was

constructed for mouse, wherein a full, non-repeat masked mouse rDNA repeat (GenBank accession no. BK000964) was added to of the MM8 assembly. This build was designated “MM8_plus_rDNA.”

Datasets were aligned to our custom assemblies with Bowtie [277],

allowing two mismatches per read. Prior to alignment, non-unique reads were

removed from each FASTQ file. During alignment, reads with more than one

reportable alignment were discarded using the “-m 1” option. Peaks were

detected with F-seq [278]. Fragment size was set to 200 for all analyses

except DNase, for which it was 0. We analyzed sequenced input samples as

described above and subtracted the signal at each base from the

corresponding base of the ChIP data using R. Input subtraction was

performed prior to all analyses of ChIP-seq data in this study.

For detection of RPA116 and UBF peaks throughout the whole genome,

we used Sole-Search [279]. ChIP-seq and corresponding input datasets, with

non-unique reads removed, were aligned to HG18 without rDNA using Bowtie,

allowing only unique alignments. Bowtie alignment files were converted to

TagAlign format and uploaded to the SoleSearch web server, using an FDR of

0.001. Detailed information on nucleoplasmic UBF peaks is available online as

99 part of the supplementary material for the publication related to this study [268].

To compare nucleosome occupancy at the coding region to the IGS of rDNA, we obtained the median MNase-signal in 100 bp windows along the whole rDNA repeat. We averaged the median signal for the coding region

(windows 1-133, representing 0-13.3 kb of the rDNA repeat) and IGS (windows

134-430, representing 13.4-43 kb of the rDNA repeat). The median signals of the coding region and IGS were compared by t-test.

Correlation analysis

To obtain data points for correlations, the rDNA was divided into 100 bp windows and the median signal for each window was obtained. To compare modifications, DNase hypersensitivity, RPA116, UBF, and CTCF at rDNA between cell types, least-squares regression analysis was performed in R. To generate an average correlation score for each histone modification, we averaged the R2 values for all cell type comparisons done for a given modification. Three-dimensional plots of histone modifications at rDNA were generated with MatLab. See Table 2-1 for a complete list of R2 values determined by these analyses.

To assess rDNA co-occupancy of histone modifications within a cell type, the rDNA was windowed to 100 bp as above and pairwise comparisons between all pairs of histone modifications for a given cell type were performed in R. Matrices of the pairwise correlation scores were then plotted as heatmaps using the gplots R package (http://cran.r- project.org/web/packages/gplots/index.html).

100

Analysis of nucleoplasmic UBF peaks

To determine the distribution of significant peaks with respect to RefSeq

genes, we used the Location-Analysis feature of the ChIP-seq Tool Set

(http://chipseq.genomecenter.ucdavis.edu/cgi-bin/chipseq.cgi), which uses the

UCSC known genes list to define the start and end of genes. UBF peaks < 2 kb

from a RefSeq TSS were binned into the TSS category. Peaks in exons or

introns > 2 kb downstream of the TSS in RefSeq genes were placed in the

and intron categories, and genes > 2 kb upstream of a RefSeq TSS or

otherwise outside of a RefSeq gene were considered intergenic. Detailed

location analysis results are available online as part of the supplementary

material for the publication related to this study [268].

To determine UBF signal at all TSSs in the human genome, the complete

list of RefSeq human genes was downloaded from the UCSC Genome Browser

and merged to a file containing all UBF binding sites throughout the genome.

The median input-normalized UBF signal ± 5 kb of each TSS in 200 bp windows was determined. TSSs were sorted ascending by gene name and then descending by median UBF signal intensity. The list was filtered to include only unique records, removing duplicate TSSs and retaining the TSS with the highest median signal for a given gene. Signals were then Z-score transformed, sorted descending by average signal intensity for each TSS, and heatmapped with

Java TreeView [280].

Gene ontology analysis was performed using the PANTHER

Classification System [281]. For each cell type, a list of genes with significant

101

UBF peaks < 2 kb from their TSS was uploaded to PANTHER. Lists were then

analyzed using the Biological Process, Pathways, and PANTHER Protein

Class options. The Homo sapiens gene list was used as the background gene

list. Results of PANTHER analyses are available online as part of the

supplementary material for the publication related to this study. Motif analysis

was performed with the Cis Element Annotation System (CEAS) [282] and

detailed results are available online as part of the supplementary material for

the publication related to this study [268]. Analysis of overlap between

HEK293T and K562 peaks was performed with the GFF-Overlap feature of the

ChIP-seq Tool Set.

Comparison of UBF ChIP-seq data to expression data

Publically available expression data were downloaded from GEO.

Accession numbers are as follows: HEK293T, GSE21092 and K562, GSE8832.

Datasets were generated with the Affymetrix GeneChip Human Genome U133

Plus 2.0 array platform. Replicates were normalized to one another using the

RMA method [283] included in the affy R package and averaged. For genes

represented by multiple probes, the probe with the highest average expression

value was retained for analysis. Expression values for three categories of genes were obtained: all genes, genes with a significant UBF peak < 2 kb from their TSS as determined by SoleSearch, and genes without substantial UBF binding at their TSS. Statistical significance between groups was assessed using t-tests. P values were multiplied by two to account for the two comparisons made against the genes with a significant SoleSearch peak.

102

Name Forward primer (5'-3') Coordinates Reverse primer (5'-3') hrDNA-1.0 CCGTGGGTTGTCTTCTGACT -1092/-964 AAGCGAAACCGTGAGTCG hrDNA-0.3 GATCCTTTCTGGCGAGTCC -410/-272 GGAGCCGGAAGCATTTTC hrDNA-0 GTGTGTGGCTGCGATGGT -156/+43 CCAACCTCTCCGACGACAG hrDNA+0.1 CGACCTGTCGTCGGAGAG +21/+153 GGACGCGCGAGAGAACAG hrDNA+0.7 CCTCCAGTGGTTGTCGACTT +681/+869 GAACGACACACCACCGTTC hrDNA+1.2 GGTCGTGTGTGGGTTGACTT +1146/+1305 GCGGTACGAGGAAACACCT hrDNA+2.1 GACCGCCCTCGTGTCTGT +2060/+2247 GGGGGAAGAAGAGGATCG hrDNA+4.0 CGACGACCCATTCGAACGTCT +3990/+4092 [261] CTCTCCGGAATCGAACCCTGA hrDNA+6.7 GCAGGACACATTGATCATCG +6697/+6775 GACGCTCAGACAGGCGTAG hrDNA+7.1 CGGAGAGGGAAAGAGAGAGC +7031/+7250 TTCCTCCTCCCCCACCAC hrDNA+8.2 AGTCGGGTTGCTTGGGAATGC +8204/8300 [261] CCCTTACGGTACTTGTTGACT hrDNA+12.9 ACCTGGCGCTAAACCATTCGT +12855/+12970 [261] GGACAAACCCTTGTGTCGAGG hrDNA+18.5 TGGTGGGATTGGTCTCTCTC +18449/+18591 CAGCCTGCGTACTGTGAAAA hrDNA+30.6 ACTGGCGAGTTGATTTCTGG +30541/+30640 CGAGACAGTCGAGGGAGAAG Coding-1 CGTGCCTGAGGTTTCTCC +2119/+2247 GGGGGAAGAAGAGGATCG Coding-2 GCTAAATACCGGCACGAGAC +8256/+8344 TTCACGCCCTCTTGAACTCT Coding-3 TGGGTTTTAAGCAGGAGGTG +12258/+12456 AACCTGTCTCACGACGGTCT IGS-1 CACTACCCACGTCCCTTCAC +28163/+28298 GAGAGAAGACGGAGGCACAC IGS-2 GTGTGCCTCCGTCTTCTCTC +28279/+28456 GTCAAGGGGCTATGCCATC IGS-3 ATTCTTGCCAGGCTGACATT +28328/+28495 AAGCCTCACAACTGCAGACC rDNA-CTCF-1 CCGTGGGTTGTCTTCTGACT +41907/+42035 AAGCGAAACCGTGAGTCG rDNA-CTCF-2 GCTTCTCGACTCACGGTTTC +42012/+42202 GGAGCTCTGCCTAGCTCACA ARHGAP18+ AGCTTCCGAAGGCTTACCTC chr6:130072845- TGTGTCAGGATCGCAGAAAG 130073012

103

ARHGAP18- TGCACAACTGCTCAACCTTC chr6:130074632- AGAGGTGGTTGAGGTCATGG 130074717 JAK2+ AGACAACTGTGACGGGCTTC chr9:4975195- CCCTTCTGCTCCTCTTCCTC 4975278 JAK2- ACCCCTTGCCTTTGTCTTTT chr9:4971724- GGAAACAGGCTCAAACGAAG 4791888 IL1RAPL1+ AGTGGCTGAGGAAAAACGAA chrX:28848843- AACCAGGTCAGGGGATTTCT 28848939 ILR1RAPL1- GGGAAGCCTACTTTGGAAGG chrX:28846958- TCAGTGGCTTCTCATTGCAC 28846958 LOC730227+ GCTAAGCACTGTGGGTGTGA chr1:201525406- AAGCCTGTTGGAGTGCTGTT 201525567 LOC730227- GGGGACTGTGCTGAGATGTT chr1:201523222- ACAGCCAGGGCACTAACCTA 201523420 SELT1+ GATGAGGCTTCTGCTGCTTC chr3:151803839- CGTGGCGTACTGCATCTTTA 151803950 SELT1- CCATCATATCCCCAGTGACC chr3:151802525- GGGTGGGGGAGATTACTGAT 151802691 MED26+ GTTGCTTCACAGCCCTTCTC chr19:16599285- GGGGAGCTGAGGCTAGAGTT 16599443 MED26- GAGAACCCCGTGATTGAAAG chr19:16600860- TCCGTTTCCTCCTCTGTGAC 1660965 Table 2-3. Primers used for ChIP-PCR assays

All primers were designed for this study except for those with associated references. For nucleoplasmic UBF binding sites, positive (+) and negative (-) control primer sets are given. For primers amplifying rDNA regions, the coordinates are expressed relative to the rDNA TSS.

104

Chapter 3

CHD7 functions in the nucleolus as a positive regulator of ribosomal RNA

biogenesis

Gabriel E. Zentner1, Elizabeth A. Hurd5, Michael P. Schnetz1, Lusy Handoko7,

Chuanping Wang2, Zhenghe Wang1,3, Chialin Wei7,

Paul J. Tesar1,4, Maria Hatzoglou2, Donna M. Martin5,6, and Peter C. Scacheri1,3

1Department of Genetics, 2Department of Nutrition, 3Case Comprehensive

Cancer Center, and 4Center for Stem Cell and Regenerative Medicine, Case

Western Reserve University, Cleveland, OH 44106, USA, 5Department of

Pediatrics, and 6Department of Human Genetics, University of Michigan, Ann

Arbor, MI 48109, USA, and 7Genome Technology and Biology Group, Genome

Institute of Singapore, 138672, Singapore, Singapore

A modified version of this chapter was previously published as:

Zentner, G.E., Hurd, E.A., Schnetz, M.P., Handoko, L., Wang, C., Wang, Z., Wei, C., Tesar, P.J., Hatzoglou, M., Martin, D.M., Scacheri, P.C. (2010). CHD7 functions in the nucleolus as a positive regulator of ribosomal RNA biogenesis. Hum. Mol. Genet. 19(18):3491-3501.

105

Abstract

De novo mutation of the gene encoding chromodomain helicase DNA

binding protein 7 (CHD7) is the primary cause of CHARGE syndrome, a complex

developmental disorder characterized by the co-occurrence of a specific set of

birth defects that has clinical overlap with known disorders of ribosome biogenesis. Recent studies indicate that CHD7 functions as a transcriptional regulator in the nucleoplasm. We used our previously described method of rDNA

ChIP-seq to assess a potential role for CHD7 in the regulation of rRNA

transcription. Indeed, CHD7 is strongly enriched at rDNA in human and mouse

cells. Immunofluorescence and western blotting of subcellular fractions indicates

that a substantial fraction that CHD7 is constitutively localized to the nucleolus,

site of rRNA transcription. ChIP-chop analyses demonstrate that CHD7

specifically associates with hypomethylated, active rDNA, suggesting a role as a

positive regulator of rRNA transcription. Consistent with this hypothesis, siRNA-

mediated depletion of CHD7 results in hypermethylation of the rDNA promoter

and concomitant reduction of 45S pre-rRNA level, while cells overexpressing

CHD7 show increased levels of 45S pre-rRNA compared to control cells

expressing wild-type levels of CHD7. Depletion of CHD7 also reduces cell

proliferation and protein synthesis. Lastly, compared to wild-type mESCs, the

levels of 45S pre-rRNA are reduced in both Chd7+/- and Chd7-/- mESCs, as well

as Chd7-/- whole mouse embryos and multiple tissues dissected from Chd7+/-

embryos. Together with previously published studies, these results indicate that

CHD7 dually functions as a transcriptional regulator in the nucleoplasm and the

106 nucleolus and provide a novel avenue for investigation into the pathogenesis of

CHARGE syndrome.

107

Introduction

The chromodomain helicase DNA binding (CHD) is a highly

conserved group of nuclear proteins with nine members in vertebrates [210,211].

Although the cellular functions of the CHD proteins are suspected to be quite diverse, roles in transcriptional regulation are emerging as a common theme.

CHD1, for example, was recently shown to regulate ES cell pluripotency genes including Oct4 [212]. CHD3 and CHD4 (also known as Mi-2α and Mi-2β) are integral components of the nucleosome remodeling and deacetylating (NuRD) complex [234], involved in transcriptional repression. CHD5 is a tumor suppressor that controls cell growth and apoptosis by positively regulating p53 target genes including p21 and Bax [225]. CHD8 controls cell cycle progression by regulating the cyclin E2 gene [216] and has also been shown to regulate transcription of β-catenin target genes [233].

Of all 9 CHD proteins, CHD7 is of particular interest. De novo mutation of the CHD7 gene gives rise to CHARGE syndrome, a complex genetic condition characterized by Coloboma of the eye, Heart malformations, Atresia of the choanae, Retardation of growth, Genital hypoplasia, and Ear abnormalities and deafness [226,235]. Other clinical features not included in the acronym include

tracheoesophageal fistula, anosmia, and limb anomalies [226,236-238,240].

Approximately two-thirds of cases of CHARGE syndrome are due to mutation of

CHD7 [226]. Most CHD7 mutations are nonsense or frameshift and predicted to

be loss of function, and thus haploinsufficiency is hypothesized to be the

pathogenic mechanism. Studies in mice support the haploinsufficiency model.

108

Specifically, mice that are heterozygous for a W973X nonsense mutation in the

Chd7 gene, also known as Whirligig, display many of the features of human

CHARGE syndrome, including postnatal growth retardation, ,

inner ear malformations, female genital hypoplasia, heart defects, and olfactory

defects [239,240]. Gene-trap technology has also been used to generate Chd7

mutant mice, and Chd7Gt/+ mice develop a phenotype similar to Whirligig mice

[241]. Mice that are homozygous for either the Whirligig or gene-trap alleles die

by embryonic day 11 [239,241]. In situ hybridization and immunofluorescence

analyses of mouse embryos at multiple stages of

indicate that Chd7 expression is spatially and temporally dynamic, suggesting

that the requirements for CHD7 during development are specific to both tissue

and stage [238,239,241,242].

To gain insight into the function of CHD7, we recently mapped the

distribution of CHD7 on chromatin using the approaches of chromatin

immunoprecipitation coupled with microarray analysis (ChIP-chip) and massively

parallel DNA sequencing (ChIP-seq) [229,245]. In multiple cell types, hundreds to thousands of CHD7 sites were identified. Most of the CHD7 sites show features of gene enhancer elements. Specifically, CHD7 sites were predominantly located distal to transcription start sites, shown to correlate with cell-specific gene expression, and found within open regions of chromatin marked with H3K4 monomethylation, the epigenetic signature of enhancers. Moreover, in mESCs,

CHD7 was found to co-localize with p300, a known enhancer-binding protein and

strong predictor of enhancer activity. Despite the strong correlation with

109

enhancer elements, most genes directly targeted by CHD7 were only subtly

altered (< 2 fold) in expression in Chd7+/- and Chd7-/- mESCs. These studies are

consistent with a role for CHD7 as a transcriptional regulator, suggesting that

dysregulated gene expression contributes to the pathogenesis of CHARGE

syndrome [245]. However, it is not yet clear whether subtle changes in gene

expression are sufficient to give rise to CHARGE syndrome, or if the genes

normally targeted by CHD7 are more dramatically affected by CHD7 deficiency at

later stages of development.

Virtually all CHD proteins have been reported to localize to the

nucleoplasm. A search of the Nucleolar Proteome Database (NOPdb) [284]

shows that all nine CHD proteins have also been detected in the nucleolus via

proteomic methods. These studies raise the possibility that CHD proteins, in

addition to functioning as regulators of nuclear gene expression, also function as

regulators of rRNA biogenesis. Consistent with this hypothesis, CHD4, a

repressor of nuclear gene transcription, was shown to associate with rDNA and

activate rRNA transcription [285]. Using the rDNA ChIP-seq method described in

the previous chapter, we show that CHD7 binds throughout the coding region of

rDNA. Immunofluorescence and subcellular fractionation confirm the nucleolar

localization of CHD7. We present evidence that both haploinsufficiency and

complete loss of CHD7 lead to increased DNA methylation of the rRNA promoter,

resulting in decreased rRNA expression. We also show that CHARGE-affected tissues isolated from heterozygous Chd7 mouse embryos have reduced levels of rRNA. The results presented herein delineate a novel nucleolar function for

110

CHD7 and also raise the possibility that CHARGE syndrome arises through a

combination of dysregulated nucleoplasmic and nucleolar transcription.

Furthermore, these findings demonstrate the utility of the previously described

rDNA ChIP-seq method in exploring novel avenues of investigation for

chromatin-associated proteins relevant to disease.

Results

CHD7 associates with rDNA

Mammalian cells contain several hundred tandemly duplicated rRNA genes, clustered into repeated arrays known as nucleolar organizer regions

(NORs) around which nucleoli form [2]. Human and mouse NORs are located

on the p-arms of chromosomes 13, 14, 15, 21, and 22 in humans and 12, 16, 15,

17, 18, and 19 in mouse [29,30]. rRNA genes are approximately 42.9 kb in

length in humans and 45 kb in mice and contain 13-14 kb of coding sequence

with the remainder made up by the noncoding intergenic spacer region (IGS)

(Figure 3-1A).

To determine potential rDNA association of CHD7, we used previously

generated ChIP-seq data. These studies were carried out in DLD1-A2 cells, in

which both alleles of CHD7 are endogenously FLAG-tagged [229,286], and

mESCs ([245], M.P.S. and P.C.S., unpublished data). Although thousands of

CHD7 sites were detected on nucleoplasmic chromatin, the reference genome

assemblies to which the ChIP-seq data were aligned do not contain the rDNA

loci, and therefore CHD7 occupancy at rDNA was not assessed. In this study,

we realigned the CHD7 ChIP-seq data to our builds of the human and mouse

111

containing rDNA. The results show high enrichment of CHD7 at rDNA

in both cell types, though the relative pattern of CHD7 occupancy at rDNA loci

differed between the two cell types tested (Figure 3-1A,C). Specifically, while

CHD7 binds to the coding region of the rDNA in DLD1-A2 cells, binding occurs at

the promoter and the 3' end of the coding sequence in mouse ES cell rDNA. The

reason for these differences is currently not clear, but might reflect species or cell

type specific differences. Indeed, UBF occupancy at rDNA has been shown to

vary between the Jurkat cell line and the Kasumi and HEL lines [287]. In

addition, transcription factors involved in cell lineage differentiation such as

Runx2, MyoD, and Mgn have been found to differentially occupy rDNA in

undifferentiated C2C12 cells versus cells differentiating along muscle, bone, and

adipose lineages [80].

CHD7 is dually localized to the nucleoplasm and nucleolus

Having determined that CHD7 associates with rDNA by ChIP-seq and

ChIP-PCR, we wanted to confirm nucleolar localization of CHD7. As expected,

immunofluorescent staining in DLD1-A2 cells with FLAG antibodies showed

CHD7 in the nucleus. However, high levels of CHD7 were also detected in the

nucleolus. Nucleolar localization of CHD7 was validated by costaining with

nucleolin, an abundant eukaryotic nucleolar protein [63] (Figure 3-2A).

Quantification of CHD7 immunofluorescence revealed that on average, 60% of

CHD7 was nucleoplasmic while the remaining 40% was nucleolar, though a

significant degree of cell-cell variability was observed (Figure 3-2A). 100% of

cells examined showed colocalization of CHD7 and nucleolin (n = 107), indicating

112

Figure 3-1. CHD7 binds to rDNA

(A) ChIP-seq plot of CHD7 binding to rDNA in DLD1-A2 cells. A schematic representation of the mammalian rDNA repeat is shown above the plot, with primers used for human ChIP-PCR in black and mouse ChIP-PCR in grey. (B)

ChIP-PCR validation of CHD7 rDNA association in DLD1-A2 cells. (C) ChIP-seq plot of CHD7 binding to rDNA in mESCs. (D) ChIP-PCR validation of CHD7 rDNA association in mESCs.

113

Figure 3-2. CHD7 localizes to the nucleoplasm and nucleolus

(A) Immunofluorescent staining of DLD1-A2 cells with FLAG and nucleolin antibodies. The arrow indicates nucleolar localization of CHD7 while the asterisk marks nucleoplasmic CHD7. Scale bar = 4 µm. The graph on the right represents the results of fluorescence quantification of CHD7 nucleolar signal.

Intensity of the whole nucleus was calculated by multiplying the pixel area of the nucleus by average pixel intensity. Individual nucleoli in each cell were measured and their combined intensity was divided by the total intensity of the cell to yield the nucleolar intensity. The nucleoplasmic intensity was determined by subtracting nucleolar intensity from the total nuclear intensity. Nucleoplasmic signal = 56.76 ± 10.58, nucleolar signal = 43.24 ± 10.58 (n = 33, mean ± SD). P

< 0.0001 by t-test, indicating that significantly more FLAG-CHD7 is located in the nucleoplasm. (B) Western blot analysis of subcellular fractions from DLD1-A2 and DLD1-WT cells. The purity of cytoplasmic (CP), nucleoplasmic (NP), and nucleolar (No) fractions was assessed by blotting for tubulin, NUP62, and UBF, respectively. Densitometric quantification of CHD7 in the nucleoplasmic and

114 nucleolar fractions is also shown and expressed as a percentage of total CHD7 signal.

115

that the nucleolar localization of CHD7 is constitutive. Nucleolar localization of

CHD7 was further validated by CHD7 western blot analysis of subcellular

fractions isolated from both epitope-tagged and wild-type DLD1 cells, indicating

that the immunofluorescence results are not artifactual or due to the presence of

the FLAG tag. Densitometric quantification of these blots showed that ~20% of

cellular CHD7 is localized to the nucleolus while the remaining ~80% is

nucleoplasmic (Figure 3-2B).

CHD7 influences the levels of the 45S pre-rRNA transcript

rDNA is transcribed by Pol I into a 45S pre-rRNA that undergoes a

complex series of RNase cleavage and chemical modification steps to yield the

mature 28S, 18S, and 5.8S rRNA species that are then assembled into mature

ribosomes [35,250,251]. Having established that CHD7 binds to rDNA, we next

investigated if CHD7 plays a role in regulating rRNA synthesis. Specifically, we

performed siRNA knockdown of CHD7 in DLD1-A2 cells (Figure 3-3A), followed

by qRT-PCR of the human 45S pre-rRNA. Compared to cells transfected with

nonspecific control siRNAs, two independent CHD7 siRNAs reduced pre-rRNA

levels 20-30% (Figure 3-3B). We also quantified the levels of the 45S pre-rRNA

in Chd7 wild-type, heterozygous, and null mESCs derived from Whirligig mouse

embryos [239], which harbor a nonsense mutation in the Chd7 gene and express

levels of Chd7 consistent with their genotypes (Figure 3-3C). Compared to wild-

type cells, Chd7+/- and Chd7-/- mESCs show a significant reduction in pre-rRNA

levels (Figure 3-3D). The level of the pre-rRNA in Chd7+/- mESCs is halfway

between that of the wild-type and null cells, indicating that the effect of Chd7

116

Figure 3-3. CHD7 positively regulates rRNA biogenesis

(A) qRT-PCR and western blot analysis of CHD7 mRNA and protein levels following treatment of DLD1-A2 cells with nontarget siRNA pools (siNT-1/2), a

CHD7 siRNA pool (siCHD7-1), and a CHD7 siRNA not present in the pool

(siCHD7-2). (B) qRT-PCR for the 45S pre-rRNA in cells treated with control or

CHD7 siRNA (n = 2-5). (C) qRT-PCR and western blot analysis of CHD7 mRNA and protein levels in mESCs derived from Whirligig embryos. (D) qRT-PCR for

117

the 45S pre-rRNA in mESCs (n = 4-7). (E) qRT-PCR analysis of 45S pre-rRNA

levels in DLD1-A2 cells transfected with empty vector or plasmid encoding

untagged human CHD7 protein (n = 2). The FLAG-tagged CHD7 expressed in

DLD1-A2 cells is not detectable with the Abcam CHD7 and thus, blotting with this antibody only reveals the untagged CHD7 that was transfected.

Error bars represent mean + SEM for biological replicates. *, P < 0.05; **, P <

0.01; ***, P < 0.001 by t-test.

118

deficiency on rRNA synthesis is dosage sensitive.

Having established that reduced levels of CHD7 result in decreased levels

of the 45S pre-rRNA, we next tested whether increased expression of CHD7

would have the opposite effect on pre-rRNA levels. Specifically, we transiently

transfected DLD1-A2 cells with a construct encoding full-length CHD7 and

quantified the levels of the 45S pre-rRNA. Compared to cells transfected with

empty vector, cells overexpressing CHD7 showed approximately 33% higher

levels of 45S pre-rRNA (Figure 3-3E). These results indicate that modulation of

CHD7 in either the positive or negative direction results in concomitant changes in the expression of the 45S pre-rRNA. To exclude a role for CHD7 in regulating

the transcription of regulators of rRNA transcription, we assessed the protein

levels of the RPA116 subunit of Pol I, UBF, and TIP5 after CHD7 knockdown.

We observed no change in the levels of these proteins (Figure 3-4). Together with the CHD7 ChIP data indicating association of CHD7 with rDNA and its demonstrated ATPase activity (K. Bouazoune, personal communication),

these results suggest that CHD7 directly regulates transcription of rRNA by

modulating chromatin structure at rDNA; however, we currently cannot rule out

the possibility that CHD7 influences early rRNA processing events.

Depletion of CHD7 reduces cell proliferation and protein synthesis

Having established that CHD7 functions as a positive regulator of rRNA

synthesis, we next chose to investigate the cellular effects of CHD7 knockdown.

We first chose to assay cell proliferation, as alterations in the levels of rRNA regulators are known to influence cell and proliferation [95,97,160]. We

119

Figure 3-4. CHD7 knockdown does not affect protein levels of known regulators of rRNA transcription

Western blotting for the indicated regulators of rRNA transcription was performed

72 hours after transfection with control or CHD7 siRNA in DLD1-A2 cells.

120 performed siRNA knockdown of CHD7 in DLD1-A2 cells and performed cell counting each day for five days following knockdown. A significant reduction in cell number was observed five days post-siRNA transfection (Figure 3-5A). To determine the cause of the reduction in cell number, we quantified BrdU-labeled

DLD1-A2 cells treated with control or CHD7 siRNA five days post-transfection. A significant reduction in the number of BrdU-positive cells was observed in the

CHD7 siRNA-treated group (Figure 3-5B,C). We therefore conclude that knockdown of CHD7 inhibits cell proliferation, consistent with its function as a positive regulator of rRNA synthesis.

In addition to influencing cell proliferation, rRNA synthesis is rate-limiting for ribosome biogenesis [288]. We therefore tested if depletion of CHD7 affected global protein synthesis. We performed siRNA-mediated knockdown of CHD7 in

DLD1-A2 cells and performed metabolic labeling with [35S]methionine at three, four, and five days after siRNA transfection. A significant reduction in radiolabeled methionine incorporation was seen four days after knockdown

(Figure 3-5D), suggesting that the impairment of rRNA synthesis caused by

CHD7 depletion impairs protein synthesis. Global protein synthesis recovers to near wild-type levels five days post-transfection, though the CHD7 siRNA is still effective at this time point (Figure 3-5E). However, at this time, cells show a proliferation defect. It is currently not clear why a defect in protein synthesis occurring four days post-transfection gives rise to a proliferation defect five days post-transfection; however, one potential explanation is that compensatory mechanisms have engaged to restore protein synthesis by day five post-

121 transfection but are not sufficient to restore normal cell proliferation. Further studies are required to test this hypothesis.

We next tested whether the cell proliferation defect observed in CHD7 siRNA-treated cells could be due to stabilization of p53, which has been reported to be induced upon nucleolar stress [176]. Compared to control-treated cells, p53 protein levels remained unchanged three through five days following CHD7 knockdown (Figure 3-5E). Alterations in p53 transcript levels were also not detected (Figure 3-5F). We next tested whether p21 transcript levels were altered, as p21 is a well-known cell cycle inhibitor that can also be induced by cellular stress, either in a p53-dependent or -independent fashion [289-291]. The results of qRT-PCR analyses showed a significant increase in the levels of p21 following CHD7 knockdown (Figure 3-5F). These results suggest that upregulation of p21 may contribute to proliferation defects observed in the CHD7- knockdown cells, although future studies are required to test if upregulation of p21 is due to nucleolar stress or dysregulation of CHD7-mediated transcription in the nucleoplasm.

CHD7 antagonizes DNA methylation at active rDNA repeats

rDNA repeats are maintained in two distinct epigenetic states. Active repeats have a euchromatic structure, characterized by methylation of H3K4, acetylation of H3 and H4, and relatively low levels of CpG methylation. Inactive repeats have a heterochromatic structure, with higher levels of CpG methylation, methylation of H3K9 and H4K20, and hypoacetylation of H4 [2]. Proteins that

bind to rDNA may be specific for active or inactive repeats, potentially providing

122

Figure 3-5. Loss of CHD7 impairs cell proliferation and protein synthesis

(A) Cell counting assay performed in DLD1-A2 cells treated with the indicated siRNA (n = 3). (B) Quantification of BrdU labeling five days post-transfection in

DLD1-A2 cells treated with control or CHD7 siRNA (n = 2-3). (C) Representative image of BrdU labeling in control and BrdU labeling in DLD1-A2 cells showing a

123

approximately matched for cell number (160-180 cells/field). Scale bar = 32 µm.

(D) Measurement of global protein synthesis in DLD1-A2 cells after CHD7 knockdown by [35S]methionine radiolabeling at the indicated time points after

siRNA transfection. Scintillation counts were normalized to total protein. (E)

Western blots of FLAG-CHD7 and p53 in DLD1-A2 cells. These blots indicate that the CHD7 siRNA is effective up to five days post-transfection and that p53

protein levels are not altered. (F) qRT-PCR analysis of CHD7, p53, and p21

transcript levels in DLD1-A2 cells four days post-transfection with siRNA (n = 4).

Error bars represent mean + SEM for biological replicates. *, P < 0.05; **, P =

0.01; ***, P < 0.001 by t-test.

124

insight into their function at rDNA repeats. Because reduced CHD7 levels

correlate with reduced levels of 45S pre-rRNA, we hypothesized that CHD7 is a

positive regulator of rDNA transcription and as such might associate with active

rDNA repeats. To address this possibility, we used the ChIP-chop assay.

Specifically, ChIP was performed on chromatin from DLD1-A2 cells using antibodies to FLAG-CHD7, H3K4me2 (predominantly found at active rDNA repeats), and H3K9me2 (predominantly found at inactive rDNA repeats).

Chromatin from input and ChIP samples was then digested with HpaII, a methylation-sensitive enzyme that does not cut the internal CpG within the sequence CCGG if it is methylated. Primer pairs flanking HpaII sites within the human rDNA repeat were then used to amplify mock- and HpaII-digested DNA.

DNA immunoprecipitated by FLAG-CHD7 showed a level of HpaII resistance more similar to that of H3K4me2 than H3K9me2, suggesting that CHD7 predominantly associates with unmethylated, active rDNA repeats in DLD1-A2 cells (Figure 3-6A).

We next investigated if loss of CHD7 was associated with changes in the epigenetic state of rDNA repeats. We chose to assay DNA methylation of the rDNA promoter, as it has a well-established role in the silencing of rRNA expression [2]. Genomic DNA was isolated from DLD1-A2 cells treated with either control or CHD7 siRNA, digested with HpaII, and PCR-amplified using primers flanking the rDNA promoter. The rDNA promoter in CHD7-siRNA treated cells was significantly more resistant to HpaII digestion than control promoter

DNA, indicating an increase in promoter methylation upon reduction of CHD7

125 levels (Figure 3-5B). Similar results were observed in comparisons between

Chd7+/+, Chd7+/-, and Chd7-/- mESCs (Figure 3-6C). To ensure that the increase in HpaII resistance was due to internal CpG methylation rather than methylation of the external C residue of the recognition sequence, the same genomic DNA was also digested with MspI, an isoschizomer of HpaII. MspI cleaves the CCGG sequence regardless of the methylation status of the internal C but will not cut if the external C is methylated. Genomic DNA that was resistant to HpaII showed little resistance to MspI cleavage, indicating that the majority of methylation at the assayed CCGG sites is on the internal CpG (Figure 3-6B,C). These results suggest that CHD7 either initiates or maintains the expression of rRNA by associating with active rDNA loci, and that loss of CHD7 results in the conversion of active rDNA to a more heterochromatic state.

CHARGE-relevant tissues from Chd7 gene-trap mice show reduced pre- rRNA levels

Having established a role for CHD7 in rRNA biogenesis in two cell culture models, we next sought to examine the relevance of these results to a model of

CHARGE syndrome. Similar to Whirligig mice, Chd7 gene-trap mice (Chd7Gt/+) mice display several features of human CHARGE syndrome including inner ear defects, postnatal growth retardation, and hyposmia. Chd7Gt/+ mice were bred and whole embryos were harvested at E9.5, just prior to the time at which

Chd7Gt/Gt embryos die. We then performed qRT-PCR for the 45S pre-rRNA on total RNA isolated from whole embryos. Compared to wild-type embryos,

Chd7Gt/Gt embryos showed a significant reduction in the levels of the 45S pre-

126

Figure 3-6. CHD7 is associated with active rDNA repeats and counteracts rDNA promoter methylation

(A) ChIP-chop analysis of FLAG-CHD7, H3K4me2, and H3K9me2 binding to the human rDNA locus in DLD1-A2 cells (n = 2). Primer coordinates are relative to the rDNA TSS. (B) Analysis of rDNA promoter methylation in DLD1-A2 cells by

HpaII/MspI digest and qPCR (n = 3). (C) Analysis of rDNA promoter methylation in mESCs by HpaII/MspI digest and qPCR (n = 3). Error bars represent mean +

SEM for biological replicates. *, P < 0.05 by t-test.

127

rRNA. The levels of the 45S pre-rRNA were similar between wild-type and

Chd7Gt/+ embryos (Figure 3-7A). The discrepancy between the results observed

in the homozygous and heterozygous embryos could be related to differences in

the severity of the phenotypes, i.e., Chd7Gt/Gt embryos die in mid-gestation while

Chd7Gt/+ mice are viable. Alternatively, the lack of aberrant pre-rRNA expression

in heterozygotes may be related to the fact that only a subset of organs are

affected in the mouse, and the effect on the pre-rRNA is masked when the whole

embryo is analyzed.

To test whether the role of CHD7 in pre-rRNA synthesis is tissue-specific,

we dissected otocyst, eye, heart, and limb tissue from Chd7+/+ and Chd7Gt/+

embryos at E10.5 and quantified pre-rRNA levels by qRT-PCR. Despite the

inherent limitations of quantifying expression across ostensibly variable samples

(Figure 3-7B), we detected a significant reduction in pre-rRNA expression in

Chd7Gt/+ eye and ear tissues (Figure 3-7C). In contrast, Chd7Gt/+ heart and limb

tissues were comparable to wild-type levels. These results could be indicative of

cell type or developmental stage-specific nucleolar roles for CHD7. Interestingly, the degree of pre-rRNA reduction appears to correlate with the penetrance of tissue-specific malformations. Inner ear defects are observed in all Chd7Gt/+ mice

and keratoconjunctivitis sicca is observed in about 50% of Whirligig mice.

Cardiac malformations have been identified in a small number of Whirligig mice,

and limb defects in Chd7 have not yet been characterized [239,241].

CHD7 promotes rDNA association of the Treacher Collins syndrome

protein, treacle

128

Figure 3-7. Pre-rRNA levels are reduced in CHARGE-relevant tissues from

Chd7 gene-trap embryos

(A) qRT-PCR for Chd7 and pre-rRNA in whole E10.5 Chd7 gene-trap embryos (n

= 2-3). (B) qRT-PCR for the 45S pre-rRNA in Chd7 mutant tissues (n = 3-4). (C) qRT-PCR for the 45S pre-rRNA in Chd7 mutant tissues. Error bars represent

mean + SEM for biological replicates. *, P < 0.05, **, P < 0.01 by t-test.

129

Treacher Collins syndrome is a congenital multiple anomaly disorder

characterized predominantly by craniofacial anomalies. TCS is caused by de

novo mutations in TCOF1, encoding treacle, as well as a subunit of Pol I in a

small number of cases [185,194]. Intriguingly, treacle is a nucleolar protein that

specifically associates with rDNA and functions as a positive regulator of rRNA

synthesis [195]. Although TCS and CHARGE are clearly clinically distinct

syndromes, several organ systems are affected in both conditions, including the

eyes and ears. Based on these observations and the data presented here

indicating that CHD7 also functions in rRNA synthesis, we tested whether CHD7

and treacle co-associate at rDNA. ChIP analysis of treacle in Chd7 wild-type and null cells demonstrated that absence of CHD7 impaired the ability of treacle to bind rDNA (Figure 3-8A). This result was not due to alterations in expression of

Tcof1 or treacle protein stability as assayed by qRT-PCR and western blot

(Figure 3-8B). The connection was between CHD7 and treacle was further

investigated using co-immunoprecipitation (Co-IP) assays. Co-IPs were

performed in DLD1-A2 cells with antibodies directed against FLAG and treacle as

well as nonspecific IgG. FLAG-tagged CHD7 was efficiently immunoprecipitated with both FLAG and treacle antibodies; however, treacle could not be detected in

FLAG IPs (Figure 3-9). Taken together, these findings suggest that the association of treacle with rDNA is partially dependent on the presence of CHD7 and raise the possibility of a functional connection between the two proteins and the pathogenesis of CHARGE and TCS. In vivo studies involving breeding of

Chd7 and Tcof1 mutant mice could be used to further investigate this possibility.

130

Figure 3-8. CHD7 promotes association of treacle with rDNA

(A) ChIP analysis of treacle binding to rDNA in Chd7+/+ and Chd7-/- mESCs (n =

2). Two nucleoplasmic nontarget regions are included as controls for ChIP specificity. (B) qRT-PCR and western blot analysis of Tcof1 mRNA and treacle protein levels, respectively, in Chd7+/+ and Chd7-/- mESCs (n = 3). Error bars represent mean + SEM for biological replicates. *, P < 0.05 by t-test.

131

Figure 3-9. CHD7 physically interacts with treacle

Western blots showing co-IP results for FLAG-CHD7 and treacle in DLD1-A2 cells. FLAG-CHD7 is efficiently immunoprecipitated with treacle but the reciprocal interaction is not seen.

132

Discussion

Previous studies indicate that CHD7 binds to gene enhancer elements and functions as a transcriptional regulator in the nucleoplasm. Here, we provide evidence that CHD7 also functions in the nucleolus as a positive regulator of rRNA transcription. Depletion of CHD7 through siRNA or genetic mutation results in a 20-30% reduction in pre-rRNA levels. Though this effect appears relatively small, it is important to consider the sheer volume of cellular transcription accounted for by rRNA. rRNA transcription accounts for at least

50% of overall cellular transcription in most growing eukaryotic cells [1] and it thus stands to reason that what appears to be a relatively modest effect on rRNA synthesis may in fact have large biological impact. Consistent with this notion, we detected reductions in pre-rRNA levels in two frequently affected tissues from

Chd7 heterozygous mice.

While further studies are necessary to test if a deficiency in rRNA biogenesis is directly responsible for the malformations of CHARGE syndrome, an attractive hypothesis is that CHARGE is a “ribopathy.” This has been previously suggested based on phenotypic similarities between CHARGE patients and mice with heterozygous mutations in RpL24 [208]. Several human genetic diseases are caused by mutations in genes that affect rRNA synthesis.

Diamond-Blackfan anemia (DBA), characterized by a dramatic reduction of erythroid precursor cells as well as craniofacial, urogenital, cardiac, and ophthalmologic defects, is caused by mutations in several genes that encode ribosomal proteins [186,200]. The most frequently associated gene, RPS19, is

133 mutated in approximately 25% of DBA cases and results in decreased production of 18S rRNA and impaired maturation of the 40S ribosomal subunit [186,200].

Most mutations detected in DBA are heterozygous, suggesting that haploinsufficiency is the underlying pathogenic mechanism [200]. Another well- characterized resulting from impaired rRNA synthesis is TCS.

Heterozygous mutations in the TCOF1 gene, encoding the protein treacle, result in craniofacial anomalies, ear abnormalities, and hearing loss due to deficiencies in rRNA biogenesis [185]. As rRNA production is essential to all proliferating cell types, it is interesting that both of these disorders, like CHARGE syndrome, are characterized by cell type specific defects. It is also noteworthy that DBA and

TCS share some clinical features with CHARGE syndrome, including ear abnormalities and optic colobomata. Other notable conditions caused by mutations in nucleolar proteins include Bloom syndrome, Werner syndrome, and

Rothmund-Thomson syndrome. These diseases are caused by mutations in the genes encoding BLM, WRN, and RECQL4 DNA , respectively, which lead to genomic instability and predisposition to cancer [3]. In contrast to CHD7,

BLM, WRN, and RECQL4 are not constitutively nucleolar; rather, they shuttle in and out of the nucleolus depending on cell cycle phase or conditions of stress [3], which might explain the phenotypic differences between these syndromes and more genuine “ribopathies” such as TCS or DBA.

Interestingly, studies of haploinsufficient growth defects in non-mammalian model organisms also point to a large contribution of mutations in proteins with functions in ribosome biogenesis. A study of S. cerevisiae heterozygous deletion

134 strains revealed that approximately 3% of the yeast genome shows a haploinsufficient growth defect under conditions that favor rapid growth [209].

Approximately 49% of the haploinsufficient genes identified were involved in ribosome biogenesis, as compared to approximately 4% for the next most- enriched category. This observation suggests that, in yeast, a primary outcome of haploinsufficiency is decreased protein synthesis. In Drosophila melanogaster, the Minute mutations are a group of more than 50 distinct mutations, mostly in ribosomal proteins, that give rise to homozygous lethality and haploinsufficient growth defects such as short bristles, small body size, and developmental delay [203]. Finally, knockdown of 21 ribosomal proteins in zebrafish caused developmental phenotypes in the majority of cases [184]. It therefore seems that genes that function in ribosome biogenesis are particularly dosage sensitive.

How might CHD7 promote transcription of rRNA? CHD7 is thought to have chromatin remodeling activity due to its SNF2-like ATPase/helicase domain. Thus, a likely scenario is that CHD7 either initiates or maintains open chromatin at active rDNA repeats to promote association of factors involved in rRNA transcription. Consistent with this hypothesis, our results indicate that active rDNA becomes hypermethylated, and less transcriptionally active upon reduction of CHD7 levels by siRNA-mediated knockdown or conventional knockout. ChIP studies of factors involved in rRNA transcription (such as RNA polymerase I or UBF) in the context of reduced CHD7 could be used to further test this hypothesis. In addition, CHD7 could play a role in antagonizing the

135

repressive functions of the nucleolar remodeling complex (NoRC), which

remodels chromatin and recruits DNA methyltransferases and histone

deacetylases to rDNA [133]. Additionally, nucleosome positioning at the

promoter of mouse rDNA regulates CpG methylation [126]. At transcriptionally active rDNA repeats, the promoter nucleosome covers nucleotides -157 to -2, confining CpG dinucleotides at positions -143 and -133 to the globular domain of the nucleosome. At inactive repeats, this nucleosome is shifted 25 nucleotides downstream, covering -132 to +22 and shifting the -143 and -133 CpGs to the linker region of the nucleosome, where they can then be methylated. NoRC induces sliding of this nucleosome to promote silencing of rDNA repeats via its

SNF2-like ATPase subunit, SNF2H/Smarca5 [126]. CHD7, via ATP-dependent chromatin remodeling, could position the promoter nucleosome in the active position, physically inhibiting DNA methylation. Studies of NoRC chromatin association and promoter nucleosome position in Chd7 mutant cells are necessary test this model.

One of the most puzzling issues surrounding conditions that are caused by defects in ribosomal production and protein synthesis is cell type specificity.

In TCS, haploinsufficiency of treacle results in a reduction of rRNA levels that ultimately affect viability of neural crest cells, leading to the characteristic craniofacial anomalies of TCS. It is hypothesized that the basis for tissue specificity in TCS is related to the neural crest. Neural crest cells, which are multipotent and highly proliferative, utilize large amounts of energy for protein synthesis and proliferation, and as such may be particularly sensitive to

136

perturbations in ribosome biogenesis. Interestingly, recent evidence has made a

strong case for neural crest cells as the cell type of origin for many CHARGE-

affected tissues [214]. Based on these results, studies of rRNA biogenesis in

CHD7-haploinsufficient neural crest cells could be informative with regards to

linking the tissue specific phenotypes of CHARGE to defects in rRNA biogenesis.

Our studies define a novel function for CHD7 in rRNA synthesis. Future

studies will focus on testing whether dysregulated expression of rRNA alone, or

in combination with dysregulated nucleoplasmic gene expression, contributes to

the pathogenesis of CHARGE syndrome. To distinguish between these

possibilities, mice with mutations in known negative regulators of rRNA

transcription such as Tip5 or Fbxl10 [95,133] could be bred to Chd7-mutant mice

to examine if restoration of rRNA synthesis rescues the CHARGE-like phenotype. Genetic crosses between Chd7 mutants and mice harboring mutations in positive regulators of rRNA biogenesis, such as Tcof1, could also be

worthwhile, as an exacerbation of the phenotype could highlight epistatic

interactions. Such studies could also help determine whether the mechanisms

underlying CHARGE syndrome and TCS are related, as suggested by our ChIP

results indicating that the association of treacle with rDNA is dependent on

CHD7. In TCS, haploinsufficiency of treacle induces perturbations of rRNA

biogenesis that result in the stabilization of p53 [199]. This in turn mediates

apoptosis of neural crest cells, which give rise to most of the tissues affected in

TCS [185]. Interestingly, genetic depletion of p53 can rescue the craniofacial

anomalies of Tcof1-mutant mice [199]. If the mechanisms underlying CHARGE

137 and TCS syndrome are related, it may be particularly informative to test whether loss of p53 can also rescue the Chd7 mutant mouse phenotype.

138

Materials and methods

Cell culture, siRNA knockdown, and CHD7 overexpression

DLD1-A2 and mESCs were cultured as previously described [229].

Nontarget and CHD7 siRNA Smartpools were purchased from Dharmacon. An

additional CHD7 siRNA not present in the Smartpool was purchased from Sigma.

CHD7 mRNA and protein levels were assessed ~72 hours after transfection by

qRT-PCR and western blotting. For overexpression, empty pCI-neo (Promega)

or pCI-neo containing untagged full-length human CHD7 was transfected into

DLD1-A2 cells, which express only FLAG-tagged CHD7. Protein and RNA were

harvested ~72 hours after transfection for analysis of CHD7 protein expression

and 45S pre-rRNA levels. Endogenous FLAG-tagged CHD7 was detected with

mouse anti-FLAG M5 (Sigma #F4042, 1:1000) and transfected untagged CHD7

was detected with rabbit anti-CHD7 (Abcam ab31824, 1:1000), which does not detected the FLAG-tagged protein (see Figure 3-3E).

Indirect immunofluorescence

DLD1-A2 cells were grown overnight on glass coverslips and fixed in cold

4% paraformaldehyde in PBS for 15 min at room temperature. Cells were

washed with PBS and permeabilized with 0.2% Triton X-100 in PBS for 5 min at

room temperature. Cells were washed with PBS and blocked with 10% normal

goat serum in PBS for 30 min at room temperature. Cells were incubated with primary antibodies overnight at 4°C. Cells were washed with PBS and incubated with secondary antibodies for 1 at room temperature. Cells were washed, mounted, and visualized on a Leica DM6000 upright microscope with Volocity 4.4

139

software. Antibodies used for immunofluorescence were mouse anti-FLAG M5

(Sigma #F4042, 1:500), rabbit anti-nucleolin (Abcam ab22758, 1:500),

AlexaFluor goat anti-mouse 488 (Invitrogen, 1:200), and AlexaFluor goat anti-

rabbit 594 (Invitrogen, 1:200).

Nucleolar isolation and western blotting

Nucleoli were isolated using a modified version [292] of an established

protocol [293]. 40 µg of cytoplasmic, nucleoplasmic, and nucleolar protein was

separated by SDS-PAGE and detected by western blotting. Antibodies used for

western blotting were mouse anti-FLAG M5 (Sigma #F4042, 1:1000), rabbit anti-

CHD7 (Abcam #31824, 1:1000), mouse anti-NUP62 (Abcam #56982, 1:1000),

rabbit anti-UBF (Bethyl Labs #859-301, 1:5000), rabbit anti-RPA116 (a gift from

Ingrid Grummt, 1:1000), rabbit anti-TIP5 (Invitrogen #49-1037, 1:1000), rabbit

anti-treacle (Abcam #65212, 1:750), and rabbit anti-tubulin (ICN BioMedicals,

1:5000).

Co-IP

Nucleoli were isolated as described above and lysed with non-denaturing

buffer (50 mM Tris, pH 8.0, 150 mM NaCl, 1% Triton X-100). Antibodies were

bound to 30 µl Protein G DynaBeads (Invitrogen) for 4 hours with rocking at 4°C.

Beads were washed 3 times with 1 ml lysis buffer and added to 300 µg of nucleolar extract. IPs were performed overnight at 4°C with rocking. Following incubation, beads were washed five times with 1 ml lysis buffer. For western blotting, 30 µl of 1x sample buffer were added to the beads and 15 µl were electrophoresed. Antibodies used for co-IP were rabbit IgG (Abcam #27478, 4

140

µg), mouse anti-FLAG M2 (Sigma #F3165, 4 µg), mouse anti-FLAG M5 (Sigma

#F4042, 4 µg), rabbit anti-RPA116 (a gift from Ingrid Grummt, 3 µg), rabbit anti-

UBF (Santa Cruz #9131, 4 µg), and rabbit anti-treacle (Abcam #65212, 5 µg).

Western blotting for treacle after co-IP was performed with rabbit anti-treacle (H-

90, Santa Cruz #67196, 1:200).

Cell proliferation analysis

DLD1-A2 cells were transfected with control or CHD7 siRNA as described.

For cell counting, cells were plated in triplicate in a 24-well plate. Wells were counted in duplicate every 24 hours using the Countess Automated Cell Counter

(Invitrogen). For BrdU labeling, the In Situ Cell Proliferation Kit (Roche) was used. Briefly, cells grown on coverslips were pulsed with 10 µM BrdU for 30 minutes at 37°C. Cells were fixed for 45 minutes with 7 volumes 100% ethanol:3 volumes 50 mM glycine, pH 2.0 at room temperature. DNA was denatured for 15 minutes with 4M HCl and pH was neutralized with excess incubation buffer at room temperature. Cells were incubated with anti-BrdU-FLUOS antibody for 30 minutes at 37°C in the dark. Cells were processed for immunofluorescence and viewed as described above. For quantitative analysis of proliferation, 4-5 separate fields from each coverslip were analyzed using the Cell Counter plugin for ImageJ. Between 600 and 1200 cells were counted per coverslip.

Metabolic labeling

Cells were transfected with siRNA as indicated above. On the indicated days post-transfection, cells were pulsed for 30 minutes with 0.03 mCi/ml

[35S]methionine (EasyTag Express Protein Labeling Mix, PerkinElmer). The

141 unincorporated [35S]methionine was chased with 5% TCA/1mM cold methionine and cells were solubilized with 1 N NaOH/0.5% DOC. Counts were determined by scintillation and normalized to total protein as measured by Lowry assay. qRT-PCR

RNA was extracted from cells with TRIzol (Invitrogen) or mouse tissues with the RNAqueous-Micro kit (Ambion) and cDNA was synthesized using the

High-Capacity cDNA Archive Kit (ABI). qRT-PCR reactions were performed in triplicate on a GeneAmp 7300 real-time thermal cycler (ABI) using Sybr Green chemistry. GAPDH was used as endogenous control for all qRT-PCR reactions.

Primer sequences are given in Table 3-1.

ChIP

ChIP was performed as described in Appendix. Antibodies used for ChIP were mouse anti-FLAG M5 (Sigma #F4042), rabbit anti-CHD7 (Abcam ab31824), goat anti-H3K4me2 (Abcam ab11946), mouse anti-H3K9me2 (Abcam ab1220), and rabbit anti-treacle (Abcam ab65212). All rDNA primers except the pair amplifying the 41 kb region of mouse rDNA have been previously described

[261]. Sequences of the mouse rDNA 41 kb and various nontarget ChIP-PCR primers are given in Table 3-2.

ChIP-seq

ChIP-seq libraries were prepared according to a published protocol [275].

ChIP-seq data were aligned to the rDNA-containing human and mouse genome builds and analyzed as described in the Materials and Methods section of

Chapter 2.

142

ChIP-chop

ChIP was performed as described, except that before qPCR was

performed, aliquots of input and ChIP DNA were digested with HpaII. ChIPs

were normalized to input and HpaII resistance was determined by normalizing

the digested DNA to mock-digested DNA. Sequences of the primers used to

amplify bases -156/+43 of the human rDNA promoter are given in Table 3-2;

primers sets H4 and H8 [261] were used to amplify the remaining regions of

rDNA. rDNA promoter methylation analysis

Genomic DNA from DLD1-A2 or mESCs was digested with HpaII and analyzed by qPCR. HpaII resistance was calculated by normalizing digests to mock-digested DNA. Genomic DNA was also digested with MspI to ensure that the results of the HpaII digests could be interpreted in the content of CpG methylation within the CCGG sequence. The human rDNA -156/+43 primer set

(Table 3-2) was used to amplify the human rDNA promoter; primer set M0 [261] was used to amplify the mouse rDNA promoter.

143

Target Forward primer (5'-3') Reverse primer (5'-3') human pre-rRNA GAACGGTGGTGTGTCGTTC GCGTCTCGTCTCGTCTCACT hCHD7 AGGTGTGTCAAAAGGTATCTTGG GCCACTCAGTTACAAATTGCTCT hp53 CCTCATTCAGCTCTCGGAAC TGACTTTGCCTGATACAGATGC hp21 GAAGTCCGTGTCCCTGGAG CATGGGTTCTGACGGACATC hGAPDH GACACCATGGGGAAGGTGAAGG CCTTGACGGTGCCATGGAATTTG mouse pre-rRNA TGTGACAACTGGGCGCTGTG [85] CACTGAGAAAAGTGCGCGCG mChd7 GACATGCTAGCAGATGGTGGT CACCTCCATGGACTCTTCCTT mTcof1 GGATCCGACACTCGAGACTT CGTCACCCTTCTGGACATCT mGapdh AAGGTCATCCCAGAGCTGAA AGACAACCTGGTCCTCAGTGTAG

Table 3-1. qRT-PCR primers used in this study

With the exception of the mouse pre-rRNA primer set, all primers were designed for this study.

144

Target Forward primer (5'-3') Reverse primer (5'-3') hrDNA promoter -156/+43 GTGTGTGGCTGCGATGGT CCAACCTCTCCGACGACAG DLD1 CHD7 NT-1 AACTCCAGATGTCTGGTTTCACATTCAG CAGGATAGACTTGTGAGGGATGGTG DLD1 CHD7 NT-2 GTCTGGAGCCAGAGCTCACAAATAG AGGAGGGTTCCCATCTCAATACCTT DLD1 CHD7 NT-3 GAAATCAGCCCCTTTTCTTTGGTTGTT AAAAATGGGGGTAGGTGGTAGGTG mrDNA-41kb CACCAAAAGAATTTAGACTGACCA CATGTGACCTATTGTTTCCAGGT mESC CHD7 NT-1 ACTCCGCCCTGCAAAGAT GCCATCCATCTCATTTCATCTC mESC CHD7 NT-2 GTTTTCTGTTTCCCCAGATGTTT CAGGAGATGGGTTCAGAAAATAA

Table 3-2. ChIP-PCR primers used in this study

145

Chapter 4

Investigation into dysregulated ribosome biogenesis as a shared pathogenic component of human haploinsufficiency syndromes

146

Abstract

Haploinsufficiency is a form of genetic that manifests in a

diploid organism heterozygous for a loss-of-function allele. In model systems, the majority of haploinsufficient genes are involved in ribosome biogenesis: they

are involved in the transcription and/or processing of rRNA or encode ribosomal

proteins. In humans, haploinsufficiency primarily results from mutations in

transcription factors and is associated with a variety of diseases from cancer to

congenital anomaly syndromes. Interestingly, a number of haploinsufficient

human diseases involve dysregulation of ribosome biogenesis and show

congenital anomalies similar to those often observed in transcription factor

haploinsufficiency syndromes. Given these facts, we hypothesized that

dysregulation of ribosome biogenesis might be a more common pathogenic

component of haploinsufficient congenital anomaly syndromes than is currently

appreciated. We compiled a list of 58 haploinsufficient transcription factors and

chromatin-associated proteins and employed a literature/database mining

approach to find evidence that the mutated proteins might be involved in

ribosome biogenesis, particularly via rRNA transcription. We found that 10/58 queried proteins showed evidence of nucleolar localization and/or function.

Statistical analysis revealed that haploinsufficient transcription factors and chromatin-associated proteins were not more likely than their non- haploinsufficient counterparts to show evidence of nucleolar localization and/or functions. We speculate that this result is due to the fact that a large number of transcription factors and chromatin-associated proteins have dual roles in

147 nucleoplasmic and nucleolar transcription, consistent with an increasing number of studies.

148

Introduction

Heterozygous loss-of-function mutations rarely result in a detectable phenotype. This observation is often attributed to the metabolic theory of dominance, which posits that redundancy in cellular physiology is often able to mask the phenotypic consequences of heterozygosity for a loss-of-function allele

[294]. There are, however, exceptions to this rule wherein such heterozygosity leads to discernable phenotypes. This phenomenon is known as haploinsufficiency and occurs in eukaryotes from S. cerevisiae to humans

[203,209,295].

In yeast, haploinsufficiency primarily manifests as a slow growth phenotype under conditions favoring rapid grown. Notably, of the approximately

3% (184 genes) of the yeast genome that is haploinsufficient under these conditions, nearly half of these are involved in ribosome biogenesis [209]. These data suggest that, in yeast, a major outcome of haploinsufficiency is impaired ribosome biogenesis and that genes involved in ribosome biogenesis are particularly dosage sensitive in rapidly dividing cells. In Drosophila, the Minute mutations are a group of more than 50 heterozygous mutations in ribosomal proteins that give rise to phenotypes including developmental delay, small body size, small bristles, and reduced viability [202,203]. Additionally, RNAi knockdown of the nucleolar protein Nopp140, a component of the rRNA processing machinery, to approximately half its normal levels gives rise to

Minute-like phenotypes [205]. The results are consistent with the findings in yeast that rapidly dividing cells are highly sensitive to altered dosage of proteins

149

involved in ribosome biogenesis.

While the majority of haploinsufficient genes in lower eukaryotes are

involved in ribosome biogenesis, relatively little is known about the extent to

which haploinsufficient human genes are involved in ribosome biogenesis. A few human diseases, which we have termed ribopathies [296], are known to have alterations in ribosome biogenesis as a pathogenic component [28]. Consistent

with studies in model organisms, all known ribopathies are haploinsufficiency

syndromes and also involve highly proliferative cell types, including neural crest

and various blood cell types [186,198].

As stated above, most haploinsufficient human genes are not known to be

involved in ribosome biogenesis; rather, a striking number are transcription

factors [295]. Various hypotheses have been put forth to explain why

transcription factors tend to be haploinsufficient, including that one-half the

normal level of many transcription factors is simply insufficient to activate

hundreds or thousands of target genes or alters the stoichiometry of

macromolecular complexes necessary for transcriptional regulation. The

prevailing theories regarding the molecular mechanisms of haploinsufficiency

have been the subject of several in-depth reviews and will not be discussed

further [209,297,298]. We will instead focus on the exciting fact that many more

transcription factors than previously appreciated may have dual roles in the

regulation of transcription by Pol I and Pol II. Many recent studies have provided

substantial evidence that many transcription factors and chromatin-associated

proteins, in addition to regulating nucleoplasmic gene expression, also modulate

150 nucleolar transcription [44,62,69,75,76,78,80,84,95-

98,123,124,132,136,143,152-154,156,158,191-

193,265,267,268,285,287,296,299-305]. For instance, the lineage-specific transcription factors MyoD, Mgn, Runx2, and C/EBPβ, well-known to be essential for myogenic, osteogenic, and adipogenic differentiation, respectively, also interact with rDNA and downregulate rRNA transcription [80]. These studies suggest that a general feature of developmentally important transcription factors is the regulation of rRNA transcription. Interestingly, haploinsufficiency of Runx2 causes cleidocranial dysplasia (CD), a disorder of craniofacial development [305] and a certain CD-related DNA-binding mutation of Runx2 impairs its association with rDNA [79]. These results raise the exciting possibility that CD is due to a combination of dysregulated nucleoplasmic and nucleolar transcription. A similar situation has been proposed for CHARGE syndrome, a complex developmental disorder characterized by a variable constellation of phenotypes. CHARGE is caused by heterozygous mutations in CHD7, a chromatin remodeling protein

[226,235]. CHD7 binds throughout the genome, primarily to enhancer elements, and regulates transcription of developmentally important genes [214,229,245].

Recent work has also shown that CHD7 is partially localized to the nucleolus and regulates rRNA biogenesis [296]. CHARGE is hypothesized to be due in part to defects in the neural crest, a cell population that appears to be highly sensitive to disruption of ribosome biogenesis [214]. These observations suggest that

CHARGE may, in part, be a ribopathy.

151

Given these facts, we set out to determine if transcription factors and

chromatin-associated proteins haploinsufficient in human congenital anomaly syndromes had nucleolar localization and/or function. We found that 10/58 proteins queried showed some evidence of nucleolar localization and/or function based on results mined from proteomic and imaging databases as well as individual research studies. Statistical analysis revealed that haploinsufficient transcription factors and chromatin-associated proteins were not more likely than non-haploinsufficient factors to display evidence of nucleolar localization and/or function.

Results

We first obtained a nearly comprehensive list of haploinsufficient genes in the human genome [306] and removed all genes haploinsufficient only in cancer syndromes. Secondly, all genes not corresponding to transcription factors or chromatin-associated proteins were discarded. Some additional genes encoding transcription factors or chromatin-associated proteins, MLL2, HDAC4, and

POLR1D, which have recently been implicated as haploinsufficient in human congenital anomaly syndromes [194,307,308], were added to this list. TERT,

encoding telomerase reverse transcriptase, was removed, as it does localize to

the nucleolus and is haploinsufficient in dyskeratosis congenita but does not

regulate rRNA transcription [309]. To further ensure that our list encompassed only congenital anomalies and anomaly syndromes, we determined whether genetic testing was available for each gene using GeneTests

(http://www.ncbi.nlm.nih.gov/sites/GeneTests) and whether mutations in the gene

152

had been described in multiple unrelated patients or several members of a

family. For consideration, one of these two criteria had to be met. Our final list

comprises 58 genes encoding transcription factors and chromatin-associated

proteins. We then attempted to find evidence for a nucleolar role of the proteins encoded by these genes. We search two databases, the Nucleolar Proteome

Database (NOPdb), which contains proteomic data from nucleoli for several cell types [284], and the Human Protein Atlas, which contains immunofluorescent imaging results for a large number of proteins in multiple cell types [310]. We also made use of a recent study by Bauer et al [311] that integrated multiple

proteomic datasets for different subcellular compartments and predicted

compartment association of proteins. Lastly, we searched PubMed and Google

for references to “[protein name] and nucleolus” or “[protein name] and rRNA” in

order to find individual studies providing any evidence of their nucleolar

localization and/or involvement in rRNA transcriptional regulation.

This approach revealed that 10/58 (17.2%) queried proteins showed

experimental evidence of nucleolar localization and/or a function in rRNA

biogenesis. We then generated a set of 100 random transcription factors and

chromatin-associated proteins from previously compiled lists [312,313]. None of

the proteins in this list are known to be haploinsufficient. We assessed nucleolar

localization and/or function for these proteins as above. The number of

haploinsufficient nucleolar proteins was not significantly different from the

number of random transcription factors and chromatin-associated proteins with

evidence of nucleolar localization and/or function (P > 0.05). This did not change

153

Syndrome/ Gene Anomaly(s) Evidence ALX4 -Tibial aplasia B(P) -Lower extremity mirror image polydactyly -Brachyphalangy -Craniofacial dysmophism -Genital hypoplasia CBP -Rubenstein-Taybi syndrome [302] CHD5 -Monosomy 1p36 syndrome P CHD7 -CHARGE syndrome P, [296] CRX -Photoreceptor degeneration N -Leber congenital amaurosis type III -Autosomal dominant cone-rod dystrophy 2 EHMT1 -9q34 subtelomeric deletion N syndrome EYA1 -Brachiootorenal syndrome N FOXC1 -Axenfeld-Rieger anomaly of N the anterior eye chamber FOXC2 -Lymphedema-Distichiasis N FOXE3 -Anterior segment dysgenesis N similar to Peters' anomaly FOXF1 -Defects in formation and N branching of primary lung buds FOXL2 -Blepharophimosis syndrome N FOXP2 -Speech and language N impairment -Oromotor dysprax GATA3 -Hypoparathyroidism, deafness N and renal dysplasia syndrome GATA4 -Congenital heart disease N GLI3 -Greig cephalopolysyndactyly B, I syndrome -Pallister-Hall syndrome GTF2l -Williams-Beuren syndrome N HDAC4 - mental N retardation syndrome HLXB9 - N HOXD13 -Foot malformations N IRF6 - N -Popliteal pterygium syndrome LMX1B -Nail-patella syndrome B(P) MITF - type 2 N MLL2 -Kabuki syndrome N

154

MSX1 -Oligodontia B(P) MSX2 -Pleiotropic defects in bone B(P) growth and ectodermal organ formation

MYCN -Reduced brain size and B(P) intestinal atresias in NFIA -CNS malformations B(P) -Urinary tract defects NKX2-1 -Choreoathetosis B(P) -Hypothyroidism -Pulmonary alterations -Neurologic anomalies -Secondary hyperthyrotropinemia NKX2-5 - N -Congenital heart disease NSD1 - N PAX2 -Renal-coloboma syndrome N PAX3 -Developmental delay N -Autism PAX6 -Eye diseases N PAX8 -Congenital hypothyroidism P PAX9 -Posterior tooth agenesis N POLR1D -Treacher Collins syndrome [194] RAI1 -Smith-Magenis syndrome N PITX2 -Rieger syndrome N RUNX1 -8p11 myeloproliferative [287] syndrome RUNX2 -Cleidocranial dysplasia B, [79,80] SALL1 -Townes-Brockes syndrome N SALL4 -Okihiro syndrome O - image for Abcam antibody ab29112 SHOX -Congenital growth failure N -Growth deficits and skeletal anomalies in Leri Weill, Langer and Turner syndromes SIX6 -Bilateral anophthalmia N -Pituitary anomalies SMADIP1 -Mowat-Wilson syndrome N SOX2 -Hippocampal malformations N -Epilepsy SOX9 -Skeletal dysplasias B(P) SOX10 -Waardenburg/Hirschsprung N

155

disease SOX18 -Mental retardation N TBX1 -22q11 deletion syndrome N -Schizophrenia TBX3 -Ulnar-mammary syndrome N TBX5 -Holt-Oram syndrome B(P) TCF4 -Pitt-Hopkins syndrome B(P) TCOF1 -Treacher Collins syndrome P, [185] TRPS1 -Tricho-rhino-phalangeal N syndrome WHSC1 -Wolf-Hirschhorn syndrome I ZIC2 -Neurological anomalies N -Behavioral abnormalities

Table 4-1. List of transcription factors and chromatin-associated proteins associated with haploinsufficient congenital anomaly syndromes

Listed are the 58 haploinsufficient transcription factors and chromatin-associated factors and their associated syndromes/anomalies. The rightmost column indicates any evidence of nucleolar localization/function found as described in the text. B: Bauer et al [311]; B(P): predicted to be nucleolar by Bauer et al [311];

N: no evidence; P: proteomic evidence from NOPdb; I: imaging evidence from the Human Protein Atlas; O: other evidence. For evidence of nucleolar localization/function derived from individual studies, the reference number is given.

156 when we also considered proteins predicted to be nucleolar by Bauer et al [311], wherein 20/58 (34.5%) were nucleolar.

Discussion

In humans, the majority of haploinsufficient disorders are caused by mutations in transcription factors and chromatin-associated proteins [295,314]. This is in contrast to model organisms, in which the majority of haploinsufficient defects result from alterations in ribosome biogenesis [203,209]. Interestingly, it is becoming increasingly clear that many transcription factors and chromatin- associated proteins are involved in the regulation of both nucleoplasmic and nucleolar transcription. We therefore wished to determine if transcription factors and chromatin-associated proteins mutated in haploinsufficient human congenital anomaly syndromes, which often share features similar to those seen in ribopathies.

We found evidence of nucleolar localization and/or function for 10/58 haploinsufficient transcription factors and chromatin-associated proteins from the list we compiled. Of these, six (CBP, CHD7, POLR1D, Runx1, Runx2, treacle) are known to have nucleolar functions [79,80,195,197,250,287,296,300,302].

The remaining four (CHD5, GLI3, SALL4, WHSC1) have not previously been studied in the context of the nucleolus and evidence of their nucleolar localization was found in proteomic and/or imaging databases. Interestingly, the diseases associated with haploinsufficiency of some of these factors share clinical features with established ribopathies. Monosomy 1p36 syndrome, involving heterozygous deletion of the CHD5 gene, includes a range of craniofacial features including

157

hypertelorism and downward slanting palpebral fissures [315]; these features are

also hallmarks of TCS [185]. Wolf-Hirschhorn syndrome, associated with

WHSC1 mutations, presents with variable craniofacial features including hypertelorism and micrognathia, which also associated with TCS, as well as growth delay [316].

Statistically speaking, our analysis did not reveal an increased likelihood of haploinsufficient transcription factors and chromatin-associate proteins having

nucleolar localization and/or function. This finding may be due to the limitations

associated with the queried studies. The imaging and proteomic databases

comprise studies carried out in a handful of cell types, and individual publications

generally assess protein localization in one or two cell lines. Thus, nucleolar

localization and/or function of a particular protein might not be detected,

especially in the case of proteins important in early development. Alternatively,

this result may simply reflect the fact that a large number of transcription factors and chromatin-associated proteins appear to have dual roles in nucleoplasmic and nucleolar transcription, as evidenced by the increasing number of proteins that have been characterized as such [44,62,69,75,76,78,80,84,95-

98,123,124,132,136,143,152-154,156,158,191,265,267,268,285,287,296,299-

304]. It will be interesting to see if future studies of haploinsufficient transcription factors and chromatin-associated proteins reveal nucleolar localization and/or function in heretofore uninvestigated cell types or under different conditions.

More such studies will certainly be conducted as it becomes more widely appreciated that many transcription factors and chromatin-associated proteins

158 have dual nucleoplasmic and nucleolar transcriptional roles and these will undoubtedly reveal unexpected insights into the multifunctional nature of these proteins.

159

Chapter 5

Discussion and Future Directions

160

Summary

The synthesis of rRNA is essential to all growing cells and accounts for the vast majority of cellular RNA production [1]. Disruption of rRNA synthesis and subsequent steps of ribosome biogenesis has been implicated in human diseases ranging from congenital anomaly syndromes such as TCS to cancers overexpressing the c-Myc oncogene [28,191-193]. Consistent with its central importance in cellular physiology, rRNA transcription is tightly regulated at the level of chromatin structure.

Despite its central importance in cellular life and the widespread use of genomic technologies such as ChIP-seq, rDNA had not previously been analyzed by genomic methodologies due to its exclusion from current genome assemblies. We wished to use genomic techniques to assemble a high- resolution map of rDNA chromatin structure. By incorporating a complete rDNA repeats into the human genome assembly, we were able to gain novel insights regarding the chromatin features of rRNA and provide novel avenues of investigation into the chromatin-level regulation of rRNA transcription.

Furthermore, application of this methodology to the disease-relevant protein

CHD7 led to a novel line of investigation into its function, leading to the novel finding that CHD7 functions as a positive regulation of rRNA synthesis.

Genomic analysis of rDNA

Within each cell, several hundred rRNA-coding rDNA repeats are present.

However, only a fraction of these are transcriptionally active at any given time, with the remainder silenced by epigenetic mechanisms. Transcriptionally active

161

repeats adopt a euchromatic structure characterized by H3K4 methylation,

histone H3/H4 acetylation, and low levels of CpG methylation. In contrast,

transcriptionally quiescent repeats are methylated at H3K9, H3K27, and H4K20

and have high levels of CpG methylation [2]. However, these features were

mostly described using standard ChIP assays focusing on a limited number of

sites within the rDNA repeat. We therefore sough to establish a more global

picture of chromatin structure at rDNA by employing genomic methodologies.

Due to a deliberate bias against sequencing tandemly repeated

sequences, including rDNA, during the Human Genome Project [253,254] and

the high, variable copy number of rDNA, rDNA is not included in current genome

assemblies. We therefore created builds of the human and mouse genomes

containing a single copy of rDNA. We surmised that aligning reads to rDNA

alone, out of the context of the rest of the genome, would cause reads with

sufficient similarity to rDNA to align to rDNA, giving rise to false positives or

artificial inflation of signal. The rDNA sequence was added to the proximal tip of

human chromosome 13, on which rDNA is endogenously located, so that ChIP-

seq signals corresponding to rDNA could be easily compared to those on

nucleoplasmic chromatin. The limitation of this approach is that active and silent

repeats are sampled together in each ChIP experiment; thus, we cannot

definitively conclude whether ChIP signal is originating from solely active or silent

repeats.

Analysis of histone modification ChIP-seq datasets in four human cell types revealed striking differences in the distributions of active and repressive

162

modifications. Enrichment of both active and repressive modifications was noted

proximal to the rDNA promoter, and novel sites of enrichment were noted for

both classes of modifications. Active modifications were distributed more

similarly within and between cell types than repressive marks. The coding region

of rDNA was found to reside within highly open, nucleosome-depleted,

transcriptionally active chromatin. We performed ChIP-seq analysis of two nucleolar proteins, RPA116 and UBF, confirming their distributions on rDNA as previously described by ChIP-PCR. Surprisingly, UBF binding was detected throughout the genome. UBF was associated with highly expressed genes and

UBF knockdown subtly altered the expression of two genes. These results raise the possibility that UBF has a more general transcriptional regulatory role outside of the nucleolus. Lastly, we demonstrate rDNA association of the insulator-

binding protein CTCF at the rDNA spacer promoter, suggesting that

transcriptional insulation plays a role in regulating rRNA transcription. Taken

together, these experiments provide novel insights into rDNA chromatin structure

and provide a viable method for genomic investigation of chromatin-mediated

regulation of rRNA transcription.

CHD7 positively regulates rRNA synthesis

An increasing number of diseases are being recognized as having

dysregulated ribosome biogenesis as a pathogenic component [28]. We

therefore wished to apply our rDNA ChIP-seq method to a chromatin-associated protein with a known disease association to investigate a possible role for dysregulated rRNA synthesis in disease pathogenesis. We chose to use ChIP-

163

seq data for CHD7, mutated in two-thirds of cases of CHARGE syndrome [226],

as this disorder shares a number of clinical features with established ribopathies,

including TCS and DBA [28,226].

Analysis of CHD7 ChIP-seq data confirmed robust association of CHD7

with rDNA in both mouse and human cells, a result that was validated by ChIP-

PCR. Further analysis confirmed partial nucleolar localization of CHD7. We

found that knockdown of CHD7 decreased the levels of pre-rRNA and impaired

cell proliferation and protein synthesis, while CHD7 overexpression increased

pre-rRNA levels. Loss of CHD7 was associated with an increase in CpG

methylation at the rDNA promoter, suggesting that CHD7 initiates or helps to

maintain a euchromatic chromatin state at rDNA. CHARGE-relevant tissues from

Chd7-heterozygous mouse embryos also displayed reduced pre-rRNA levels, suggesting that this function of CHD7 is relevant in vivo. Lastly, loss of CHD7 in mESCs reduced the association of the TCS protein treacle with rDNA, potentially suggesting pathogenic overlap between these two syndromes.

Discussion and future directions

How does CHD7 promote rRNA biogenesis?

While our results clearly demonstrate that CHD7 is a positive regulator of rRNA biogenesis, the mechanism by which CHD7 achieves this effect has yet to be determined. In these studies, we measured rRNA expression by performing

qRT-PCR for the 45S pre-rRNA transcript with primers residing in the 5'-ETS.

This region of the pre-rRNA is removed in the second processing step following

transcription [35]. Thus, our results do not absolutely preclude a role for CHD7 in

164 regulating early rRNA processing. However, we speculate that, based on its rDNA chromatin association and its ATPase activity, that CHD7 regulates rRNA transcription.

Given that CHD7 is likely to be an ATP-dependent chromatin remodeling enzyme, a potential scenario is that CHD7 functions to initiate or maintain open chromatin at rDNA repeats, in turn promoting the association of factors necessary for rRNA transcription. In support of this hypothesis, genetic or siRNA-mediated depletion of CHD7 reduces rDNA transcriptional output and increases promoter CpG methylation. This issue could be addressed further by performing ChIP for positive regulators of rRNA transcription (i.e. Pol I, UBF) after CHD7 depletion. It would also be informative to perform ChIP for CHD7 following inhibition of Pol I transcription by ActD treatment. If CHD7 functions in initiating the open chromatin state at rDNA, it is likely that CHD7 rDNA association precedes Pol I transcriptional initiation and would not therefore not depend on ongoing Pol I transcription. In contrast, if CHD7 functions in the maintenance of open chromatin at rDNA, it is likely that its rDNA association would be dependent upon ongoing Pol I transcription.

Given that CHD7 may be able to remodel chromatin, it is also possible that CHD7 regulates the position of the rDNA promoter nucleosome, similar to the SNF2H subunit of NoRC, though with an opposing effect on rRNA transcription. As CHD7 promotes rDNA transcription, it seems likely that in this scenario CHD7 would function to slide the promoter nucleosome upstream of the

165

rDNA TSS to impair promoter CpG methylation. Studies of rDNA nucleosome

positioning in CHD7-deficient cells are necessary to test this hypothesis.

Regulation of chromatin structure at rDNA is accomplished by the actions

of multiprotein complexes such as NoRC and B-WICH that target chromatin

remodeling as well as multiple histone-modifying activities to rDNA [96,133,152].

Thus, it stands to reason that CHD7 might function in complex with additional

proteins to establish a euchromatic state at rDNA. Mass spectrometric

identification of CHD7 binding partners in nucleolar protein preparations would be most informative in this regard. ChIP analysis of histone modifications at rDNA following CHD7 knockdown could also help elucidate rDNA regulatory partners, as the enzymes mediating most histone modifications are known. A change in the level of a particular histone modification at rDNA would suggest a limited number of potentially responsible modifying enzymes that could then be tested for physical interaction with CHD7.

Another interesting line of investigation involves the potential connection between CHD7 and treacle. In Chd7-/- mESCs, the association of treacle with

rDNA is reduced, supporting the role of CHD7 in rRNA transcription and

suggesting functional interaction of the two proteins [226]. Interestingly, co-IP

assays showed that treacle can be immunoprecipitated with antibodies to FLAG

in the DLD1-A2 cell line, though the reciprocal interaction was not observed.

Further studies will analyze if treacle is required for CHD7 rDNA association.

Genetic crosses between Chd7+/- and Tcof1+/- mice will also be performed with

the aim of highlighting a synergistic interaction between the two mutations, which

166

would provide strong evidence that the role of CHD7 in rRNA biogenesis is

relevant in vivo.

Is dysregulated rRNA transcription a pathogenic component of CHARGE syndrome?

The results presented here, which establish CHD7 as a positive regulator of rRNA biogenesis, pose an interesting question: does CHARGE syndrome arise solely from dysregulation of nucleoplasmic transcription, dysregulation of rRNA transcription, or a combination of both?

Relevance of dysregulated rRNA transcription to CHARGE syndrome

Studies of Chd7+/- mouse embryos suggest that the role of CHD7 in

upregulating rRNA transcription is relevant in vivo. Chd7+/- embryos display

reduced neural stem cell proliferation in the developing olfactory placode, which

may be responsible for the olfactory defects seen in CHARGE patients [238].

Additionally, CHD7 deficiency in the developing mouse inner ear, one of the most

frequently affected tissues in mice and humans heterozygous for CHD7, reduces

neuroblast number due to reduced cell proliferation [247]. Lastly, the work

presented here shows that heterozygous loss of Chd7 reduces pre-rRNA levels

in the eye and ear, two of the most commonly affected tissues in CHARGE

syndrome [296].

In the hNCLC system established by Wysocka and colleagues, CHD7

depletion reduces the formation of migratory neural crest cells [214]. However,

the basis of this defect was not conclusively determined. Interestingly, it is

known that Tcof1 haploinsufficiency in mice causes a substantial reduction in the

167

number of migratory neural crest cells due to a proliferation defect and

subsequent apoptosis [198]. Thus, it may be hypothesized that CHD7-deficient hNCLCs do not proliferate properly due to reduced rRNA levels.

Though we cannot rule out the possibility that loss of CHD7 affects cell proliferation via dysregulation of nucleoplasmic gene targets, these studies strongly suggest that the role of CHD7 in upregulating rRNA expression is relevant in vivo and its dysregulation may therefore contribute to the pathogenesis of CHARGE syndrome.

Relevance of dysregulated nucleoplasmic transcription to CHARGE syndrome

There is also strong evidence for a nucleoplasmic transcriptional role of

CHD7, the dysregulation of which is likely to be relevant to CHARGE syndrome pathogenesis. CHD7 binds thousands of sites throughout the genome that show features of enhancer elements [229,245], a number of which are capable of

activating a luciferase reporter gene. Loss of CHD7 causes only subtle effects

on gene expression in mESCs and does not affect pluripotency, self-renewal, or somatic reprogramming, suggesting that CHD7 is more important for transcriptional regulation at later stages of development [245]. Indeed, shRNA-

mediated depletion of CHD7 in hNCLCs causes reduced formation of migratory neural crest cells, and morpholino knockdown of CHD7 in Xenopus leads to

CHARGE-like phenotypes and dysregulation of genes important for neural crest development, such as Sox9, Twist, and Slug [214]. In the mouse inner ear, where loss of CHD7 causes proliferation defects, loss of Chd7 also leads to dysregulated expression of several genes important for inner ear development

168

including Otx2, Fgf10, and Eya1 [247].

An intriguing potential aspect of nucleoplasmic CHD7 function has come to light thanks to the recent characterization of multiple enhancer classes. Active enhancers are marked with H3K4me1 and acetylation of lysine 27 on histone H3

(H3K27ac), while poised enhancers may be marked with H3K27me3 or no covalent modification of H3K27 [317,318]. Active enhancers are associated with

highly expressed genes with ESC-specific functions in ESCs, while poised

enhancers are associated with genes expressed at later stages of development

[317,318]. Upon differentiation to various cell types, subsets of lineage-specific

poised enhancers are activated, while ESC-specific enhancers and enhancers necessary for specification of other lineages are inactivated. Thus, it appears that differentiation programs are epigenetically encoded at the earliest stages of development. We find that, in mESCs, CHD7 is bound to both active (~25%) and

poised (~75%) enhancers (Figure 5-1). CHD7-bound active mESC enhancers

are associated with general biological processes related to metabolism, gene

expression, and cytoskeletal organization as well as early embryonic phenotypes

(Table 5-1A,B). Intriguingly, CHD7-bound poised enhancers are associated with

developmental processes and CHARGE-like mouse phenotypes in the eye,

brain, and nervous system (Table 5-1C,D). It therefore appears that CHD7

participates in the maintenance of the poised state at enhancers linked to genes required later in development and, at later stages of development, participates in transcriptional activation of the genes associated with the formerly poised enhancers. This hypothesis is supported by the observations that Chd7-/-

169

Figure 5-1. CHD7 associates with active and poised mESC enhancers

Heatmap representation of CHD7, H3K4me1, and H3K27ac signals ± 5 kb of each every CHD7-bound site at least 1 kb away from a known TSS in mESCs.

CHD7 can clearly be seen to active enhancers marked with H3K4me1 and

H3K27ac as well as poised enhancers with H3K4me1 but not H3K27ac. Median signals were z-transformed, K-means clustered with Gene Cluster 3.0 [319], and visualized with Java TreeView [280].

170

GO Biological Process P-value Regulation of cellular protein metabolic process 3.37x10-34 Posttranscriptional regulation of gene expression 7.89x10-27 Cytoskeleton organization 4.54x10-26 Cell junction organization 7.75x10-25 Stem cell differentiation 4.62x10-24 Negative regulation of molecular function 2.03x10-23 Cell-cell junction organization 2.55x10-23 Cell junction assembly 2.31x10-22 Regulation of binding 1.75x10-19

Table 5-1A. GO biological process annotations associated with genes linked to CHD7-bound active enhancers in mESCs

The 10 most significant GO biological processes are listed.

Mouse phenotype P-value Embryonic growth retardation 2.66x10-39 Decreased embryo size 6.56x10-36 Abnormal neural tube closure 1.15x10-35 Abnormal cell proliferation 1.92x10-31 Embryonic lethality between implantation and placentation 4.64x10-30 Decreased cell proliferation 2.30x10-29 Embryonic lethality before somite formation 2.00x10-28 Abnormal intestinal epithelium morphology 3.88x10-27 Abnormal placenta morphology 8.37x10-27 Exencephaly 3.70x10-25

Table 5-1B. Mouse phenotype annotations associated with genes linked to

CHD7-bound active enhancers in mESCs

The 10 most significant mouse phenotypes are listed.

171

GO Biological Process P-value Stem cell differentiation 6.79x10-18 Embryonic placenta development 1.01x10-13 Axis specification 1.51x10-12 Developmental induction 1.29x10-10 D/V axis specification 8.64x10-10 Induction of an organ 2.33x10-9 A/P axis specification 5.64x10-8 Body morphogenesis 1.62x10-7

Table 5-1C. GO biological process annotations associated with genes linked to CHD7-bound poised enhancers in mESCs

All developmental biological processes contained in the 20 most significant GO

biological process annotations are listed.

Mouse phenotype P-value Abnormal optic vesicle formation 9.31x10-20 Anopthalmia 5.11x10-17 Abnormal neural fold formation 5.59x10-15 Delayed neural tube closure 6.16x10-14 Abnormal hippocampus development 3.83x10-12 Absent mandible 9.49x10-12 Retina hypoplasia 1.72x10-10 Abnormal oculomotor nerve morphology 3.91x10-10

Table 5-1D. Mouse phenotype annotations associated with genes linked to

CHD7-bound poised enhancers in mESCs

All CHARGE-relevant phenotypes contained in the 20 most significant mouse

phenotype annotations are listed. For all analyses in Table 5-1, a list of

coordinates representing 1 kb centered on the midpoint of each CHD7-bound

active or poised enhancer was uploaded to GREAT [320].

172

mESCs are capable of properly differentiating to all three germ layers and that

CHD7 is dispensable for mESC pluripotency, self-renewal, and somatic

reprogramming [245].

A dual-function model for CHD7

In light of these findings, we propose a model for dual transcriptional

functions of CHD7 (Figure 5-2). In the nucleoplasm, CHD7 participates in the

maintenance of a poised state at enhancer elements in mESCs and functions to

activate transcription of their associated genes at the appropriate time and place

during development. In the nucleolus, CHD7 upregulates rRNA transcription to

ensure proper cell proliferation throughout development. Given that loss of

CHD7 affects both nucleoplasmic transcription and leads to reduced pre-rRNA

levels and cell proliferation during development, we speculate that both functions

of CHD7 are required during development and that it is the dysregulation of both

of these functions that leads to CHARGE syndrome.

rDNA copy number, CpG methylation, and phenotypic variability in

CHARGE syndrome

One of the most notable features of CHARGE syndrome is its phenotypic

variability. Patients with identical mutations often have vastly different clinical

manifestations and even monozygotic (MZ) twins affected with CHARGE

syndrome display phenotypic discordance [321,322]. Additionally, certain tissues

appear to be more susceptible to CHD7 haploinsufficiency. For instance, the

temporal bone is affected in 98% of screened CHD7 mutation-positive CHARGE

patients, while choanal atresia is observed in just 38% [226]. The involvement of

173

Figure 5-2. A model for dual functions of CHD7

In ESCs, nucleoplasmic CHD7 associates with cell type-specific factors primarily

at poised enhancers to restrict the expression of genes required for later stages

of development. During differentiation, CHD7 cooperates with cell type-specific

factors to promote the expression of these genes as poised enhancers become

active. In the nucleolus, CHD7 functions to upregulate rRNA expression and

promoter proper cell proliferation throughout development

174

CHD7 in rRNA biogenesis suggests some potential explanations for this

phenotypic variability and tissue-specific susceptibility to CHD7 haploinsufficiency related to the structure and epigenetic status of rDNA, which may guide future studies in this area. rDNA copy number variation

Each human genome contains several hundred copies of the rDNA repeat, arrayed in a tandemly repeated fashion within NORs. However, the

absolute rDNA copy number as well as the number of rDNA repeats within each

NOR is highly variable [7]. We hypothesize that, if dysregulation of the function

of CHD7 in rRNA transcription is a pathogenic component of CHARGE

syndrome, that rDNA copy number variations might account for phenotypic

differences between CHARGE patients. Support for this hypothesis comes from

recent studies of the chromatin remodeling protein ATRX, which is mutated in

alpha-thalassemia with X-linked mental retardation. ATRX was found to bind

tandemly repeated sequences both at telomeres and euchromatin. Most

interestingly, phenotypic severity (measured as the degree of alpha-thalassemia

in this case) was correlated with the length of a certain tandem repeat bound by

ATRX [323]. This suggests that copy number of tandemly repeated elements

can influence transcriptional output. By extension, the transcriptional output of

rRNA may depend on the number of repeats present in a given cell.

CpG methylation of rDNA

CpG methylation plays a key role in the regulation of rRNA transcription as

well as the genomic stability of rDNA [89,91,156]. Alterations in CpG methylation

175

at rDNA are associated with many diseases, including ATRX, systematic lupus

erythyematosus (SLE), Alzheimer's disease, and cancer [89,324-326].

Hypermethylation of rDNA is also found in the brains of suicide victims [327].

Intriguingly, MZ twins discordant for SLE display differences in CpG methylation at rDNA [325]. Hypomethylation of rDNA in SLE patients may contribute to the disease phenotype via overexpression of rRNA and increased assembly of ribosomes, as auto-antibodies to rRNA and ribosomal proteins have been reported in SLE patients. Thus, differing degrees of rDNA hypomethylation could have variable effects on the SLE phenotype [325].

Implications of variable rDNA copy number and CpG methylation for phenotypic variability in CHARGE syndrome

Given the evidence that changes in rDNA copy number and CpG methylation appear to influence a variety of disease phenotypes, we postulate that variations in rDNA copy number and/or CpG methylation influence the susceptibility of certain tissues to CHARGE-related defects and impart phenotypic variability to CHARGE-affected individuals, particularly MZ twins or patients with identical mutations. In terms of the influence of copy number and

CpG methylation on tissue susceptibility to CHD7 loss, we suggest that the cells that give rise to tissues frequently involved in CHARGE syndrome, such as the inner ear, have lower rDNA copy number and/or higher basal CpG methylation at rDNA than the cells that give rise to less frequently affected tissues, such as the choanae. The lower rDNA copy number and/or higher basal CpG methylation would then synergize with the increased rDNA methylation caused by CHD7 loss

176

[296] to cause defects in that tissue. Due to the difficulty of obtaining the appropriate tissues from human patients, it would be advantages to investigate these possibilities in the mouse. CHARGE-relevant tissues (i.e. ear, eye, heart) could be dissected from wild-type mouse embryos and their DNA subjected to qPCR and southern blotting to estimate rDNA copy number as well as to bisulfite sequencing to assess rDNA CpG methylation.

We propose a similar scenario to explain phenotypic variability between individuals with CHARGE syndrome. It is highly likely that different individuals have different rDNA copy number and CpG methylation levels; thus, an individual severely affected with CHARGE syndrome may have a lower rDNA copy number and/or higher level of CpG methylation at rDNA than a mildly affected patient.

Analysis of rDNA copy number and rDNA methylation, as described above, could be performed on lymphoblast or fibroblast cell lines from CHARGE patients to test these possibilities.

Dissecting nucleoplasmic and nucleolar functions of CHD7

Testing the requirements for a nucleoplasmic function of CHD7 in the mouse

To assess the importance of the a nucleoplasmic CHD7 function during development, it may be possible to create a version of CHD7 that is targeted to the nucleolus via the creation of a chimeric protein. The Chd7 gene could be endogenously fused to a characterized nucleolar targeting sequence [328] in mESCs using the method previously used to generate FLAG-tagged CHD7 alleles in DLD1 cells [286]. Presumably, this fusion would cause an increase in the amount of nucleolar CHD7, leading to impaired nucleoplasmic transcriptional

177

regulation. Following validation of increased nucleolar localization of CHD7,

mESCs harboring one or two copies of the fusion allele could then be used to

generate transgenic mice to be examined for CHARGE-like phenotypes. Several

results are possible:

(1) Only the nucleoplasmic functions of CHD7 are required for proper

development. Phenotypes observed in this instance would likely be equivalent to

loss of CHD7: mice heterozygous for the fusion allele would presumably develop

to Whirligig or Chd7 gene-trap embryos due to insufficient nucleoplasmic CHD7

concentration, while mice homozygous for this allele would likely die around the

time Chd7-null mice die.

(2) Only the nucleolar functions of CHD7 are required for proper

development. In this case, increased nucleolar concentration of CHD7 by

heterozygosity or homozygosity for the fusion allele might not result in overt

developmental defects. It is also possible that upregulation of CHD7 nucleolar

function might cause developmental defects, as it has recently been shown that

microduplication of chromosome 8q12, including CHD7, leads to defects similar

to those seen in CHARGE syndrome [329]. However, the duplicated interval

contains several genes besides CHD7; therefore, further studies are needed to elucidate the consequences of increased CHD7 dosage in the context of development. Interestingly, overexpression of CHD7 is common in small cell

lung cancer (SCLC) [330]. However, it is not known if there is a higher nucleolar

concentration of CHD7 in these cancers. Increased nucleolar localization of

CHD7 via the fusion allele would likely increase rRNA transcription and cell

178 proliferation and potentially leading to developmental defects and/or cancer, though the potential of these outcomes occurring is uncertain.

(3) A combination of the nucleoplasmic and nucleolar functions of CHD7 are required for proper development. In this scenario, cells requiring CHD7 would proliferate properly but, assuming that CHD7 is necessary for lineage specification later in development, not properly commit to the required lineages.

This would likely also lead to heterozygous and homozygous defects similar to those seen with genetic deficiencies in Chd7.

The relevance of the function of CHD7 in rRNA biogenesis to CHARGE syndrome could also be tested through genetic crosses, as described above using Tcof1+/- mice. Chd7+/- mice could be crossed to mice harboring mutations in either positive or negative regulators of rRNA synthesis. If dysregulated rRNA biogenesis contributes to the CHARGE-like phenotypes of Chd7 mutant mice, it might be expected that crossing Chd7+/- mice to mice mutant for a negative regulator of rRNA synthesis such as TIP5 would rescue these phenotypes to some extent through partial restoration of rRNA synthesis. On the other hand, crossing Chd7+/- mice to mice harboring deficiencies in positive regulators of rRNA synthesis such as UBF would likely result in a worsening of the CHARGE- like phenotypes and potentially embryonic lethality of the double heterozygous mice.

Separating the functions of CHD7 using patient-specific iPSCs

A recent, powerful advance in the investigation of disease etiology has come in the form of induced pluripotent stem cells (iPSCs). These are terminally

179 differentiated cells that are reprogrammed back to a pluripotent state by expressing a defined set of pluripotency factors [331-333]. This technology has already been used to generate iPSC lines from fibroblasts from patients with a variety of diseases from Rett syndrome to retinal degeneration, some of which have already contributed novel insights into the pathogenic mechanisms of various diseases [334-338]. This approach could also be applied to cell lines from CHARGE syndrome patients, particularly to assess the requirements for

CHD7 in the specification of various cell lineages.

Cells from CHARGE patients and normal controls, likely fibroblasts, would be harvested and reprogrammed into a pluripotent state via forced expression of the critical pluripotency factors OCT4, KLF4, SOX2, and MYC [331]. Following the establishment of stable CHARGE and control iPSC lines, a variety of studies would be possible. To test the requirements for CHD7 in the differentiation of

CHARGE-relevant tissues, CHARGE iPSCs could be differentiated along neural or cardiac lineages. Upon differentiation to various cell types, expression profiling, immunofluorescence, and functional assays could be conducted to determine if proper differentiation has occurred. During differentiation, assays to measure rRNA levels and proliferation would also be conducted. Defects in differentiation would implicate dysregulated nucleoplasmic transcription, while proliferation defects would point to impaired rRNA transcription and ribosome biogenesis.

CHARGE iPSC lines also offer a unique advantage to the study of CHD7 missense mutations. While the majority (70%) of CHD7 mutations are nonsense

180

or frameshift and are predicted to be loss-of-function, a substantial fraction (15%) are missense [226]. These missense mutations can fall within but have also been reported outside known protein domains within CHD7 [322,339]. It is not

certain how CHD7 protein containing a missense mutation causes CHARGE

syndrome. An obvious explanation for the mutations falling within protein

domains of CHD7 is that they disrupt the domain and therefore a critical function

of the protein, essentially leading to a loss-of-function phenotype. Indeed,

expression of a CHD7 helicase mutant in Xenopus embryos leads to CHARGE-

like phenotypes [214]. However, mutations of the catalytic residues within the

CHD7 helicase domains or the critical residues for methyl-histone binding of the

chromodomains have not been described in CHARGE patients, and it is unknown how mutation of non-critical residues would affect the function of these domains. Even less obvious is how missense mutations in non-domain regions of CHD7 cause CHARGE syndrome. They may disrupt protein-protein interactions, sites of post-translational modification, or subcellular localization. It would be particularly interesting to determine if mutations if particular regions of

CHD7 preferentially affect its nucleolar localization and functions. All of these possibilities have yet to be explored, and patient-specific iPSCs afford the opportunity to study disease-relevant mutations in a powerful system.

Investigating CHD7 nucleolar targeting via nucleolar protein interactions

There is increasing evidence that nucleolar protein localization may be dependent on interactions with certain nucleolar "hub" proteins, namely nucleolin and nucleophosmin (NPM) [328]. These proteins appear to bind various NoLSs

181

to target proteins to the nucleolus. In a yeast two-hybrid screen, 14 hits between

CHD7 and nucleolin were observed (D.M. Martin, personal communication).

Therefore, the targeting of CHD7 to the nucleolus may be dependent on its

interaction with nucleolin. Co-IP analysis of CHD7 and nucleolin from nucleolar

protein preparations could be performed as a first step to assessing this

mechanism.

Further experiments to address this possibility are complex. Nucleolin appears to bind consensus nucleolar localization sequences (NoLSs) within proteins to target them to the nucleolus. Due to its large size, CHD7 contains a large number of putative NoLSs; a search of the CHD7 protein sequence using the R/K-R/K-X-R/K reveals over two dozen potential

NoLSs. Additionally, several of these sequences are located within consensus

NLSs and their mutation might also disrupt nucleoplasmic localization of CHD7.

A first step in addressing this issue is to determine the region(s) of CHD7 necessary for interaction with nucleolin. Epitope-tagged CHD7 fragments could be created and used for in vitro binding assays with recombinant nucleolin.

Constructs encoding region(s) of CHD7 sufficient for nucleolin interaction could then be transfected into cells followed by immunofluorescent analysis of subcellular localization using antibodies directed against the epitope tag attached to the protein fragment. Following determination of which fragments are sufficient for interaction with nucleolin, detailed sequence analysis of these fragments could be performed to find consensus NoLSs. Following this analysis,

consensus NoLSs within nucleolin-interacting fragments of CHD7 would be

182

mutated and used for in vitro nucleolin-binding assays and subcellular localization analysis. The ultimate goal of such studies would be to ascertain the minimum number of residues necessary and sufficient for CHD7-nucleolin

interaction and to create a construct encoding full-length CHD7 with a minimal

number of residues mutated so as to impair its nucleolar localization but not its

nucleoplasmic localization or function. It is unlikely that the reciprocal

experiments (i.e. mutation of nucleolin to determine regions necessary and

sufficient for CHD7 interaction) would be fruitful, as knockdown of nucleolin leads

to nucleolar disruption [340,341] and overexpression of mutated forms of

nucleolin would, presumably, lead to activation of the nucleolar stress response.

CHD7 and rRNA biogenesis: a connection to cancer?

One of the hallmarks of cancer is cell proliferation in the absence of

growth signals [342]. Given that rRNA synthesis and ribosome biogenesis are

controlled by a vast number of growth signal-regulated pathways, it stands to

reason that dysregulated rRNA and ribosome biogenesis may contribute to

. Indeed, an emerging theme in cancer biology is the contribution

of dysregulated rRNA synthesis and ribosome biogenesis to carcinogenesis

[343]. Well-known tumor suppressors including PTEN, p53, Rb, and p19Arf inhibit

rRNA biogenesis [344-348], while oncoproteins such as c-Myc promote rRNA

synthesis [191-193]. Overexpression of kinases that regulate UBF activity, such

as casein kinase II (CK2) [349] and ERK1/2 [14], may also contribute to

abnormal rRNA transcription in cancer. Lastly, genes encoding ribosomal

proteins are overexpressed in a wide range of cancers [350-355].

183

CHD7 is overexpressed in several diverse cancers: hepatitis C-induced hepatocellular carcinoma (HCC) (Figure 5-3), ovarian cancers (Figure 5-4), and gliomas (Figure 5-5). Recent work has also demonstrated rearrangements and overexpression of the CHD7 gene in small cell lung cancer (SCLC) [330]. These

studies raise the possibility that excess CHD7 might function oncogenically by

upregulating rRNA transcription. It will be interesting to determine if excess

CHD7 protein is localized to the nucleolus. Experiments testing the ability of

excess CHD7 to promote anchorage-independent growth, colony formation, and

invasion would be informative in testing the oncogenic potential of CHD7.

Analysis of rRNA levels and the nucleoplasmic transcriptome following CHD7

overexpression and/or knockdown might also provide insight into the

requirements for the nucleolar and/or nucleoplasmic functions of CHD7 in

oncogenesis.

Further applications of rDNA genomics

The approach to aligning high-throughput genomic data to rDNA presented here

is a simple method to gain novel insights into chromatin structure and protein

occupancy at rDNA. Combining this approach with additional techniques such as

bisulfite sequencing could yield further insights into rDNA chromatin structure and help distinguish the structures of active and silent repeats at high resolution.

Condition-dependent alterations in rDNA chromatin structure

It is well known that the transcription of rRNA is to environmental

conditions such as nutrient levels and growth factor stimulation, and this

responsiveness is mediated by epigenetic mechanisms [17,94,101]. It would be

184

Figure 5-3. CHD7 expression in HCV-induced HCCs

Microarray expression data were obtained from GEO (GSE6764) and normalized using the RMA method [283]. P values were calculated by one-way ANOVA with

Bonferroni correction. *, P < 0.0001 vs. normal tissue. +, P < 0.01; ++, P <

0.0001 vs. cirrhotic tissue. †, P < 0.05; ††, P < 0.01; †††, P < 0.0001 vs. low- grade dysplastic tissue. ‡, P < 0.01; ‡‡, P < 0.0001 vs. high-grade dysplastic tissue.

185

Figure 5-4. CHD7 expression in ovarian cancers

Microarray expression data were obtained from GEO (GSE6008) and normalized using the RMA method [283]. P values were calculated by one-way ANOVA with

Bonferroni correction. *, P < 0.0001 vs. normal tissue.

186

Figure 5-5. CHD7 expression in gliomas

Microarray expression data were obtained from GEO (GSE4290) and normalized using the RMA method [283]. P values were calculated by one-way ANOVA with

Bonferroni correction. *, P < 0.05; **, P < 0.0001 vs. normal tissue. +, P < 0.05;

++, P < 0.01; +++, P < 0.0001 vs. glioblastoma stage 4.

187

interesting to assay the levels and distributions of histone modifications and

nucleosomes as well as chromatin accessibility and transcription from rDNA

under various conditions, such as serum or glucose starvation versus normal

growth conditions or with and without growth factor stimulation. By combining

ChIP-seq, MNase-seq, DNase-seq, and RNA-seq with our method for rDNA alignment, these studies are feasible.

Distinguishing active and inactive rDNA repeats

Of the several hundred rDNA repeats in the cell, a fraction are

transcriptionally active and adopt a euchromatin chromatin structure. The

remainder are transcriptionally silent and heterochromatic [2]. The major

disadvantage of our rDNA alignment method is that a mixed population of active

and silent rDNA repeats are sample together in each dataset we aligned. To

distinguish if signals are originating primarily from active or silent repeats, it may

be advantageous to combine high-throughput sequencing approaches with DNA

methylation-sensitive assays, as transcriptionally inactive repeats are DNA

hypermethylated [2]. Several approaches are possible. ChIP is given as the

prototypical experimental technique for these assays, but they could also be

adapted to MNase-seq and DNase-seq approaches.

(1) ChIP with high-throughput bisulfite sequencing (CBS). ChIP would be

performed as usual, but immunoprecipitated DNA would be bisulfite-converted

prior to sequencing. Bisulfite treatment converts unmethylated cytosine residues

to uracil but does not affect 5-methylcytosine (5-mC) [356]. Analytical

approaches taking into account the mismatches to the reference genome have

188

been developed and could be incorporated in our rDNA alignment pipeline. This methodology would give a single-base resolution readout of the level of DNA methylation at the rDNA repeats bound by a particular protein, and, by extension, if that protein was bound to active, silent, or both types of rDNA repeats.

(2) ChIP with methylation-sensitive restriction digest and high-throughput sequencing (ChIP-chop-seq). A similar approach, referred to as ChIP-chop, has been extensively used in combination with PCR-based assays to determine the preference of histone modifications and other proteins for active or silent rDNA repeats [93,100,133,287,296]. Prior to sequencing, ChIP DNA is digested with the HpaII restriction enzyme prior to sequencing. HpaII recognizes the sequence

CCGG but will not cut if the internal and/or external C residues are methylated.

Thus, HpaII resistance is used as a measure of CpG methylation at a given

CCGG site. Digestion with MspI, an isoschizomer of HpaII, is used as a control.

MspI cuts the CCGG sequence regardless of the methylation status of the internal C residue, but will not cut if the external C is methylated. Thus, MspI digestion serves to ensure that methylation of CCGG sequences is occurring at

CpG dinucleotides. The drawback of this approach is that it relies upon the presence of CCGG tetranucleotides; not all CpG dinucleotides in rDNA are contained within such sequences. Thus, this technique has limited resolution and provides only a rough estimate of the degree of CpG methylation on DNA associated with a given protein.

(3) Sequential immunoprecipitation with high-throughput sequencing

(ChIP-MeDIP-seq). Alternative approaches to analyzing DNA methylation based

189

on immunoprecipitation have also come into widespread use. Methylated DNA

immunoprecipitation (MeDIP) utilizes an antibody directed against 5-mC to pull

down methylated DNA. This approach could be adapted for use in sequential

ChIP [357]. Sequential ChIP, also called ChIP-reChIP, enables the analysis of

simultaneous occupancy of two factors at a given genomic locus. The assay

would begin as a ChIP with an antibody directed against a protein of interest.

After immunoprecipitation, an aliquot of eluted ChIP DNA would be carried

forward into a MeDIP assay. This would enrich methylated sequences bound by

the protein of interest. Thus, upon alignment of ChIP-MeDIP-seq data to rDNA, if a protein was bound to highly methylated repeats, robust signal would be obtained, while if a protein was bound to hypomethylated repeats, low signal would be observed. A drawback to this approach is that sequential ChIP often gives a low signal in the second immunoprecipitation, as a substantial amount of

DNA must be retained for analysis of the initial ChIP; however, this limitation may be bypassed due to the high copy number of rDNA, which could facilitate efficient immunoprecipitation from small quantities of DNA. Performing ChIP from isolated nucleoli may also be useful in circumventing this issue. Second, similar to ChIP-chop, the resolution obtained by MeDIP is not at single-base resolution.

Thus, the most accurate measure of methylation at rDNA repeats bound by a particular protein is likely to be CBS.

Large-scale analysis of protein occupancy at rDNA

The rapid and widespread adoption of ChIP-seq has led to the rapid accumulation of thousands of datasets from many species. In particular, the

190

ENCODE and modENCODE consortia have undertaken massive efforts to

catalog protein occupancy in organisms from humans to C. elegans [358-360].

Data generated by the ENCODE consortium are readily available through the

UCSC genome browser [361] and include datasets for hundreds of transcription factors, chromatin-associated proteins, and histone modifications in dozens of cell types. Public repositories such as the Gene Expression Omnibus (GEO) and

Sequence Read Archive (SRA) [362] also contain thousands of ChIP-seq datasets generated by independent labs. The availability of such a large amount of data combined with the simplicity of our rDNA alignment method could yield a vast amount of insight into chromatin-level regulation of rRNA transcription and could yield novel insights into disease pathogenesis as was described for

CHARGE syndrome in this study. Given that many transcription factors and chromatin-associated proteins are now appreciated to have roles in both Pol I and Pol II transcription, these studies will likely become more commonplace.

Additionally, ChIP-seq analysis of rDNA might also facilitate evolutionary

studies of rDNA regulation. Given that the sequence of the IGS is highly

divergent between species as closely related as human and mouse

[31,32,39,363] and that histone modifications are enriched within the human IGS,

it stands to reason that there are species-specific putative regulatory elements

within the IGS. Using our rDNA alignment method, this hypothesis could be

easily tested in species with available histone modification ChIP-seq data.

191

Appendix

Detailed chromatin immunoprecipitation protocol

This protocol was adapted from a previously published procedure:

Schmidt, D., Wilson, M.D., Spyrou, C., Brown, G.D., Hadfield, J., Odom, D.T.

(2009). ChIP-seq: Using high-throughput sequencing to discover protein-DNA interactions. Methods 48: 240-248.

192

A. Pre-blocking and binding of antibodies to beads Note: the magnetic stand is used to collect beads between all washing steps in this protocol. Note: it's usually most convenient to start this step the afternoon before you want to perform the ChIP or in the morning of the day you will start the ChIP, if you plan to perform lysis and sonication later in the day. 1. Aliquot 100 µl of Protein G DynaBeads (Invitrogen) per ChIP into microfuge tubes 2. Wash beads 3 times with 1 ml of 1x PBS/0.5% BSA 3. Resuspend beads in 250 µl 1x PBS/0.5% BSA 4. Add antibodies to the resuspended beads. The amount of antibody used depends on the target protein and may need to be determined empirically. 5. Bind antibodies to beads for at least 3 hours at 4°C with rocking. 6. Remove antibody solution and wash beads 3 times with 1 ml 1x PBS/0.5% BSA. Resuspend beads in 100 µl 1x PBS/0.5% BSA. B. Crosslinking Suspension cells 1. Add 1/10 volume of 11% formaldehyde solution directly to the culture medium, swirl to mix, and let sit at room temperature for 15 min. 2. Add 1/20 volume of 2.5 M glycine to quench formaldehyde. 3. Transfer quenched cells to a 15 or 50 ml conical tube, depending on the volume. 4. Collect cells by centrifugation at 2000 x g for 5 minutes. 5. Wash cells twice with 5 ml cold PBS; repeat centrifugation after each wash. 6. After the second wash, remove PBS and proceed to lysis or freeze cells in liquid nitrogen and store at -80° until needed. Adherent cells 1. Add 1/10 volume of 11% formaldehyde solution directly to the culture medium, swirl to mix, and let sit at RT for 15 minutes. 2. Add 1/20 volume of 2.5 M glycine to quench formaldehyde. 3. Wash cells three times with 5 ml cold PBS. Remove PBS from the first and second washes but leave the PBS from the third wash. 4. Scrape cells into the PBS from the third wash using a silicone scraper. Transfer to a 15 ml conical tube. 5. Add 5 ml PBS and repeat scraping and collection. 6. Collect cells by centrifugation at 2000 x g for 5 minutes and proceed to lysis or freeze cells in liquid nitrogen and store at -80° until needed. C. Cell lysis and sonication Note: all lysis buffers should be supplemented with protease inhibitors.

193

1. Resuspend each tube of cells in 10 ml lysis buffer 1 and rock for 10 min at 4°C. Collect cells by centrifugation at 4°C at 2000 x g for 2 minutes. 2. Resuspend each tube of cells in 10 ml lysis buffer 2 and rock for 10 min at 4°C. Collect cells by centrifugation at 4°C at 2000 x g for 2 minutes. 3. Resuspend each tube of cells in 3 ml lysis buffer 3. 4. Sonicate resuspended cells using a Misonix Sonicator 3000 with the following parameters: output level 7 (power output will be about 27-30W), 18 cycles of 25 seconds followed by a 1 minute rest. Samples should be kept on ice-water at all times, which will dissipate heat much more efficiently than ice alone. 5. Add 1/10 volume 10% Triton X-100 to each sample and split each sample into three microfuge tubes. Pellet debris by centrifugation at 4°C at maximum speed for 10 minutes in a microfuge. 6. Combine supernatants into a 15 ml tubes. If you are doing multiple ChIPs from one sample, add an additional 3 ml lysis buffer (with Triton X-100) per ChIP. For instance, if you are doing four ChIPs from one sample, add 9 ml of lysis buffer 3 to the lysate for a total volume of 12 ml and split into four 15 ml tubes. 7. Save a 100 µl aliquot of lysate as input. 8. Add antibody-bound beads (resuspended in 100 µl 1x PBS/0.5% BSA) to each tube of lysate and rock overnight at 4°C. D. Washing and elution Note: all washing steps (1-4) should be carried out in the cold room with ice cold buffers. Wash buffer should be supplemented with protease inhibitors. 1. Split the lysate-antibody mixture from each ChIP into 3 microfuge tubes and collect beads with the magnetic stand. 2. Remove supernatant from each tube and resuspend the first tube in 1 ml of wash buffer. Transfer mixture to second tube, resuspend beads, and transfer mixture to the last tube. This is wash 1. 3. Wash beads 4 additional times with 1 ml of wash buffer. 4. Wash beads once with 1 ml PBS. 5. Remove PBS and add 115 µl of elution buffer to each ChIP. Heat at 65°C overnight to elute proteins and reverse crosslinks. Vortex each sample briefly every 2 minutes for the first 10-15 min of the elution. 6. Thaw input sample and add 300 µl elution buffer. Heat at 65°C overnight. E. Digestion of cellular RNA and protein and purification of DNA 1. Transfer ChIP supernatants to new tubes. Add 1 volume TE buffer, pH 8.0 to ChIP and input samples. 2. Add 10 mg/ml RNase A to each sample to a final concentration of 0.2 mg/ml. Add 4.6 µl to ChIPs and 16 µl to inputs. Incubate 1 hour at 37°C.

194

3. Add 20 mg/ml Proteinase K to each sample to a final concentration of 0.2 mg/ml. Add 2.3 µl to ChIPs and 8 µl to inputs. Incubate 2 hours at 55°C. 4. Add 1.5 µl 20 mg/ml glycogen (30 µg) and 5 M NaCl to a final concentration of 0.2 M to each sample. Add 10 µl NaCl to ChIPs, 32 µl NaCl to inputs. 5. Transfer samples to Phaselock tubes and add 1 volume phenol/chloroform/isoamyl alcohol. Shake vigorously for 15 sec and centrifuge at maximum speed in a microfuge at room temperature. 6. Transfer aqueous phase to a new tube and add 2 volumes 100% ethanol. Note: you will need to split the input aqueous phase into 2 microfuge tubes in order to carry out this precipitation. Precipitate for at least 30 minutes at -80°C. 7. Centrifuge samples for 10 minutes at 4°C at maximum speed in a microfuge. The pellets should appear white and fluffy. 8. Remove ethanol and wash pellets twice with 500 µl 100% ethanol. Centrifuge samples for 2 minutes at 4°C at maximum speed in a microfuge after each wash. 9. Briefly air-dry pellets and resuspend in 65 µl 10 mM Tris, pH 8.0 10. The concentrations of input samples can be measured using the NanoDrop; however, the concentrations of ChIP samples are generally too low for reliable spectrophotometric quantification. More sensitive assays such as PicoGreen can be used if accurate quantification of ChIP DNA is desired. Notes on ChIP-seq library preparation We generally perform library preparation for ChIP-seq samples as described [275] with minor modifications, which are detailed below. 1. Rather than resuspending the ChIP DNA in 30 µl of 10 mM Tris, pH 8.0 and using the entire ChIP sample for library preparation, we use approximately half the sample by resuspending in 65 µl Tris and using 30 µl of resuspended sample. 2. Rather than processing 5-50 ng of whole cell extract for input library preparation, we use 500 ng of purified input DNA. 3. We use 40-fold diluted paired-end genomic adapters for adapter ligation (section 2.2.9) and genomic primers 1.0 and 2.0 for library amplification (section 2.2.10), which are for paired-end runs. This ensures that the library can be used for single- and paired-end sequencing. 4. A separate gel is not used to purify each library (section 2.2.11). We have found that as long as libraries on a single are separated by one large well there are no problems with cross-contamination.

195

5. After gel purification of the library, we split the library into five 3 µl aliquots and save 2-3 aliquots in case there is a problem with the sequencing run.

196

Solutions

PBS/BSA (50 ml) Dissolve 0.25g BSA in 50 ml 1x PBS

11% formaldehyde solution (50 ml) 14.9 ml 37% formaldehyde 1 ml 5 M NaCl 100 µl 0.5 M EDTA, pH 8.0 50 µl 0.5 M EGTA, pH 8.0 2.5 ml 1 M HEPES, pH 8.0

Lysis buffer 1 (500 ml) 25 ml 1 M HEPES-KOH, pH 7.5 14 ml 5 M NaCl 1 ml 0.5 M EDTA, pH 8.0 50 ml glycerol 25 ml 10% NP-40 1.25 ml Triton X-100

Lysis buffer 2 (500 ml) 20 ml 5 M NaCl 1 ml 0.5 M EDTA, pH 8.0 0.5 ml 0.5 M EGTA, pH 8.0 5 ml 1 M Tris, pH 8.0

Lysis buffer 3 (500 ml) 1 ml 0.5 M EDTA, pH 8.0 0.5 ml 0.5 M EGTA, pH 8.0 5 ml 1 M Tris, pH 8.0 10 ml 5 M NaCl 5 ml 10% Na-Deoxycholate 2.5 g N-lauroyl-sarcosine

Wash buffer (500 ml) 25 ml 1 M HEPES, pH 7.6 1 ml 0.5 M EDTA, pH 8.0 35 ml 10% Na-Deoxycholate 50 ml 10% NP-40 10 ml 5 M LiCl or 2.12g LiCl powder

197

Elution buffer (50 ml) 2.5 ml 1 M Tris, pH 8.0 1 ml 0.5 M ETDA, pH 8.0 5 ml 10% SDS

198

Bibliography

1. Moss T, Langlois F, Gagnon-Kugler T, Stefanovsky V (2007) A housekeeper with power of attorney: the rRNA genes in ribosome biogenesis. Cell. Mol. Life Sci. 64:29-49. 2. McStay B, Grummt I (2008) The of rRNA Genes: From Molecular to Chromosome Biology. Annu. Rev. Cell Dev. Biol. 24:131-157. 3. Boisvert F-M, van Koningsbruggen S, Navascues J, Lamond AI (2007) The multifunctional nucleolus. Nat. Rev. Mol. Cell Biol. 8:574-585. 4. Davison J, Tyagi A, Comai L (2007) Large-scale polymorphism of heterochromatic repeats in the DNA of Arabidopsis thaliana. BMC Plant Biol. 7:44. 5. Prokopowich C, Gregory T, Crease T (2003) The correlation between rDNA copy number and genome size in eukaryotes. Genome 46:48-50. 6. Kobayashi T (2011) Regulation of ribosomal RNA gene copy number and its role in modulating genome integrity and evolutionary adaptability in yeast. Cell. Mol. Life Sci.:1-9. 7. Stults DM, Killen MW, Pierce HH, Pierce AJ (2008) Genomic architecture and inheritance of human ribosomal RNA gene clusters. Genome Res. 18:13- 18. 8. Belikov SV, Dzherbashyajan AR, Preobrazhenskaya OV, Karpov VL, Mirzabekov AD (1990) Chromatin structure of Drosophila melanogaster ribosomal genes. FEBS Lett. 273:205-207. 9. Mais C, Scheer U (2001) Molecular architecture of the amplified nucleoli of Xenopus oocytes. J. Cell Sci. 114:709-718. 10. Rodland KD, Russell PJ (1983) Ribosomal genes of Neurospora crassa: Constancy of gene number in the conidial and mycelial phases, and homogeneity in length and restriction enzyme cleavage sites within strains. Mol. Gen. Genet. 192:285-287. 11. Kabler RL, Srinivasan A, Taylor LJ, Mowad J, Rothblum LI, et al. (1996) Androgen regulation of ribosomal RNA synthesis in LNCaP cells and rat prostate. J. Steroid Biochem. Mol. Biol. 59:431-439. 12. Sheng Z, Liang Y, Lin C-Y, Comai L, Chirico WJ (2005) Direct Regulation of rRNA Transcription by Fibroblast Growth Factor 2. Mol. Cell. Biol. 25:9419-9426. 13. Zhao J, Yuan X, Frödin M, Grummt I (2003) ERK-Dependent Phosphorylation of the Transcription Initiation Factor TIF-IA Is Required for RNA Polymerase I Transcription and Cell Growth. Mol. Cell 11:405-413. 14. Stefanovsky VY, Pelletier G, Hannan R, Gagnon-Kugler T, Rothblum LI, et al. (2001) An Immediate Response of Ribosomal Transcription to Growth Factor Stimulation in Mammals Is Mediated by ERK Phosphorylation of UBF. Mol. Cell 8:1063-1073. 15. Hoppe S, Bierhoff H, Cado I, Weber A, Tiebe M, et al. (2009) AMP-activated protein kinase adapts rRNA synthesis to cellular energy supply. Proc. Natl. Acad. Sci. U. S. A. 106:17781-17786.

199

16. Hannan KM, Brandenburger Y, Jenkins A, Sharkey K, Cavanaugh A, et al. (2003) mTOR-Dependent Regulation of Ribosomal Gene Transcription Requires S6K1 and Is Mediated by Phosphorylation of the Carboxy- Terminal Activation Domain of the Nucleolar Transcription Factor UBF. Mol. Cell. Biol. 23:8862–8877. 17. Stefanovsky V, Langlois F, Gagnon-Kugler T, Rothblum LI, Moss T (2006) Growth Factor Signaling Regulates Elongation of RNA Polymerase I Transcription in Mammals via UBF Phosphorylation and r-Chromatin Remodeling. Mol. Cell 21:629-639. 18. Mariappan MM, D'Silva K, Lee MJ, Sataranatarajan K, Barnes JL, et al. (2011) Ribosomal biogenesis induction by high glucose requires activation of upstream binding factor in kidney glomerular epithelial cells. Am. J. Physiol. Renal Physiol. 300:F219-F230. 19. James MJ, Zomerdijk JCBM (2004) Phosphatidylinositol 3-Kinase and mTOR Signaling Pathways Regulate RNA Polymerase I Transcription in Response to IGF-1 and Nutrients. J. Biol. Chem. 279:8911-8918. 20. Sun H, Tu X, Baserga R (2006) A Mechanism for Cell Size Regulation by the Insulin and Insulin-Like Growth Factor-I Receptors. Cancer Res. 66:11106-11109. 21. Kishimoto K, Liu S, Tsuji T, Olson KA, Hu G-f (2004) Endogenous angiogenin in endothelial cells is a general requirement for cell proliferation and angiogenesis. Oncogene 24:445-456. 22. Ko Y-G, Kang Y-S, Kim E-K, Park SG, Kim S (2000) Nucleolar Localization of Human Methionyl–Trna Synthetase and Its Role in Ribosomal RNA Synthesis. J. Cell Biol. 149:567-574. 23. Lin C-Y, Navarro S, Reddy S, Comai L CK2-mediated stimulation of Pol I transcription by stabilization of UBF–SL1 interaction. Nucleic Acids Res. 34:4752-4766. 24. Gomes C, Smith SC, Youssef MN, Zheng J-J, Hagg T, et al. (2011) RNA Polymerase 1-driven Transcription as a Mediator of BDNF-induced Neurite Outgrowth. J. Biol. Chem. 286:4357-4363. 25. Luyken J, Hannan RD, Cheung JY, Rothblum LI (1996) Regulation of rDNA transcription during endothelin-1-induced hypertrophy of neonatal cardiomyocytes. Hyperphosphorylation of upstream binding factor, an rDNA transcription factor. Circ. Res. 78:354-361. 26. Ibaragi S, Yoshioka N, Kishikawa H, Hu JK, Sadow PM, et al. (2009) Angiogenin-Stimulated rRNA Transcription Is Essential for Initiation and Survival of AKT-Induced Prostate Intraepithelial Neoplasia. Mol. Cancer Res. 7:415-424. 27. Bouche G, Gas N, Prats H, Baldin V, Tauber J, et al. (1987) Basic fibroblast growth factor enters the nucleolus and stimulates the transcription of ribosomal genes in ABAE cells undergoing G0----G1 transition. Proc. Natl. Acad. Sci. U. S. A. 84:6770 - 6774. 28. Narla A, Ebert BL (2010) : human disorders of ribosome dysfunction. Blood 115:3196-3205.

200

29. Henderson AS, Warburton D, Atwood KC (1972) Location of Ribosomal DNA in the Human Chromosome Complement. Proc. Natl. Acad. Sci. U. S. A. 69:3394–3398. 30. Dev VG, Tantravahi R, Miller DA, Miller OJ (1977) Nucleolus organizers in Mus musculus subspecies and in the RAG mouse cell line. Genetics 86:389-398. 31. Gonzalez IL, Sylvester JE (1995) Complete Sequence of the 43-kb Human Ribosomal DNA Repeat: Analysis of the Intergenic Spacer. Genomics 27:320-328. 32. Grozdanov P, Georgiev O, Karagyozov L (2003) Complete sequence of the 45-kb mouse ribosomal DNA repeat: analysis of the intergenic spacer. Genomics 82:637-643. 33. Learned RM, Learned TK, Haltiner MM, Tjian RT (1986) Human rRNA transcription is modulated by the coordinate binding of two factors to an upstream control element. Cell 45:847-857. 34. Haltiner MM, Smale ST, Tjian R (1986) Two distinct promoter elements in the human rRNA gene identified by linker scanning mutagenesis. Mol. Cell. Biol. 6:227-235. 35. Henras A, Soudet J, Gérus M, Lebaron S, Caizergues-Ferrer M, et al. (2008) The post-transcriptional steps of eukaryotic ribosome biogenesis. Cell. Mol. Life Sci. 65:2334-2359. 36. Grummt I, Rosenbauer H, Niedermeyer I, Maier U, Öhrlein A (1986) A repeated 18 bp sequence motif in the mouse rDNA spacer mediates binding of a nuclear factor and transcription termination. Cell 45:837-846. 37. Grummt I, Kuhn A, Bartsch I, Rosenbauer H (1986) A transcription terminator located upstream of the mouse rDNA initiation site affects rRNA synthesis. Cell 47:901-911. 38. Arnheim N, Krystal M, Schmickel R, Wilson G, Ryder O, et al. (1980) Molecular evidence for genetic exchanges among ribosomal genes on nonhomologous chromosomes in man and apes. Proc. Natl. Acad. Sci. U. S. A. 77:7323-7327. 39. Gonzalez IL, Sylvester JE (2001) Human rDNA: Evolutionary Patterns within the Genes and Tandem Arrays Derived from Multiple Chromosomes. Genomics 73:255-263. 40. Kuhn A, Grummt I (1986) A novel promoter in the mouse rDNA spacer is active in vivo and in vitro. EMBO J. 6:3487–3492. 41. Mayer C, Schmitz K-M, Li J, Grummt I, Santoro R (2006) Intergenic Transcripts Regulate the Epigenetic State of rRNA Genes. Mol. Cell 22:351-361. 42. Santoro R, Schmitz K-M, Sandoval J, Grummt I (2010) Intergenic transcripts originating from a subclass of ribosomal DNA repeats silence ribosomal RNA genes in trans. EMBO Rep. 11:52-58. 43. Schmitz K-M, Mayer C, Postepska A, Grummt I (2010) Interaction of noncoding RNA with the rDNA promoter mediates recruitment of DNMT3b and silencing of rRNA genes. Genes Dev. 24:2264-2269.

201

44. van de Nobelen S, Rosa-Garrido M, Leers J, Heath H, Soochit W, et al. (2010) CTCF regulates the local epigenetic state of ribosomal DNA repeats. Epigenetics Chromatin 3:19. 45. Kuhn A, Deppert U, Grummt I (1990) A 140-base-pair repetitive sequence element in the mouse rRNA gene spacer enhances transcription by RNA polymerase I in a cell-free system. Proc. Natl. Acad. Sci. U. S. A. 87:7527- 7531. 46. Pikaard CS, Pape LK, Henderson SL, Ryan K, Paalman MH, et al. (1990) Enhancers for RNA polymerase I in mouse ribosomal DNA. Mol. Cell. Biol. 10:4816-4825. 47. Osheim YN, Mougey EB, Windle J, Anderson M, O'Reilly M, et al. (1996) Metazoan rDNA enhancer acts by making more genes transcriptionally active. J. Cell Biol. 133:943-954. 48. Roussel P, Andre C, Masson C, Geraud G, Hernandez-Verdun D (1993) Localization of the RNA polymerase I transcription factor hUBF during the cell cycle. J. Cell Sci. 104:327-337. 49. Roussel P, André C, Comai L, Hernandez-Verdun D (1996) The rDNA transcription machinery is assembled during mitosis in active NORs and absent in inactive NORs. J. Cell Biol. 133:235-246. 50. Sogo JM, Ness PJ, Widmer RM, Parish RW, Koller T (1984) Psoralen- crosslinking of DNA as a probe for the structure of active nucleolar chromatin. J. Mol. Biol. 178:897-919. 51. Sollner-Webb B, Mougey E (1991) News from the nucleolus: rRNA gene expression. Trends Biochem. Sci. 16:58-62. 52. Grummt I (1999) Regulation of mammalian ribosomal gene transcription by RNA polymerase I. Prog. Nucleic Acid Res. Mol. Biol. 62:109-154. 53. Judelson HS, Vogt VM (1982) Accessibility of ribosomal genes to trimethyl psoralen in nuclei of Physarum polycephalum. Mol. Cell. Biol. 2:211-220. 54. Dammann R, Lucchini R, Koller T, Sogo JM (1993) Chromatin structures and transcription of rDNA in yeast . Nucleic Acids Res. 21:2331-2338. 55. Lucchini R, Pauli U, Braun R, Koller T, Sogo JM (1987) Structure of the extrachromosomal ribosomal RNA chromatin of Physarum polycephalum. J. Mol. Biol. 196:829-843. 56. Tremblay M, Toussaint M, D'Amours A, Conconi A (2009) Nucleotide excision repair and photolyase repair of UV photoproducts in nucleosomes: assessing the existence of nucleosome and non- nucleosome rDNA chromatin in vivo. Biochem. Cell Biol. 87:337-346. 57. Ness P, Labhart P, Banz E, Koller T, Parish R (1983) Chromatin structure along the ribosomal DNA of Dictyostelium. Regional differences and changes accompanying cell differentiation. J. Mol. Biol. 166:361-381. 58. Tongaonkar P, French SL, Oakes ML, Vu L, Schneider DA, et al. (2005) Histones are required for transcription of yeast rRNA genes by RNA polymerase I. Proc. Natl. Acad. Sci. U. S. A. 102:10129-10134.

202

59. Jones HS, Kawauchi J, Braglia P, Alen CM, Kent NA, et al. (2007) RNA polymerase I in yeast transcribes dynamic nucleosomal rDNA. Nat. Struct. Mol. Biol. 14:123-130. 60. Merz K, Hondele M, Goetze H, Gmelch K, Stoeckl U, et al. (2008) Actively transcribed rRNA genes in S. cerevisiae are organized in a specialized chromatin associated with the high-mobility group protein Hmo1 and are largely devoid of histone molecules. Genes Dev. 22:1190-1204. 61. Ahmad K, Henikoff S (2002) The Histone Variant H3.3 Marks Active Chromatin by Replication-Independent Nucleosome Assembly. Mol. Cell 9:1191-1200. 62. Birch JL, Tan BCM, Panov KI, Panova TB, Andersen JS, et al. (2009) FACT facilitates chromatin transcription by RNA polymerases I and III. EMBO J. 28:854-865. 63. Mongelard F, Bouvet P (2007) Nucleolin: a multiFACeTed protein. Trends Cell Biol. 17:80-86. 64. Murano K, Okuwaki M, Hisaoka M, Nagata K (2008) Transcription Regulation of the rRNA Gene by a Multifunctional Nucleolar Protein, B23/Nucleophosmin, through Its Histone Chaperone Activity. Mol. Cell. Biol. 28:3114-3126. 65. Rickards B, Flint SJ, Cole MD, LeRoy G (2007) Nucleolin Is Required for RNA Polymerase I Transcription In Vivo. Mol. Cell. Biol. 27:937-948. 66. Hisaoka M, Ueshima S, Murano K, Nagata K, Okuwaki M (2010) Regulation of Nucleolar Chromatin by B23/Nucleophosmin Jointly Depends upon Its RNA Binding Activity and Transcription Factor UBF. Mol. Cell. Biol. 30:4952-4964. 67. Sanij E, Hannan RD (2009) The role of UBF in regulating the structure and dynamics of transcriptionally active rDNA chromatin. Epigenetics 4:374- 382. 68. Panov KI, Friedrich JK, Russell J, Zomerdijk JCBM (2006) UBF activates RNA polymerase I transcription by stimulating promoter escape. EMBO J. 25:3310-3322. 69. Comai L, Tanese N, Tjian R (1992) The TATA-binding protein and associated factors are integral components of the RNA polymerase I transcription factor, SL1. Cell 68:965-976. 70. Zomerdijk J, Beckmann H, Comai L, Tjian R (1994) Assembly of transcriptionally active RNA polymerase I initiation factor SL1 from recombinant subunits. Science 266:2015-2018. 71. Gorski JJ, Pathak S, Panov K, Kasciukovic T, Panova T, et al. (2007) A novel TBP-associated factor of SL1 functions in RNA polymerase I transcription. EMBO J. 26:1560-1568. 72. Denissov S, van Driel M, Voit R, Hekkelman M, Hulsen T, et al. (2007) Identification of novel functional TBP-binding sites and general factor repertoires. EMBO J. 26:944-954. 73. Yuan X, Zhao J, Zentgraf H, Hoffmann-Rohrer U, Grummt I (2002) Multiple interactions between RNA polymerase I, TIF-IA and TAFI subunits

203

regulate preinitiation complex assembly at the ribosomal gene promoter. EMBO Rep. 3:1082-1087

74. Miller G, Panov KI, Friedrich JK, Trinkle-Mulcahy L, Lamond AI, et al. (2001) hRRN3 is essential in the SL1-mediated recruitment of RNA Polymerase I to rRNA gene promoters. EMBO J. 20:1373-1382. 75. Iuchi S, Green H (1999) Basonuclin, a zinc finger protein of keratinocytes and reproductive germ cells, binds to the rRNA gene promoter. Proc. Natl. Acad. Sci. U. S. A. 96:9628-9632. 76. Tseng H, Biegel J, Brown R (1999) Basonuclin is associated with the ribosomal RNA genes on human keratinocyte mitotic chromosomes. J. Cell Sci. 112:3039-3047. 77. Tian Q, Kopf GS, Brown RS, Tseng H (2001) Function of basonuclin in increasing transcription of the ribosomal RNA genes during mouse oogenesis. Development 128:407-416. 78. Zhang S, Wang J, Tseng H (2007) Basonuclin Regulates a Subset of Ribosomal RNA Genes in HaCaT Cells. PLoS ONE 2:e902. 79. Young DW, Hassan MQ, Pratap J, Galindo M, Zaidi SK, et al. (2007) Mitotic occupancy and lineage-specific transcriptional control of rRNA genes by Runx2. Nature 445:442-446. 80. Ali SA, Zaidi SK, Dacwag CS, Salma N, Young DW, et al. (2008) Phenotypic transcription factors epigenetically mediate cell growth control. Proc. Natl. Acad. Sci. U. S. A. 105:6632-6637. 81. Bowman LH, Emerson CP (1977) Post-transcriptional regulation of ribosome accumulation during myoblast differentiation. Cell 10:587-596. 82. Krauter KS, Soeiro R, Nadal-Gnard B (1979) Transcriptional regulation of ribosomal RNA accumulation during L6E9 myoblast differentiation. J. Mol. Biol. 134:727-741. 83. Zahradkal P, Larson DE, Sells BH (1991) Regulation of ribosome biogenesis in differentiated rat myotubes. Mol. Cell. Biochem. 104:189-194. 84. Poortinga G, Hannan KM, Snelling H, Walkley CR, Jenkins A, et al. (2004) MAD1 and c-MYC regulate UBF and rDNA transcription during granulocyte differentiation. EMBO J. 23:3325-3335. 85. Tseng H, Chou W, Wang J, Zhang X, Zhang S, et al. (2008) Mouse Ribosomal RNA Genes Contain Multiple Differentially Regulated Variants. PLoS ONE 3:e1843. 86. Doerfler W (2008) In pursuit of the first recognized epigenetic signal––DNA methylation: A 1976 to 2008 synopsis. Epigenetics 3:125-133. 87. Tost J (2009) DNA Methylation: An Introduction to the Biology and the Disease-Associated Changes of a Promising Biomarker. In: Tost J, editor. DNA Methylation: Humana Press. pp. 3-20. 88. Santoro R, Grummt I (2001) Molecular Mechanisms Mediating Methylation- Dependent Silencing of Ribosomal Gene Transcription. Mol. Cell 8:719- 725. 89. Ghoshal K, Majumder S, Datta J, Motiwala T, Bai S, et al. (2004) Role of Human Ribosomal RNA (rRNA) Promoter Methylation and of Methyl-CpG-

204

binding Protein MBD2 in the Suppression of rRNA Gene Expression. J. Biol. Chem. 279:6783-6793. 90. Espada J, Ballestar E, Santoro R, Fraga MF, Villar-Garea A, et al. (2007) Epigenetic disruption of ribosomal RNA genes and nucleolar architecture in DNA methyltransferase 1 (Dnmt1) deficient cells. Nucleic Acids Res. 35:2191-2198. 91. Gagnon-Kugler T, Langlois F, Stefanovsky V, Lessard F, Moss T (2009) Loss of Human Ribosomal Gene CpG Methylation Enhances Cryptic RNA Polymerase II Transcription and Disrupts Ribosomal RNA Processing. Mol. Cell 35:414-425. 92. Kobayashi T, Ganley ARD (2005) Recombination Regulation by Transcription-Induced Dissociation in rDNA Repeats. Science 309:1581-1584. 93. Sanij E, Poortinga G, Sharkey K, Hung S, Holloway TP, et al. (2008) UBF levels determine the number of active ribosomal RNA genes in mammals. J. Cell Biol. 183:1259-1274. 94. Murayama A, Ohmori K, Fujimura A, Minami H, Yasuzawa-Tanaka K, et al. (2008) Epigenetic Control of rDNA Loci in Response to Intracellular Energy Status. Cell 133:627-639. 95. Frescas D, Guardavaccaro D, Bassermann F, Koyama-Nasu R, Pagano M (2007) JHDM1B/FBXL10 is a nucleolar protein that represses transcription of ribosomal RNA genes. Nature 450:309-313. 96. Yuan X, Feng W, Imhof A, Grummt I, Zhou Y (2007) Activation of RNA Polymerase I Transcription by Cockayne Syndrome Group B Protein and Histone Methyltransferase G9a. Mol. Cell 27:585-595. 97. Feng W, Yonezawa M, Ye J, Jenuwein T, Grummt I (2010) PHF8 activates transcription of rRNA genes through H3K4me3 binding and H3K9me1/2 demethylation. Nat. Struct. Mol. Biol. 17:445-450. 98. Suganuma T, Workman JL (2010) Features of the PHF8/KIAA1718 histone demethylase. Cell Res. 20:861-862. 99. Zhu Z, Wang Y, Li X, Wang Y, Xu L, et al. (2010) PHF8 is a histone H3K9me2 demethylase regulating rRNA synthesis. Cell Res. 20:794-801. 100. Zhou Y, Schmitz K-M, Mayer C, Yuan X, Akhtar A, et al. (2009) Reversible acetylation of the chromatin remodelling complex NoRC is required for non-coding RNA-dependent silencing. Nat. Cell Biol. 11:1010-1016. 101. Tanaka Y, Okamoto K, Teye K, Umata T, Yamagiwa N, et al. (2010) JmjC enzyme KDM2A is a regulator of rRNA transcription in response to starvation. EMBO J. 29:1510-1522. 102. Zhou Y, Grummt I (2005) The PHD Finger/Bromodomain of NoRC Interacts with Acetylated Histone H4K16 and Is Sufficient for rDNA Silencing. Curr. Biol. 15:1434-1438. 103. Bierhoff H, Schmitz K, Maass F, Ye J, Grummt I (2011) Noncoding Transcripts in Sense and Antisense Orientation Regulate the Epigenetic State of Ribosomal RNA Genes. Cold Spring Harbor Symp. Quant. Biol.

205

104. Zheng Y, John S, Pesavento JJ, Schultz-Norton JR, Schiltz RL, et al. (2010) Histone H1 phosphorylation is associated with transcription by RNA polymerases I and II. J. Cell Biol. 189:407-415. 105. Levy A, Eyal M, Hershkovits G, Salmon-Divon M, Klutstein M, et al. (2008) Yeast linker histone Hho1p is required for efficient RNA polymerase I processivity and transcriptional silencing at the ribosomal DNA. Proc. Natl. Acad. Sci. U. S. A. 105:11703-11708. 106. Li C, Mueller JE, Elfline M, Bryk M (2008) Linker histone H1 represses recombination at the ribosomal DNA locus in the budding yeast Saccharomyces cerevisiae. Mol. Microbiol. 67:906-919. 107. Carr AM (1994) Analysis of a histone H2A variant from fission yeast: evidence for a role in chromosome stability. Mol. Gen. Genet. 245:628- 635. 108. Faast R (2001) Histone variant H2A.Z is required for early mammalian development. Curr. Biol. 11:1183-1187. 109. Rangasamy D, Greaves I, Tremethick DJ (2004) RNA interference demonstrates a novel role for H2A.Z in chromosome segregation. Nat. Struct. Mol. Biol. 11:650-655. 110. Updike DL, Mango SE (2006) Temporal regulation of foregut development by HTZ-1/H2A.Z and PHA-4/FoxA. PLoS Genet. 2:e161. 111. Ahmed S, Dul B, Qiu X, Walworth NC (2008) Msc1 acts through histone H2A.Z to promote chromosome stability in Schizosaccharomyces pombe. Genetics 177:1487-1497. 112. Creyghton MP, Markoulaki S, Levine SS, Hanna J, Lodato MA, et al. (2008) H2AZ Is Enriched at Polycomb Complex Target Genes in ES Cells and Is Necessary for Lineage Commitment. Cell 135:649-661. 113. Kim H-S, Vanoosthuyse V, Fillingham J, Roguev A, Watt S, et al. (2009) An acetylated form of histone H2A.Z regulates chromosome architecture in Schizosaccharomyces pombe. Nat. Struct. Mol. Biol. 16:1286-1293. 114. Guillemette B, Gaudreau L (2006) Reuniting the contrasting functions of H2A.Z. Biochem. Cell Biol. 84:528-535. 115. Meneghini MD, Wu M, Madhani HD (2003) Conserved Histone Variant H2A.Z Protects Euchromatin from the Ectopic Spread of Silent Heterochromatin. Cell 112:725-736. 116. Sarcinella E, Zuzarte PC, Lau PNI, Draker R, Cheung P (2007) Monoubiquitylation of H2A.Z Distinguishes Its Association with Euchromatin or Facultative Heterochromatin. Mol. Cell. Biol. 27:6457- 6468. 117. Wang Z, Zang C, Rosenfeld JA, Schones DE, Barski A, et al. (2008) Combinatorial patterns of histone and in the human genome. Nat. Genet. 40:897-903. 118. Jin C, Zang C, Wei G, Cui K, Peng W, et al. (2009) H3.3/H2A.Z double variant-containing nucleosomes mark 'nucleosome-free regions' of active promoters and other regulatory regions. Nat. Genet. 41:941-945. 119. Jin C, Felsenfeld G (2007) Nucleosome stability mediated by histone variants H3.3 and H2A.Z. Genes Dev. 21:1519-1529.

206

120. Millar CB, Xu F, Zhang K, Grunstein M (2006) Acetylation of H2AZ lysine 14 is associated with genome-wide gene activity in yeast. Genes Dev. 20:711-722. 121. Nemeth A, Guibert S, Tiwari VK, Ohlsson R, Langst G (2008) Epigenetic regulation of TTF-I-mediated promoter-terminator interactions of rRNA genes. EMBO J. 27:1255-1265. 122. Phillips JE, Corces VG (2009) CTCF: Master Weaver of the Genome. Cell 137:1194-1211. 123. Torrano V, Navascues J, Docquier F, Zhang R, Burke LJ, et al. (2006) Targeting of CTCF to the nucleolus inhibits nucleolar transcription through a poly(ADP-ribosyl)ation-dependent mechanism. J. Cell Sci. 119:1746- 1759. 124. Guerrero PA, Maggert KA (2011) The CCCTC-Binding Factor (CTCF) of Drosophila Contributes to the Regulation of the Ribosomal DNA and Nucleolar Stability. PLoS ONE 6:e16401. 125. Henikoff S, Ahmad K (2005) Assembly of variant histones into chromatin. Annu. Rev. Cell Dev. Biol. 21:133-153. 126. Li J, Langst G, Grummt I (2006) NoRC-dependent nucleosome positioning silences rRNA genes. EMBO J. 25:5735-5741. 127. Gerber J-K, Gögel E, Berger C, Wallisch M, Müller F, et al. (1997) Termination of Mammalian rDNA Replication: Polar Arrest of Replication Fork Movement by Transcription Termination Factor TTF-I. Cell 90:559- 567. 128. Henderson S, Sollner-Webb B (1986) A transcriptional terminator is a novel element of the promoter of the mouse ribosomal RNA gene. Cell 47:891- 900. 129. McStay B, Reeder RH (1990) An RNA polymerase I termination site can stimulate the adjacent ribosomal gene promoter by two distinct mechanisms in Xenopus laevis. Genes Dev. 4:1240-1251. 130. Langst G, Becker PB, Grummt I (1998) TTF-I determines the chromatin architecture of the active rDNA promoter. EMBO J. 17:3135-3145. 131. Langst G, Blank TA, Becker PB, Grummt I (1997) RNA polymerase I transcription on nucleosomal templates: the transcription termination factor TTF-I induces chromatin remodeling and relieves transcriptional repression. EMBO J. 16:760-768. 132. Strohner R, Nemeth A, Jansa P, Hofmann-Rohrer U, Santoro R, et al. (2001) NoRC-a novel member of mammalian ISWI-containing chromatin remodeling machines. EMBO J. 20:4892-4900. 133. Santoro R, Li J, Grummt I (2002) The nucleolar remodeling complex NoRC mediates heterochromatin formation and silencing of ribosomal gene transcription. Nat. Genet. 32:393-396. 134. Németh A, Strohner R, Grummt I, Längst G (2004) The chromatin remodeling complex NoRC and TTF-I cooperate in the regulation of the mammalian rRNA genes in vivo. Nucleic Acids Res. 32:4091-4099.

207

135. Zhou Y, Santoro R, Grummt I (2002) The chromatin remodeling complex NoRC targets HDAC1 to the ribosomal gene promoter and represses RNA polymerase I transcription. EMBO J. 21:4632-4640. 136. Kavi H, Birchler J (2009) Drosophila KDM2 is a H3K4me3 demethylase regulating nucleolar organization. BMC Res. Notes 2:217. 137. Frescas D, Guardavaccaro D, Bassermann F, Koyama-Nasu R, Pagano M (2007) JHDM1B/FBXL10 is a nucleolar protein that represses transcription of ribosomal RNA genes. Nature 450:309 - 313. 138. Mayer C, Neubert M, Grummt I (2008) The structure of NoRC-associated RNA is crucial for targeting the chromatin remodelling complex NoRC to the nucleolus. EMBO Rep. 9:774-780. 139. Licht CL, Stevnsner T, Bohr VA (2003) Cockayne Syndrome Group B Cellular and Biochemical Functions. Am. J. Hum. Genet. 73:1217-1239. 140. Hanawalt PC, Spivak G (2008) Transcription-coupled DNA repair: two decades of progress and surprises. Nat. Rev. Mol. Cell Biol. 9:958-970. 141. Tantin D, Kansal A, Carey M (1997) Recruitment of the putative transcription-repair coupling factor CSB/ERCC6 to RNA polymerase II elongation complexes. Mol. Cell. Biol. 17:6803-6814. 142. Balajee AS, May A, Dianov GL, Friedberg EC, Bohr VA (1997) Reduced RNA polymerase II transcription in intact and permeabilized Cockayne syndrome group B cells. Proc. Natl. Acad. Sci. U. S. A. 94:4306-4311. 143. Bradsher J, Auriol J, de Santis LP, Iben S, Vonesch J-L, et al. (2002) CSB Is a Component of RNA Pol I Transcription. Mol. Cell 10:819-829. 144. Nielsen AL, Oulad-Abdelghani M, Ortiz JA, Remboutsika E, Chambon P, et al. (2001) Heterochromatin Formation in Mammalian Cells: Interaction between Histones and HP1 Proteins. Mol. Cell 7:729-739. 145. Snowden AW, Gregory PD, Case CC, Pabo CO (2002) Gene-Specific Targeting of H3K9 Methylation Is Sufficient for Initiating Repression In Vivo. Curr. Biol. 12:2159-2166. 146. Bannister AJ, Zegerman P, Partridge JF, Miska EA, Thomas JO, et al. (2001) Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature 410:120-124. 147. Cheutin T, McNairn AJ, Jenuwein T, Gilbert DM, Singh PB, et al. (2003) Maintenance of Stable Heterochromatin Domains by Dynamic HP1 Binding. Science 299:721-725. 148. Xiao A, Li H, Shechter D, Ahn SH, Fabrizio LA, et al. (2009) WSTF regulates the H2A.X DNA damage response via a novel tyrosine kinase activity. Nature 457:57-62. 149. Percipalle P, Farrants A-KÖ (2006) Chromatin remodelling and transcription: be-WICHed by nuclear myosin 1. Curr. Opin. Cell Biol. 18:267-274. 150. Kato S, Fujiki R, Kim M-S, Kitagawa H (2007) Ligand-induced transrepressive function of VDR requires a chromatin remodeling complex, WINAC. J. Steroid Biochem. Mol. Biol. 103:372-380.

208

151. Philimonenko VV, Zhao J, Iben S, Dingova H, Kysela K, et al. (2004) Nuclear actin and myosin I are required for RNA polymerase I transcription. Nat. Cell Biol. 6:1165-1172. 152. Percipalle P, Fomproix N, Cavellan E, Voit R, Reimer G, et al. (2006) The chromatin remodelling complex WSTF-SNF2h interacts with nuclear myosin 1 and has a role in RNA polymerase I transcription. EMBO Rep. 7:525-530. 153. Cavellán E, Asp P, Percipalle P, Farrants A-KÖ (2006) The WSTF-SNF2h Chromatin Remodeling Complex Interacts with Several Nuclear Proteins in Transcription. J. Biol. Chem. 281:16264-16271. 154. Ford E, Voit R, Liszt G, Magin C, Grummt I, et al. (2006) Mammalian Sir2 homolog SIRT7 is an activator of RNA polymerase I transcription. Genes Dev. 20:1075-1080. 155. Grob A, Roussel P, Wright JE, McStay B, Hernandez-Verdun D, et al. (2009) Involvement of SIRT7 in resumption of rDNA transcription at the exit from mitosis. J. Cell Sci. 122:489-498. 156. Brown SE, Szyf M (2007) Epigenetic Programming of the rRNA Promoter by MBD3. Mol. Cell. Biol. 27:4938-4952. 157. Barreto G, Schafer A, Marhold J, Stach D, Swaminathan SK, et al. (2007) Gadd45a promotes epigenetic gene activation by repair-mediated DNA demethylation. Nature 445:671-675. 158. Schmitz K-M, Schmitt N, Hoffmann-Rohrer U, Schäfer A, Grummt I, et al. (2009) TAF12 Recruits Gadd45a and the Nucleotide Excision Repair Complex to the Promoter of rRNA Genes Leading to Active DNA Demethylation. Mol. Cell 33:344-353. 159. Hiratani I, Gilbert DM (2009) Replication timing as an epigenetic mark. Epigenetics 4:93-97. 160. Li J, Santoro R, Koberna K, Grummt I (2005) The chromatin remodeling complex NoRC controls replication timing of rRNA genes. EMBO J. 24:120-127. 161. Schlesinger S, Selig S, Bergman Y, Cedar H (2009) Allelic inactivation of rDNA loci. Genes Dev. 23:2437-2447. 162. Rubbi CP, Milner J (2003) Disruption of the nucleolus mediates stabilization of p53 in response to DNA damage and other stresses. EMBO J. 22:6068- 6077. 163. Pestov DG, Strezoska Z, Lau LF (2001) Evidence of p53-Dependent Cross- Talk between Ribosome Biogenesis and the Cell Cycle: Effects of Nucleolar Protein Bop1 on G1/S Transition. Mol. Cell. Biol. 21:4246-4255. 164. Šulić S, Panić L, Barkić M, Merćep M, Uzelac M, et al. (2005) Inactivation of S6 ribosomal protein gene in T lymphocytes activates a p53-dependent checkpoint response. Genes Dev. 19:3070-3082. 165. Gilkes DM, Chen L, Chen J (2006) MDMX regulation of p53 response to ribosomal stress. EMBO J. 25:5614-5625. 166. Oliner JD, Pietenpol JA, Thiagalingam S, Gyuris J, Kinzler KW, et al. (1993) Oncoprotein MDM2 conceals the activation domain of tumour suppressor p53. Nature 362:857-860.

209

167. Haupt Y, Maya R, Kazaz A, Oren M (1997) Mdm2 promotes the rapid degradation of p53. Nature 387:296-299. 168. Honda R, Tanaka H, Yasuda H (1997) Oncoprotein MDM2 is a ubiquitin ligase E3 for tumor suppressor p53. FEBS Lett. 420:25-27. 169. Kubbutat MHG, Jones SN, Vousden KH (1997) Regulation of p53 stability by Mdm2. Nature 387:299-303. 170. Sakaguchi K, Herrera JE, Saito Si, Miki T, Bustin M, et al. (1998) DNA damage activates p53 through a phosphorylation–acetylation cascade. Genes Dev. 12:2831-2841. 171. Higashimoto Y, Saito Si, Tong X-H, Hong A, Sakaguchi K, et al. (2000) Human p53 Is Phosphorylated on 6 and 9 in Response to DNA Damage-inducing Agents. J. Biol. Chem. 275:23199-23203. 172. Chao C, Saito Si, Anderson CW, Appella E, Xu Y (2000) Phosphorylation of murine p53 at Ser-18 regulates the p53 responses to DNA damage. Proc. Natl. Acad. Sci. U. S. A. 97:11936-11941. 173. Appella E, Anderson CW (2001) Post-translational modifications and activation of p53 by genotoxic stresses. Eur. J. Biochem. 268:2764-2772. 174. Huang J, Berger SL (2008) The emerging field of dynamic lysine methylation of non-histone proteins. Curr. Opin. Genet. Dev. 18:152-158. 175. Sherr CJ (2006) Divorcing ARF and p53: an unsettled case. Nat. Rev. Cancer 6:663-673. 176. Zhang Y, Lu H (2009) Signaling to p53: Ribosomal Proteins Find Their Way. Cancer Cell 16:369-377. 177. Marechal V, Elenbaas B, Piette J, Nicolas JC, Levine AJ (1994) The ribosomal L5 protein is associated with mdm-2 and mdm-2-p53 complexes. Mol. Cell. Biol. 14:7414-7420. 178. Lohrum MAE, Ludwig RL, Kubbutat MHG, Hanlon M, Vousden KH (2003) Regulation of HDM2 activity by the ribosomal protein L11. Cancer Cell 3:577-587. 179. Zhang Y, Wolf GW, Bhat K, Jin A, Allio T, et al. (2003) Ribosomal Protein L11 Negatively Regulates Oncoprotein MDM2 and Mediates a p53- Dependent Ribosomal-Stress Checkpoint Pathway. Mol. Cell. Biol. 23:8902-8912. 180. Dai M-S, Lu H (2004) Inhibition of MDM2-mediated p53 Ubiquitination and Degradation by Ribosomal Protein L5. J. Biol. Chem. 279:44475-44482. 181. Dai M-S, Zeng SX, Jin Y, Sun X-X, David L, et al. (2004) Ribosomal Protein L23 Activates p53 by Inhibiting MDM2 Function in Response to Ribosomal Perturbation but Not to Translation Inhibition. Mol. Cell. Biol. 24:7654- 7668. 182. Chakraborty A, Uechi T, Higa S, Torihara H, Kenmochi N (2009) Loss of Ribosomal Protein L11 Affects Zebrafish Embryonic Development through a p53-Dependent Apoptotic Response. PLoS ONE 4:e4152. 183. Azuma M, Toyama R, Laver E, Dawid IB (2006) Perturbation of rRNA Synthesis in the bap28 Mutation Leads to Apoptosis Mediated by p53 in the Zebrafish Central Nervous System. J. Biol. Chem. 281:13309-13316.

210

184. Uechi T, Nakajima Y, Nakao A, Torihara H, Chakraborty A, et al. (2006) Ribosomal Protein Gene Knockdown Causes Developmental Defects in Zebrafish. PLoS ONE 1:e37. 185. Trainor PA, Dixon J, Dixon MJ (2008) Treacher Collins syndrome: etiology, pathogenesis and prevention. Eur. J. Hum. Genet. 17:275-283. 186. Choesmel V, Bacqueville D, Rouquette J, Noaillac-Depeyre J, Fribourg S, et al. (2007) Impaired ribosome biogenesis in Diamond-Blackfan anemia. Blood 109:1275-1283. 187. Heiss NS, Knight SW, Vulliamy TJ, Klauck SM, Wiemann S, et al. (1998) X- linked dyskeratosis congenita is caused by mutations in a highly conserved gene with putative nucleolar functions. Nat. Genet. 19:32-38. 188. Ebert BL, Pretz J, Bosco J, Chang CY, Tamayo P, et al. (2008) Identification of RPS14 as a 5q- syndrome gene by RNA interference screen. Nature 451:335-339. 189. Ridanpää M, van Eenennaam H, Pelin K, Chadwick R, Johnson C, et al. (2001) Mutations in the RNA Component of RNase MRP Cause a Pleiotropic Human Disease, Cartilage-Hair Hypoplasia. Cell 104:195-203. 190. Ganapathi KA, Austin KM, Lee C-S, Dias A, Malsch MM, et al. (2007) The human Shwachman-Diamond syndrome protein, SBDS, associates with ribosomal RNA. Blood 110:1458-1465. 191. Grandori C, Gomez-Roman N, Felton-Edkins ZA, Ngouenet C, Galloway DA, et al. (2005) c-Myc binds to human ribosomal DNA and stimulates transcription of rRNA genes by RNA polymerase I. Nat. Cell Biol. 7:311- 318. 192. Arabi A, Wu S, Ridderstrale K, Bierhoff H, Shiue C, et al. (2005) c-Myc associates with ribosomal DNA and activates RNA polymerase I transcription. Nat. Cell Biol. 7:303-310. 193. Grewal SS, Li L, Orian A, Eisenman RN, Edgar BA (2005) Myc-dependent regulation of ribosomal RNA synthesis during Drosophila development. Nat. Cell Biol. 7:295-302. 194. Dauwerse JG, Dixon J, Seland S, Ruivenkamp CAL, van Haeringen A, et al. (2011) Mutations in genes encoding subunits of RNA polymerases I and III cause Treacher Collins syndrome. Nat. Genet. 43:20-22. 195. Valdez BC, Henning D, So RB, Dixon J, Dixon MJ (2004) The Treacher Collins syndrome (TCOF1) gene product is involved in ribosomal DNA gene transcription by interacting with upstream binding factor. Proc. Natl. Acad. Sci. U. S. A. 101:10709-10714. 196. Hayano T, Yanagida M, Yamauchi Y, Shinkawa T, Isobe T, et al. (2003) Proteomic Analysis of Human Nop56p-associated Pre-ribosomal Ribonucleoprotein Complexes. J. Biol. Chem. 278:34309-34319. 197. Gonzales B, Henning D, So RB, Dixon J, Dixon MJ, et al. (2005) The Treacher Collins syndrome (TCOF1) gene product is involved in pre-rRNA methylation. Hum. Mol. Genet. 14:2035-2043. 198. Dixon J, Jones NC, Sandell LL, Jayasinghe SM, Crane J, et al. (2006) Tcof1/Treacle is required for neural crest cell formation and proliferation

211

deficiencies that cause craniofacial abnormalities. Proc. Natl. Acad. Sci. U. S. A. 103:13403-13408. 199. Jones NC, Lynn ML, Gaudenz K, Sakai D, Aoto K, et al. (2008) Prevention of the neurocristopathy Treacher Collins syndrome through inhibition of p53 function. Nat. Med. 14:125-133. 200. Vlachos A, Ball S, Dahl N, Alter BP, Sheth S, et al. (2008) Diagnosing and treating Diamond Blackfan anaemia: results of an international clinical consensus conference. Br. J. Haematol. 142:859-876. 201. Danilova N, Sakamoto KM, Lin S (2008) Ribosomal protein S19 deficiency in zebrafish leads to developmental abnormalities and defective erythropoiesis through activation of p53 protein family. Blood 112:5228- 5237. 202. Schultz J (1929) The Minute reaction in the development of Drosophila melanogaster. Genetics 14:366-419. 203. Marygold S, Roote J, Reuter G, Lambertsson A, Ashburner M, et al. (2007) The ribosomal protein genes and Minute loci of Drosophila melanogaster. Genome Biol. 8:R216. 204. Kongsuwan K, Yu Q, Vincent A, Frisardi MC, Rosbash M, et al. (1985) A Drosophila Minute gene encodes a ribosomal protein. Nature 317:555- 558. 205. Cui Z, DiMario PJ (2007) RNAi Knockdown of Nopp140 Induces Minute-like Phenotypes in Drosophila. Mol. Biol. Cell 18:2179-2191. 206. Southard JL, Eicher EM (1977) Belly spot and tail (Bst). . Mouse News Lett. 56:40. 207. Smith RS, John SWM, Zabeleta A, Davisson MT, Hawes NL, et al. (2000) The Bst locus on mouse chromosome 16 is associated with age-related subretinal neovascularization. Proc. Natl. Acad. Sci. U. S. A. 97:2191- 2195. 208. Oliver ER, Saunders TL, Tarlé SA, Glaser T (2004) Ribosomal protein L24 defect in Belly spot and tail (Bst), a mouse Minute. Development 131:3907-3920. 209. Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, et al. (2005) Mechanisms of Haploinsufficiency Revealed by Genome-Wide Profiling in Yeast. Genetics 169:1915-1925. 210. Hall JA, Georgel PT (2007) CHD proteins: a diverse family with strong ties. Biochem. Cell Biol. 85:463-476. 211. Marfella CGA, Imbalzano AN (2007) The Chd family of chromatin remodelers. Mutat. Res. 618:30-40. 212. Gaspar-Maia A, Alajem A, Polesso F, Sridharan R, Mason MJ, et al. (2009) Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature 460:863-868. 213. Marfella CGA, Henninger N, LeBlanc SE, Krishnan N, Garlick DS, et al. (2009) A Mutation in the Mouse Chd2 Chromatin Remodeling Enzyme Results in a Complex Renal Phenotype. Kidney Blood Press. Res. 31:421- 432.

212

214. Bajpai R, Chen DA, Rada-Iglesias A, Zhang J, Xiong Y, et al. (2010) CHD7 cooperates with PBAF to control multipotent neural crest formation. Nature 463:958-962. 215. Shur I, Solomon R, Benayahu D (2006) Dynamic Interactions of Chromatin- Related Mesenchymal Modulator, a Chromodomain Helicase-DNA- Binding Protein, with Promoters in Osteoprogenitors. Stem Cells 24:1288- 1293. 216. Rodríguez-Paredes M, Ceballos-Chávez M, Esteller M, García-Domínguez M, Reyes JC (2009) The chromatin remodeling factor CHD8 interacts with elongating RNA polymerase II and controls expression of the cyclin E2 gene. Nucleic Acids Res. 37:2449-2460. 217. Polo SE, Kaidi A, Baskcomb L, Galanty Y, Jackson SP (2010) Regulation of DNA-damage responses and cell-cycle progression by the chromatin remodelling factor CHD4. EMBO J. 29:3130-3139. 218. Urquhart A, Gatei M, Richard D, Khanna KK (2011) ATM mediated phosphorylation of CHD4 contributes to genome maintenance. Genome Integr. 2:1. 219. Larsen DH, Poinsignon C, Gudjonsson T, Dinant C, Payne MR, et al. (2010) The chromatin-remodeling factor CHD4 coordinates signaling and repair after DNA damage. J. Cell Biol. 190:731-740. 220. Nagarajan P, Onami TM, Rajagopalan S, Kania S, Donnell R, et al. (2009) Role of chromodomain helicase DNA-binding protein 2 in DNA damage response signaling and tumorigenesis. Oncogene 28:1053-1062. 221. Chou DM, Adamson B, Dephoure NE, Tan X, Nottke AC, et al. (2010) A chromatin localization screen reveals poly (ADP ribose)-regulated recruitment of the repressive polycomb and NuRD complexes to sites of DNA damage. Proc. Natl. Acad. Sci. U. S. A. 107:18475-18480. 222. Nishiyama M, Oshikawa K, Tsukada Y-i, Nakagawa T, Iemura S-i, et al. (2009) CHD8 suppresses p53-mediated apoptosis through histone H1 recruitment during early embryogenesis. Nat. Cell Biol. 11:172-182. 223. Gao X, Gordon D, Zhang D, Browne R, Helms C, et al. (2007) CHD7 Gene Polymorphisms Are Associated with Susceptibility to Idiopathic Scoliosis. The American Journal of Human Genetics 80:957-965. 224. Kulkarni S, Nagarajan P, Wall J, Donovan DJ, Donell RL, et al. (2008) Disruption of chromodomain helicase DNA binding protein 2 (CHD2) causes scoliosis. Am. J. Med. Genet. A 146A:1117-1127. 225. Bagchi A, Papazoglu C, Wu Y, Capurso D, Brodt M, et al. (2007) CHD5 Is a Tumor Suppressor at Human 1p36. Cell 128:459-475. 226. Zentner GE, Layman WS, Martin DM, Scacheri PC (2010) Molecular and phenotypic aspects of CHD7 mutation in CHARGE syndrome. Am. J. Med. Genet. A 152A:674-686. 227. Brehm A, Tufteland KR, Aasland R, Becker PB (2004) The many colours of chromodomains. BioEssays 26:133-140. 228. Flanagan JF, Mi L-Z, Chruszcz M, Cymborowski M, Clines KL, et al. (2005) Double chromodomains cooperate to recognize the methylated histone H3 tail. Nature 438:1181-1185.

213

229. Schnetz MP, Bartels CF, Shastri K, Balasubramanian D, Zentner GE, et al. (2009) Genomic distribution of CHD7 on chromatin tracks H3K4 methylation patterns. Genome Res. 19:590-601. 230. Srinivasan S, Dorighi KM, Tamkun JW (2008) Drosophila Kismet Regulates Histone H3 Lysine 27 Methylation and Early Elongation by RNA Polymerase II. PLoS Genet. 4:e1000217. 231. Lutz T, Stöger R, Nieto A (2006) CHD6 is a DNA-dependent ATPase and localizes at nuclear sites of mRNA synthesis. FEBS Lett. 580:5851-5857. 232. Shur I, Benayahu D (2005) Characterization and Functional Analysis of CReMM, a Novel Chromodomain Helicase DNA-binding Protein. J. Mol. Biol. 352:646-655. 233. Thompson BA, Tremblay V, Lin G, Bochar DA (2008) CHD8 Is an ATP- Dependent Chromatin Remodeling Factor That Regulates β-Catenin Target Genes. Mol. Cell. Biol. 28:3894-3904. 234. Denslow SA, Wade PA (2007) The human Mi-2β/NuRD complex and gene regulation. Oncogene 26:5433-5438. 235. Vissers LELM, van Ravenswaaij CMA, Admiraal R, Hurst JA, de Vries BBA, et al. (2004) Mutations in a new member of the chromodomain gene family cause CHARGE syndrome. Nat. Genet. 36:955-957. 236. Alazami AM, Alzahrani F, Alkuraya FS (2008) Expanding the “E” in CHARGE. Am. J. Med. Genet. A 146A:1890-1892. 237. Van de Laar I, Dooijes D, Hoefsloot L, Simon M, Hoogeboom J, et al. (2007) Limb anomalies in patients with CHARGE syndrome: An expansion of the phenotype. Am. J. Med. Genet. A 143A:2712-2715. 238. Layman WS, McEwen DP, Beyer LA, Lalani SR, Fernbach SD, et al. (2009) Defects in neural stem cell proliferation and olfaction in Chd7 deficient mice indicate a mechanism for hyposmia in human CHARGE syndrome. Hum. Mol. Genet. 18:1909-1923. 239. Bosman EA, Penn AC, Ambrose JC, Kettleborough R, Stemple DL, et al. (2005) Multiple mutations in mouse Chd7 provide models for CHARGE syndrome. Hum. Mol. Genet. 14:3463-3476. 240. Bergman JEH, Bosman EA, van Ravenswaaij-Arts CMA, Steel KP (2009) Study of smell and reproductive organs in a mouse model for CHARGE syndrome. Eur. J. Hum. Genet. 18:171-177. 241. Hurd EA, Capers PL, Blauwkamp MN, Adams ME, Raphael Y, et al. (2007) Loss of Chd7 function in gene-trapped reporter mice is embryonic lethal and associated with severe defects in multiple developing tissues. Mamm. Genome 18:94-104. 242. Adams ME, Hurd EA, Beyer LA, Swiderski DL, Raphael Y, et al. (2007) Defects in vestibular sensory epithelia and innervation in mice with loss of Chd7 function: Implications for human CHARGE syndrome. J. Comp. Neurol. 504:519-532. 243. de la Cruz X, Lois S, Sánchez-Molina S, Martínez-Balbás MA (2005) Do protein motifs read the ? BioEssays 27:164-175. 244. Allen MD, Religa TL, Freund SMV, Bycroft M (2007) Solution Structure of the BRK Domains from CHD7. J. Mol. Biol. 371:1135-1140.

214

245. Schnetz MP, Handoko L, Akhtar-Zaidi B, Bartels CF, Pereira CF, et al. (2010) CHD7 Targets Active Gene Enhancer Elements to Modulate ES Cell-Specific Gene Expression. PLoS Genet. 6:e1001023. 246. Takada I, Mihara M, Suzawa M, Ohtake F, Kobayashi S, et al. (2007) A histone lysine methyltransferase activated by non-canonical Wnt signalling suppresses PPAR-γ transactivation. Nat. Cell Biol. 9:1273-1285. 247. Hurd EA, Poucher HK, Cheng K, Raphael Y, Martin DM (2010) The ATP- dependent chromatin remodeling enzyme CHD7 regulates pro-neural gene expression and neurogenesis in the inner ear. Development 137:3139-3150. 248. Randall V, McCue K, Roberts C, Kyriakopoulou V, Beddow S, et al. (2009) Great vessel development requires biallelic expression of Chd7 and Tbx1 in pharyngeal ectoderm in mice. J. Clin. Invest. 119:3301-3310. 249. Opferman JT, Zambetti GP (2006) Translational research? Ribosome integrity and a new p53 tumor suppressor checkpoint. Cell Death Differ. 13:898-901. 250. Grummt I (2003) Life on a planet of its own: regulation of RNA polymerase I transcription in the nucleolus. Genes Dev. 17:1691-1702. 251. Russell J, Zomerdijk JCBM (2005) RNA-polymerase-I-directed rDNA transcription, life and works. Trends Biochem. Sci. 30:87-96. 252. Park PJ (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10:669-680. 253. International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409:860-921. 254. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, et al. (2001) The Sequence of the Human Genome. Science 291:1304-1351. 255. Németh A, Längst G (2008) Chromatin organization of active ribosomal RNA genes. Epigenetics 3:243-245. 256. Sims III RJ, Reinberg D (2009) Processing the H3K36me3 signature. Nat. Genet. 41:270-271. 257. Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, et al. (2009) Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459:108-112. 258. Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, et al. (2007) Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39:311-318. 259. Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, et al. (2006) Genome- wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16:123-131. 260. Schones DE, Cui K, Cuddapah S, Roh T-Y, Barski A, et al. (2008) Dynamic Regulation of Nucleosome Positioning in the Human Genome. Cell 132:887-898. 261. O'Sullivan AC, Sullivan GJ, McStay B (2002) UBF Binding In Vivo Is Not Restricted to Regulatory Sequences within the Vertebrate Ribosomal DNA Repeat. Mol. Cell. Biol. 22:657-658.

215

262. Pistoni M, Verrecchia A, Doni M, Guccione E, Amati B (2010) Chromatin association and regulation of rDNA transcription by the Ras-family protein RasL11a. EMBO J. 29:1215-1224. 263. Nix D, Courdy S, Boucher K (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 9:523. 264. Grueneberg DA, Pablo L, Hu K-Q, August P, Weng Z, et al. (2003) A Functional Screen in Human Cells Identifies UBF2 as an RNA Polymerase II Transcription Factor That Enhances the {beta}-Catenin Signaling Pathway. Mol. Cell. Biol. 23:3936-3950. 265. Chen J, Wu A, Sun H, Drakas R, Garofalo C, et al. (2005) Functional Significance of Type 1 Insulin-like Growth Factor-mediated Nuclear Translocation of the Insulin Receptor Substrate-1 and β-Catenin. J. Biol. Chem. 280:29912-29920. 266. Zentner GE, Saiakhova A, Manaenkov P, Adams MD, Scacheri PC (2011) Integrative genomic analysis of human ribosomal DNA. Nucleic Acids Res. 39:4949-4960. 267. Grueneberg DA, Pablo L, Hu K-Q, August P, Weng Z, et al. (2003) A Functional Screen in Human Cells Identifies UBF2 as an RNA Polymerase II Transcription Factor That Enhances the β-Catenin Signaling Pathway. Mol. Cell. Biol. 23:3936-3950. 268. Zentner GE, Saiakhova A, Manaenkov P, Adams MD, Scacheri PC (2011) Integrative genomic analysis of human ribosomal DNA. Nucleic Acids Res. 269. Riethoven JJ (2010) Regulatory regions in DNA: promoters, enhancers, silencers, and insulators. Methods Mol. Biol. 674:33-42. 270. Xi H, Shulha HP, Lin JM, Vales TR, Fu Y, et al. (2007) Identification and Characterization of Cell Type–Specific and Ubiquitous Chromatin Regulatory Structures in the Human Genome. PLoS Genet. 3:e136. 271. Bao L, Zhou M, Cui Y (2008) CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators. Nucleic Acids Res. 36:D83-D87. 272. Bao L, Zhou M, Cui Y (2008) CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators. Nucl. Acids Res. 36:D83-87. 273. Zhang X-Y, Loflin PT, Gehrke CW, Andrews PA, Ehrlich M (1987) Hypermethylation of human DNA sequences in embryonal carcinoma cells and somatic tissues but not in sperm. Nucleic Acids Res. 15:9429-9449. 274. Kim T-K, Hemberg M, Gray JM, Costa AM, Bear DM, et al. (2010) Widespread transcription at neuronal activity-regulated enhancers. Nature 465:182-187. 275. Schmidt D, Wilson MD, Spyrou C, Brown GD, Hadfield J, et al. (2009) ChIP- seq: Using high-throughput sequencing to discover protein-DNA interactions. Methods 48:240-248. 276. Chen X, Xu H, Yuan P, Fang F, Huss M, et al. (2008) Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells. Cell 133:1106-1117.

216

277. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory- efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25. 278. Boyle AP, Guinney J, Crawford GE, Furey TS (2008) F-Seq: a feature density estimator for high-throughput sequence tags. Bioinformatics 24:2537-2538. 279. Blahnik KR, Dou L, O’Geen H, McPhillips T, Xu X, et al. (2010) Sole- Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Res. 38:e13. 280. Saldanha AJ (2004) Java Treeview—extensible visualization of microarray data. Bioinformatics 20:3246-3248. 281. Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, et al. (2010) PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Res. 38:D204-D210. 282. Shin H, Liu T, Manrai AK, Liu XS (2009) CEAS: cis-regulatory element annotation system. Bioinformatics 25:2605-2606. 283. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, et al. (2003) Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31:e15. 284. Ahmad Y, Boisvert F-M, Gregor P, Cobley A, Lamond AI (2009) NOPdb: Nucleolar Proteome Database—2008 update. Nucleic Acids Res. 37:D181-D184. 285. Shimono K, Shimono Y, Shimokata K, Ishiguro N, Takahashi M (2005) Microspherule Protein 1, Mi-2β, and RET Finger Protein Associate in the Nucleolus and Up-regulate Ribosomal Gene Transcription. J. Biol. Chem. 280:39436-39447. 286. Zhang X, Guo C, Chen Y, Shulha HP, Schnetz MP, et al. (2008) Epitope tagging of endogenous proteins for genome-wide ChIP-chip studies. Nat. Meth. 5:163-165. 287. Bakshi R, Zaidi SK, Pande S, Hassan MQ, Young DW, et al. (2008) The leukemogenic t(8;21) fusion protein AML1-ETO controls rRNA genes and associates with nucleolar-organizing regions at mitotic chromosomes. J. Cell Sci. 121:3981-3990. 288. Chédin S, Laferté A, Hoang T, LaFontaine D, Riva M (2007) Is Ribosome Synthesis Controlled by Pol I Transcription? Cell Cycle 6:11-15. 289. Huo JX, Metz SA, Li GD (2003) p53-independent induction of p21waf1//cip1 contributes to the activation of caspases in GTP-depletion-induced apoptosis of insulin-secreting cells. Cell Death Differ. 11:99-109. 290. Aliouat-Denis C-M, Dendouga N, Van den Wyngaert I, Goehlmann H, Steller U, et al. (2005) p53-Independent Regulation of p21Waf1/Cip1 Expression and Senescence by Chk2. Mol. Cancer Res. 3:627-634. 291. Abbas T, Dutta A (2009) p21 in cancer: intricate networks and multiple activities. Nat. Rev. Cancer 9:400-414. 292. Andersen JS, Lyon CE, Fox AH, Leung AKL, Lam YW, et al. (2002) Directed Proteomic Analysis of the Human Nucleolus. Curr. Biol. 12:1-11.

217

293. Muramatsu M, Smetana K, Busch H (1963) Quantitative aspects of isolation of nucleoli of the Walker carcinosarcoma and liver of the rat. Cancer Res. 25:693-697. 294. Kacser H, Burns JA (1981) The molecular basis of dominance. Genetics 97:639-666. 295. Seidman JG, Seidman C (2002) Transcription factor haploinsufficiency: when half a loaf is not enough. J. Clin. Invest. 109:451-455. 296. Zentner GE, Hurd EA, Schnetz MP, Handoko L, Wang C, et al. (2010) CHD7 functions in the nucleolus as a positive regulator of ribosomal RNA biogenesis. Hum. Mol. Genet. 19:3491-3501. 297. Veitia RA, Birchler JA (2010) Dominance and gene dosage balance in health and disease: why levels matter! The Journal of Pathology 220:174- 185. 298. Veitia RA (2002) Exploring the etiology of haploinsufficiency. BioEssays 24:175-184. 299. He J, Kallin EM, Tsukada Y-i, Zhang Y (2008) The H3K36 demethylase Jhdm1b/Kdm2b regulates cell proliferation and senescence through p15Ink4b. Nat. Struct. Mol. Biol. 15:1169-1175. 300. Ali SA, Zaidi SK, Dobson JR, Shakoori AR, Lian JB, et al. (2010) Transcriptional TLE1 functions with Runx2 in epigenetic repression of ribosomal RNA genes. Proc. Natl. Acad. Sci. U. S. A. 107:4165-4169. 301. Guetg C, Lienemann P, Sirri V, Grummt I, Hernandez-Verdun D, et al. (2010) The NoRC complex mediates the heterochromatin formation and stability of silent rRNA genes and centromeric repeats. EMBO J. 29:2135- 2146. 302. Pelletier G, Stefanovsky VY, Faubladier M, Hirschler-Laszkiewicz I, Savard J, et al. (2000) Competitive Recruitment of CBP and Rb-HDAC Regulates UBF Acetylation and Ribosomal Transcription. Mol. Cell 6:1059-1066. 303. Zhang X, Tseng H (2007) Basonuclin-Null Mutation Impairs Homeostasis and Wound Repair in Mouse Corneal Epithelium. PLoS ONE 2:e1087. 304. Muller C, Bremer A, Schreiber S, Eichwald S, Calkhoven CF (2010) Nucleolar retention of a translational C/EBPα isoform stimulates rDNA transcription and cell size. EMBO J. 29:897-909. 305. Cohen Jr MM (2009) Perspectives on RUNX genes: An update. Am. J. Med. Genet. A 149A:2629-2646. 306. Dang VT, Kassahn KS, Marcos AE, Ragan MA (2008) Identification of human haploinsufficient genes and their genomic proximity to segmental duplications. Eur. J. Hum. Genet. 16:1350-1357. 307. Ng SB, Bigham AW, Buckingham KJ, Hannibal MC, McMillin MJ, et al. (2010) identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42:790-793. 308. Williams SR, Aldred MA, Der Kaloustian VM, Halal F, Gowans G, et al. (2010) Haploinsufficiency of HDAC4 Causes Brachydactyly Mental Retardation Syndrome, with Brachydactyly Type E, Developmental Delays, and Behavioral Problems. Am. J. Hum. Genet. 87:219-228.

218

309. Lin J, Jin R, Zhang B, Chen H, Bai YX, et al. (2008) Nucleolar localization of TERT is unrelated to telomerase function in human cells. J. Cell Sci. 121:2169-2176. 310. Barbe L, Lundberg E, Oksvold P, Stenius A, Lewin E, et al. (2008) Toward a Confocal Subcellular Atlas of the Human Proteome. Mol. Cell. Proteomics 7:499-508. 311. Bauer DC, Willadsen K, Buske FA, Lê Cao K-A, Bailey TL, et al. (2011) Sorting the nuclear proteome. Bioinformatics 27:i7-i14. 312. Kong YM, MacDonald RJ, Wen X, Yang P, Barbera VM, et al. (2006) A comprehensive survey of DNA-binding transcription factor gene expression in human fetal and adult organs. Gene Expr. Patterns 6:678- 686. 313. Shipra A, Chetan K, Rao MRS (2006) CREMOFAC—a database of chromatin remodeling factors. Bioinformatics 22:2940-2944. 314. Ho L, Crabtree GR (2010) Chromatin remodelling during development. Nature 463:474-484. 315. Gajecka M, Mackay KL, Shaffer LG (2007) Monosomy 1p36 deletion syndrome. Am. J. Med. Genet. Part C 145C:346-356. 316. Bergemann AD, Cole F, Hirschhorn K (2005) The etiology of Wolf- Hirschhorn syndrome. Trends Genet. 21:188-195. 317. Rada-Iglesias A, Bajpai R, Swigut T, Brugmann SA, Flynn RA, et al. (2010) A unique chromatin signature uncovers early developmental enhancers in humans. Nature advance online publication. 318. Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, et al. (2010) Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U. S. A. 107:21931- 21936. 319. de Hoon MJL, Imoto S, Nolan J, Miyano S (2004) Open source clustering software. Bioinformatics 20:1453-1454. 320. McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, et al. (2010) GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotech. 28:495-501. 321. Johnson D, Morrison N, Grant L, Turner T, Fantes J, et al. (2006) Confirmation of CHD7 as a cause of CHARGE association identified by mapping a balanced chromosome translocation in affected monozygotic twins. J. Med. Genet. 43:280-284. 322. Jongmans MCJ, Admiraal RJ, van der Donk KP, Vissers LELM, Baas AF, et al. (2006) CHARGE syndrome: the phenotypic spectrum of mutations in the CHD7 gene. J. Med. Genet. 43:306-314. 323. Law MJ, Lower KM, Voon HPJ, Hughes JR, Garrick D, et al. (2010) ATR-X Syndrome Protein Targets Tandem Repeats and Influences Allele-Specific Expression in a Size-Dependent Manner. Cell 143:367-378. 324. Gibbons RJ, McDowell TL, Raman S, O'Rourke DM, Garrick D, et al. (2000) Mutations in ATRX, encoding a SWI/SNF-like protein, cause diverse changes in the pattern of DNA methylation. Nat. Genet. 24:368-371.

219

325. Javierre BM, Fernandez AF, Richter J, Al-Shahrour F, Martin-Subero JI, et al. (2010) Changes in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus. Genome Res. 20:170-179. 326. de Carvalho CV, Payão SLM, Smith MdAC (2000) DNA methylation, ageing and ribosomal genes activity. Biogerontology 1:357-361. 327. McGowan PO, Sasaki A, Huang TCT, Unterberger A, Suderman M, et al. (2008) Promoter-Wide Hypermethylation of the Ribosomal RNA Gene Promoter in the Suicide Brain. PLoS ONE 3:e2085. 328. Emmott E, Hiscox JA (2009) Nucleolar targeting: the hub of the matter. EMBO Rep. 10:231-238. 329. Lehman AM, Friedman JM, Chai D, Zahir FR, Marra MA, et al. (2009) A characteristic syndrome associated with microduplication of 8q12, inclusive of CHD7. Eur. J. Med. Genet. 52:436-439. 330. Pleasance ED, Stephens PJ, O’Meara S, McBride DJ, Meynert A, et al. (2010) A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463:184-190. 331. Takahashi K, Yamanaka S (2006) Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell 126:663-676. 332. Meissner A, Wernig M, Jaenisch R (2007) Direct reprogramming of genetically unmodified fibroblasts into pluripotent stem cells. Nat. Biotech. 25:1177-1181. 333. Stadtfeld M, Hochedlinger K (2010) Induced pluripotency: history, mechanisms, and applications. Genes Dev. 24:2239-2263. 334. Zhang N, An MC, Montoro D, Ellerby LM (2010) Characterization of Human Huntington's Disease Cell Model from Induced Pluripotent Stem Cells. PLoS Curr. 2. 335. Marchetto MCN, Carromeu C, Acab A, Yu D, Yeo GW, et al. (2010) A Model for Neural Development and Treatment of Rett Syndrome Using Human Induced Pluripotent Stem Cells. Cell 143:527-539. 336. Jin Z-B, Okamoto S, Osakada F, Homma K, Assawachananont J, et al. (2011) Modeling Retinal Degeneration Using Patient-Specific Induced Pluripotent Stem Cells. PLoS ONE 6:e17084. 337. Inoue H (2010) Neurodegenerative disease-specific induced pluripotent stem cell research. Exp. Cell Res. 316:2560-2564. 338. Kim PG, Daley GQ (2009) Application of induced pluripotent stem cells to hematologic disease. Cytotherapy 11:980-989. 339. Lalani SR, Safiullah AM, Fernbach SD, Harutyunyan KG, Thaller C, et al. (2006) Spectrum of CHD7 Mutations in 110 Individuals with CHARGE Syndrome and Genotype-Phenotype Correlation. Am. J. Hum. Genet. 78:303-314. 340. Ma N, Matsunaga S, Takata H, Ono-Maniwa R, Uchiyama S, et al. (2007) Nucleolin functions in nucleolus formation and chromosome congression. J. Cell Sci. 120:2091-2105.

220

341. Ugrinova I, Monier K, Ivaldi C, Thiry M, Storck S, et al. (2007) Inactivation of nucleolin leads to nucleolar disruption, cell cycle arrest and defects in centrosome duplication. BMC Mol. Biol. 8:66. 342. Hanahan D, Weinberg RA (2000) The Hallmarks of Cancer. Cell 100:57-70. 343. Ruggero D, Pandolfi PP (2003) Does the ribosome translate cancer? Nat. Rev. Cancer 3:179-192. 344. Lessard F, Morin F, Ivanchuk S, Langlois F, Stefanovsky V, et al. (2010) The ARF Tumor Suppressor Controls Ribosome Biogenesis by Regulating the RNA Polymerase I Transcription Factor TTF-I. Mol. Cell 38:539-550. 345. Zhang C, Comai L, Johnson DL (2005) PTEN Represses RNA Polymerase I Transcription by Disrupting the SL1 Complex. Mol. Cell. Biol. 25:6899- 6911. 346. Zhai W, Comai L (2000) Repression of RNA Polymerase I Transcription by the Tumor Suppressor p53. Mol. Cell. Biol. 20:5930-5938. 347. Hannan KM, Hannan RD, Smith SD, Jefferson LS, Mingyue L, et al. (2000) Rb and p130 regulate RNA polymerase I transcription: Rb disrupts the interaction between UBF and SL-1. Oncogene 19:1988-4999. 348. Sugimoto M, Kuo M-L, Roussel MF, Sherr CJ (2003) Nucleolar Arf Tumor Suppressor Inhibits Ribosomal RNA Processing. Mol. Cell 11:415-424. 349. Voit R, Schnapp A, Kuhn A, Rosenbauer H, Hirschmann P, et al. (1992) The nucleolar transcription factor mUBF is phosphorylated by casein kinase II in the C-terminal hyperacidic tail which is essential for transactivation. EMBO J. 11:2211-2218. 350. Pogue-Geile K, Geiser JR, Shu M, Miller C, Wool IG, et al. (1991) Ribosomal protein genes are overexpressed in colorectal cancer: isolation of a cDNA clone encoding the human S3 ribosomal protein. Mol. Cell. Biol. 11:3842-3849. 351. Barnard GF, Staniunas RJ, Mori M, Puder M, Jessup MJ, et al. (1993) Gastric and Hepatocellular Carcinomas Do Not Overexpress the Same Ribosomal Protein Messenger RNAs as Colonic Carcinoma. Cancer Res. 53:4048-4052. 352. Henry JL, Coggin DL, King CR (1993) High-Level Expression of the Ribosomal Protein L19 in Human Breast Tumors That Overexpress erbB- 2. Cancer Res. 53:1403-1408. 353. Loging WT, Reisman D (1999) Elevated Expression of Ribosomal Protein Genes L37, RPP-1, and S2 in the Presence of Mutant p53. Cancer Epidemiol. Biomark. Prev. 8:1011-1016. 354. Wang H, Zhao L-N, Li K-Z, Ling R, Li X-J, et al. (2006) Overexpression of ribosomal protein L15 is associated with cell proliferation in gastric cancer. BMC Cancer 6:91. 355. Artero-Castro A, Castellvi J, García A, Hernández J, Cajal SRy, et al. (2011) Expression of the ribosomal proteins Rplp0, Rplp1, and Rplp2 in gynecologic tumors. Hum. Pathol. 42:194-203. 356. Bibikova M, Fan J-B (2010) Genome-wide DNA methylation profiling. Wiley Interdiscip. Rev. Syst. Biol. Med. 2:210-223.

221

357. Furlan-Magaril M, Rincón-Arano H, Recillas-Targa F (2009) Sequential chromatin immunoprecipitation protocol: ChIP-reChIP. Methods Mol. Biol. 543:253. 358. Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, et al. (2010) Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project. Science 330:1775-1787. 359. The ENCODE Project Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799-816. 360. The modENCODE Consortium (2010) Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE. Science 330:1787- 1797. 361. Raney BJ, Cline MS, Rosenbloom KR, Dreszer TR, Learned K, et al. (2011) ENCODE whole-genome data in the UCSC genome browser (2011 update). Nucleic Acids Res. 39:D871-D875. 362. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 39:D38-D51. 363. Sylvester JE, Gonzalez IL, Mougey EB (2003) Structure and organization of vertebrate ribosomal RNA genes. In: Olson M, editor. The Nucleolus. Georgetown, TX: Landes Bioscience. pp. 58-73.

222