SUMO-1 mapping in the genome and its implications for control

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Hui-wen Liu

Graduate Program in Molecular, Cellular and Developmental Biology

The Ohio State University

2014

Dissertation Committee:

Jeffrey D. Parvin, M.D., Ph.D., Advisor

Ching-shih Chen, Ph.D.,

Mark Parthun, Ph.D.,

Amanda Toland, Ph.D.

Copyright by

Hui-wen Liu

2014

Abstract

SUMOylation, a post-translational modification with SUMO covalently conjugated to a variety of proteins, regulates a range of cellular processes, including cell proliferation and maintenance of genome stability. In this study, we investigated how

SUMO-1 functions as a chromatin mark on the during progression by ChIP-seq approach. Surprisingly, despite the known repressive role of

SUMOylation on histones, we found that SUMO-1 localizes to the promoters of constitutively active involved in translation and proliferation during interphase. For example, ribosomal protein genes; and SUMO-1 marks on these promoters were absent during mitosis. In addition, SUMO-1 association on the promoters recruits RNAPII, and depletion of SUMO-1 leads to down regulation of those ribosomal protein genes, suggesting a positive role of SUMO-1 in activation.

To further elucidate how SUMOylation regulates transcription process related to protein synthesis, we identified that SUMO-1 marks the promoters via the Scaffold Associated

Factor B (SAFB) protein. The results showed that SAFB is SUMOylated, and depletion of SAFB caused the decrease of SUMO-1 marks on the promoters of those housekeeping genes transcribed by RNAPII. In addition, depletion of SAFB decreased the splicing of the mRNAs and disrupted the organization of Cajal body, which is important for snRNP and snoRNP biogenesis. All these findings suggested that SUMOylation plays an

ii important role in the regulatory process for transcription initiation and splicing of mRNA of ribosomal protein genes.

iii

Dedication

This dissertation is dedicated to my family.

iv

Acknowledgements

I would like to thank my advisor, Dr. Jeffrey Parvin for supporting and guiding me throughout the past 5 years. Whenever I feel lost in my research, you are always willing to help. You have set a great example to be an outstanding researcher, mentor, and role model. I am truly grateful to have such an excellent mentor in my life.

I also thank my thesis committee members, Dr. Mandy Toland, Dr. Mark Parthun, and

Dr. Ching-shih Chen. Not only your time and patience, but also your invaluable feedback and discussion, have guided me on the path as a researcher.

I am also thankful to have my lab-mates, Zeina, Mansi, Muhtadi, Grace, Cindy, Shweta,

Derek, Alaina, Eliana, and Ian. You have been very helpful and fun to work with. I will easily miss the time we hanging out (and working in the lab, of course!).

I thank my friends for all the support and suggestions, and it is always a great pleasure to hanging out with you guys. I am a person who gets homesick a lot, but you guys make me feel like home, and I really appreciate that.

Last but not least, I would like to thank my family and my husband Aaron Chen, for unconditional love and support. There are times when I feel frustrated, and you are always there for me. This journey would not have been possible without all the support from everyone, and I am truly thankful for everything I have.

v

Vita

2008-Present…...…………….PhD Candidate, The Ohio State University, Columbus, OH

2005………... MS, Microbiology and Biochemistry, National Taiwan University, Taiwan

2003…………………...BS, Agricultural Chemistry, National Taiwan University, Taiwan

Publications

1. Liu HW, Zhang J, Heine GF, Arora M, Ozer HG, Onti-Srinivasan R, Huang K, Parvin JD, Chromatin modification by SUMO-1 stimulates the promoters of translation machinery genes. Nucleic acid research. 40(20): 10172-86, 2012. 2. Arora M, Zhang J, Heine GF, Liu HW, Ozer G, Huang K, Parvin JD. Chromatin ubiquitination: a bookmark for transcription and DNA replication through mitosis. Nucleic acid research. 40(20): 10187-202, 2012. 3. Zhang J, Lu K, Xiang Y, Islam M, Kotian S, Kais Z, Lee C, Arora M, Liu HW, Parvin JD, Huang K. Weighted Frequent Gene Co-expression Network Mining to Identify Genes Involved in Genome Stability. PLOS Computational biology. 8(8): e1002656, 2012 4. Shen YF, Chen YH, Chu SY, Lin MI, Wu PY, Hsu HT, Wu CJ, Liu HW, Lin FY, Lin G, Hsu PH, Yang AS, Cheng SH, Wu YT, Wong CH, Tsai MD. E339…R416 salt bridge of as a feasible target for influenza virus inhibitors. Proc Natl Acad Sci USA. 108(40):16515-20, 2011 5. Chen SC, Liu HW, Lee KT, Yamakawa T. High-efficiency Agrobacterium rhizogenes-mediated transformation of sHSP18.2-GUS in Nicotiana tabacum. Plant Cell Reports. 26: 29-37, 2007.

Fields of Study

Major Field: Molecular, Cellular and Developmental Biology

vi

Table of Contents 

Abstract ...…...... ii

Dedication...... iv

Acknowledgements ...... v

Vita...... vi

Table of Contents ...... vii

List of Tables ...... xii

List of Figures ...... xiii

Chapter 1: Introduction ...... 1

1.1 Chromatin functions in eukaryotes ...... 1

1.1.1 Chromatin structures are well organized in eukaryotes ...... 1

1.1.2 Epigenetics and histone dynamics ...... 2

1.1.3 Histone modifications ...... 4

1.2 SUMO pathway ...... 9

1.2.1 involved in SUMOylation ...... 9

1.2.2 SUMO proteases ...... 11

1.2.3 SUMO proteins ...... 12 vii

1.2.4 SUMO-Interaction Motif ...... 13

1.3 The role of SUMOylation in chromatin remodeling ...... 14

1.3.1 SUMO localization on chromatin ...... 14

1.3.2 SUMO modification of Histones and HDACs ...... 15

1.3.3 Crosstalk between histone methylation and SUMOylation ...... 17

1.4 SUMOylation and transcription regulation ...... 17

1.4.1 Transcription repression ...... 17

1.4.2 Transcription activation ...... 18

1.4.3 SUMO, transcription, and chromatin structure ...... 19

1.5 SUMO function in subnuclear structure ...... 20

1.5.1 Polycomb bodies ...... 20

1.5.2 PML bodies ...... 21

1.5.3 Nucleolus and speckles ...... 22

1.6 Interactions between SUMO and mRNA biogenesis ...... 23

1.7 The role of SUMOylation in genome stability and tumorigenesis ...... 24

1.7.1 SUMOylation regulates cell cycle progression ...... 24

1.7.2 SUMOylation regulates DNA damage response ...... 25

1.7.3 Deregulation of SUMO system causes tumorigenesis ...... 25

Chapter 2: Rationale ...... 30 viii

Chapter 3: Chromatin modification by SUMO-1 stimulates the promoters of translation

machinery genes ...... 32

3.1 Abstract ...... 33

3.2 Introduction ...... 34

3.3 Materials and Methods ...... 37

3.3.1 Cloning and Cell line generation ...... 37

3.3.2 and used for Chromatin Immunoprecipitation (ChIP) ...... 37

3.3.3 Cell culture, cell cycle analysis, and RT-qPCR ...... 37

3.3.4 Chromatin immunoprecipitation, ChIP-qPCR, and Affinity Purification ..... 38

3.3.5 ChIP DNA preparation for Solexa Sequencing ...... 39

3.3.6 Data analysis ...... 40

3.4 Results ...... 44

3.4.1 Chromatin affinity purification of SUMO-1 through the cell cycle ...... 44

3.4.2 Chromatin bound SUMO-1 is concentrated at transcriptional regulatory sites

and is dynamic through the cell cycle...... 53

3.4.3 SUMO-1 labels the promoters of active genes...... 55

3.4.4 Correlation of SUMO-1 with other chromatin marks...... 62

3.4.5 SUMO-1 is a transcriptional activator of genes encoding ribosomal subunit

proteins and translation initiation factors...... 64

ix

3.5 Discussion ...... 73

Chapter 4: SAFB participates in SUMO-1 binding on the constitutive active promoters

...... 78

4.1 Abstract ...... 79

4.2 Introduction ...... 79

4.3 Materials and methods ...... 82

4.3.1 Chromatin affinitypurification (ChAP) for mass spectrometry analysis ...... 82

4.3.2 Chromatin immunoprecipitation qPCR (ChIP-qPCR) ...... 83

4.3.3 Antibody used and Immunofluorescent staining ...... 83

4.4 Results ...... 84

4.4.1 SUMOylation facilitates RNAPII recruitment on the constitutive active

promoters...... 84

4.4.2 SAFB is SUMO-1 modified and associated with RP gene promoters ...... 86

4.4.3 SAFB participates in SUMO-1 localization to promoters ...... 92

4.4.4 SAFB depletion caused down regulation of mRNA processing of RP genes 93

4.4.5 SAFB is needed for proper Cajal bodies organization ...... 95

4.5 Discussion ...... 98

Chapter 5: Discussion and Future Direction ...... 102

5.1 Summary of results ...... 102

x

5.2 Issues to be resolved ...... 104

5.2.1 The SUMOylation sites in SAFB1 that regulates SUMO-1 marks on the

promoters ...... 104

5.2.2 Other SUMO-1 targets that regulate SUMO-1 marks on the promoters ..... 105

5.2.3 The E3 ligases responsible for SUMO-1 marking on the promoters ...... 105

5.3 Future directions ...... 106

5.3.1 SUMOylation regulating RNA processing ...... 106

5.3.2 Gene regulation of RP biogenesis ...... 107

5.4 Significance ...... 109

Bibliography ...... 111

Appendix: Supplementary information in this study ...... 123

xi

List of Tables

Table 1-1 Enzymes involved in SUMOylation ...... 10

Table 4-1 GO term Analysis of chromatin-associated proteins pulled down by SUMO-1

...... 87

xii

List of Figures

Figure 1-1 Histone modification by writer, reader, and erasers...... 3

Figure 1-2 Summary of histone modifications in human cells...... 9

Figure 1-3 SUMOylation pathway...... 12

Figure 1-4 Non-covalent interaction between SIM and SUMO proteins...... 14

Figure 1-5 SUMOylation of PML is required for PML NB formation...... 22

Figure 1-6 Imbalance of SUMOylation leads to tumorigenesis...... 28

Figure 3-1 Characterization of HeLa-SUMO cell line...... 46

Figure 3-2 Genome wide analysis of SUMO-1 binding...... 49

Figure 3-3 Binding patterns of tagged-SUMO-1 detected by ChAP-seq are highly similar to the binding patterns of native SUMO-1 detected by ChIP-seq...... 51

Figure 3-4 Consistent binding patterns of SUMO-1 detected by ChAP and ChIP-seq. ... 52

Figure 3-5 SUMO-1 marks chromatin at active sites on human genome...... 54

Figure 3-6 SUMO-1 binding pattern is associated with active promoters...... 57

Figure 3-7 SUMO-1 binding pattern is associated with transcriptional activation...... 58

Figure 3-8 SUMO-1 binding pattern associates with highly transcribed genes in synchronized cells...... 60

Figure 3-9 SUMO-1 is associated with chromatin of active genes...... 61

Figure 3-10 SUMO-1 marked promoters are associated with genes marked with

H3K4me3...... 63 xiii

Figure 3-11 RNA-seq showing the high reproducibility of biological replicates...... 65

Figure 3-12 Differential expression of genes following SUMO-1 depletion...... 68

Figure 3-13 Validation of RNA-seq data under SUMO-1 depletion showing an enrichment of genes that encodes protein translation factors...... 69

Figure 3-14 Examples of SUMO-1 tracing on specific promoters...... 70

Figure 3-15 SUMO-1 activates expression of biogenesis genes...... 72

Figure 4-1 SUMOylation facilitates RNAPII recruitment on the active promoters...... 86

Figure 4-2 Confirmation of SUMO-1 enrichment to SAFB binding sites...... 89

Figure 4-3 Identification of SAFB as SUMOylated substrate on the promoters...... 91

Figure 4-4 SAFB participates in SUMO-1 association on promoters of RP genes...... 93

Figure 4-5 SAFB or SUMO-1 depletion caused down-regulation of spliced mRNA in nucleus...... 95

Figure 4-6 The effect of SAFB or SUMO-1 depletion on nuclear structure ...... 97

Figure 4-7 SAFB or SAFB-1 depletion did not affect the expression of an inducible system...... 101

Figure 5-1 Model of SAFB SUMOylation regulating RP ...... 104

Figure 5-2 Su-SAFB involved in sensing stress...... 109

xiv

Chapter 1: Introduction

1.1 Chromatin functions in eukaryotes

1.1.1 Chromatin structures are well organized in eukaryotes

Chromatin is the DNA and protein complex located in the nucleus of the cell. The basic structural unit of chromatin is the nucleosome, which contains DNA fragments of

147 base pairs (bp) tightly associated with highly conserved histone octamers composed of H2A, H2B, H3, H4 subunits and their variants. In eukaryotic cell nuclei, chromosomal

DNA is wrapped into nucleosomes and further forms ‘beads-on-a-string’ fiber structure, representing the first level of chromatin organization. While chromatin had been considered as a stable structure, more evidence showed that it is actually highly dynamic.

The euchromatin or the active chromatin is loosely coiled; whereas the heterochromatin is tightly packed thus genes within this region are poorly expressed. Changes in chromatin structure are carried out by two mechanisms: 1) reposition of nucleosomes or exchanging histone variants by ATP-dependent chromatin remodeling enzymes such as

Switch/Sucrose Nonfermentable (SWI/SNF) and SWI/SNF-related protein 1 (SWR1) complex (Nguyen et al., 2013); and 2) covalent histone modifications by specific enzymes, for example, histone acetyltransferases (HAT), histone methytransferases

(HMT), and kinases (Segal and Widom, 2009). Finally, these histone variants and

1 modifications along with other regulatory proteins and nuclear compartmentalization make up the higher order organization in nuclei, and further regulate cellular processes in response to the environmental stimuli.

1.1.2 Epigenetics and histone dynamics

It is well known that DNA carries the genetic information that is faithfully passed from mother cell to daughter cell. In spite of having identical genetic information, differentiated cells of the same multicellular organism display different transcription patterns of genes that contribute to the phenotypic heterogeneity. The definition of

‘Epigenetics’ is the study that elucidates heritable changes in genome function that occur without changing DNA sequences. This definition means that how particular genetic information is interpreted is determined by epigenetic marks including DNA and histone modification, histone variants, non-histone chromatin proteins, non-coding RNAs, and higher-order chromatin structure (Goldberg et al., 2007).

To fine-tune chromatin structure, the histone tails and globular domains are subject to posttranslational modifications (PTM) in order to regulate cellular processes such as transcription, replication, repair, and alternative splicing. To date, more than 130 PTMs on histones have been discovered in human cells (Tan et al., 2011). These histone modifications are added or removed by enzymes (writers and erasers), and are recognized by proteins (readers) with domains specific for certain chemical groups such as methylation and (Figure 1-1). The crosstalk of histone modifications modulates gene expression by controlling the accessibility of chromatin, or functioning as the binding docks that recruit or block complexes to the chromatin. Some modifications, 2

such as histone lysine methylation, are known to recruit specific binding proteins. For

instance, histone H3 lysine 9 methylation (H3K9me) recognized by HP1 and histone H3

lysine 27 trimethylation (H3K27me3) recruits PRC1 to repress gene expression, whereas

acetylation on lysine residues is believed to have a more structural role, making the

nucleosome structure more accessible to transcription factors (Spivakov and Fisher,

2007).

Writer Reader Eraser

Examples

Methylases, PHD finger, Deacetylases, acetylases, chromodomain, demethylases, kinases bromodomain phosphatases

Figure 1-1 Histone modification by writer, reader, and erasers. The 'writers' are enzymes that catalyze the PTMs of the histones, and the 'reader' proteins recognize and bind to PTM of histones. 'Erasers' are the enzymes that remove these marks.

3

1.1.3 Histone modifications

It is known that histone modifications play a central role in regulating transcription, and this ‘histone code’ regulates chromatin structure by recruiting chromatin-remodeling enzymes to reposition nucleosomes. Therefore histone modifications play crucial roles in many biological processes such as cell-cycle regulation, DNA damage and stress response, development and differentiation. Among these various modifications, methylation, acetylation, , ubiquitination, and SUMOylation have been extensively studied and summarized in Figure 1-2, and will be discussed in the following section.

a. Methylation

Histone methylation mainly occurs on the side chains of lysines (K) and arginines (R), and lysines may be mono-, di- or tri-methylated, whereas arginines may be mono-, symmetrically or asymmetrically di-methylated, therefore adding complexity of this modification (Greer and Shi, 2012). Histone methylation is involved in transcriptional regulation by alteration of chromatin structure, recruitment of transcription factors on to a specific loci, and interaction with initiation and elongation factors. In addition, it has been reported that histone modification affects RNA processing. To date, three families of enzymes have been identified to transfer of methyl groups donated from

Sadenosylmethionine (SAM) to histones: 1) The SET-domain containing protein and 2)

DOT1like proteins have been shown to methylate lysines, and 3) protein arginine N- methyltransferase (PRMT) family have been shown to methylate arginines, and two

4 families of demethylases have been identified thus to remove methyl group from lysines:

1) the amine oxidases and 2) jumonji C (JmjC)-domain containing iron-dependent dioxygenases. However the enzymes responsible for demethylation of methyl-arginine remains elusive (Greer and Shi, 2012). While all four core histones have detected methylation sites by proteomic analysis and the biological functions of many of the sites remain to be determined, the most extensively studied histone methylation sites include histone H3K4, H3K9, H3K27, H3K36, H3K79 and H4K20, and the effects of histone methylation is context-dependent. For instance, H3K4me3 is generally associated with transcriptional activation or with genes that are paused for activation, whereas

H3K27me3 and H3K9me3 are associated with gene repression. H3K4me3 is often associated with active promoters, whereas H3K4me1 is linked with enhancers.

Nonetheless, the same histone code mark can result in opposite outcomes depending on the context of ‘readers’. For example, while H3K4me2 and H3K4me3 are associated with gene activation, these marks can also lead to gene repression if recognized by an inhibitor of growth family member (ING2), through the stabilization of HDACs (Shi et al., 2006).

In addition, histone methylations can act either competitively or cooperatively. For example, H3K4me3 and H3K27me3 are marks associated with opposite effects for transcription, when they are present together; however, they keep genes poised for activation (Bernstein et al., 2006). On the other hand, in mammalian cells, dimethylation of H3R2 by PRMT6 is prevented by the presence of H3K4me3; conversely, H3R2me2 prevents H3K4 methylation (Guccione et al., 2007)

5 b. Acetylation

Acetylation occurs on lysine residues of histones and is regulated by histone acetyltransferases (HATs) and reversed by histone deacetylases (HDACs). There are two major superfamilies of HATs – GNAT (Gcn5 related N-acetyl transferase) and MYST

(MOZ, Ybf2-Sas3, Sas2, Tip60), while HDACs can be classified into three different classes – class I (HDAC1, 2, 3 and 8), II (HDAC4, 5, 6, 7, 9, 10) and III (sirtuins).

HATs utilize acetyl-CoA as cofactor and transfer of an acetyl group to the ε-amino group of lysine residue to neutralize the lysine's positive charge, and further weaken the interactions between histones and DNA, resulting in giving more accessibility of DNA binding proteins to DNA template. Therefore histone acetylation is associated with transcriptional activation whereas deacetylation leads to transcriptional repression. In addition to transcription activation, histone acetylation is also involved in DNA replication and repair. It has recently been shown that histone acetylation is associated with productive origin activation and DNA double strand breaks.

c. Phosphorylation

Histone phosphorylation takes place on serine, threonine, lysine, and tyrosine in N- terminal tail of histones, and the modification is catalyzed by kinases and reversed by phosphatases. Kinases transfer a phosphate group from ATP to the hydroxyl group of the target side, and add negative charge to the histone therefore influences the chromatin structure. It is known that phosphorylation of histone H3 phosphorylation on

S10, T3, T11 and S28 is involved with compaction and segregation during

6 mitosis and meiosis. H3S10Ph and H3S28Ph are mediated by Aurora kinase B (AURKB) during the prophase stage and are dephosphorylated by PP1 at the end of mitosis

(Banerjee and Chakravarti, 2011). In addition, DNA damage induces phosphorylation of the histone variant H2AX at S139 by ATM or ATR kinases and is called γ-H2AX

(Rogakou et al., 1998; Ward et al., 2004). γ-H2AX is required for the assembly of DNA repair proteins at the sites containing damaged chromatin and the activation of checkpoints, which arrest the cell cycle progression (Bonner et al., 2008). While histone phosphorylation sites have been identified by mass spectrometry, however, the biological functions of most of the sites remain largely unknown (Banerjee and Chakravarti, 2011).

d. Ubiquitination

Ubiquitin is a 76-amino acid polypeptide that is attached to histone lysines via the sequential action of three enzymes, E1-activating, E2-conjugating and E3-ligating enzymes. While methylation, acetylation and phosphorylation of histones modifications result in small molecular changes to amino acid side chains, ubiquitylation results in a much larger covalent modification. The complexes determine both substrate specificity as well as the type of ubiquitylation (mono- or poly-ubiquitylated). Mono- ubiquitylation of H2A and H2B are the most well known sites associated with gene transcription. In mammalian cells, H2AK119ub1 is involved in gene silencing, whereas

H2BK123ub1 plays an important role in transcriptional initiation and elongation.

Ubiquitination is removed by isopeptidases called specific proteases, or deubiquitinase enzymes, and this activity is important for both gene activity and

7 silencing. However, although mono-ubiquitination of histones seems most relevant, the exact modification sites and their biological functions remain largely elusive.

e. SUMOylation

SUMOylation is a PTM similar to ubiquitination, and it modifies histone lysines by attaching Small Ubiquitin-like MOdifiers (SUMO) proteins through E1, E2 and E3 enzymes. SUMOylation of histone has been identified in all core histones in yeast and is associated with gene repression (Nathan et al., 2006). However, whether SUMOylated proteins (histones and chromatin associated proteins) are solely involved in gene silencing remains controversial and will be discussed in detail in chapter 3. More work is clearly needed to elucidate the mechanisms through which SUMOylation exerts its effect on chromatin.

8

Figure 1-2 Summary of histone modifications in human cells. Adapted from Nature Structural & Molecular Biology 14, 1017 - 1024 (2007) ac: acetylation; bio: biotinylation; me: methylation; su: SUMOylation; cit: citrullination; asterisk indicates that either the histone amino acid sequence or the modification is from S. cerevisiae.

1.2 SUMO pathway

1.2.1 Enzymes involved in SUMOylation

SUMOylation, an evolutionally conserved post-translational modification, includes a 3- step enzymatic cascade that is similar to the cascade associated with ubiquitination

(Figure 1-3); mechanistically, it involves activation by an E1 enzyme (SAE1/SAE2), conjugation of mature SUMO proteins on the substrates through the lysine residues in the substrates by an E2 ligase (Ubc9). SUMOylation can be further facilitated by various E3 ligases, and these E3 ligases are also believed to contribute to specificity, yet the exact

9 mechanisms of SUMO pathways regarding to specificity have not been fully understood

(Flotho and Melchior, 2013). Compared to the very large number of ubiquitin E3 ligases, relatively few SUMO E3 ligases have been found. PIAS family ligases (PIAS1 to 4 in human) were the first E3 ligases characterized. They contain a SP-RING (Siz/PIAS

RING) domain that is essential for Ubc9 interaction. The second well-defined E3 ligase

RanBP2/ Nup358 is a large nuclear pore protein that localizes to the cytoplasmic face of the nuclear pore complex (NPC), and contains two 50-amino acid internal repeat (IR1 and IR2) domains functioning as the E3 active site. Interestingly, a recent study showed that RanBP2/RanGAP1-SUMO1/Ubc9 forms a multisubunit SUMO E3 ligase throughout the cell cycle, suggesting the connection between SUMOylation and Ran hydrolysis, and it may play a role on the efficiency of some nucleoplasmic transport processes (Werner et al., 2012). To date, other E3 ligases such as Polycomb-2 (Pc2), class II HDACs, topoisomerase I binding RING finger protein (TOPORS), the PHD containing protein

KAP-1, Ras homologue enriched in striatum (RHES), and Krox20, have been identified

(Table 1-1).

Table 1-1 Enzymes involved in SUMOylation

Yeast Human E1 Aos1/Uba2 SAE1/SAE2 E2 Ubc9 Ubc9 SP-RING type E3 ligases Siz1, Siz2, Mms21 PIAS1, PIAS2, PIAS3, PIAS4 IR E3 ligases RanBP2 Other E3 ligase KAP1, Pc2, TOPORS, Class IIa HDACs, RHES, Krox20 C48 family cysteine proteases Ulp1, Ulp2 SENP1-3, SENP4-7 Other proteases DESI-1, DESI-2, USPL1

10

1.2.2 SUMO proteases

SUMOylation is highly dynamic and can be reversed by specific cysteine proteases that have been found in eukaryotes. These SUMO proteases cleave the isopeptide bond formed between the carboxy-terminal glycine of SUMO and lysine of the substrate. In yeast, the SUMO protease, also known as ubiquitin-like protein protease (Ulp), is responsible for the cleavage of SUMO precursor to generate the mature SUMO proteins, and the removal of covalent bound SUMO peptides from the substrates. In , the first identified Ulp isoforms are called SUMO/sentrin-specific protease (SENP1 to 3,

SENP5 to 7). SUMO proteases display various properties in terms of protein localization and function. For example, SENP1 and SENP2 bind to SUMO1 and SUMO2/3 as substrates, whereas SENP3 and SENP5 specifically interact with SUMO2/3 (Di Bacco et al., 2006; Gong and Yeh, 2006). In addition, SENP1 and 2 is found to shuttle between cytoplasm and cytosol, but not in nucleolus, whereas SENP3 and 5 is mainly localized in the nucleolus, but hardly found in cytoplasm. Both SENP6 and 7 have been shown to localize to the nucleoplasm (Choi et al., 2006; Mukhopadhyay et al., 2006; Shen et al.,

2009). Moreover, three new SUMO proteases in humans have been identified- two members of the PPPDE super family desumoylating isopeptidase 1 and 2 (DESI1 and

DESI2) (Shin et al., 2012), and ubiquitin-specific protease-like 1 (USPL1) (Schulz et al.,

2012), which share little sequence similarity with the SENP proteases summarized in

Table 1-1.

11

E1

SAE1 Su GG SENP C SAE2

Su GG-XXXX.. Su GG

Su GG C Target UBC9 E2

E3 ligase SENP DeSUMOylation

Target K Su GG SUMOylation

Figure 1-3 SUMOylation pathway. SUMOylation starts with the activation by an E1 enzyme (SAE1/2), conjugation of mature SUMO proteins on the substrates through the lysine residues in the substrates by an E2 ligase (Ubc9). SUMOylation can be facilitated by various E3 ligases. This reaction is further reversed by SUMO proteases called SENPs (DeSUMOylation).

1.2.3 SUMO proteins

In Saccharomyces cerevisiae and Drosophilla melanogaster, one SUMO protein encoded

by the SMT3 gene is found. On the other hand, there are 4 isoforms found in vertebrates.

While SUMO-1 shares 40% identity with SUMO-2/3, SUMO-2 and -3 share 95%

identity; and SUMO-4 shares around 86% with SUMO-2/3, however, it is

debatable whether it is functional in cells (Geiss-Friedlander and Melchior, 2007). Unlike

ubiquitination, SUMOylation does not generally tag proteins for proteasomal

degradation, but rather alters their function and/or localization. Conjugation of SUMO

12 proteins affects protein stability and enzyme activity and further plays an important role in a wide range of cellular processes. All the data suggest that SUMOylation is essential for most organisms. Deletion mutants of SUMO display severe growth defects in yeast; in addition, complete loss of SUMOylation is embryonic lethal in Ubc9 knocked out mice

(Nacerddine et al., 2005). In mammals, SUMO-1 knockout mice are viable, suggesting that its function can be compensated by SUMO-2/3 (Alkuraya et al., 2006; Evdokimov et al., 2008; Zhang et al., 2008a). Moreover, while SUMOylation is required for early development, all three SUMO isoforms are functionally redundant (Yuan et al.,

2010). Most SUMO-modified proteins are found in the nucleus, but cytosolic SUMO substrates have also been identified. In mammalian cells, SUMO-2/3 proteins can form polymeric chains, whereas SUMO-1 is in general found in monomer form, indicating these different isoforms may play different roles in gene regulation.

1.2.4 SUMO-Interaction Motif

Despite the covalent attachment of SUMO proteins on substrates, some of the

SUMOylated proteins are also recognized by proteins harboring SUMO-interaction motif

(SIM). This motif contains a hydrophobic core that can non-covalently bind to an interaction surface on SUMO proteins (Hecker et al., 2006). There is emerging evidence showing the importance of SIM in regulating protein functions. For instance, a transcription corepressor called Daxx is shown to be SUMOylated, and the SIM of Daxx plays an important role for regulating function and localization of the protein; deletion of

SIM in Daxx fails to be covalently modified by SUMO, suggesting that either the target through SIM or the interaction between SIM and SUMO is a prerequisite of proper 13

SUMO modification (Figure 1-4) (Lin et al., 2006). SIM is recently reported to be involved in DNA repair. The DNA repair protein Rad51 is found to associate with

SUMO through SIM, and SUMOylation of Rad51 is necessary for accumulation of

Rad51 at DNA damage sites (Shima et al., 2013).

SIM-SUMO interactions promote assembly of higher-order protein complexes, such as the formation of PML bodies, and the recruitment of different repressor complexes to the chromatin though SUMOylated Sp3 for gene repression, as described in section 1.4.1 and

1.5.3.

Substrate Substrate Substrate

   Substrate Su Su Su Su    Daxx     Daxx Daxx

Weak interaction Strong interaction Complex formation

Figure 1-4 Non-covalent interaction between SIM and SUMO proteins. A transcriptional co- repressor, Daxx, contains a hydrophobic SUMO-interacting-motif (SIM). SUMOylation of its partners enhances the protein–protein interactions. If several proteins of an assembly also have SIMs, SUMOylation facilitates protein-complex formation.

1.3 The role of SUMOylation in chromatin remodeling

1.3.1 SUMO localization on chromatin

SUMOylation of the chromatin-associated proteins is conserved for proper chromatin organization. For example, SUMO proteins bound on the mitotic is found in S. cerevisiae (Biggins et al., 2001) and D. melanogaster (Lehembre et al., 2000). In

14

mammalian cells, SUMO-2/3 is located at the centromere and condensed chromosomes,

whereas SUMO-1 localizes to the mitotic spindle and spindle midzone during mitosis

(Zhang et al., 2008b). It has been demonstrated that SUMO is required for the

maintenance of constitutive heterochromatin in fission yeast. SUMO-1 is found on the

constitutive heterochromatin in human spermatocytes (Metzler-Guillemain et al., 2008).

In addition, SUMO-1 and -2/3 localize to heterochromatin domains enriched in the

methyl-CpG-domain binding protein (MBD1), and other heterochromatin proteins such

as HP1 and MCAF1 are also SUMOylated. Depletion of either SUMO-2/3 or SUMO-1

induced dissociation of MCAF1, H3K9me3, and the HP1 proteins from the MBD1-

containing heterochromatin foci, suggesting that SUMO formation of heterochromatin is

SUMO-dependent (Uchimura et al., 2006).

1.3.2 SUMO modification of Histones and HDACs

Direct modification of the histones by SUMO is the simplest model that SUMOylation

modulates chromatin dynamics. It has been demonstrated that only Histone H4 is found to be a SUMO substrate in human cells (Shiio and Eisenman, 2003), whereas all four core histones are SUMOylated in S. cerevisiae (Nathan et al., 2006). SUMOylation of H4 increased its interaction with endogenous HDAC1 and with heterochromatin protein 1

(HP1γ), suggesting that histone SUMOylation is involved in gene repression (Shiio and

Eisenman, 2003). In addition, it has been shown that the SUMOylation of Elk-1 caused

the recruitment of HDAC2 to deacetylate the chromatin of bound promoters and further

caused the repression of Elk-1 target genes (Yang et al., 2003). Interestingly, the NAD+-

dependent HDAC, SIRT1, as well as HDAC1, 4, 6, and 9 are SUMOylated. However, it 15 is not clear whether SUMOylation of class I HDACs can specifically affect their association with transcription factors targets (Brandl et al., 2009; Yang et al., 2007). Mass spectrometry and mutagenesis analyses have identified the SUMOylation sites on histones H2B (H2BK6/7/16/17) in budding yeast. Interestingly, these sites are not the canonical SUMOylation consensus sites (ΨKXE). In addition, many of these identified

SUMOylation sites are also targeted for acetylation and ubiquitination on H2B, which are involved in gene activation, and this suggests a role of SUMOylation as a competitor of those activation signals. Indeed, of SUMOylated lysines to alanines, which cannot be SUMOylated, increases basal transcription of several genes in yeast.

Conversely, SUMOylation of H2B inhibits induction of the GAL1 gene, which further supports the mechanism that H2B SUMOylation serves as a competitor of activating modifications (Nathan et al., 2006). However, it is to note that while both SUMOylation and acetylation target overlapping sites, acetylation is more prevalent. Therefore, more structural data are needed to clarify the effect of SUMOylation of histones. Alternatively, histone SUMOylation may mediate other histone modifying proteins, such as HDACs described above. The crosstalk of histone SUMOylation with other histone modifications suggests that SUMOylation contributes to the chromatin dynamic by modulating access to the DNA template. That is, rather than recognizing a specific residue, SUMOylation of multiple lysines in histones represses gene transcription by either blocking or recruitment of other factors. However, it is not clear whether recruitment of HDAC through

SUMOylation involves the direct binding of SUMO to HDAC, or other cofactors recruited in SUMO-dependent fashion.

16

1.3.3 Crosstalk between histone methylation and SUMOylation

SUMOylation is involved in the regulation of the accessibility of the chromatin structure by modulating the architecture and functions of several chromatin modification complexes. Several studies have indicated that SUMOylation is associated with histone methylation. The histone methyltransferase SETDB1 is responsible for a repressive mark,

H3K9me3, and it forms a complex with MDB1, linking DNA methylation to histone methylation (Lyst et al., 2006). However, SUMOylation of MBD1 interferes with its interaction with SETDB1, and fails to efficiently repress transcription of a target gene

(Lyst et al., 2006), suggesting that effects of SUMOylation on gene expression and chromatin structure may be both positive and negative.

1.4 SUMOylation and transcription regulation

1.4.1 Transcription repression

Transcription factors and co-regulators are found to be one of the most abundant groups of SUMO targets; therefore, SUMOylation plays a crucial role in regulating gene transcription. It is known that transcription activation is positively correlated to that of histone acetylation, whereas silenced genes have low histone acetylation, and Ubc9 depletion caused up-regulation of global acetylation in yeast (Nathan et al., 2006), suggesting SUMOylation plays a repressive role in transcription. The mechanism of

SUMOylation of H2B serving as a competitor of activation signal, and SUMO recruiting

HDACs and further repress gene expression, have also been described in section 1.3.2. In

17 addition, direct modification of transcription factors by SUMO, such as, Elk-1, c-Jun,

C/EBP, Sp3, and , primarily results in transcriptional repression. By a Gal4-based reporter system, the expression of SUMO or Ubc9 represses downstream gene expression

(Yang et al., 2003). Conversely, overexpressing SUMO proteases or depleting Ubc9 or

SUMO enhances ectopic gene expression (Ouyang et al., 2009a; Poulin et al., 2005;

Spektor et al., 2011). Interestingly, SUMOylation of the coactivators such as p300 and

CREB binding proteins (CBP), switch the coactivator property into repressors (Girdwood et al., 2003).

1.4.2 Transcription activation

Although many cases have indicated that SUMOylation is associated with gene repression, there are also examples where SUMOylation is required for efficient gene activation process. For instance, an ETS-domain PEA3 is found to be

SUMOylated, which is required for PEA3-mediated transcriptional activation (Guo and

Sharrocks, 2009); in addition, SUMOylation of PEA3 is also needed for RNF4-directed ubiquitination and degradation of PEA3 (Guo et al., 2011). Since SUMOylation may compete with other PTMs, a mechanism has been described whereby SUMOylation may result in stabilizing transcription factors, for instance, Oct4 (Wei et al., 2007). Another proposed mechanism is that SUMOylation may interrupt the association between the repressor and the transcription factor. For example, SUMOylation of Ikaros disrupts the interaction with HDACs (Gomez-del Arco et al., 2005). Another example showed that although SUMOylation of transcription factor SF-1 repressed its transcriptional activity in culture cells, mice expressing non-SUMOylatable SF-1 mutant failed to phenocopy a 18 constitutively active SF-1 (Lee et al., 2011). Therefore, SUMOylation is not simply a repressive or activation signal, but rather governs the fine-tuning of SF-1 activity during embryo development. Interestingly, it has been reported that SUMO associates with the constitutively active promoters in both yeast and human cells (Liu et al., 2012; Rosonina et al., 2010). This again raises the issue that while SUMOylation is involved in gene repression, it also associates with transcriptional activation of certain genes, such as housekeeping genes, by either structurally or as a signaling pathway.

1.4.3 SUMO, transcription, and chromatin structure

Despite regulating transcription by modifying transcription factors, SUMOylation fine- tunes transcription by modulating chromatin modification and chromatin structure. For instance, Sp3, a transcription factor that activates or represses a depending on the context, is found be repressed by SUMO-1 modification. Interestingly, while both

DNA and histone methylation are reduced at promoters in cells expressing unSUMOylatable Sp3 mutant, heterochromatin protein 1 (HP1) and two ATP-dependent chromatin remodelers are also down-regulated (Stielow et al., 2010; Stielow et al., 2008) suggesting that SUMOylation of one single transcription factor may cause multiple effects to alter chromatin structure. In other words, SUMOylation governs the transcription activities through transcription factors, other chromatin-associated proteins, and chromatin-modifying enzymes, in a context dependent manner.

19

1.5 SUMO function in subnuclear structure

1.5.1 Polycomb bodies

Polycomb group (PcG) proteins are transcription repressors that cluster in nuclear foci called PcG bodies. PcG proteins tightly regulate chromatin structures especially for long- term heritable gene silencing of homeotic genes during development. In addition, the connection between SUMOylation and the expression of PcG target genes was demonstrated in C. elegans. SUMOylation of a PcG protein, SOP-2, is required for its localization to nuclear bodies in vivo and for silencing Hox genes. Defective

SUMOylation causes ectopic expression of Hox gene (Zhang et al., 2004). PcG proteins form complexes called polycomb repressive complex, PRC1 and PRC2. The PRC2 complex contains the histone methyltransferase EZH2, which catalyzes H3K27me3, an epigenetic mark acting mainly as a gene silencer (Cao et al., 2002). This results in the recruitment of the PRC1 complex and blocks chromatin remodeling by SWI/SNF factors, thus preventing transcription of repressed chromosome regions (Shao et al., 1999).

Furthermore, the recruitment of PRC1 onto the promoter of chromatin leads to H2A ubiquitination. In addition, Pc2, the human PcG proteins found in PRC1 complex, is found to be a SUMO E3 (Kagey et al., 2003), suggesting that SUMOylation plays a role in the formation or regulation of PcG bodies. In SENP2-deficient mice, SUMOylated Pc2 accumulates, therefore causes enhanced assembly of PRC1 at the promoters of genes essential for development, resulting in embryonic heart defects (Kang 2010).

20

1.5.2 PML bodies

Promyelocytic Leukemia nuclear bodies (PML NB) are a nuclear structure participating in the DNA damage response, induction, angiogenesis, telomere maintenance, cell proliferation, senescence, and anti-viral response (Bernardi and Pandolfi, 2007;

Salomoni and Pandolfi, 2002). The formation of PML NB is known to be SUMO- dependent. PML, the major structural protein in PML NB, was one of the first SUMO targets identified in the cells. Structurally, PML contains a SIM that is independent of its

SUMOylation sites, and this non-covalent interaction plays a critical role in PML nuclear body assembly (Shen et al., 2006). In addition, while SUMOylation of PML is necessary for recruitment of other partners, many of which are SUMOylated themselves or via SIM

(Figure 1-5). SENP7 and SENP6 are the SUMO proteases to regulate the recruitment of endogenous PML and SUMO to PML NBs, controlling the size and the number of PML

NBs (Hattersley et al., 2011; Shen et al., 2009).

21

Su Su Su Su

Su Su Su Su Su Su

Su Su Su

Su Su Su

Su Su Su Su Su Su Su

Su Su Su Su

Su Su

Su Su Su Su

SuSu Su

Su Su Su Su Su Su Su Su Su Su Su

Su Su Su Su Su Su Su Su

PML dimer PML network PML-NB

Su Su SIM SUMO Proteins with SIM SUMOylated Proteins PML

Figure 1-5 SUMOylation of PML is required for PML NB formation. PML dimers are covalently SUMOylated, and further assemble as polymer via non-covalent SUMO-SIM interactions. Finally, this PML network recruits partner proteins with SIM, or SUMOylated partner proteins, to form PML NB.

1.5.3 Nucleolus and speckles

The nucleolus is the subnuclear organelle responsible for rRNA synthesis, processing,

and assembly of the large and small ribosome subunits. Previous studies have shown that

the SUMO system plays an important role in regulation of rRNA processing and pre-

ribosomal particle assembly, as well as nucleolar rDNA integrity (Darst et al., 2008).

Thus, depletion of a SUMO E3 ligase Mms21, a component of the Smc5/6 complex,

causes abnormal nucleolar morphology (Zhao and Blobel, 2005).

22

Nuclear speckles are a subnuclear structure that coordinates pre-mRNA splicing and mRNA export. In addition, nuclear speckles contain not only mRNA processing factors, but also many different factors involved in transcription and production of functional mRNAs, such as transcription factors, RNA polymerase II subunits, cleavage and polyadenylation factors, and RNA export proteins. It is found that in mouse oocyte Ubc9 localizes to nuclear speckles and stimulates transcription (Shikina et al., 2008), and that the serine/arginine-rich protein SF2/ASF, a factor involved in splicing regulation and other RNA metabolism-related processes, is a regulator of the SUMOylation (Pelisch et al., 2010)

1.6 Interactions between SUMO and mRNA biogenesis

The role of SUMO at distinct steps of mRNA metabolism is first shown by the identification of SUMO targets in proteomic analyses, and multiple factors involved in transcription or mRNA-related processes were found to be SUMO targets in yeast

(Denison et al., 2005; Panse et al., 2004; Wohlschlegel et al., 2004), mammalian cells

(Gocke et al., 2005; Tatham et al., 2011; Vertegaal et al., 2006). A significant overrepresentation of factors involved in mRNA metabolism was notably observed among proteins modified by SUMO-2/3 (Blomster et al., 2009; Bruderer et al., 2011). In addition, the targets identified for an E3 SUMO-ligase, TOPORS, compose a number of transcriptional regulators and factors involved in mRNA processing (Pungaliya et al.,

2007). Functional analysis of a subset of these SUMOylated proteins, in particular transcriptional regulators, has confirmed the key function of SUMO-dependent regulations in mRNA biogenesis processes. 23

1.7 The role of SUMOylation in genome stability and tumorigenesis

1.7.1 SUMOylation regulates cell cycle progression

Chromosome segregation is a crucial step during cell cycle progression to ensure dividing equal amount genetic material to the two daughter cells, and cancer cells have been found to be defective in chromosome segregation resulting in multipolar spindles, chromosome instability, and aneuploidy (Draviam et al., 2004). It has been shown that SUMOylation plays a critical role in regulating mitotic chromosome structure and segregation, and the connection between SUMOylation and mitosis regulation is conserved from yeast to human. In budding yeast, disruption of the SUMO pathway led to a G2/M cell cycle arrest (Seufert et al., 1995). In addition, the septins Cdc13, Cdc11, and Shs1 were found to be transiently SUMOylated from the onset of anaphase to cytokinesis, indicating the role of SUMO in regulating septin ring dynamics during the cell cycle (Johnson and

Blobel, 1999). In Drosophila, deletion of SUMO caused nuclear cleavage defects such as chromosome hypercondensation, aberrant segregation, and polyploidy (Nie et al., 2009).

In Xenopus egg, SUMO-2/3 regulates topoisomerase II in mitosis, loss of SUMOylation caused the dissociation of sister chromatids at the metaphase–anaphase transition (Azuma et al., 2003). In a murine model, Ubc9 deficient mice showed major chromosome condensation and segregation defect in mitosis, including hypercondensation and breakage, as well as a high rates of missegregation and polyploidy. These defects resulted in early embryo lethality (Nacerddine et al., 2005). SUMOylation has been found to regulate many of the proteins of the kinetochore complex, which is important for accurate chromosome segregation during mitosis. For example, CENP-E, one of the motors 24 responsible for chromosome movement and spindle elongation, was modified by SUMO-

2/3 during mitosis, and the poly-SUMOylation is essential for kinetochore localization

(Zhang et al., 2008b). In conclusion, all these results suggest that SUMOylation is important in coordinating multiple events for accurate chromosome segregation in mitosis.

1.7.2 SUMOylation regulates DNA damage response

Many DNA-damage response proteins are found to be SUMO and/or ubiquitin targets, suggesting the involvement of SUMOylation/ubiquitination in regulating checkpoint responses and DNA-repair pathways. For example, BRCA1, an E3 ubiquitin ligase known for maintaining genome stability, is found to be SUMOylated in response to DNA damage, and co-localizes with SUMO-1, SUMO2/3 and Ubc9 at the DNA damage sites, and SUMOylation of BRCA1 is required for BRCA1 ubiquitin ligase activity, further recruits the downstream repair factors at the damage sites (Morris et al., 2009). Another well-characterized tumor suppressor, , is also modified by SUMO in response to UV- irradiation (Rodriguez et al., 1999). In addition, UV induces PIAS3-mediated hnRNP-K

SUMOylation, which increases hnRNP-K stability, interaction between hnRNP-K and p53, and p21 expression in an ATR-dependent manner, leading to cell cycle arrest (Lee et al., 2012).

1.7.3 Deregulation of SUMO system causes tumorigenesis

The balance between SUMOylation/deSUMOylation is strictly regulated in normal cells, and this fine-tuned modification further regulates various biological processes, such as

25 nuclear-cytosolic transport, protein stability, cell apoptosis, transcriptional regulation,

DNA repair, cell proliferation, and cell cycle progression (Geiss-Friedlander and

Melchior, 2007). Previous studies showed that SUMOylation is a cell cycle-dependent event, and deSUMOylation is critical to DNA repair through homologous recombination.

Emerging evidence suggests that deregulation of SUMOylation lead to tumorigenesis.

First, loss of balance of SUMOylation/deSUMOylation is shown in tumor samples (Lee* et al., 2009) (Figure 1-6). For example, studies have shown that Ubc9 is up-regulated in several types of cancer like ovarian (Mo et al., 2005), breast, head and neck, and lung cancer (Wu et al., 2009). Furthermore, SUMOylation is up-regulated in tumor cells, for example, leukemia and breast cancer (Zhu et al., 2005; Zhu et al., 2010), whereas factors that control deSUMOylation have shown to be repressed (Kim et al., 1999). Nevertheless, up-regulation of deSUMOylation is also shown in several cancer types (Sarge and Park-

Sarge, 2009). Second, a number of transcription factors and tumor suppressors are

SUMO substrates, such as PML, p53, c-FOS, c-JUN, PTEN, Akt, etc (Geiss-Friedlander and Melchior, 2007). By changing the localization of these nuclear proteins, SUMO system can regulate down-stream signaling pathways that are critical to cell proliferation.

A recent study showed that blocking SUMO activating enzyme (SAE1/2) alters a subgroup of targets, and further disrupts proper mitosis and cell viability in c-MYC driven cancer cells (Kessler et al., 2012). In addition, a number of receptors and intracellular signaling factors are SUMO substrates, such as IGF-1R (Sehat et al., 2010),

Type I TGF-β (Kang et al., 2008), reptin (Kim et al., 2006) and pontin (Kim et

26 al., 2007). Therefore, deregulation of SUMOylation of these receptors may contribute to carcinogenesis.

SUMOylation also plays a critical role in maintaining genome integrity (Muller et al.,

2004). For instance, under genotoxic stress, SUMO proteins, Ubc9, PIAS proteins are recruited to DNA double-stranded break sites, and BRCA1 is SUMOylated by PIAS1/4 to enhance E3 ubiquitin ligase activity (Galanty et al., 2009; Morris et al., 2009).

Moreover, SUMOylation of transcription factors, cofactors or chromatin-remodeling factors, modulate transcriptional activity and regulate many signaling pathways, such as

PRCs (Bracken and Helin, 2009), NF-κB (Mabb and Miyamoto, 2007) and steroid pathways (Faus and Haendler, 2006) that are known to be related to cancer progression. This shows the crosstalk between SUMOylation/deSUMOylation is highly orchestrated.

27

Figure 1-6 Imbalance of SUMOylation leads to tumorigenesis. Normal cell development requires the balance between SUMOylation (red circle) and deSUMOylation (green circle). The deregulation of SUMOylation may result in the development of various types of tumor.

In summary, SUMOylation plays a crucial role in regulating chromatin structure, gene expression and genome integrity. This post-translational modification of a target protein results in a variety of outcomes, including protein stability, activity, and localization.

Despite the evidence that alteration of SUMOylation contributes to dysfunction of cell cycle progression and further results in human tumorigenesis, it remains controversial about how SUMOylation modulates transcription. A systematic mapping of SUMO proteins bound to the genomic chromatin to regulate cell cycle progression has not been

28 previously accomplished. The main hypothesis for this research is that SUMO-1 can function as a chromatin remodeling mark on the human genome during cell cycle progression. We found that SUMO-1 is present on the proximal promoter regions of the constitutive, high transcription activity promoters, such as ribosomal protein genes during interphase. We have identified that SUMO-1 marks the promoters through Scaffold

Associated Factor B (SAFB) protein, yielding initial mechanistic insights into this important regulatory process. Deciphering a comprehensive map of SUMOylation machinery will not only build our basic understanding of the SUMO pathway, but also identify SUMOylation-related targets that are vital to cell growth.

29

Chapter 2: Rationale

Many studies focus on protein-protein interactions involving SUMO, and an emerging knowledge of correlation between SUMOylation and transcriptional control has been elucidated. However, our current understanding of the chromatin modification of

SUMOylation on the human genome is still rudimentary. SUMOylation of histones and many transcription factors is generally considered to be a repressive signal through recruiting HDACs to genes and inhibiting mRNA synthesis (Gill, 2010). Nonetheless,

SUMOylated protein are detected in active, but not in repressed genes in yeast (Rosonina et al., 2010). Thus, a big picture on in what range of genes are induced/repressed and in what manner controlled by SUMOylation remains elusive. Although some genome-wide expression profiles using microarrays regarding SUMOylation have been reported

(Stielow et al., 2008; Zhou et al., 2005), how SUMO proteins coordinate chromatin structure during cell cycle progression is still missing. Therefore, a genome-wide mapping of SUMOylation throughout the cell cycle would lead to more comprehensive understandings for SUMOylation regulating transcription. In this study we characterized the SUMO-1 mapping in the human genome and how the SUMO-1 bound sites change during different cell cycle stages. In order to detect SUMO-1-bound loci, we used chromatin affinity purification (ChAP) specific to an epitope-tagged SUMO-1. This unbiased method was used to detect the changes in SUMOylation of all the chromatin

30 proteins regardless of the type of substrate or function of the modification. Our study was the first comprehensive view of SUMO-1 on the chromatin to unveil a previously unexpected role for transcriptional activation.

31

Chapter 3: Chromatin modification by SUMO-1 stimulates the

promoters of translation machinery genes

Liu HW, Zhang J, Heine GF, Arora M, Ozer HG, Onti-Srinivasan R, Huang K, Parvin JD, Chromatin modification by SUMO-1 stimulates the promoters of translation machinery genes. Nucleic acid research. 40(20): 10172-86. (Feature Article)

Author contributions:

• Liu HW & Parvin JD designed the experiments.

• Zhang J, Ozer G and Huang K performed bioinformatics analysis for the high

throughput data

• Arora M contributed to discussions about data interpretation

• Heine G provided preliminary data

32

3.1 Abstract

SUMOylation of transcription factors and chromatin proteins is in many cases a negative mark that recruits factors that repress gene expression. In this study, we determined the occupancy of SUMO-1 on chromatin in HeLa cells by use of chromatin affinity purification coupled with next generation sequencing. We found SUMO-1 localization on chromatin was dynamic throughout the cell cycle. Surprisingly, we observed that from

G1 through late S phase, but not during mitosis, SUMO-1 marks the chromatin just upstream of the transcription start site on many of the most active housekeeping genes, including genes encoding translation factors and ribosomal subunit proteins. Moreover, we found that SUMO-1 distribution on promoters was correlated with H3K4me3, another general chromatin activation mark. Depletion of SUMO-1 resulted in down regulation of the genes that were marked by SUMO-1 at their promoters during interphase, supporting the concept that the marking of promoters by SUMO-1 is associated with transcriptional activation of genes involved in ribosome biosynthesis and in the protein translation process.

33

3.2 Introduction

SUMOylation, an evolutionally conserved post-translational modification among

eukaryotic cells, involves a three-step process that requires an E1 activating enzyme

(SAE1/SAE2 in humans), E2 conjugating enzyme (Ubc9), and a variety of E3 ligases that

covalently attach Small Ubiquitin-like MOdifier (SUMO) protein to the lysine residues of

substrate proteins (Gareau and Lima, 2010). SUMO proteins are ubiquitously present in

eukaryotic cells; in human, there are four SUMO isoforms, SUMO-1 to -4, encoded by

distinct genes. SUMO-1 is found in vivo conjugated to target proteins as a monomer.

SUMO-2/3, which are each 45% identical to SUMO-1 and 96% identical to each other,

are conjugated by different E3 enzymes than act on SUMO-1, and SUMO-2/3 are often

found in poly-SUMO chains (Gareau and Lima, 2010). SUMO-4 is an isoform found in

kidney, lymph node, and spleen cells (Guo et al., 2004), but it is not known whether

SUMO-4 can be conjugated to cellular proteins. SUMOylation can be reversed by

SUMO/sentrin-specific proteases (Ulps in yeast and SENPs in human) that remove

SUMO proteins from target proteins (Geiss-Friedlander and Melchior, 2007). This

covalent and reversible biochemical reaction is highly dynamic and tightly orchestrated

in cells, and it regulates various biological and physiological processes, such as nuclear-

cytosolic transport, protein stability, apoptosis, transcriptional regulation, DNA repair,

cell proliferation, and cell cycle progression (Geiss-Friedlander and Melchior, 2007).

SUMO proteins are associated with transcriptional regulation. A wide range of

transcription factors have been reported as SUMO-substrates, and in most studies this

modification results in a repressive signal. For example, SUMOylation of the Polycomb

34

Repressive Complex 1 (PRC1) subunit Pc2 is important for the repressive activity of the complex (Gill, 2010; Kang et al., 2010) SUMO-mediated repression of sequence-specific transcription factors includes Elk-1 (Yang and Sharrocks, 2006), IκBα (Desterro et al.,

1998), c-Jun (Muller et al., 2000), C/EBP (Kim et al., 2002), Sp3 (Ross et al., 2002), and many others (Garcia-Dominguez and Reyes, 2009; Ouyang et al., 2009b). In addition, p300, a transcription factor with both activating and repressing roles, is modified by

SUMO conjugation to repress downstream genes via association with HDAC6

(Girdwood et al., 2003). A variety of chromatin-modifying enzymes have been identified to be recruited to promoters in a SUMO-dependent manner (Ouyang and Gill, 2009). It is also known that all four major core histones can be SUMOylated and further repress gene expression in yeast (Nathan et al., 2006). In human cells, SUMOylation of histone H4 was associated with transcription inactivation via the recruitment of HDACs to oppose other activating modifications such as ubiquitination or acetylation (Shiio and Eisenman,

2003). Histones H1 and H3 are SUMO substrates yet the exact role of the SUMOylation of these proteins is unclear (Matafora et al., 2009b). In addition to SUMO conjugation of sequence-specific transcription factors and of histones, general transcription initiation factors, such as TFIID subunits hsTAF5 and hsTAF12, can be SUMOylated resulting in the inhibition of their promoter binding activity (Boyer-Guittaut et al., 2005).

SUMOylation of chromatin-associated factors has also been associated with stimulation of transcription. A set of transcription factors have all been reported to be stimulated by

SUMOylation, including Pax-6 (Yan et al., 2010), GRIP1 (Kotaja et al., 2002), myocardin (Wang et al., 2007), p45/NF-E2 (Shyu et al., 2005), GATA-4 (Wang et al.,

35

2004), Smad4 (Lin et al., 2003), (Tian et al., 2002), NFAT-1

(Terui et al., 2004), PEA3 (Guo and Sharrocks, 2009), and HSF-1/-2 (Goodson et al.,

2001; Hong et al., 2001). SUMOylation has been reported as both an activator and a repressor of the p53 protein (Gostissa et al., 1999; Wu and Chiang, 2009). One study found that SUMOylation of promoter-associated factors in yeast was clearly associated with transcriptional activation on constitutive gene promoters (Rosonina et al., 2010).

Thus, while the preponderance of evidence has focused on SUMOylation as a repressive signal, there are examples of it activating transcription. However, a general rule for how

SUMO-1 functions as a chromatin mark is still unclear.

Here, we analyzed the genome-wide association of SUMO-1 as a chromatin mark in human cells at stages throughout the cell cycle. To our surprise, we found that SUMO-1 marks many of the most active genes at the proximal promoter region. The SUMO-1 binding profile was dynamic as cells traversed the cell cycle. In particular, we noted that

SUMO-1 binding to the promoter of active genes was decreased during mitosis when transcription generally halts. We found SUMO-1 labeling on the chromatin was highly correlated with the stimulatory H3K4 trimethylation (H3K4me3) mark. Depletion of

SUMO-1 protein resulted in a decrease in mRNA abundance of SUMO-1 marked genes, indicating that SUMO-1 is a transcriptional activator for those genes.

36

3.3 Materials and Methods

3.3.1 Cloning and Cell line generation

To obtain the HeLa cell line stably expressing His6-biotin-tagged SUMO-1 (protein diagram in Figure 3-1A), full-length human SUMO-1 was PCR-amplified from HeLa cell cDNA by using Phusion High Fidelity polymerase (Finnzymes), and cloned into pQCXIP derived vector (gift of P. Kaiser, UC Irvine) (Tagwerker et al., 2006). HeLa cells were then stably transfected with the His6-biotin-SUMO1 plasmid using Lipofectamine

(Invitrogen) and selected in 2 μg/ml puromycin. Colonies with recombinant SUMO-1 stable expression were screened and confirmed by western blot.

3.3.2 Antibody and used for Chromatin Immunoprecipitation (ChIP)

The SUMO-1 polyclonal antibody used a GST-SUMO1 fusion protein as antigen, and the serum was prepared at Cocalico Biologicals, Inc. (Reamstown, PA).

3.3.3 Cell culture, cell cycle analysis, and RT-qPCR

For G1/S synchronization, HeLa or HeLa-SUMO cells were treated with 2 mM thymidine (Sigma) for 17 h, then removed for 9 h and added at the same concentration for 18 h, and released for the indicated times to synchronize cells in early S, mid-S, late

S, and G1 phases, respectively. Mitotic phase cells were obtained by treating with 2 mM thymidine for 15 h and released for 3 h, then treated with 100 ng/ml nocodazole for 15 h.

Cell-cycle distribution was determined by FACS Calibur flow cytometer (Becton

Dickinson). The RT-qPCR assays were done 72 h post-transfection with SUMO-1 or

Ubc9-specific siRNA using Oligofectamine (Invitrogen), and the control oligonucleotide

37

was specific for luciferase. Primer and siRNA sequences are provided in Appendix. Total

RNA was purified using Trizol reagent (Invitrogen); 2 μg of total RNA was reverse-

transcribed using iScript cDNA synthesis kit (Bio-Rad), and qPCR was done by the

manufacturer’s protocol (iQ SYBR Green Supermix, Bio-Rad). Three biological

replicates were performed individually.

3.3.4 Chromatin immunoprecipitation, ChIP-qPCR, and Affinity Purification

Chromatin immunoprecipitation (ChIP) and affinity purification (ChAP) samples for

Illumina GAII were prepared as follows. The ChIP samples were prepared by standard

methods (Di Bacco et al., 2006) using SUMO-1 antibody. Chromatin affinity purification

was based on the same ChIP method with modification of a two-step affinity purification.

108 HeLa-SUMO cells were cross-linked with 1% formaldehyde (Sigma) and stopped by adding 125 mM glycine. The cross-linked chromatin was then sheared to 200-300 bp by sonication, incubated with 375 μl of Ni beads (Qiagen) for 16h at 4°C. An aliquot of the

input DNA was saved prior to immunoprecipitation as reference sample. After washing

in 6 ml of wash buffer I (50 mM Tris pH 8; 0.01% SDS; 1.1% Triton X-100; 150 mM

NaCl), chromatin fragments were eluted in 6ml elution buffer (washing buffer I with 300

mM imidazole). The nickel eluate was incubated with 375 μl of streptavidin beads

(Invitrogen) for 6h at 4°C. After three stringent washes in 2 ml of wash buffer II (50 mM

Tris pH 8; 10 mM EDTA; 1% SDS; 1M NaCl), the chromatin was eluted by adding 2 ml

of elution buffer (50 mM Tris pH 8; 10 mM EDTA; 1% SDS; 200 mM NaCl) to the

beads and crosslink-reversal was done by incubating at 65°C for 15 h. The supernatant

was collected and diluted 1:1 with TE buffer. The eluate was treated with RNase (0.2 38

mg/ml; Sigma) for 2h at 37°C, with Proteinase K (0.2 μg/ml; Sigma) for 2h at 55°C, and

DNA was extracted using phenol/chloroform/isoamyl alcohol and precipitation in 0.1

volumes of 3 M sodium acetate, 2 volumes of 100% ethanol and 30 μg of glycogen

(Invitrogen). ChIPed DNA prepared from 1×108 cells was resuspended in 30 μL of

Qiagen Elution Buffer. Three biological replicates were prepared per time point. ChIP- qPCR was performed to validate the ChIP-seq data obtained in this research. For ChIP- qPCR experiment, after 72 h of Ubc9 depletion in HeLa-SUMO cell line, 2 × 107 were

harvested, and followed by ChIP method described previously. Ct values obtained in each

sample were normalized to the % input DNA values. qPCR was done by the

manufacturer’s protocol (iQ SYBR Green Supermix, Bio-Rad). Primer sequences are

provided in Supplementary Error! Reference source not found.. At least three

biological replicates were performed individually.

3.3.5 ChIP DNA preparation for Solexa Sequencing

ChIP or ChAP DNA samples were then prepared for ChIP-sequencing library

construction following Illumina’s ChIP-seq Sample Prep protocol. Briefly, the DNA

samples were blunt-ended by using End-it DNA End-Repair Kit (Epicentre) according to

the manufacturer's instruction. dA overhangs were then added and Illumina adapters

ligated. Adapter-ligated DNA was subject to 15 cycles of PCR after size selection of 200-

300 bp by agarose gel electrophoresis. 10 nM purified DNA was subjected to sequencing

on Illumina GAII platform to 36-bp reads. The sequencing reads were aligned to the

39

human genome UCSC build hg18. Only uniquely aligned reads were used for further

analysis, and multiple identical reads were eliminated to reduce PCR-generated artifacts.

cDNA sample preparation: The double-stranded cDNA (0.8 μg total RNA input) was

subjected to library preparation using the Illumina TruSeqTM RNA sample preparation kit

(Low-Throughput protocol) according to manufacturer's protocol.

RNA-seq analysis: six cDNA samples containing three pairs of biological replicates (3

SUMO-1 depleted samples and 3 GL2 control samples) were barcoded, pooled together in equal concentration, and subjected to sequencing in one lane of Illumina GAII. The resulted sequences (5-9 million reads for each sample) were sorted and mapped to human reference genome hg18 using open-source software TopHat (Trapnell et al., 2009). The differential gene expression of the two groups of samples (SUMO-1-depleted vs. control) was analyzed by open-source software (Trapnell et al., 2010) using default parameter settings. Genes from all six samples with significantly changed FPKM values, as well as a sub-group of significantly down-regulated genes upon SUMO-1 depletion involved in protein synthesis, were displayed in the heat map with row-wise scale. The significantly changed genes were also compared with ChAP-seq results and the GO enrichment was analyzed using Toppgene (http:// http://toppgene.cchmc.org/) and Ingenuity Pathway

Analysis (IPA).

3.3.6 Data analysis

ChAP-seq peak finding: FindPeaks 4.0.10 (Fejes et al., 2008) was used to generate

peaks for all the ChAP-seq and ChIP-seq data of SUMO-1 with options of subpeaks 0.5,

40 trim 0.2. A minimum height threshold for each dataset was established so that FDR is less than 0.1% based on the Monte-Carlo simulation of each dataset.

Histogram of genome-wide tag counts: Raw tags were counted in a 1 kb bin-size for every chromosome for each sample using a Matlab code. The same histograms for chromosome 1 were used to generate scatterplots for paired ChIP-ChAP samples using scatterplot function in MatLab.

Sort peaks into different genomic regions: RefSeq database was used to define genomic regions, and the promoter region is defined as 5 kb upstream of a Transcription start site (TSS). A peak was sorted to a specific region if there is at least 1 bp overlap with that region. Active/ inactive promoters were classified based on GEO datasets

GDS885 and GDS2781 containing asynchronous HeLa cell gene expression microarray results. Genes were grouped based on their expression levels, and active promoters were defined from the top 20 percentile gene groups, while inactive promoters were defined from the bottom 20 percentile groups. Each contains about 2400 genes.

Extended TSS region tag density profiling: RefSeq database was used to obtain start and end coordinates of ±10 kb of TSSs for each gene that is included in the GDS885 dataset (Carson et al., 2004). A total of 12013 genes extended TSSs were used. Raw

SUMO-1 tags were extended according to the average fragment length of each sample.

The average tag density was computed using non-overlapping 5 bp bins along the extended TSS region from for each of the three biological replicates, then the tag density was normalized by dividing with the total number of reads (in millions) in each sample, and averaged among the three replicates. In the heat maps arranged by gene expression

41

percentile, gene expression was grouped based on the percentile in GDS885 dataset. In

the sorted TSSs heat map (Figure 3-9B), the rows of all other cell stage heat maps follow

the same order of G1 sample.

Comparison of SUMO-1 marked genes in ChAP-seq samples of different cell stages:

G1 and M0 stage ChAP-seq samples were processed for peak-calling using FindPeaks

4.0.10. The resulted peak files were crosschecked with RefSeq database to extract genes

with peaks present in the promoter region (5kb upstream of TSSs) using BEDtools

(Quinlan and Hall, 2010). The presence of a peak in the promoter region was defined as

at least 1bp overlap between the peak range and the promoter region of a specific gene.

The gene lists were then crosschecked with the gene lists from significantly changed

RNA-seq comparison data.

Normal Distribution of the number of randomly selected genes with SUMO-1

promoter peaks and z-score calculation: A specific number (199 or 158) of genes were randomly selected from RefSeq database, then crosschecked with ChAP-seq peak files to obtain the number of genes with SUMO1 peaks in the promoter regions using BEDtools.

The ChAP-seq datasets used in this analysis were from G1 phase. The whole process was repeated 1000 times and we found that the number of genes with SUMO1 peaks follows normal distribution. The mean and standard deviation of this distribution were calculated.

Using the real number of genes with G1-stage SUMO1 promoter peaks obtained from

RNA-seq comparison data, the z-score was calculated as: z-score = (NumTrue - mean)/std.

Comparison of SUMO-1 ChAP-seq and published chromatin marks: publicly available HeLa cell ChIP-seq/ChIP-chip datasets- H3K4me3 (GSM566169), H3K27me3

42

(GSM566170)-were download from the GEO database (www.ncbi.nlm.nih.gov/geo/). For

all chromatin mark ChIP-seq datasets, the raw reads were extended to 200 bp. Peaks were

generated the same way as SUMO-1 ChAP-seq sample. RefSeq gene promoter and

transcribed region were used to search for a peak that has at least 90% of its range

overlapping with annotated regions of a specific gene.

To compare the binding pattern between SUMO-1 and other chromatin marks, tag

density profiles were computed with a Matlab code within the 20 kb extended TSSs of all

the genes (total 12,013 entries) included in the GDS885 dataset. The rows of each tag

density profile were sorted according to the maximum tag density of the ±2 kb of the

TSSs in sample profiles. The mean tag density of this 4 kb region from each dataset was

used to calculate the Pearson correlation coefficient (R).

The peak files from SUMO-1 G1-stage ChAP-seq as well as ChIP-seq from the

chromatin marks (H3K4me3 and H3K27me3) were also used to find the genes that have

both SUMO-1 marks and one of the chromatin marks, then to generate Venn diagram.

BEDtools was used to find peaks from data that overlap at least 90% with the promoter of

each gene (for SUMO-1 G1-stage data) in RefSeq database, or the promoter plus

transcribed region (for H3K4me3 and H3K27me3 data). The Chi-square test p-values

were computed using R function chisq.test.

Comparison of SUMO-1 marked genes with Ubiquitin marked genes in HeLa

ChAP-seq samples: G1 and M0 stage Ubiquitin-tagged ChAP-seq samples (Arora et al, submitted), as well as G1 stage SUMO1-tagged ChAP-seq sample were processed for peak-calling using FindPeaks 4.0.10. Each of the resulted peak files was cross-checked

43

with RefSeq database to extract genes with peaks present in the promoter region (5kb

upstream of TSSs) or transcribed region using BEDtools. The presence of a peak in the

promoter/transcribed region was defined as at least 90% of the peak range overlapping

with that region of a specific gene.

Principal Component Analysis (PCA) of ChAP-seq datasets: For each sample,

SUMO-1 tag counts on chromosome 1 (without the centromeric region to avoid bias due

to the sequencing artifacts) was used for PCA using Matlab (bin-size = 1 kb). The first

three principle components were plotted using Matlab.

3.4 Results

3.4.1 Chromatin affinity purification of SUMO-1 through the cell cycle

A variety of studies have shown that SUMO-1 participates in cell cycle progression

(Bachant et al., 2002; Watts, 2007). To determine the genome-wide SUMO-1 pattern on

chromatin and how it changes during the cell cycle, we employed a HeLa-derived cell

line that stably expressed His6-biotin-tagged SUMO-1 (Figure 3-1A). Western blot

analysis showed that the 26 kD recombinant SUMO-1 was expressed at around 10-fold

higher levels than the endogenous 11.5 kD monomer SUMO-1 in crude whole cell

extracts (Figure 3-1B); however, those tagged SUMO-1 conjugates at higher molecular

weight were present at similar levels as compared to the endogenous SUMO-1 protein

(Figure 3-1B, right). We purified chromatin using standard methods, followed by double

affinity purification via the His6-tag and the biotin-tag. We found that the most abundant proteins conjugated to the tagged SUMO-1 were in the size range of 40 kD and higher

44

(Figure 3-1C). The most abundant SUMOylated proteins were most likely transcription factors or other nonhistone chromatin proteins.

45

A His6 Biotin SUMO-1 Nickel Avidin B C

HeLa-SuHeLa HeLa-SuHeLa ChromatinNi Streptavidin

150 150 250 100 100 150 50 50 100

25 50 25 20 20 15 25 15 * 20 15

1 2 3 4 1 2 3 WB: Biotin WB: SUMO-1 WB: Biotin

104 D E 0.0324 1.48 7.39 G1 103 S0 S3 S6 M Async. 102

16.5

101

48.2 44.4 100 0 200 400 600 800 1000 FL2 A 2n 104 4n 0.796 79.4 DNA content (propidium iodide) 66.5

103 pHistone H3 (Ser-28) Thy.-Noc. 102 block

11.9

101

3.62 16.2 100 0 200 400 600 800 1000

DNA content (propidium iodide)

Figure 3-1 Characterization of HeLa-SUMO cell line. A. Illustration of His6-biotin-tagged SUMO-1 (HBT-SUMO-1) protein used in this study. Protein domains are not drawn to scale. The His6 domain binds to Ni-NTA matrix, and the naturally biotinylated domain binds to streptavidin matrix. B. Western blot analysis showing SUMOylated proteins in whole cell lysates from HeLa-SUMO (lane 1) and HeLa (lane 2) cells. The arrow indicates the recombinant SUMO- 1 protein and the asterisk indicates the endogenous SUMO-1 protein. Left panel showed biotinylated conjugates detected using streptavidin linked with HRP, whereas the right panel showed the SUMO-1 conjugates detected using SUMO-1 antibody. C. Western blot showing the fractionation of SUMOylated chromatin from HeLa-SUMO cells. Chromatin was isolated in the presence of high salt and fragmented by sonication (lane 1). The isolated chromatin was then purified using the Ni-NTA metal ion affinity matrix (lane 2) and subsequently using streptavidin- agarose (lane 3) affinity purification. D. Flow cytometry analysis of DNA content of the samples used for ChAP-seq analysis. HeLa-SUMO cells were double blocked in thymidine and released for 13 h (G1), 0 h (S0), 3 h (S3), or 6 h (S6), or the cells were blocked in thymidine and released into nocodazole (M) as described in the Methods section. E. FACS analysis of phospho-H3 and propidium iodide stained cells arrested using the thymidine/nocodazole protocol.

46 Cells were synchronized in various cell cycle stages using a double thymidine block and release or thymidine/nocodazole block (Figure 3-2A). Flow cytometry analysis of the

DNA content and the mitosis-specific phospho-histone H3 mark indicated that the cells were synchronized in G1, early/mid/late S, and mitosis phases (Figure 3-1D, E).

Chromatin was isolated, and the SUMO-tagged chromatin was then double-affinity purified using metal ion affinity chromatography followed by streptavidin-affinity chromatography. The protein bound to the matrix was subjected to stringent wash conditions, cross-link reversal, and the enriched DNA was analyzed by high throughput sequencing. This approach was directly analogous to ChIP-seq, but since no were used to purify the chromatin, we call this technique ChAP-seq for chromatin affinity purification and sequence analysis. Three sets of biological replicates were performed for each time point, and we obtained 18 to 25 million uniquely mapped reads from the

Illumina genome analyzer II (GAII) for each individual sample. We then compared the datasets pairwise to evaluate the reproducibility of the three biological replicates. We found all the peaks of samples collected during interphase to highly overlap with other samples from the same point in the cell cycle: replicates from S3, S6, and G1 had 77% to

95% of their peaks overlap from the respective samples. The early S phase samples (S0) had over 52% of its peaks present in the other replicates. The samples from mitosis had over 41% of its peaks present in the replicate samples. This was a high level of reproducibility, especially among the interphase samples. The samples from mitosis had lower reproducibility, but as will be shown in the following sections, these samples had

SUMO-1 removed from the promoters.

47

Results for the SUMO-1 binding profiles on the human are shown as an example (Figure 3-2B). We computed the SUMO-1 tag densities (bin size = 1 kb) and plotted them along the length of the chromosome as a histogram (False Discovery Rate;

FDR < 0.1%). At the top is the histogram from the HeLa cell line that does not express tagged SUMO-1, and results from specific points in the cell cycle were shown (top to bottom): G1, early S (S0), mid S (S3), late S (S6), and mitosis (M). From the cell line that does not express tagged SUMO-1, there was a low background of non-specifically purified sequence tags evenly distributed throughout the chromosome and without peaks.

48

A B Untagged

M G1 G1 M G2 G1 S0 S S6 S3 S3 S0 S6

M

Chr.3 198 mb C 4 G1 S0 [fold]) 2 3 S3 2 S6 M 1 0 -1 -2 -3

Fold enrichment in region (log CpG island Promoter Intron Gap> 1mb Gap< 1mb Regulatory element Transcribed region Intergenic region D

G1 S0 PC1 S3 S6 M

PC2 PC3 Figure 3-2 Genome wide analysis of SUMO-1 binding. A. Sample collection for mapping the chromatin localization of SUMO-1 through out the cell cycle. HeLa cells were treated with double-thymidine block and released for 0, 3, 6, 13 h to obtain S0, S3, S6, and G1 samples, respectively, and cells in mitosis (M) were treated with a sequential thymidine-nocodazole block. B. Histogram depicting the locations of SUMO-1 binding sites on chromosome 3 of the human genome using chromatin affinity sequence analysis (ChAP-seq). The frequency of raw reads was plotted along the length of the chromosome with bin-size 1 kb. Samples were: ChAP purified DNAs from HeLa-SUMO cells during G1 (blue), early S phase (S0, red), mid S phase (S3, green), late S phase (S6, purple), mitosis (M, orange), and results from affinity purification using a HeLa cell line that does not express the tagged SUMO-1 protein (black). A diagram of

chromosome 3 is shown at the bottom. C. Peak annotation depicts fold change on log2 scale of SUMO-1 binding sites on defined sequence elements on the human genome relative to the expected frequency of the genetic elements distributed in the genome if the binding profile is randomly distributed. G1 (blue), early S phase (S0, red), mid S phase (S3, green), late S phase (S6, purple), and mitosis (M, orange), and error bars are SEM from three biological repeats. D. 15 SUMO-1 datasets of chromosome 1 were represented in a three-dimensional stereoscopic image by using standard PCA (see Methods) to show the reproducibility within each set of biological replicates as well as the separation of data among different cell stages. Each color ball represents individual dataset collected from indicated cell stage. The same color denotes the biological replicates of the same collection point during cell cycle. 49

When comparing the interphase SUMO-1 localization, at the chromosome scale resolution, the samples had similar patterns to each other. By contrast, during mitosis the

SUMO-1-modified chromatin was largely redistributed, with relatively even distribution and fewer apparent peaks. The SUMO-1 peak at the pericentromere appears in all samples, including the ChIP-seq reaction using pre-immune IgG (Figure 3-3A). Since this peak appears in a sample without specific purification, we interpret this peak as an artifact from the parallel-sequencing technique.

50

A HBT-SUMO1

ChIP-SUMO1

ChIP-IgG

Chr.1 246 mb

B 100 kb chr1:27,128,181-27,889,107

HBT-SUMO S0

ChIP-SUMO S0

ChIP-IgG S0 C

Figure 3-3 Binding patterns of tagged-SUMO-1 detected by ChAP-seq are highly similar to the binding patterns of native SUMO-1 detected by ChIP-seq. A. Histogram of the tag densities (bin size =1 kb) of SUMO-1 binding sites on chromosome 1 of the human genome (hg 18) in the S0 sample by using two types of purification strategy, ChIP-seq and ChAP-seq. The frequency of raw reads was plotted along the length of the chromosome 1. Samples were: ChAP- SUMO purified DNAs from HeLa-SUMO cells during S0 (red, panel 1, top), ChIP-SUMO purified DNAs using SUMO-1 specific IgG antibody and chromatin from HeLa cells that do not express tagged SUMO-1 (black, panel 2), ChIP using purified IgG from the matched pre-immune IgG (gray, panel 3). A diagram of chromosome 1 is shown at the bottom. B. SUMO-1 binding tracing for a representative stretch of chromosome 1. A ~750 kb multi-gene cluster is shown and detected SUMO-1 tag density is shown for the tagged SUMO-1 (red, top), endogenous SUMO-1 immunoprecipitated using SUMO-1 specific IgG (black, middle), and endogenous chromatin immunoprecipitated using pre-immune IgG (gray, bottom).

To test whether the SUMOylation of chromatin in cells expressing the tagged SUMO-1 is consistent with the labeling of endogenous SUMOylation, we performed a ChIP-seq using SUMO-1 specific antibody in early S phase (S0) as a biological validation for the

ChAP technique. We found that the results obtained from ChIP method were highly 51 consistent with those from ChAP (Figure 3-3). An example, which includes multiple biological replicates, at the promoter of the NOSIP gene is shown in Figure 3-4A. The average peak values obtained by ChIP-seq were comparable to the peak values obtained from ChAP-seq. Furthermore, the peaks detected using ChIP-SUMO-1 (x-axis) were correlated well with those of the double-tagged-SUMO-1 (y-axis) (R =0.989) by scatter plot analysis (Figure 3-4B).

A B S0- 1

S0- 2 R = 0.9890

S0- 3

Untagged

ChIP-SUMO (ChIP reads count on chr.1) reads count on chr.1) (ChIP

ChIP-IgG 10 Log NOSIP PRRG2

Log10(ChAP reads count on chr.1) Figure 3-4 Consistent binding patterns of SUMO-1 detected by ChAP and ChIP-seq. A. NOSIP gene as an example of SUMO-1 tracing in IGV genome browser shows the consistency among the ChAP-seq samples and the ChIP-seq samples. Three replicates are shown of the early S phase (S0) ChAP-seq results (S0-1 to -3), ChAP-seq from an untagged cell line (fourth tier), ChIP-seq using SUMO-1 specific IgG (fifth tier) and ChIP-seq using pre-immune IgG (bottom tier). B. Scatter plot analysis of the peaks of ChIP-SUMO-1 (x-axis) against those of the double- affinity purified tagged-SUMO-1 (y-axis). The Pearson correlation coefficient (R) between these two methods is 0.989.

52 3.4.2 Chromatin bound SUMO-1 is concentrated at transcriptional regulatory

sites and is dynamic through the cell cycle.

We then analyzed the distribution of SUMO-1-tagged chromatin on a genome-wide scale

according to sequence annotations. Compared to the null hypothesis that tags were

randomly distributed in the genome, SUMO-1 was significantly enriched on CpG islands,

promoters, and during interphase (Samples from S0, S3, S6, and G1 phase;

Wilcoxon rank-sum p-value <0.05), whereas SUMO-1 binding to intron containing sequences was not significantly different from the random expectation. 10% of SUMO-1 marks were around the promoter region (5 kb upstream of a transcription start site, TSS), representing a 2.5-fold enrichment of SUMO-1 at promoter DNA, suggesting that

SUMO-1 might play a role in regulating transcription initiation. In addition, during mitosis the SUMO-1 marks at promoters decreased (Figure 3-2C). These results suggested that SUMO-1 is depleted from chromatin, and this is consistent with a previous study shown that during mitosis, little SUMO-1 remains localized to condensed chromosomes (Zhang et al., 2008b). By contrast, large gene deserts were under-

represented in the chromatin marked by SUMO-1. SUMO-1 occupancy in the genome

was shown in fold enrichment (log2) normalized to the frequency of the genetic elements in the genome. Interestingly, CpG islands represent 0.7% of the genome, but we observed that 8-10% of the SUMO-1 marks were on CpG islands, consistent with the promoter enrichment in Figure 3-2C. Since many CpG islands are located in promoters, we also analyzed the promoters without CpG islands, and found a similar pattern of SUMO-1 association with promoters that do not have CpG islands (Figure 3-5A). In addition, there

53

was a four-fold enrichment of SUMO-1 marks on exon, but this enrichment was not

explained by promoter-proximal binding of SUMO-1 to exon1 (Figure 3-5B). This

association of SUMO-1 with exons suggested that SUMO-1 might be associated with

splicing at the chromatin level. As many histone marks such as H3K36 methylation and

K9 acetylation, have shown to play a role in alternative splicing (Luco et al., 2011), it

will be of interest to investigate whether SUMO-1 marks participate in pre-mRNA

processing through chromatin conformation.

A 14 G1 12 S0 S3 10 S6 M 8 % in genome 6

genome (%) 4 2

Peak distribution on the 0 Promoter CpG island-free promoter B 70 Exon 1 60 Exon 1+2+3 50 40 30

on exons (%) 20

Peak distribution 10 0 G1 S0 S3 S6 M0 % in genome Cell cycle stage Figure 3-5 SUMO-1 marks chromatin at active sites on human genome. A. Peak annotation depicts of SUMO-1 binding sites on promoter region (5 kb upstream of TSSs as in Panel A; left) and promoters that have no CpG islands (right). B. Peak annotation depicts of SUMO-1 binding percentage on exon1 or exon (1+2+3) region (gray) compared to exon distribution in the genome (black).

54 In order to reduce the complexity of analyzing large datasets, we used principal component analysis (PCA) (Huff et al., 2010) to examine the 15 datasets containing three replicates each of the five time points in the cell cycle (Figure 3-2D). Like other high throughput data, ChAP-seq data contain many features and thus are in high dimensions.

By PCA, we focused on the combination of features with the largest variances and thus identified major dissimilarities among multiple datasets simultaneously. Apart from pairwise analysis of the biological replicates indicated high reproducibility, visualization of the first three principal components of the PCA showed that replicates from each time point tend to group together, suggesting that the differences among time points are larger than the differences among replicates. Consistent with visualization of the chromosome- wide labeling by SUMO-1, in which the pattern of SUMO-1 on chromatin during mitosis was distinct from the interphase samples (Figure 3-2B, C), the SUMO-1 localization during mitosis analyzed by PCA was also well separated from all the other interphase samples (Figure 3-2D). These results indicated that the SUMO-1 tagging of chromatin is dynamic through the cell cycle, and the changes we identified were meaningful at each time point since they were obtained with biological repeats collected weeks apart.

3.4.3 SUMO-1 labels the promoters of active genes.

Previous studies showed that SUMOylation generally contributes to transcriptional repression (Garcia-Dominguez and Reyes, 2009). However, a recent study suggested

SUMOylation of chromatin could facilitate transcription activation in constitutive genes in yeast (Rosonina et al., 2010). Since we observed that SUMO-1 marks were enriched at regulatory elements in the genome (Figure 3-2C), we asked whether SUMO-1 was 55 associated with the most active or inactive genes. Using published microarray data

(Carson et al., 2004), we sorted the mRNA level for each gene from low to high, and obtained the 20% highest and 20% lowest expressed genes and asked what proportion of the most active or least active promoters were labeled by SUMO-1. In striking contrast to the published association of SUMO-1 with repressive elements, there were many more examples of SUMO-1 modified chromatin at highly active promoters. We found during

G1 phase, 49.2% of the high activity and 23.3% of the low activity promoters were labeled by SUMO-1 (Figure 3-6). During mitosis, we found 15.8% of high activity and

5.9% of low activity promoters were marked by SUMO-1. This reduction of SUMO-1 marks was consistent with the idea that during mitosis, transcription was repressed and this stimulatory SUMO-1 signal would rebind to the chromatin after cell division was completed and active transcription resumed.

56

60 High expression Low expression 50 40 30 20 SUMO-1 (%) 10 Promoters labeled by 0 G1 M Figure 3-6 SUMO-1 binding pattern is associated with active promoters. A histogram is shown of the percentage of active (red) and inactive (gray) promoters labeled by SUMO-1. High activity-promoters are defined as those upstream of genes for which the mRNAs were the 20% most abundant, and low activity promoters are defined as those upstream of genes for which the mRNAs were the 20% least abundant in microarray datasets. Results are the means (±SEM) in G1 and mitosis, as indicated.

We further dissected the SUMO-1 localization flanking TSSs of annotated genes. The average SUMO-1 tag density per 10 from the three replicates of each time point were normalized and plotted within ±10 kb of TSSs (Barski et al., 2007). To correlate

SUMO-1 distribution and global mRNA gene expression, we divided the genes from microarray dataset GDS885 into 10 groups; each was a decile composed of approximately 1200 genes according to the mRNA abundance levels from the silent genes to the most highly expressed genes (Figure 3-7A). In all interphase stages of the cell cycle, SUMO-1 was associated with the chromatin surrounding the TSSs of the most active genes. The active genes (90-100% decile; red tracing of Figure 3-7A) had the highest density of SUMO-1 at the TSSs. The inactive genes (10-20% decile in green and

0-10% decile in black in Figure 3-7A) were relatively unlabeled by SUMO-1.

57

G1 S6 A

-10 +10 -10 +10 S0S0 MM

-10 +10 -10 +10

Normalized SUMO-1 tag density S3 High Medium to High Medium Low Silent

-10 +10 Position relative to TSS (kb)

G1 S0 S3 B S6 M Inactive genes Normalized SUMO-1 tag density

Position relative to TSS (bases) Figure 3-7 SUMO-1 binding pattern is associated with transcriptional activation. A. Normalized tag density plots display SUMO-1 tags distribution ±10 kb surrounding the transcription start sites (TSSs, bentarrow) in different cell cycle stages. Each trace is based on averages of normalized ChAP-seq tag densities results from the three replicates at each point in the cell cycle. From published microarray results using asynchronous HeLa cells, genes were divided into deciles representing inactive genes (0-10%, black), low activity genes (10-20%, green), medium abundance mRNAs (50-60%, pink), medium-high abundance mRNAs (80-90%, blue), and highest abundance mRNAs (90-100%, red). The y-axis is arbitrary normalized tag density unit (see Methods). Results are shown from G1 (top left), early-S (S0, middle left), mid- S (S3, bottom left), late S (S6, top right), and mitosis (M, middle right). B. A zoom-in view is shown of the average of normalized SUMO-1 tag density plots on most highly expressed genes from each cell cycle stage within 2 kb relative to TSSs. The similar trace from inactive genes in G1 phase is shown in black.

58 The pattern of SUMO-1 labeling revealed two peaks of SUMO-1 binding from -400 to

0.and a comparatively minor peak of SUMO-1 is located at +400 to +2500 bp relative to

the TSSs (Figure 3-7). The promoter peak was high during the transcriptionally active

stages of the cell cycle (G1 through late S phase), and then this promoter peak dropped

during mitosis with the decrease of transcriptional activity. Interestingly, there is also a

drop during S0 phase compared to other transcriptionally active stages. Although we do

not have an explanation for this phenomenon, we believe that the beginning of S phase

could be the dividing point between two waves of SUMO-1 stimulated transcription.

We also compared our results to microarray data from synchronized cells (Sadasivam et

al., 2012) to test the correlation between SUMO-1 tag on promoter and gene expression.

Just as was observed with the microarray results from asynchronously growing cells, for

those promoters marked by SUMO-1, gene expression was higher than those without

SUMO-1 marks during the cell cycle progression (Figure 3-8). However, mRNA abundance may reflect synthesis at earlier points in the cell cycle, and during mitosis, when genes are repressed in general, there was still positive correlation between SUMO-

1 and gene expression. The microarray results from both synchronized or unsynchronized cells were most consistent with SUMO-1 having a direct, transcriptional stimulatory role, and this idea was tested in subsequent experiments.

59

genes w/ high SUMO-1 tag density genes w/ low SUMO-1 tag density 1400

1200

1000

800

600

400

200

0 G1* S0 S3** S6 M Gene expression (arbitrary units) Cell cycle stages Figure 3-8 SUMO-1 binding pattern associates with highly transcribed genes in synchronized cells. A histogram is shown of the average gene expression of high SUMO-1- labeled genes (gray) and low SUMO-1-labeled genes (black) from various cell cycle stages using a cell-cycle synchronized HeLa cell mRNA microarray dataset (GSE26922). High SUMO-1- labeled genes are defined as those genes for which the SUMO-1 average tag density at the promoter was the 10% highest, and low SUMO-1 labeled genes are defined as those upstream of genes for which the SUMO-1 average tag density were the 10% least abundant in ChAP-seq datasets. (*, the data of S12 from GSE26922 was used for G1; **, S4 from GSE26922 was used for S3** in this study.

The patterns of SUMO-1 binding to promoters were determined using averages for

groups of genes (Figure 3-7), but when promoters were analyzed one at a time, we found

that SUMO-1 labeled the promoters of a significant subset of genes (Figure 3-9A). In the

heat map, genes with measured expression levels were arranged from top to bottom

according to increasing expression levels, and we calculated SUMO-1 binding density of

regions surrounding TSSs (±10 kb) for each of the 12,013 genes. We found that in the G1

time point, SUMO-1 was associated with the TSSs, and the highest amount of SUMO-1

label was associated with the most active genes (the rows toward the bottom of the heat

map). By contrast, the heat map from samples taken during mitosis revealed very little

SUMO-1 labeling of promoters (Figure 3-9A). 60

A Low G1 M 12 Normalized SUMO-1 tag density

6

0 High mRNA abundance

B Low G1 S0 S3 S6 M 12 Normalized SUMO-1 tag density

6

0 High SUMO-1 Tag density Figure 3-9 SUMO-1 is associated with chromatin of active genes. A. The heat maps of normalized SUMO-1 tag densities on genes (±10 kb surrounding the TSSs), sorted by gene expression level from low (top) to high (bottom) in G1 and M phase. Each row is a gene’s SUMO-1 tag density trace using the average of normalized tag density at each stage of the cell cycle. The vertical center (bent arrow) denotes the TSSs. The density of the SUMO-1 tag is indicated by the color; blue is low level to white and red are progressively higher levels of SUMO-1. B. Heat maps similar to those in panel A, but the order of the genes (rows) is according to the density of SUMO-1 near the TSSs from low (top) to high (bottom) during G1. Similar heat maps are shown for each indicated phase of the cell cycle, and the order of genes (rows) is the same as in G1.

61

We next asked whether SUMO-1 labeled promoters were changing throughout the course

of interphase. We reordered the rows in the heat map according to the density of SUMO-

1 in the promoter region in the G1 samples (Figure 3-9B). The order of the rows in all

five heat maps was fixed according to the G1 order. We found that SUMO-1 occupancy

around the TSSs was consistent among different cell cycle stages, and SUMO-1 label at

TSSs on individual genes slightly increased during cell cycle progression. SUMO-1

marks were cleared during mitosis and then replaced in G1. Among these most

abundantly expressed genes, 127 genes were constantly labeled with intense SUMO-1

tags throughout interphase (Appendix). This gene list is remarkable for the enrichment of

housekeeping genes, notably ribosomal proteins and other translation factors (p- value=

6.68 x10-08).

3.4.4 Correlation of SUMO-1 with other chromatin marks.

To explore further SUMO-1 association with transcriptionally active chromatin, we

compared the SUMO-1 binding pattern from this study to the published binding profiles

among various chromatin marks, including the activation mark H3K4me3 and the

repression mark H3K27me3 (Barski et al., 2007). We asked how many of the genes with

SUMO-1 enriched promoters also have H3K4me3 peaks falling into the transcribed

region. There are a total of 2893 genes with SUMO-1 peaks in the promoter, out of which

70% (2039 genes) have H3K4me3 overlapping in the promoter region (Figure 3-10A,

left, chi-squared test p = 2.2 × 10-16). Since H3K4me3 is associated with open chromatin and actively transcribed genes (Justin et al., 2010; Santos-Rosa et al., 2002), these results further supported the concept that SUMO tagging of the promoter marks active gene 62 expression. By contrast, the number of genes with the repressive H3K27me3 chromatin mark had only 9% overlap with genes with SUMO-1 labeling the corresponding promoters (Figure 3-10A, right; chi-squared test p = 0.0016).

A

H3K4me3 SUMO-1 SUMO-1 H3K27me3 (n = 9834) (n = 2893) (n = 2893) (n = 2111)

7795 2039 854 2620 1838

273 (p = 0.0016) (p = 2.2 × 10-16 ) B H3K4me3 SUMO-1 H3K27me3 N orma li ze d

t ag d ens it y

R = 0.5122 R = 0.0445 Figure 3-10 SUMO-1 marked promoters are associated with genes marked with H3K4me3. A. The Venn diagram depicts the degree of overlap between the SUMO-1 marked promoters and H3K4me3-marked promoters (left), as well as SUMO-1 marked promoters and the H3K27me3- marked promoters (right). B. The heat maps of chromatin marks (H3K4me3, SUMO-1, and H3K27me3, respectively) on each gene ±10 kb surrounding the TSSs (bent arrow). Each row represents the corresponding tag density trace for each individual gene; rows are ordered and kept the same in each heat map according to the maximum intensity of SUMO-1 labeling in the peak region (-1600 to 400 bp) of G1-stage sample. The Pearson correlation coefficients between SUMO-1 and H3K4me3 or H3K27me3 are shown (see Methods).

63 To further investigate whether SUMO-1 correlates with H3K4me3 or K27me3, we aligned their binding patterns on genes ±10 kb surrounding the TSSs to determine if the

SUMO-1 mark was associated with this measure of gene activation (Figure 3-10B).

Interestingly, we found the SUMO-1 tag profile had a positive correlation with H3K4me3

(R = 0.5122), but not K27me3 (R = 0.0445). Similar results were obtained for the SUMO-

1 profiles on chromatin at other cell cycle stages (data not shown). Since we observed a positive correlation between SUMO-1 and H3K4me3, this further supported our interpretation that SUMO-1 is associated with a transcriptional activation signal.

3.4.5 SUMO-1 is a transcriptional activator of genes encoding ribosomal subunit

proteins and translation initiation factors.

Our results indicated that SUMO-1 marked the promoters of active genes. The timing of the appearance of SUMO-1 marks on promoters during interphase and removal during mitosis suggested that SUMO-1 was involved with the activation process. To test whether SUMO-1 was stimulatory to transcription, we depleted SUMO-1 or its associated E2 factor, Ubc9, by siRNA transfection in HeLa cells. The efficiency of Ubc9 or SUMO-1 siRNA depletion was confirmed by immunoblot analysis (Figure 3-12A). In cells with depleted Ubc9, the monomer form of SUMO-1 had increased abundance since it was not conjugated to other proteins (Figure 3-12A, lane 2). We then performed RNA- seq analysis from control and SUMO-1 depleted cells and collected the data from three biological replicates. Multiplex sequencing of polyA+ enriched cDNA on the Illumina

GAII generated 5.7 to 9.7 million reads for each replicate, of which approximately 80% could be mapped. We calculated global gene expression levels using the standard 64 measurement of Fragments Per Kilobase of exon per Million fragments mapped (FPKM)

(Trapnell et al., 2010) from all three replicates for each gene, and all replicates showed highly consistent correlation coefficients (Figure 3-11 and data not shown).

Control si SUMO-1 si

R = 0.9763 R = 0.9843

Figure 3-11 RNA-seq showing the high reproducibility of biological replicates. Scatter plot shows the FPKM of each gene from replicate 2 (y-axis) against replicate 1 (x-axis). Pearson correlation coefficient is shown in R. Scatter plot for replicates from the control siRNA (top) and the SUMO-1 specific siRNA (bottom) are shown.

We also determined the significance of changes in mRNA abundance using a FDR <

0.1%. We found 199 down regulated genes and 158 up regulated genes to have statistically significant changes in expression due to depletion of SUMO-1 (Appendix), and the magnitude of the effect ranged from a decrease in mRNA abundance of ~10-fold to an increase in mRNA abundance of ~10-fold. A heat map visualizing the 357 differentially expressed genes is shown in Figure 3-12B, with consistent results observed

65

among the biological replicates. Strikingly, transcripts repressed by SUMO-1 depletion

were significantly enriched for those involved in protein synthesis, such as the Gene

Ontology (GO) terms “Translation” (p =6.31x10-10). By contrast, those up regulated

genes were correlated with GO terms such as “negative regulation of cell

communication” (p =4.87x10-3) and “negative regulation of signal transduction” (p

=8.44x10-3), though these had lower correlation among enriched GO terms (Figure 3-

12B). Consistent with this observation, by Ingenuity Pathway analysis (IPA), similar GO terms, such as protein synthesis, were enriched among those genes down regulated by

SUMO-1 depletion (p =1.4×10-17; Figure 3-13A) but not the up regulated genes. Among

the genes that changed expression, all of those associated with protein synthesis function

were repressed by depletion of SUMO-1 (Figure 3-13B). These results again suggested

SUMO-1 functions as an activator on gene expression. To correlate SUMO-1 mark in the

genome and its effect on gene expression, we looked whether those 357 genes have

SUMO-1 mark in promoter region (Appendix). We found that, 134 out of 199 down

regulated genes, and 78 out of 161 up regulated genes had a SUMO-1 mark in the

promoter region during G1 phase. Interestingly, when sorting the genes according to the

mRNA abundance, we found that SUMO-1 marks at the promoter were more common

with the more highly expressed genes, and these marks were most often stimulatory.

(This trend can be seen with the presence of the stimulatory SUMO-1 mark shown in red

in the top rows – highest expressers – and SUMO-1 mark was more sparsely present in

the lower rows of this table; Appendix). By contrast, SUMO-1 also labeled promoters in

the less expressed genes but acting as a repressor (Appendix in green), indicating that

66

SUMO-1 may have a dual effect on regulating gene expression. We further assessed the average SUMO-1 tag density on these 357 genes, and the results reveal that SUMO-1 marks are enriched on the TSSs of both up and down regulated genes, though genes that were activated by SUMO-1 had a higher density of SUMO-1 at the TSSs (Figure 3-12C).

To test whether the transcriptional differences under SUMO-1 depletion are likely to be specific events, versus experimental or environmental induced gene expression changes, we tested whether the differentially expressed genes show enrichment under SUMO-1 depletion. We found that both up and down regulated genes showed highly significant enrichment for association signals (Z = 9.41, p < 2.2×10−16 for genes down regulated by

SUMO-1 depletion, and Z = 3.43, p = 4.19×10−4 for genes up regulated by SUMO-1 depletion; Figure 3-12D).

67

siRNA A Ctrl Ubc9 SUMO-1 Ubc9 SUMO-1 α-Tubulin 1 2 3 B

Translation (p = 6.31x10-10)

2 1

0 Negative regulation of cell communication -1 (p = 4.87x10-3) Scaled expression levels -2 1 2 3 1’ 2’ 3’ Control si SUMO-1 si C SUMO-1si Down

SUMO-1si Up

D SUMO-1 repressed genes 0.08

0.06 SUMO-1 stimulated genes 0.06 Z-score =9.41 Z-score =3.43 0.04 0.04

Probability 0.02 0.02

0 0 60 70 80 90 100 110 120 130 140 40 50 60 70 80 Number genes with SUMO-1 at promoter Number genes with SUMO-1 at promoter Figure 3-12 Differential expression of genes following SUMO-1 depletion. A. Western blot analysis of Ubc9 (top), SUMO-1 (middle), or α-tubulin (bottom) proteins was used to evaluate the depletion by the indicated siRNA transfection. B. Heat map of RNA-seq data showed 357 differentially expressed genes from SUMO-1 depletion compared to control in HeLa cells. Color key on the left shows lower relative expression (green) and higher relative expression (red). The expression intensities were row-wise scaled for the specified genes determined to be significantly changed (adj. p-value <0.05). C. Average SUMO-1 tag distribution ±10 kb surrounding the TSSs (bent arrow) from up (blue) or down (red) regulated genes following SUMO-1 depletion. The y- axis is the normalized tag density unit (see Methods). D. Differentially expressed genes are enriched with SUMO1 peaks in the promoter region. For each gene set (down- and up-regulated), the null distribution is generated by randomly selected 1,000 gene sets (gene number = 199 and 158, respectively, see Methods). The enrichment score (z-score) for the gene sets obtained from RNA-seq comparisons was indicated in the distribution plot by the vertical line (up regulated: blue, down regulated: red).

68

A -log (p-value) 0 5 10 15 20 Protein Synthesis Cell Cycle DNA Replication, Cell Death Cellular Growth Cellular down up Cellular Movement Gene Expression B Protein synthesis genes Scaled expression levels 1 2 3 1’ 2’ 3’ Control si SUMO-1 si C

RPL3 RPL5 RPL7A RPL10A RPL17 RPL23 RPL26 SLC3A1 0.8 0.6 0.4 0.2

(FC) 0 2 -0.2 SUMO-1si -0.4 Ctrlsi Log -0.6 SUMO-1 -0.8 β-actin Figure 3-13 Validation of RNA-seq data under SUMO-1 depletion showing an enrichment of genes that encodes protein translation factors. A. Pathway analysis result for genes affected by SUMO-1 depletion is shown. The histogram shows the down regulated (blue) and up regulated genes (red) secondary to SUMO-1 depletion have statistically significant enrichment for the indicated functions. The x-axis is the –log(p-value). B. Heat map of those genes that significantly changed expression following depletion of SUMO-1 and which are classified as encoding protein translation factors. The expression intensities were row-wise scaled for the specified genes determined to be significantly changed (adj. p-value <0.05). Green color indicates lower relative expression and red indicates higher relative expression. C. A second siRNA specific for SUMO-1 was used for validation of RNA-seq data shown in Figure 6C. Fold change

relative to the control siRNA is represented in log2 scale for SUMO-1 depletion. The mRNA expression level for each experiment was normalized to Polr2a (a non-SUMO-1-labeled gene) and to the result with the control siRNA. Four biological replicates were done and error bars reflect the SEM. A western blot is shown to evaluate SUMO-1 protein depletion by the indicated siRNA transfection.

69

We find it striking that some of the housekeeping genes, for example, ribosome biogenesis proteins (RPL5, RPL7A, RPL10A) and translation factors such as initiation and elongation factors (EIF3D, EIF3E, EIF4G2, EIF5B, and EEF2) were marked by

SUMO-1 at their promoters during interphase and had mRNA expression stimulated by

SUMO-1. Examples of specific genes with SUMO-1 density for G1 and M phases and effects on transcription are shown in Figure 3-14 (top four tracings).

G1 M

RPL5 SNORA66 G1 M

RPL7A G1 M RPL10A G1 M

RPL26 G1 M

SLC1A3 G1 M PKM2 Figure 3-14 Examples of SUMO-1 tracing on specific promoters. SUMO-1 tracing in G1 phase (top) and in mitosis (bottom) are shown above the gene map drawn from the IGV genome browser. Genes shown are (top to bottom) RPL5, RPL7A, RPL10A, RPL26, SLC1A3, and PKM2.

70 Ubc9 was required for SUMO-1 to associate with these promoters. Depletion of Ubc9 resulted in a decrease in SUMO-1 marks at these promoters (Figure 3-15A). This result suggested that SUMO-1 is coupled to the chromatin at these promoters and is not binding as a monomeric protein. For those genes that were stimulated by SUMO-1 depletion, i.e.

SUMO-1 functioned as a repressor, patterns in the SUMO-1 tag density on the promoter and gene at different points in the cell cycle were not identified. An example of a gene repressed by SUMO-1 with SUMO-1 found at the promoter, SLC1A3, is shown in Figure

3-14. Consistent with an earlier study (Rosonina et al., 2010) we found that the promoter of PKM2 (a homologue of Pyk1 in yeast) is labeled by SUMO-1 in G1 but not M (Figure

3-14, bottom), and its expression is decreased upon SUMO-1 depletion (Figure 3-15B).

In addition, our RNA-seq results showed that several ribosomal protein genes are significantly down regulated under SUMO-1 depletion (Figure 3-15B), and these genes were confirmed by RT-qPCR using the same siRNA (Figrue 3-15C) and a second siRNA specific to SUMO-1 (Figure 3-13C). For these assays we also tested Ubc9 depleted samples (Figure 3-15C). The results showed that when SUMO-1 was depleted, those genes were all down regulated. Interestingly, Ubc9 depletion was not in all cases consistent with the SUMO-1 depletion. We suggest from this result with Ubc9 depletion that other SUMO family proteins, such as SUMO-2/3, might be involved in the regulation of these transcripts. These observations indicate several interesting points. The combination of ChIP-seq, RNA-seq, and RT-qPCR results support the concept that

SUMO-1 directly activates specific gene expression and SUMO-1 is associated with regulation of expression of ribosomal proteins and translation factors.

71

A 8 Controlsi 7  Ubc9si 6  5 4  3 

% Input  2  1   0 IL2 EIF3F RPS27 RPL26 RPL38 RPL5 RPL7A RPL10A RPL3 RPL23 B RPL3 RPL5 RPL7A RPL10A RPL17 RPL23 RPL26 PKM2 SLC1A3 1

0.5

0 (FC) 2 -0.5 Log -1

-1.5 C RPL3 RPL5 RPL7A RPL10A RPL17 RPL23 RPL26 SLC1A3 1 

0.5

(FC) 0 2   -0.5   Log   -1  SUMO-1si  Ubc9si -1.5 Figure 3-15 SUMO-1 activates expression of ribosome biogenesis genes. A. The effect of Ubc9 depletion on SUMOylation of specific promoters. Chromatin was isolated from control siRNA transfected cells (black) or Ubc9 siRNA transfected cells (gray), and recombinant SUMO- 1 was detected by ChAP. IL-2 was a negative control based on the gene expression and ChAP- seq data. T-test using the data from four biological replications of ChAP–qPCR was conducted (*, p-value ≤0.05) B. RNA-seq analysis showing the effects of SUMO-1 depletion on mRNA levels of selected genes. The genes with statistically significant changes in RNA level are shown.

Values are expressed as log2 fold change [Log2(FC)]; for those genes that depletion of SUMO-1 caused a decrease in mRNA levels the histogram points downward. C. RT-qPCR analysis of gene expression levels for the indicated genes 72 h after transfection using siRNAs specific for control,

SUMO-1 or Ubc9. Fold change relative to the control siRNA is represented in log2 scale for SUMO-1 (black) and Ubc9 (gray). The mRNA expression level for each experiment was normalized to Polr2a (a non-SUMO-1-labeled gene) and to the result with the control siRNA. Three biological replicates were done and error bars reflect the SEM. A t-test of equal expression between SUMO-1/Ubc9 and control siRNA using the data from three biological replications of RT–qPCR was conducted (*, p-value ≤0.05).

72 3.5 Discussion

In this study, we mapped genome-wide labeling of chromatin by the SUMO-1 protein throughout the human cell cycle and made multiple discoveries. 1) On a chromosome- wide scale, the SUMO-1 binding profile was consistent during interphase, but changes were evident during mitosis with a decrease in SUMO-1 binding events. 2) We found the

ChAP-seq data of SUMO-1 replicates were highly reproducible and the pattern of

SUMO-1 binding to chromatin was dynamic during cell cycle progression. 3) The

SUMO-1 distribution on the chromatin was enriched on active genes, especially the regulatory elements such as CpG islands and promoters. 4) SUMO-1 localization on promoter chromatin was highly correlated with the transcriptional activation signal of

H3K4me3 and had low correlation with the transcriptional repressive signal H3K27me3.

5) The effect of SUMO-1 labeling of promoters on gene expression was in many cases stimulatory. 6) Genes that ribosomal protein subunits and translation factors were the most significant subgroup stimulated by SUMO-1.

An initial clue that SUMO-1 was correlated with gene activation was that it was associated with highly active promoters throughout interphase, decreased during mitosis when transcription is generally repressed, and then present again in G1 phase of the cell cycle. It must be recognized with this cell cycle correlation of SUMO-1 marks that absence of a chromatin mark during mitosis can have many causes aside from the regulation of transcription. It has been shown that SUMO-1 is removed from chromatin during mitosis (Zhang et al., 2008b). Our results are consistent with that earlier finding, though we do still observe SUMO-1 marks on specific sites, for example many promoters

73

(Figure 3-6) including the SLC1A3 gene (Figure 3-14). The results indicate that the signal by SUMO-1 on a promoter is complicated: in many cases it is stimulatory and in others the SUMO-1 tag is repressive (Figure 3-12). The genome-wide analysis presented in this study is a first step towards deciphering how SUMO-1 is regulating gene expression. The striking finding on which we focused was that among very highly expressed genes SUMO-1 is a stimulatory mark (Appendix). From the principal component analysis (Figure 3-2D), it is clear that how SUMO-1 associates with a variety of genetic elements changes through the cell cycle, and future analyses are targeted at deciphering these aspects of the complex chromatin signaling by SUMO-1.

Interestingly, when comparing the labeling of chromatin by SUMO-1 in this study with the labeling of chromatin by ubiquitin during mitosis, we found a high level of concordance. The promoters of many genes whose expression is important in the G1 phase of the cell cycle are bookmarked by ubiquitination during mitosis and then de- ubiquitinated in G1 (Arora et al, submitted). Of the 3446 promoters found to be bookmarked by ubiquitin during mitosis, 1829 promoters (53%) were labeled by SUMO-

1 during interphase. These results are most consistent with SUMO-1 having a stimulatory role in regulating gene expression via the chromatin.

SUMOylation of transcription complexes and/or chromatin-modifying complexes is known to regulate subcellular localization, protein-DNA binding affinity, and repress gene transcription. For example, SUMOylation of a variety of transcription factors/ co- factors fused with reporter gene inhibits gene expression (Chupreta et al., 2005;

Holmstrom et al., 2003; Shiio and Eisenman, 2003; Yang and Sharrocks, 2004).

74

Moreover, expression of a dominant-negative E2 Ubc9, which inhibits SUMO

conjugation to substrate proteins, or mutation of the SUMO-targeting sites on

transcription factors resulted in up-regulated transcriptional activity of specific genes

(Boyer-Guittaut et al., 2005; Nathan et al., 2006). SUMO-1/2/3 have all been shown to

recruit histone deacetylases (HDACs) (Ouyang et al., 2009a) and thus repress acetylated

chromatin. For these reasons, we were surprised that our global SUMO-1 binding data

showed SUMO-1 actually marked constitutively expressed genes. From the genome wide

data, SUMO-1 associates with highly expressed genes that encode proteins involved in

protein biogenesis. Whether the SUMO-1 moiety was recruited by specific bound factors

or DNA elements is unclear at this time. It is possible that the transcription activation

process itself recruits the SUMOylation to highly active promoters. On these high activity

promoters, binding by SUMO-1 is stimulatory.

One published study focused on SUMO marking of multiple promoters in yeast. That study suggested that SUMOylation of the promoter bound factors is associated with constitutive transcription and also activation of inducible genes, and inactivation of

SUMOylation in yeast harboring a defective ubc9 gene reduced SUMO at the constitutive promoters and decreased RNAPII binding on the promoters in yeast (Rosonina et al.,

2010). By contrast, in our study, the outcome of Ubc9 depletion is not necessarily consistent with SUMO-1 depletion, and we suggest that this inconsistency is due to

SUMO isoforms (i.e. SUMO-2/3) that might have opposing transcriptional activities. The conjugation of SUMO-1 and SUMO-2/3 on substrates has been shown to have an opposing role with a specific transcription factor (Lyst et al., 2006). In another study,

75

SUMO-1 was located on both active and repressive photoreceptor-specific genes to regulate rod cell development in a mouse model (Onishi et al., 2009). The results of our study substantially add to the concept that SUMO-1 is a stimulatory mark on chromatin since we found that genome-wide in the human cell, the preponderance of SUMO-1 chromatin marks on, or near promoter regions are associated with active gene expression.

Ribosome biogenesis proteins, such as small nuclear ribonucleoproteins, and ribosomal proteins were identified as novel SUMO targets and were required for nucleolus formation (Matafora et al., 2009b). Moreover, a recent study showed SUMO system is critical for nucleolar partitioning by regulating a novel ribosome biogenesis complex

(Finkbeiner et al., 2011). The current study finds that not only are the ribosomal proteins

SUMOylated, but also the genes encoding ribosomal proteins and translation factors are labeled by SUMO-1 on the chromatin over their promoters. Taken together, we suggest that SUMO-1 regulates nucleolar integrity during the cell cycle processing, both transcriptionally and post-translationally.

Since impairing SUMO-1 on these promoters resulted in lower expression, this shows that efficient SUMOylation is critical for optimal gene expression. SUMO-1 marking on these translational machinery genes may function to maintain gene expression and protein stability perhaps by antagonizing other repressive chromatin marks or regulating the subcellular localization of partner proteins required for repression. In addition, while

SUMOylation plays a critical role on gene repression on a subset of genes, SUMO-1 also has other properties, for example, regulating the assembly of transcription machinery

(Vethantham et al., 2007), therefore, SUMO-1 marking on those housekeeping genes

76 may be an early modification affecting chromatin remodeling. It is unclear at this time what are the relevant chromatin proteins in promoters conjugated to SUMO-1. The position of the peak of the SUMO-1 mark on constitutive active promoters is at -200 relative to the TSS. Such a position could be consistent with the -1 nucleosome or close to where the components of general transcription machinery would be expected to bind.

A previous study has shown that SUMO-1 post-translationally modifies hsTAF5 in

TFIID to modulate TFIID promoter binding activity (Boyer-Guittaut et al., 2005). It is possible that this is the factor SUMOylated at promoters in our studies; however, it would be a complicated mechanism by which SUMOylation of a general transcription factor would be associated with the transcription activation process. Further arguing against

TFIID components causing the promoter peak of SUMO-1 binding, the methods used in this study had sufficient resolution to map the bound domains and TFIID subunits would be expected to be closer to the TSSs.

In summary, in this study we demonstrated how SUMO-1 marks promoters in the human genome and how it changes through the cell cycle. We found that SUMO-1 labeling of chromatin is dynamic through the cell cycle, and it is associated at promoters with the most actively transcribed genes. While SUMO-1 was not generally associated with all active genes, a very high percentage of the most active genes (49%) had their promoters modified with bound SUMO-1, and it was shown that in many of the housekeeping genes, the SUMO-1 mark on the promoter was stimulatory to gene expression, and is critical for the high expression genes encoding translation factors.

77

Chapter 4: SAFB participates in SUMO-1 binding on the constitutive

active promoters

Liu HW, Banerjee T, and Parvin JD. Manuscript in preparation

Author contributions:

• Liu HW & Parvin JD designed the experiments.

• Liu HW performed the experiments

• Banerjee T cloned the HBT-SAFB1 plasmids

78

4.1 Abstract

SUMOylation is a post-translational modification in which SUMO is covalently conjugated to a variety of proteins that results in the regulation of a range of cellular processes including cell proliferation and maintenance of genome stability. As described in the previous chapter, we found that SUMO-1 marks chromatin at the proximal promoter regions of some of the most active housekeeping genes during interphase. In this chapter, we show our work to identify SUMO substrates on the chromatin bound to these promoters and the effect on transcription regulation. We found that SUMO-1 marks the promoters via the Scaffold Associated Factor B (SAFB) protein. We found that depletion of SAFB disrupted the organization of Cajal body, which plays an important role for assembly and modification of the nuclear-transcription and RNA-processing machinery. This study found an unexpected role of SAFB that coupling transcription initiation and RNA processing process.

4.2 Introduction

Covalent modification of proteins by Small Ubiquitin-related Modifier (SUMO) provides a platform for protein-protein interactions, and it plays a critical role in a variety of cellular signaling pathways including control of cell cycle progression, DNA repair, gene expression, and nuclear architecture (Nacerddine et al., 2005). SUMO proteins are highly conserved among eukaryotes, and there are three isoforms in mammalian cells: SUMO-1,

SUMO-2, and SUMO-3. SUMO-2 and SUMO-3 are often referred to as SUMO-2/3

79 because of the high sequence similarity (Geiss-Friedlander and Melchior, 2007).

Biochemically, SUMOylation of target proteins involves a three-step process starting with activation by the E1 activating enzyme (SAE1/SAE2), transfer of the SUMO moiety to the E2 ligase (Ubc9), and finally, the Ubc9 in association with E3 ligases transfers the

SUMO protein to lysine residues via an isopeptide bond on the target protein.

SUMOylation can be E3 factor independent, or alternatively various E3 ligases may contribute by increasing the substrate specificity. Finally, the completion of

SUMOylation cascade then affects the target substrate stability or enzymatic activity or subcellular localization. Among various SUMO substrates that have been identified, transcription factors and co-regulators comprise one of the largest groups. A large number of studies have provided strong evidence for the involvement of SUMOylation in transcriptional regulation reviewed in (Geiss-Friedlander and Melchior, 2007).

SUMOylation of those transcription factors often is repressive, and the SUMOylation sites on the target proteins are located in negative regulatory domains, indicating the connection between SUMOylation and gene repression. Current models suggest that

SUMOylation leads to the recruitment of transcriptional co-repressor complexes and histone deacetylases (HDACs) to the promoters (Ouyang et al., 2009a; Shiio and

Eisenman, 2003). Conversely, there are also studies showing SUMOylation of transcription factors can lead to gene activation. For example, SUMOylation of MBD1 by

PIAS1 and PIAS3 E3 SUMO ligases interferes with the interaction between MBD1 and the histone H3 lysine 9 (H3K9) methyltransferase SETDB1, leading to loss of the

80 silencing chromatin modification (H3K9me3) at the promoter of an MBD1 target gene, p53BP2, and derepression of transcription (Lyst et al., 2006).

In a previous study, we have shown that SUMO-1 modifies chromatin-associated proteins located at the promoter regions of highly active genes, including those that encode ribosome subunits (Liu et al., 2012). The role for SUMO in the activation of transcription has also been observed in yeast and in human fibroblasts (Neyret-Kahn et al., 2013;

Rosonina et al., 2010). These studies have suggested that SUMOylation of transcription factors is not merely acting as a switch for gene silencing; rather, it also plays an important role for modulating transcription activation. The role of how SUMOylation modulates chromatin structure, and further participates in transcriptional control of constitutive genes is largely unknown.

In this study, we first sought to identify the SUMOylated protein bound to the chromatin at active promoters, and we found that one of the SUMO-1 targets is Scaffold Associated

Factor-B. SAFB is a DNA/RNA binding protein serving as a molecular base to assemble a transcription initiation complex near the actively transcribed genes. Two isoforms

(SAFB1 and SAFB2) have been found with 74% similarity at the amino acid level, and up to 98% similarity in some functional domains and display redundant activity

(Oesterreich et al., 1997). There are multiple functional domains found in SAFB: the

RNA recognizing motif (RRM) to bind to RNA, and the SAF box, which is a homeodomain-like DNA binding motif with high specificity for AT-rich scaffold/matrix attachment regions (S/MAR) on DNA. S/MAR elements have been proposed to affect gene expression, as they are found close to regulatory loci such as promoters, enhancers,

81 and experiments using heterologous promoters flanked by S/MAR elements caused long- term transgene activation in vitro and in vivo (Forrester et al., 1994; Jenke et al., 2004).

Therefore, SAFB may regulate gene expression by mediating chromatin looping to coordinate far-distance chromatin interactions and higher order chromatin structure. In addition, SAFB has been found to function as a co-repressor of estrogen-dependent transcription (Oesterreich et al., 2000), and participates the repression of immune regulators and apoptotic genes (Hammerich-Hille et al., 2010). Recent studies suggest that it may be involved in a more widespread manner by functioning as a positive regulator for permissive chromatin of the myogenic differentiation (Hernandez-

Hernandez et al., 2013), and in response to DNA damage (Altmeyer et al., 2013).

Here we provide evidence that both SAFB1 and SAFB2 (collectively referred to as

SAFB) are SUMO substrates bound to the chromatin during interphase in a region centering on -100 bp relative to the transcription start site. Like SUMO-1, depletion of

SAFB diminished RNA polymerase II (RNAPII) binding to promoters and decreased

RNA expression of these ribosomal protein genes.

4.3 Materials and methods

4.3.1 Chromatin affinitypurification (ChAP) for mass spectrometry analysis

ChAP was based on the same ChIP method with modification of a two-step affinity purification. Cells were synchronized in S phase or mitosis phase described in section

3.3.3. 108 HeLa cells were lysed by Lysis Buffer I (50 mM HEPES-KOH, pH 7.5, 140

82 mM NaCl, 1 mM EDTA, 10 % glycerol, 0.5 % NP-40, 0.25 % triton-X-100), and the cell pellet were collected, resuspend in Lysis Buffer II (10 mM Tris, pH 8, 80 mM NaCl, 1 mM EDTA, 0.5 mM EGTA). After centrifugation (1400 xg), the pellet contained the isolated chromatin. The chromatin was resuspend in lysis buffer III (10 mM Tris, pH 8,

80 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1 % Deoxycholate, 0.5 % N- lauroylsarcosine). The isolated chromatin was sheared to 200-300 bp by sonication and incubated with 375 μl of Ni beads (Qiagen) for 16h at 4°C. After washing in 6 ml of wash buffer I (50 mM Tris pH 8; 0.01% SDS; 1.1% Triton X-100; 150 mM NaCl), chromatin fragments were eluted in 6ml elution buffer (washing buffer I with 300 mM imidazole). The nickel eluate was incubated with 375 μl of streptavidin beads

(Invitrogen) for 6h at 4°C. After three stringent washes in 2 ml of wash buffer II (50 mM

Tris pH 8; 10 mM EDTA; 1% SDS; 1M NaCl), and 3 times of 10 mM Tris buffer, pH 8.

The chromatin was then trypsinized in solution for 2 hrs at 37°C. The protein to trypsin ratio is 1:120 (wt/wt). The digested solution was lyphoilized and dissolved in 50 μL

HPLC grade water. The digested sample is then sent to MS analysis.

4.3.2 Chromatin immunoprecipitation qPCR (ChIP-qPCR)

The Chromatin immunoprecipitation and affinity purification were performed as described in section 3.3.4 of this dissertation.

4.3.3 Antibody used and Immunofluorescent staining

The SUMO-1 antiserum was prepared described in 3.3.2. The dilution factor for ChIP- qPCR was 1:1000, and 1:100 for immunofluorescent staining. Anti-Coilin (sc-56298,

83

SantaCruz) for immunofluoresecnt staining was used at a dilution of 1:100; and anti-

SAFB (05-588, Millipore) for immunofluorescent staining was used for 1:100. The dilution factor for homemade monoclonal antibody against RNAPII (8WG16), and pSer-

5 (ab5131, Abcam) is 1:500.

4.4 Results

4.4.1 SUMOylation facilitates RNAPII recruitment on the constitutive active

promoters.

We have previously shown that SUMO-1 is enriched on the chromatin proteins bound to promoter regions of some of the most active genes during interphase, such as ribosomal protein (RP) encoding genes (Liu et al., 2012), which are highly abundant and constitutively transcribed by RNAPII. Previous studies in yeast described that SUMO protein is required for RNA polymerase II (RNAPII) recruitment on the constitutive genes during transcription initiation (Rosonina et al., 2010). Since depletion of SUMO-1 caused a decrease in mRNA production, the presence of SUMO-1 on the active promoter regions indicated a positive role for SUMOylation in transcription regulation. In order to specify whether this phenomenon is due to transcription initiation, we performed ChIP- qPCR assays to investigate RNAPII occupancy under defective SUMOylation in HeLa cells. Ubc9, the only SUMO-specific E2 ligase found in cells, was depleted by siRNA transfection, and the cells were collected after 48 h post siRNA transfection. The promoter regions of RPL3, RPL7A, RPL10A, RPL26, which were found enriched with

84

SUMO-1 (Liu et al., 2012), were tested for the effect of SUMOylation on promoter binding by RNAPII (Figure 4-1). We assayed by ChIP against SUMO-1 binding to promoters, RNAPII with phosphorylated serine-5 of the CTD (pSer-5) and RNAPII that was either pSer-5 or unphosphorylated (8WG16). SUMO-1 and RNAPII were enriched on the promoters of RP genes under control conditions (cells transfected with the control siRNA), upon Ubc9 siRNA depletion, the occupancy of SUMO-1 on the promoters of RP genes decreased four- to five-fold (Figure 4-1C). We found that depletion of Ubc9 and consequent defect in SUMOylation caused a decrease in the occupancy at these promoters of both unphosphorylated and phosphorylated form of RNAPII. Binding of the unphosphorylated RNAPII was decreased 2.5- to 5-fold. Binding of the pSer-5 form of RNAPII was decreased four- to five-fold compared to controls (Figure 4-1A, B). It is noteworthy that the effect on the pSer-5 form of RNAPII was of somewhat higher magnitude than the unphosphorylated form of RNAPII, suggesting that SUMOylation impacted the initiation process.

85

A 8WG16 1.2

1

0.8

0.6

0.4

0.2 Fold change over Ctrlsi

0 RPL26 RPL10A RPL7A RPL3

B p-Ser-5 1.2 Ctrlsi 1

0.8 Ubc9si

0.6

0.4

0.2 Fold change over Ctrlsi 0 RPL26 RPL10A RPL7A RPL3

SUMO-1

C 1.2

1

0.8

0.6

0.4

0.2 Fold change over Ctrlsi 0 RPL26 RPL10A RPL7A RPL3 Promoters Figure 4-1 SUMOylation facilitates RNAPII recruitment on the active promoters. The effect of Ubc9 depletion on SUMOylation of RP promoters. Chromatin was isolated from control siRNA transfected cells (black) or Ubc9 siRNA transfected cells (gray), and A. SUMO-1, B. unphosphorylated (8WG16), or C. phosphorylated (Ser-5p) enrichment was detected by ChIP- qPCR. T-test using the data from four biological replications of ChIP–qPCR was conducted (and the p-value from each sample compared to control were ≤0.05)

4.4.2 SAFB is SUMO-1 modified and associated with RP gene promoters

Since SUMOylated chromatin-associated proteins and transcription factors are low in abundance, and since SUMOylated targets are highly dynamic and rapidly reversed by

SUMO proteases, a cell line that stably expresses SUMO-1 fused with a histidine and

86 biotinylated (HB) tag was used for isolation of SUMO-1-labeled chromatin proteins. To identify the SUMO-1 substrates that mark on the promoters during S phase, we took advantage of the observation that the SUMO-1 modified chromatin associated protein was present during interphase and absent during mitosis (Liu et al., 2012). We purified the chromatin fraction from interphase cells and from cells in mitosis. The SUMOylated chromatin proteins were isolated by metal ion affinity purification followed by avidin affinity purification, applying stringent washes for each column. Purified proteins were analyzed by mass spectrometry in order to identify proteins covalently bound to SUMO-1 during S phase and not during mitosis (Figure 4-3A).

We analyzed the list of chromatin-associated proteins SUMOylated during interphase by (GO) term analysis. The top functions of those proteins purified by virtue of binding to SUMO-1 were RNA metabolism and protein synthesis (Table 4-1).

Table 4-1 GO term Analysis of chromatin-associated proteins pulled down by SUMO-1

Name p-value # Molecules

RNA post-transcriptional modification 1.92E-09 – 6.25E-03 11

DNA replication, Recombination and Repair 1.39E-07 – 4.12E-02 13

Post-translational modification 7.66E-06 – 3.88E-02 3

Gene expression 1.95E-05 – 3.88E-02 19

Cell morphology 3.56E-05 – 4.36E-02 6

87

In addition, among those chromatin associated proteins affinity-pulled down by SUMO-

1, many RNA binding proteins such as non-POU domain-containing octamer-binding 1 (NonO), or splicing related proteins like PTB-associated splicing factor

(SFPQ) were shown on the MS data, suggesting that both transcription initiation and pre- mRNA splicing are coordinated and functionally coupled. In addition, splicing occurs in the close vicinity of genes and is frequently cotranscriptional, and SUMOylation may serve as a link between transcriptional initiation and pre-mRNA splicing (Appendix).

Since SUMOylation of multiple transcription factors or chromatin-associated proteins have been shown to affect their structure and further regulate transcription, we sought for factors that may connect transcriptional regulation and nuclear structure. A protein called

Scaffold Attachment factor B was identified with very high scores, and both isoforms were shown in the analysis from the S phase chromatin and not the mitotic chromatin.

Interestingly, the connection between scaffold binding and gene regulation has been demonstrated in human cell lines and a mouse model. SAFB chromatin association had been previously evaluated by ChIP-chip analysis, and of the four sites identified

(Hammerich-Hille et al., 2010) two exactly corresponded with SUMO-1 binding sites upstream of promoters (Figure 4-2).

88

2kb 2kb 20 20 G1

0 0 20 Late S 20 phase (S6) 0 0 Hsp27/HSBP1

Figure 4-2 Confirmation of SUMO-1 enrichment to SAFB binding sites. SAFB binding sites analysed by ChIP-chip (Hammerich-Hille et al., 2010) are coincident with SUMO-1 binding sites identified by ChIP-seq analysis (Liu et al., 2012).

SAFB was initially identified as a protein binding to SAR/MARs, and it binds with heterogeneous ribonucleoprotein A1 (HAP); furthermore, it is a transcription factor at the hsp27 and Estrogen-receptor-alpha genes (HET). Analysis of our affinity purification of SUMO-1 associated chromatin proteins revealed a prominent SAFB protein in the input sample migrating at a position consistent with unmodified protein. The unbound fraction contained a band of similar mass as the unmodified SAFB. By contrast, the bound, SUMO-1 tagged eluate contained multiple polypeptides that bound to SAFB specific antibody and that were shifted to slower migration (Figure 4-3B, lane 3), which we interpret to be consistent multiple SUMOylations of the two isoforms of the SAFB protein.

It had been previously reported that SAFB interacts with chromatin (Hammerich-Hille et al., 2010). We tested whether SAFB localized to those RP gene promoters. Since the antibody specific to SAFB did not ChIP, we transfected HeLa cells with SAFB-1 gene 89 fused with the his6-biotin (HB) tag, and followed by ChIP-qPCR analysis. A HB-tag only plasmid was used to control for nonspecific binding. SAFB-1 was found to associate with all eight of the RP promoters tested (Figure 4-3C).

90

A B

FT SUMO-1 tagged Cell synchronization Total chromatin (M or S phase)

250 Su SAFB Su 150

Chromatinti isolationi 100 1 2 3

Su

Su Ni-NTA purificationpurififfi C

Su 0.07% HB-SAFB1 Su 0.06% HB tag only Avidin purification 0.05% 0.04%

% Input 0.03% 0.02% 0.01% 0.00%

IL2 In-solution digestion EIF3F RPL5 RPL3 HSP27 RPL26 RPL38 RPL7A RPL23 RPL10A Promoter LC-MS/MS

Protein identification

Figure 4-3 Identification of SAFB as SUMOylated substrate on the promoters. A. Flowchart of identifying SUMO-1 targets on the chromatin during S phase. B. Chromatin affinity purification verifying SAFB as SUMO-1 target. C. Verification of SAFB-1 enrichment on the

SUMO-1 bound promoters. HeLa cells were transfected with SAFB-1 gene fused with the his6- biotin (HB) tag, and followed by ChIP-qPCR analysis. A HB-tag only plasmid was used to control for nonspecific binding.

91

4.4.3 SAFB participates in SUMO-1 localization to promoters

To determine if SAFB is responsible for the recruitment of SUMO-1 binding on the specific promoters, we tested whether depletion of SAFB affected the recruitment of

SUMO-1 to promoters that we had previously characterized to be SUMO-1 bound (Liu et al., 2012). Since there are two highly related isoforms of SAFB, we depleted both isoforms with siRNAs targeting each SAFB1 and SAFB2. Following siRNA transfection, cells were collected after 48 h post-transfection, and immunoblot analysis showed that

SAFB protein was depleted by >90% (Figure 4-4C). Consistent with earlier results,

ChIP-qPCR analysis showed that SUMO-1 and RNAPII were enriched on the promoter regions analyzed; however, SAFB depletion caused a significant decrease in the SUMO-1 marks on the promoters, down 40-50% compared to the control (Figure 4-4A). Depletion of SAFB also caused a decrease in RNAPII occupancy on these promoters (Figure 4-4B).

In addition, a second set of siRNAs for depletion of SAFB targeted the 3’UTR of SAFB1 and the 3’UTR of SAFB2. Depletion of SAFB by transfection of the second set of siRNAs similarly decreased SUMO-1 specific ChIP at RP gene promoters, and expression of SAFB1 from a cotransfected plasmid rescued SUMO1 binding to these promoters (Figure 4-4D). These results clearly indicate that SAFB is a functionally relevant SUMO-1 target bound to the promoters, and these results support a model whereby the SUMOylation of SAFB RNAPII binding to target gene promoters.

92

A * αSUMO-1 RPL26 0.8% D 0.800% Ctrlsi 0.6% SAFBsi 0.600% * 0.4% 0.400% *

% Input 0.2% * 0.200% 

0.0% 0.000% RPL38 αPol2 (8WG16) 0.250% B 0.8% IgG * 0.200% 0.6% * SUMO-1 0.150% * 0.4% 0.100% % Input

% Input 0.050% 0.2% *

0.000% 0.0% RPL5 IL2 EIF3F RPL26 RPL38 RPL5 0.250%

Promoter region 0.200% C 0.150% Ctrlsi SAFBsi 0.100% SAFB 0.050% α-tubulin 0.000% plasmid pcDNA3- pcDNA3- pcDNA3- empty empty SAFB1 siRNA Ctrl SAFB SAFB

Figure 4-4 SAFB participates in SUMO-1 association on promoters of RP genes. The effect of SAFB depletion on SUMO-1-bound promoters. Chromatin was isolated from control siRNA transfected cells (black) or SAFB siRNA transfected cells (gray), and A. SUMO-1 or B. RNAPII binding on the promoters was detected by ChIP-qPCR. IL-2 was a negative control based on the gene expression and ChIP-seq data. T-test using the data from four biological replications of ChIP–qPCR was conducted (*, p-value ≤0.05) C. Western blot analysis of SAFB or α-tubulin proteins was used to evaluate the depletion by the indicated siRNA transfection. D. ChIP-qPCR showed the effect from a second siRNA targeting SAFB and add-back of SAFB-1 wildtype plasmid as indicated. The enrichment of IgG (black) or SUMO-1 (gray) was shown.

4.4.4 SAFB depletion caused down regulation of mRNA processing of RP genes

We have shown previously that SUMO-1 marks on the chromatin coincident with preinitiation complex (PIC) on the promoter of constitutive housekeeping genes and

93 SUMO-1 depletion caused down regulation of RP gene expression (Liu et al., 2012).

Given that SAFB interacts with the carboxy-terminal domain of RNAPII (Nayler et al.,

1998), and we found that SAFB is involved in recruitment of SUMO-1 and RNAPII on the promoters, we tested whether SAFB depletion may affect mRNA expression of the

RP genes in transcription initiation or pre-mRNA splicing. We investigated the RNA processing of two RP genes, RPL26 and RPL7a, by quantifying RNA containing the exon-exon junction for mature mRNA, and we quantified the abundance of the intron- exon junctions for measuring pre-mRNA concentration. The RT-qPCR analysis showed that, while depletion of either SUMO-1 or SAFB did not affect the primary transcripts relative to the control in the nucleus (Figure 4-5A), interestingly, the spliced mRNA purified from the nucleus was down-regulated during SUMO-1 or SAFB depletion. This result suggested that SUMO-1 and SAFB is involved in mRNA processing (Figure 4-5B).

We also tested the mature mRNA in the cytosol, while there is a trend for down regulation of RP genes, the decrease in mRNA due to SUMO-1 or SAFB depletion was not statistically significant (Figure 4-5C). We suggest that the diminished effect was due to the abundance of pre-existing mRNA of RP genes in the cells prior to SAFB or

SUMO-1 siRNA transfection.

94

A Nuclear pre-mRNA Ctrlsi 2.0 SUMO-1si 1.8 1.6 SAFBsi 1.4 1.2 1.0 0.8 0.6 0.4 Fold change over Ctrlsi 0.2 0.0 preRPL26 preRPL7A B Nuclear mRNA * * * 1.2 * 1.0

0.8

0.6

0.4

0.2 Fold change over Ctrlsi 0.0 RPL26 RPL7A C Cytosolic mRNA 1.2 1.0 0.8 0.6 0.4 0.2

Fold change over Ctrlsi 0.0 RPL26 RPL7A Figure 4-5 SAFB or SUMO-1 depletion caused down-regulation of spliced mRNA in nucleus. RT-qPCR analysis of gene expression levels for the indicated genes 48 h after transfection using siRNAs specific for control, SUMO-1 or SAFB. A. Pre-mRNA from nucleus was shown. B. mRNA abundance from Nucleus or C. Cytoplasm were analyzed. Fold change relative to the control siRNA for SUMO-1 or SAFB. The pre-mRNA/ mRNA expression level for each experiment was normalized to 18S rRNA (a non-SUMO-1-labeled gene) and to the result with the control siRNA. Three biological replicates were done and error bars reflect the SEM. A t-test of equal expression between SUMO-1/SAFB and control siRNA using the data from three biological replications of RT–qPCR was conducted (*, p-value ≤0.05).

4.4.5 SAFB is needed for proper Cajal bodies organization

Mammalian cells are highly compartmentalized, and previous studies showed that SAFB binds to RNAPII and SR proteins (Nayler et al., 1998), indicating SAFB is part of the spliceosome, and may play an important role for the nuclear matrix organization. We asked whether disrupting SAFB expression affects nuclear structure formation

95 participating in transcription and mRNA processing. We investigated those distinct subnuclear structures such as transcription factories, speckles, and Cajal bodies. By immunofluorescence analysis, we stained transcription factories by antibody recognizing

RNAPII-Ser5p, which is the phosphorylation form of active RNAPII during transcription initiation; SC35 as a marker protein of speckles, where splicing factors are located and pre-mRNA is spliced. We also looked at Cajal bodies, a subnuclear structure that is involved in the biogenesis of ribonucleoproteins and modification of snRNA(Gall, 2003), which are important for splicing. In addition, disruption of Cajal body organization in

HeLa cells can reduce cellular proliferation, and this might be due to depleted small nuclear ribonucleoprotein (snRNP) resources (Velma et al., 2010). We used P80-coilin, an 80-kD protein that is predominately present in the Cajal bodies, as a marker for Cajal body structure. SAFB or SUMO-1 was depleted by siRNA transfection in U2OS cells.

The results showed that depletion SAFB or SUMO-1 proteins doesn’t disrupt speckles or

RNAPII-Ser5p organization compared to control; that is, both phosphorylated RNAPII and speckles are distributed broadly throughout the nucleoplasm in a meshwork pattern, and the punctuate staining of both structures is maintained (Figure 4-6B). However, under control condition, Cajal bodies were found in spherical shape, closely associated with nucleoli, and ranging from 2-5 per cell. Surprisingly, we found that upon depletion of SAFB, Cajal body organization was irregular shaped, and accumulated outside of the nucleoli, as well as diffusely distributed throughout the nucleoplasm. In addition, depletion of SUMO-1 also caused deregulation of Cajal body similar to SAFB depletion.

We found that coilin accumulated on the peri-nuclear region (Figure 4-6A).

96

A p-Ser-5 Coilin Merge DAPI Ctrlsi GL2si

SAFBsi

SUMO-1si

B GL2si p-Ser-5 SC35 Merge DAPI

SAFBsi

SUMO-1si

Figure 4-6 The effect of SAFB or SUMO-1 depletion on nuclear structure. A. U2OS cells were transfected with SAFB or SUMO-1 siRNA, and the organization of RNAPII (p-Ser5, green) or Cajal body (Red) is shown. B. Same experiment setting as A, but cells are immune-stained for nuclear speckles (Red).

97

4.5 Discussion

In this study, we discovered that SUMO-1 binds to the chromatin at promoters of ribosomal protein genes via the scaffold attachment factor, SAFB. We found that SAFB is SUMOylated, and depletion of SAFB caused decrease of SUMO-1 association with the promoters on the chromatin. These promoters, encoding RP genes and translation factors, are among the most active RNAPII promoters in the cell, and SUMO-1-tagged SAFB is stimulatory to their transcription. SAFB is required for the recruitment of RNAPII on the active promoters for the transcription initiation of ribosomal protein genes. A previous study showed that the strength of a promoter-bound activator could also affect the efficiency of constitutive splicing and 3′-end cleavage of different reporter pre-mRNAs

(Rosonina et al., 2003), and this activator-dependent increase in pre-mRNA processing efficiency was found to require the RNAPII CTD, and close interactions between transcription initiation and pre-mRNA processing components has also been found

(Monsalve et al., 2000). Interestingly, a number of RNA processing factors have been found to be SUMO targets, suggesting that SUMO marks on the promoter during the transcription cycle can be important for determining the efficiency of pre-mRNA splicing step. In addition, transcription and mRNA splicing is highly regulated and a series of highly structured and coordinated events are required to precisely remove each of the introns during the synthesis of the nascent transcripts. Therefore, transcriptional coregulators that play an additional nuclear structural role such as SAFB may regulate this process. 98

Nuclear function depends on organizing platforms for establishing structural and functional domains in the nucleus. SAFB1 has been confirmed as a nuclear matrix protein and provides a platform for chromatin by binding to the AT-rich S/MAR regions in the genome. Therefore, it is possible that SAFB mediates localization of constitutive promoters marked by SUMO-1 to the chromatin regions that are repressive or active.

There are emerging studies showing the importance of SAFB on regulation of chromatin architecture. For example, SAFB1 is shown to participate in chromatin remodeling by interacting with ATP-dependent chromatin modifying proteins such as CHD1 (Tai et al.,

2003). SAFB1 might act as an architectural protein that attracts basal transcription machinery to the nuclear matrix and regulate their transcription and RNA processing.

Despite of its co-repressor role, a recent study reported that SAFB1 is involved in regulating chromatin accessibility the in response to genotoxic stress, and it is transiently recruited to DNA damage sites for efficient signaling and the downstream phosphorylation of chromatin (Altmeyer et al., 2013). In addition, another study showed that SAFB1 is associated with the activation of skeletal muscle gene expression during myogenic differentiation by facilitating the transition of promoter sequences from a repressive chromatin structure to one that is transcriptionally active (Hernandez-

Hernandez et al., 2013). Interestingly, we found that the most active genes such as ribosomal protein genes, requires SAFB to initiate transcription; therefore, it is possible that the basal expression of those house keeping genes may utilize the same mechanism to modulating gene expression through regulation of the partitioning of chromatin accessibility. In addition, a previous study showed that the localization of SR proteins

99 coincide with the sites of active RNAPII during transcription. Considering the DNA and

RNA binding ability of SAFB, and the result in this study that SAFB and SUMO-1 participate in Cajal body formation, where snoRNP is processed and recycle, they may work cooperatively by serving as a platform to tether mRNA processing machinery.

Indeed, while we test whether SAFB is involved in transcriptional activation on an inducible system, we did not see any difference comparing to control, and this might be due to the lack of intron in the reporter system (Figure 4-7), supporting the role of SAFB or SUMO-1 in splicing. In combination of previous literature indicating the link between

SAFB and RNAPII, and the data we provided in this study, suggest that SAFB may serve as the platform for transcriptional initiation and the pre-mRNA splicing to ensure the efficient expression of highly active genes. It is possible that SAFB associates with the

CTD of the initiating RNAPII and travel with RNAPII as it synthesizes nascent pre- mRNA, and subsequently facilitates the assembly of splicing complexes and splicing on the first intron to emerge, and further coordinates transcription and pre-mRNA processing levels.

100

1400000 No transfection 1200000 Ctrlsi SAFBsi 1000000 SUMO-1si 800000

RFU 600000

400000

200000

0 0 0.5 1 1.5 2 Time after induction (hrs)

Figure 4-7 SAFB or SAFB-1 depletion did not affect the expression of an inducible system. U2OS-Luc cells were transfected with Control, SAFB, or SUMO-1 siRNA and cells were induced with Tetracycline for indicated time period. Cells were then collected and analyzed with luciferase activity.

Taken together, the results of this study suggest a novel function for SAFB in regulation

of gene expression by coordinating transcriptional initiation and RNA processing.

Interestingly, SAFB interacts with SF2/ASF in vivo, and it has been reported recently

that SF2/ASF functions as a cofactor to enhance SUMOylation through the E3 ligase

PIAS1. Therefore, it is possible that SAFB and SF2/ASF may serve as the functional link

between the RNA processing and SUMOylation machinery.

101

Chapter 5: Discussion and Future Direction

5.1 Summary of results

In this study, we sought to identify how SUMO-1 modifies the chromatin on the human genome and how the SUMOylation changes during cell cycle progression. In order to detect the changes in SUMO-1 labeling of all the chromatin or chromatin associated proteins, we used the affinity chromatin purification specific to exogenous SUMO-1 to identify all the SUMO-1-modified chromatin associated proteins and DNA loci. This was the first time using a genome wide approach to characterize the SUMO-1 functions as a chromatin-associated modification of human genome. Surprisingly, despite the known repressive role of SUMOylation on histones, we found that SUMO-1 localizes to the promoters of constitutively active genes involved in protein translation and proliferation during interphase, and SUMO-1 marks on these promoters were absent during mitosis. In addition, SUMO-1 association on the promoters recruits RNAPII, and depletion of

SUMO-1 leads to down regulation of those ribosomal protein genes, suggesting a positive role of SUMO-1 in gene activation.

By using mass spectrometry, we identified that SUMO-1 marks the promoters via the

Scaffold Associated Factor B (SAFB) protein. The results showed that SAFB is

SUMOylated, and depletion of SAFB caused the decrease of SUMO-1 marks on the promoters of those housekeeping genes transcribed by RNAPII and SUMO-1-tagged 102

SAFB is stimulatory to their transcription. In addition, depletion of SAFB decreased the splicing of the mRNAs and disrupted the organization of Cajal body, which is important for snRNP and snoRNP biogenesis. All these findings suggested the role of SUMO-1 in this important regulatory process for transcription initiation and splicing of mRNA of ribosomal protein genes.

A study published recently indicated that SUMOylation of active TSSs is also shown in human fibroblast, and depletion of the only E2 enzyme, Ubc9, leads to atypical cell senescence (Neyret-Kahn et al., 2013). While the authors suggested that SUMO mark on the promoters is to restrain downstream gene expression, we found that SUMO-1 activates RP gene expression in HeLa cells. This discrepancy might be due to different

SUMOylated substrates participate in the enrichment on the promoters thus resulted in different outcomes. While the SUMOylated substrates on the PIC sites are transcription factors, the SUMO-labeled promoters are repressed; however, while SAFB is

SUMOylated and bound on the RP genes, it is involved in the activation of transcription and splicing events.

103

Pol S II SAFB RP genes

Pol

S II SAFB Spliceosome RP genes

mRNPs

Figure 5-1 Model of SAFB SUMOylation regulating RP gene expression. SUMOylation of SAFB is required for the coupling of transcription and processing to make mRNA efficiently. SAFB and SUMO-1 recruit RNAPII onto the RP genes to initiate transcription, and facilitate splicing of primary transcripts. A mature mRNA is then bound with proteins and exported from nucleus to cytoplasm.

5.2 Issues to be resolved

5.2.1 The SUMOylation sites in SAFB1 that regulates SUMO-1 marks on the

promoters

In this study we found that SAFB is involved in regulating SUMO-1 binding on the

promoters, and a previous study showed the SUMOylation sites (K231/K294), which are

the canonical SUMOylation sites of SAFB1, is required for its co-repressor activity to

regulate ERα (Garee et al., 2011). Nonetheless, while cells exogenously expressed

K231R/K294R mutant, we still found the enrichment of SUMO-1 on the promoters, suggesting that these two SUMOylation sites are not responsible for the SUMO-1 marks 104 on the promoters. By using SUMOsp software (Ren et al., 2009), we found other predicted non-canonical SUMOylation sites such as K285, K296, and K500 in SAFB1.

Therefore, it is possible that these non-consensus sites are the actual sites responsible for

SUMO-1 association on the promoters.

5.2.2 Other SUMO-1 targets that regulate SUMO-1 marks on the promoters

While depletion of SAFB caused a significant decrease of SUMO-1 binding on the active promoters, SUMO-1 binding was not eliminated from these promoters. One possibility was that the depletion was not sufficient to decrease SUMOylation to baseline levels.

Alternatively, this result could suggest that other factors might also be SUMO-1 modified and bind to those promoters. Indeed, a study revealed that SUMOylation functions synergistically to target a protein group rather than individual substrates using DNA repair pathway as an example (Psakhye and Jentsch, 2012). From the mass spectrometry in this study, we have found other factors such as SAF-A and hnRNPA1 as potential

SUMO-1 substrates on the chromatin. Therefore, it is important to identify other components involved in the SUMO-1 association on the promoters.

5.2.3 The E3 ligases responsible for SUMO-1 marking on the promoters

SUMO E3 ligases are known to increase the specificity of SUMOylation to tightly regulate cellular processes. Compared to the very large number of ubiquitin E3 ligases, relatively few SUMO E3 ligases have been found. By depleting SUMO E3 ligases, we can examine the SUMOylation of SAFB in order to identify the enzyme responsible for

105 the modification. With an E3 SUMO ligase identified, we could determine whether its depletion causes changes of SUMOylation of the chromatin on the genome by ChIP-seq, to further elucidate how SUMOylation regulates cell cycle progression. In this study, we have found that depletion of SUMO-1 caused down regulation of RP gene during pre- mRNA splicing. A previous study has shown that a factor involved in splicing regulation and other RNA metabolism-related processes, ASF/SF2, is a co-activator of a SUMO E3 ligase, PIAS1, and PIAS1 has been found in nuclear speckles (Tan et al., 2002). Thus, it will be interesting to identify whether PIAS1 is the E3 ligase responsible for SUMO-1 association on the promoters. Moreover, several hnRNPs such as hnRNPC1 and hnRNPA1 were pulled down by SUMO-1 in the MS analysis, and hnRNPs are the most abundant cargos transported through nuclear pore complexs (NPCs). Since one of the major E3 such as RanBP2 locates in NPC, it is also possible that RanBP2 participates in the processing/transportation of the mRNA of SUMO-1 labeled genes.

5.3 Future directions

5.3.1 SUMOylation regulating RNA processing

SUMOylation is known to be a critical regulator of nuclear functions, including nuclear transportation, transcription, and genome stability. The events governing the processing of mRNA precursors are closely linked with transcription and export. Previous studies from proteomic analyses to functional studies have suggested that SUMO is involved in essentially every step during nuclear RNA processing. For example, a number of putative

106

SUMO targets in functional capping, splicing, polyadenylation and mRNA export complexes have been found (Vassileva and Matunis, 2004; Vethantham et al., 2008; Xu et al., 2007). However, the study of SUMOylation of RNA processing/binding proteins has been largely performed in vitro and the exact biological outcome remains unclear.

SAFB protein harbors a RRM domain, which is a found in most RNA processing proteins; nonetheless, the exact role of SAFB involved in RNA processing is not clear.

Our data suggested that depletion of either SUMO-1 or SAFB disrupted the Cajal body, which is the nuclear structure responsible for snRNP and snoRNP biogenesis, telomere maintenance and histone mRNA processing, suggesting the SAFB involvement in nuclear architecture. Therefore it is of interest to identify the mechanisms of how SAFB regulates Cajal body organization by functional mutagenesis assay, and the correlation between those highly active genes transcribed by RNAPII.

5.3.2 Gene regulation of RP biogenesis

Ribosome biogenesis plays a critical role for cell growth and it takes up to 60% of total cellular transcription in yeast, and the ratio is even higher in HeLa cells (Warner et al.,

2001). It involves three highly conserved steps from yeast to human: expression of rRNA and ribosomal protein, rRNA processing, and assembly of 40S and 60S ribosome subunit.

Therefore, disruption of ribosomal machinery triggers nucleolar stress, which has been observed in many diseases including cancer. In this study, we have found that SUMO-1 and SAFB regulate RP expression by mediating mRNA splicing. Consistent with our finding, a proteomic analysis for nucleolar SUMO-1 targets has been reported that

SUMO-1 coordinately functions with the ubiquitin-proteasome system to regulate 107 ribosome biogenesis and the maintenance of nucleolar integrity (Matafora et al., 2009a).

Interestingly, given that SAFB is known to respond to stress and forms speckles under heat shock (Denegri et al., 2001), and in this study, we have found that SAFB is required for proper Cajal body formation in HeLa cells. It is known that stress often causes reorganization of nuclear architecture, the simultaneous inhibition of major nuclear pathways, for example, transcription and replication, and the activation of stress response pathways such as DNA repair. Therefore, it will be of interest to identify the role of

SUMOylation and SAFB in response to nuclear stress. Ribosomal stress or nucleolar stress triggers the RP-p53-MDM2 stress-response pathway, connecting p53 and ribosome biogenesis in maintaining cell hemeostasis (Deisenroth and Zhang, 2010; Donati et al.,

2011). It is possible that SUMOylation of SAFB is also involved the surveillance mechanism to ribosome biogenesis and nucleolar stress (Figure 5-2). To test this hypothesis, we can challenge the cells with various stresses such as heat shock or UV damage under SAFB depletion or unSUMOylable mutants, and check whether cells are sensitized to these stressors. In addition, to identify whether this pathway is an adaption mechanism in aggressive cancer cells, we can test this hypothesis in several cell lines including cancerous and noncancerous cells, and cells that acquire this pathway can be specify using proliferation or survival test.

108

Su SAFB

Pol S II SAFB RP genes Stress

SAFB SAFB SAFB SAFB SAFB Stress body formation

Down regulation of RP

Nucleolar stress

p53 degradation

Figure 5-2 Su-SAFB involved in sensing stress. Under normal condition, Su-SAFB binds to RP genes to activate transcription; under stress, SAFB is released from the promoter to form stress body, and causes down regulation of RP and further triggers nucleolar stress response pathway, such as p53-MDM2 axis.

5.4 Significance

The proper control of cell cycle progression is of the utmost importance to normal cell division; when the cell cycle regulation is unbalanced, it may promote cell growth and tumor development. It is known that SUMOylation is tightly orchestrated in normal cells, and previous studies have shown that an imbalance of SUMOylation is linked to

109 tumorigenesis. Nevertheless, how the SUMO network regulates cell proliferation epigenetically remains unclear.

The approach used in this study therefore identified a common biological output and uncovered previously unknown functions for active SUMOylation at chromatin as a key regulator to dynamically mark chromatin, and coordinates transcriptional regulation of a network of genes vital for cell growth and proliferation.

110

Bibliography

Alkuraya, F.S., Saadi, I., Lund, J.J., Turbe-Doan, A., Morton, C.C., and Maas, R.L. (2006). SUMO1 haploinsufficiency leads to cleft lip and palate. Science 313, 1751. Altmeyer, M., Toledo, L., Gudjonsson, T., Grofte, M., Rask, M.B., Lukas, C., Akimov, V., Blagoev, B., Bartek, J., and Lukas, J. (2013). The chromatin scaffold protein SAFB1 renders chromatin permissive for DNA damage signaling. Mol Cell 52, 206-220. Azuma, Y., Arnaoutov, A., and Dasso, M. (2003). SUMO-2/3 regulates topoisomerase II in mitosis. The Journal of cell biology 163, 477-487. Bachant, J., Alcasabas, A., Blat, Y., Kleckner, N., and Elledge, S.J. (2002). The SUMO-1 isopeptidase Smt4 is linked to centromeric cohesion through SUMO-1 modification of DNA topoisomerase II. Mol Cell 9, 1169-1182. Banerjee, T., and Chakravarti, D. (2011). A peek into the complex realm of histone phosphorylation. Mol Cell Biol 31, 4858-4873. Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. (2007). High-resolution profiling of histone methylations in the human genome. Cell 129, 823-837. Bernardi, R., and Pandolfi, P.P. (2007). Structure, dynamics and functions of promyelocytic leukaemia nuclear bodies. Nature reviews Molecular cell biology 8, 1006- 1016. Bernstein, B.E., Mikkelsen, T.S., Xie, X., Kamal, M., Huebert, D.J., Cuff, J., Fry, B., Meissner, A., Wernig, M., Plath, K., et al. (2006). A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315-326. Biggins, S., Bhalla, N., Chang, A., Smith, D.L., and Murray, A.W. (2001). Genes involved in sister chromatid separation and segregation in the budding yeast Saccharomyces cerevisiae. Genetics 159, 453-470. Blomster, H.A., Hietakangas, V., Wu, J., Kouvonen, P., Hautaniemi, S., and Sistonen, L. (2009). Novel proteomics strategy brings insight into the prevalence of SUMO-2 target sites. Molecular & cellular proteomics : MCP 8, 1382-1390. Bonner, W.M., Redon, C.E., Dickey, J.S., Nakamura, A.J., Sedelnikova, O.A., Solier, S., and Pommier, Y. (2008). GammaH2AX and cancer. Nature reviews Cancer 8, 957-967. Boyer-Guittaut, M., Birsoy, K., Potel, C., Elliott, G., Jaffray, E., Desterro, J.M., Hay, R.T., and Oelgeschlager, T. (2005). SUMO-1 modification of human transcription factor (TF) IID complex subunits: inhibition of TFIID promoter-binding activity through SUMO-1 modification of hsTAF5. The Journal of biological chemistry 280, 9937-9945. Bracken, A.P., and Helin, K. (2009). Polycomb group proteins: navigators of lineage pathways led astray in cancer. Nat Rev Cancer 9, 773-784.

111

Brandl, A., Heinzel, T., and Kramer, O.H. (2009). Histone deacetylases: salesmen and customers in the post-translational modification market. Biology of the cell / under the auspices of the European Cell Biology Organization 101, 193-205. Bruderer, R., Tatham, M.H., Plechanovova, A., Matic, I., Garg, A.K., and Hay, R.T. (2011). Purification and identification of endogenous polySUMO conjugates. EMBO reports 12, 142-148. Cao, R., Wang, L., Wang, H., Xia, L., Erdjument-Bromage, H., Tempst, P., Jones, R.S., and Zhang, Y. (2002). Role of histone H3 lysine 27 methylation in Polycomb-group silencing. Science 298, 1039-1043. Carson, J.P., Zhang, N., Frampton, G.M., Gerry, N.P., Lenburg, M.E., and Christman, M.F. (2004). Pharmacogenomic identification of targets for adjuvant therapy with the topoisomerase poison camptothecin. Cancer Res 64, 2096-2104. Choi, S.J., Chung, S.S., Rho, E.J., Lee, H.W., Lee, M.H., Choi, H.S., Seol, J.H., Baek, S.H., Bang, O.S., and Chung, C.H. (2006). Negative modulation of RXRalpha transcriptional activity by small ubiquitin-related modifier (SUMO) modification and its reversal by SUMO-specific protease SUSP1. The Journal of biological chemistry 281, 30669-30677. Chupreta, S., Holmstrom, S., Subramanian, L., and Iniguez-Lluhi, J.A. (2005). A small conserved surface in SUMO is the critical structural determinant of its transcriptional inhibitory properties. Mol Cell Biol 25, 4272-4282. Darst, R.P., Garcia, S.N., Koch, M.R., and Pillus, L. (2008). Slx5 promotes transcriptional silencing and is required for robust growth in the absence of Sir2. Mol Cell Biol 28, 1361-1372. Deisenroth, C., and Zhang, Y. (2010). Ribosome biogenesis surveillance: probing the ribosomal protein-Mdm2-p53 pathway. Oncogene 29, 4253-4260. Denegri, M., Chiodi, I., Corioni, M., Cobianchi, F., Riva, S., and Biamonti, G. (2001). Stress-induced nuclear bodies are sites of accumulation of pre-mRNA processing factors. Molecular biology of the cell 12, 3502-3514. Denison, C., Rudner, A.D., Gerber, S.A., Bakalarski, C.E., Moazed, D., and Gygi, S.P. (2005). A proteomic strategy for gaining insights into protein sumoylation in yeast. Molecular & cellular proteomics : MCP 4, 246-254. Desterro, J.M., Rodriguez, M.S., and Hay, R.T. (1998). SUMO-1 modification of IkappaBalpha inhibits NF-kappaB activation. Mol Cell 2, 233-239. Di Bacco, A., Ouyang, J., Lee, H.Y., Catic, A., Ploegh, H., and Gill, G. (2006). The SUMO-specific protease SENP5 is required for cell division. Mol Cell Biol 26, 4489- 4498. Donati, G., Bertoni, S., Brighenti, E., Vici, M., Trere, D., Volarevic, S., Montanaro, L., and Derenzini, M. (2011). The balance between rRNA and ribosomal protein synthesis up- and downregulates the tumour suppressor p53 in mammalian cells. Oncogene 30, 3274-3288. Draviam, V.M., Xie, S., and Sorger, P.K. (2004). Chromosome segregation and genomic stability. Current opinion in genetics & development 14, 120-125.

112

Evdokimov, E., Sharma, P., Lockett, S.J., Lualdi, M., and Kuehn, M.R. (2008). Loss of SUMO1 in mice affects RanGAP1 localization and formation of PML nuclear bodies, but is not lethal as it can be compensated by SUMO2 or SUMO3. J Cell Sci 121, 4106-4113. Faus, H., and Haendler, B. (2006). Post-translational modifications of steroid receptors. Biomed Pharmacother 60, 520-528. Fejes, A.P., Robertson, G., Bilenky, M., Varhol, R., Bainbridge, M., and Jones, S.J. (2008). FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24, 1729-1730. Finkbeiner, E., Haindl, M., and Muller, S. (2011). The SUMO system controls nucleolar partitioning of a novel mammalian ribosome biogenesis complex. Embo J 30, 1067-1078. Flotho, A., and Melchior, F. (2013). Sumoylation: a regulatory protein modification in health and disease. Annual review of biochemistry 82, 357-385. Forrester, W.C., van Genderen, C., Jenuwein, T., and Grosschedl, R. (1994). Dependence of enhancer-mediated transcription of the immunoglobulin mu gene on nuclear matrix attachment regions. Science 265, 1221-1225. Galanty, Y., Belotserkovskaya, R., Coates, J., Polo, S., Miller, K.M., and Jackson, S.P. (2009). Mammalian SUMO E3-ligases PIAS1 and PIAS4 promote responses to DNA double-strand breaks. Nature 462, 935-939. Gall, J.G. (2003). The centennial of the Cajal body. Nature reviews Molecular cell biology 4, 975-980. Garcia-Dominguez, M., and Reyes, J.C. (2009). SUMO association with repressor complexes, emerging routes for transcriptional control. Biochimica et biophysica acta 1789, 451-459. Gareau, J.R., and Lima, C.D. (2010). The SUMO pathway: emerging mechanisms that shape specificity, conjugation and recognition. Nat Rev Mol Cell Biol 11, 861-871. Garee, J.P., Meyer, R., and Oesterreich, S. (2011). Co-repressor activity of scaffold attachment factor B1 requires sumoylation. Biochemical and biophysical research communications 408, 516-522. Geiss-Friedlander, R., and Melchior, F. (2007). Concepts in sumoylation: a decade on. Nature reviews Molecular cell biology 8, 947-956. Gill, G. (2010). SUMO weighs in on polycomb-dependent gene repression. Mol Cell 38, 157-159. Girdwood, D., Bumpass, D., Vaughan, O.A., Thain, A., Anderson, L.A., Snowden, A.W., Garcia-Wilson, E., Perkins, N.D., and Hay, R.T. (2003). P300 transcriptional repression is mediated by SUMO modification. Mol Cell 11, 1043-1054. Gocke, C.B., Yu, H., and Kang, J. (2005). Systematic identification and analysis of mammalian small ubiquitin-like modifier substrates. The Journal of biological chemistry 280, 5004-5012. Goldberg, A.D., Allis, C.D., and Bernstein, E. (2007). Epigenetics: a landscape takes shape. Cell 128, 635-638. Gomez-del Arco, P., Koipally, J., and Georgopoulos, K. (2005). Ikaros SUMOylation: switching out of repression. Mol Cell Biol 25, 2688-2697.

113

Gong, L., and Yeh, E.T. (2006). Characterization of a family of nucleolar SUMO-specific proteases with preference for SUMO-2 or SUMO-3. The Journal of biological chemistry 281, 15869-15877. Goodson, M.L., Hong, Y., Rogers, R., Matunis, M.J., Park-Sarge, O.K., and Sarge, K.D. (2001). Sumo-1 modification regulates the DNA binding activity of heat shock transcription factor 2, a promyelocytic leukemia nuclear body associated transcription factor. J Biol Chem 276, 18513-18518. Gostissa, M., Hengstermann, A., Fogal, V., Sandy, P., Schwarz, S.E., Scheffner, M., and Del Sal, G. (1999). Activation of p53 by conjugation to the ubiquitin-like protein SUMO- 1. Embo J 18, 6462-6471. Greer, E.L., and Shi, Y. (2012). Histone methylation: a dynamic mark in health, disease and inheritance. Nature reviews Genetics 13, 343-357. Guccione, E., Bassi, C., Casadio, F., Martinato, F., Cesaroni, M., Schuchlautz, H., Luscher, B., and Amati, B. (2007). Methylation of histone H3R2 by PRMT6 and H3K4 by an MLL complex are mutually exclusive. Nature 449, 933-937. Guo, B., Panagiotaki, N., Warwood, S., and Sharrocks, A.D. (2011). Dynamic modification of the ETS transcription factor PEA3 by sumoylation and p300-mediated acetylation. Nucleic acids research 39, 6403-6413. Guo, B., and Sharrocks, A.D. (2009). Extracellular signal-regulated kinase mitogen- activated protein kinase signaling initiates a dynamic interplay between sumoylation and ubiquitination to regulate the activity of the transcriptional activator PEA3. Mol Cell Biol 29, 3204-3218. Guo, D., Li, M., Zhang, Y., Yang, P., Eckenrode, S., Hopkins, D., Zheng, W., Purohit, S., Podolsky, R.H., Muir, A., et al. (2004). A functional variant of SUMO4, a new I kappa B alpha modifier, is associated with type 1 diabetes. Nature genetics 36, 837-841. Hammerich-Hille, S., Kaipparettu, B.A., Tsimelzon, A., Creighton, C.J., Jiang, S., Polo, J.M., Melnick, A., Meyer, R., and Oesterreich, S. (2010). SAFB1 mediates repression of immune regulators and apoptotic genes in breast cancer cells. The Journal of biological chemistry 285, 3608-3616. Hattersley, N., Shen, L., Jaffray, E.G., and Hay, R.T. (2011). The SUMO protease SENP6 is a direct regulator of PML nuclear bodies. Molecular biology of the cell 22, 78- 90. Hecker, C.M., Rabiller, M., Haglund, K., Bayer, P., and Dikic, I. (2006). Specification of SUMO1- and SUMO2-interacting motifs. The Journal of biological chemistry 281, 16117-16127. Hernandez-Hernandez, J.M., Mallappa, C., Nasipak, B.T., Oesterreich, S., and Imbalzano, A.N. (2013). The Scaffold attachment factor b1 (Safb1) regulates myogenic differentiation by facilitating the transition of myogenic gene chromatin from a repressed to an activated state. Nucleic acids research 41, 5704-5716. Holmstrom, S., Van Antwerp, M.E., and Iniguez-Lluhi, J.A. (2003). Direct and distinguishable inhibitory roles for SUMO isoforms in the control of transcriptional synergy. Proceedings of the National Academy of Sciences of the United States of America 100, 15758-15763.

114

Hong, Y., Rogers, R., Matunis, M.J., Mayhew, C.N., Goodson, M.L., Park-Sarge, O.K., and Sarge, K.D. (2001). Regulation of heat shock transcription factor 1 by stress-induced SUMO-1 modification. J Biol Chem 276, 40263-40267. Huff, J.T., Plocik, A.M., Guthrie, C., and Yamamoto, K.R. (2010). Reciprocal intronic and exonic histone modification regions in humans. Nat Struct Mol Biol 17, 1495-1499. Jenke, A.C., Stehle, I.M., Herrmann, F., Eisenberger, T., Baiker, A., Bode, J., Fackelmayer, F.O., and Lipps, H.J. (2004). Nuclear scaffold/matrix attached region modules linked to a transcription unit are sufficient for replication and maintenance of a mammalian episome. Proceedings of the National Academy of Sciences of the United States of America 101, 11322-11327. Johnson, E.S., and Blobel, G. (1999). Cell cycle-regulated attachment of the ubiquitin- related protein SUMO to the yeast septins. The Journal of cell biology 147, 981-994. Justin, N., De Marco, V., Aasland, R., and Gamblin, S.J. (2010). Reading, writing and editing methylated lysines on histone tails: new insights from recent structural studies. Current opinion in structural biology 20, 730-738. Kang, J.S., Saunier, E.F., Akhurst, R.J., and Derynck, R. (2008). The type I TGF-beta receptor is covalently modified and regulated by sumoylation. Nature cell biology 10, 654-664. Kang, X., Qi, Y., Zuo, Y., Wang, Q., Zou, Y., Schwartz, R.J., Cheng, J., and Yeh, E.T. (2010). SUMO-specific protease 2 is essential for suppression of polycomb group protein-mediated gene silencing during embryonic development. Mol Cell 38, 191-201. Kessler, J.D., Kahle, K.T., Sun, T., Meerbrey, K.L., Schlabach, M.R., Schmitt, E.M., Skinner, S.O., Xu, Q., Li, M.Z., Hartman, Z.C., et al. (2012). A SUMOylation-dependent transcriptional subprogram is required for Myc-driven tumorigenesis. Science 335, 348- 353. Kim, J., Cantwell, C.A., Johnson, P.F., Pfarr, C.M., and Williams, S.C. (2002). Transcriptional activity of CCAAT/enhancer-binding proteins is controlled by a conserved inhibitory domain that is a target for sumoylation. J Biol Chem 277, 38037- 38044. Kim, J.H., Choi, H.J., Kim, B., Kim, M.H., Lee, J.M., Kim, I.S., Lee, M.H., Choi, S.J., Kim, K.I., Kim, S.I., et al. (2006). Roles of sumoylation of a reptin chromatin- remodelling complex in cancer metastasis. Nature cell biology 8, 631-639. Kim, J.H., Lee, J.M., Nam, H.J., Choi, H.J., Yang, J.W., Lee, J.S., Kim, M.H., Kim, S.I., Chung, C.H., Kim, K.I., et al. (2007). SUMOylation of pontin chromatin-remodeling complex reveals a signal integration code in prostate cancer cells. Proceedings of the National Academy of Sciences of the United States of America 104, 20793-20798. Kim, Y.H., Choi, C.Y., and Kim, Y. (1999). Covalent modification of the homeodomain- interacting protein kinase 2 (HIPK2) by the ubiquitin-like protein SUMO-1. Proc Natl Acad Sci U S A 96, 12350-12355. Kotaja, N., Karvonen, U., Janne, O.A., and Palvimo, J.J. (2002). The interaction domain of GRIP1 is modulated by covalent attachment of SUMO-1. J Biol Chem 277, 30283-30288. Lee, F.Y., Faivre, E.J., Suzawa, M., Lontok, E., Ebert, D., Cai, F., Belsham, D.D., and Ingraham, H.A. (2011). Eliminating SF-1 (NR5A1) sumoylation in vivo results in ectopic 115 hedgehog signaling and disruption of endocrine development. Developmental cell 21, 315-327. Lee, S.W., Lee, M.H., Park, J.H., Kang, S.H., Yoo, H.M., Ka, S.H., Oh, Y.M., Jeon, Y.J., and Chung, C.H. (2012). SUMOylation of hnRNP-K is required for p53-mediated cell- cycle arrest in response to DNA damage. The EMBO journal 31, 4441-4452. Lee*, J.S., Choi*, H.J., and Baek, S.H. (2009). Sumoylation and Its Contribution to Cancer SUMO Regulation of Cellular Processes. In, V.G. Wilson, ed. (Springer Netherlands), pp. 253-272. Lehembre, F., Badenhorst, P., Muller, S., Travers, A., Schweisguth, F., and Dejean, A. (2000). Covalent modification of the transcriptional repressor tramtrack by the ubiquitin- related protein Smt3 in Drosophila flies. Mol Cell Biol 20, 1072-1082. Lin, D.Y., Huang, Y.S., Jeng, J.C., Kuo, H.Y., Chang, C.C., Chao, T.T., Ho, C.C., Chen, Y.C., Lin, T.P., Fang, H.I., et al. (2006). Role of SUMO-interacting motif in Daxx SUMO modification, subnuclear localization, and repression of sumoylated transcription factors. Mol Cell 24, 341-354. Lin, X., Liang, M., Liang, Y.Y., Brunicardi, F.C., Melchior, F., and Feng, X.H. (2003). Activation of transforming growth factor-beta signaling by SUMO-1 modification of tumor suppressor Smad4/DPC4. J Biol Chem 278, 18714-18719. Liu, H.W., Zhang, J., Heine, G.F., Arora, M., Gulcin Ozer, H., Onti-Srinivasan, R., Huang, K., and Parvin, J.D. (2012). Chromatin modification by SUMO-1 stimulates the promoters of translation machinery genes. Nucleic acids research 40, 10172-10186. Luco, R.F., Allo, M., Schor, I.E., Kornblihtt, A.R., and Misteli, T. (2011). Epigenetics in Alternative Pre-mRNA Splicing. Cell 144, 16-26. Lyst, M.J., Nan, X., and Stancheva, I. (2006). Regulation of MBD1-mediated transcriptional repression by SUMO and PIAS proteins. The EMBO journal 25, 5317- 5328. Mabb, A.M., and Miyamoto, S. (2007). SUMO and NF-kappaB ties. Cell Mol Life Sci 64, 1979-1996. Matafora, V., D'Amato, A., Mori, S., Blasi, F., and Bachi, A. (2009a). Proteomics analysis of nucleolar SUMO-1 target proteins upon proteasome inhibition. Molecular & cellular proteomics : MCP 8, 2243-2255. Matafora, V., D'Amato, A., Mori, S., Blasi, F., and Bachi, A. (2009b). Proteomics analysis of nucleolar SUMO-1 target proteins upon proteasome inhibition. Mol Cell Proteomics 8, 2243-2255. Metzler-Guillemain, C., Depetris, D., Luciani, J.J., Mignon-Ravix, C., Mitchell, M.J., and Mattei, M.G. (2008). In human pachytene spermatocytes, SUMO protein is restricted to the constitutive heterochromatin. Chromosome Res 16, 761-782. Mo, Y.Y., Yu, Y., Theodosiou, E., Ee, P.L., and Beck, W.T. (2005). A role for Ubc9 in tumorigenesis. Oncogene 24, 2677-2683. Monsalve, M., Wu, Z., Adelmant, G., Puigserver, P., Fan, M., and Spiegelman, B.M. (2000). Direct coupling of transcription and mRNA processing through the thermogenic coactivator PGC-1. Mol Cell 6, 307-316.

116

Morris, J.R., Boutell, C., Keppler, M., Densham, R., Weekes, D., Alamshah, A., Butler, L., Galanty, Y., Pangon, L., Kiuchi, T., et al. (2009). The SUMO modification pathway is involved in the BRCA1 response to genotoxic stress. Nature 462, 886-890. Mukhopadhyay, D., Ayaydin, F., Kolli, N., Tan, S.H., Anan, T., Kametaka, A., Azuma, Y., Wilkinson, K.D., and Dasso, M. (2006). SUSP1 antagonizes formation of highly SUMO2/3-conjugated species. The Journal of cell biology 174, 939-949. Muller, S., Berger, M., Lehembre, F., Seeler, J.S., Haupt, Y., and Dejean, A. (2000). c- Jun and p53 activity is modulated by SUMO-1 modification. J Biol Chem 275, 13321- 13329. Muller, S., Ledl, A., and Schmidt, D. (2004). SUMO: a regulator of gene expression and genome integrity. Oncogene 23, 1998-2008. Nacerddine, K., Lehembre, F., Bhaumik, M., Artus, J., Cohen-Tannoudji, M., Babinet, C., Pandolfi, P.P., and Dejean, A. (2005). The SUMO pathway is essential for nuclear integrity and chromosome segregation in mice. Developmental cell 9, 769-779. Nathan, D., Ingvarsdottir, K., Sterner, D.E., Bylebyl, G.R., Dokmanovic, M., Dorsey, J.A., Whelan, K.A., Krsmanovic, M., Lane, W.S., Meluh, P.B., et al. (2006). Histone sumoylation is a negative regulator in Saccharomyces cerevisiae and shows dynamic interplay with positive-acting histone modifications. Genes & development 20, 966-976. Nayler, O., Stratling, W., Bourquin, J.P., Stagljar, I., Lindemann, L., Jasper, H., Hartmann, A.M., Fackelmayer, F.O., Ullrich, A., and Stamm, S. (1998). SAF-B protein couples transcription and pre-mRNA splicing to SAR/MAR elements. Nucleic acids research 26, 3542-3549. Neyret-Kahn, H., Benhamed, M., Ye, T., Le Gras, S., Cossec, J.C., Lapaquette, P., Bischof, O., Ouspenskaia, M., Dasso, M., Seeler, J., et al. (2013). Sumoylation at chromatin governs coordinated repression of a transcriptional program essential for cell growth and proliferation. Genome research 23, 1563-1579. Nguyen, V.Q., Ranjan, A., Stengel, F., Wei, D., Aebersold, R., Wu, C., and Leschziner, A.E. (2013). Molecular architecture of the ATP-dependent chromatin-remodeling complex SWR1. Cell 154, 1220-1231. Nie, M., Xie, Y., Loo, J.A., and Courey, A.J. (2009). Genetic and proteomic evidence for roles of Drosophila SUMO in cell cycle control, Ras signaling, and early pattern formation. PloS one 4, e5905. Oesterreich, S., Lee, A.V., Sullivan, T.M., Samuel, S.K., Davie, J.R., and Fuqua, S.A. (1997). Novel nuclear matrix protein HET binds to and influences activity of the HSP27 promoter in human breast cancer cells. Journal of cellular biochemistry 67, 275-286. Oesterreich, S., Zhang, Q., Hopp, T., Fuqua, S.A., Michaelis, M., Zhao, H.H., Davie, J.R., Osborne, C.K., and Lee, A.V. (2000). Tamoxifen-bound (ER) strongly interacts with the nuclear matrix protein HET/SAF-B, a novel inhibitor of ER- mediated transactivation. Molecular endocrinology 14, 369-381. Onishi, A., Peng, G.H., Hsu, C., Alexis, U., Chen, S., and Blackshaw, S. (2009). Pias3- dependent SUMOylation directs rod photoreceptor development. Neuron 61, 234-246. Ouyang, J., and Gill, G. (2009). SUMO engages multiple corepressors to regulate chromatin structure and transcription. Epigenetics : official journal of the DNA Methylation Society 4, 440-444. 117

Ouyang, J., Shi, Y., Valin, A., Xuan, Y., and Gill, G. (2009a). Direct binding of CoREST1 to SUMO-2/3 contributes to gene-specific repression by the LSD1/CoREST1/HDAC complex. Mol Cell 34, 145-154. Ouyang, J., Valin, A., and Gill, G. (2009b). Regulation of transcription factor activity by SUMO modification. Methods Mol Biol 497, 141-152. Panse, V.G., Hardeland, U., Werner, T., Kuster, B., and Hurt, E. (2004). A proteome- wide approach identifies sumoylated substrate proteins in yeast. The Journal of biological chemistry 279, 41346-41351. Pelisch, F., Gerez, J., Druker, J., Schor, I.E., Munoz, M.J., Risso, G., Petrillo, E., Westman, B.J., Lamond, A.I., Arzt, E., et al. (2010). The serine/arginine-rich protein SF2/ASF regulates protein sumoylation. Proceedings of the National Academy of Sciences of the United States of America 107, 16119-16124. Poulin, G., Dong, Y., Fraser, A.G., Hopper, N.A., and Ahringer, J. (2005). Chromatin regulation and sumoylation in the inhibition of Ras-induced vulval development in Caenorhabditis elegans. The EMBO journal 24, 2613-2623. Psakhye, I., and Jentsch, S. (2012). Protein group modification and synergy in the SUMO pathway as exemplified in DNA repair. Cell 151, 807-820. Pungaliya, P., Kulkarni, D., Park, H.J., Marshall, H., Zheng, H., Lackland, H., Saleem, A., and Rubin, E.H. (2007). TOPORS functions as a SUMO-1 E3 ligase for chromatin- modifying proteins. Journal of proteome research 6, 3918-3923. Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842. Ren, J., Gao, X., Jin, C., Zhu, M., Wang, X., Shaw, A., Wen, L., Yao, X., and Xue, Y. (2009). Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0. Proteomics 9, 3409-3412. Rodriguez, M.S., Desterro, J.M., Lain, S., Midgley, C.A., Lane, D.P., and Hay, R.T. (1999). SUMO-1 modification activates the transcriptional response of p53. The EMBO journal 18, 6455-6461. Rogakou, E.P., Pilch, D.R., Orr, A.H., Ivanova, V.S., and Bonner, W.M. (1998). DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139. The Journal of biological chemistry 273, 5858-5868. Rosonina, E., Bakowski, M.A., McCracken, S., and Blencowe, B.J. (2003). Transcriptional activators control splicing and 3'-end cleavage levels. The Journal of biological chemistry 278, 43034-43040. Rosonina, E., Duncan, S.M., and Manley, J.L. (2010). SUMO functions in constitutive transcription and during activation of inducible genes in yeast. Genes & development 24, 1242-1252. Ross, S., Best, J.L., Zon, L.I., and Gill, G. (2002). SUMO-1 modification represses Sp3 transcriptional activation and modulates its subnuclear localization. Mol Cell 10, 831- 842. Sadasivam, S., Duan, S., and DeCaprio, J.A. (2012). The MuvB complex sequentially recruits B-Myb and FoxM1 to promote mitotic gene expression. Genes Dev 26, 474-489. Salomoni, P., and Pandolfi, P.P. (2002). The role of PML in tumor suppression. Cell 108, 165-170. 118

Santos-Rosa, H., Schneider, R., Bannister, A.J., Sherriff, J., Bernstein, B.E., Emre, N.C., Schreiber, S.L., Mellor, J., and Kouzarides, T. (2002). Active genes are tri-methylated at K4 of histone H3. Nature 419, 407-411. Sarge, K.D., and Park-Sarge, O.K. (2009). Sumoylation and human disease pathogenesis. Trends Biochem Sci 34, 200-205. Schulz, S., Chachami, G., Kozaczkiewicz, L., Winter, U., Stankovic-Valentin, N., Haas, P., Hofmann, K., Urlaub, H., Ovaa, H., Wittbrodt, J., et al. (2012). Ubiquitin-specific protease-like 1 (USPL1) is a SUMO isopeptidase with essential, non-catalytic functions. EMBO reports 13, 930-938. Segal, E., and Widom, J. (2009). From DNA sequence to transcriptional behaviour: a quantitative approach. Nature reviews Genetics 10, 443-456. Sehat, B., Tofigh, A., Lin, Y., Trocme, E., Liljedahl, U., Lagergren, J., and Larsson, O. (2010). SUMOylation mediates the nuclear translocation and signaling of the IGF-1 receptor. Science signaling 3, ra10. Seufert, W., Futcher, B., and Jentsch, S. (1995). Role of a ubiquitin-conjugating enzyme in degradation of S- and M-phase cyclins. Nature 373, 78-81. Shao, Z., Raible, F., Mollaaghababa, R., Guyon, J.R., Wu, C.T., Bender, W., and Kingston, R.E. (1999). Stabilization of chromatin structure by PRC1, a Polycomb complex. Cell 98, 37-46. Shen, L.N., Geoffroy, M.C., Jaffray, E.G., and Hay, R.T. (2009). Characterization of SENP7, a SUMO-2/3-specific isopeptidase. The Biochemical journal 421, 223-230. Shen, T.H., Lin, H.K., Scaglioni, P.P., Yung, T.M., and Pandolfi, P.P. (2006). The mechanisms of PML-nuclear body formation. Mol Cell 24, 331-339. Shi, X., Hong, T., Walter, K.L., Ewalt, M., Michishita, E., Hung, T., Carney, D., Pena, P., Lan, F., Kaadige, M.R., et al. (2006). ING2 PHD domain links histone H3 lysine 4 methylation to active gene repression. Nature 442, 96-99. Shiio, Y., and Eisenman, R.N. (2003). Histone sumoylation is associated with transcriptional repression. Proceedings of the National Academy of Sciences of the United States of America 100, 13225-13230. Shikina, S., Ihara, S., and Yoshizaki, G. (2008). Culture conditions for maintaining the survival and mitotic activity of rainbow trout transplantable type A spermatogonia. Molecular reproduction and development 75, 529-537. Shima, H., Suzuki, H., Sun, J., Kono, K., Shi, L., Kinomura, A., Horikoshi, Y., Ikura, T., Ikura, M., Kanaar, R., et al. (2013). Activation of the SUMO modification system is required for the accumulation of RAD51 at sites of DNA damage. J Cell Sci 126, 5284- 5292. Shin, E.J., Shin, H.M., Nam, E., Kim, W.S., Kim, J.H., Oh, B.H., and Yun, Y. (2012). DeSUMOylating isopeptidase: a second class of SUMO protease. EMBO reports 13, 339-346. Shyu, Y.C., Lee, T.L., Ting, C.Y., Wen, S.C., Hsieh, L.J., Li, Y.C., Hwang, J.L., Lin, C.C., and Shen, C.K. (2005). Sumoylation of p45/NF-E2: nuclear positioning and transcriptional activation of the mammalian beta-like globin gene . Mol Cell Biol 25, 10365-10378.

119

Spektor, T.M., Congdon, L.M., Veerappan, C.S., and Rice, J.C. (2011). The UBC9 E2 SUMO conjugating enzyme binds the PR-Set7 histone methyltransferase to facilitate target gene repression. PloS one 6, e22785. Spivakov, M., and Fisher, A.G. (2007). Epigenetic signatures of stem-cell identity. Nature reviews Genetics 8, 263-271. Stielow, B., Kruger, I., Diezko, R., Finkernagel, F., Gillemans, N., Kong-a-San, J., Philipsen, S., and Suske, G. (2010). Epigenetic silencing of spermatocyte-specific and neuronal genes by SUMO modification of the transcription factor Sp3. PLoS genetics 6, e1001203. Stielow, B., Sapetschnig, A., Kruger, I., Kunert, N., Brehm, A., Boutros, M., and Suske, G. (2008). Identification of SUMO-dependent chromatin-associated transcriptional repression components by a genome-wide RNAi screen. Mol Cell 29, 742-754. Tagwerker, C., Flick, K., Cui, M., Guerrero, C., Dou, Y., Auer, B., Baldi, P., Huang, L., and Kaiser, P. (2006). A tandem affinity tag for two-step purification under fully denaturing conditions: application in ubiquitin profiling and protein complex identification combined with in vivocross-linking. Mol Cell Proteomics 5, 737-748. Tai, H.H., Geisterfer, M., Bell, J.C., Moniwa, M., Davie, J.R., Boucher, L., and McBurney, M.W. (2003). CHD1 associates with NCoR and histone deacetylase as well as with RNA splicing proteins. Biochemical and biophysical research communications 308, 170-176. Tan, J.A., Hall, S.H., Hamil, K.G., Grossman, G., Petrusz, P., and French, F.S. (2002). Protein inhibitors of activated STAT resemble scaffold attachment factors and function as interacting nuclear receptor coregulators. The Journal of biological chemistry 277, 16993-17001. Tan, M., Luo, H., Lee, S., Jin, F., Yang, J.S., Montellier, E., Buchou, T., Cheng, Z., Rousseaux, S., Rajagopal, N., et al. (2011). Identification of 67 histone marks and histone lysine crotonylation as a new type of histone modification. Cell 146, 1016-1028. Tatham, M.H., Matic, I., Mann, M., and Hay, R.T. (2011). Comparative proteomic analysis identifies a role for SUMO in protein quality control. Science signaling 4, rs4. Terui, Y., Saad, N., Jia, S., McKeon, F., and Yuan, J. (2004). Dual role of sumoylation in the nuclear localization and transcriptional activation of NFAT1. J Biol Chem 279, 28257-28265. Tian, S., Poukka, H., Palvimo, J.J., and Janne, O.A. (2002). Small ubiquitin-related modifier-1 (SUMO-1) modification of the glucocorticoid receptor. The Biochemical journal 367, 907-911. Trapnell, C., Pachter, L., and Salzberg, S.L. (2009). TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105-1111. Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B.J., and Pachter, L. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28, 511-515. Uchimura, Y., Ichimura, T., Uwada, J., Tachibana, T., Sugahara, S., Nakao, M., and Saitoh, H. (2006). Involvement of SUMO modification in MBD1- and MCAF1-mediated heterochromatin formation. The Journal of biological chemistry 281, 23180-23190. 120

Vassileva, M.T., and Matunis, M.J. (2004). SUMO modification of heterogeneous nuclear ribonucleoproteins. Mol Cell Biol 24, 3623-3632. Velma, V., Carrero, Z.I., Cosman, A.M., and Hebert, M.D. (2010). Coilin interacts with Ku proteins and inhibits in vitro non-homologous DNA end joining. FEBS letters 584, 4735-4739. Vertegaal, A.C., Andersen, J.S., Ogg, S.C., Hay, R.T., Mann, M., and Lamond, A.I. (2006). Distinct and overlapping sets of SUMO-1 and SUMO-2 target proteins revealed by quantitative proteomics. Molecular & cellular proteomics : MCP 5, 2298-2310. Vethantham, V., Rao, N., and Manley, J.L. (2007). Sumoylation modulates the assembly and activity of the pre-mRNA 3' processing complex. Mol Cell Biol 27, 8848-8858. Vethantham, V., Rao, N., and Manley, J.L. (2008). Sumoylation regulates multiple aspects of mammalian poly(A) polymerase function. Genes & development 22, 499-511. Wang, J., Feng, X.H., and Schwartz, R.J. (2004). SUMO-1 modification activated GATA4-dependent cardiogenic gene activity. J Biol Chem 279, 49091-49098. Wang, J., Li, A., Wang, Z., Feng, X., Olson, E.N., and Schwartz, R.J. (2007). Myocardin sumoylation transactivates cardiogenic genes in pluripotent 10T1/2 fibroblasts. Mol Cell Biol 27, 622-632. Ward, I.M., Minn, K., and Chen, J. (2004). UV-induced ataxia-telangiectasia-mutated and Rad3-related (ATR) activation requires replication stress. The Journal of biological chemistry 279, 9677-9680. Warner, J.R., Vilardell, J., and Sohn, J.H. (2001). Economics of ribosome biosynthesis. Cold Spring Harbor symposia on quantitative biology 66, 567-574. Watts, F.Z. (2007). The role of SUMO in chromosome segregation. Chromosoma 116, 15-20. Wei, F., Scholer, H.R., and Atchison, M.L. (2007). Sumoylation of Oct4 enhances its stability, DNA binding, and transactivation. The Journal of biological chemistry 282, 21551-21560. Werner, A., Flotho, A., and Melchior, F. (2012). The RanBP2/RanGAP1*SUMO1/Ubc9 complex is a multisubunit SUMO E3 ligase. Mol Cell 46, 287-298. Wohlschlegel, J.A., Johnson, E.S., Reed, S.I., and Yates, J.R., 3rd (2004). Global analysis of protein sumoylation in Saccharomyces cerevisiae. The Journal of biological chemistry 279, 45662-45668. Wu, F., Zhu, S., Ding, Y., Beck, W.T., and Mo, Y.Y. (2009). MicroRNA-mediated regulation of Ubc9 expression in cancer cells. Clin Cancer Res 15, 1550-1557. Wu, S.Y., and Chiang, C.M. (2009). Crosstalk between sumoylation and acetylation regulates p53-dependent chromatin transcription and DNA binding. Embo J 28, 1246- 1259. Xu, X.M., Rose, A., Muthuswamy, S., Jeong, S.Y., Venkatakrishnan, S., Zhao, Q., and Meier, I. (2007). NUCLEAR PORE ANCHOR, the Arabidopsis homolog of Tpr/Mlp1/Mlp2/megator, is involved in mRNA export and SUMO homeostasis and affects diverse aspects of plant development. The Plant cell 19, 1537-1548. Yan, Q., Gong, L., Deng, M., Zhang, L., Sun, S., Liu, J., Ma, H., Yuan, D., Chen, P.C., Hu, X., et al. (2010). Sumoylation activates the transcriptional activity of Pax-6, an

121 important transcription factor for eye and brain development. Proc Natl Acad Sci U S A 107, 21034-21039. Yang, S.H., Jaffray, E., Hay, R.T., and Sharrocks, A.D. (2003). Dynamic interplay of the SUMO and ERK pathways in regulating Elk-1 transcriptional activity. Mol Cell 12, 63- 74. Yang, S.H., and Sharrocks, A.D. (2004). SUMO promotes HDAC-mediated transcriptional repression. Mol Cell 13, 611-617. Yang, S.H., and Sharrocks, A.D. (2006). PIASxalpha differentially regulates the amplitudes of transcriptional responses following activation of the ERK and p38 MAPK pathways. Mol Cell 22, 477-487. Yang, Y., Fu, W., Chen, J., Olashaw, N., Zhang, X., Nicosia, S.V., Bhalla, K., and Bai, W. (2007). SIRT1 sumoylation regulates its deacetylase activity and cellular response to genotoxic stress. Nature cell biology 9, 1253-1262. Yuan, H., Zhou, J., Deng, M., Liu, X., Le Bras, M., de The, H., Chen, S.J., Chen, Z., Liu, T.X., and Zhu, J. (2010). Small ubiquitin-related modifier paralogs are indispensable but functionally redundant during early development of zebrafish. Cell Res 20, 185-196. Zhang, F.P., Mikkonen, L., Toppari, J., Palvimo, J.J., Thesleff, I., and Janne, O.A. (2008a). Sumo-1 function is dispensable in normal mouse development. Mol Cell Biol 28, 5381-5390. Zhang, H., Smolen, G.A., Palmer, R., Christoforou, A., van den Heuvel, S., and Haber, D.A. (2004). SUMO modification is required for in vivo Hox gene regulation by the Caenorhabditis elegans Polycomb group protein SOP-2. Nature genetics 36, 507-511. Zhang, X.D., Goeres, J., Zhang, H., Yen, T.J., Porter, A.C., and Matunis, M.J. (2008b). SUMO-2/3 modification and binding regulate the association of CENP-E with kinetochores and progression through mitosis. Mol Cell 29, 729-741. Zhao, X., and Blobel, G. (2005). A SUMO ligase is part of a nuclear multiprotein complex that affects DNA repair and chromosomal organization. Proceedings of the National Academy of Sciences of the United States of America 102, 4777-4782. Zhou, F., Xue, Y., Lu, H., Chen, G., and Yao, X. (2005). A genome-wide analysis of sumoylation-related biological processes and functions in human nucleus. FEBS Lett 579, 3369-3375. Zhu, J., Zhou, J., Peres, L., Riaucoux, F., Honore, N., Kogan, S., and de The, H. (2005). A sumoylation site in PML/RARA is essential for leukemic transformation. Cancer cell 7, 143-153. Zhu, S., Sachdeva, M., Wu, F., Lu, Z., and Mo, Y.Y. (2010). Ubc9 promotes breast cell invasion and metastasis in a sumoylation-independent manner. Oncogene 29, 1763-1772.

122

Appendix: Supplementary information in this study

* Primers and siRNAs used in this thesis Primers for RT-qPCR

Gene name (Refseq) Sequence (5’ to 3’) RPL3 TTCTCTGTGGCACGCGCTGG CCAGCAAGGACTTGCGGAGGG RPL5 GGCCCGCAGGCTTCTCAATAGG GTGAAGGCACCTGGCTGACCA RPL7A ACGTGGATCCCATCGAGCTGGT CCACCCCAGTGACGGCGGAT RPL10A TTAGCGCGGCGTGAGAAGCC TCCAGGAACTTGCGGCGCTT RPL17 GGTGATCTGTGAAAATGGTTCGC ATGCTGAATTTACTCCCGTGC RPL23 GGTGGGCGGGGCGTTAAAGT CCCACCACGTCCTCGCTTCG RPL26 AGCGGGAGCGGCCAAAATGA TTCCCGCTGCACCCGTTCAA SLC1A3 TGGCAAAACCATTCCACCCAACA ACCGTGTCCGGGATTTGGGTCT Polr2a ATCTCTCCTGCCATGACACC AGACCAGGCAGGGGAGTAAC Pre-RPL26 (exon1/intron1) GGCTTTCCGTTCGAGGATCT TTAGGCATCCACCTACCCCA Pre-RPL7A (exon4/intron4) GACGTCCCAACGAAGAGACC CCCCCAGTGTTCACCCTAAG

Primers for ChIP-qPCR

Gene name (Refseq) Sequence (5’ to 3’) IL2 TCTGCCTGCTTTCTGTGAAACTCAA GGACAAGCCTCATCCCAAACTCCA EIF3F GTTTCTCTTCGAACGCCGT

123

GAGCACTGAAATAGTCCCGC RPS27 CAGGATTTCCGCTTTCGCTC ACAGAACAGCGAGATCTCCG RPL26 GACCTATGTCTCTCGGAGCG GTTTTGCAATCCCTCGCAGT RPL38 TGATCCTCGGCAGGCACCGT CAGCAAGCAGCAACCGGGGA RPL5 CGTCACTGGCGTGACCGTCC CTGCGGAACAGAGACCGGCG RPL7A CCGCCGCCCAAGATGGTGAG GGCAGCGGATACAGCCGGAA RPL10A GGCGCTCAGGACTGCGACAA GCTCCTCGGTCCTACCCGCA RPL3 CCTTCGGAGTGCACCAGCGG TTGCTTTAGGGGCACGGGCG RPL23 AATCCGCCAGCCACTGCACG TTGCTCCGGCCACGTGAGGA

siRNAs

Gene name Sequence SUMO-1 CAAAAAAUCCCGAUGGCAC Ubc9 CUGGGAAUGGAGGAAGAAG SAFB1 GUAAUCCUGACGAAAUUGA SAFB2 GAAUAGCAGUGCUCCAGAU

124

* Ingenuity Pathway Analysis of 127 genes with constant SUMO-1 marks on active promoters during interphase.

Category p-Value Molecules # Molecules

Protein 6.68E-08 ALG3, DTL, EEF2, MRPL13, MYC, PSMB3, 19 Synthesis PSMC2, RPL13, RPL18, RPL32, RPL36, RPL38, RPL18A, RPL36A, RPS24, RPS27, RPS29, TNRC6B, WARS Cancer 1.40E-04 ABCD3, AHCYL1, AKAP12, ALG3, ANXA2, 49 BMP6, CD9, CD58, CDKN1C, CNP, CRYZ, CX3CL1, CYR61, DUSP1, ETFA, FTH1, ILF2, JAK1, LASP1, LASS2, LEPR, LRRC8D, MAP1LC3B, MRCL3, MRPL13, MYC, PDCD6, PERP, PHLDA2, PPP1R3C, PRKACB, PRMT1, PSMD4, PTMA, PTPRF, RAE1, S100A2, SLC7A5, SLC9A3R1, STAT3, TMEM106C, TNRC6B, TPD52, TPD52L1, TPM1, TRIP13, USP1, WARS, ZYX Cell Cycle 4.25E-04 AKAP12, ALG3, BMP6, CDKN1C, CREG1, 20 CYR61, DDB1, DTL, DUSP1, GNAI3, MYC, PDCD6, PRMT5, PRPF4, PTPRF, RPS27L, STAT3, TPD52L1, TPM1, TRIP13 Other - ACADM, AP3D1, BASP1, BHLHB2, BRD9, 67 BRMS1, CCDC72, CLPTM1, CS, DDT, DDX23, DDX27, DDX47, DNASE2, DPP8, DYNC1H1, FAM32A, GGA1, GRPEL1, GTF2IRD1, HMGN4, HSPA9, IARS2, KIAA0100, KIAA0947, LTA4H, M6PRBP1, MAPKAPK3, MARCH6, MLF2, MORF4L2, MRPS12, MTX1, NAPA, NPDC1, NPR3, NPTN, NUP85, OLFML2A, OXCT1, PCBP2, PERP, PH-4, POP5, PPM1J, PSMB7, PSMD1, RAB13, RHOD, S100A10, SEC16A, SEP15, TARS, TEGT, TMED2, TMEM161A, TNRC6B, TOR3A, TPD52, TSC22D3, TTC19, U2AF2, UBE2Q1, UROD, USP13, WDR74, WDR79

125

* Genes with significant change under SUMO-1 depletion based on RNA-seq experiments (cut off range: FPKM> 1).

With SUMO-1 Avg FPKM RefSeq peak in Fold p-value Locus gene promoter Change in G1 (Ctrlsi) (SUMO1si) phase GAPDH + 3165.56 2896.61 -1.09285 0.000554101 chr12:6513917-6517797 ACTB + 2281.17 1923.1 -1.18619 3.48E-08 chr7:5533304-5536758 TMSB10 1617.74 1401.93 -1.15394 8.71E-05 chr2:84986273-84987310 GNB2L1 + 1379.36 1223.64 -1.12726 0.00228591 chr5:180596533-180603512 ENO1 1276.72 987.07 -1.29344 1.27E-09 chr1:8843649-8861367 RPS8 1251.12 933.284 -1.34056 1.23E-11 chr1:45013832-45016999 PKM2 + 1241.43 996.898 -1.24529 2.49E-07 chr15:70278423-70310738 RPL5 + 1161.8 722.933 -1.60706 0 chr1:93070181-93080069 B2M 1154.14 1579.76 1.36878 4.44E-16 chr15:42790976-42797649 HSPA8 + 1090.08 958.194 -1.13764 0.00359087 chr11:122433409-122438054 RPL4 1037.73 755.899 -1.37284 3.35E-11 chr15:64578706-64584238 SLC7A5 + 1031.66 1251.49 1.21308 4.36E-06 chr16:86421129-86460601 TPT1 1023.06 890.409 -1.14898 0.0024451 chr13:44809303-44813297 EEF2 + 992.521 816.635 -1.21538 3.65E-05 chr19:3927053-3936461 PFN1 + 981.316 731.969 -1.34065 1.94E-09 chr17:4789691-4792570 RPL26 + 971.445 391.778 -2.47958 0 chr17:8221558-8227290 CALR + 951.483 596.969 -1.59386 0 chr19:12910413-12916304 ATP5B 946.618 1171.5 1.23756 1.08E-06 chr12:55318225-55326119 S100A11 920.46 707.935 -1.3002 1.51E-07 chr1:150271605-150276135 FTH1 + 911.095 1395.36 1.53152 0 chr11:61473931-61491708 HSP90AB1 + 907.395 779.016 -1.1648 0.00178944 chr6:44322826-44329592 KRT18 + 872.548 484.695 -1.8002 0 chr12:51628921-51632952 RPS16 + 864.082 736.862 -1.17265 0.00149221 chr19:44615686-44618458 RPL3 + 843.773 491.911 -1.7153 0 chr22:38038832-38045616 RPL23 826.394 671.805 -1.23011 6.70E-05 chr17:34259846-34263579 TUBA1B 776.795 614.691 -1.26372 1.45E-05 chr12:47807832-47811571 VIM 751.623 435.162 -1.72723 0 chr10:17310263-17319598 BSG 693.236 537.081 -1.29075 9.01E-06 chr19:522324-534493 FTL 690.837 1233.29 1.78521 0 chr19:54160377-54161948 KRT7 + 678.338 405.714 -1.67196 2.22E-16 chr12:50913220-50928976 TFRC 670.096 568.355 -1.17901 0.00387978 chr3:197260551-197293429 TUBA1C + 631.553 471.667 -1.33898 1.61E-06 chr12:47945131-47953380 EEF1A1 + 630.907 519.952 -1.21339 0.00109238 chr6:74282193-74287476 CTSZ 627.842 511.921 -1.22644 0.000608803 chr20:57003637-57015704 NPM1 + 602.972 388.209 -1.55321 1.33E-11 chr5:170747312-170770492 HSP90AA1 594.872 371.225 -1.60246 1.01E-12 chr14:101616827-101675839 MYL6 590.916 449.453 -1.31474 1.25E-05 chr12:54838366-54841633 EIF4G2 + 585.943 453.567 -1.29186 4.23E-05 chr11:10775168-10787158 KRT8 + 531.433 374.041 -1.42079 1.95E-07 chr12:51577237-51585135 126

RPS19 528.147 689.392 1.3053 4.08E-06 chr19:47055827-47067324 RPL13A 519.182 334.026 -1.55432 3.22E-10 chr19:54682676-54687376 TMSL3 + 502.532 383.565 -1.31016 0.000182186 chrX:12903145-12905267 LGALS1 492.092 255.141 -1.92871 0 chr22:36401558-36405755 ASS1 + 490.189 307.008 -1.59667 1.29E-10 chr9:132309914-132366482 RPL7A + 483.392 375.321 -1.28794 0.000232661 chr9:135204889-135208101 RPL10A + 482.116 393.341 -1.22569 0.00274277 chr6:35544155-35546536 CD55 + 468.155 361.557 -1.29483 0.000223919 chr1:205561439-205600934 RPS9 460.391 376.197 -1.2238 0.00366156 chr19:59396537-59403327 SCD + 444.112 279.021 -1.59168 1.17E-09 chr10:102096761-102114578 RPS25 435.905 337.471 -1.29168 0.000302572 chr11:118374061-118394267 S100P 431.097 236.341 -1.82405 1.12E-13 chr4:6746466-6749798 ANXA2P2 425.74 314.374 -1.35425 4.54E-05 chr9:33614222-33615532 SLC3A2 + 423.742 546.605 1.28995 8.47E-05 chr11:62380093-62412929 PRDX6 + 416.744 328.344 -1.26923 0.00123412 chr1:171713108-171724569 HNRNPA2 B1 415.326 321.44 -1.29208 0.000561805 chr7:26196080-26206938 HDGF + 410.071 281.867 -1.45484 1.29E-06 chr1:154978522-154988864 TAGLN2 379.476 205.769 -1.84418 1.55E-12 chr1:158154526-158161908 SPINK4 375.621 142.765 -2.63104 0 chr9:33209069-33501047 PSMA5 + 365.415 251.668 -1.45197 5.30E-06 chr1:109745994-109770560 TM4SF1 + 359.869 446.112 1.23965 0.0024295 chr3:150569494-150578258 UQCRH 355.619 454.048 1.27678 0.000559426 chr1:46541966-46555034 DBI 355.054 263.45 -1.34771 0.000243507 chr2:119840973-119846592 PTMA + 353.498 265.704 -1.33042 0.000437733 chr2:232281478-232286494 COX4I1 348.771 429.39 1.23115 0.00391642 chr16:84390696-84398108 GPX4 340.213 252.518 -1.34728 0.000332962 chr19:1054935-1057787 MYL12B 334.074 435.61 1.30393 0.000269197 chr18:3252110-3268280 ANXA2 + 333.349 251.976 -1.32294 0.000802544 chr15:58426641-58477477 H19 310.893 185.012 -1.68039 2.27E-08 chr11:1972981-1975641 RBBP7 307.388 236.869 -1.29771 0.00257628 chrX:16772697-16798455 H2AFZ 302.823 212.105 -1.4277 6.99E-05 chr4:101088266-101090535 IFITM3 286.446 478.899 1.67186 5.96E-12 chr11:309672-310914 MGST3 + 285.35 216.446 -1.31834 0.00216804 chr1:163867073-163891479 PCBP1 284.363 199.675 -1.42413 0.000128487 chr2:70168088-70169836 IFI30 284.226 383.45 1.3491 0.000130394 chr19:18145578-18149927 PHB2 + 266.602 187.825 -1.41942 0.000214442 chr12:6944777-6950177 ECEL1 261.868 386.232 1.47491 1.21E-06 chr2:233052780-233060776 ANXA1 + 260.209 144.685 -1.79845 1.52E-08 chr9:74956600-74975127 CYBA 258.076 179.378 -1.43873 0.000182553 chr16:87237197-87244958 SERPINB6 + 250.202 181.426 -1.37909 0.000980085 chr6:2893391-2917089 HSPB1 247.495 164.468 -1.50482 4.86E-05 chr7:75769810-75771549 COX6C 246.976 343.455 1.39064 7.73E-05 chr8:100959547-100975071 SQSTM1 246.643 189.497 -1.30157 0.00153201 chr5:179157204-179218446 EIF3E + 246.062 183.465 -1.34119 0.00261675 chr8:109283147-109330135 FKBP10 243.292 170.559 -1.42644 0.000375776 chr17:37222487-37232995 ID1 240.035 180.711 -1.32828 0.00399233 chr20:29656752-29657974 ILF2 + 237.655 146.588 -1.62124 4.21E-06 chr1:151901137-151910103 TXNRD1 236.811 104.429 -2.26767 3.04E-12 chr12:103133688-103268192 127

EIF2S3 + 232.8 298.778 1.28341 0.00431443 chrX:23982985-24006851 COX6A1 232.64 308.797 1.32736 0.00110618 chr12:119360286-119362912 CRIP1 229.783 114.976 -1.99853 1.35E-09 chr14:105024301-105026169 CLU + 226.454 149.459 -1.51516 8.06E-05 chr8:27510367-27528244 PON2 225.835 299.728 1.3272 0.00131628 chr7:94872109-94902320 FDPS + 222.012 157.956 -1.40553 0.000138641 chr1:153545162-153567533 PTMS + 221.692 137.342 -1.61416 1.04E-05 chr12:6745801-6750379 ACADVL 213.496 141.459 -1.50924 0.000115705 chr17:7033933-7069309 LMNA + 212.428 153.635 -1.38268 0.00269446 chr1:154351084-154376502 PTGES3 + 207.434 141.212 -1.46895 0.000423797 chr12:55343391-55368345 ATP1B3 206.665 342.083 1.65525 1.06E-08 chr3:143078159-143128072 EIF3H 206.603 98.3222 -2.10129 1.36E-09 chr8:117726235-117837243 ECH1 204.159 105.608 -1.93318 3.81E-08 chr19:43997901-44014337 LGALS3 203.47 140.024 -1.45311 0.000668302 chr14:54665624-54681901 DSTN + 203.197 127.685 -1.59139 3.89E-05 chr20:17498598-17536652 S100A10 + 199.891 107.045 -1.86735 1.84E-07 chr1:150222009-150233338 STMN1 198.924 121.174 -1.64164 1.70E-05 chr1:26083264-26105955 LAPTM4B 198.369 263.426 1.32796 0.00255049 chr8:98856984-98934006 PSMA7 196.165 120.468 -1.62836 2.53E-05 chr20:60145185-60151869 AP2M1 190.846 269.025 1.40964 0.00028586 chr3:185375327-185384573 EIF3D + 190.716 120.965 -1.57662 8.97E-05 chr22:35236842-35255223 COTL1 + 190.633 81.226 -2.34695 1.21E-10 chr16:83156704-83209170 ATP6V0C 190.103 270.033 1.42046 0.000209575 chr16:2503953-2510220 CBX3 187.39 122.597 -1.5285 0.000260417 chr7:26207623-26219501 GSTM1 186.907 69.7136 -2.68107 2.10E-12 chr1:110031940-110037889 MYC + 186.51 135.464 -1.37682 0.00461513 chr8:128817496-128822860 ANP32B + 185.238 112.767 -1.64266 3.25E-05 chr9:99785309-99818045 PEBP1 184.565 266.711 1.44508 0.000120479 chr12:117058252-117067773 JUP + 182.595 242.119 1.32599 0.00400442 chr17:37164384-37196490 NPTX1 180.415 115.154 -1.56673 0.000167017 chr17:76055227-76064999 TMEM106 C + 179.06 98.3703 -1.82026 1.82E-06 chr12:46643596-46648927 FASN 177.218 124.116 -1.42784 0.00234278 chr17:77629502-77649395 LAMA1 177.009 291.775 1.64836 1.56E-07 chr18:6931885-7107813 PDHA1 171.12 89.3322 -1.91555 6.24E-07 chrX:19271931-19410315 GLO1 168.203 116.297 -1.44632 0.00221368 chr6:38751679-38778930 LAMP1 + 166.428 251.769 1.51278 3.42E-05 chr13:112999469-113025742 SEPP1 + 166.237 251.2 1.5111 3.39E-05 chr5:42792676-42847781 DEGS1 163.54 326.7 1.99768 6.26E-13 chr1:222437550-222447765 BCAP31 163.118 228.042 1.39802 0.00110101 chrX:152619140-152643395 CTSC + 161.843 76.6107 -2.11254 1.30E-07 chr11:87666407-87710589 PRSS23 161.249 239.394 1.48462 0.000104956 chr11:86189138-86199921 CSRP1 + 158.203 245.658 1.5528 1.58E-05 chr1:199719282-199743010 RPL17 153.241 105.196 -1.45672 0.00307936 chr18:45268853-45272904 SF3B4 145.382 99.5702 -1.4601 0.00361797 chr1:148161834-148166326 NOLC1 141.839 64.63 -2.19463 1.63E-07 chr10:103901922-103913617 PA2G4 140.067 91.3182 -1.53383 0.00147036 chr12:54784369-54793961 TMED9 139.846 93.2077 -1.50037 0.00241195 chr5:176951818-176955705 TNFSF9 139.447 195.16 1.39953 0.00243392 chr19:6482009-6486939 128

TOB1 138.59 81.5237 -1.7 0.000143724 chr17:46294585-46296412 NDUFA13 137.914 222.521 1.61348 1.01E-05 chr19:19488018-19500013 SEC61A1 137.644 222.097 1.61356 1.03E-05 chr3:129253901-129273216 CDK4 136.978 89.296 -1.53398 0.00113387 chr12:56425050-56432431 EIF4A2 + 136.418 196.482 1.44029 0.000190722 chr3:187984054-188007178 PTP4A2 + 133.386 199.179 1.49325 0.000341518 chr1:32146379-32176575 MYBL2 131.828 54.7774 -2.40661 4.68E-08 chr20:41729122-41778536 SF3B2 130.052 79.7287 -1.63118 0.000581607 chr11:65576391-65592958 MOSC1 + 128.663 83.7188 -1.53685 0.00221052 chr1:219026661-219054363 PPT1 127.465 67.5144 -1.88797 2.48E-05 chr1:40310968-40335729 SLC1A3 + 125.652 209.534 1.66757 6.02E-06 chr5:36642213-36724193 PLOD2 + 123.48 71.3675 -1.7302 0.000227066 chr3:147269917-147361972 PRR13 119.754 56.6394 -2.11432 3.73E-06 chr12:52121699-52126694 ATP6V0B 119.548 190.993 1.59763 5.91E-05 chr1:44213188-44216559 C19orf53 + 118.008 76.9435 -1.5337 0.00351422 chr19:13746256-13750586 CHCHD10 117.264 76.5987 -1.53089 0.00374757 chr22:22435207-22440141 GNG11 117.082 49.8139 -2.35039 4.38E-07 chr7:93388951-93393762 NAP1L1 116.945 62.7968 -1.86228 7.12E-05 chr12:74724938-74765005 HEXB 112.645 168.612 1.49684 5.00E-05 chr5:74016724-74098798 MST4 112.526 68.6696 -1.63866 0.00128021 chrX:130984925-131037652 C19orf33 + 111.15 48.4217 -2.29546 0 chr19:43486039-43498446 RBM3 110.887 63.4101 -1.74873 0.000385613 chrX:48317779-48321748 ITGB5 108.699 59.2342 -1.83507 0.000170556 chr3:125964484-126088834 ADIPOR1 + 107.962 66.8477 -1.61504 0.00206937 chr1:201176583-201194323 CTNNB1 + 105.613 173.743 1.64509 5.53E-05 chr3:41215945-41256943 UBE2T + 105.121 67.4103 -1.55942 0.00440647 chr1:200567408-200577717 LASS2 + 104.532 161.113 1.54128 0.00057195 chr1:149204272-149214064 CS 104.091 161.047 1.54718 0.000520046 chr12:54951749-54980442 C20orf199 102.885 128.036 1.24446 0.00283344 chr20:47295845-47339202 TMEM147 99.2532 160.971 1.62182 0.000151308 chr19:40728384-40730268 TMEM59 98.8622 162.825 1.64699 9.10E-05 chr1:54269936-54291699 GABARAP L1 + 98.8065 172.888 1.74976 9.15E-06 chr12:10256755-10266991 TPX2 98.4896 55.8945 -1.76206 0.000717693 chr20:29790564-29853264 PPIF 96.5749 160.304 1.65989 8.35E-05 chr10:80777225-80785095 AKR1B1 96.5109 197.005 2.04127 9.29E-09 chr7:133777646-133794428 LITAF 95.8435 151.113 1.57666 0.000488827 chr16:11549082-11588823 RNASEH2 A 95.5401 48.3685 -1.97525 0.000114639 chr19:12778427-12785462 WBP11 95.4644 56.026 -1.70393 0.00154206 chr12:14830678-14847668 BAG1 95.0599 128.435 1.3511 0.000411526 chr9:33209069-33501047 ADAM9 94.2488 37.0207 -2.54584 1.45E-06 chr8:38973661-39081936 GSTM3 93.7458 42.9839 -2.18095 2.30E-05 chr1:110078076-110085183 CDK2AP1 93.3409 55.2333 -1.68994 0.00199624 chr12:122311492-122322640 EID1 91.1688 154.638 1.69617 6.01E-05 chr15:46903226-47042933 IMPDH2 91.0115 50.3116 -1.80896 0.0007408 chr3:49036765-49041879 MAZ 90.1238 54.7056 -1.64743 0.00359638 chr16:29725355-29730005 KIAA0100 + 89.8762 134.494 1.49644 0.00309037 chr17:23965584-23996300 UGP2 87.7553 32.9111 -2.66643 1.60E-06 chr2:63921601-63972200 129

C19orf10 87.3189 39.1076 -2.23279 2.98E-05 chr19:4608556-4621415 SEC13 86.9742 44.091 -1.97261 0.000238159 chr3:10317614-10337858 BZW1 86.1074 163.471 1.89845 3.00E-08 chr2:201384891-201396805 PSIP1 + 85.9316 51.1011 -1.6816 0.00340465 chr9:15454064-15501003 CPA4 85.1551 250.81 2.94533 0 chr7:129720209-129751255 VCL 84.9801 20.7938 -4.0868 8.72E-09 chr10:75427877-75549920 TOP2A 83.1094 49.8408 -1.6675 0.00431588 chr17:35798321-35827695 GALNT2 83.1042 264.266 3.17994 0 chr1:228269578-228484498 ITM2B 81.4877 126.546 1.55295 0.00194256 chr13:47705274-47734233 MCAM + 81.0242 155.128 1.91459 2.15E-06 chr11:118684443-118693050 SNX2 80.7277 130.759 1.61975 0.000656259 chr5:122138648-122193701 EIF5B + 80.3231 37.1475 -2.16227 0.000101677 chr2:99320265-99383160 IFITM1 79.9484 235.693 2.94806 0 chr11:303990-305272 LXN 79.2776 40.7719 -1.94442 2.76E-10 chr3:159845010-159893054 C19orf28 79.0613 135.413 1.71276 0.000116645 chr19:3489262-3508571 VAT1 78.3046 135.974 1.73648 0.00010022 chr17:38420147-38427985 CUL4A 76.3783 118.659 1.55357 0.00267249 chr13:112911086-112967393 TAF10 + 75.1144 40.2543 -1.866 6.85E-12 chr11:6581539-6590021 CD59 72.7482 118.198 1.62475 0.00112573 chr11:33681131-33714601 CREG1 72.596 112.244 1.54615 0.00381191 chr1:165776874-165789680 SERPINB5 72.0739 123.307 1.71084 0.000292741 chr18:59295123-59323298 WDR34 + 71.6006 51.8272 -1.38153 3.22E-07 chr9:130354686-130458950 GSR 69.6607 36.2617 -1.92105 0.00143132 chr8:30655976-30704985 INPPL1 69.2457 125.663 1.81474 6.84E-05 chr11:71613529-71632868 TGFBR1 + 68.4929 164.316 2.39902 1.17E-09 chr9:100907232-100956294 CHTF8 66.0538 32.9347 -2.0056 0.000931795 chr16:67697660-67723985 ENO3 65.1649 23.4806 -2.77527 2.23E-05 chr17:4795130-4801148 MGAT4B 64.5175 99.976 1.5496 1.37E-08 chr5:179157204-179218446 PLAC8 64.3183 34.9234 -1.8417 0.00380109 chr4:84230234-84254935 AIMP2 61.9627 45.056 -1.37524 9.70E-06 chr7:6015407-6065386 ACSL4 61.8122 30.4337 -2.03104 0.00137657 chrX:108771219-108863277 OSMR + 61.2465 136.918 2.23552 1.67E-07 chr5:38881892-38970159 DPH5 + 61.1415 27.7375 -2.20429 0.000555182 chr1:101227768-101263950 SDF4 61.0433 103.459 1.69485 0.00107957 chr1:1142150-1157310 SLC39A6 60.9557 143.498 2.35414 2.72E-08 chr18:31942491-31963355 CCNG1 59.5776 133.896 2.24742 2.01E-07 chr5:162797154-162804600 FAM127A 58.3637 95.9345 1.64374 0.00275607 chrX:133993998-133995241 PCSK9 57.9063 27.754 -2.08641 0.00144467 chr1:55277807-55303111 ACAT2 57.2357 43.6153 -1.31228 0.00304108 chr6:160102978-160130725 NUP210 56.2593 27.8854 -2.01752 0.00244081 chr3:13332736-13436809 SUPT16H 55.8378 28.4818 -1.96047 0.00345997 chr14:20889471-20922265 TMEM2 + 55.558 108.087 1.94548 5.54E-05 chr9:73488101-73573620 NMRAL1 + 55.5213 20.6061 -2.69441 3.55E-13 chr16:4451695-4500349 PDE2A 55.3447 117.378 2.12085 4.01E-06 chr11:71964833-72063142 C6orf176 54.9793 96.3171 1.75188 0.000877378 chr6:166257525-166323093 RGS2 + 54.8423 140.469 2.56133 3.49E-09 chr1:191044791-191048029 DUSP16 + 54.8157 91.026 1.66058 0.00301194 chr12:12520097-12606584 SUMO1 54.5937 15.0985 -3.61584 9.94E-06 chr2:202779147-202811567 TAF6 53.9202 28.6359 -1.88296 4.08E-09 chr7:99528339-99554915 130

HIBADH 53.2536 24.1791 -2.20246 0.00128283 chr7:27531585-27669127 CEP55 52.4588 25.6568 -2.04464 0.00299841 chr10:95246358-95278839 WARS + 51.9951 110.668 2.12843 7.14E-06 chr14:99869877-99912433 NNMT 51.9196 115.132 2.21751 1.90E-06 chr11:113671744-113688448 SERINC3 51.0097 85.0008 1.66637 0.00410761 chr20:42558277-42584140 SLC20A1 + 50.5805 92.2513 1.82385 0.000592955 chr2:113119997-113137871 ALDH3B1 50.1812 22.8196 -2.19904 0.00181032 chr11:67532623-67553319 TM7SF3 + 49.7401 123.522 2.48335 6.07E-08 chr12:27015774-27058606 RECQL 48.8933 26.4788 -1.84651 0.00299045 chr12:21481804-21545870 SLC12A3 48.5445 13.3464 -3.63727 2.94E-05 chr16:55456619-55507263 TWF1 47.7225 21.7257 -2.19659 0.00236201 chr12:42473792-42486445 DDAH1 + 47.6282 87.4272 1.83562 0.000744696 chr1:85556756-85816634 CDKN1A 47.5873 223.052 4.68722 0 chr6:36754436-36763087 CPNE1 46.0701 84.5303 1.83482 2.59E-06 chr20:33677379-33716262 DAB2 + 45.9759 87.5991 1.90533 0.000400434 chr5:39407536-39461092 ORMDL2 45.5432 27.0253 -1.68521 0.00138497 chr12:54498072-54509687 GDA 44.5292 78.5331 1.76363 0.00249032 chr9:73954112-74056960 TLCD1 + 44.3445 26.4232 -1.67824 1.69E-08 chr17:24071126-24078076 CALD1 43.9971 15.6236 -2.81607 0.000441386 chr7:134114703-134306019 ANKRD52 + 43.9409 79.0264 1.79847 0.00181459 chr12:54910086-54938410 LYPD3 + 42.5219 73.9633 1.73942 0.00402359 chr19:48656785-48661671 ITFG3 42.3167 18.1372 -2.33314 0.00253842 chr16:224801-256119 EPHX1 42.0285 87.8987 2.09141 3.88E-07 chr1:224064419-224136671 SERPINE2 + 41.7538 85.9276 2.05796 0.00013048 chr2:224548008-224612280 C1QTNF6 41.7514 106.882 2.55996 2.72E-07 chr22:35906151-35914276 STRA6 + 41.6114 18.252 -2.27983 0.00334441 chr15:72258862-72288424 DARS2 + 40.6674 19.7801 -2.05598 0.00160188 chr1:172035310-172094305 PPP2R1B 40.6551 99.7106 2.4526 5.83E-11 chr11:110978379-111142379 PLK2 + 39.7453 79.6388 2.00373 0.00034534 chr5:57785566-57791670 PHLDA3 39.4351 87.2682 2.21296 3.48E-05 chr1:199701244-199704922 DPP7 + 39.2716 70.104 1.78511 0.00364576 chr9:139124812-139129016 FBLN2 39.0498 16.5137 -2.36469 0.00336794 chr3:13565624-13654923 LSMD1 38.4845 24.9111 -1.54487 0.00242442 chr17:7700727-7706325 RPL18AP3 37.3588 73.0774 1.9561 5.43E-05 chr19:17831726-17835124 RASSF8 37.2068 67.6568 1.8184 0.00340213 chr12:26003230-26124091 PIK3R2 + 34.6606 77.0857 2.22402 9.29E-05 chr19:18125015-18142343 BZW1L1 32.7759 68.371 2.08601 1.55E-11 chr2:201384891-201396805 ERCC1 + 32.4645 17.7627 -1.82768 0.00164414 chr19:50574732-50619017 IGFBP3 + 31.9295 71.2642 2.23192 0.000163242 chr7:45918368-45927396 STAT3 + 31.9061 61.7416 1.9351 0.00246729 chr17:37718868-37794039 EPAS1 + 31.2381 87.672 2.80657 7.33E-07 chr2:46378044-46467346 C3 + 29.2586 72.041 2.46222 3.95E-05 chr19:6628845-6671662 DUSP3 28.8184 56.3714 1.95609 0.0033902 chr17:39199014-39211894 LRFN4 28.0421 43.0792 1.53623 0.00184424 chr11:66372572-66482423 KDELR3 27.2026 10.6988 -2.54258 0 chr22:37194028-37232291 SYNC 26.7377 4.93784 -5.41486 0 chr1:32889335-32940948 SAMD11 26.1827 15.6739 -1.67046 0.00134136 chr1:850983-884542 CUTC 24.5774 14.6735 -1.67495 5.58E-06 chr10:101409252-101505884 VWA5A 23.9336 67.0193 2.80022 1.61E-05 chr11:123491320-123522828 131

SCARNA1 2 22.9644 17.3226 -1.32569 1.28E-13 chr12:6944777-6950177 SUMO1P3 22.4883 4.87028 -4.61746 0 chr1:158525000-158595366 RAVER2 + 22.3907 37.2057 1.66166 3.13E-09 chr1:64983365-65204775 GDF15 + 22.1827 240.699 10.8508 0 chr19:18357967-18360986 PEG10 21.8543 46.3892 2.12266 0.00371946 chr7:94123572-94136942 SAT2 + 21.4619 33.9234 1.58063 0.000129845 chr17:7435271-7477425 TGFBR3 20.9234 54.1558 2.58829 0.000220305 chr1:91918489-92124375 C1S + 20.7275 47.8957 2.31073 0.00144645 chr12:7038240-7048594 ACOT13 19.8932 8.1734 -2.4339 8.81E-06 chr6:24775241-24827382 ADCY3 19.2447 6.78757 -2.83529 0.00400527 chr2:24869836-24995559 POLG + 19.0758 44.8 2.34853 7.92E-09 chr15:87588197-87679030 NR1H3 + 18.8129 9.23537 -2.03705 0.00106626 chr11:47217428-47246977 XPR1 18.5026 41.5556 2.24593 0.00379084 chr1:178867768-179126036 FAS + 18.4876 45.5618 2.46445 0.00103333 chr10:90684812-90765522 HNRPA1L- 2 18.3169 2.30226 -7.95605 0 chr12:52960754-52965297 LOC10027 0710 17.764 25.3435 1.42668 0.0019143 chr10:99463470-99467895 CFI 17.7457 47.8752 2.69785 0.000355605 chr4:110881296-110942784 HDDC3 + 16.9076 11.5681 -1.46157 0.00115782 chr15:89274413-89298327 CCDC18 + 16.0444 8.65494 -1.85379 0.00189955 chr1:93389996-93516856 CTU2 15.9992 13.5779 -1.17833 0.00125302 chr16:87300391-87378873 KRT17 15.278 97.9417 6.41064 1.43E-11 chr17:37029217-37034408 GBA2 14.373 35.346 2.45919 0 chr9:35687333-35742871 CHAC2 14.3552 8.53411 -1.6821 0.000100251 chr2:53750621-53940674 TRIM14 13.995 33.4725 2.39175 0.000102236 chr9:99858779-99921309 NME7 13.8901 27.1006 1.95107 4.44E-16 chr1:167342570-167603810 PBXIP1 13.6996 33.9256 2.47639 0.00461449 chr1:153183179-153195191 HTRA2 13.4761 7.66007 -1.75927 0.000741462 chr2:74607282-74634570 FAM120A OS 12.6614 7.08997 -1.78582 7.44E-08 chr9:95248602-95368218 SMAGP 12.4986 31.3078 2.5049 0 chr12:49918774-49950469 ATF5 12.4888 22.22 1.77919 0.000848645 chr19:55084722-55129004 MMACHC + 12.2308 6.6743 -1.83252 1.54E-12 chr1:45738442-45760196 C16orf53 12.0514 3.96318 -3.04084 4.85E-05 chr16:29735028-29766842 PPAP2A 11.9033 18.9682 1.59352 0.000366191 chr5:54639332-54866630 KDELC1 10.6857 22.4204 2.09817 0.000775213 chr13:102234631-102291888 TMEM183 A + 10.0768 23.5762 2.33965 0.000312304 chr1:201243156-201259820 DDIT3 10.0762 14.6834 1.45724 2.39E-05 chr12:56168117-56200567 PLEKHA9 9.68312 5.31114 -1.82317 1.45E-05 chr12:43853113-44120454 TSPAN31 9.58298 5.06361 -1.89252 1.05E-05 chr12:56425050-56432431 CREB3L4 + 7.22463 4.13405 -1.74759 5.09E-07 chr1:152207020-152217075 HCCA2 5.91069 8.69964 1.47185 2.05E-06 chr11:1447268-1742077 LOC10030 2652 5.6682 9.91736 1.74965 0.000920715 chr2:53750621-53940674 MXD3 5.56056 2.26063 -2.45974 7.79E-08 chr5:176663440-176671898 5.36166 3.54 -1.51459 0.00451339 chr14:23675217-23680637 PIH1D2 5.27113 0.500478 -10.5322 0 chr11:111400747-111450105

132

YY2 5.26432 9.02292 1.71398 2.41E-05 chrX:21767576-21813461 LENG9 + 5.12877 2.53373 -2.0242 2.10E-05 chr19:59651876-59666706 FTHL3P 4.68882 8.02649 1.71184 1.32E-08 chr2:27457564-27486000 EIF3CL_du p2 3.84277 7.44152 1.9365 1.42E-05 chr16:28630282-28654554 CCDC24 3.19833 1.86549 -1.71447 0.00273831 chr1:44229866-44269721 C5orf45 3.1876 5.25934 1.64994 0 chr5:179157204-179218446 SUGT1P 3.12756 8.63422 2.76069 0 chr9:33209069-33501047 DUSP8 3.08337 6.63625 2.15227 1.23E-07 chr11:1447268-1742077 C18orf56 2.93561 5.53287 1.88474 0 chr18:586997-702662 EFCAB5 2.82826 0.95332 -2.96675 0.00325526 chr17:24977090-25459595 IQCG 2.6985 1.27731 -2.11264 3.39E-12 chr3:199100343-199171283 RPL23AP6 4 2.34397 3.83666 1.63682 1.11E-09 chr11:118374061-118394267 DNASE1 2.26202 3.65263 1.61476 5.07E-05 chr16:3642940-3707599 PPAN- P2RY11 2.18561 0.816961 -2.67529 1.99E-08 chr19:10077964-10091599 GPR75 2.16913 3.67753 1.69539 0.00229248 chr2:53750621-53940674 PART1 2.07086 4.20706 2.03155 3.72E-06 chr5:58300622-59879241 TYRO3P 1.91505 3.98377 2.08024 3.33E-15 chr15:74295683-74390865 C1orf182 + 1.87327 2.14174 1.14332 0.000388905 chr1:154545375-154583409 PLTP 1.82496 0 -18249.6 0 chr20:43950673-43974193 INE1 1.47898 3.00734 2.03339 0 chrX:46935142-46959471 LNP1 + 1.18628 0.547498 -2.16673 6.52E-07 chr3:101564996-101657860 DGCR11 1.00754 2.01666 2.00157 0.001946 chr22:17403794-17489967 0.98909 TIAF1 5 1.7083 1.72713 0.00309614 chr17:24424653-24531533 0.82736 SLC15A3 3 2.26874 2.74213 3.85E-07 chr11:60448488-60475833 0.79417 HSN2 7 2.17633 2.74036 0 chr12:732485-890879 0.75659 CCDC152 7 1.28685 1.70084 4.09E-05 chr5:42792676-42847781 0.54274 LOC29034 6 3.67656 6.774 0 chr2:211050653-211252076 0.14753 PLAC4 3 3.60753 24.4524 0 chr21:41461597-41570394 LOC33865 1 0 1.23633 12363.3 0 chr11:1447268-1742077 CCRL1 0 1.27969 12796.9 0 chr3:133759671-133879634 LPAR6 0 2.57967 25796.7 0 chr13:47775883-47954027 KRTAP5-1 0 2.67247 26724.7 0 chr11:1447268-1742077 SCARNA9 0 4.54132 45413.2 0 chr11:93034463-93103170

133

* MS data of identified chromatin-associated proteins pulled down by SUMO-1 from HeLa cells

Sequence Protein ID Gene name Coverage 43% gi 34932414 non-POU domain containing, octamer-binding [Homo sapiens] 20% gi 38372432 SAFB2_HUMAN Scaffold attachment factor B2 34% gi 15559354 Ran GTPase activating protein 1 [Homo sapiens] 16% gi 21264343 scaffold attachment factor B [Homo sapiens] 16% gi 68509926 DEAH (Asp-Glu-Ala-His) box polypeptide 15 [Homo sapiens] 28% gi 38014635 SFPQ protein [Homo sapiens] 17% gi 12644118 TOP1_HUMAN DNA topoisomerase I 61% gi 61680867 B Chain B, Sumo Modified Ubiquitin Conjugating Enzyme E2- 25k 34% gi 133274 HNRPL_HUMAN Heterogeneous nuclear ribonucleoprotein L (hnRNP L) 19% gi 24432016 pre-mRNA cleavage factor I, 59 kDa subunit [Homo sapiens] 16% gi 3183179 KRAB-associated protein 1, KAP-1 5% gi 34222504 Huntingtin-interacting protein HYPA/FBP11 20% gi 55958677 OTTHUMP00000016039 [Homo sapiens] 12% gi 17380155 NOP5_HUMAN Nucleolar protein NOP5 (Nucleolar protein 5) (NOP58) 9% gi 4688900 sarcolectin [Homo sapiens] 6% gi 55958672 antigen identified by monoclonal antibody Ki-67 [Homo sapiens] 21% gi 21361376 splicing factor 3a, subunit 2 [Homo sapiens] 10% gi 71051630 WIZ protein [Homo sapiens] 6% gi 17978466 cyclin T1 [Homo sapiens] 7% gi 5729790 CCCTC-binding factor [Homo sapiens] 44% gi 47117890 H2AQ_HUMAN Histone H2A.q (H2A/q) (H2A-GL101) 33% gi 9973351 H2BS_HUMAN Histone H2B.s (H2B/s) 4% gi 42406352 ASCC3L1 protein [Homo sapiens] 9% gi 12643409 MATR3_HUMAN Matrin 3 15% gi 36054194 nuclear matrix transcription factor 4 [Homo sapiens] 43% gi 12643341 H2AL_HUMAN Histone H2A.l (H2A/l) 6% gi 74354337 Unknown (protein for MGC:111419) [Homo sapiens] 13% gi 62897625 beta actin variant [Homo sapiens] 13% gi 12056465 fibrillarin [Homo sapiens] 6% gi 6822170 protein [Homo sapiens] 8% gi 55859528 heterogeneous nuclear ribonucleoprotein U (SAFA) [Homo sapiens] 7% gi 4505917 exosome component 10 isoform 2 [Homo sapiens] 10% gi 23241743 Transducin beta-like 3 [Homo sapiens]

134

5% gi 56404958 ZN644_HUMAN Zinc finger protein 644 (Zinc finger motif enhancer binding protein 2) (Zep-2) 12% gi 62897249 heterogeneous nuclear ribonucleoprotein A1 isoform a variant [Homo sapiens] 33% gi 20357599 H2A histone family, member V isoform 2 [Homo sapiens] 4% gi 20139105 PKP2_HUMAN Plakophilin-2 38% gi 75766352 B Chain B, Nmr Structure Of Lys48-Linked Di-Ubiquitin 5% gi 62897593 squamous cell carcinoma antigen recognized by T cells 1 variant [Homo sapiens] 3% gi 14670356 general transcription factor II, i isoform 4 [Homo sapiens] 21% gi 75765428 A Chain A, Solution Structure Of Human Sumo-2 (Smt3b), A Ubiquitin- Like Protein 9% gi 62898766 Heterogeneous nuclear ribonucleoproteins C1/C2 (hnRNP C1 / hnRNP C2) variant [Homo sapiens] 2% gi 39753953 hypothetical protein LOC168850 [Homo sapiens] 6% gi 22450882 Transcription factor AP-2 alpha, isoform b [Homo sapiens] 0% gi 8134564 MCM3A_HUMAN 80 kda MCM3-associated protein (GANP protein) 8% gi 20987729 Heterogeneous nuclear ribonucleoprotein A0 [Homo sapiens]

135