ROLES OF THE SNF2-LIKE HELLS IN MYC-DRIVEN

by Jane A. Welch

A dissertation submitted to Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy

Baltimore, MD October 2017

© Jane A. Welch 2017 All Rights Reserved

Abstract

Helicase, lymphoid specific (HELLS) is a SNF2-like ATP-dependent remodeler that participates in transcriptional repression and activation in mammalian cells (von Eyss et al., 2012; Myant and Stancheva, 2008). HELLS has long been described as a proliferation-associated (Jarvis et al., 1996; Lee et al., 2000; Raabe et al., 2001), but its role in neoplastic diseases, which are characterized by cellular hyperproliferation, has not been widely investigated.

This dissertation presents work that addresses how the expression and molecular function of HELLS may have pathologic relevance to cancers. Through immunohistochemistry and RNA sequencing (RNA-seq), we found that HELLS is abundantly expressed in proliferative compartments of normal human tissues and broadly overexpressed in a number of cancers. We used chromatin immunoprecipitation followed by sequencing (ChIP-seq) to identify HELLS binding sites in human Burkitt lymphoma, glioblastoma multiforme, and small cell lung carcinoma cell line genomes and found that

HELLS predominantly targets the promoters of transcribed , leading us to conclude that in human cancers, HELLS function is likely to be associated with transcriptional activation rather than repression. Expressed HELLS-bound genes are significantly enriched for targets of MYC, which has well-described oncogenic effects. Using ChIP- seq and co-immunoprecipitation (co-IP), we found that HELLS and MYC colocalize extensively at transcribed gene promoters and physically interact in human cells, which

ii suggests that the may functionally interact as well. We modeled partial loss of

HELLS function in human cells through CRISPR/Cas9-mediated engineering of a hypomorphic HELLS allele and found that genes previously identified as HELLS targets exhibited altered expression levels. In human osteosarcoma cells with inducible MYC, we tested the effects of RNA interference (RNAi)-induced HELLS knockdown on MYC- driven transcription. While MYC target gene induction was not wholly impaired, a subset of MYC targets exhibited decreased expression. We conclude that HELLS is a bona fide

MYC-interacting protein that may contribute to regulating the expression of some MYC targets, which could have implications for human tumor biology.

Thesis advisor: Kathleen H. Burns, M.D., Ph.D.

Thesis readers: Kathleen H. Burns, M.D., Ph.D. and Kirby D. Smith, Ph.D.

iii Acknowledgements

It is often said that it takes a village to raise a child. In my opinion, it also takes a village to raise a scientist at the doctoral level. I helmed my research projects, designed and conducted experiments with my own two hands, prepared my own presentations for meetings and conferences, and wrote this dissertation, but I could not have accomplished any of these things without the support of many others.

I will begin by thanking the faculty and staff of the Predoctoral Training Program in Human Genetics housed in the McKusick-Nathans Institute of Genetic Medicine, especially Dr. David Valle and Sandy Muscelli. Earning acceptance into this program became my dream early on in my undergraduate studies. I am so very grateful that I was given the opportunity to study and train in this program, and I will always strive to be a positive reflection of it.

I feel that receiving mentorship from my thesis advisor, Dr. Kathleen Burns, has made me one of the luckiest trainees at this institution. Kathy has enthusiastically supported me as I pursued both scientific questions of interest and professional development opportunities. She has always believed in me, even during the times when I found it impossible to believe in myself. She is the most gracious and considerate professional I have ever had the privilege to work with, and I will always look to her as a role model. I will be eternally grateful for the experience of being her mentee.

iv During my graduate studies, I have also had the privilege to work with a number of talented and generous collaborators on my projects: Dr. Amy Duffield at the Johns

Hopkins Hospital’s clinical IHC laboratory; Daniel Ardeljan, Dr. Kathy Burns, David

Esopi, Dr. Michael Haffner, Dr. Chunhong Liu, Dr. Paul Schaughency, Alyza Skaist,

Reema Sharma, Peilin Shen, Dr. Sarah Wheelan, William Wu, and Dr. Srinivasan

Yegnasubramanian here at the Johns Hopkins University School of Medicine; Dr. Brian

Altman, Dr. Annie Hsieh, and Dr. Chi Dang, who worked together at the University of

Pennsylvania during our time of collaboration; and Dr. David Fenyö, Zuojian Tang, and

Xuya Wang at the New York University School of Medicine. The work presented in this dissertation would not have been possible without their collaboration, and it is for that reason that I use “we” in presenting the rationales for experiments, hypotheses, findings, and conclusions throughout this tome.

I would like to thank the members of my thesis committee– Dr. Kathy Burns, Dr.

Elana Fertig, Dr. Haig Kazazian (my committee chairman), Dr. Kirby Smith, Dr. Sarah

Wheelan, and Dr. Srinivasan Yegnasubramanian– for the thoughtful discussions, constructive criticisms, and steadfast encouragement they have provided over the past several years. I am especially grateful that Kathy and Kirby were willing to read this lengthy thesis and offer suggestions for improvement.

I am very grateful to have worked alongside my colleagues, both former and current, in the Burns lab for all these years. They have always been generous with offering feedback on my experiments and analyses, and I have learned a great deal by working with them. In particular, I would like to thank Dan Ardeljan, Dr. Chunhong Liu,

v Reema Sharma, Peilin Shen, and William Wu for collaborating with me on several of my projects.

I am also incredibly grateful to Dr. Sarah Wheelan and the members of her lab– particularly Dr. Bracha Avigdor, Lauren Ciotti, Dr. Paul Schaughency, Alyza Skaist, and

Heather Wick– for working closely with me during much of my time here at the School of Medicine. Sarah co-mentored me for many years, and it is entirely thanks to her and her lab members that I learned enough programming to complete the bioinformatics components of my thesis work.

I would like to thank everyone at OMIM®, especially Joanna Amberger, Dr. Nara

Sobreira, Matt Gross, and Carol Bocchini, for giving me the opportunity to write for their organization. I wholeheartedly believe that reviewing journal articles and drafting gene entries for OMIM® have made an important contribution to my education as a scientist.

I feel that I would be remiss if I did not acknowledge Dr. Kerry Smith and Dr.

Cheryl Ingram-Smith, who welcomed me into their laboratory as a completely inexperienced rising freshman at Clemson University. My dearest ambition was– and still is today– to find a way to help others through my work. Kerry and Cheryl taught me that research could provide a way to achieve that ambition, as our efforts to understand the bases of human diseases can ultimately lead us to ways to help people who are suffering.

Without their guidance and support, I am not sure that I would ever have taken the path to becoming a biomedical scientist. As such, I will always be grateful to them both.

Last, but unequivocally not least, I am incredibly grateful to my family and friends for their unwavering support throughout my lifetime and, in particular, during the vi long, challenging process of earning my Ph.D. My parents, Rob and Anne Welch, and their personal experiences with genetic disorders were really what sparked my interest in human genetics at a pretty early age. They have done everything they possibly could have to foster this interest of mine and to help me achieve my dream of becoming a geneticist, and I will always, always be thankful to them. My brother, David, has also been a substantial source of encouragement, along with my wonderful and loveable extended family: Jim and Pam Welch; Rosemary Nellist; Terry and Beth Dismukes; Matt, Sonia,

Cannon, Chandler, and Saylor Dismukes; Jim, Ginnie, and Michelle Dismukes; Lauren and Jimmy Marshall; Bobby Dismukes and Jim White; Glenn, Adam, Jessica, and Dr.

Joey Dismukes; and Brian and Will Dismukes. I have no doubt that my late grandparents,

James and Laureign Welch and John and Doris Dismukes, would have been incredibly supportive and very proud of me for taking on the challenge of earning a Ph.D.– especially my Grandma Welch, who strongly advocated for women’s education. I would also like to offer heartfelt thanks to my dear friends; Randy, Becky, and Zach Hawkins;

Rico, Sikithea, Skylar, and Sydney Zackery; and Nichola, Griffin, and Jillie Conze for all their love and support. In particular, Skylar, Sydney, and Nichola have brought me so much joy and encouragement, and they have made my hardest days so much easier to bear. Finally, I want to acknowledge my beloved collie, Toby, for always being able to make me smile– and for always reminding me of what matters most in life.

vii Table of Contents

Abstract ii

Acknowledgements iv

Table of Contents viii

List of Figures xv

List of Tables xx

Chapter 1 Introduction 1

1.1 SWI/SNF chromatin remodelers and 1

1.2 Discovery and early characterization of HELLS 2

1.3 HELLS-associated phenotypes 4

1.4 Transcriptional regulation by HELLS 6 viii 1.5 Rationale for studying HELLS in cancer 8

1.6 Summary of dissertation 10

Chapter 2 Profiles of HELLS expression in human tissues, malignancies, and cell

lines 14

2.1 Introduction 14

2.2 Results 15

2.2.1 HELLS is expressed in proliferating compartments of

normal tissues 15

2.2.2 HELLS expression is associated with higher grade in lymphoid

malignancies 16

2.2.3 HELLS is overexpressed in some solid tumor types 17

2.2.4 HELLS-expressing human cell lines can serve as model systems

for investigating HELLS functions 17

2.3 Conclusions and Discussion 18

2.4 Materials and Methods 20

2.4.1 HELLS immunostaining in formalin-fixed, paraffin-

embedded (FFPE) tissues 20

ix 2.4.2 Differential expression analysis of HELLS using TCGA

datasets 21

2.4.3 Cell culture 21

2.4.4 HELLS immunostaining in formalin-fixed, paraffin-

embedded (FFPE) cell blocks 22

Chapter 3 Identity and transcriptional status of genomic targets of HELLS in

human cancer cells 34

3.1 Introduction 34

3.2 Results 35

3.2.1 HELLS binding is enriched in gene promoters 35

3.2.2 HELLS-bound promoters are marked by mRNA expression or

POL2 binding 36

3.2.3 HELLS binding sites are marked by H3K4me3 and H3K27ac 37

3.3 Conclusions and Discussion 38

3.4 Materials and Methods 40

3.4.1 Cell culture 40

3.4.2 HELLS ChIP 40

x 3.4.3 Collection of published ChIP-seq data 42

3.4.4 ChIP-seq analysis 42

3.4.5 Isolation of total RNA from BL cells 43

3.4.6 RNA-seq 44

Chapter 4 Discovery of HELLS and MYC association in human cells 56

4.1 Introduction 56

4.2 Results 59

4.2.1 Transcribed HELLS-bound genes are enriched for MYC

targets 59

4.2.2 HELLS and MYC colocalize on chromatin, notably at

promoters 59

4.2.3 HELLS and MYC share transcribed target genes 60

4.2.4 Some HELLS/MYC targets have been implicated in cancer

biology 62

4.2.5 HELLS and MYC physically interact in HEK 293T cells 63

4.3 Conclusions and Discussion 64

4.4 Materials and Methods 68

xi 4.4.1 Gene set enrichment analysis (GSEA) 68

4.4.2 MYC ChIP-seq analysis 69

4.4.3 Promoter E box motif analysis 70

4.4.4 Cell culture 70

4.4.5 Plasmid construction and transfection 71

4.4.6 Co-immunoprecipitation (co-IP) 71

Chapter 5 Engineering loss of HELLS in human embryonic kidney 293T

cells 83

5.1 Introduction 83

5.2 Results 84

5.2.1 Genome editing recapitulates Hells model mutations in

HEK 293T cells 84

5.2.2 Transfection of MYC expression vector increases MYC expression

in wild-type and HELLS-deficient cells 86

5.2.3 Partial loss of HELLS alters expression of genes, including

HELLS targets 87

5.3 Conclusions and Discussion 89

xii 5.4 Materials and Methods 93

5.4.1 sequence alignment 93

5.4.2 Cell culture 94

5.4.3 Creation of HEK 293T cells with mutations in HELLS 94

5.4.4 Synthesis of MYC expression vector pcDNA3-MYC-Puro 95

5.4.5 Engineering of MYC-expressing HEK 293T cells 97

5.4.6 Isolation of total RNA 98

5.4.7 RNA-seq analysis 98

Chapter 6 Effects of partial loss of HELLS function on MYC-driven

transcription in human osteosarcoma cells 113

6.1 Introduction 113

6.2 Results 115

6.2.1 MYC target gene expression is upregulated by 4OHT treatment

in U2OS MYC-ERTM cells 115

6.2.2 HELLS expression is reduced by DsiRNA transfection 115

6.2.3 HELLS knockdown attenuates expression of a hallmark MYC

target gene subset 116

xiii 6.3 Conclusions and Discussion 118

6.4 Materials and Methods 120

6.4.1 Cell culture 120

6.4.2 DsiRNA transfection 121

6.4.3 Production of MYC off and MYC on states 121

6.4.4 Isolation of total RNA 122

6.4.5 qPCR validation 122

6.4.6 RNA-seq analysis 123

Chapter 7 Concluding Remarks 140

Appendix 145

References 153

Curriculum vitae 168

xiv List of Figures

Figure 1.1 Diagram of human HELLS protein 12

Figure 2.1 HELLS expression in normal human lymph node 23

Figure 2.2 HELLS expression in normal human non-hematolymphoid

tissues 25

Figure 2.3 HELLS expression in low- and high-grade human B cell

lymphomas 27

Figure 2.4 Comparison of HELLS expression between human solid tumor and normal

tissue control samples from The Cancer Genome Atlas (TCGA)

Network 29

Figure 2.5 Expression and nuclear localization of HELLS in selected human

cancer cell lines 31

Figure 3.1 Overlap between HELLS peaks and selected genomic regions 46

Figure 3.2 Profiles of HELLS ChIP-seq coverage across length of hg19 RefSeq

promoters 47

xv Figure 3.3 Overlap between HELLS-bound and expressed promoters in BL

cells 48

Figure 3.4 Overlap between HELLS-bound and POL2-bound promoters in GBM

and SCLC cells 49

Figure 3.5 Profiles of H3K4me3 ChIP-seq coverage across length of hg19 RefSeq

promoters 51

Figure 3.6 Profiles of H3K27ac ChIP-seq coverage across length of hg19 RefSeq

promoters 53

Figure 4.1 Statistical significance of enrichment of transcribed HELLS-bound genes

in gene sets comprised of MYC targets 72

Figure 4.2 Profile of MYC ChIP-seq coverage across length of hg19 RefSeq

promoters 73

Figure 4.3 Overlap between transcribed promoters bound by HELLS and MYC 75

Figure 4.4 Canonical E box content of transcribed promoters bound by HELLS and

MYC 76

Figure 4.5 Track view of HELLS, MYC, mRNA, H3K4me3, and H3K27ac

colocalization at the RANBP1 promoter in BL cells 77

xvi Figure 4.6 Track view of HELLS, MYC, POL2, H3K4me3, and H3K27ac

colocalization at the PSMD3 promoter in SCLC cells 78

Figure 4.7 Map of pCDNA3.1-3xFLAG-HELLS plasmid 79

Figure 4.8 Map of pcDNA3.1(+)-Myc-MYC plasmid 80

Figure 4.9 HELLS/MYC co-immunoprecipitation assay using tagged HELLS

as bait 81

Figure 4.10 HELLS/MYC co-immunoprecipitation assay using tagged MYC

as bait 82

Figure 5.1 Alignment of human and mouse HELLS amino acid sequences 101

Figure 5.2 Targeting strategies for CRISPR/Cas9-mediated generation of HELLS

mutant cell lines 103

Figure 5.3 RT-PCR confirmation of deletion of HELLS exons 6 and 7 in HELLS∆6-

7/∆6-7 clonal cell lines 104

Figure 5.4 RT-PCR confirmation of deletion of HELLS exons 10, 11, and 12 in

HELLS∆10-12/∆10-12 clonal cell lines 105

Figure 5.5 Western blot confirmation of loss of HELLS expression in HELLS mutant

cell lines. 106

Figure 5.6 Map of pcDNA3-MYC-Puro expression vector 107

xvii Figure 5.7 RT-PCR profiling of MYC expression in wild-type (WT) and

HELLS∆10-12/∆10-12 clonal cell lines with and without

pcDNA3-MYC-Puro 108

Figure 5.8 Gene expression differences between HELLS∆10-12/∆10-12 and wild-type

clonal cell lines 109

Figure 5.9 Comparison of HELLS target gene expression between differences

between HELLS∆10-12/∆10-12 and wild-type clonal cell lines 110

Figure 6.1 Schematic representation of MYC off and MYC on states in

U2OS MYC-ERTM cells 126

Figure 6.2 Confirmation of ODC1 upregulation in the MYC on state 127

Figure 6.3 Association of expression of hallmark MYC target genes with MYC off

and MYC on states 128

Figure 6.4 Schematic representation of experiment to test effects of HELLS

knockdown on MYC-ERTM transcription factor activity 130

Figure 6.5 Confirmation of HELLS downregulation following DsiRNA-mediated

knockdown 131

Figure 6.6 Association of expression of hallmark MYC target genes with control and

HELLS knockdowns in the MYC on state 132

xviii Figure 6.7 Association of expression of hallmark MYC target genes with MYC off

and MYC on states in HELLS knockdown cells 134

Figure 6.8 Comparison of hallmark MYC target gene expression between MYC

activity states for control and HELLS knockdown cells 136

Figure 6.9 List of genes exhibiting attenuated expression following HELLS

knockdown in both MYC activity states 137

xix List of Tables

Table 1.1 HELLS allelic variants implicated in five cases of ICF

syndrome 4 13

Table 2.1 Statistically significant results of differential HELLS expression analysis

using TCGA datasets 33

Table 3.1 Formulations of buffers used for HELLS ChIP in BL cells 55

Table 5.1 Guide RNAs (gRNAs) used to engineer HELLS mutant HEK 293T

cells 111

Table 5.2 RT-PCR primers used to confirm targeted deletions in HELLS 112

Table 6.1 TriFECTa DsiRNAs used for knockdown of gene expression 138

Table 6.2 Primers used for qPCR validation of gene expression changes 139

xx Chapter 1: Introduction

1.1 SWI/SNF chromatin remodelers and gene expression

Chromatin remodeling proteins form an important arm of the epigenetic machinery that dynamically regulates chromatin structure (reviewed in Fahrner and

Bjornsson, 2014). These proteins use the energy from ATP hydrolysis to reposition nucleosomes along DNA, altering the accessibility of sites to DNA- or histone-interacting proteins, including enzymes involved in DNA methylation and histone tail modification

(reviewed in Hargreaves and Crabtree, 2011). As a result, chromatin remodelers play essential roles in regulating gene expression, giving them the potential to impact a multitude of cellular processes.

SWI/SNF complexes constitute one of the five families of ATP-dependent chromatin remodelers (reviewed in Wilson and Roberts, 2011). Forward genetic screens in yeast yielded the discovery of two genes encoding proteins that are a part of the

SWI/SNF complex and led to the name for the family. Genes required for the expression of SUC2, which is essential for sucrose metabolism, were identified by the production of sucrose non-fermenting (SNF) yeast mutants (Neigeborn and Carlson, 1984), and genes required for the activation of HO for mating-type switching were identified by the production of switch (SWI) yeast mutants (Stern et al., 1984). While yeast SWI/SNF proteins were identified based on their roles in activating transcription, evidence suggests

1 that mammalian SWI/SNF complexes can contribute to both transcriptional repression and activation (Chi et al., 2002; Ho et al., 2009; Isakoff et al., 2005; Zhang et al., 2002).

A SNF2-like helicase named HELLS (helicase, lymphoid specific) is the subject of this dissertation. HELLS is the official name for the gene that encodes this helicase, but a number of alternative names have been proposed over time and appear in the literature.

These include lymphoid-specific helicase (LSH); proliferation-associated SNF2-like gene

(PASG); and SWI/SNF2-related, matrix associated, actin-dependent regulator of chromatin, subfamily 2A-like (SMARCA6). For the sake of simplicity, HELLS will be used throughout this dissertation to refer to this gene and the protein that it encodes.

1.2 Discovery and early characterization of HELLS

HELLS was discovered through cloning from a mouse fetal thymus cDNA library and was characterized as a putative novel helicase on the basis of cDNA sequence analysis (Jarvis et al., 1996). Expression profiling by Northern blot detected the presence of a Hells transcript in mouse thymus and lymphoid cell lines of T cell origin but not in any other tissues or cells, which led to the declaration that HELLS was a lymphoid- specific protein (Jarvis et al., 1996). A subsequent study, however, revealed that Hells expression in mouse was not restricted to lymphoid tissues; rather, it was specifically expressed in proliferating organs, including the thymus, bone marrow, and testis (Raabe et al., 2001). During mouse embryonic development, Hells expression levels varied by tissue type, with the highest expression levels detected in developing face, limbs, skeletal muscle, heart, and tail (Raabe et al., 2001). Additionally, Hells expression was shown to

2 correlate with a shift from a quiescent to a proliferative state in mouse cell lines (Raabe et al., 2001).

Human HELLS was identified through cloning from cDNA prepared from the

MO7e human megakaryoblastic leukemia cell line and was subsequently characterized by analysis of its sequence features and expression patterns (Lee et al., 2000). In ,

HELLS localizes to 10q23.33 and encodes an 838-amino acid protein containing a nuclear localization signal and two helicase domains: (1) an N-terminal SNF domain with a DEAD box helicase domain characteristic of SNF2 and SNF2-like and (2) a C-terminal helicase domain that houses a nucleotide-binding domain

(Figure 1.1). Each of the two major helicase domains contains an ATP binding site

(Figure 1.1). Although the mouse HELLS ortholog contains fewer amino acids than the human protein, the helicase domains are conserved between human and mouse.

Consistent with reported expression patterns in mouse, the highest levels of human

HELLS mRNA expression were found in proliferative tissues, including the thymus, testis, and bone marrow (Lee et al., 2000).

From these analyses in mouse and human tissues, it became clear that HELLS is not at all a lymphoid-specific protein. Rather, the published expression profiles demonstrated an association between HELLS and proliferation, prompting the question of whether HELLS function may be involved regulating or governing proliferation.

3 1.3 HELLS-associated phenotypes

Two transgenic mouse models have been created to investigate the effects of loss of Hells function. The phenotypes observed in these mouse models, as well as evidence from human molecular genetics, suggest that HELLS function is essential in vivo for normal growth and development.

The first mouse model was generated by targeted deletion of exons 6-7, which leads to a frameshift that introduces an early stop codon and ultimately results in the complete absence of HELLS (Geiman et al., 2001). Mice that are homozygous for this allele, therefore, are true nullomorphs. During gestation, Hells-/- embryos were present in normal Mendelian ratios (Geiman et al., 2001). At birth, the weight of Hells-/- mice was significantly lower than that of their wild-type and heterozygous littermates, indicating that loss of Hells results in growth defects (Geiman et al., 2001). Nearly all Hells-/- mice died within a few hours into the perinatal period; a single animal survived up to five days

(Geiman et al., 2001). This suggests that in mice, complete loss of HELLS is incompatible with life. The precise cause of death for Hells-/- mice was not definitively established, but the animals were noted to show signs of renal pathology, including increased apoptosis (Geiman et al., 2001). Additionally, T lymphocytes generated from

Hells-/- embryos failed to grow in culture (Geiman et al., 2001), suggesting that Hells is required for their proliferation. The single Hells-/- mouse that survived for 5 days after birth was noted to exhibit severe kidney necrosis, bacterial colonization of the spleen, and severe intestinal inflammation, features suggestive of a defective immune system

(Geiman et al., 2001).

4 The second mouse model was generated by targeted deletion of exons 10-12, which encode a portion of the C-terminal helicase domain (Sun et al., 2004). This deletion is in frame and was expected to result in the production of protein with an interstitial deletion (Sun et al., 2004). Western blot analysis revealed that mice homozygous or heterozygous for this mutant allele expressed low levels of this predicted protein product (Sun et al., 2004), indicating that the mutant allele is actually a hypomorph. Because mice heterozygous for the mutant allele were indistinguishable from their wild-type littermates, I will subsequently refer only to mice homozygous for the mutant allele as Hells-deficient. Although they were morphologically normal, Hells- deficient mice were significantly smaller than their wild-type and heterozygous littermates at all stages of development after birth (Sun et al., 2004). Approximately 40% of Hells-deficient mice survived the perinatal period, living for up to several weeks; however, these mice displayed a premature aging phenotype (Sun et al., 2004).

Additional abnormalities noted in Hells-deficient mice included renal pathology and delayed bone development (Sun et al., 2004). Cultured fibroblasts derived from Hells- deficient mice ceased to divide and showed altered expression of senescence-related genes, suggesting that HELLS plays a crucial role in maintaining gene expression patterns that direct normal development and longevity (Sun et al., 2004).

In humans, mutations in HELLS that are predicted to cause loss of function have been suggested as the causes of several cases of immunodeficiency-centromeric instability-facial anomalies (ICF) syndrome. ICF syndrome is a genetically heterogeneous autosomal recessive disorder that has been attributed to mutations in the

DNA methyltransferase 3B (DNMT3B), zinc-finger and BTB domain containing 24 5 (ZBTB24), and cell division cycle associated 7 (CDCA7) genes, in addition to HELLS (de

Greef et al., 2011; Thijssen et al., 2015; Weemaes et al., 2013). The syndrome is characterized by agammaglobulinemia or hypoimmunoglobulinemia, centromere instability, and variable facial dysmorphisms (Hagleitner et al., 2008; Weemaes et al.,

2013). Individuals with ICF syndrome suffer from recurrent, and often fatal, respiratory and gastrointestinal infections and exhibit hypomethylation of certain chromosomal regions (de Greef et al., 2011); in fact, hypomethylation of juxtacentromeric satellites types II and III is diagnostic for the syndrome. Six homozygous or compound heterozygous mutations in HELLS were identified in five patients with ICF syndrome 4

(ICF4) from four unrelated families through a combination of homozygosity mapping and whole exome sequencing (Thijssen et al., 2015). The mutations, listed in Table 1.1, segregated with the disorder in the families and were all predicted to be pathogenic

(Thijssen et al., 2015). Functional studies of the mutations were not performed, but the disorder could not be explained by mutations in any of the three other genes implicated in

ICF syndrome (Thijssen et al., 2015).

1.4 Transcriptional regulation by HELLS

There has been some debate in the field as to whether, on a molecular level,

HELLS acts as a transcriptional repressor or activator. Studies of physical interactions between HELLS and other proteins, identification of HELLS genomic targets, and the molecular effects of loss of HELLS function provide evidence to suggest that, like other

SWI/SNF complex proteins, HELLS is capable of performing either role.

6 The bulk of the available literature provides support for a transcriptional repressor function for HELLS. Physical interactions between HELLS and DNA methyltransferases

1 and 3B (DNMT1 and DNMT3B) as well as histone deacetylases 1 and 2 (HDAC1 and

HDAC2) were identified by co-immunoprecipitation (co-IP) (Myant and Stancheva,

2008). These interactions appears to have a cooperative functional effect, as ectopically- expressed HELLS demonstrated the ability to silence a reporter gene in cultured human cells, but only in the presence of DNMTs and HDACs (Myant and Stancheva, 2008). In another report, chromatin immunoprecipitation (ChIP) assays performed on mouse embryonic fibroblasts showed that HELLS directly localized and bound to LINE, SINE, and IAP retrotransposons (Huang et al., 2004). In somatic cells, retrotransposons are typically silenced by DNA methylation, which prevents their expression and, therefore, their ability to create a de novo insertion in the genome (reviewed in Smith and Meissner,

2013); that HELLS binds to these loci suggests that the helicase may work to induce their silencing, thereby preserving the integrity of the genome. Finally, as I discussed in

Section 1.3, Hells-/- and Hells-deficient mice exhibit global DNA hypomethylation, including at interspersed repeats, protein-coding genes, and large chromosomal domains that are associated with the nuclear lamina (Dennis et al., 2001; Myant et al., 2011; Sun et al., 2004; Yu et al., 2014). This suggests that HELLS may be important in establishing or maintaining normal patterns of DNA methylation. Indeed, embryonic fibroblasts derived from Hells-/- mice were able to reestablish DNA methylation and silencing of misregulated genes after reexpression of wild-type, but not catalytically inactive, HELLS, indicating that HELLS activity is required for those processes to occur (Termanis et al.,

2016).

7 Evidence that supports a role for HELLS in activating transcription was published in a single, yet convincing, publication in which HELLS was identified in a proteomic screen for candidate interactors of E2F3, a member of the E2F family of transcriptional activators (von Eyss et al., 2012). E2F3 is a well-characterized regulator of cell proliferation and cell cycle progression. In addition to physically interacting with one another, as demonstrated by co-IP, Hells and E2f3 co-occupied many of the same promoters in murine embryonic fibroblasts (von Eyss et al., 2012), suggesting that they share target genes and may cooperate to regulate their expression. Depletion of HELLS via shRNA-induced knockdown impaired the induction of E2F3 target genes, suggesting that HELLS is required for transactivating E2F3-mediated transcription (von Eyss et al.,

2012). Importantly, HELLS knockdown did not affect the ability of E2F3 to bind to its target gene promoters, indicating that HELLS plays an important role in the activating function of E2F3 (von Eyss et al., 2012). HELLS knockdown also resulted in delayed entry into S-phase and subsequent decreased cell growth rates (von Eyss et al., 2012), suggesting that its putative action as a transcriptional activator is involved in driving growth and proliferation.

1.5 Rationale for studying HELLS in cancer

When I began the work that forms the body of this dissertation, I was intrigued by the idea that HELLS could potentially regulate the epigenetic, and thus transcriptional, status of retrotransposons, protein-coding genes, or both. In particular, I was curious about whether HELLS function may be significant in the context of cancer for several reasons.

8 First, because loss of HELLS results in decreased cell growth, it would be logical to ask whether HELLS expression could contribute to the hyperproliferation that is a common feature of many cancers (Evan and Vousden, 2001). HELLS had not yet been widely investigated in human cancers, although its expression had been positively correlated with disease stage in some types of solid tumors, including head and neck squamous cell carcinomas and prostate cancers (von Eyss et al., 2012; Waseem et al.,

2010). Additionally, the identification of a HELLS isoform specific to acute myelogenous leukemia and acute lymphoblastic leukemias prompted the question of whether HELLS contributes to leukemogenesis (Lee et al., 2000).

Second, cancers can exhibit aberrant expression of protein-coding genes as well as retrotransposons. Hypomethylation of LINE-1 (L1) retrotransposons has been described in many tumor types, including breast, colon, and prostate cancers (Alves et al.,

1996; Cho et al., 2007; Choi et al., 2007; Dante et al., 1992; Santourlidis et al., 1999;

Schulz et al., 2002; Suter et al., 2004). This can to lead to retrotransposon derepression, which may allow for complete cycles of retrotransposition that generate somatic insertions in tumors. Somatic insertions of L1, Alu, and SVA elements have been identified in a number of human cancers, including tumors of the lung, ovary, stomach, colon, pancreas, esophagus, head and neck, endometrium, and liver (Doucet-O’Hare et al.,

2015; Ewing et al., 2015; Helman et al., 2014; Iskow et al., 2010; Lee et al., 2012; Miki et al., 1992; Paterson et al., 2015; Pitkänen et al., 2014; Rodić et al., 2015; Shukla et al.,

2013; Solyom et al., 2012; Tang et al., 2017). It has also been shown that somatic insertions relevant to cellular transformation can occur; in a case of colon cancer, a somatic L1 insertion interrupted an exon of the adenomatous polyposis coli (APC) tumor 9 suppressor gene (Miki et al., 1992). Because HELLS had been demonstrated to localize to LINE retrotransposons in mouse embryonic fibroblasts, I was curious as to whether it did the same in human cancer cells.

1.6 Summary of dissertation

For this dissertation, I sought to address the molecular basis of HELLS function and to understand how this may be relevant to cancer biology. We profiled HELLS expression in human tissues, cancers, and cell lines and found that HELLS tended to be selectively expressed in proliferative compartments of normal tissues and highly expressed in various tumors and cancer cell lines. We performed ChIP-seq to identify

HELLS genomic targets in human cancer-derived cell lines and found that HELLS preferentially binds to promoters of actively transcribed genes. Gene set enrichment analysis revealed that HELLS binding is enriched in promoters of target genes of the

MYC transcription factor, which has well-documented oncogenic effects. HELLS and

MYC colocalize extensively on chromatin, particularly at the promoters of expressed genes, and co-immunoprecipitation confirmed that the two proteins physically interact in cultured cells. We created a HELLS loss-of-function model by generating a hypomorphic

HELLS allele in HEK293T cells with the intent of investigating how HELLS deficiency alters gene expression in the presence or absence of MYC. In another model system, we tested the effects of HELLS knockdown on MYC-driven transcription and found that while the induction of MYC target genes was not impaired, a subset of these targets exhibited decreased expression levels, suggesting that HELLS somehow contributes to regulating their transcription. We therefore concluded that HELLS is a MYC-associated

10 protein, and while HELLS may not be an obligatory cofactor for MYC, it nevertheless seems to be important for the full expression of some MYC target genes.

11 





)LJXUH'LDJUDPRIKXPDQ+(//6SURWHLQ+XPDQ+(//6LVRIRUP 1&%,

LGHQWLILHU13B LVVKRZQZLWKPDQ\RILWVUHFRJQL]DEOHGRPDLQVDQGIHDWXUHV

.H\61)B1 61)KHOLFDVHGRPDLQ'(;'F '($'ER[KHOLFDVHGRPDLQ+(/,&F 

&WHUPLQDOKHOLFDVHGRPDLQ3XUSOHDVWHULVN SUHGLFWHGSKRVSKRU\ODWLRQVLWH*UHHQEDU 

$73ELQGLQJVLWH$TXDGLDPRQG '($+ER[PRWLI5HGEDU QXFOHRWLGHELQGLQJ

UHJLRQ1WHUPLQDOQXFOHDUORFDOL]DWLRQVLJQDOLVQRWVKRZQ



















  Table 1.1. HELLS allelic variants implicated in five cases of ICF syndrome 4.

Database Accession Numbers

Mutation Consequence dbSNP ClinVar

Q699 to R substitution (p.[Gln699Arg]) in c.2096A>G rs879253733 RCV000210912 conserved C-terminal helicase domain

Destruction of splice site in intron 5 resulting in skipping of exon 5, leading c.370+2T>A rs140316223 RCV000210918 to frameshift that creates premature termination codon in exon 6

Frameshift that creates premature termination c.2283_2286delGTCT rs879253734 RCV000210919 codon in exon 20 (p.[Ser762Argfs*4])

In-frame deletion of L801 c.2400_2402delGTT rs879253737 RCV000210910 (p.[Leu801del)]

Transversion in exon 8 resulting in K204 to stop c.610A>T rs879253735 RCV000210911 codon substitution (p.[Lys204*])

Frameshift resulting in c.374_381dup stop codon insertion rs879253736 RCV000210917 (p.[Lys128*])

Key: > = substitution; del = deletion; dup = duplication; * = stop codon

13 Chapter 2: Profiles of HELLS expression in

human tissues, malignancies, and cell lines

2.1 Introduction

Cancer is a disease state characterized by increased cellular proliferation

(reviewed in Evan and Vousden, 2001). Because loss of HELLS results in decreased cell growth and proliferation in animal models (Geiman et al., 2001; Sun et al., 2004), we considered whether the reverse may also be true. We therefore hypothesized that increased HELLS expression may occur in human cancers and be biologically relevant to disease onset or progression. Identifying tumor types and cell lines that highly express

HELLS would justify using cancer as a model system in which to study HELLS function in later experiments, so we sought to determine patterns of HELLS expression in human tumors and cancer cell lines.

Additionally, we were interested in examining HELLS expression in normal human tissues. While it has been demonstrated that mouse HELLS expression occurs in both hematolymphoid and non-hematolymphoid tissues (Jarvis et al., 1996; Raabe et al.,

2001), the field appears to have directed less focus towards human HELLS expression.

We therefore sought to verify whether human HELLS is expressed in similar patterns to those observed in mouse tissues.

14 To this end, we utilized either immunohistochemistry (IHC) or RNA-seq to examine patterns of HELLS expression in normal human tissues, a variety of human tumors, and several human cancer cell lines. In the next section, I provide a description of the results of these analyses.

2.2 Results

2.2.1 HELLS is expressed in proliferating compartments of normal human tissues

We began our studies of HELLS expression in a variety of normal human tissues.

In collaboration with Dr. Amy Duffield at the clinical IHC laboratory at the Johns

Hopkins Hospital, we used immunostaining to identify HELLS-positive cells in formalin- fixed, paraffin-embedded (FFPE) diagnostic pathology samples.

In both hematolymphoid and non-hematolymphoid human tissues, HELLS exhibited highly specific patterns of expression. In normal lymph node (LN), HELLS expression was predominantly restricted to large germinal center B cells (Figure 2.1), which are known to constitute a highly proliferative cell population. Similarly, in non- hematolymphoid tissues, we found that HELLS-positive cells were frequent within proliferative compartments (Figure 2.2). In normal small bowel, we observed HELLS expression in a subset of cells in the crypts of Lieberkühn, which contain stem cell populations (Figure 2.2). In normal skin, a subset of epidermal basal cells were immunoreactive for HELLS, while superficial and keratinizing epithelial cells were not

(Figure 2.2); stromal cells in the dermis were also negative for HELLS immunoreactivity.

In all tissues that we examined, HELLS apparently localized to the nucleus, which was

15 not unexpected, given the protein’s known role as a chromatin remodeler. Overall, these

IHC profiles suggest that expression and nuclear localization of HELLS are indeed associated with proliferating human cells and tissues.

2.2.2 HELLS expression is associated with higher grade in human lymphoid

malignancies

As I discussed in Chapter 1, many human cancers exhibit deregulated cell proliferation, and we were interested in investigating whether HELLS may contribute to this phenomenon. We therefore next sought to characterize HELLS expression patterns in human malignancies, both to contribute this knowledge to the field and to evaluate whether cancer would be an appropriate model system for investigating functions of

HELLS.

Through further collaboration with Dr. Duffield, we evaluated HELLS expression in a number of low- and high-grade human B cell lymphomas by immunohistochemistry.

In general, HELLS expression tended to be minimal in low-grade lymphomas. In low- grade extranodal marginal zone lymphoma, or mucosa-associated lymphoid tissue

(MALT), immunostaining showed infrequent immunoreactivity for HELLS (Figure 2.3).

In contrast, HELLS expression was commonly detected in high-grade malignant lymphomas. In an unclassifiable high-grade large B cell lymphoma with features intermediate between diffuse large B cell lymphoma (DLBCL) and Burkitt lymphoma, the majority of cells exhibited diffuse nuclear staining for HELLS (Figure 2.3). These data indicate that HELLS expression is a feature of some human B cell lymphomas, notably those that are classified as higher grade malignancies.

16 2.2.3 HELLS is overexpressed in some solid tumor types

After noting that HELLS was abundantly expressed in high-grade malignant lymphomas, we were curious as to whether HELLS expression is characteristic of additional tumor types. Additionally, we were curious about whether HELLS expression patterns may differ between normal and cancerous tissues. We obtained RNA-seq data from The Cancer Genome Atlas (TCGA) Research Network and looked for differences in

HELLS expression between tumors and normal tissue controls. HELLS was significantly more highly expressed (log2 fold change 2.12-3.23; p-value < 0.05) in carcinomas originating from the esophagus, lung, liver, bladder, and uterus than in normal tissue controls (Figure 2.4, Table 2.1). This suggests that HELLS expression in cancer is not uncommon and may contribute, in some way, to tumor biology.

2.2.4 HELLS-expressing human cell lines can serve as model systems for

investigating HELLS function

Having determined that HELLS is expressed in a variety of human cancers, we were satisfied that cancer would be an appropriate model in which to study functions of

HELLS. Our next step was to select several HELLS-expressing cell lines to use as our model systems for experimental studies. The first cell line that we identified as a potential candidate model system, Ramos 1596, was established from an American Burkitt lymphoma (Klein et al., 1975). Because we observed robust HELLS expression in high- grade large B cell lymphomas, we expected that HELLS would similarly be expressed in

Ramos 1596 cells. Our second potential candidate model system was the U-87 MG cell line, which was derived from glioblastoma multiforme (GBM) (Hirvonen et al., 1994).

17 Although we were unable to analyze HELLS expression in GBM samples from TCGA ourselves, expression data available at FireBrowse (http://firebrowse.org/) suggested that

HELLS expression may be higher in GBM tumors than in normal brain tissue (data not shown). In small cell lung carcinomas (SCLCs), HELLS expression has been shown to positively correlate with tumor stage (von Eyss et al., 2012), so we selected H2171 cells, which were derived from a patient’s SCLC (Johnson et al., 1987), as our third candidate model system.

To determine whether the three selected candidate cell lines express HELLS, we used IHC to look for HELLS staining patterns in slices prepared from cell blocks. Indeed,

IHC revealed robust nuclear staining for HELLS in all three cell lines (Figure 2.5), suggesting that these cell lines could serve as model systems for investigating how

HELLS functions as a transcriptional modulator.

2.3 Conclusions and Discussion

In this chapter, I have described our efforts to characterize patterns of HELLS expression in a number of human-derived specimens: normal tissues; malignancies, including liquid and solid tumors; and tumor-derived cell lines. From these studies, we were able to identify HELLS-expressing cells and to evaluate cancer as a potential model system for studying HELLS function.

First, we used IHC to demonstrate that HELLS is expressed in proliferating compartments of normal human tissues, including germinal centers in the lymph node

(Figure 2.1) and stem cell compartments of small bowel and skin (Figure 2.2). Our

18 observations are consistent with published data showing that in mouse, HELLS expression is not lymphoid-specific but is associated with proliferating tissues and organs, including the thymus, bone marrow, and testis (Raabe et al., 2001). The literature does not contain many reports describing human HELLS expression, but in a single study,

Northern blot analysis revealed high levels of HELLS in human adult thymus and testis and dramatically lower levels in several other tissues, including small intestine (Lee et al.,

2000). Our IHC profiling supports those published observations, and our own results suggest that the reason for the reported low-level expression of HELLS in some tissues may arise from analysis of total RNA from a whole tissue sample. Tissue architecture is often complex and contains multiple cell types and populations. Using IHC to examine tissues, Northern blot, RT-PCR, or qPCR to query sorted cells would provide a more refined view of HELLS expression and, as such, may be the most appropriate strategies for additional HELLS expression profiling in other tissues.

Next, we evaluated whether HELLS expression is associated with human cancers, which exhibit deregulated cell proliferation (reviewed in Evan and Vousden, 2001). By

IHC, we demonstrated that HELLS expression tended to be robust in high-grade B cell lymphomas but not in low-grade lymphoid malignancies (Figure 2.3). Importantly,

HELLS expression was clearly restricted to the nucleus (Figure 2.3). To function as a chromatin remodeler, HELLS must be able to access DNA in the nucleus. Our results suggest that HELLS may function as a transcriptional modulator in those cells that were immunoreactive for HELLS. Analysis of TCGA datasets revealed that HELLS is more highly expressed in solid tumors than in normal tissues (Figure 2.4). HELLS expression has also been shown to positively correlate with pre-malignancy as well as tumor 19 progression in head and neck squamous cell carcinoma and with later-stage, more aggressive prostate cancers (von Eyss et al., 2012; Waseem et al., 2010). Taking these observations together, we hypothesized that HELLS may be biologically relevant to tumorigenesis or tumor progression.

To identify cancer cell lines that would be suitable for use as models in which to study HELLS functions, we evaluated HELLS expression by IHC in three human cancer cell lines, Ramos 1596, U-87 MG, and H2171. All three were derived from human tumor types that we expected to express HELLS, based on our IHC profiling and published or publicly available data. We demonstrated the expression and nuclear localization of

HELLS by IHC in all three cell lines (Figure 2.5) and concluded that they would be appropriate systems in which to experimentally assess a role for HELLS in transcriptional regulation and chromatin structure. We therefore elected to use all three in experiments that are discussed in Chapter 3.

2.4 Materials and Methods

2.4.1 HELLS immunostaining in formalin-fixed, paraffin-embedded (FFPE)

tissues

These stains were performed in the clinical IHC lab at the Johns Hopkins Hospital using a mouse monoclonal antibody that was raised against N-terminal amino acids 1-240 predicted for the human protein (Santa Cruz Biotechnology, CA, catalog number sc-

4666). Unstained 4-μm sections of each tissue block were kept at 65° C for 30 minutes before staining on a Bond-Leica autostainer (Leica Microsystems, Bannockburn, IL).

20 Heat-induced antigen retrieval with high pH retrieval solution was followed by a peroxide blocking step and 30 minutes of primary antibody incubation with the described mouse monoclonal anti-HELLS antibody. The reaction was developed using a biotin- free, Bond-polymer detection (Leica Microsystems, Bannockburn, IL), and 3,3’- diaminobenzidine (DAB) chromogen/substrate was used for visualization. Slides were counterstained with hematoxylin, dehydrated, and coverslipped.

2.4.2 Differential expression analysis of HELLS using TCGA datasets

RTCGAToolbox (Samur, 2014) was used to: (1) download unnormalized RNA- seq counts from the "20151101" run date for all TCGA datasets with available tumor and normal samples and (2) perform a differential expression analysis for HELLS using

Student’s t-test. The Benjamini-Hochberg correction (Benjamini and Hochberg, 1995) was used to correct for multiple hypothesis testing and determine adjusted p-values.

HELLS expression values were plotted using the ggplot2 R package (Wickham, 2009) and Silhouette Studio Designer Edition software (Silhouette America).

2.4.3 Cell culture

All cell lines were cultured in base medium supplemented with 10% fetal bovine serum (FBS) (Corning) and 1% penicillin/streptomycin (Life Science, Sigma-Aldrich).

For Ramos 1596 cells, FBS was heat-inactivated at 55°C for 30 minutes before formulating the complete growth medium. RPMI-1640 (Gibco®, Life Technologies,

Thermo Fisher Scientific Inc.) was used as the base medium for Ramos 1596 (BL) and

H2171 (SCLC) cells. EMEM was used as the base medium for U-87 MG (GBM) cells.

Cells were incubated at 37°C with 5% CO2.

21 2.4.4 HELLS immunostaining in formalin-fixed, paraffin-embedded (FFPE) cell

blocks

Cells were grown in sufficient quantities to be able to make cell blocks, which were then fixed in formalin and embedded in paraffin. Slides with cell pellet sections were baked at 65°C for 20 minutes before staining. Heat-induced antigen retrieval with

Dako’s Target Retrieval Solution, pH 9 (Agilent, S2367) was followed by a peroxide blocking step and 60 minutes of primary antibody incubation with mouse monoclonal anti-HELLS antibody (Santa Cruz Biotechnology, sc-46665). The reaction was developed using the Dako EnVision+ System-HRP (DAB) kit (Agilent, K4006). Slides were counterstained with Dako’s Mayer’s Hematoxylin (Agilent, S330930-2), dehydrated, and coverslipped.

22 Figure 2.1. HELLS expression in normal human lymph node. (A) Hematoxylin and eosin (H&E) stain shows the rounded edge of a secondary lymphoid follicle with germinal center B cells (bottom) separated from the paracortex (top) by a well- demarcated mantle zone. Black scale bar indicates 50 μm. (B) Immunostain shows cells that are immunoreactive for HELLS in brown; cells that are not immunoreactive for

HELLS are counterstained in blue.

23 





 Figure 2.2. HELLS expression in normal human non-hematolymphoid tissues. (A)

Immunostain of normal human small bowel shows cells that are immunoreactive for

HELLS in brown; cells that are not immunoreactive for HELLS are counterstained in blue. (B) Immunostain of normal human skin shows cells that are immunoreactive for

HELLS in brown; cells that are not immunoreactive for HELLS are counterstained in blue.

25  $ 













 %















 Figure 2.3. HELLS expression in low- and high-grade human B cell lymphomas.

Top row: Left, H&E stain of a low-grade mucosa-associated lymphoid tissue (MALT) shows a monotonous infiltrate of small, round mature B cells with moderate amounts of cytoplasm; right, immunostain of MALT shows cells that are immunoreactive for HELLS in brown; cells that are not immunoreactive for HELLS are counterstained in blue.

Bottom row: Left, H&E stain of an unclassifiable high-grade large B cell lymphoma; right, immunostain of the unclassifiable high-grade large B cell lymphoma shows cells that are immunoreactive for HELLS in brown; cells that are not immunoreactive for

HELLS are counterstained in blue. Black scale bars depict 50 μm.

27 





















  Figure 2.4. Comparison of HELLS expression between human solid tumor and normal tissue control samples from The Cancer Genome Atlas (TCGA) Network.

Box plot displays log2-transformed RNA-seq counts from data generated by the TCGA

Research Network (http://cancergenome.nih.gov) for head and neck squamous cell carcinoma (HNSC); thyroid carcinoma (THCA), esophageal carcinoma (ESCA), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), breast invasive carcinoma (BRCA), liver hepatocellular carcinoma (LIHC), kidney renal clear cell carcinoma (KIRC), pan-kidney cohort (KIPAN), bladder urothelial carcinoma (BLCA), and uterine corpus endometrial carcinoma (UCEC), along with corresponding, appropriate normal tissue controls. Significant p-values, which were determined by

Student’s t-test and subsequently adjusted for multiple hypothesis testing, are indicated on the plot as follows: *** for p-value < 0.001, ** for 0.001 < p-value < 0.01, and * for

0.01 < p-value < 0.05.

29

30 Figure 2.5. Expression and nuclear localization of HELLS in selected human cancer cell lines. Immunostains performed on slices of cell blocks show cells that are immunoreactive for HELLS in brown; cells that are not immunoreactive for HELLS are counterstained in blue. (A) Ramos 1596 Burkitt lymphoma cells, (B) U87-MG glioblastoma multiforme cells, and (C) H2171 small cell lung carcinoma cells. Black scale bars indicate 25 μm.

31 



 Table 2.1. Statistically significant results of differential HELLS expression analysis using TCGA datasets.

Statistical Significance

Tumor/Normal Test statistic TCGA Dataset Expression Difference p-value Adjusted p-value (t) (log2 fold change)

-8 -7 BCLA 2.52 6.38 1.80 × 10 5.35 × 10

-10 -9 ESCA 2.32 6.75 1.55 × 10 5.51 × 10

-4 -3 LIHC 2.43 3.80 7.31 × 10 3.30 × 10

-15 -14 LUAD 2.17 8.83 1.47 × 10 1.68 × 10

LUSC 3.23 10.59 8.49 × 10-22 1.72 × 10-20

-3 -2 UCEC 2.81 3.10 2.12 × 10 1.35 × 10

Key: BLCA: bladder urothelial carcinoma, ESCA: esophageal carcinoma, LIHC: liver hepatocellular carcinoma, LUAD: lung adenocarcinoma, LUSC: lung squamous cell carcinoma,

UCEC: uterine corpus endometrial carcinoma

33 Chapter 3: Identity and transcriptional status of

genomic targets of HELLS in human cancer cells

3.1 Introduction

Our work establishing that HELLS is abundantly expressed in some human cancers, described in Chapter 2, led us to hypothesize that, in cancers in which it is expressed, HELLS functions as a transcriptional modulator. To understand how HELLS may regulate transcription, we sought to identify genomic loci that are targeted and bound by HELLS. We collaborated with David Esopi and Dr. Srinivasan

Yegnasubramanian at the Johns Hopkins University of Medicine to perform ChIP followed by high-throughput sequencing (ChIP-seq) to isolate and sequence HELLS- bound DNA from the three cancer cell lines I described in Chapter 2: Ramos 1596

Burkitt lymphoma (BL) cells, U-87 MG glioblastoma multiforme (GBM) cells, and

H2171 small cell lung carcinoma (SCLC) cells. Hereafter, I will refer to each of these cell lines by the acronyms representing the tumor types from which they were derived.

As I mentioned in Chapter 1, the literature contains seemingly conflicting reports about the identity and transcriptional status of genomic targets of HELLS. HELLS has been shown to colocalize with E2F3 at active promoters (von Eyss et al., 2012), and it has also been demonstrated to directly bind to mouse LINE, SINE, and IAP retrotransposon loci (Huang et al., 2004), which tend to be transcriptionally silent in normal somatic cells. The use of ChIP-seq, which allows for comprehensive detection of

34 genomic regions that are bound by a protein of interest (Massie and Mills, 2012), would enable us to identify protein-coding genes as well as retrotransposon loci that may be bound by HELLS. Identifying HELLS-bound loci also provided us with a means to study correlations with indicators of transcriptional status, including DNA binding factors, mRNA expression, and histone modifications. In this chapter, I present our results from these studies.

3.2 Results

3.2.1 HELLS binding is enriched in gene promoters

To identify HELLS binding sites in the genomes of BL, GBM, and SCLC cells, we mapped HELLS ChIP-seq and input DNA sequence data and performed peak calling to identify genomic regions that were enriched for HELLS-immunoprecipitated DNA. In total, our analysis identified 2,765 statistically significant (p-value < 1 × 10-5) HELLS peaks in BL; 10,043 in GBM; and 23,939 peaks in SCLC.

Next, we sought to address the question of whether HELLS binding was associated with genes or retrotransposons. We correlated the positions of HELLS peaks with the following classes of genomic regions: intergenic; promoter, which we defined as a region within 2 kb of a transcription start site (TSS); gene, which we defined as a transcription unit; exons; introns; L1 (LINE) retrotransposons; and Alu (SINE) retrotransposons. In all three cell lines, HELLS peaks exhibited the highest degree of overlap with promoters (Figure 2.1). Importantly, permutation testing revealed that the degree of overlap between HELLS peaks and promoters was highly statistically significant (p-value < 0.001), suggesting that HELLS binding is enriched in promoters.

HELLS peaks were also somewhat highly correlated with exons (Figure 2.1); however, it 35 may be possible to attribute this correlation to the way that we defined promoters, which could have resulted in the inclusion of up to a few exons in a promoter region. We also found that HELLS peaks were significantly underrepresented in both L1s and Alus (p- value < 0.001), suggesting that HELLS does not target retrotransposons in human genomes.

To refine the spatial orientation of HELLS ChIP-seq peaks within promoters, we determined scores for the coverage ratios of HELLS ChIP to input DNA and plotted these across the length of all annotated hg19 promoters. In all three cell lines, the highest scores mapped right around the TSS, showing that HELLS binds to TSSs (Figure 3.2).

Taking these data together, we concluded that HELLS predominantly targets promoters of protein-coding genes in human cancer cell genomes. This suggested to us that genes with HELLS-bound promoters may be transcriptionally regulated by HELLS.

3.2.2 HELLS-bound promoters are marked by mRNA expression or POL2

binding

To address whether HELLS binding was associated with transcriptional activation or repression of its apparent target genes, we first investigated whether there was evidence of HELLS target gene expression. We performed RNA-seq to profile the transcriptome of BL cells, generated a list of expressed promoters, and asked how many

HELLS-bound promoters fell into that list. The majority (88.7%) of HELLS-bound promoters showed evidence of mRNA expression (Figure 3.3), suggesting that HELLS binding may be predominantly associated with activating expression rather than repressing it.

36 We did not profile the transcriptomes of GBM or SCLC cells, and previously published RNA-seq datasets were not available. However, we were able to download and reanalyze previously published RNA polymerase II (POL2) ChIP-seq data (Lin et al.,

2012). POL2 binding is highly correlated with the presence of other basal transcription regulators (Mokry et al., 2012), so it seemed reasonable to consider POL2 peaks at promoters to be indicators of gene expression. POL2 binding occurred at 78.0% and

77.7% of HELLS-bound promoters in GBM and SCLC, respectively (Figure 3.4), providing additional evidence for a strong association between HELLS binding and gene expression.

3.2.3 HELLS binding sites are marked by H3K4me3 and H3K27ac

We also considered whether HELLS binding was associated with histone modifications that are indicative of transcriptional status. Histone H3 lysine 4 trimethylation (H3K4me3) and histone H3 lysine 27 acetylation (H3K27ac) are both histone modifications that are commonly found in nucleosomes that flank active promoters (reviewed in Shlyueva et al., 2014). We were able to locate, download, and reanalyze previously published ChIP-seq data for these two histone modifications (Lin et al., 2012; Qian et al., 2014). We then looked at H3K4me3 and H3K27ac coverage across promoters stratified by HELLS binding status. For HELLS-bound promoters, H3K4me3 and H3K27ac signals were present in a bimodal distribution (Figures 3.5 and 3.6), a typical pattern for these marks at transcribed loci. Promoters that were not bound by

HELLS also showed signs of H3K4me3 and H3K27ac, indicating that HELLS is not broadly required for transcription. Nevertheless, the data supported our conclusion that

HELLS predominantly binds to the promoters of transcribed genes.

37 3.3 Conclusions and Discussion

In this chapter, I have described our work to identify and characterize HELLS binding sites in BL, GBM, and SCLC cell line genomes. To the best of our knowledge, our efforts have resulted in the first catalogs of genomic loci bound by HELLS in the . Using those catalogs, we were able to address the question of whether

HELLS directly targets protein-coding genes or retrotransposons in human cells. Further, we looked for associations between HELLS binding and various indicators of transcriptional status to address a possible functional role for HELLS in the expression or silencing of genes in cancer cells.

First, we found that that the positions of HELLS ChIP-seq peaks, representative of HELLS binding sites, exhibited a significant correlation with gene promoters (Figure

3.1). This result is consistent with published observations of HELLS binding at protein- coding gene promoters in murine fibroblasts (von Eyss et al., 2012). HELLS binding was centered around TSSs (Figure 3.2), suggesting that HELLS may directly regulate the transcriptional status of its target promoters. We concluded that in human cancer genomes, HELLS binding preferentially occurs at gene promoters.

Although HELLS has been reported to directly interact with LINE, SINE, and

IAP retrotransposon loci in mouse embryonic fibroblasts (Huang et al., 2004), we found that in human BL, GBM, and SCLC cells, HELLS binding sites were significantly underrepresented in L1 LINE and Alu SINE retrotransposons. From this, we surmised that HELLS does not directly target retrotransposon loci for transcriptional regulation in human cancer cells. It may be that normal somatic and cancer cells exhibit differences in

HELLS function. Another interesting possibility that may be worthy of further 38 exploration in future studies is that HELLS may indirectly influence the epigenetic status of retrotransposons through the downstream action of one of its own target genes.

To address how HELLS binding may be associated with the transcriptional status of its target promoters, we looked for correlations between HELLS ChIP-seq peaks and other factors, including mRNA expression, POL2, and histone modifications, at gene promoters. A noteable caveat in regards to these analyses is that we did not evaluate a potential association between HELLS binding sites and signatures of heterochromatin, such as DNA methylation (reviewed in Rose and Klose, 2014), histone H3 lysine 9 methylation (Peters et al., 2002), and histone H3 lysine 27 trimethylation (reviewed in

Shlyueva et al., 2014) owing to a lack of available published data. However, in BL cells, most HELLS-bound promoters exhibited evidence of mRNA expression (Figure 3.3).

Further, reanalysis of previously published data (Lin et al., 2012; Qian et al., 2014) revealed that HELLS-bound promoters tended to be marked by POL2 binding,

H3K4me3, and H3K27ac (Figures 3.4, 3.5, 3.6). Active promoters are known to be enriched for POL2, H3K4me3, and H3K27ac (Bonn et al., 2012; Mokry et al., 2012; reviewed in Shlyueva et al., 2014), leading us to conclude that in the cancer cell lines that we profiled, genes with promoters bound by HELLS are likely to be expressed. In the

Appendix, we present a list of 578 high-confidence HELLS target genes; the promoters of the genes in this list were bound by HELLS in all three cell lines.

Importantly, our findings corroborate, to a degree, the single previous report demonstrating that HELLS binds to and co-activates the expression of protein-coding genes (von Eyss et al., 2012). Our work bolsters the hypothesis put forth by the authors of that report: that HELLS does not function exclusively in transcriptional silencing and

39 instead may participate in activating the expression of the loci to which it binds. We propose that HELLS may play a more prominent role in transcriptional activation than has previously been thought. In the next chapter, I discuss our efforts to further characterize transcribed HELLS target genes in order to understand what biological processes or pathways may be influenced by HELLS as a transcriptional co-activator.

3.4 Materials and Methods

3.4.1 Cell culture

All three cell lines were cultured in base medium supplemented with 10% fetal bovine serum (FBS) (Corning) and 1% penicillin/streptomycin (Life Science, Sigma-

Aldrich). For BL cells, FBS was heat-inactivated at 55°C for 30 minutes before formulating the complete growth medium. RPMI-1640 (Gibco®, Life Technologies,

Thermo Fisher Scientific Inc.) was used as the base medium for BL and SCLC cells.

EMEM was used as the base medium for GBM cells. All cells were incubated at 37°C with 5% CO2.

3.4.2 HELLS ChIP

BL cells were processed with a ChIP protocol developed by Dr. Michael C.

Haffner, David M. Esopi, and Dr. Srinivasan Yegnasubramanian (the Johns Hopkins

University School of Medicine). The assay was performed in collaboration with David

Esopi. The formulations of all buffers used in the protocol are provided in Table 3.1.

Cells were counted, collected, and washed in 1x PBS and then fixed in 1% formaldehyde.

Following quench of fixation by 2.5 M glycine, cells were pelleted at 4ºC. Cells were lysed by sequential resuspension in lysis buffers (LB) 1 and 2 plus protease inhibitors,

15-minute incubation on ice, and pelleting at 4ºC. Cells were resuspended in Shearing 40 Buffer plus protease inhibitors, snap frozen in liquid nitrogen, and stored at -80ºC. Before sonication, the suspension was thawed, and an equal volume of Dilution Buffer plus protease inhibitors was added. Following sonication, additional Dilution Buffer was supplied, and Dynabeads® Protein G (Thermo Scientific, Thermo Fisher Scientific Inc.), which were first washed twice with Dilution Buffer, and Triton X-100 to a final concentration of 1% were added to the chromatin fragments to induce non-specific binding. A small aliquot of suspension was taken and saved to use as total input, and the remainder was incubated at 4ºC with shaking for one hour. After precipitation of the beads, chromatin was aliquoted into DNA LoBind tubes (Eppendorf) and incubated with a HELLS rabbit polyclonal antibody (Novus Biologicals, NB100-278) overnight at 4ºC with shaking. Fresh beads were blocked for non-specific interaction by washing with

7.5% BSA in 1x DPBS, re-suspension in 7.5% BSA in 1x DPBS with yeast tRNA, and incubating overnight at 4ºC with shaking. For immunoprecipitation, the blocked beads were added to the chromatin-antibody mixture and incubated at 4ºC with shaking for four hours. Beads were sequentially precipitated by magnet and washed in order with TSE I buffer, TSE II buffer, TSE III buffer, and TE, pH 8. Following the last wash, beads were precipitated by magnet, re-suspended in Elution Buffer plus proteinase K, and incubated at 55ºC for 15 minutes. The eluted chromatin fragments were collected, and a second elution was performed on the beads. The collected HELLS-bound fragments underwent reverse cross-linking overnight before being treated with RNAse A and proteinase K, cleaned up with a QIAquick spin column (Qiagen), and eluted in molecular biology grade water. HELLS ChIP-seq and input DNA libraries were prepared for sequencing by the

41 Next Generation Sequencing Center at the Johns Hopkins Sidney Kimmel Cancer Center using standard Illumina protocols to generate 100 bp paired-end reads.

GBM and SCLC cells were processed with the iDeal ChIP-seq Kit for

Transcription Factors (Diagenode, Inc.) according to the manufacturer’s instructions.

Briefly, cells were counted, collected, and washed in 1x PBS; fixed in formaldehyde; and lysed. Chromatin was collected and sonicated, and a portion was reserved for total input

DNA to use as a control. Immunoprecipitation was performed with DiagMag Protein-A coated beads and a HELLS rabbit polyclonal antibody (Novus Biologicals, NB100-278).

HELLS-bound DNA fragments underwent reverse cross-linking before being eluted.

HELLS ChIP-seq and input DNA libraries were prepared for sequencing by the Genome

Technology Center at the New York University Langone Medical Center using standard

Illumina protocols to generate 100 bp paired-end reads.

3.4.3 Collection of published ChIP-seq data

Raw FASTQ files from published ChIP-seq studies of H3K4me3 and H3K27ac in

BL cells (Qian et al., 2014, GEO Series Accession GSE62063) and of POL2, H3K4me3, and H3K27ac in GBM and SCLC cells (Lin et al., 2012, GEO Series Accession

GSE36354) were downloaded, along with corresponding input controls, from the NCBI

Sequence Read Archive (SRA) using SRA Toolkit (Leinonen et al., 2011).

3.4.4 ChIP-seq analysis

FASTQ files were trimmed and aligned to an hg19 reference index with Bowtie v.1.1.1 using default parameters (Langmead, 2010). The resulting mapped reads were used for peak calling with MACS v.1.4.2 (Zhang et al., 2008); input served as a control.

Peak sets were filtered by p-value in R to discard potential artifacts (p-value < 1 × 10-300).

42 Resultant peaks were annotated with the GenomicRanges R package (Lawrence et al.,

2013). Annotations for hg19 intergenic regions, promoters, genes, exons, introns, L1s, and Alus were downloaded from the UCSC Table Browser (Karolchik et al., 2004).

HELLS peak correlations with genomic features were evaluated using the GenometriCorr

R package (Favorov et al., 2012); for statistical significance, 1000 permutations were run.

Jaccard indices indicative of these correlations were plotted using GraphPad Prism 7

(GraphPad Software) and Silhouette Studio Designer Edition software (Silhouette

America). Plots depicting the score of the log2 ratio of HELLS ChIP to input coverage coverages across promoters were created with deepTools v.2.0 (Ramírez et al., 2014) and

Silhouette Studio Designer Edition software (Silhouette America). Venn diagrams were created with BioVenn (Hulsen et al., 2008) and InteractiVenn (Heberle et al., 2015).

HELLS ChIP-seq FASTQ files, peak files, and bigWig files representative of coverage have been prepared, along with the requisite metadata spreadsheet, to submit to

Gene Expression Omnibus (GEO) genomics data repository maintained at NCBI. Once submitted and accepted by NCBI, these data will receive a GEO Series Accession

Number and will be made available to the public either three years from the date of submission or upon publication of the data in a journal or preprint server, whichever comes first.

3.4.5 Isolation of total RNA from BL cells

Total RNA was isolated from BL cells with QIAzol Lysis Reagent and the miRNeasy Mini Kit (Qiagen) according to the manufacturer’s instructions. Briefly, cells were pelleted by centrifugation for 5 minutes at 300 × g, and the supernatant was aspirated. The cell pellet was disrupted by addition of QIAzol Lysis Reagent, passage of

43 the lysate several times through an RNase-free pipette tip, and brief vortexing. Phase separation was performed with QIAzol Lysis Reagent and chloroform, and manufacturer- supplied kit reagents and columns were used to bind, wash, and elute RNA. DNase treatment with RNase-free DNase (Qiagen) was performed after the binding step to ensure DNA-free total RNA for the downstream application of sequencing. Following elution, RNA samples were quantified using the NanoDrop 1000 Spectrophotometer

(Thermo Scientific, Thermo Fisher Scientific Inc.).

3.4.6 RNA-seq

Libraries were rRNA-depleted and prepared for sequencing by the Genome

Technology Center at the New York University Langone Medical Center to obtain 100 bp paired-end reads. FASTQ files were aligned to a precompiled hg19 reference index with HISAT (Kim et al., 2015) using default parameters and a .txt file containing known hg19 splice sites. Transcript assembly was performed with StringTie (Pertea et al., 2015) using default parameters and a premade .gtf file of hg19 RefSeq genes. RSEM (Li and

Dewey, 2011) was used to quantify gene expression. The mean number of fragments per kb of transcript per million mapped reads (FPKM) from two biological replicates was calculated for each gene. Genes with a mean FPKM at or above the fifteenth percentile of all non-zero FPKM values for the transcriptome dataset were classified as expressed genes.

BL RNA-seq FASTQ files and abundance measurements from RSEM (in the form of txt files) have been prepared, along with the requisite metadata spreadsheet, to submit to Gene Expression Omnibus (GEO) genomics data repository maintained at

NCBI. Once submitted and accepted by NCBI, these data will receive a GEO series

44 accession number and will be made available to the public either three years from the date of submission or upon publication of the data in a journal or preprint server, whichever comes first.

45 



 

Figure 3.2. Profiles of HELLS ChIP-seq coverage across length of hg19 RefSeq promoters. Blue line indicates the mean score for the log2 ratio of HELLS ChIP coverage to input coverage in 50 bp bins.

47

Figure 3.3. Overlap between HELLS-bound and expressed promoters in BL cells.

The area-proportional Venn diagram depicts the overlap between HELLS-bound promoters and promoters with RNA-seq-based evidence of mRNA expression.

48 Figure 3.4. Overlap between HELLS-bound and POL2-bound promoters in GBM and SCLC cells. POL2 ChIP-seq data were reanalyzed from Lin et al., Cell (2012). (A)

The area-proportional Venn diagram depicts the overlap between HELLS-bound promoters and POL2-bound promoters in GBM cells. (B) The area-proportional Venn diagram depicts the overlap between HELLS-bound promoters and POL2-bound promoters in SCLC.

49 















 Figure 3.5. Profiles of H3K4me3 ChIP-seq coverage across length of hg19 RefSeq promoters. BL data were reanalyzed from Qian et al., Cell (2014); GBM and SCLC data were reanalyzed from Lin et al., Cell (2012). The red line indicates the mean score for the log2 ratio of H3K4me3 ChIP coverage to input coverage in 50 bp bins across HELLS- bound promoters. The gray line indicates the mean score for the log2 ratio of H3K4me3

ChIP coverage to input coverage in 50 bp bins across promoters to which HELLS was not bound.

51 













 Figure 3.6. Profiles of H3K27ac ChIP-seq coverage across length of hg19 RefSeq promoters. BL data were reanalyzed from Qian et al., Cell (2014); GBM and SCLC data were reanalyzed from Lin et al., Cell (2012). The orange line indicates the mean score for the log2 ratio of H3K27ac ChIP coverage to input coverage in 50 bp bins across HELLS- bound promoters. The gray line indicates the mean score for the log2 ratio of H3K27ac

ChIP coverage to input coverage in 50 bp bins across promoters to which HELLS was not bound.

53 







 

  Table 3.1. Formulations of buffers used for HELLS ChIP in BL cells.

Buffer Name Components 50 mM HEPES, pH 8 140 mM NaCl 1 mM EDTA LB1 10% glycerol 0.5% Igepal 0.25% Triton X-100 10 mM Tris-HCl, pH 8 200 mM NaCl LB2 1 mM EDTA 0.5 mM EGTA 50 mM Tris-HCl, pH 8 Shearing 0.4% SDS 2 mM EDTA Dilution 150 mM NaCl 20 mM Tris-HCl, pH 8 1% SDS Elution 0.75% sodium bicarbonate 0.1% SDS 1% Triton TSE I 2 mM EDTA 20 mM Tris-HCl, pH 8 150 mM NaCl 0.1% SDS 1% Triton TSE II 2 mM EDTA 20 mM Tris-HCl, pH 8 500 mM NaCl 0.25 M LiCl 1% Igepal C-630 TSE III 1 mM EDTA 10 mM Tris-HCl, pH 8 1% deoxycholate

55 Chapter 4: Discovery of HELLS and MYC

association in human cells

4.1 Introduction

As I discussed in the previous chapter, the results of our ChIP-seq analyses indicated that in human cancer cell lines, HELLS predominantly targets the promoters of transcribed genes, suggesting that it may contribute to the activation of their expression.

We next wanted to address how HELLS’s participation in regulating the expression of these genes may be relevant to broader biological mechanisms. To do so, we employed an analytical technique called gene set enrichment analysis (GSEA), which assesses whether genes of interest tend to occur more often than one would expect by random chance alone in gene sets. These gene sets are defined by established biological knowledge (Subramanian et al., 2005), such as published investigations of biochemical pathways and cellular processes. In using GSEA, our goal was to identify biological processes mediated by HELLS target genes and, thus, areas for further inquiry.

As is often the case with this type of analysis, GSEA yielded many statistically significant results, but one result stood out as particularly noteworthy: HELLS target genes in all three cell lines were significantly enriched in gene sets related to MYC proto- oncogene, bHLH transcription factor (MYC), suggesting that HELLS and MYC share

56 common target genes. We became interested in exploring this possibility for several reasons.

First, like HELLS, MYC has a well-established association with cellular proliferation. MYC plays a key role in regulating mammalian cellular proliferation, particularly by inducing the expression of genes encoding cyclin-dependent kinase

(CDK) complex proteins that direct cell cycle progression (Bouchard et al., 1999; Perez-

Roger et al., 1999). Like HELLS, MYC is expressed in germinal center B cells, where it is needed for germinal center formation and maintenance (Calado et al., 2012).

Second, phenotypic similarities between animal models of HELLS and MYC loss of function supported the idea that HELLS and MYC may share functions or belong to the same pathway. Both complete and partial loss of MYC function have been modeled in mice and in fruit flies, which possess a MYC homolog called dMyc that is encoded by the diminutive (dm) gene on the X chromosome. In mice, homozygosity for null mutations in Myc resulted in embryonic lethality approximately midway through gestation (Davis et al., 1993; Trumpp et al., 2001). Myc-null embryos were dramatically reduced in size relative to wild-type embryos, suggesting that the loss of MYC causes growth deficiencies. This hypothesis is additionally supported by persistent body size and growth deficiencies in mice heterozygous for a null Myc mutation (Trumpp et al., 2001).

In Drosophila, male larvae hemizygous for a null dm allele hatched at approximately the same rate as control larvae, suggesting that they undergo normal embryonic development; however, after hatching, dm-null mutants were notably smaller than wild-type larvae and continued to exhibit growth deficiencies until the second instar, when they ceased to

57 grow altogether and subsequently died (Pierce et al., 2004). These phenotypes are indeed reminiscent of murine Hells nullomorphs, which developed normally as embryos but exhibited growth defects at birth and died in the perinatal period (Geiman et al., 2001).

Partial loss of MYC engineered through hypomorphic mutations caused resulted in reduced body size in both mice (Trumpp et al., 2001) and fruit flies (Johnston et al.,

1999), but mutant animals were viable, similar to mice homozygous for a partial-loss-of- function Hells allele (Sun et al., 2004).

Finally, elevated MYC expression is known to be oncogenic and commonly occurs in human cancers (reviewed in Dang, 2012). Due to a common t(8;14) translocation that places MYC under the control of the regulatory elements governing immunoglobulin H expression, MYC is overexpressed in many Burkitt lymphomas and

BL cell lines (Manolov and Manolova, 1972; Zech et al., 1976), including the cell line used in our ChIP-seq study (Bemark and Neuberger, 2000). Other mechanisms that result in oncogenic MYC include enhanced expression or amplification, which have been shown to drive MYC overexpression in the GBM and SCLC cell lines used in our ChIP- seq study (Campbell et al., 2008; Hirvonen et al., 1994). It is thought that the oncogenic effects of MYC are exerted by altered transcription that results from excess MYC

(reviewed in Dang, 2012).

Having provided some context to explain why the discovery of an association between HELLS and MYC target genes was particularly interesting, I will return to precisely how we made that discovery. In the sections that follow, I describe how we used GSEA to identify an association between HELLS and MYC. I also present

58 experiments and analyses that were performed to develop a better understanding of the nature of the association between the two proteins.

4.2 Results

4.2.1 Transcribed HELLS-bound genes are enriched for MYC targets

To search for common biological themes among transcribed HELLS target genes,

I asked whether those target genes were overrepresented in any gene sets from the

Molecular Signatures Database (MSigDB) (Subramanian et al., 2005), which houses an extensive collection of annotated gene sets. Strikingly, in all three cell lines in which we performed HELLS ChIP-seq studies, expressed HELLS targets were significantly enriched in gene sets related to MYC, including sets of high confidence MYC target genes (Liberzon et al., 2015); genes bound directly by MYC (Ben-Porath et al., 2008;

Zeller et al., 2003); genes that exhibit elevated expression in response to MYC binding

(Zeller et al., 2003); and genes with promoters that contain binding sites for MYC and its partner MAX (Xiaohui Xie, Broad Institute) (Figure 4.1). These results suggested that

HELLS and MYC may share target genes in human cancer cell lines.

4.2.2 HELLS and MYC colocalize on chromatin, notably at promoters

If HELLS and MYC share target genes, then the two proteins should colocalize on chromatin. To establish whether HELLS and MYC do, in fact, bind to the same genomic targets, it was necessary to identify MYC binding sites in BL, GBM, and SCLC cell genomes and compare them to the HELLS binding sites that we identified in the studies described in Chapter 3. I downloaded published MYC ChIP-seq data for BL

59 (Seitz et al., 2011) as well as GBM and SCLC cells (Lin et al., 2012) and reanalyzed it with my own pipeline, identifying 11,293 statistically significant (p-value < 1 × 10-5)

MYC peaks in BL; 18,779 in GBM; and 49,036 in SCLC. Permutation testing revealed that across the genome, HELLS peaks overlapped with MYC peaks with a higher degree than would be expected by random chance alone (p-value < 0.001 in all three cell lines), indicating that HELLS binding is enriched in loci that are bound by MYC. Profiles of

MYC ChIP-seq signal density across promoters revealed that MYC binding concentrated rather sharply at the TSS of promoters that were also bound by HELLS (Figure 4.2), indicating that the two proteins do in fact colocalize at promoters. MYC binding was also detected at promoters that were not bound by HELLS (Figure 4.2), suggesting that MYC can bind in the absence of HELLS; this also implied that MYC and HELLS may not be obligate co-factors for regulating all MYC target genes.

4.2.3 HELLS and MYC share transcribed target genes

MYC plays a key role in global regulation of gene expression (reviewed in Dang,

2012). MYC is largely recognized as a transcriptional activator, although it has been documented to participate in transcriptional repression as well (reviewed in Dang, 2012).

To activate transcription, MYC forms heterodimers with MYC associated factor X

(MAX), and together, the heterodimers bind to DNA at sites containing E box (5’-

CANNTG-3’) motifs (reviewed in Dang, 2012). MYC/MAX heterodimers form large interaction complexes with many cofactors, including general transcription initiation factors like ERCC excision repair 2, TFIIH core complex helicase subunit (TFIIH) as well as proteins that promote or carry out chromatin modulation, like

60 transformation/transcription domain associated protein (TRRAP) and lysine acetyltransferase 2A (GCN5) (reviewed in Dang, 2012).

This well-documented association between MYC and transcriptional activation, as well as our research indicating that HELLS targets transcribed promoters (described in

Chapter 3), led us to ask whether HELLS and MYC tended to bind to the same active promoters. In this context, we used our ChIP-seq analysis from Chapter 3 to define active promoters as those that (1) exhibited evidence of mRNA expression or POL2 binding and

(2) were marked by H3K4me3 alone, H3K27ac alone, or both H3K4me3 and H3K27ac.

We then asked what fraction of active promoters were bound by HELLS and MYC. In all three cell lines, a clear or near majority of HELLS-bound active promoters were also bound by MYC: 65.3% in BL, 48.5% in GBM, and 91.5% in SCLC (Figure 4.3).

While MYC binding favors the canonical E box motif (5’-CACGTG-3’)

(Blackwell et al., 1990), MYC overexpression has been documented to lead to less discriminate binding of MYC at active promoters and even enhancers (Lin et al., 2012).

To establish whether target promoters apparently shared by HELLS and MYC were likely to be genuine MYC targets, we analyzed the sequences of active promoters bound by HELLS, MYC, or both proteins, calculating the fraction of these promoters that contained at least one. In all three cell lines, close to half of the active promoters bound by MYC alone contained a canonical E box (Figure 4.4), indicating that perhaps some less discriminate MYC binding occurs; nevertheless, a sizeable fraction of these active promoters are likely to be bona fide MYC targets. Similar proportions of active promoters bound by HELLS only or by both HELLS and MYC contained the canonical E

61 box (Figure 4.4), suggesting that a large fraction of shared HELLS/MYC target promoters are likely to be true MYC targets.

In summary, these analyses revealed that many active genes with promoters bound by HELLS appear to also serve as targets for MYC binding. This suggests that

HELLS and MYC may share a role in activating the expression of these loci.

4.2.4 Some HELLS/MYC targets have been implicated in cancer biology

Closer inspection of the target genes apparently shared by HELLS and MYC, and that may be regulated by both proteins, revealed some well-known MYC targets that have been implicated in cancer biology. For example, in BL cells, we observed that HELLS and MYC localized with mRNA, H3K4me3, and H3K27ac at the RAN binding protein 1

(RANBP1) promoter (Figure 4.5). RANBP1 is a molecular partner of Ran GTPase, which is known to play a role in regulating various cellular processes, including progression of the cell cycle. Elevated expression of RANBP1 has been implicated in driving cancer invasion and metastasis in cellular models (Kurisetty et al., 2008; Rensen et al., 2008). In

SCLC cells, we observed that HELLS and MYC localized with POL2, H3K4me3, and

H3K27ac at the proteasome 26S subunit, non-ATPase 3 (PSMD3) promoter (Figure 4.6).

PSMD3 is a proteasome S3 subunit family member that functions in protein degradation.

In brain tumors, its expression is inversely correlated with patient survival, suggesting that PSMD3 expression may be a marker of, or contributor to, severity of disease (Patel et al., 2013). A cooperative effort by HELLS and MYC to induce the expression of these loci, then, may have potential ramifications for cancer biology.

62 4.2.5 HELLS and MYC physically interact in HEK 293T cells

The significant degree of HELLS and MYC colocalization at promoters of expressed genes certainly suggested that the two proteins physically interact. To assess whether any other investigators had previously demonstrated a physical interaction between HELLS and MYC, I searched the literature and was excited to find a single study reporting HELLS as a candidate MYC-interacting protein (Koch et al., 2007). The study’s authors used tandem-affinity purification with mass spectral multidimensional protein identification technology to identify 221 candidate proteins that interacted with a

MYC-TAP fusion protein in HEK 293T cells and DLD1-tTA colorectal cancer cells.

They reported HELLS as one of the candidate MYC-interacting proteins in HEK 293T cells but did not select the putative HELLS/MYC interaction for further experimental studies or validation.

Encouraged by the findings from that proteomics screen as well as our own ChIP- seq observations, we decided to test for a HELLS-MYC physical interaction. In collaboration with Creative BioMart, we generated expression vectors encoding tagged versions of human HELLS (Figure 4.8) and MYC (Figure 4.9); transfected them into

HEK 293T cells, along with tag-only vector controls; and performed co- immunoprecipitation (co-IP) assays. Using tagged HELLS as bait, we were able to immunoprecipitate the approximately 65 kDa tagged MYC protein (Figure 4.10). In the reciprocal co-IP assay, the approximately 110 kDa tagged HELLS protein co- immunoprecipitated with tagged MYC (Figure 4.11). Taken together, these data indicated that HELLS and MYC do physically interact with one another in human cells.

63 4.3 Conclusions and Discussion

In this chapter, I have presented our efforts to characterize the underlying biological mechanisms connecting the transcribed genes that we identified as targets of

HELLS. These efforts led to the discovery that genes bound by HELLS may also be targets of MYC, a major transcriptional regulator that is tied to normal cellular proliferation as well as the pathophysiology of many human cancers. We then performed a series of experiments to validate this putative association between HELLS and MYC.

Through GSEA, we identified significant enrichment of HELLS targets in gene sets comprised of MYC targets (Figure 4.1). While this analysis did reveal other significantly enriched gene sets, which I have not presented here, we elected to prioritize this MYC-related finding for several reasons: (1) HELLS and MYC are both tightly associated with cellular proliferation (Bouchard et al., 1999; Jarvis et al., 1996; Lee et al.,

2000; Perez-Roger et al., 1999; Sun et al., 2004); (2) HELLS and MYC exhibit similar loss-of-function phenotypes (Davis et al., 1993; Geiman et al., 2001; Johnston et al.,

1999; Pierce et al., 2004; Sun et al., 2004; Trumpp et al., 2001); and (3) HELLS and

MYC are both abundantly expressed in cancer (reviewed in Dang, 2012). Nevertheless, the exploration of non-MYC-related gene sets that showed enrichment for HELLS targets would likely be an interesting future direction and may shed additional light on the molecular functions of HELLS.

In order to assess whether HELLS and MYC truly share target genomic loci in the three cancer cell lines included in our ChIP-seq studies, we compared HELLS and MYC binding sites using reanalyzed published MYC-ChIP-seq data (Lin et al., 2012; Seitz et 64 al., 2011). Because we had previously identified promoters as the predominant targets for

HELLS binding in BL, GBM, and SCLC cells, we prioritized the characterization of promoters bound by HELLS and MYC. However, HELLS binding sites were significantly enriched in MYC binding sites throughout the genome, and we did identify instances of HELLS/MYC colocalization in intergenic space (data not shown). In addition to E-box containing promoters (reviewed in Dang, 2012), MYC can bind to enhancer sequences with lower affinity E-box sequences. A rigorous characterization of

HELLS and MYC colocalization at these regulatory elements would be an interesting potential future direction that may shed even more light on the relationship between the two proteins and transcriptional activation.

Our comparative ChIP-seq analysis revealed that MYC notably localized to TSSs bound by HELLS (Figure 4.2), and many expressed promoters bound by HELLS were also bound by MYC (Figure 4.3). Importantly, a substantial fraction of expressed promoters bound by HELLS and MYC contained the canonical E box motif favored for

MYC binding, suggesting that they are likely to be true MYC targets (Figure 4.4). MYC can also bind to non-canonical E box motifs (Blackwell et al., 1993), so it is possible that an even higher fraction of promoters bound by HELLS and MYC are truly targeted by both proteins. We concluded that HELLS and MYC share many target genes in human cancer cell lines.

Notably, active genes with promoters bound by HELLS and MYC included several known MYC target genes (Figures 4.5 and 4.6) that, when overexpressed, are positively correlated with tumor progression and negative correlated with survival

65 outcomes (Abe et al., 2008; Kurisetty et al., 2008; Patel et al., 2013; Rensen et al., 2008;

Xia et al., 2008). That HELLS and MYC colocalize with positive indicators of transcription at these gene promoters suggests that the proteins may co-regulate their expression. If this were indeed to be the case, there could be significant implications for tumor biology.

HELLS and MYC co-occupancy at gene promoters suggested that the two proteins physically interact, and to test this putative interaction, we performed reciprocal co-IP assays with tagged HELLS (Figure 4.7) and MYC (Figure 4.8) proteins in HEK

293T cells. These assays demonstrated that HELLS and MYC physically interacted with one another (Figures 4.9 and 4.10). Importantly, these experiments validated an early report of a HELLS/MYC interaction that was detected in a proteomic screen (Koch et al.,

2007). We concluded that HELLS is a bona fide MYC-interacting protein.

Our observations that HELLS and MYC physically interact with one another and co-occupy the promoters of many expressed genes were reminiscent of the reported association of HELLS with the E2F3 transcription factor (von Eyss et al., 2012), which I discussed in detail in Chapter 1. Briefly, HELLS and E2F3, a member of the E2F family of transcriptional activators, were identified as interacting proteins through co-IP, and

ChIP-seq revealed that the two colocalized on chromatin, predominantly in regions that were marked by H3K4me3 and therefore were likely to be transcribed (von Eyss et al.,

2012). Importantly, the induction of E2F3 target gene expression was impaired following

HELLS knockdown, suggesting that E2F3 requires fully functional HELLS in order to activate transcription of its targets (von Eyss et al., 2012).

66 In light of the similarities between the HELLS/E2F3 study and our own observations, we became curious about whether HELLS and MYC functionally interact to drive gene expression, as this could have important implications for human cancer biology. Elevated MYC expression has been estimated to occur in up to 70% of human cancers (reviewed in Dang, 2012), making MYC one of the most pervasive oncogenes and therefore, in theory, an attractive candidate target for cancer therapeutics. In practice, however, pharmacological inhibition of MYC itself has been hindered due to its nuclear localization, lack of a binding site for small molecule inhibitors, and essential role in normal tissue biology (reviewed in Whitfield et al., 2017). These intractable challenges have led to efforts to develop indirect targeting strategies to curtail the effects of oncogenic MYC (reviewed in Dang et al., 2017). One such strategy is to use inhibitors that block the interaction of MYC and its co-factors in order to interfere with MYC function (reviewed in Whitfield et al., 2017). Small molecular inhibitors of MYC/MAX heterodimerization have been shown to block MYC’s transcriptional activator activity in cells and in mouse models (Chauhan et al., 2014; Hart et al., 2014; Soodgupta et al.,

2015; Stellas et al., 2014; Wang et al., 2013; Yap et al., 2013; Yin et al., 2003). Omomyc, a miniprotein comprised of the bHLH-Zip domain of MYC with mutations that alter dimerization specificity, inhibits MYC function by disrupting MYC/MAX heterodimerization, preventing MYC from binding DNA, and occupying E box motifs at promoters (reviewed in Whitfield et al., 2017). Effective in mouse models of cancer while imposing minimal side effects, Omomyc serves as a proof of principle that interfering with MYC’s protein interaction capabilities can circumvent the effects of oncogenic MYC (Annibali et al., 2014; Galardi et al., 2016; Soucek et al., 2004, 2008,

67 2013). In my mind, this raises the question of whether inhibiting other MYC interactors, particularly enzymes like HELLS, may be able to mitigate the consequences of oncogenic MYC.

We therefore hypothesized that HELLS cooperates with MYC to drive expression of its target genes, similar to the way that HELLS cooperates with the E2F3 transcriptional activator. Testing this hypothesis experimentally required the design of a model that would allow for the induction of HELLS loss-of-function as well as the control of MYC expression or transcriptional activator activity. In the two following chapters, I describe the creation and investigation of two such experimental models.

4.4 Materials and Methods

4.4.1 Gene set enrichment analysis (GSEA)

The lists of expressed HELLS target genes were too large to analyze online using

MSigDB, so gene set enrichment analysis was performed using a custom R script designed to implement a nearly identical approach. Hallmark (H), curated (C2), and motif

(C3) gene sets were downloaded directly from MSigDB as .gmt files and were read into

R using the qusage R package (Yaari et al., 2013). The overlap between query gene lists

(HELLS target genes) and each gene set was subsequently evaluated. Statistical significance of the overlap between a query gene list and each gene set was represented in the form of a p-value, which was determined from the hypergeometric distribution for

(k-1, K, N-K, n) where k is the number of query genes that overlap the gene set, K is the number of genes in the gene set, N is the total number of genes in the genome, and n is

68 the total number of query genes. Because the number of unique gene symbols and

ENTREZ IDs present in the RefSeq hg19 gene annotation is 26,227, that numerical value was employed for N. False discovery rate (FDR) q-values were calculated to correct for multiple hypothesis testing according to the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995). Statistical significance for observed HELLS target gene enrichment in MYC-related gene sets was plotted with GraphPad Prism 7 (GraphPad

Software) and Silhouette Studio Designer Edition software (Silhouette America).

4.4.2 MYC ChIP-seq analysis

Raw FASTQ files from published ChIP-seq studies of MYC in BL cells (Seitz et al., 2011, GEO Series Accession GSE30726) and in GBM and SCLC cells (Lin et al.,

2012, GEO Series Accession GSE36354) were downloaded, along with corresponding input controls, from the NCBI Sequence Read Archive (SRA) using SRA Toolkit

(Leinonen et al., 2011). The FASTQ files were then trimmed and aligned to an hg19 reference index with Bowtie v.1.1.1 using default parameters (Langmead, 2010). The resulting mapped reads were used for peak calling with MACS v.1.4.2 (Zhang et al.,

2008); input served as a control. Peak sets were filtered by p-value in R to discard potential artifacts (p-value < 1 × 10-300). Resultant MYC peaks were annotated with the

GenomicRanges R package (Lawrence et al., 2013). The methods used to create or obtain and to analyze HELLS, POL2, H3K4me3, and H3K27 ac ChIP-seq peaks, as well as mRNA expression data, used in comparative analyses with MYC peaks are described in

Chapter 3. Correlations of HELLS peak positions with MYC peak positions in different genomic compartments were evaluated using the GenometriCorr R package (Favorov et

69 al., 2012); for statistical significance, 1000 permutations were run. Plots depicting the score of the log2 ratio of MYC ChIP to input coverage coverages across promoters were created with deepTools v.2.0 (Ramírez et al., 2014) and Silhouette Studio Designer

Edition software (Silhouette America). Venn diagrams were created with BioVenn

(Hulsen et al., 2008), InteractiVenn (Heberle et al., 2015), and Silhouette Studio Designer

Edition software (Silhouette America). Gene track images were produced with

Integrative Genomics Viewer (IGV) (Robinson et al., 2011) and Silhouette Studio

Designer Edition software (Silhouette America).

4.4.3 Promoter E box motif analysis

Sequences corresponding to transcribed promoters that were bound by HELLS and/or MYC were obtained using BEDTools (Quinlan and Hall, 2010). Microsoft Excel

(Microsoft Office) was used to detect the presence of the CACGTG canonical E box motif in promoter sequences as well as to quantify the fraction of promoters containing at least 1 motif occurrence. The data were plotted using GraphPad Prism 7 (GraphPad

Software) and Silhouette Studio Designer Edition software (Silhouette America).

4.4.4 Cell culture

HEK 293T cells were cultured by Creative BioMart (Shirley, NY) in DMEM base medium (HyClone, GE Healthcare) supplemented with 10% fetal bovine serum (FBS)

(HyClone, GE Healthcare) and penicillin/streptomycin. For transfections, cells were cultured in DMEM supplemented with 10% FBS only; four hours after transfection, the culture medium was switched for the complete growth medium described above.

70 4.4.5 Plasmid construction and transfection

Vector construction and transfection were performed by Creative BioMart

(Shirley, NY). The HELLS-FLAG construct was created by cloning a 3x-FLAG tag into a plasmid encoding isoform 1 (NM_018063) of human HELLS (GenScript, product ID

OHu11262D). The MYC-Myc construct was created by cloning a Myc tag into the pcDNA3-cmyc plasmid, which was a gift from Wafik El-Deiry (Addgene plasmid

#16011) and encodes human MYC. Plasmid map images were created using SnapGene software v4.0 (SnapGene). Tag-only vectors were prepared and supplied by Creative

BioMart. HEK 293T cells were transfected with 2 µg of plasmid using 10 µL of polyethylenimine (PEI) transfection reagent.

4.4.6 Co-immunoprecipitation (co-IP)

HEK 293T cells were collected and lysed by Creative BioMart (Shirley, NY) 48 hours after transfection of plasmids encoding tagged HELLS and MYC. Nuclear extracts were incubated at 4°C for 1 hour with either FLAG magnetic beads (Sigma-Aldrich) or protein A/G magnetic beads with Myc antibody. The beads were washed three times before collecting eluate for further Western blot analysis with FLAG-HRP (Sigma-

Aldrich) and anti-MYC (Santa Cruz, sc40) antibodies.

71

Figure 4.1. Statistical significance of enrichment of transcribed HELLS-bound genes in gene sets comprised of MYC targets. The dotted black line indicates the threshold accepted for statistical significance (the -log10[FDR q-value] that is equivalent to a q-value of 0.05).

72 Figure 4.2. Profile of MYC ChIP-seq coverage across length of hg19 RefSeq promoters. BL data were reanalyzed from Seitz et al., PloS One (2011); GBM and

SCLC data were reanalyzed from Lin et al., Cell (2012). The blue lines indicate the mean score for the log2 ratio of MYC ChIP coverage to input coverage in 50 bp bins across

HELLS-bound promoters. The gray lines indicate the mean score for the log2 ratio of

MYC ChIP coverage to input coverage in 50 bp bins across promoters to which HELLS was not bound.

73

74

Figure 4.3. Overlap between transcribed promoters bound by HELLS and MYC.

MYC ChIP-seq data were reanalyzed from Seitz et al., PloS One (2011) and Lin et al.,

Cell (2012). The area-proportional Venn diagrams depict the overlap between transcribed promoters bound by HELLS and by MYC in BL, GBM, and SCLC cells.

75

Figure 4.4. Canonical E box content of transcribed promoters bound by HELLS and MYC. MYC ChIP-seq data were reanalyzed from Seitz et al., PloS One (2011) and

Lin et al., Cell (2012). The plot displays the fraction of active promoters containing at least 1 canonical E box motif that are bound by MYC alone, by HELLS alone, and by both proteins.

76 





 





 



)LJXUH0DSRIS&'1$[)/$*+(//6SODVPLG7KHESSODVPLG

FRQWDLQVIHDWXUHVWKDWLQFOXGHKXPDQ+(//6ZLWKDQ1WHUPLQDO[)/$*WDJXQGHUWKH

FRQWURORID&09SURPRWHUQHRP\FLQNDQRP\FLQUHVLVWDQFH 1HR5.DQ5 XQGHUWKH

FRQWURORIDQ69SURPRWHUDPSLFLOOLQUHVLVWDQFH $PS5 XQGHUWKHFRQWURORIWKH

$PS5SURPRWHUIRUEDFWHULDOVHOHFWLRQDQ69RULJLQRIUHSOLFDWLRQ 69RUL DQGD

EDFWHULDORULJLQRIUHSOLFDWLRQ RUL 5HVWULFWLRQVLWHVDUHQRWVKRZQ

  



)LJXUH0DSRISF'1$  0\F0<&SODVPLG7KHESSODVPLGFRQWDLQV

IHDWXUHVWKDWLQFOXGHKXPDQ0<&ZLWKDQ1WHUPLQDO0\FWDJXQGHUWKHFRQWURORID

&09SURPRWHUQHRP\FLQNDQRP\FLQUHVLVWDQFH 1HR5.DQ5 XQGHUWKHFRQWURORIDQ

69SURPRWHUDPSLFLOOLQUHVLVWDQFH $PS5 XQGHUWKHFRQWURORIWKH$PS5SURPRWHU

IRUEDFWHULDOVHOHFWLRQDQ69RULJLQRIUHSOLFDWLRQ 69RUL DQGDEDFWHULDORULJLQRI

UHSOLFDWLRQ RUL 5HVWULFWLRQVLWHVDUHQRWVKRZQ

  





)LJXUH+(//60<&FRLPPXQRSUHFLSLWDWLRQDVVD\XVLQJWDJJHG+(//6DV

EDLW)/$* [)/$*WDJ0\F 0\FWDJ+(//6)/$* WDJJHG+(//6SURWHLQ

0<&0\F WDJJHG0<&SURWHLQ,3 LPPXQRSUHFLSLWDWLRQ,% LPPXQREORW











 





)LJXUH+(//60<&FRLPPXQRSUHFLSLWDWLRQDVVD\XVLQJWDJJHG0<&DVEDLW

0\F 0\FWDJ)/$* [)/$*WDJ0<&0\F WDJJHG0<&SURWHLQ+(//6

)/$* WDJJHG+(//6SURWHLQ,3 LPPXQRSUHFLSLWDWLRQ,% LPPXQREORW





 Chapter 5: Engineering loss of HELLS in human

embryonic kidney 293T cells

5.1 Introduction

In order to investigate a possible functional interaction between human HELLS and MYC, we first attempted to develop an appropriate model system in the human embryonic kidney 293T (HEK 293T) cell line. This cell line was originally established by stable transfection of HEK 293 cells with a plasmid encoding a temperature-sensitive

SV40 large T antigen allele (DuBridge et al., 1987). HEK 293T cells are reputed to be an highly genetically tractable system, so we anticipated that we would be able to manipulate both HELLS and MYC in them, allowing us to test our hypothesis that

HELLS cooperates with MYC to drive gene expression.

To engineer the loss of HELLS expression and, therefore, function, we collaborated with Dr. Chunhong Liu and William Wu to modify the endogenous HELLS locus in HEK 293T cells. As I discussed in Chapter 1, loss-of-function mutations in Hells have been successfully produced in two different mouse models (Geiman et al., 2001;

Sun et al., 2004). At the protein level, HELLS is highly conserved between human and mouse (Figure 5.1), and the structures of the human and mouse genes are also highly similar. The murine null and hypomorphic Hells alleles were well-characterized by the groups that engineered them, and mice homozygous for either mutant allele were viable,

83 at least up to the perinatal period (Geiman et al., 2001; Sun et al., 2004). As such, we decided to recreate the genetic alterations made in those mouse models in HEK 293T cells using CRISPR/Cas9 genome editing technology (Cong et al., 2013).

To control the transcriptional activation activity of MYC, we elected to induce constitutive expression of MYC from a stable plasmid transfection. Endogenous MYC is expressed at relatively low levels in HEK 293T cells (Pan et al., 2015), suggesting that

MYC would play little to no role in directing transcriptional programs in this system. We theorized that untransfected HEK 293T cells could serve as controls for MYC-expressing ones.

In the remainder of this chapter, I describe our efforts to create HELLS-null and

HELLS-deficient HEK 293T cells as well as induce constitutive ectopic MYC expression.

We initially sought to address the question of whether MYC-driven target gene expression was reduced in HELLS mutants relative to wild-type cells, but an unexpected result prompted us to ask several different questions, leading to some interesting observations.

5.2 Results

5.2.1 Genome editing recapitulates Hells mouse model mutations in HEK 293T

cells

In order to generate human cell lines with loss of function HELLS alleles, we used

CRISPR/Cas9 genome editing to engineer the null and hypomorphic alleles of the Hells loss-of-function mouse models in HEK 293T cells. To recapitulate the mouse null Hells

84 allele, we designed guide RNAs (gRNAs) to target introns 5 and 7 of human HELLS, which would lead to the deletion of exons 6 and 7 (Figure 5.2). As in the mouse model, this deletion was expected to cause a frameshift, leading to the introduction of a premature stop codon and ultimately a lack of HELLS expression. We termed this human mutant allele HELLS∆6-7. To recapitulate the mouse hypomorphic Hells allele, we designed guide RNAs (gRNAs) to target introns 9 and 12 of human HELLS, which would lead to the deletion of exons 10, 11, and 12 (Figure 5.2). While these exons encode a portion of the conserved N-terminal DEXDc helicase domain, their loss leaves an intact reading frame, theoretically allowing a smaller protein to be produced. We termed this human mutant allele HELLS∆10-12.

After genotyping to identify clonal cell lines that were homozygous for HELLS∆6-7 or HELLS∆10-12, we performed RT-PCR and Western blot assays to examine the effect of the mutant alleles at the levels of transcripts and proteins. RT-PCR confirmed the presence of smaller amplicons in HELLS∆6-7/∆6-7 (Figure 5.3) and HELLS∆10-12/∆10-12

(Figure 5.4) cell lines, indicating that the targeted deletions led to changes in HELLS transcripts. Sanger sequencing confirmed the absence of exons 10-12 in

HELLS∆10-12/∆10-12 RT-PCR products, but interestingly, exon 5 had been spliced out of some of the HELLS∆6-7/∆6-7 RT-PCR products. This particular exon skipping event, in combination with the exon 6/7 deletion, would restore the reading frame. It was not reported in the Hells-null mouse model (Geiman et al., 2001), but its occurrence suggests that HELLS∆6-7/∆6-7 cells may attempt to compensate for the engineered deletion. By

Western blot, wild-type HELLS protein did not appear to be expressed in HELLS∆6-7/∆6-7 or HELLS∆10-12/∆10-12 cells (Figure 5.5). However, HELLS∆10-12/∆10-12 cells appeared to 85 express a smaller protein product at low levels (Figure 5.5). This smaller product, with an approximate molecular weight of 77.4 kDa, is consistent with the expected result of the interstitial deletion caused by the absence of exons 10-12; it was also detected in the

Hells-deficient mouse model (Sun et al., 2004).

Taking these results together, we concluded that we were able to recapitulate the same genetic alterations as the Hells-null and Hells-deficient mouse models in human cells. While the RT-PCR and Western blot analyses indicated that HELLS expression is similar in HELLS∆10-12/∆10-12 cells and cells derived from Hells-deficient mice, the detection of a splice variant in HELLS∆6-7/∆6-7 cells but not in cells derived from Hells-null mice made us question whether we had truly created a null HELLS allele. We therefore elected to use only HELLS∆10-12/∆10-12 cells in our subsequent experiments, accepting that these cells constituted a system with a partial, rather than full, loss of HELLS.

5.2.2 Transfection of MYC expression vector increases MYC expression in wild-

type and HELLS-deficient cells

Having created a partial-loss-of-function HELLS model in HEK 293T cells, we next set out to establish a way to control MYC expression. HEK 293T cells typically express relatively low levels of endogenous MYC (Pan et al., 2015), so it was necessary to induce MYC expression in our HEK 293T cell lines. To this end, we engineered a

MYC expression vector, which we named pcDNA3-MYC-Puro (Figure 5.6), by cloning a gene conferring puromycin resistance into the MYC-expressing pcDNA3-cmyc plasmid

(Ricci et al., 2004). After stable transfection of pcDNA3-MYC-Puro into wild-type and

HELLS∆10-12/∆10-12 cells, we used RT-PCR to evaluate MYC expression. Regardless of

86 genotype, MYC was expressed in all cells that were successfully transfected with pcDNA3-MYC-Puro (Figure 5.7). However, RT-PCR detected MYC expression in some untransfected clonal cell lines, suggesting that the endogenous locus was expressed to some degree. By RT-PCR, one HELLS∆10-12/∆10-12 clonal cell line appeared to express low levels of MYC mRNA, and one wild-type clonal cell line appeared to express MYC mRNA at similar levels to those in the transfected cell lines (Figure 5.7). This result was surprising to us because we noticed no differences in cell morphology, survival, or growth rates between the pair of wild-type cell lines or between the pair of HELLS- deficient cell lines, although we did not formally quantify these phenotypic features. In light of the RT-PCR result, we became concerned that we no longer had an appropriate set of controls to test the hypothesis that loss of HELLS function would impede MYC- driven gene transcription.

5.2.3 Partial loss of HELLS alters expression of genes, including HELLS targets

Instead of proceeding further with the MYC-expressing HEK 293T cell lines, we decided instead to examine differences in gene expression between untransfected wild- type and HELLS∆10-12/∆10-12 cells to address a different, and perhaps more basic, question: does the partial loss of HELLS function result in an altered HEK 293T transcriptome?

Taking that further, we were curious about which genes that may be affected by impaired

HELLS function and how the expression of those genes differs between wild-type and

HELLS-deficient cells. HELLS has previously been implicated in transcriptional regulation (von Eyss et al., 2012; Myant and Stancheva, 2008; Sun et al., 2004), so we fully anticipated that at least some genes would exhibit altered transcription in the

87 absence of fully functional HELLS protein. The HEK293 cell line, from which HEK

293T cells were derived, exhibit cancer-like characteristics, including karyotypic abnormalities and tumorigenic capabilities (reviewed in Stepanenko and Dmitrenko,

2015). Because our ChIP-seq studies in human cancer cell lines (discussed in Chapter 3) revealed a striking association between HELLS binding and transcriptional activation, we hypothesized that in HEK 293T cells, the loss of HELLS would lead to decreased expression of genes that we had previously identified as HELLS targets (discussed in

Chapters 3 and 4).

To compare the gene expression profiles of wild-type and HELLS∆10-12/∆10-12 HEK

293T cells, we prepared and sequenced total RNA and performed differential expression analysis. As seen in Figure 5.8, partial loss of HELLS resulted in increased expression of some genes and decreased expression of others. When we applied thresholds for both biological and statistical significance (described in Methods section 5.4.7), we found that

332 genes met those criteria.

Next, we took a step back to specifically consider HELLS target genes in order to determine whether their expression levels differed between HELLS-deficient and wild- type cells, as well as whether any differences in expression were biologically or statistically significant. Our hypothesis, which we based on our ChIP-seq analysis described in Chapter 3, was that their expression levels would be lower in HELLS- deficient cells than in wild-type ones. Of the 578 high-confidence HELLS target genes that we identified by our ChIP-seq studies (and which are listed in the Appendix), the majority were expressed at lower levels in HELLS∆10-12/∆10-12 cells than in wild-type cells

88 (Figure 5.9), suggesting that HELLS’s putative transcriptional activator activity may be at work in HEK 293T cells. A fraction of HELLS target genes, though, were upregulated by the partial loss of HELLS (Figure 5.9). While HELLS may not bind to the promoters of these genes in HEK 293T cells, this result also may suggest two other possibilities: (1) that HELLS acts to silence these genes, or (2) that these genes are indirectly regulated by

HELLS. However, when we applied our criteria for biological and statistical significance of observed differential expression, we found that only two HELLS target genes, SET binding protein 1 (SETBP1) and glutathione S-transferase alpha 4 (GSTA4), met those criteria, raising the possibility that HELLS is not actually biologically relevant to the expression levels of these genes.

5.3 Conclusions and Discussion

In this chapter, I have presented our first attempt to create a human cell line model that exhibits loss of HELLS function and allows for control of MYC expression. Our aim was to use this model to address whether functional HELLS is required for MYC-driven gene transcription.

Using CRISPR/Cas9 technology, we engineered HEK 293T clonal cell lines that were homozygous for either of two loss-of-function HELLS alleles that were originally produced in mouse models (Geiman et al., 2001; Sun et al., 2004). We named these recapitulated null and hypomorphic alleles HELLS∆6-7 and HELLS∆10-12, respectively, to reflect the genetic alterations made to create them (Figure 5.2).

89 As expected, Western blot analysis confirmed that HELLS∆6-7/∆6-7 cells did not express wild-type protein (Figure 5.5). However, while gel electrophoresis of RT-PCR products confirmed that HELLS cDNA isolated from HELLS∆6-7/∆6-7 cells contained the expected deletion of exons 6 and 7 (Figure 5.3), sequencing of the products revealed a mixture of fragments, including one that showed evidence of an exon skipping event that would theoretically restore a reading frame. This observation led us to speculate that a total loss of HELLS may be sufficiently detrimental in HEK 293T cells for attempted compensatory mechanisms to kick in, although we did not investigate this theory. In any case, this exon skipping event was not reported to occur in the Hells-null mouse model

(Geiman et al., 2001), leading us to conclude that we may not have created a true null

HELLS allele in HEK 293T cells.

In contrast, HELLS∆10-12/∆10-12 cells produced the expected HELLS cDNA (Figure

5.3, Figure 5.4), and Western blot showed little to no expression of either wild-type

HELLS or the truncated product expected to result from the engineered deletion (Figure

5.5). This is consistent with observations made in the mouse Hells hypomorph (Sun et al.,

2004), which carries the same deletion as HELLS∆10-12/∆10-12 cells. We therefore concluded that we had recapitulated the murine hypomorphic Hells allele in a human cell line.

To induce MYC expression, we generated a MYC expression vector (Figure 5.6) and transfected it into wild-type and HELLS∆10-12/∆10-12 cells. Positive transfectants of both genotypes expressed MYC mRNA; however, so did two clonal untransfected lines, one wild-type and one homozygous for HELLS∆10-12 (Figure 5.7). That these two

90 untransfected cell lines apparently expressed MYC mRNA was surprising, given that

HEK 293T cells tend to express little to no endogenous MYC (Pan et al., 2015). It is possible that at some point during culture, these two untransfected cell lines accumulated mutations that led to increased MYC expression, but as we did not investigate this particular observation further, this is simply speculation. We concluded that the untransfected cell lines would not be suitable as controls for comparison with their MYC- expressing counterparts and decided against attempting to test our hypothesis that

HELLS cooperates with MYC. In future studies, these two MYC-expressing untransfected cell lines may be able to serve as “low MYC” and “high MYC” experimental systems. It would first be important, however, to evaluate how MYC mRNA levels correlate with protein levels in these cell lines.

Nevertheless, the HELLS-deficient cells that we generated, along with wild-type clonal cell lines, presented us with an opportunity to address whether partial loss of

HELLS resulted in altered gene expression. We hypothesized that in HEK 293T cells, which exhibit cancer-like characteristics (reviewed in Stepanenko and Dmitrenko, 2015),

HELLS acts as a transcriptional activator and that a deficiency in HELLS would result in decreased transcription of the high-confidence HELLS target genes identified through our

ChIP-seq studies, described in Chapter 3.

Differential gene expression analysis of the entire transcriptome revealed that 332 genes were biologically and significantly differentially expressed between

HELLS∆10-12/∆10-12 and wild-type cells (Figure 5.8). Because HELLS∆10-12/∆10-12 cells model

91 partial, not full, loss of HELLS function, HELLS likely plays a real role in modulating transcription of these genes.

We also specifically considered whether our 578 high-confidence HELLS target genes were differentially expressed between HELLS-deficient and wild-type cells. Most of these HELLS target genes were downregulated in HELLS∆10-12/∆10-12 cells, but a fraction was upregulated (Figure 5.9). While ChIP-based analysis would be required to know with certainty whether any of these genes are bound by HELLS in this cell line, this observation suggests that HELLS may be able to act in both silencing and activation of gene expression in HEK 293T cells. As I discussed in Chapter 1, most published investigations of the molecular functions of HELLS produced evidence that HELLS acts as a transcriptional repressor for protein-coding genes and for retrotransposon loci

(Huang et al., 2004; Myant and Stancheva, 2008). In future studies, it would be interesting to explore whether the loss of HELLS increases retrotransposon expression and whether HELLS directly targets those elements in HEK 293T cells. It must be noted, however, that only 2 of the 578 high-confidence HELLS targets fell into the list of the

332 genes that differentially expressed at both biological and statistical significance.

While it is certainly possible that HELLS is not truly functionally important in regulating the expression of genes in this target gene set, this particular observation may in fact highlight the limits of our model system. As I have stated before, HELLS∆10-12/∆10-12 cells model only a partial loss of HELLS function, as they do express some mutant HELLS protein, albeit at low levels. To better evaluate the biological necessity of HELLS in transcriptional regulation of these, and other, genes, the development of a true HELLS- null cell line is likely necessary. This may not be easy to do, given that our own attempt 92 to create a null HELLS allele yielded ambiguous results. The fact that HELLS-null mice die shortly after birth and exhibit some tissue abnormalities suggests that HELLS function is essential in some cell types (Geiman et al., 2001). This idea is further supported by the observation that HELLS is essential for survival in some human cancer cells (Hart et al., 2015).

In conclusion, we were able to effectively model partial loss of HELLS function in HEK 293T cells by generating the HELLS∆10-12/∆10-12 cell line, but we could not control

MYC expression in a way that would allow us to address whether HELLS cooperates with MYC to activate transcription. In the next chapter, I discuss our attempt to develop a different model with which to answer that question.

5.4 Materials and Methods

5.4.1 Amino acid sequence alignment

Amino acid sequences for human HELLS isoform 1 (NCBI Reference Sequence

NP_060533.2) and mouse HELLS (NCBI Reference Sequence NP_032260.2) were downloaded in FASTA format from the NCBI Protein database available at https://www.ncbi.nlm.nih.gov/. Sequences were aligned using Clustal Omega with default parameters; Clustal Omega was accessed online at EBI

(http://www.ebi.ac.uk/Tools/msa/clustalo/). BOXSHADE (http://sourceforge.net) and

Microsoft PowerPoint (Microsoft Office) were used to visualize the alignment.

93 5.4.2 Cell culture

HEK 293T cells were maintained in 1x high glucose (4.5 g/L D-Glucose) DMEM medium (Gibco®, Life Technologies, Thermo Fisher Scientific Inc.) supplemented with

10% fetal bovine serum (Corning) and 1% penicillin/streptomycin (Life Science, Sigma-

Aldrich). Cells were incubated at 37°C with 5% CO2.

5.4.3 Creation of HEK 293T cells with mutations in HELLS

CRISPR/Cas9 targeted genome editing to create mutant HELLS alleles in HEK

293T cells was performed in collaboration with Dr. Chunhong Lu and William Wu (the

Johns Hopkins University School of Medicine). Guide RNAs (gRNAs) to engineer targeted deletion of HELLS exons 6 and 7 (HELLS∆6-7) and deletion of HELLS exons 10-

12 (HELLS∆10-12) were designed using MIT CRISPR Design (http://crispr.mit.edu/).

Table 5.1 lists gRNA genomic target sites and sequences as well as oligo sequences that were synthesized for cloning. Pairs of gRNAs were cloned into the pSpCas9(BB)-2A-

GFP vector, which contains Cas9 and a GFP marker to aid in the selection of positive transfectants. Pairs of gRNA oligos were annealed and directly ligated with BbsI sites in the pSpCas9(BB)-2A-GFP vector before transformation into E. coli. Positive constructs were identified by Sanger sequencing. The gRNA-pSpCas9(BB)-2A-GFP vectors were transfected into HEK 293T cells using FuGENE® HD Transfection Reagent (Promega).

Two days after transfection, transfected cells were pooled and genotyped to look for the intended HELLS mutant alleles. Positive pools were sorted into single cells, which were plated and grown for 2-3 weeks before another round of genotyping to identify colonies that were homozygous for HELLS∆6-7 or HELLS∆10-12. RT-PCR followed by gel

94 electrophoresis and Sanger sequencing was also used to validate HELLS mutations; primer pairs used for RT-PCR are provided in Table 5.2. To evaluate protein expression,

Western blot analysis was performed using a HELLS rabbit polyclonal antibody (Abcam, ab3851), a β-actin mouse monoclonal antibody (Santa Cruz, SC- 47778), and fluorescent secondary antibodies (LI-COR Biosciences). The blot was scanned on a LI-COR

Biosciences scanner and imaged using LI-COR Odyssey (LI-COR Biosciences) software and Microsoft PowerPoint (Microsoft Office).

5.4.4 Synthesis of MYC expression vector pcDNA3-MYC-Puro

MYC was received in the pcDNA3-cmyc vector, which was gifted by Wafik El-

Diery (Addgene plasmid #16011) and has been previously described (Ricci et al., 2004).

The pcDNA3-cmyc plasmid contains a mammalian selectable marker for neomycin/G418, but as HEK 293T cells are already resistant to this agent, switching to an alternative marker was necessary. In preparation for Gibson assembly, the vector was linearized with

DraIII-HF and BstZ17l restriction enzymes (New England BioLabs, Inc.), and the resultant 5.1 kb backbone was gel purified with the Zymoclean™ Gel DNA Recovery Kit

(Zymo Research) and quantified with the NanoDrop 1000 Spectrophotometer (Thermo

Scientific, Thermo Fisher Scientific Inc.). Puro, which confers resistance to puromycin, was subcloned along with an SV40 promoter and an SV40 poly-A from the pDA019 pCEP Puro Tet vector (gifted by Daniel Ardeljan at the Johns Hopkins University School of Medicine). The Puro fragment was PCR-amplified from pDA019 using Q5® Hot Start

High Fidelity 2x Master Mix (New England BioLabs Inc.) and custom primers with 40 bp “tails” complementary to pcDNA3-cmyc (5’-

95 CTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACCCAGCTGTGGAATGT

GTGTCAGT-3’, 5’-

CATGATTACGCCAAGCTCTAGCTAGAGGTCGACGGTAACTATCGGCGAGTGA

TCCAGACA-3’). The amplicon was subsequently gel purified with the Zymoclean™

Gel DNA Recovery Kit (Zymo Research) and quantified with the NanoDrop 1000

Spectrophotometer (Thermo Scientific, Thermo Fisher Scientific Inc.). Molar quantities of the vector backbone and the Puro fragment were calculated from NanoDrop-measured concentrations using the NEBioCalculatorTM (New England BioLabs, Inc.) available at https://nebiocalculator.neb.com/#!/. The fragments were joined using the NEBuilder®

Hi-Fi DNA Assembly Cloning Kit (New England BioLabs, Inc.) according to the manufacturers’ instructions and recommended vector to insert molar ratio of 1:2. The assembled vector, henceforth referred to as pcDNA3-MYC-Puro, was transformed into competent E. coli cells, which were subsequently plated and incubated at 37ºC for 17 hours. To screen for the presence of the Puro insert, colony PCR was performed with

OneTaq® Quick-Load® 2X Master Mix with Standard Buffer (New England BioLabs,

Inc.) according to the manufacturer’s instructions and recommended protocol. Plasmid minipreps were performed on liquid cultures grown from positive colonies using the

Zymo ZYPPYTM Plasmid Miniprep Kit (Zymo Research). GENEWIZ, LLC sequence- verified the Gibson assembly junctions and MYC and performed plasmid midi preps. A pcDNA3-MYC-Puro map was created using SnapGene software version 4 (SnapGene).

96 5.4.5 Engineering of MYC-expressing HEK 293T cells

HEK 293T cells that were homozygous for wild-type HELLS or for HELLS∆10-12 were separately seeded in 6-well culture plates 24 hours before transfection. Plasmid

DNA for pcDNA3-MYC-Puro was diluted in 1x OPTI-MEM® I Reduced Serum Medium

(Gibco®, Life Technologies, Thermo Fisher Scientific Inc.), vortexed, and incubated at room temperature for 20 minutes. After warming to room temperature, FuGENE® HD

Transfection Reagent (Promega Corporation) was added to the diluted plasmid DNA in a

3:1 FuGENE® HD:DNA ratio (0.3 µL: 1 µg) and vortexed. The FuGENE® HD/DNA mixture was incubated at room temperature for 15 minutes to allow for complex formation. Following incubation, the mixture was added drop-wise to cells, with gentle swirling of the culture plate in between drops. Cells were incubated at 37°C with 5% CO2 for 24 hours. Following the 24-hour incubation, cells were passaged to larger vessels and supplemented with 1 µg/mL puromycin to begin selection for positive transfectants.

Thereafter, cells were continuously cultured with puromycin to maintain pcDNA3-MYC-

Puro. RT-PCR was performed to assess MYC expression. RNAs were reversed to complementary DNA (cDNA) using the iScriptTM cDNA Synthesis Kit (Bio-Rad) according to the manufacturer’s instructions. Per synthesis reaction, 1 µg of RNA was used as a template; additionally, a no template reaction was performed to serve as a negative control. The resultant cDNAs were used as templates for RT-PCR with Q5®

High-Fidelity 2X Master Mix (New England BioLabs, Inc.) using the manufacturer’s recommended protocol and specific primers (forward primer 5’-

TACCCTCTCAACGACAGCAG-3’; reverse primer 5’-

TCTTGACATTCTCCTCGGTG-3’) to evaluate MYC expression. 97 5.4.6 Isolation of total RNA

Total RNA was isolated from cells with TRIzol® Reagent and the PureLink®

RNA Mini Kit (Invitrogen, Life Technologies, Thermo Fisher Scientific Inc.) according to the manufacturer’s instructions. Briefly, cells were lysed directly in culture flasks by addition of TRIzol® Reagent and passage of the lysate several times through an RNase- free pipette tip. Phase separation was performed with TRIzol® Reagent and chloroform, and manufacturer-supplied kit reagents and columns were used to bind, wash, and elute

RNA. DNase treatment with PureLink® DNase (Invitrogen, Life Technologies, Thermo

Fisher Scientific Inc.) was performed after the binding step to ensure DNA-free total

RNA for the downstream application of sequencing. Following elution, RNA samples were quantified using the NanoDrop 1000 Spectrophotometer (Thermo Scientific,

Thermo Fisher Scientific Inc.). In total, RNA was prepared from two independently cultured flasks of HELLS∆10-12/∆10-12 and wild-type clonal cell lines, giving rise to two biological replicates per genotype.

5.4.7 RNA-seq analysis

Libraries were rRNA-depleted and prepared for sequencing by the Genome

Technology Center at the New York University Langone Medical Center using standard

Illumina protocols. Libraries were sequenced on an Illumina HiSeq 4000 to generate 150 bp paired-end reads. FASTQ files were created with Bcl2fastq Conversion Software 1.8.4

(Illumina Inc.). RNA-seq data processing and analysis were performed in collaboration with Xuya Wang, Zuojian Tang, and Dr. David Fenyö (New York University School of

Medicine). FastQC v0.11.5 (Andrews, 2010) was utilized to evaluate raw read quality.

98 TruSeq2 paired-end adapter sequences were trimmed from raw read pairs using

Trimmomatic v0.36 (Bolger et al., 2014). There were 48.4M to 57.3M qualified read pairs generated among samples, and 85.2% to 86.6% aligned to the hg19 human reference genome. Qualified read pairs in FASTQ format were aligned to human genome assembly hg19 using TopHat v2.0.9 (Kim et al., 2013) and Bowtie2 v2.1.0 (Langmead and Salzberg, 2012) with the following settings: --library-type fr-unstranded -fusion- search --fusion-ignore- chrM. Gene count was calculated by HTSeq v0.6.0

(Anders et al., 2015) with the following settings: --stranded=no --order=name -- minaqual=10 --type=exon --idattr=gene_id --mode=union. Negative binomial tests were applied on pairwise comparison of calculated gene counts using edgeR v3.14.0 (Robinson et al., 2010) to detect differentially expressed genes. The ggplot2 R package (Wickham,

2009) was used to create the volcano plot showing gene expression differences between

HELLS mutant and wild-type cells. To consider observed differential expression of a gene to be biologically significant, we implemented a threshold for log2 fold changes in expression greater than or equal to the absolute value of 1. To consider observed differential expression of a gene to be statistically significant, we implemented a threshold for adjusted p-values less than 0.05. R was used to calculate a Z-score from edgeR-normalized counts for each HELLS target gene in each sample. The heatmap depicting these Z-scores was created using the gplots package (Warnes et al., 2009) and

Microsoft PowerPoint (Microsoft Office).

HELLS∆10-12/∆10-12 and wild-type HEK 293T RNA-seq FASTQ files and abundance measurements from RSEM (in the form of txt files) have been prepared, along with the requisite metadata spreadsheet, to submit to Gene Expression Omnibus (GEO) 99 genomics data repository maintained at NCBI. Once submitted and accepted by NCBI, these data will receive a GEO series accession number and will be made available to the public either three years from the date of submission or upon publication of the data in a journal or preprint server, whichever comes first.

100 Figure 5.1. Alignment of human and mouse HELLS amino acid sequences. Human

HELLS isoform 1 (NCBI Reference Sequence NP_060533.2) and mouse HELLS (NCBI

Reference Sequence NP_032260.2) were used for sequence alignment. Identical acid residues are shaded in black; similar amino acid residues are shaded in gray. Residues comprising the dead-like helicase superfamily domain (DEXDc) and helicase family C- terminal domain (HELICc) are boxed in blue and orange, respectively.

101 

    



)LJXUH573&5FRQILUPDWLRQRIGHOHWLRQRI+(//6H[RQVDQGLQ

+(//6¨¨FORQDOFHOOOLQHV $ 6FKHPDWLFRI+(//6F'1$IUDJPHQWDPSOLILHGLQ

WKH573&5UHDFWLRQ([RQV JUD\ER[HV DUHQXPEHUHG3XUSOHDUURZVLQGLFDWH

DSSUR[LPDWHSULPHUWDUJHWVLWHVDQGRULHQWDWLRQ % *HOLPDJHRI573&5SURGXFWV

([SHFWHGEDQGVL]HVIRUZLOGW\SH :7 DQG+(//6¨¨FORQDOFHOOOLQHVDQGIRU

+(//6¨¨FORQDOFHOOOLQHVDUHLQGLFDWHGRQWKHLPDJH0: PROHFXODUZHLJKW

PDUNHU



 



)LJXUH573&5FRQILUPDWLRQRIGHOHWLRQRI+(//6H[RQVDQGLQ

+(//6¨¨FORQDOFHOOOLQHV $ 6FKHPDWLFRI+(//6F'1$IUDJPHQWDPSOLILHG

LQWKH573&5UHDFWLRQ([RQV JUD\ER[HV DUHQXPEHUHG3XUSOHDUURZVLQGLFDWH

DSSUR[LPDWHSULPHUWDUJHWVLWHVDQGRULHQWDWLRQ % *HOLPDJHRI573&5SURGXFWV

([SHFWHGEDQGVL]HVIRUZLOGW\SH :7 DQG+(//6¨¨FORQDOFHOOOLQHVDUH

LQGLFDWHGRQWKHLPDJH0: PROHFXODUZHLJKWPDUNHU





 



)LJXUH:HVWHUQEORWFRQILUPDWLRQRIORVVRI+(//6H[SUHVVLRQLQ+(//6

PXWDQWFHOOOLQHV([SHFWHGPROHFXODUZHLJKWVRIZLOGW\SHDQGPXWDQWSURWHLQVDUH

LQGLFDWHGRQWKHEORW



















 



)LJXUH0DSRISF'1$0<&3XURH[SUHVVLRQYHFWRU7KHESSODVPLG

FRQWDLQVIHDWXUHVWKDWLQFOXGH0<&XQGHUWKHFRQWURORID&09SURPRWHUSXURP\FLQ

UHVLVWDQFH 3XUR5 XQGHUWKHFRQWURORIDQ69SURPRWHUDPSLFLOOLQUHVLVWDQFH $PS5 

XQGHUWKHFRQWURORIWKH$PS5SURPRWHUIRUEDFWHULDOVHOHFWLRQDQ69RULJLQRI

UHSOLFDWLRQ 69RUL DQGDEDFWHULDORULJLQRIUHSOLFDWLRQ RUL 5HVWULFWLRQVLWHVDUHQRW

VKRZQ



 



)LJXUH573&5SURILOLQJRI0<&H[SUHVVLRQLQZLOGW\SH :7 DQG

+(//6¨¨FORQDOFHOOOLQHVZLWKDQGZLWKRXWSF'1$0<&3XUR([SHFWHG

EDQGVL]HIRU0<&DPSOLFRQLVLQGLFDWHGRQWKHLPDJH0: PROHFXODUZHLJKWPDUNHU











  

)LJXUH*HQHH[SUHVVLRQGLIIHUHQFHVEHWZHHQ+(//6¨¨DQGZLOGW\SH

FORQDOFHOOOLQHV9ROFDQRSORWGHSLFWVWKHORJIROGFKDQJHVLQJHQHH[SUHVVLRQEHWZHHQ

PXWDQWDQGZLOGW\SHFHOOVSRVLWLYH[D[LVYDOXHVLQGLFDWHKLJKHUH[SUHVVLRQLQPXWDQW

WKDQLQZLOGW\SHDQGQHJDWLYH[D[LVYDOXHVLQGLFDWHORZHUH[SUHVVLRQLQPXWDQWWKDQLQ

ZLOGW\SH7KHKRUL]RQWDOGDVKHGEODFNOLQHLQGLFDWHVWKHWKUHVKROGIRUVWDWLVWLFDO

VLJQLILFDQFHZLWKVLJQLILFDQWDGMXVWHGSYDOXHVEHLQJDERYHWKHOLQH7KHYHUWLFDOGDVKHG

EODFNOLQHVLQGLFDWHWKHWKUHVKROGVIRUELRORJLFDOVLJQLILFDQFHZLWKVLJQLILFDQWIROG

FKDQJHVEHLQJWRWKHOHIWRIWKHOHIWKDQGOLQHDQGWRWKHULJKWRIWKHULJKWKDQGOLQH







 

)LJXUH&RPSDULVRQRI+(//6WDUJHWJHQHH[SUHVVLRQEHWZHHQGLIIHUHQFHV

EHWZHHQ+(//6¨¨DQGZLOGW\SHFORQDOFHOOOLQHV2QWKHKHDWPDSH[SUHVVLRQ

YDOXHVDUHVKRZQDV=VFRUHV+LJK=VFRUHVVKRZQLQUHGDUHLQGLFDWLYHRIKLJK

H[SUHVVLRQ/RZ=VFRUHVVKRZQLQEOXHDUHLQGLFDWLYHRIORZH[SUHVVLRQ















 Table 5.1. Guide RNAs (gRNAs) used to engineer HELLS mutant HEK 293T cells.

Target Target Sequence (5'  3') gRNA Oligo Sequences (5'  3') Site

CACCG-TATTCTTCAAAACTAGAATA HELLS TATTCTTCAAAACTAGAATAAGG intron 5 AAAC-TATTCTAGTTTTGAAGAATA-C

CACCG-TACAAGTGCTCGTAACCATG HELLS TACAAGTGCTCGTAACCATGTGG intron 7 AAAC-CATGGTTACGAGCACTTGTA-C

CACCG-GATTCGACTGAGACAGACTT HELLS GATTCGACTGAGACAGACTT intron 9 AAAC-AAGTCTGTCTCAGTCGAATC-C

CACCG-GGATGTATACACTACTGATG HELLS GGATGTATACACTACTGATG intron 12 AAAC-CATCAGTAGTGTATACATCC-C

111 Table 5.2. RT-PCR primers used to confirm targeted deletions in HELLS.

Primer Target Site Sequence (5' 3')

HELLS exon 2 AGAGAGAGCGGAAGATGCTG

HELLS exon 9 ACCCAATCCCATTTCATCTG

HELLS exon 8/9 junction TGGAATGGCTTAGGATGCTT

HELLS exon 14/15 junction TCTTTCTCGGTCCACCTCTG

112 Chapter 6: Effects of partial loss of HELLS on

MYC-driven transcription in human

osteosarcoma cells

6.1 Introduction

Due to the difficulties in establishing appropriate controls for MYC expression in

HEK 293T cells that I described in Chapter 5, the question of whether HELLS function is required for MYC-driven transcriptional activation remained unanswered but was nevertheless still a key point of interest. We were fortunate to be able to have a productive and enlightening meeting with Dr. Chi Dang, who introduced us to a cell line,

U2OS MYC-ERTM, that was developed in his laboratory and would conveniently allow for drug-based control of MYC’s transcription factor activity. By inducing loss of

HELLS function in this cell line, we could test the hypothesis that HELLS is a required cofactor in MYC-driven gene expression.

U2OS MYC-ERTM cells were created by stably transfecting U2OS human osteosarcoma cells with a plasmid encoding wild-type MYC fused to the hormone- binding domain of a mutant estrogen receptor, ERTM, that is responsive to 4- hydroxytamoxifen (4OHT) but not to estrogen (Altman et al., 2015; Littlewood et al.,

1995). These cells constitutively express the MYC fusion protein, MYC-ERTM, but like

113 their parental U2OS cell line, express little to no endogenous MYC (Altman et al., 2015).

Under normal physiological conditions, MYC-ERTM is sequestered in the cytoplasm, which renders it inactive and places the cells in a MYC off state (Figure 6.1, Altman et al.,

2015). After treatment with 4OHT, MYC-ERTM translocates to the nucleus, where it activates the expression of MYC target genes, including ornithine decarboxylase 1

(ODC1) (Figure 6.1, Altman et al., 2015). This places the cells in a MYC on state (Figure

6.1, Altman et al., 2015). Because 4OHT must be suspended in ethanol (EtOH), it is appropriate to use an equivalent volume of EtOH as a control treatment; this maintains the cells’ MYC off state while taking into consideration the minor physiological effects that can be imparted by EtOH.

As I discussed in Chapter 5, our own genome editing efforts were able to produce

HEK 293T cells that were homozygous for a hypomorphic HELLS allele but not any cells that were homozygous for a true null allele, which suggested to us that HELLS is essential for survival, in at least some cell types. Therefore, we elected to induce RNA interference (RNAi)-mediated knockdown of HELLS by transfecting cells with Dicer substrate RNAs (DsiRNAs). We expected that this approach would effectively yield a model of partial loss of HELLS.

To this end, we designed and carried out an experiment in which we knocked down HELLS expression, activated MYC-ERTM to produce the MYC on state, and look for changes in MYC target gene expression by RNA-seq. Our overarching goal was to address the question whether fully functional HELLS is required for MYC to activate the expression of MYC target genes.

114 6.2 Results

6.2.1 MYC target gene expression is upregulated by 4OHT treatment in U2OS MYC-ERTM cells

To establish that the U2OS MYC-ERTM model of MYC activation would work in our own hands, we treated cells with either 4OHT or EtOH to produce the MYC on or

MYC off state, respectively, and then isolated total RNA from cells 24 hours after treatment allow for any changes in MYC target gene expression to be detectable. Using qPCR, we found that the expression of the known MYC target ODC1 was increased by

1.71-fold in the MYC on state relative to the MYC off state (Figure 6.2). Because a ChIP- validated set of MYC target genes in U2OS MYC-ERTM cells was not available, we elected to use Hallmark MYC Target gene sets in our analyses. A gene must receive support from multiple published studies in order to be included in a Hallmark gene set

(Liberzon et al., 2015); as such, Hallmark gene sets can be viewed as high-confidence gene sets. Our RNA-seq analysis revealed that 162 hallmark MYC target genes were upregulated in the MYC on state (adjusted p-value < 0.05). In addition, gene set association analysis (GSAA) showed that the MYC on state was significantly associated with hallmark MYC target gene expression (FDR q-value < 0.001) (Figure 6.3). We concluded that treatment with 4OHT was effective in producing a MYC on state, in which

MYC-ERTM upregulates the expression of known MYC target genes.

6.2.2 HELLS expression is reduced by DsiRNA transfection

We transfected cells with either HELLS or non-target control DsiRNAs and, after waiting for 48 hours to allow for efficient knockdown, treated with either 4OHT or EtOH 115 to produce a MYC on or MYC off state, respectively (Figure 6.4). We harvested total RNA from the cells 48 hours after drug treatment to allow for any changes in MYC target gene expression to be detectable. By qPCR, we confirmed that DsiRNA-induced knockdown significantly reduced HELLS expression by 72-85% (Figure 6.5). The difference in

HELLS expression between MYC activity states was not significant (p-value = 0.73 by

Student’s t-test) (Figure 6.5). The results of our RNA-seq analysis were in agreement with the qPCR assay, showing that log2 fold changes in HELLS mRNA between control knockdown and HELLS knockdown cells ranged from 1.77 to 1.92 (adjusted p-value <

0.001). We concluded that we had indeed created a model of partial loss of HELLS function and that HELLS expression was reduced in both the MYC off and MYC on states.

6.2.3 HELLS knockdown attenuates expression of a hallmark MYC target gene subset

Next, we sought to address the question of whether activated MYC-ERTM was able to induce the expression of hallmark MYC target genes following HELLS knockdown. Our hypothesis was that HELLS would be required for MYC-mediated transcriptional activation. If this were to be the case, we would expect the induction of

MYC target genes to be impaired by the knockdown of HELLS.

First, we looked for differences in hallmark MYC target gene expression between control and HELLS knockdown cells in the MYC on state. GSAA revealed that the expression of hallmark MYC target genes was significantly associated with control knockdown cells (FDR q-value < 0.001) (Figure 6.6), suggesting that the partial loss of

HELLS leads to downregulation of these genes. To investigate whether this was caused

116 by HELLS knockdown-induced impairment of the ability of MYC-ERTM to activate target gene expression, we compared the expression of hallmark MYC target genes in MYC off versus MYC on states for only the HELLS knockdown condition. GSAA indicated that even after inducing partial loss of HELLS, the expression of hallmark MYC target genes was still significantly associated with the MYC on state FDR q-value < 0.001) (Figure

6.7). This indicated that as a group, MYC target genes remained responsive to activated

MYC-ERTM in spite of RNAi-induced depletion of HELLS mRNA. From this, we concluded that HELLS may not be absolutely required for MYC-driven transcriptional activation.

To try to understand why the expression of hallmark MYC target genes was apparently enriched in the control knockdown condition and yet capable of being induced following HELLS knockdown, we decided to visualize the expression changes between the MYC on and MYC off states in control versus HELLS knockdown cells. We plotted normalized expression values for all 311 hallmark MYC target genes on a heatmap

(Figure 6.8). This revealed that a large subset of these genes (n = 169) exhibited a similar response to activated MYC-ERTM; that is to say, the expression of the genes in this subset was induced in the MYC on state, irrespective of the knockdown that was performed

(Figure 6.8). Interestingly, though, there was also a subset of MYC target genes (n = 99) for which HELLS knockdown resulted in a lesser degree of upregulation (Figure 6.8). It is likely that this subset of genes was the primary driver of the association that we observed when comparing control and HELLS knockdown in the MYC on state. Differential expression analysis with edgeR (Robinson et al., 2010) revealed that the differences in the expression of this subset of genes between control and HELLS knockdown were 117 modest (log2 fold change 0.11-0.91) but statistically significant (adjusted p-value < 0.05), suggesting that their response to activated MYC-ERTM may be attenuated. Moreover, closer examination revealed that in the MYC off state, HELLS knockdown resulted in statistically significant (adjusted p-value < 0.05) downregulation of 90 of these genes

(log2 fold change 0.12-1.01) (Figure 6.8), instead that HELLS may also be important for establishing the baseline expression levels of these genes.

In the list of genes that exhibited attenuated expression after HELLS knockdown

(Figure 6.9), we noticed the names of several genes whose expression has been associated with MYC and cancer biology, including cell proliferation and oncogenic transformation. Overexpression of cyclin dependent kinase 4 (CDK4) has been demonstrated to induce hyperproliferation of skin cells in mice (Miliani de Marval et al.,

2001). Overexpression of nucleophosmin 1 (NPM1) was shown to stimulate MYC- dependent hyperproliferation and transformation of cultured cells (Li et al., 2008). If

HELLS truly is important for establishing the baseline expression of these genes, this could have potential implications for human tumor biology.

6.3 Conclusions and Discussion

In this chapter, I have presented a second human cell line model that permitted the control of HELLS expression as well as the transcription factor activity of MYC: U2OS

MYC-ERTM. Using this model, I was able to perform an experiment to directly test the hypothesis that fully functional HELLS is required for MYC-driven gene transcription.

118 After verifying that treatment with 4OHT produced a MYC on state (Figures 6.2 and 6.3) and HELLS knockdown was achieved (Figure 6.5) in U2OS MYC-ERTM cells, we looked for differences in the expression of hallmark MYC target genes between control and HELLS knockdown conditions. Control knockdown cells expressed these

MYC target genes at higher levels than did HELLS knockdown cells (Figures 6.6 and 6.8), indicating that the loss of HELLS led to downregulation of their expression. However, many hallmark MYC target genes remained responsive to activated MYC-ERTM even with a partial loss of HELLS (Figures 6.7 and 6.8). While it is not clear whether MYC-

ERTM is specifically aided by HELLS in activating target gene expression, HELLS may still be required for establishing typical baseline expression patterns for a subset of these targets (Figure 6.8).

A role for HELLS in positively regulating the expression of this hallmark MYC target gene subset could potentially have pathologic relevance to human cancers. The expression of some of the genes in this subset, including CDK4 and NPM1, has been shown to stimulate cell proliferation and/or malignant transformation of cultured cells (Li et al., 2008; Miliani de Marval et al., 2004). In some cases, the proteins encoded by these genes may actually be critical mediators of the effects of MYC overexpression. For example, CDK4 is known to be a key downstream effector of oncogenic MYC; in an animal model, the loss of CDK4 inhibited MYC’s tumorigenic activities (Miliani de

Marval et al., 2004). Our work raises the question of whether inhibiting HELLS function could truly downregulate the expression of these genes and, possibly, lead to diminished effects of oncogenic MYC.

119 It is important to note that our experimental system reflects only a partial loss of

HELLS, making it difficult to draw conclusions at this point about what the effects of complete loss of HELLS function would be. We elected to use a knockdown approach due to indications that HELLS is required for survival, at least in some cell types. As I mentioned in Chapter 1, complete ablation of HELLS in mice results in perinatal lethality

(Geiman et al., 2001). Additionally, a search for human cancer cell viability genes found that HELLS was essential for cell survival in one of the five screened human cancer cell lines (Hart et al., 2015). For these reasons, any future functional studies of HELLS will require careful consideration of both experimental system and method for disrupting

HELLS function.

6.4 Materials and Methods

6.4.1 Cell culture

U2OS MYC-ERTM cells were graciously shared by Dr. Brian Altman, Dr. Annie

Hsieh, and Dr. Chi Dang (all formerly affiliated with the Perelman School of Medicine at the University of Pennsylvania). The cells were maintained in 1x high glucose (4.5 g/L

D-Glucose) DMEM medium (Gibco®, Life Technologies, Thermo Fisher Scientific Inc.) supplemented with 10% fetal bovine serum (Corning) and 1% penicillin/streptomycin

(Life Science, Sigma-Aldrich). The medium was additionally supplemented with 100

µg/mL ZeocinTM Selection Reagent (Invitrogen, Life Technologies, Thermo Fisher

Scientific Inc.) to maintain expression of MYC-ERTM. Cells were incubated at 37°C with

5% CO2.

120 6.4.2 DsiRNA transfection

Human HELLS DsiRNAs were purchased along with a TriFECTa Kit from IDT.

Cells were transfected with DsiRNAs using Lipofectamine® RNAiMAX Transfection

Reagent (InvitrogenTM, Thermo Fisher Scientific Inc.), following the manufacturer’s instructions. Briefly, cells were seeded 24 hours prior to transfection to achieve 60-80% by the time of transfection. DsiRNAs and Lipofectamine® RNAiMAX were diluted separately in 1x OPTI-MEM® I Reduced Serum Medium (Gibco®, Life Technologies,

Thermo Fisher Scientific Inc.). Following a brief incubation at room temperature, the diluted transfection reagents were mixed together, vortexed, and incubated at room temperature. DsiRNA- complexes were added to cells in a drop-wise fashion, with mixing between drops. Non-targeting DsiRNAs were used as a negative control.

DsiRNAs targeting the HPRT1 gene were used as a positive control. TYE 653 was used as a transfection efficiency control. DsiRNA oligo information is provided in Table 6.1.

For all transfections, the final concentration of DsiRNAs used was 10 nM. After the addition of DsiRNA-lipid complexes, cells were incubated at 37°C with 5% CO2. The transfection efficiency control was examined 24 hours post-transfection.

6.4.3 Production of MYC on and MYC off states

To activate MYC-ERTM translocation to the nucleus and place cells in a MYC on state, cells were cultured with (Z)-4-hydroxytamoxifen (4OHT) (Sigma-Aldrich), suspended in 100% molecular biology grade ethanol (Sigma Life Science) and supplied to cells in a final concentration of 500 nM. Because EtOH itself may have a minor impact

121 on cellular physiology, MYC off control cells were treated with an equivalent volume of pure molecular biology grade EtOH.

6.4.4 Isolation of total RNA

Total RNA was isolated from cells with TRIzol® Reagent and the PureLink®

RNA Mini Kit (Invitrogen, Life Technologies, Thermo Fisher Scientific Inc.) according to the manufacturer’s instructions. Briefly, cells were lysed directly in 6-well culture plates by addition of TRIzol® Reagent and passage of the lysate several times through an

RNase-free pipette tip. Phase separation was performed with TRIzol® Reagent and chloroform, and manufacturer-supplied kit reagents and columns were used to bind, wash, and elute RNA. DNase treatment with PureLink® DNase (Invitrogen, Life Technologies,

Thermo Fisher Scientific Inc.) was performed after the binding step to ensure DNA-free total RNA for the downstream application of sequencing. Following elution, RNA samples were quantified using the NanoDrop 1000 Spectrophotometer (Thermo

Scientific, Thermo Fisher Scientific Inc.).

6.4.5 qPCR validation

RNAs were reversed to complementary DNA (cDNA) using the iScriptTM cDNA

Synthesis Kit (Bio-Rad) according to the manufacturer’s instructions. Per synthesis reaction, 1 µg of RNA was used as a template; additionally, a no template reaction was performed to serve as a negative control. The resultant cDNAs were used as templates in qPCR assays with specific human primers for validation of DsiRNA-mediated gene knockdown and MYC-ERTM activation. HELLS primers were purchased from IDT, and all other primers used were designed using Primer-BLAST (Ye et al., 2012). All qPCR 122 primer pairs are listed in Table 6.2. qPCRs were performed with iQTM SYBR® Green

Supermix (Bio-Rad) and the MyiQTM Single-Color Real-Time PCR Detection System

(Bio-Rad), both used according to the manufacturer’s instructions. Reactions were performed in triplicate with 50 ng cDNA template and 500 nM forward and reverse primers per well. Relative mRNA expression levels were normalized to RPLP0 and analyzed using a comparative ∆∆Ct calculation method. Statistical analysis with

Student’s t-test was performed using R, and data were visualized using GraphPad Prism 7

(GraphPad Software) and Silhouette Studio Designer Edition software (Silhouette

America).

6.4.6 RNA-seq analysis

Libraries were rRNA-depleted and prepared for sequencing by the Genome

Technology Center at the New York University Langone Medical Center using standard

Illumina protocols. Libraries were sequenced on an Illumina HiSeq 4000 to generate paired-end 150 bp reads. FASTQ files were created with Bcl2fastq Conversion Software

1.8.4 (Illumina Inc.). RNA-seq data processing and analysis were performed in collaboration with Xuya Wang, Zuojian Tang, and Dr. David Fenyö (New York

University School of Medicine). FastQC v0.11.5 (Andrews, 2010) was utilized to evaluate raw read quality. TruSeq2 paired-end adapter sequences were trimmed from raw read pairs using Trimmomatic v0.36 (Bolger et al., 2014). There were 40.7M to 48.6M qualified read pairs generated among samples, and 85.4% to 89.6% aligned to the hg19 human reference genome. Qualified read pairs in FASTQ format were aligned to human genome assembly hg19 using TopHat v2.0.9 (Kim et al., 2013) and Bowtie2 v2.1.0

123 (Langmead and Salzberg, 2012) with the following settings: --library-type fr-unstranded - fusion-search --fusion-ignore-chromosomes chrM. Gene count was calculated by HTSeq v0.6.0 (Anders et al., 2015) with the following settings: --stranded=no --order=name -- minaqual=10 --type=exon --idattr=gene_id --mode=union. Negative binomial tests were applied on pairwise comparison of calculated gene counts using edgeR v3.14.0 (Robinson et al., 2010) to detect differentially expressed genes. R was used to calculate a Z-score from edgeR-normalized counts for each hallmark MYC target gene in each sample. The heatmap depicting these Z-scores was created using the gplots package (Warnes et al.,

2009) and Microsoft PowerPoint.

Gene Set Association Analysis for RNA-Seq (GSAASeqSP) (Xiong et al., 2014) was used to examine the association of the expression of genes belonging to Hallmark

MYC Target Gene sets (Liberzon et al., 2015) and experimental condition. HTSeq unnormalized counts were supplied as the expression data. Gene sets were downloaded as .gmt files directly from MSigDB (Subramanian et al., 2005). GSAASeqSP software was run in the gene set permutation mode using default parameters, with the exceptions of increasing the Java heap space memory, the maximum allowable gene count per gene set, and the number of plots produced. Association plots were automatically produced by the GSAASeqSP software. The association plot shown in Figure 6.6 was edited for aesthetics using Silhouette Studio Designer Edition software (Silhouette America).

RNA-seq FASTQ files and abundance measurements from RSEM (in the form of txt files) have been prepared, along with the requisite metadata spreadsheet, to submit to

Gene Expression Omnibus (GEO) genomics data repository maintained at NCBI. Once

124 submitted and accepted by NCBI, these data will receive a GEO series accession number and will be made available to the public either three years from the date of submission or upon publication of the data in a journal or preprint server, whichever comes first.

125 

)LJXUH6FKHPDWLFUHSUHVHQWDWLRQRI0<&RIIDQG0<&RQVWDWHVLQ8260<&

(570FHOOV $ 7KH0<&RIIVWDWHRFFXUVXQGHUQRUPDOSK\VLRORJLFDOFRQGLWLRQVWKDWLV

WRVD\WKHDEVHQFHRI2+7 % 7KH0<&RQVWDWHLVSURGXFHGE\WUHDWPHQWZLWK2+7

ZKLFKFDXVHV0<&(570WRWUDQVORFDWHLQWRWKHQXFOHXVDQGDFWLYDWHWKHH[SUHVVLRQRI

0<&WDUJHWJHQHV



Figure 6.2. Confirmation of ODC1 upregulation in the MYC on state. Means and standard error from qPCR technical triplicates of two biological replicates are shown.

Statistical significance was assessed using Student’s t-test (*** denotes p-value < 0.001).

127 Figure 6.3. Association of expression of hallmark MYC target genes with MYC off and MYC on states. (A) Association plot of gene expression profiles for control knockdown cells in MYC off versus MYC on states for the Hallmark MYC Targets V1 gene set. (B) Association plot of gene expression profiles for control knockdown cells in

MYC off versus MYC on states for the Hallmark MYC Targets V2 gene set.

128 

  



)LJXUH6FKHPDWLFUHSUHVHQWDWLRQRIH[SHULPHQWWRWHVWHIIHFWVRI+(//6

NQRFNGRZQRQ0<&(570WUDQVFULSWLRQIDFWRUDFWLYLW\$EEUHYLDWLRQVK KRXU

'VL51$ 'LFHUVXEVWUDWH51$2+7 K\GUR[\WDPR[LIHQ(W2+ HWKDQRO







Figure 6.5. Confirmation of HELLS downregulation following DsiRNA-mediated knockdown. Means and standard error from qPCR technical triplicates of two biological replicates are shown. Statistical significance was assessed using Student’s t-test (*** denotes p-value < 0.001; ** denotes 0.001 < p-value < 0.01).

131 Figure 6.6. Association of expression of hallmark MYC target genes with control and HELLS knockdowns in the MYC on state. (A) Association plot of gene expression profiles for the MYC on state in control knockdown versus HELLS knockdown cells for the Hallmark MYC Targets V1 gene set. (B) Association plot of gene expression profiles for the MYC on state in control knockdown versus HELLS knockdown cells for the

Hallmark MYC Targets V2 gene set.

132 

 Figure 6.7. Association of expression of hallmark MYC target genes with MYC off and MYC on states in HELLS knockdown cells. (A) Association plot of gene expression profiles for cells subjected to HELLS knockdown in MYC off versus MYC on states for the Hallmark MYC Targets V1 gene set. (B) Association plot of gene expression profiles for cells subjected to HELLS knockdown in MYC off versus MYC on states for the Hallmark MYC Targets V2 gene set.

134 



 

)LJXUH&RPSDULVRQRIKDOOPDUN0<&WDUJHWJHQHH[SUHVVLRQEHWZHHQ0<&

DFWLYLW\VWDWHVIRUFRQWURODQG+(//6NQRFNGRZQFHOOV([SUHVVLRQYDOXHVDUHVKRZQ

RQWKHKHDWPDSDV=VFRUHV+LJK=VFRUHVVKRZQLQUHGDUHLQGLFDWLYHRIKLJK

H[SUHVVLRQ/RZ=VFRUHVVKRZQLQEOXHDUHLQGLFDWLYHRIORZH[SUHVVLRQ*HQHV

H[KLELWLQJVLPLODUFKDQJHVLQH[SUHVVLRQIURP0<&RIIWR0<&RQUHJDUGOHVVRI

NQRFNGRZQDUHLQGLFDWHGE\DEUDFNHWDQGDUHODEHOHG*HQHVH[KLELWLQJDWWHQXDWHG

H[SUHVVLRQOHYHOVLQERWK0<&RIDQG0<&RQVWDWHVDIWHU+(//6NQRFNGRZQDUHDOVR

LQGLFDWHGE\DEUDFNHWDQGDUHODEHOHG













 



)LJXUH/LVWRIJHQHVH[KLELWLQJDWWHQXDWHGH[SUHVVLRQIROORZLQJ+(//6

NQRFNGRZQLQERWK0<&DFWLYLW\VWDWHV*HQHVDUHSUHVHQWHGLQDOSKDEHWLFDORUGHU



 

































 Table 6.2. Primers used for qPCR validation of gene expression changes.

Gene Name Sequence (5'  3') or Product Number Source of Primer Design

CCTCGTGGAAGTGACATCGT, RPLP0 NCBI (Primer-BLAST) CTGTCTTCCCTGGGCATCAC

CCTGGCGTCGTGATTAGTGA, HPRT1 NCBI (Primer-BLAST) CGAGCAAGACGTTCAGTCCT

HELLS Hs.PT.58.679663 IDT

GGCGCTCTGAGATTGTCACT, ODC1 NCBI (Primer-BLAST) TCCAAATCCCTCTGCGTGTG

139 Chapter 7: Concluding Remarks

HELLS is a SNF2-like chromatin remodeler that has exhibited both transcriptional repressor and activator activities in mammalian cells (von Eyss et al.,

2012; Myant and Stancheva, 2008; Termanis et al., 2016). Although HELLS has a well- documented association with cell proliferation (Geiman et al., 2001; Jarvis et al., 1996;

Lee et al., 2000; Raabe et al., 2001; Sun et al., 2004), it has not been thoroughly studied in the context of cancer, a class of diseases characterized by rapid cell proliferation and aberrant gene expression. For this dissertation, I sought to address HELLS expression patterns and possible molecular functions in human cancers. In this final chapter, I briefly review the major findings from this work and discuss the contributions that these findings make to the research community.

First, we sought to characterize patterns of HELLS expression in normal and malignant human tissues. Through immunohistochemical analysis, described in Chapter 2, we confirmed that HELLS expression is associated with proliferation in human tissues.

Using immunohistochemistry and differential expression analyses, we demonstrated that

HELLS expression is elevated in a number of human cancers, including lymphoid malignancies and carcinomas. These results suggest that HELLS expression may have pathologic relevance in the context of cancer. Importantly, our analyses highlighted the utility of studying the molecular functions of HELLS in model systems of human cancers.

140 In order to better understand the molecular functions of HELLS in human cancer cells, we used ChIP-seq to identify the genomic loci targeted by HELLS and characterize the transcriptional status of those loci. We found that HELLS preferentially binds to gene promoters, predominantly to those that exhibit evidence of transcription. When we began this work, there was only a single report demonstrating involvement of HELLS in activating transcription (von Eyss et al., 2012). Our findings add additional support for this role for HELLS and suggest that HELLS may play a more prominent role in transcriptional activation than has previously been appreciated. Further, our work has resulted in the creation of the first high-confidence catalog of transcribed HELLS target genes in human cells (provided in the Appendix), which may serve as a useful tool for the research community.

Although HELLS has been shown to bind to LINE and SINE retrotransposons in murine cells (Huang et al., 2004), we found no evidence to suggest that this is a widespread phenomenon in human cancer cells, indicating that HELLS does not directly target these loci for epigenetic regulation in cancers. Nevertheless, the product of one or more HELLS target genes may direct the transcriptional status of retrotransposons, allowing HELLS to serve as an indirect regulator of these loci. While we have not explored this possibility in this work, some of the data that we have generated, including the HELLS target gene catalog and transcriptomes of HELLS-deficient cells, may lend themselves to investigating candidate LINE or SINE regulators as well as retrotransposon expression by RNA-seq in the future.

141 The use of gene set enrichment analysis to identify underlying biological themes connecting HELLS target genes, described in Chapter 4, led us to discover that expressed genes bound by HELLS are enriched for targets of MYC, a transcription factor with a significant role in normal mammalian cellular proliferation as well as a potent oncogenic effect (reviewed in Dang, 2012). By ChIP-seq analysis, we found that HELLS and MYC do in fact share target genes in human cancer cells; the proteins bind to many of the same promoters, a significant fraction of which are apparently transcribed. This observed co- localization of HELLS and MYC within the genome, as well as a published proteomic screen that identified HELLS as a candidate MYC-interacting protein (Koch et al., 2007), is suggestive of a physical interaction between HELLS and MYC. Co-IP assays demonstrated that HELLS and MYC physically interact with one another in HEK 293T cells, confirming that HELLS is indeed a MYC-interacting protein.

That many shared targets of HELLS and MYC are apparently transcribed led us to hypothesize that HELLS and MYC cooperate to drive their expression. Accordingly, we developed two models in which to investigate the effects of the loss of HELLS function on MYC-induced transcription in human cells.

The first model, described in Chapter 5, was created by inducing homozygous loss-of-function HELLS mutations and ectopic MYC expression in HEK 293T cells. We successfully recapitulated a murine Hells mutant allele (Sun et al., 2004) in these cells, resulting in a partial loss of HELLS function caused by low expression levels of a mutant

HELLS protein. While stable transfection of a MYC expression vector resulted in robust

MYC expression, some of our control cell lines expressed endogenous MYC mRNA,

142 compromising our ability to investigate transcriptome differences between wild-type and

HELLS-deficient cells in the presence or absence of MYC. Instead, we compared the gene expression profiles of wild-type and HELLS-deficient cells without any manipulation of MYC expression and found that the partial loss of HELLS function alters the expression of some genes, including those that we previously identified as direct

HELLS targets in human cancer cell lines. These genes were not uniformly downregulated in HELLS-deficient cells relative to wild-type, as we would have expected if HELLS were acting exclusively as a transcriptional activator of these genes. In light of this finding, we propose that HELLS-mediated effects on target locus expression may be cell type- or tissue-specific, and that going forward, HELLS function and activity must be evaluated in different contexts to fully understand the protein’s role in regulating transcription. Additionally, the differences in HELLS target gene expression between wild-type and HELLS-deficient cells were in most cases small, meaning that they may not be biologically significant. While it is certainly possible that HELLS is not a key transcriptional regulator of this particular set of genes in HEK 293T cells, this observation may also highlight the limitations of studying HELLS function using a partial-loss-of-function model. In the future, studies of HELLS would greatly benefit from a human cellular model of complete loss of HELLS function. This may be difficult to achieve, however, as our genome editing efforts suggest that a null allele is difficult to engineer in human cells. Additionally, HELLS is essential for survival in mice (Geiman et al., 2001) and in some cancer cell types (Hart et al., 2015).

The second model, described in Chapter 6, was created by inducing RNAi- mediated knockdown of HELLS in U2OS osteosarcoma cells expressing a MYC fusion 143 protein that becomes transcriptionally active following tamoxifen treatment (Altman et al., 2015). Activated MYC was able to increase expression of its target genes even following HELLS knockdown, suggesting that the partial loss of HELLS did not wholly impair MYC function. However, the expression levels of a subset of MYC targets were consistently lower in HELLS knockdown cells than in wild-type cells. We conclude

HELLS is not a required MYC co-factor but nevertheless plays some role in establishing the baseline expression and full MYC-driven transcription levels of some MYC target genes.

Going forward, fully understanding the nature of any functional interaction between HELLS and MYC, or between HELLS and MYC target genes, will require additional studies. Aside from adding to our understanding of MYC biology, such studies may have potential ramifications for therapeutic strategies aimed at MYC. MYC plays a significant and frequent role in human cancers and in some cases, MYC target gene expression is needed for the full transforming ability of oncogenic MYC (reviewed in

Dang, 2012). While the work presented in this dissertation does not conclusively establish that HELLS is necessary for the expression of MYC target genes, it does indicate that there is likely some role for HELLS in this process. Expanding these studies to include models of complete loss of HELLS function as well as additional human cell line model systems would allow for a more optimal characterization of HELLS function related to MYC target gene biology; further, such studies would provide a better indication of whether HELLS inhibition could serve as a meaningful strategy to indirectly target the effects of oncogenic MYC.

144 Appendix

Here, we present an alphabetized list of 578 high-confidence transcribed HELLS target genes, which were identified and described in Chapter 3. The promoters of the genes in this list were bound by HELLS in all three human cancer cell lines that were profiled by

ChIP-seq.

A

ABHD11 ANAPC2 ARRDC3-AS1 ACSS2 ANGEL1 ASB6 ACTB ANKRD24 ASH1L ACTG1 AP2B1 ASH1L-AS1 ACYP2 AP4B1 ASPSCR1 AGBL5 AP4B1-AS1 ATF2 AGBL5-AS1 AP4E1 ATR AKT1S1 ARHGAP11B ATXN2L AKT2 ARHGEF7 ATXN7L3 ALG3 ARID2 AURKAIP AMD1 ARMT1

B

B3GALNT2 BAX BOLA2B BANF1 BOLA2 BRD2

C

C16orf58 C21orf2 C7orf49 C17orf62 C21orf62-AS1 C7orf55 C18orf25 C3orf58 C7orf55-LUC7L2 C1orf43 C6orf62 C9orf43 145 C9orf69 CDK12 COMMD6 CAMK1D CDK13 CORO2B CCAR2 CDK19 COX20 CCDC114 CDKL3 COX4I1 CCDC124 CENPP COX6A1 CCDC59 CENPT CREM CCDC97 CFL1 CSAD CCNT1 CIPC CTCF CCT4 CLASP1 CUL3 CCT5 CLPTM1 CYB561A3 CD2AP CNOT11 CYCS CDC27 COA5 CDC40 COG3

D

DALRD3 DDX51 DNAJB6 DCLRE1B DENND4A DNAJC30 DDIT3 DEPDC5 DPP8 DDX11L1 DHX8 DPP9 DDX11L10 DNAAF2 DUSP5P1 DDX17 DNAJB12 DDX18 DNAJB2

E

ECE2 EIF2B4 ELL EEF1A1 EIF2B5 EMC8 EFCAB2 EIF4A3 EML3 EHD1 EIF4B ERMARD EIF1 EIF4E EXOG EIF1AD EIF4G1

F

FAM133B FAM86B1 FRG1 FAM133DP FBXO31 FUT8 FAM168A FCHSD2 FUT8-AS1 FAM173B FOXA3 FAM3C FRA10AC1

146 G

GAPDH GMFB GTF2I GBA GSPT1 GTF3C2 GFI1B GSTA4 GTPBP10 GIN1 GTF2A1 GUK1

H

HDLBP HIST2H3A HNRNPU HEXIM2 HIST2H3C HSP90AA1 HIST1H1C HIST2H4A HSPA8 HIST1H2AC HIST2H4B HSPD1 HIST1H2BC HNRNPD HSPE1 HIST2H2AA3 HNRNPK HSPE1-MOB4 HIST2H2AA4 HNRNPL

I

ID4 INTS2 INTS6-AS1 ILF2 INTS4 IRF2 INO80C INTS6 IRGQ

J

JUND

K

KAT5 KIAA1429 KMT2E KAT6A KIAA1919 KMT2E-AS1 KBTBD4 KLF13 KNTC1 KCNIP2-AS1 KLHDC9 KPTN KDM2A KMT2A KRI1

147 L

LACTB2-AS1 LMF2 LOC642852 LAMP1 LNPEP LOC645513 LDHA LOC100128164 LOC728743 LIMD1-AS1 LOC100288778 LOC730183 LINC-PINT LOC101926933 LONP1 LINC00884 LOC101928068 LRRC29 LINC00938 LOC101928626 LRRC37A3 LINC01003 LOC101928739 LSM8 LINC01138 LOC101929715 LUC7L2 LMBR1 LOC102724596 LYSMD1

M

MAD2L2 MIR1302-11 MIR6087 MADD MIR1302-2 MIR616 MALAT1 MIR1302-9 MIR632 MAP3K11 MIR191 MIR636 MAP3K12 MIR3180-1 MIR6723 MARCH6 MIR3180-2 MIR7641-2 MAZ MIR3180-3 MIR8072 MBD5 MIR3188 MIR8075 MBD6 MIR3610 MIR933 MCOLN1 MIR3648-1 MKLN1 MDM4 MIR3648-2 MOB4 MEF2A MIR3661 MORC3 MEMO1 MIR3687-1 MPLKIP MEPCE MIR3687-2 MRPL24 METTL25 MIR3912 MRPL44 MFSD11 MIR425 MSL2 MFSD3 MIR4638 MST1P2 MGEA5 MIR4651 MTDH MIB2 MIR4738 MUS81 MIER1 MIR5188 MYL12A MIR1302-10 MIR548AW

N

NAMPT NCBP2 NDUFS3 NAPA-AS1 NCBP2-AS2 NDUFS7 NCAPH2 NDUFAF3 NFKBIA 148 NFX1 NOC4L NRD1 NIF3L1 NOL8 NRF1 NIFK-AS1 NOLC1 NUF2 NME1 NPM1 NUTF2 NME1-NME2 NR1H2

O

OCIAD1 ORAOV1 ODC1 ORC4

P

PAFAH2 PIGU PPP1R9B PAXBP1 PIK3CA PPP2CA PBX2 PIP4K2B PPP3R1 PCBP1 PKM PPP4R2 PCBP1-AS1 PLEKHO2 PPP6R1 PCBP2 POFUT2 PRDX1 PCNXL3 POLE3 PRKCI PDE4DIP POLG PRMT5 PDE8A POLR2J2 PRPF31 PDIK1L POLR2J3 PSMD3 PEX5 POLR3E PTBP1 PFKM POP7 PTMA PGS1 POR PTPN11 PHF13 PPIB PTPN4 PHF3 PPIL3 PUM2 PI4KA PPIP5K2 PURB

R

R3HCC1 RAVER1 RHOU RAB11A RBM25 RMI1 RAC1 RBM39 RMND1 RAD21 RBPJ RNU11 RAD21-AS1 RBX1 RNU6-2 RAD23A RCOR3 ROCK1 RAD23B RGS5 ROM1 RAD9B RHOB RPL19

149 RPL23 RPS20 RRAS RPL27 RPS26 RSBN1 RPRD1A RPS29 RSPH6A RPS19 RPS8 RSRC2

S

SACM1L SLX1B-SULT1A4 SNORD46 SCAF1 SMARCD2 SNORD54 SCAND1 SMEK1 SNORD55 SCARNA2 SMG5 SNORD60 SCNM1 SNAP29 SNRPE SCO2 SNAR-A10 SNUPN SEC14L1 SNAR-A11 SNX17 SEC62 SNAR-A14 SRCAP SECISBP2 SNAR-A3 SRRM2 SENP1 SNAR-A4 SRRM2-AS1 SEPT2 SNAR-A5 SRRT SEPT7 SNAR-A6 SRSF10 SEPT7-AS1 SNAR-A7 SRSF2 SERTAD3 SNAR-A8 SSBP1 SETBP1 SNAR-A9 SSBP2 SETD5 SNHG1 SSNA1 SETDB1 SNORA21 STAG1 SF3A3 SNORA80B STAU1 SIRT6 SNORD12B STX18 SKIL SNORD12C STX18-AS1 SLC25A51 SNORD25 SUGCT SLC35F5 SNORD26 SUPT4H1 SLC3A2 SNORD27 SYMPK SLX1A SNORD28 SYVN1 SLX1A-SULT1A3 SNORD29 SZRD1 SLX1B SNORD30

T

TARBP2 THRAP3 TMEM243 TBC1D17 THUMPD3-AS1 TMEM79 TBC1D19 TIPARP TNPO3 TCF4 TIPARP-AS1 TOR1AIP1 TCTE3 TM2D3 TPI1P2 TFPT TMEM208 TPT1 150 TPT1-AS1 TRAPPC9 TSC22D1 TRA2B TRIM28 TSSC1 TRAF7 TRIM41 TTLL12 TRAM1 TRIM7 TXNL1 TRAPPC12 TSC1

U

UBAP2L UCHL3 USP17L28 UBC UCKL1 USP17L29 UBE2B UGP2 USP17L30 UBE2D3 UNC50 USP17L5 UBE2F UNK USP22 UBE2F-SCLY UPF2 USP3 UBE2J2 USP17L24 USP32 UBE3C USP17L25 USP34 UBQLN1 USP17L26 UVSSA UBTF USP17L27

V

VDAC2 VPS29 VPS51 VPRBP VPS33A

W

WAC WASL WDR78 WAC-AS1 WBP2 WEE2-AS1 WASF1 WBP4 WIBG WASH7P WBSCR22 WNK1

Y

YTHDF2

151 Z

ZBTB45 ZNF106 ZNF655 ZC3H18 ZNF207 ZNF668 ZC3H3 ZNF3 ZNF740 ZC3H4 ZNF384 ZNF767P ZCCHC4 ZNF398 ZNF786 ZCWPW1 ZNF425 ZNF787 ZFAS1 ZNF576 ZNF839 ZFP62 ZNF646 ZNFX1 ZMPSTE24 ZNF652

152 References

Abe, H., Kamai, T., Shirataki, H., Oyama, T., Arai, K., and Yoshida, K.-I. (2008). High expression of Ran GTPase is associated with local invasion and metastasis of human clear cell renal cell carcinoma. Int. J. Cancer 122, 2391–2397.

Altman, B.J., Hsieh, A.L., Sengupta, A., Krishnanaiah, S.Y., Stine, Z.E., Walton, Z.E., Gouw, A.M., Venkataraman, A., Li, B., Goraksha-Hicks, P., et al. (2015). MYC Disrupts the Circadian Clock and Metabolism in Cancer Cells. Cell Metab. 22, 1009–1019.

Alves, G., Tatro, A., and Fanning, T. (1996). Differential methylation of human LINE-1 retrotransposons in malignant cells. Gene 176, 39–44.

Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169.

Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Annibali, D., Whitfield, J.R., Favuzzi, E., Jauset, T., Serrano, E., Cuartas, I., Redondo- Campos, S., Folch, G., Gonzàlez-Juncà, A., Sodir, N.M., et al. (2014). Myc inhibition is effective against glioma and reveals a role for Myc in proficient mitosis. Nat. Commun. 5, 4632.

Bemark, M., and Neuberger, M.S. (2000). The c-MYC allele that is translocated into the IgH locus undergoes constitutive hypermutation in a Burkitt’s lymphoma line. Oncogene 19, 3404–3410.

Ben-Porath, I., Thomson, M.W., Carey, V.J., Ge, R., Bell, G.W., Regev, A., and Weinberg, R.A. (2008). An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat. Genet. 40, 499–507.

153 Benjamini, Y., and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300.

Blackwell, T.K., Huang, J., Ma, A., Kretzner, L., Alt, F.W., Eisenman, R.N., and Weintraub, H. (1993). Binding of myc proteins to canonical and noncanonical DNA sequences. Mol. Cell. Biol. 13, 5216–5224.

Blackwell, T.K., Kretzner, L., Blackwood, E.M., Eisenman, R.N., and Weintraub, H. (1990). Sequence-specific DNA binding by the c-Myc protein. Science 250, 1149–1151.

Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120.

Bonn, S., Zinzen, R.P., Girardot, C., Gustafson, E.H., Perez-Gonzalez, A., Delhomme, N., Ghavi-Helm, Y., Wilczy_ski, B., Riddell, A., and Furlong, E.E.M. (2012). Tissue- specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat. Genet. 44, 148–156.

Bouchard, C., Thieke, K., Maier, A., Saffrich, R., Hanley-Hyde, J., Ansorge, W., Reed, S., Sicinski, P., Bartek, J., and Eilers, M. (1999). Direct induction of cyclin D2 by Myc contributes to cell cycle progression and sequestration of p27. EMBO J. 18, 5321–5333.

Calado, D.P., Sasaki, Y., Godinho, S.A., Pellerin, A., Köchert, K., Sleckman, B.P., de Alborán, I.M., Janz, M., Rodig, S., and Rajewsky, K. (2012). The cell-cycle regulator c- Myc is essential for the formation and maintenance of germinal centers. Nat. Immunol. 13, 1092–1100.

Campbell, P.J., Stephens, P.J., Pleasance, E.D., O’Meara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729.

Chauhan, J., Wang, H., Yap, J.L., Sabato, P.E., Hu, A., Prochownik, E.V., and Fletcher, S. (2014). Discovery of methyl 4’-methyl-5-(7-nitrobenzo[c][1,2,5]oxadiazol-4-yl)-[1,1’- biphenyl]-3-carboxylate, an improved small-molecule inhibitor of c-Myc-max dimerization. ChemMedChem 9, 2274–2285.

154 Chi, T.H., Wan, M., Zhao, K., Taniuchi, I., Chen, L., Littman, D.R., and Crabtree, G.R. (2002). Reciprocal regulation of CD4/CD8 expression by SWI/SNF-like BAF complexes. Nature 418, 195–199.

Cho, N.-Y., Kim, B.-H., Choi, M., Yoo, E.J., Moon, K.C., Cho, Y.-M., Kim, D., and Kang, G.H. (2007). Hypermethylation of CpG island loci and hypomethylation of LINE- 1 and Alu repeats in prostate adenocarcinoma and their relationship to clinicopathological features. J. Pathol. 211, 269–277.

Choi, I.-S., Estecio, M.R.H., Nagano, Y., Kim, D.H., White, J.A., Yao, J.C., Issa, J.-P.J., and Rashid, A. (2007). Hypomethylation of LINE-1 and Alu in well-differentiated neuroendocrine tumors (pancreatic endocrine tumors and carcinoid tumors). Mod. Pathol. 20, 802–810.

Cong, L., Ran, F.A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.D., Wu, X., Jiang, W., Marraffini, L.A., et al. (2013). Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823.

Dang, C.V. (2012). MYC on the path to cancer. Cell 149, 22–35.

Dang, C.V., Reddy, E.P., Shokat, K.M., and Soucek, L. (2017). Drugging the “undruggable” cancer targets. Nat. Rev. Cancer 17, 502–508.

Dante, R., Dante-Paire, J., Rigal, D., and Roizès, G. (1992). Methylation patterns of long interspersed repeated DNA and alphoid repetitive DNA from human cell lines and tumors. Anticancer Res. 12, 559–563.

Davis, A.C., Wims, M., Spotts, G.D., Hann, S.R., and Bradley, A. (1993). A null c-myc mutation causes lethality before 10.5 days of gestation in homozygotes and reduced fertility in heterozygous female mice. Genes Dev. 7, 671–682.

de Greef, J.C., Wang, J., Balog, J., den Dunnen, J.T., Frants, R.R., Straasheijm, K.R., Aytekin, C., van der Burg, M., Duprez, L., Ferster, A., et al. (2011). Mutations in ZBTB24 are associated with immunodeficiency, centromeric instability, and facial anomalies syndrome type 2. Am. J. Hum. Genet. 88, 796–804.

155 Dennis, K., Fan, T., Geiman, T., Yan, Q., and Muegge, K. (2001). Lsh, a member of the SNF2 family, is required for genome-wide methylation. Genes Dev. 15, 2940–2944.

Doucet-O’Hare, T.T., Rodić, N., Sharma, R., Darbari, I., Abril, G., Choi, J.A., Young Ahn, J., Cheng, Y., Anders, R.A., Burns, K.H., et al. (2015). LINE-1 expression and retrotransposition in Barrett’s esophagus and esophageal carcinoma. Proc. Natl. Acad. Sci. U. S. A. 112, E4894-4900.

DuBridge, R.B., Tang, P., Hsia, H.C., Leong, P.M., Miller, J.H., and Calos, M.P. (1987). Analysis of mutation in human cells by using an Epstein-Barr virus shuttle system. Mol. Cell. Biol. 7, 379–387.

Evan, G.I., and Vousden, K.H. (2001). Proliferation, cell cycle and apoptosis in cancer. Nature 411, 342–348.

Ewing, A.D., Gacita, A., Wood, L.D., Ma, F., Xing, D., Kim, M.-S., Manda, S.S., Abril, G., Pereira, G., Makohon-Moore, A., et al. (2015). Widespread somatic L1 retrotransposition occurs early during gastrointestinal cancer evolution. Genome Res. 25, 1536–1545.

Fahrner, J.A., and Bjornsson, H.T. (2014). Mendelian Disorders of the Epigenetic Machinery: Tipping the Balance of Chromatin States. Annu. Rev. Genomics Hum. Genet. 15, 269–293.

Favorov, A., Mularoni, L., Cope, L.M., Medvedeva, Y., Mironov, A.A., Makeev, V.J., and Wheelan, S.J. (2012). Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput. Biol. 8, e1002529.

Galardi, S., Savino, M., Scagnoli, F., Pellegatta, S., Pisati, F., Zambelli, F., Illi, B., Annibali, D., Beji, S., Orecchini, E., et al. (2016). Resetting cancer stem cell regulatory nodes upon MYC inhibition. EMBO Rep. 17, 1872–1889.

Geiman, T.M., Tessarollo, L., Anver, M.R., Kopp, J.B., Ward, J.M., and Muegge, K. (2001). Lsh, a SNF2 family member, is required for normal murine development. Biochim. Biophys. Acta 1526, 211–220.

156 Hagleitner, M.M., Lankester, A., Maraschio, P., Hultén, M., Fryns, J.P., Schuetz, C., Gimelli, G., Davies, E.G., Gennery, A., Belohradsky, B.H., et al. (2008). Clinical spectrum of immunodeficiency, centromeric instability and facial dysmorphism (ICF syndrome). J. Med. Genet. 45, 93–99.

Hargreaves, D.C., and Crabtree, G.R. (2011). ATP-dependent chromatin remodeling: genetics, genomics and mechanisms. Cell Res. 21, 396–420.

Hart, J.R., Garner, A.L., Yu, J., Ito, Y., Sun, M., Ueno, L., Rhee, J.-K., Baksh, M.M., Stefan, E., Hartl, M., et al. (2014). Inhibitor of MYC identified in a Kröhnke pyridine library. Proc. Natl. Acad. Sci. U. S. A. 111, 12556–12561.

Hart, T., Chandrashekhar, M., Aregger, M., Steinhart, Z., Brown, K.R., MacLeod, G., Mis, M., Zimmermann, M., Fradet-Turcotte, A., Sun, S., et al. (2015). High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell 163, 1515–1526.

Heberle, H., Meirelles, G.V., da Silva, F.R., Telles, G.P., and Minghim, R. (2015). InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams. BMC Bioinformatics 16, 169.

Helman, E., Lawrence, M.S., Stewart, C., Sougnez, C., Getz, G., and Meyerson, M. (2014). Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome Res. 24, 1053–1063.

Hirvonen, H.E., Salonen, R., Sandberg, M.M., Vuorio, E., Västrik, I., Kotilainen, E., and Kalimo, H. (1994). Differential expression of myc, max and RB1 genes in human gliomas and glioma cell lines. Br. J. Cancer 69, 16–25.

Ho, L., Jothi, R., Ronan, J.L., Cui, K., Zhao, K., and Crabtree, G.R. (2009). An embryonic stem cell chromatin remodeling complex, esBAF, is an essential component of the core pluripotency transcriptional network. Proc. Natl. Acad. Sci. U. S. A. 106, 5187–5191.

Huang, J., Fan, T., Yan, Q., Zhu, H., Fox, S., Issaq, H.J., Best, L., Gangi, L., Munroe, D., and Muegge, K. (2004). Lsh, an epigenetic guardian of repetitive elements. Nucleic Acids Res. 32, 5019–5028.

157 Hulsen, T., de Vlieg, J., and Alkema, W. (2008). BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics 9, 488.

Isakoff, M.S., Sansam, C.G., Tamayo, P., Subramanian, A., Evans, J.A., Fillmore, C.M., Wang, X., Biegel, J.A., Pomeroy, S.L., Mesirov, J.P., et al. (2005). Inactivation of the Snf5 tumor suppressor stimulates cell cycle progression and cooperates with p53 loss in oncogenic transformation. Proc. Natl. Acad. Sci. U. S. A. 102, 17745–17750.

Iskow, R.C., McCabe, M.T., Mills, R.E., Torene, S., Pittard, W.S., Neuwald, A.F., Van Meir, E.G., Vertino, P.M., and Devine, S.E. (2010). Natural mutagenesis of human genomes by endogenous retrotransposons. Cell 141, 1253–1261.

Jarvis, C.D., Geiman, T., Vila-Storm, M.P., Osipovich, O., Akella, U., Candeias, S., Nathan, I., Durum, S.K., and Muegge, K. (1996). A novel putative helicase produced in early murine lymphocytes. Gene 169, 203–207.

Johnson, B.E., Ihde, D.C., Makuch, R.W., Gazdar, A.F., Carney, D.N., Oie, H., Russell, E., Nau, M.M., and Minna, J.D. (1987). myc family oncogene amplification in tumor cell lines established from small cell lung cancer patients and its relationship to clinical status and course. J. Clin. Invest. 79, 1629–1634.

Johnston, L.A., Prober, D.A., Edgar, B.A., Eisenman, R.N., and Gallant, P. (1999). Drosophila myc regulates cellular growth during development. Cell 98, 779–790.

Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet, C.W., Haussler, D., and Kent, W.J. (2004). The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493-496.

Kim, D., Langmead, B., and Salzberg, S.L. (2015). HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360.

Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S.L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36.

158 Klein, G., Giovanella, B., Westman, A., Stehlin, J.S., and Mumford, D. (1975). An EBV- genome-negative cell line established from an American Burkitt lymphoma; receptor characteristics. EBV infectibility and permanent conversion into EBV-positive sublines by in vitro infection. Intervirology 5, 319–334.

Koch, H.B., Zhang, R., Verdoodt, B., Bailey, A., Zhang, C.-D., Yates, J.R., Menssen, A., and Hermeking, H. (2007). Large-scale identification of c-MYC-associated proteins using a combined TAP/MudPIT approach. Cell Cycle 6, 205–217.

Kurisetty, V.V., Johnston, P.G., Johnston, N., Erwin, P., Crowe, P., Fernig, D.G., Campbell, F.C., Anderson, I.P., Rudland, P.S., and El-Tanani, M.K. (2008). RAN GTPase is an effector of the invasive/metastatic phenotype induced by osteopontin. Oncogene 27, 7139–7149.

Langmead, B. (2010). Aligning short sequencing reads with Bowtie. Curr. Protoc. Bioinformatics. Ed. Board Andreas Baxevanis Al Chapter 11, Unit 11.7.

Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359.

Lawrence, M., Huber, W., Pagès, H., Aboyoun, P., Carlson, M., Gentleman, R., Morgan, M.T., and Carey, V.J. (2013). Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118.

Lee, D.W., Zhang, K., Ning, Z.Q., Raabe, E.H., Tintner, S., Wieland, R., Wilkins, B.J., Kim, J.M., Blough, R.I., and Arceci, R.J. (2000). Proliferation-associated SNF2-like gene (PASG): a SNF2 family member altered in leukemia. Cancer Res. 60, 3612–3622.

Lee, E., Iskow, R., Yang, L., Gokcumen, O., Haseley, P., Luquette, L.J., Lohr, J.G., Harris, C.C., Ding, L., Wilson, R.K., et al. (2012). Landscape of somatic retrotransposition in human cancers. Science 337, 967–971.

Leinonen, R., Sugawara, H., and Shumway, M. (2011). The Sequence Read Archive. Nucleic Acids Res. 39, D19–D21.

159 Li, B., and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323.

Li, Z., Boone, D., and Hann, S.R. (2008). Nucleophosmin interacts directly with c-Myc and controls c-Myc-induced hyperproliferation and transformation. Proc. Natl. Acad. Sci. U. S. A. 105, 18794–18799.

Liberzon, A., Birger, C., Thorvaldsdóttir, H., Ghandi, M., Mesirov, J.P., and Tamayo, P. (2015). The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 1, 417–425.

Lin, C.Y., Lovén, J., Rahl, P.B., Paranal, R.M., Burge, C.B., Bradner, J.E., Lee, T.I., and Young, R.A. (2012). Transcriptional amplification in tumor cells with elevated c-Myc. Cell 151, 56–67.

Littlewood, T.D., Hancock, D.C., Danielian, P.S., Parker, M.G., and Evan, G.I. (1995). A modified oestrogen receptor ligand-binding domain as an improved switch for the regulation of heterologous proteins. Nucleic Acids Res. 23, 1686–1690.

Manolov, G., and Manolova, Y. (1972). Marker band in one chromosome 14 from Burkitt lymphomas. Nature 237, 33–34.

Massie, C.E., and Mills, I.G. (2012). Mapping protein-DNA interactions using ChIP- sequencing. Methods Mol. Biol. Clifton NJ 809, 157–173.

Miki, Y., Nishisho, I., Horii, A., Miyoshi, Y., Utsunomiya, J., Kinzler, K.W., Vogelstein, B., and Nakamura, Y. (1992). Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res. 52, 643–645.

Miliani de Marval, P.L., Gimenez-Conti, I.B., LaCava, M., Martinez, L.A., Conti, C.J., and Rodriguez-Puebla, M.L. (2001). Transgenic expression of cyclin-dependent kinase 4 results in epidermal hyperplasia, hypertrophy, and severe dermal fibrosis. Am. J. Pathol. 159, 369–379.

160 Miliani de Marval, P.L., Macias, E., Rounbehler, R., Sicinski, P., Kiyokawa, H., Johnson, D.G., Conti, C.J., and Rodriguez-Puebla, M.L. (2004). Lack of Cyclin-Dependent Kinase 4 Inhibits c-myc Tumorigenic Activities in Epithelial Tissues. Mol. Cell. Biol. 24, 7538– 7547.

Mokry, M., Hatzis, P., Schuijers, J., Lansu, N., Ruzius, F.-P., Clevers, H., and Cuppen, E. (2012). Integrated genome-wide analysis of transcription factor occupancy, RNA polymerase II binding and steady-state RNA levels identify differentially regulated functional gene classes. Nucleic Acids Res. 40, 148–158.

Myant, K., and Stancheva, I. (2008). LSH cooperates with DNA methyltransferases to repress transcription. Mol. Cell. Biol. 28, 215–226.

Myant, K., Termanis, A., Sundaram, A.Y.M., Boe, T., Li, C., Merusi, C., Burrage, J., de Las Heras, J.I., and Stancheva, I. (2011). LSH and G9a/GLP complex are required for developmentally programmed DNA methylation. Genome Res. 21, 83–94.

Neigeborn, L., and Carlson, M. (1984). Genes affecting the regulation of SUC2 gene expression by glucose repression in . Genetics 108, 845–858.

Pan, J., Deng, Q., Jiang, C., Wang, X., Niu, T., Li, H., Chen, T., Jin, J., Pan, W., Cai, X., et al. (2015). USP37 directly deubiquitinates and stabilizes c-Myc in lung cancer. Oncogene 34, 3957–3967.

Patel, V.N., Gokulrangan, G., Chowdhury, S.A., Chen, Y., Sloan, A.E., Koyutürk, M., Barnholtz-Sloan, J., and Chance, M.R. (2013). Network signatures of survival in glioblastoma multiforme. PLoS Comput. Biol. 9, e1003237.

Paterson, A.L., Weaver, J.M.J., Eldridge, M.D., Tavaré, S., Fitzgerald, R.C., Edwards, P.A.W., and OCCAMs Consortium (2015). Mobile element insertions are frequent in oesophageal adenocarcinomas and can mislead paired-end sequencing analysis. BMC Genomics 16, 473.

Perez-Roger, I., Kim, S.H., Griffiths, B., Sewing, A., and Land, H. (1999). Cyclins D1 and D2 mediate myc-induced proliferation via sequestration of p27(Kip1) and p21(Cip1). EMBO J. 18, 5310–5320.

161 Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.-C., Mendell, J.T., and Salzberg, S.L. (2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295.

Peters, A.H.F.M., Mermoud, J.E., O’Carroll, D., Pagani, M., Schweizer, D., Brockdorff, N., and Jenuwein, T. (2002). Histone H3 lysine 9 methylation is an epigenetic imprint of facultative heterochromatin. Nat. Genet. 30, 77–80.

Pierce, S.B., Yost, C., Britton, J.S., Loo, L.W.M., Flynn, E.M., Edgar, B.A., and Eisenman, R.N. (2004). dMyc is required for larval growth and endoreplication in Drosophila. Development. 131, 2317–2327.

Pitkänen, E., Cajuso, T., Katainen, R., Kaasinen, E., Välimäki, N., Palin, K., Taipale, J., Aaltonen, L.A., and Kilpivaara, O. (2014). Frequent L1 retrotranspositions originating from TTC28 in colorectal cancer. Oncotarget 5, 853–859.

Qian, J., Wang, Q., Dose, M., Pruett, N., Kieffer-Kwon, K.-R., Resch, W., Liang, G., Tang, Z., Mathé, E., Benner, C., et al. (2014). B cell super-enhancers and regulatory clusters recruit AID tumorigenic activity. Cell 159, 1524–1537.

Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842.

Raabe, E.H., Abdurrahman, L., Behbehani, G., and Arceci, R.J. (2001). An SNF2 factor involved in mammalian development and cellular proliferation. Dev. Dyn. 221, 92–105.

Ramírez, F., Dündar, F., Diehl, S., Grüning, B.A., and Manke, T. (2014). deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187-191.

Rensen, W.M., Mangiacasale, R., Ciciarello, M., and Lavia, P. (2008). The GTPase Ran: regulation of cell life and potential roles in cell transformation. Front. Biosci. 13, 4097– 4121.

Ricci, M.S., Jin, Z., Dews, M., Yu, D., Thomas-Tikhonenko, A., Dicker, D.T., and El- Deiry, W.S. (2004). Direct repression of FLIP expression by c-myc is a major determinant of TRAIL sensitivity. Mol. Cell. Biol. 24, 8541–8555.

162 Robinson, J.T., Thorvaldsdóttir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nat. Biotechnol. 29, 24–26.

Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140.

Rodić, N., Steranka, J.P., Makohon-Moore, A., Moyer, A., Shen, P., Sharma, R., Kohutek, Z.A., Huang, C.R., Ahn, D., Mita, P., et al. (2015). Retrotransposon insertions in the clonal evolution of pancreatic ductal adenocarcinoma. Nat. Med. 21, 1060–1064.

Rose, N.R., and Klose, R.J. (2014). Understanding the relationship between DNA methylation and histone lysine methylation. Biochim. Biophys. Acta 1839, 1362–1372.

Samur, M.K. (2014). RTCGAToolbox: A New Tool for Exporting TCGA Firehose Data. PLoS ONE 9.

Santourlidis, S., Florl, A., Ackermann, R., Wirtz, H.C., and Schulz, W.A. (1999). High frequency of alterations in DNA methylation in adenocarcinoma of the prostate. The Prostate 39, 166–174.

Schulz, W.A., Elo, J.P., Florl, A.R., Pennanen, S., Santourlidis, S., Engers, R., Buchardt, M., Seifert, H.-H., and Visakorpi, T. (2002). Genomewide DNA hypomethylation is associated with alterations on chromosome 8 in prostate carcinoma. Genes Chromosomes Cancer 35, 58–65.

Seitz, V., Butzhammer, P., Hirsch, B., Hecht, J., Gütgemann, I., Ehlers, A., Lenze, D., Oker, E., Sommerfeld, A., von der Wall, E., et al. (2011). Deep sequencing of MYC DNA-binding sites in Burkitt lymphoma. PLoS One 6, e26837.

Shlyueva, D., Stampfel, G., and Stark, A. (2014). Transcriptional enhancers: from properties to genome-wide predictions. Nat. Rev. Genet. 15, 272–286.

163 Shukla, R., Upton, K.R., Muñoz-Lopez, M., Gerhardt, D.J., Fisher, M.E., Nguyen, T., Brennan, P.M., Baillie, J.K., Collino, A., Ghisletti, S., et al. (2013). Endogenous retrotransposition activates oncogenic pathways in hepatocellular carcinoma. Cell 153, 101–111.

Smith, Z.D., and Meissner, A. (2013). DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204–220.

Solyom, S., Ewing, A.D., Rahrmann, E.P., Doucet, T., Nelson, H.H., Burns, M.B., Harris, R.S., Sigmon, D.F., Casella, A., Erlanger, B., et al. (2012). Extensive somatic L1 retrotransposition in colorectal tumors. Genome Res. 22, 2328–2338.

Soodgupta, D., Pan, D., Cui, G., Senpan, A., Yang, X., Lu, L., Weilbaecher, K.N., Prochownik, E.V., Lanza, G.M., and Tomasson, M.H. (2015). Small Molecule MYC Inhibitor Conjugated to Integrin-Targeted Nanoparticles Extends Survival in a Mouse Model of Disseminated Multiple Myeloma. Mol. Cancer Ther. 14, 1286–1294.

Soucek, L., Nasi, S., and Evan, G.I. (2004). Omomyc expression in skin prevents Myc- induced papillomatosis. Cell Death Differ. 11, 1038–1045.

Soucek, L., Whitfield, J., Martins, C.P., Finch, A.J., Murphy, D.J., Sodir, N.M., Karnezis, A.N., Swigart, L.B., Nasi, S., and Evan, G.I. (2008). Modelling Myc inhibition as a cancer therapy. Nature 455, 679–683.

Soucek, L., Whitfield, J.R., Sodir, N.M., Massó-Vallés, D., Serrano, E., Karnezis, A.N., Swigart, L.B., and Evan, G.I. (2013). Inhibition of Myc family proteins eradicates KRas- driven lung cancer in mice. Genes Dev. 27, 504–513.

Stellas, D., Szabolcs, M., Koul, S., Li, Z., Polyzos, A., Anagnostopoulos, C., Cournia, Z., Tamvakopoulos, C., Klinakis, A., and Efstratiadis, A. (2014). Therapeutic effects of an anti-Myc drug on mouse pancreatic cancer. J. Natl. Cancer Inst. 106.

Stepanenko, A.A., and Dmitrenko, V.V. (2015). HEK293 in cell biology and cancer research: phenotype, karyotype, tumorigenicity, and stress-induced genome-phenotype evolution. Gene 569, 182–190.

164 Stern, M., Jensen, R., and Herskowitz, I. (1984). Five SWI genes are required for expression of the HO gene in yeast. J. Mol. Biol. 178, 853–868.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 102, 15545–15550.

Sun, L.-Q., Lee, D.W., Zhang, Q., Xiao, W., Raabe, E.H., Meeker, A., Miao, D., Huso, D.L., and Arceci, R.J. (2004). Growth retardation and premature aging phenotypes in mice with disruption of the SNF2-like gene, PASG. Genes Dev. 18, 1035–1046.

Suter, C.M., Martin, D.I., and Ward, R.L. (2004). Hypomethylation of L1 retrotransposons in colorectal cancer and adjacent normal tissue. Int. J. Colorectal Dis. 19, 95–101.

Tang, Z., Steranka, J.P., Ma, S., Grivainis, M., Rodi_, N., Huang, C.R.L., Shih, I.-M., Wang, T.-L., Boeke, J.D., Fenyö, D., et al. (2017). Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer. Proc. Natl. Acad. Sci. U. S. A. 114, E733–E740.

Termanis, A., Torrea, N., Culley, J., Kerr, A., Ramsahoye, B., and Stancheva, I. (2016). The SNF2 family ATPase LSH promotes cell-autonomous de novo DNA methylation in somatic cells. Nucleic Acids Res. 44, 7592–7604.

Thijssen, P.E., Ito, Y., Grillo, G., Wang, J., Velasco, G., Nitta, H., Unoki, M., Yoshihara, M., Suyama, M., Sun, Y., et al. (2015). Mutations in CDCA7 and HELLS cause immunodeficiency-centromeric instability-facial anomalies syndrome. Nat. Commun. 6, 7870.

Trumpp, A., Refaeli, Y., Oskarsson, T., Gasser, S., Murphy, M., Martin, G.R., and Bishop, J.M. (2001). c-Myc regulates mammalian body size by controlling cell number but not cell size. Nature 414, 768–773.

165 von Eyss, B., Maaskola, J., Memczak, S., Möllmann, K., Schuetz, A., Loddenkemper, C., Tanh, M.-D., Otto, A., Muegge, K., Heinemann, U., et al. (2012). The SNF2-like helicase HELLS mediates E2F3-dependent transcription and cellular transformation. EMBO J. 31, 972–985.

Wang, H., Chauhan, J., Hu, A., Pendleton, K., Yap, J.L., Sabato, P.E., Jones, J.W., Perri, M., Yu, J., Cione, E., et al. (2013). Disruption of Myc-Max heterodimerization with improved cell-penetrating analogs of the small molecule 10074-G5. Oncotarget 4, 936– 947.

Warnes, G.R., Bolker, B., Bonebakker, L., Gentleman, R., Huber, W., Liaw, A., Lumley, T., Maechler, M., Magnusson, A., Moeller, S., et al. (2009). gplots: Various R programming tools for plotting data. R Package Version 2, 1.

Waseem, A., Ali, M., Odell, E.W., Fortune, F., and Teh, M.-T. (2010). Downstream targets of FOXM1: CEP55 and HELLS are cancer progression markers of head and neck squamous cell carcinoma. Oral Oncol. 46, 536–542.

Weemaes, C.M.R., van Tol, M.J.D., Wang, J., van Ostaijen-ten Dam, M.M., van Eggermond, M.C.J.A., Thijssen, P.E., Aytekin, C., Brunetti-Pierri, N., van der Burg, M., Graham Davies, E., et al. (2013). Heterogeneous clinical presentation in ICF syndrome: correlation with underlying gene defects. Eur. J. Hum. Genet. 21, 1219–1225.

Whitfield, J.R., Beaulieu, M.-E., and Soucek, L. (2017). Strategies to Inhibit Myc and Their Clinical Applicability. Front. Cell Dev. Biol. 5.

Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag.

Wilson, B.G., and Roberts, C.W.M. (2011). SWI/SNF nucleosome remodellers and cancer. Nat. Rev. Cancer 11, 481–492.

Xia, F., Lee, C.W., and Altieri, D.C. (2008). Tumor cell dependence on Ran-GTP- directed mitosis. Cancer Res. 68, 1826–1833.

Xiong, Q., Mukherjee, S., and Furey, T.S. (2014). GSAASeqSP: a toolset for gene set association analysis of RNA-Seq data. Sci. Rep. 4, 6347.

166 Yaari, G., Bolen, C.R., Thakar, J., and Kleinstein, S.H. (2013). Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene- gene correlations. Nucleic Acids Res. 41, E170.

Yap, J.L., Wang, H., Hu, A., Chauhan, J., Jung, K.-Y., Gharavi, R.B., Prochownik, E.V., and Fletcher, S. (2013). Pharmacophore identification of c-Myc inhibitor 10074-G5. Bioorg. Med. Chem. Lett. 23, 370–374.

Ye, J., Coulouris, G., Zaretskaya, I., Cutcutache, I., Rozen, S., and Madden, T.L. (2012). Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 13, 134.

Yin, X., Giap, C., Lazo, J.S., and Prochownik, E.V. (2003). Low molecular weight inhibitors of Myc-Max interaction and function. Oncogene 22, 6151–6159.

Yu, W., McIntosh, C., Lister, R., Zhu, I., Han, Y., Ren, J., Landsman, D., Lee, E., Briones, V., Terashima, M., et al. (2014). Genome-wide DNA methylation patterns in LSH mutant reveals de-repression of repeat elements and redundant epigenetic silencing pathways. Genome Res. 24, 1613–1623.

Zech, L., Haglund, U., Nilsson, K., and Klein, G. (1976). Characteristic chromosomal abnormalities in biopsies and lymphoid-cell lines from patients with Burkitt and non- Burkitt lymphomas. Int. J. Cancer 17, 47–56.

Zeller, K.I., Jegga, A.G., Aronow, B.J., O’Donnell, K.A., and Dang, C.V. (2003). An integrated database of genes responsive to the Myc oncogenic transcription factor: identification of direct genomic targets. Genome Biol. 4, R69.

Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., et al. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137.

Zhang, Z.-K., Davies, K.P., Allen, J., Zhu, L., Pestell, R.G., Zagzag, D., and Kalpana, G.V. (2002). Cell Cycle Arrest and Repression of Cyclin D1 Transcription by INI1/hSNF5. Mol. Cell. Biol. 22, 5975–5988.

167 Jane A. Welch 404-797-9415 | [email protected] | linkedin.com/in/janeawelch

EDUCATION

The Johns Hopkins University School of Medicine August 2011 - present Ph.D. in Human Genetics & Molecular Biology

Clemson University August 2007 - May 2011 B.S. in Genetics with minor in Biochemistry Summa cum laude

PROFILE

Born in Decatur, GA, and grew up in the metropolitan Atlanta area. Developed extensive knowledge of human health and disease over the course of formal undergraduate and graduate education, including 10 years of laboratory research experience. Experienced in research project management as well as data collection, tracking, analysis, and interpretation. Additional skills and abilities include technical writing/editing; written and verbal communication; time management; and computer literacy.

RESEARCH EXPERIENCE

Ph.D. Candidate August 2011 - present The Johns Hopkins University School of Medicine Baltimore, MD

 Independently designed and completed multidisciplinary laboratory research projects to characterize the role of the HELLS helicase in human cancers and cancer cell lines. These works led to the discovery of an association between HELLS and the MYC transcription factor.  Prepared a first-author manuscript for submission to a peer-reviewed journal.  Collaborated on biomedical research projects and reports on varying topics, including microarray-based profiling of retrotransposon insertions in human cancer cells, a DNA sequencing-based assay for human identity testing in patients who have received bone marrow transplants, retrotransposon expression profiling by RNA-seq, and effects of loss of function of the ARID5B transcription factor in murine B cell development.  Received co-authorship on 2 peer-reviewed articles and on 2 manuscripts being prepared for publication.  Edited 4 scientific manuscript drafts for colleagues; 2 were subsequently 168 published in peer-reviewed journals.  Liaised with science and medical professionals in bimonthly team meetings to discuss data reports.  Wrote and submitted abstracts for professional conferences, resulting in invitation to present at 2 national symposia.

Undergraduate Student Researcher August 2007 - May 2011 Clemson University Clemson, SC

 Cloned and worked on biochemical and kinetic characterization of acetate- activating enzymes of eukaryotic microbes Entamoeba histolytica, Plasmodium falciparum, and Phytopthera ramorum, in the laboratory of Dr. Kerry S. Smith.  Wrote grant proposals for infectious disease research projects, securing over $2000 in funding from the Clemson University Calhoun Honors College and the SC Life Undergraduate Research Program.

HHMI International Student Researcher May - July 2010 Institut de Biologie Moléculaire et Cellulaire Strasbourg, France

 Received $5000 from HHMI to conduct summer-term research in the laboratory of HHMI international investigator Dr. Elena Levashina, which focused on researching development of malaria transmission control strategies through genetic modification of Anopheles gambiae.  Investigated effects of RNAi-based knockdown of immune system protein TEP1 on mosquito progeny fitness.  Contributed to creating and implementing a new protocol for gene silencing in A. gambiae larvae.

WRITING EXPERIENCE

Science Writing Intern October 2016 - September 2017 Online Mendelian Inheritance in Man (OMIM®) Baltimore, MD

 Reviewed and analyzed scientific literature to write gene entries for OMIM, an online, open-access, manually curated catalog of human genetics information for clinicians and scientists.  Expertly integrated, synthesized, and summarized data from a variety of publications in a clear, concise manner. 169  Reliably met 100% of deadlines, resulting in 25 submitted articles to date (with 24 published on www.omim.org).

Medical Writing Fellow March - June 2017 Johns Hopkins Professional Development & Career Office Baltimore, MD

 Applied medical communication expertise to earning the American Medical Writers Association Essential Skills Certificate in 7 weeks’ time.  Proficient in using style guides, including AMA, CSE, and the Chicago Manual of Style, to compose text and bibliographies as well as create tables and graphs.  Developed familiarity with structure and content requirements of clinical reports, protocols, investigators’ brochures, and other regulatory documents through an externship with the Johns Hopkins Institute of Clinical and Translational Research.  Built knowledge of guidances and regulations relevant to medical writing, including ICH E6 Good Clinical Practice, FDA Initial New Drug guidances, and Code of Federal Regulations Title 21.

TEACHING EXPERIENCE

Teaching Assistant August 2015 Practical Genomics Workshop Johns Hopkins University, Baltimore, MD

 Introduced academic researchers to publicly available tools for genomic data analysis with an emphasis on next-generation sequencing.  Previewed key lectures and offered feedback and edits, improving the overall clarity of their content.  Provided one-on-one instruction to workshop participants as needed, ensuring understanding of the material and keeping the lecture schedule on track.

Teaching Assistant August - October 2013 “Pathology for Graduate Students: Basic Mechanisms” The Johns Hopkins University School of Medicine, Baltimore, MD

 Provided logistical support for course lecturers, including classroom setup and lecture note printouts, to ensure that classes began on time and ran smoothly.  Routinely responded to students’ queries by email.  Graded homework and final exams.  Conducted a statistical analysis of final exam scores and compiled a report for the

170 course coordinator, facilitating efficient assessment of student performance trends.

SCIENCE OUTREACH EXPERIENCE

Mentor October 2016 - May 2017 STEM Achievement in Baltimore Elementary Schools Johns Hopkins University, Baltimore, MD

 Collaborated with educators and other professionals to teach scientific concepts and approaches to local 5th grade students.  Trained students to conceptualize, execute, and present a science project at the program’s year-end showcase.  Displayed a confident and mature attitude to maintain an orderly classroom environment.  Managed a team fall program schedule, ensuring that volunteer attendance requirements were met.

Visiting Scientist December 2015 “The Genome Geek is In!” Program Smithsonian National Museum of Natural History, Washington, DC

 Designed and executed an original exhibit on mobile DNA research with an interactive game, a microscopy station, and visual arts that attracted 50+ visitors ranging from children to adults.  Composed an exhibit advertisement, targeted to the lay public, that was marketed online and in print media.  Used friendly disposition to facilitate discussion, achieving the program’s record per-visitor conversation length.

Volunteer March 2012 & 2014 Community Science Education Program Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD

 Collaborated with fellow Johns Hopkins professionals to develop and lead an interactive cancer research activity for local 5th graders visiting the cancer center for Community Science Day.  Implemented creative metaphors to introduce biology concepts, including DNA, mutation, cancer, and genome sequencing.  Received incredibly positive feedback from students, resulting in an invitation from program organizers to host the activity again at the next Community Science

171 Day event, held in 2014.

SCIENTIFIC PUBLICATIONS & MANUSCRIPTS

Debeljak M, Freed DN, Welch JA, Haley L, Beierl H, Inglehart BS, Pallavajjaia A, Gocke CD, Leffell MS, Lin M, Pevsner J, Wheelan SJ, Eshleman JR. Haplotype Counting by Next-Generation Sequencing for Ultrasensitive Human DNA Detection. J Mol Diagn. 2014;16(5):495-503. doi:10.1016/j.jmoldx.2014.04.003.

Zampella J, Rodić N, Yang WR, Huang CRL, Welch J, Gnanakkan VP, Cornish TC, Boeke JD, Burns KH. A map of mobile DNA insertions in the NCI-60 human cancer cell panel. Mobile DNA. 2016;7(1). doi:10.1186/s13100-016-0078-4.

Welch JA, Wang X, Tang Z, Esopi DM, Shen P, Schaughency P, Duffield AS, Haffner MC, Fenyö D, Yegnasubramanian S, Wheelan SJ, Burns KH. The SNF2-like helicase HELLS colocalizes with MYC at promoters of expressed genes in human cancer cells. Manuscript in preparation for submission to peer-reviewed journal (likely Molecular Cancer Research).

Liu C, Wu W, Shen P, Welch JA, Tang Z, Esopi DM, Ardeljan D, Payer LM, Yang WR, Steranka J, Fenyö D, Desiderio S, Yegnasubramanian S, Burns KH. ARID5B activates genome integrity genes in developing B cells. Manuscript in preparation to submit to Nature Communications.

Yang WR, Schaughency P, Ardeljan D, Welch JA, Payer LM, Liu C, Boeke JD, Wheelan SJ, Levitsky HI, Burns KH. SQuIRE: Software for Quantifying Individual Repeat Elements. Manuscript in preparation to submit to Bioinformatics.

CONFERENCE PRESENTATIONS

“Transcriptional control of cellular genes and retrotransposons in a human Burkitt lymphoma cell line.” Poster presentation 65th Meeting of the American Society of Human Genetics October 2015

“ChIP-seq identifies the target genomic loci of the SMARCA6 helicase in human hematopoietic neoplasms." Poster presentation FASEB Science Research Conference: Mobile DNA in Mammalian Genomes June 2015

172 HONORS & AWARDS

2011 - present Phi Beta Kappa Honor Fraternity 2017 American Medical Writers Association Essential Skills Certificate 2007 - 2011 Clemson University President’s List for Academic Excellence 2007 - 2011 Alpha Lambda Delta Honor Fraternity 2007 - 2011 Phi Kappa Phi Honor Fraternity 2007 - 2011 Clemson University Out-of-State Tuition Scholarship 2009, 2010, 2011 Calhoun Honors College Departmental Honors Research Grant 2009 SC Life Undergraduate Research Grant

PROFESSIONAL AFFILIATIONS

2017 - present American Medical Writers Association 2015 - 2016 American Society of Human Genetics 2014 - 2015 American Society for Cell Biology

REFERENCES

Kathleen H. Burns, M.D., Ph.D. Associate Professor, Department of Pathology The Johns Hopkins University School of Medicine 410-502-7214 | [email protected]

Joanna Amberger Program Manager, Online Mendelian Inheritance in Man The Johns Hopkins University School of Medicine 410-955-0313 | [email protected]

Patricia Phelps, Ph.D. Director, Professional Development & Career Office The Johns Hopkins University School of Medicine 410-502-2804 | [email protected]

173