University of New Mexico UNM Digital Repository

Biomedical Sciences ETDs Electronic Theses and Dissertations

Fall 12-2019

Mechanisms and consequences of MYB activation in salivary gland tumors

Candace Frerich University of New Mexico

Follow this and additional works at: https://digitalrepository.unm.edu/biom_etds

Part of the Bioinformatics Commons, Cancer Biology Commons, Genetics Commons, Laboratory and Basic Science Research Commons, Medicine and Health Sciences Commons, Molecular Biology Commons, and the Molecular Genetics Commons

Recommended Citation Frerich, Candace. "Mechanisms and consequences of MYB gene activation in salivary gland tumors." (2019). https://digitalrepository.unm.edu/biom_etds/205

This Dissertation is brought to you for free and open access by the Electronic Theses and Dissertations at UNM Digital Repository. It has been accepted for inclusion in Biomedical Sciences ETDs by an authorized administrator of UNM Digital Repository. For more information, please contact [email protected], [email protected], [email protected]. Candace Frerich Candidate

Biomedical Sciences Department

This dissertation is approved, and it is acceptable in quality and form for publication:

Approved by the Dissertation Committee:

Scott Ness Ph.D. , Chairperson

Alan Tomkinson Ph.D.

Eric Prossnitz Ph.D.

Hua-Ying Fan Ph.D.

David Lee M.D./Ph.D.

i MECHANISMS AND CONSEQUENCES OF MYB GENE ACTIVATION IN SALIVARY GLAND TUMORS

BY

CANDACE FRERICH B.S., Biochemistry, Angelo State University, 2013

DISSERTATION

Submitted in Partial Fulfillment of the Requirements for the Degree of

DOCTOR OF PHILOSOPHY BIOMEDICAL SCIENCES

The University of New Mexico Albuquerque, New Mexico

December 2019

ii ACKNOWLEDGMENTS

I would like to express my deepest appreciation to my advisor, Dr. Scott

Ness, for your mentorship, guidance, and support throughout my graduate career. I am also grateful to my committee members: Dr. Alan Tomkinson, Dr. Eric

Prossnitz, Dr. David Lee, Dr. Hua-Ying Fan for agreeing to read a dissertation about Myb.

Thank you to the current and former members of the Ness lab: Charlie

Brayer, Jen Woods, Jason Byers, Roger Brown, Hailey Sedam, Maggie Cyphery,

Brandon Painter, Hideaki Suzuki, Olivia George, Jamie Padilla.

Special thanks to all my friends and family for your continued support and encouragement.

iii MECHANISMS AND CONSEQUENCES OF MYB GENE ACTIVATION IN

SALIVARY GLAND TUMORS

by

Candace Frerich B.S., Biochemistry, Angelo State University, 2013 Ph.D., Biomedical Sciences, University of New Mexico 2019

ABSTRACT

Salivary gland adenoid cystic carcinoma (ACC) is an aggressive tumor with a tendency to infiltrate surrounding nerves and metastasize to distant sites.

The standard treatment often fails to control local tumor recurrence and distant metastases and no approved targeted therapeutic options exist for these tumors.

The goal of our studies was to reveal the molecular mechanisms driving ACC tumor development and novel drug targets to improve patient morbidity and mortality.

We first analyzed clinical and RNA-sequencing (RNA-seq) data for 68 formalin-fixed paraffin-embedded (FFPE) ACC tumor samples and described previously unappreciated molecular heterogeneity that predicts patient outcome.

The poor outcome subgroup had a signature that resembled embryonic stem cells, suggesting these patients had high-grade dedifferentated tumors. We also utilized these RNA-seq data to definitively show that the MYB and MYBL1 are the oncogenic drivers in the vast majority of ACC tumors.

ACC tumors that expressed Myb were distinct from those that did not,

iv indicating that Myb driven tumorigenesis in ACC tumors results in a unique gene expression pattern. From these analyses we identified and validated the first two high-confidence Myb regulated genes in ACC tumors, the first step in unraveling how Myb driven gene expression changes drive oncogenesis.

The MYB gene must be truncated to unleash its oncogenic potential, in

ACC tumors this typically occurs at the C-terminus via chromosomal translocations. However we only observed gene truncation in half of the MYB expressing ACC tumors, raising a new question: how is the oncogenic potential of MYB unleashed in those tumors that appear to express the full-length gene?

We found that nearly all of the MYB expressing ACC tumors had an activated alternative MYB gene promoter, which produces N-terminally truncated Myb . Thus, alternative promoter use may unleash the oncogenic potential of the Myb in those ACC tumors that appeared to express the non- oncogenic full-length protein. Further investigation revealed significant differences in the gene expression signature elicited by N-terminally truncated

Myb isoforms and full-length Myb isoforms. Specifically, N-terminally truncated

Myb isoforms uniquely activated a pro-tumorigenic neural migration signaling pathway. A pathway linked to increased perineural invasion, an ACC tumor hallmark associated with poor prognoses. Indeed, we found stratification of ACC tumors by expression of these genes identified a significantly poor outcome subgroup.

v The clinical heterogeneity that makes ACC tumors so difficult to treat may be predicted by molecular characterization, thus identifying high-risk patients who are candidates for more aggressive care or personalized therapies. A previously unidentified truncated Myb isoform stimulates perineural invasion, a significant roadblock to curative treatment. These studies have many implications for future studies of the mechanisms of Myb driven tumorigenesis, development of more effective targeted therapeutics, and ultimately patient treatment.

vi Table of Contents

List of Figures...... xi

Chapter 1- INTRODUCTION...... 1

1.1 The c-Myb factor determines cell fate...... 1

1.1.1 Normal c-Myb balances proliferation and differentiation...... 1

1.1.2 c-Myb protein domains and transactivation...... 3

1.2 Transcriptional regulation of the MYB gene...... 6

1.2.1 Upstream MYB promoter (TSS1)...... 6

1.2.3 Rare alternative MYB promoter (TSS2)...... 10

1.2.4 Distant enhancers regulate MYB gene expression...... 11

1.3 c-Myb is the founding member of the Myb family...... 11

1.4 How normal Myb transcription factors become oncoproteins...... 14

1.4.1 Normal full-length c-Myb does not transform cells...... 14

1.4.2 N-terminal truncation...... 14

1.4.3 C-terminal truncation...... 15

1.4.4 Context specific transcription factor code hypothesis...... 16

1.5 Myb proteins are the driver oncogenes in adenoid cystic carcinoma...... 19

1.5.1 Disease characteristics of adenoid cystic carcinoma (ACC)...... 19

1.5.2 MYB is a human proto- in ACC tumors...... 19

1.5.3 Characteristics of chromosomal translocations in ACC tumors...... 20

vii Chapter 2: Transcriptomes define distinct subgroups of salivary gland adenoid cystic carcinoma with different driver and outcomes...... 23

2.1 ABSTRACT...... 24

2.2 INTRODUCTION...... 25

2.3 RESULTS...... 28

2.3.1 RNA-seq analysis of ACC tumor samples up to 25 years old...... 28

2.3.2 Most ACC tumors express either MYB or MYBL1...... 29

2.3.3 Analysis of RNA-seq data for evidence of fusion transcripts...... 33

2.3.4 Gene expression signatures identify major subgroups of ACC tumors

...... 35

2.3.5 EN1 and SOX4 are Myb-regulated target genes in ACC tumors...... 40

2.3.6 Identification of a high-risk, poor-outcome subgroup of ACC patients

...... 42

2.3.7 Gene expression signatures define good- and poor-outcome

subgroups of ACC patients...... 51

2.3.7 A gene expression signature is associated with poor

outcome...... 52

2.4 DISCUSSION...... 55

2.5 METHODS...... 58

2.5.1 Human Salivary Gland ACC FFPE samples...... 58

2.5.2 RNA Isolation and Sequencing...... 59

2.5.3 Data Analysis...... 60

viii 2.5.4 Unsupervised hierarchical clustering...... 60

2.5.5 Statistical Analysis...... 61

2.5.6 Translocation Verification...... 61

2.5.7 EN1 and SOX4 Promoter Fragments...... 62

2.5.7 Transfections and reporter gene assays...... 63

CHAPTER 3: An Alternative MYB Promoter Activated in Adenoid Cystic

Carcinoma Produces N-Terminally truncated Myb Proteins With Unique

Transcriptional Activities...... 66

3.1 ABSTRACT...... 67

3.2 STATEMENT OF SIGNIFICANCE...... 67

3.3 INTRODUCTION...... 69

3.4 RESULTS...... 74

3.4.1 ACC tumors express MYB from an alternative promoter...... 74

3.4.2 Myb transcription factors can activate cell-type specific MYB

promoters...... 79

3.4.3 MYB TSS2 activation is widespread amongst, yet unique to ACC

tumors...... 82

3.4.4 MYB TSS2 transcripts give rise to N-terminally truncated proteins...87

3.4.5 ∆N Myb transcription factors have unique activity in cells...... 90

3.4.6 ΔN Myb uniquely modulates gene sets implicated in neuronal cell N Mybuniquelymodulatesgenesetsimplicatedinneuronalcell

migration...... 96

ix 3.4.5 ∆N Myb geneset identified a poor outcome subgroup of ACC tumors

...... 102

3.5 DISCUSSION...... 104

3.6 METHODS...... 108

3.2.1 Cell Culture and Luciferase assays...... 108

3.6.2 Tumor RNA-seq...... 109

3.6.3 SW620 RNA-seq...... 109

3.6.4 5’RLM-RACE...... 110

3.6.5 Protein and Western blots...... 111

3.6.6 Cloning...... 112

3.6.7 Promoter motif analyses...... 112

3.7 Acknowledgments...... 113

CHAPTER 4: CONCLUSIONS, SIGNIFICANCE, FUTURE DIRECTIONS...... 114

4.1 How does MYB TSS2 become activated in ACC tumors?...... 114

4.2 N-terminal and C-terminal Myb truncations are cooperative...... 119

4.3 Why are truncated Myb transcription factors oncogenic?...... 122

4.4 Myb in ACC tumor hallmarks: many unknowns...... 123

Appendices...... 131

Appendix A: Abbreviations...... 131

Appendix B: Chapter 2 supplementary tables and figures...... 133

Appendix C: Chapter 3 supplementary tables and figures...... 136

References...... 150

x LIST OF FIGURES

Figure 1.1. The c-Myb transcription factor controls cell fate...... 2

Figure 1.2. Conserved c-Myb protein domains serve diverse functions...... 5

Figure 1.3. Transcriptional regulation of the MYB gene...... 8

Figure 1.4. The Myb family of transcription factors...... 12

Figure 1.5. Transcription factor code hypothesis explaining the transcriptional specificity of Myb proteins...... 18

Figure 1.6. Chromosomal translocations are predicted to produce transcriptionally active, truncated Myb proteins...... 21

Figure 2.1. RNA-seq identifies distinct subgroups of ACC tumor samples...... 30

Table 2.1. RNA-Seq Statistics...... 31

Table 2.2. Observed and Putative MYB gene fusions identified in 68 ACC tumors

...... 34

Figure 2.2. Differential Gene Expression Analysis: MYB/MYBL1 vs Neither

Oncogene...... 37

Figure 2.3. EN1 and SOX4 promoter reporter gene assays...... 41

Figure 2.4. Identification of a high-risk subgroup of ACC patients...... 45

Table 2.3. Genes reported to be linked to poor prognosis in ACC tumors...... 48

Table 2.4. Characteristics of 68 ACC tumors...... 50

Table 2.5. Multivariate Cox-Regression Analysis...... 50

Figure 2.5. Gene Set Enrichment Analysis of MYB Samples...... 53

Table 2.6. Gene Set Enrichment Analysis Results – Group 2...... 54

xi Table 2.7. Gene Set Enrichment Analysis Results – Group 1...... 54

Figure 3.1: ACC tumors use an alternative MYB gene promoter...... 76

Figure 3.2 Cell type specific MYB promoters are activated by Myb transcription factors...... 80

Figure 3.3. MYB TSS2 activation is unique to ACC tumors...... 83

Figure 3.4. MYB TSS2 mRNA encodes an N-terminally truncated Myb (∆N Myb)...... 88

Figure 3.5. Myb and ∆N Myb transcription factors elicit both similar and different gene expression changes in cells...... 92

Figure 6. ∆N Myb uniquely activates SEMA4D signaling, which is correlated with

ACC patient survival...... 97

Figure 4.1. Proposed model of MYB TSS2 activation in ACC tumors...... 116

Figure 4.2. The role of truncated Myb in ACC tumor hallmarks...... 126

xii CHAPTER 1- INTRODUCTION

1.1 The c-Myb transcription factor determines cell fate

1.1.1 Normal c-Myb balances proliferation and differentiation

The c-Myb transcription factor, encoded by the MYB gene, regulates the expression of thousands of genes to coordinate proliferative and differentiated cell states in multiple normal cell types (Mucenski et al. 1991; Malaterre et al.

2008; Matsumoto et al. 2016; Zorbas et al. 1999). Indeed, c-Myb expression is linked to the differentiation state of the cell, where immature, highly proliferative cells produce high amounts of c-Myb. But as proliferation slows and these cells differentiate c-Myb expression coordinately decreases, until expression of c-Myb is essentially zero in terminally differentiated cells (Figure 1.1A). Early studies demonstrated c-Myb is essential for early haematopoiesis and MYB gene knockout is lethal at embryonic day 15. Knockout animals lacked any of the differentiated blood cell lineages, a catastrophic failure for life (Mucenski et al.

1991). Further studies revealed the dual nature of c-Myb in controlling cell fate; perturbation of c-Myb function led to defects in both proliferation and differentiation (Figure 1.1B). Specifically, MYB knockout cells failed to progress at three different stages of T-cell differentiation (Bender et al. 2004) whereas dominant negative inhibition of c-Myb in immature thymocytes also prevented resumed proliferation (Pearson and Weston 2000). More recently, the role of c-

Myb in governing proliferation and differentiation has been expanded to include a

1 Figure 1.1. The c-Myb transcription factor controls cell fate. (A) The role of c-Myb in cell fate is best understood in haematopoiesis, where it is necessary for both proliferation of immature progenitor cells and their differentiation into specialized blood cells. c-Myb expression is high in proliferating cells (indicated by the red wedge), and gradually decreases as the cells differentiate, so that terminally differentiated cells do not express c-Myb at all. (B) Disruption of c-Myb function via knock out or inhibition leads to defective proliferation and differentiation.

2 multitude of normal cell types including colon progenitors, neural progenitors, and normal salivary gland(Mucenski et al. 1991; Malaterre et al. 2008; Matsumoto et al. 2016). As in the hematopoietic system, c-Myb is expressed early in normal salivary gland development but is silenced in fully differentiated cells. During branching morphogenesis the ductal stalk elongates via rapid proliferation of epithelial cells followed by differentiation as the bud at the end of the duct forms

(Matsumoto et al. 2016). Evidence suggests that c-Myb, in conjunction with Wnt signaling, has an integral role in the maintenance and timing of proliferation during this process (Matsumoto et al. 2016). Modulation of MYB expression in ex vivo cultures of mouse salivary gland rudiments significantly altered duct morphology, where MYB knockdown resulted in shorter ducts and an over- abundance of terminal buds (Matsumoto et al. 2016) indicative of perturbed proliferation. Hence, expression of c-Myb is necessary to maintain ductal cell proliferation and delay bud cell differentiation. Thus, the normal role of the c-Myb transcription factor is to coordinate the timing and balance of proliferation and differentiation in multiple developing tissues.

1.1.2 c-Myb protein domains and transactivation

The c-Myb protein functions as a transcriptional regulator which is capable of both activating and repressing transcription. The protein is composed of five highly conserved domains which largely make up the DNA-binding and specificity

/regulatory regions (Figure 1.2A). The DNA-binding domain (DBD; amino acids

72-192) is perfectly conserved from humans to chickens and is strictly required

3 for DNA-binding. A single point within this domain (at amino acid 167) is capable of completely abolishing DNA-binding altogether, producing a dead protein (Frampton et al. 1991). The DBD consists of two consecutive, imperfect helix-turn-helix domains which form two globular structures that wedge themselves into the flanking sides of the DNA major groove (Ogata et al. 1994).

Together these two domains, forming one DBD, recognize the degenerate hexa- nucleotide sequence motif of Cc/aGTTa/g, termed the Myb binding site. There are more than 6 million potential Myb binding sites in the , but chromatin immunoprecipitation (ChIP) experiments demonstrated that c-Myb only bind ~10 thousand of those (Quintana et al. 2011; Drier et al. 2016). Hinting that c-Myb specificity is determined by factors beyond the DBD.

Together the DBD and transactivation domains (TAD; amino acids 260-

321; Figure 1.2A) are the minimum portion of the protein required for transactivation in reporter assays (Frerich et al. 2018; Cuddihy et al. 1993;

Brayer et al. 2016). The c-Myb TAD is a powerful domain capable of directly recruiting the basal transcriptional machinery and initiating transcription of bound target genes (Figure 1.2B). In acute myeloid cells the TAD directly recruits the TFIID general transcriptional co- complex, which in turn nucleates the pre-initiation complex to initiate transcription (Y. Xu et al. 2018).

The c-Myb TAD is

4 Figure 1.2. Conserved c-Myb protein domains serve diverse functions. (A) The c-Myb protein is composed of several conserved domains. Together the DBD and TAD are the minimal portion of the protein needed for transactivation. The C-terminal regulatory domains (FAETL, TPTPF, EVES) are involved in protein-to-protein interactions which contribute to the proteins transcriptional specificity. (B) The c-Myb (cyan) TAD interacts with TAF12 or p300 co-factors (pink) which directly recruit the general transcriptional machinery (transcription initiation complex (TIC), green). c-Myb binding to DNA and interaction with TAC12/p300 are likely stabilized by additional transcription factors and co-factors (purple).

5 also able to directly interact with the general co-activator p300, which may perform the same function of recruiting the basal transcriptional machinery to initiate transcription (Sandberg et al. 2005). In both studies abolishing c-Myb interaction with the general co-factor (either TFIID or p300) abolished up to 90% of c-Myb transactivation.

The C-terminus of c-Myb is composed of multiple highly conserved domains that contribute to the specificity and regulation of the protein (Figure

1.2A). The FAETL domain (amino acids 365-411) may stabilize Myb in a functional conformation and is required for transformation in some assays (Fu and Lipsick 1996). The TPTPF domain is highly conserved, yet has no known function. Finally, the EVES domain is hypothesized to be involved in auto- inhibitory intramolecular interactions, where it folds back to interact with the DBD, thereby inactivating the protein (Dash, Orrico, and Ness 1996). It is clear that removal of this portion of the protein leads dramatically increased transcriptional activity (Fu and Lipsick 1996), qualitatively different transcriptional specificity (F.

Liu et al. 2006; Ness 2003), and the ability to transform cells (Dubendorff et al.

1992; Cuddihy et al. 1993).

1.2 Transcriptional regulation of the MYB gene

1.2.1 Upstream MYB promoter (TSS1)

As described above the correct timing and quantity of c-Myb is essential for proper development. Extensive transcriptional regulation via multiple mechanisms ensures correct MYB gene expression. Transcription of the MYB

6 gene typically begins at the promoter upstream of the first exon (TSS1; Figure

1.3). MYB TSS1 is within an annotated CpG island and is over 90% GC in some regions (Dvorák et al. 1989). TSS1 lacks both a TATA box and a CAAT box, instead it appears that transcription is initiated by Sp1 factor binding (Dvorák et al. 1989). It is likely this promoter is regulated in a cell-type and cell-stage specific manner by diverse signaling pathways, cell-type specific transcription factors, chromatin structure, and DNA methylation. For instance, in T-cells IL-2 activation induced and NF-κB binding of TSS1 to stimulate transcription, which in turn stimulated proliferation and protected cells from (Lauder,

Castellanos, and Weston 2001). Deletion of either E2F or NF-κB binding site individually resulted in reduced MYB transcription and deletion of both completely abolished MYB transcription in these cells (Lauder, Castellanos, and Weston

2001). Hence, MYB expression is maintained by highly coordinated interactions between multiple tissue-specific transcription factors and signaling pathways.

7 Figure 1.3. Transcriptional regulation of the MYB gene. Transcriptional regulation of the MYB gene is centered around the first ~5kb of the gene including exon 1, intron 1, and exon 2. The upstream promoter (MYB TSS1) is extremely GC-rich and lacks traditional TATA box or CAAT box, transcription is likely initiated by SP1 binding. MYB TSS1 is activated by cell-type specific transcription factors downstream of signaling pathways, for example, IL-2 signaling stimulates binding of E2F and NF-kβ. Within the first intron RNA- polymerase encounters the attenuation site and stalls (yellow). Docking of cell type specific cofactors, like NF-kβ in T-cells and estrogen in breast cancer, is necessary to relieve polymerase stalling and allow expression of the gene. Just before the second exon is an AT-rich, rarely used alternative promoter. This promoter has only been described in leukemia cells where binding of the PBX2 transcription factor has a role in promoter selection.

8 1.2.2 Transcriptional regulation via RNA-polymerase pausing

MYB expression is regulated in multiple cell-types by an RNA-polymerase

(RNA-pol) attenuation site within the first intron (hairpin structure, Figure 1.3)

(Yuan 2000; Pereira et al. 2015; H. Hugo et al. 2006). Transcription initiates at

TSS1 then proceeds through to the first intron. Approximately 1.7 kilobases (kb) downstream of MYB TSS1 the newly transcribed RNA forms a hairpin structure causing the RNA polymerase II to stall (Pereira et al. 2015). Polymerase stalling is relieved by docking of regulatory proteins to the hairpin structure, allowing transcriptional elongation to continue and produce full-length MYB transcripts

(Pereira et al. 2015). In colorectal cancer and hematopoietic cells elongation is controlled by subunit specific binding of NF-kB to the RNA hairpin. Specifically, p50-p50 homodimer binding blocked elongation. Whereas, p50-p65 heterodimer binding recruited P-TEFb to phosphorylate Ser2 in the C-terminal domain of RNA polymerase II which stimulated continued transcription of the remainder of the

MYB gene (Pereira et al. 2015; Suhasini and Pilz 1999; Perkel, Simon, and Rao

2002). A similar mechanism is employed in breast cancer cells, where binding the hairpin structure recruits P-TEFb and relieves transcriptional attenuation (Drabsch et al. 2007). Thus, MYB transcription is tightly controlled at the elongation stage by cell-type specific factors modulating ubiquitous transcriptional machinery.

9 1.2.3 Rare alternative MYB promoter (TSS2)

Genome-wide studies estimate that half the human genes have multiple promoters, with an average of 4 promoters per gene (X. Wang et al. 2016). It is increasingly clear that alternative promoter use is an important mechanism to increase transcript and protein diversity (X. Wang et al. 2016). This is also the case with the MYB gene, which has an alternative promoter within the first intron directly upstream of exon 2 (MYB TSS2, Figure 1.3). TSS2 use has thus far only been reported in leukemia cells lines (Dassé et al. 2012; Jacobs, Gorse, and

Westin 1994). Evidence from embryonic stem cells suggested GC-rich promoters, like MYB TSS1, are active by default and must be actively silenced.

Whereas GC-poor promoters, like MYB TSS2, are inactive but can be selectively activated by cell-type specific transcription factors (Mikkelsen et al. 2007).

Indeed, the two MYB gene promoters are activated differently in normal hematopoietic progenitors and transformed myeloid cells (Dassé et al. 2012).

Where TSS1 was active in normal hematopoietic progenitors but MYB TSS2 was active in transformed myeloid cells (Dassé et al. 2012). It appeared that cell-type specific localization of the Pbx2 transcription factor to MYB TSS2 versus TSS1 was linked to TSS2 activation (Dassé et al. 2012). Thus MYB promoter selection is likely achieved through differential recruitment of the basal transcriptional machinery to different core-promoter types by cell-type specific factors (Gross et al. 1998; Losick 1998; Mikkelsen et al. 2007).

10 1.2.4 Distant enhancers regulate MYB gene expression

In addition to the extensive local gene structure described above, the

MYB gene can be regulated in trans by multiple distant enhancers. In primary human erythroid progenitors and myeloid progenitor cells multiple enhancers interacted with the 5’end of the MYB gene to form a dynamic chromatin hub that encompassed the entire first ~5kb of the gene, including the first exon, first intron, and second exon (Stadhouders et al. 2012; J. Zhang et al. 2016). As with the other regulatory features of MYB these interactions appear to be cell-type specific. In myeloid cells Hoxa9 and PU.1 were integral in maintaining enhancer- promoter interactions (J. Zhang et al. 2016), but in erythroid cells KLF1 and the

GATA1/TAL1/LBD1 bound and maintained the enhancer-promoter chromatin hub

(Stadhouders et al. 2012).

1.3 c-Myb is the founding member of the Myb transcription factor family

There are three members of the Myb transcription factor family: c-Myb (the founding member), A-Myb, and B-Myb. (Henceforth, “Myb” generally refers to

Myb proteins as a whole or unnamed truncated isoforms, specific names will be used when appropriate.) All three related transcription factors are expressed in vertebrates and share a conserved DNA-binding domain (Figure 1.4). However, the proteins have little conservation beyond the DBD and each transcription factor has a unique function. Knock out of A-Myb, encoded by the MYBL1 gene,

11 Figure 1.4. The Myb family of transcription factors. c-Myb, encoded by the MYB gene, is the founding member of the Myb transcription factor family. Two other Myb transcription factors are expressed in vertebrates, A-Myb and B-Myb, encoded by MYBL1 and MYBL2 respectively. These homologs share a conserved DBD that was functionally identical in domain swap experiments (Rushton and Ness 2001). The C-terminus of the proteins are unique, and determine their transcriptional specificity. The oncogenic v-Myb encoded by the AMV virus was the first discovered Myb protein. This is a truncated, and mutated version of c-Myb, which causes a disease similar to leukemia when expressed in chickens. (Henceforth, “Myb” generally refers to Myb proteins as a whole or unnamed truncated isoforms, specific names will be used when appropriate.)

12 results in male infertility (Toscani et al. 1997) and further studies identified it as a master regulator of male meiosis (Bolcun-Filas et al. 2011). The B-Myb transcription factor, encoded by the MYBL2 gene, plays an integral role in regulation and progression (Sadasivam, Duan, and DeCaprio 2012). Each of these transcription factors regulates unique sets of genes and has different roles in a cell, despite the conserved DBD (Rushton et al. 2003; Rushton and

Ness 2001). Domain swap experiments further illustrated that the DBD of these three proteins were functionally interchangeable and the unique C-terminal regulatory domains were responsible for maintaining the context-specific gene activation of each transcription factor (Lei et al. 2004; Rushton et al. 2003).

In addition to the vertebrate Myb genes, there are multiple examples of viral Myb homologs, which are all oncogenic. For example, the Avian

Myeloblastosis Virus (AMV) encodes the v-Myb protein (Figure 1.4). v-Myb is a truncated, mutated, and oncogenic version of c-Myb that causes myeloblastosis, a disease similar to leukemia, in infected chickens. As illustrated in Figure 1.4 v-

Myb is doubly truncated at both the N-terminus and C-terminus, only encoding amino acids 72-442 of the full-length c-Myb protein. The activities of vertebrate

Myb proteins and truncated oncogenic v-Myb provided the first clues to Myb transcription factor specificity, discussed in detail later.

13 1.4 How normal Myb transcription factors become oncoproteins

1.4.1 Normal full-length c-Myb does not transform cells

Early studies demonstrated that v-Myb was able to transform cells yet full- length c-Myb had little to no transforming ability (Furuta et al. 1993; Grässer,

Graf, and Lipsick 1991). The oncogenic and transforming ability of the normal vertebrate c-Myb transcription factor must be unleashed by truncation to either one or both termini of the protein (Fu and Lipsick 1996; Hu et al. 1991; Ramsay,

Ishii, and Gonda 1991; Grässer, Graf, and Lipsick 1991).

1.4.2 N-terminal truncation

The highly conserved N-terminus of c-Myb is constitutively phosphorylated at S11/S12 (Cures et al. 2001) followed by a helix-turn-helix protein domain (47-

72 aa) similar to those that make up the DBD. N-terminal truncation of Myb proteins is observed in multiple oncogenic versions, including v-Myb which is missing the first 72 amino acids. Additionally, viral integration of the avian leukosis virus within the 5’end of the Myb gene results in a 20 amino acid

N-terminal truncation (Jiang et al. 1997). Both of these N-terminally truncated proteins result in oncogenic transformation of the infected cells, and just a 20 amino acid N-terminal truncation was the minimal truncation needed to induce rapid onset of multiple tumor types in chickens (Jiang et al. 1997). More recently recurrent mutations centered around amino acid 14, were described in pediatric

14 T-cell acute lymphoblastic (Y. Liu et al. 2017). Suggesting the N- terminus of c-Myb has an important role in leashing its oncogenic potential.

Due to the proximity to the DBD past studies have speculated that this region modulates DNA-binding. Phosphorylation of residues S11 and S12, which are excluded in the truncated versions discussed above, increased the specificity of c-Myb by destabilizing DNA-binding (Ramsay, Ishii, and Gonda 1991;

Oelgeschlager et al. 1995). However, this effect could be easily overcome by protein-to-protein interactions with co-factors that anchored c-Myb to its binding site on DNA (Oelgeschlager et al. 2001). It is clear that the N-terminus is also involved in protein-to-protein interactions. To activate the endogenous, chromatin embedded mim-1 gene c-Myb requires the C/EBPβ(NF-M) co-factor to open the local chromatin structure, an interaction that was mapped to the N-terminus of c-

Myb (Oelgeschlager et al. 2001; Introna et al. 1990; Ness et al. 1993). This region is altered in v-Myb, which is consequently unable to open the local chromatin structure to activate the endogenous mim-1 gene (Oelgeschläger et al.

1996; Ness et al. 1993). Thus, multiple lines of evidence suggest the N-terminus of c-Myb proteins has an important function and a role in oncogenicity.

1.4.3 C-terminal truncation

Multiple mechanisms leading to C-terminal truncation of c-Myb have also been described in human tumors including alternative splicing in leukemais

(Ohyashiki et al. 1988; O’Rourke and Ness 2008), and chromosomal translocations in adenoid cystic carcinoma (ACC) and glioblastoma

15 (Bandopadhayay et al. 2016; A. S. Ho et al. 2013). Multiple studies have demonstrated C-terminal truncation is also sufficient to convert c-Myb into an oncogenic version (Cuddihy et al. 1993; Gonda, Buckmaster, and Ramsay 1989;

Fu and Lipsick 1996; Hu et al. 1991). Indeed, only the DBD and TAD, which maintains the proteins core transactivation ability, are required to block terminal differentiation of Murine erythroleukemia cells (Cuddihy et al. 1993). Instead the

C-terminus is involved in regulating the activity and specificity of c-Myb. Loss of this region leads to increased protein stability (Corradini et al. 2005) and increased transcriptional activity (Fu and Lipsick 1996), and dramatically altered transcriptional specificity (O’Rourke and Ness 2008; Ness 2003; F. Liu et al.

2006). Loss of these regulatory and specificity region is likely key to oncogenic activity of C-terminally truncated Myb proteins.

1.4.4 Context specific transcription factor code hypothesis

Initially it was hypothesized that truncated, oncogenic Myb was simply an activated form of c-Myb, ie it activated the same genes, just better. Instead further investigations discovered that truncated Myb transcription factors have completely different transcriptional specificity than full-length c-Myb (Lei et al.

2004; F. Liu et al. 2006). The first evidence that Myb transcription factors had qualitatively different activities was provided by the endogenous, chromatin embedded mim-1 gene, which can be activated by c-Myb but not v-Myb

(described in section 1.4.2 (Ness et al. 1993; Introna et al. 1990). Extensive microarray and reporter assays demonstrated definitively that v-Myb and c-Myb

16 had different activities in cells, and appeared to act like completely different transcription factors in cells (F. Liu et al. 2006). It was hypothesized that the unstructured C-terminal regulatory domains are involved in extensive protein-to- protein interactions that determine Myb transcription factors specificity, which was dubbed the transcription factor code hypothesis (Ness 2003). A simplified illustration of this hypothesis is outlined in Figure 1.5 (adapted from (George and

Ness 2014)), where c-Myb interacts with the blue and green transcription factors to activate gene A, whereas v-Myb interacts with the orange and tan transcription factors to activate gene B. The interactions that Myb proteins are able to make are determined by their N-terminal and C-terminal domains, represented by the shape of Myb in Figure 1.5. Still, many aspects of the transcription factor code hypothesis remain unclear, like what co-factors and transcription factors cooperate with truncated Myb, what genes are activated or silenced, and how these contribute to cell transformation.

17 Figure 1.5. Transcription factor code hypothesis explaining the transcriptional specificity of Myb proteins. The transcription specificity of Myb proteins is determine by their N- and C- terminal regulatory domains, represented here as shape. c-Myb interacts with the blue and green transcription factors to activate gene A, whereas v-Myb interacts with the orange and tan transcription factors to activate gene B. These interaction ultimately determine which genes Myb proteins can activate. So c-Myb activates “normal” genes which are necessary for the proliferation and differentiation of normal cells. Whereas truncated, oncogenic Myb activates “oncogenic” genes that lead to uncontrolled proliferation, dedifferentiation, and transformation. The identity of the cooperating transcription factors and activated genes is not clear. Figure is adapted from (George and Ness 2014).

18 1.5 Myb proteins are the driver oncogenes in adenoid cystic carcinoma

1.5.1 Disease characteristics of adenoid cystic carcinoma (ACC)

With approximately 1200 cases per year in the united states (Allen S. Ho et al. 2019) ACC is one of the most common tumors that arises in the salivary gland (Mitani et al. 2011). Tumors are often intractable to current treatments resulting in a protracted disease course and poor long-term survival between

23% -40% (Hunt 2011; Allen S. Ho et al. 2019). ACC tumors are typified by slow, yet unpredictable and aggressive growth. Estimates of local disease recurrence are as high as 100% and late occurring, distant metastases are reported in over

60% of patients (Jones et al. 1997; DeAngelis et al. 2011; Spiro 1997; van der

Wal et al. 2002). ACC tumors also have extremely high incidences of perineural invasion (Gil et al. 2009), a condition where tumor cells invade the surrounding nerves. Perineural invasion significantly contributes to patient morbidity and is associated with local tumor recurrence (Dantas et al. 2015). To date efforts to develop targeted therapeutic intervention for any aspect of the disease, including perineural invasion and metastases, have proved unfruitful (Dillon et al. 2016).

Consequently, ACC patients face uncertain outcomes even after initial treatment due to local recurrence, distant metastases, and lack of targeted therapeutics.

1.5.2 MYB is a human proto-oncogene in ACC tumors

Chromosomal translocations in ACC tumors both activate expression of the MYB gene and truncate the resulting protein, the two requirements to

19 unleash the oncogenic activity of Myb proteins (Nordkvist et al. 1994; M. Persson et al. 2009). Comprehensive whole genome studies established that ACC tumors have relatively quiet genomes, with MYB gene mutations being the most significant mutated pathway by far (A. S. Ho et al. 2013). Early studies estimated that MYB was the driver oncogene in almost 60% of ACC tumors (A. S. Ho et al.

2013), and estimates have only increased since. Thus, this rare salivary gland tumor provided the first definitive evidence that truncated Myb proteins are capable of driving induction and progression of a human tumor.

1.5.3 Characteristics of chromosomal translocations in ACC tumors

The first described, and most common, chromosomal translocation in ACC tumors occurs between 6 and 9, fusing the MYB and NFIB genes respectively (Figure 1.6A). These chromosomal translocations almost always result in interruption and truncation of the MYB gene (Brayer et al. 2016), and produce truncated Myb proteins (Mitani et al. 2011). Chromosomal breakpoints are scattered throughout the MYB gene ranging from exon 8 to exon 15, with no apparent hotspots. Importantly, all the identified breakpoints occur after the DBD and TAD, and were determined to encode functional Myb transcription factors in reporter assays (Figure 1.6B)(Brayer et al. 2016).

20 Figure 1.6. Chromosomal translocations are predicted to produce transcriptionally active, truncated Myb proteins. (A) Chromosomal translocations involving normal chromosomes 6 (pink) and 9 (blue) create the fusion (t(6;9)). Usually the first portion of the MYB gene (pink) is fused with the last portion of the NFIB gene (blue). (B) Gene fusions are predicted to produce truncated Myb proteins (protein illustrations). Some cases are predicted to encode fusion proteins (ex T349 and T013), but others are predicted to encode frame-shifts that truncate the Myb protein (ex T399). Chromosomal translocation involving the MYBL1 gene are also predicted to produce truncated Myb proteins (lower three proteins). All of these truncated or fused Myb proteins were able to activate expression of a synthetic Myb responsive promoter (5xMRE). Data for Figure 1.6B was collected by C. Frerich and published in (Brayer et al. 2016).

21 Detailed analyses of RNA-seq fusion transcripts showed that ACC tumors produce a variety of transcripts, some of which are predicted to encode fusion proteins while some are predicted to encode premature stop codons. Western blot analyses have shown that ACC tumors produce truncated Myb proteins

(Mitani et al. 2016), but it is not clear if Myb-NFIB fusion proteins are in fact present in primary ACC tumors. In 2016 novel chromosomal translocations involving the related MYBL1 gene were described in ~20% of ACC tumors

(Brayer et al. 2016; Mitani et al. 2016). Again, these chromosomal translocations most often fused the MYBL1 gene on chromosome 8 to the NFIB gene on chromosome 9. RNA-seq analyses demonstrated that ACC tumors harboring

MYB t(6;9) and MYBL1 t(8;9) chromosomal translocations had similar gene expression signatures, indicating these two related transcription factors are interchangeable driver oncogenes (Brayer et al. 2016).

The full role of chromosomal translocations in activating the MYB gene and the unleashing the Myb protein is still being revealed. Firstly, chromosomal translocations serve to truncate the MYB gene, converting c-Myb into an oncogenic transcription factor. As detailed in the transcription factor code hypothesis, truncated Myb proteins are predicted to have altered specificity from full-length c-Myb, and are thus able to drive tumorigenesis. Indeed gene expression analyses have identified thousands of genes expressed differently in

ACC tumors, yet the critical effects of truncated Myb transcription factors remain elusive.

22 CHAPTER 2: TRANSCRIPTOMES DEFINE DISTINCT SUBGROUPS OF

SALIVARY GLAND ADENOID CYSTIC CARCINOMA WITH DIFFERENT

DRIVER MUTATIONS AND OUTCOMES

Candace A. Frerich1, Kathryn J. Brayer1,2, Brandon M. Painter1, Huining Kang1,

Yoshitsugu Mitani3, Adel K. El-Naggar3 and Scott A. Ness1,2,4

1 Department of Internal Medicine, University of New Mexico Health Sciences

Center

2 University of New Mexico Comprehensive Cancer Center

3 Head and Neck Pathology, University of Texas MD Anderson Cancer Center

4 Corresponding Author

Onocotarget. 2018 doi:10.18632/oncotarget.23641. PMID: 29484115

23 2.1 ABSTRACT

The relative rarity of salivary gland adenoid cystic carcinoma (ACC) and its slow growing yet aggressive nature has complicated the development of molecular markers for patient stratification. To analyze molecular differences linked to the protracted disease course of ACC and metastases that form 5 or more years after diagnosis, detailed RNA-sequencing (RNA-seq) analysis was performed on 68 ACC tumor samples, starting with archived, formalin-fixed paraffin-embedded (FFPE) samples up to 25 years old, so that clinical outcomes were available. A statistical peak-finding approach was used to classify the tumors that expressed MYB or MYBL1, which had overlapping gene expression signatures, from a group that expressed neither oncogene and displayed a unique phenotype. Expression of MYB or MYBL1 was closely correlated to the expression of the SOX4 and EN1 genes, suggesting that they are direct targets of Myb proteins in ACC tumors. Unsupervised hierarchical clustering identified a subgroup of approximately 20% of patients with exceptionally poor overall survival (median less than 30 months) and a unique gene expression signature resembling embryonic stem cells. The results provide a strategy for stratifying

ACC patients and identifying the high-risk, poor-outcome group that are candidates for personalized therapies.

24 2.2 INTRODUCTION

In an era of precision medicine, it has become increasingly important to define subgroups of patients likely to respond to specific therapeutic strategies. Adenoid cystic carcinoma (ACC), the second most frequent malignancy of the salivary glands (Mitani et al. 2011), is a slow growing yet aggressive tumor with a protracted disease course typified by local recurrence and/or metastasis, which often occurs 5 or more years after diagnosis (Hunt 2011). The standard treatment is surgical resection, but the effectiveness in preventing local recurrence and distant metastases is variable – survival ranges from less than 3 to more than 15 years, suggesting unexplained phenotypic or molecular heterogeneities (Mitani et al. 2011; D. Bell and Hanna 2012). Efforts to develop targeted treatments have been largely unfruitful (Dillon et al. 2016), highlighting the need for new and more effective therapeutic strategies.

ACC has been closely associated with the MYB oncogene since the discovery of recurrent t(6;9) translocations that fuse the MYB and NFIB genes in many of these tumors (M. Persson et al. 2009; Mitani et al. 2010; West et al. 2011). The

MYB proto-oncogene encodes a DNA-binding transcription factor implicated in a variety of human hematopoietic, epithelial and neural malignancies (Ramsay and

Gonda 2008; Y. Zhou and Ness 2011; George and Ness 2014). The recurrent t(6;9) translocation fuses the MYB gene on to the NFIB locus on chromosome 9 and may lead to overexpression of an activated Myb protein or a novel Myb-NFIB fusion oncoprotein. Detailed epigenetic studies have shown that

25 the translocation juxtaposes important enhancers from the NFIB locus to the

MYB gene, leading the oncogene to be aberrantly overexpressed (Drier et al.

2016). However, estimates of the fraction of ACC tumors that harbor the t(6;9) translocation or that express Myb proteins or MYB-NFIB fusion transcripts have varied (Mitani et al. 2011; Brill et al. 2011; Fehr et al. 2011; Marta Persson et al.

2012; Pusztaszeri et al. 2014; 2014; Brayer et al. 2016). These discrepancies may be due to numerous factors, including small cohort sizes, the use of frozen vs. archival FFPE material from different institutions, different types of detection methods or even problematic antibodies used in molecular assays. The confusion has led some authors to conclude that MYB is unlikely to be an important driver oncogene in ACC tumors (D. Bell et al. 2016)or even that the fusion partner NFIB plays a more important functional role than expected (Rettig et al. 2016). These issues became even more complex with the discovery of alternative translocations in some ACC tumors. For example, instead of fusions with the MYB gene, a subgroup of ACC tumors display fusions of the MYBL1 gene on chromosome 8, fused to either the NFIB or RAD51B genes (Brayer et al.

2016; Mitani et al. 2016). MYBL1 encodes the A-Myb transcription factor that is highly related to Myb: the two proteins can bind the same DNA sequences and can activate the same target genes (Y. Zhou and Ness 2011; George and Ness

2014). Another subgroup of ACC tumors has been described that have point mutations in NOTCH1 (Ferrarotto et al. 2017). Thus, despite considerable progress, there remains uncertainty about the extent of heterogeneity amongst

26 ACC mutations, the importance of different candidate driver oncogenes in ACC tumor development and progress and consequently what the appropriate course of action should be for developing targeted therapeutic agents.

Since metastases in ACC tumors may develop after 5 years or more, linking molecular data to outcomes is challenging due to the need to analyze relatively old samples, which may not have been preserved with RNA or DNA analysis in mind. In addition, there have been reports that several supposed ACC cell lines have been misidentified or could be contaminated by other cell types, so studies that have relied primarily on cell line analyses may be compromised (Phuchareon et al. 2009; M. Zhao et al. 2011). Fortunately, recent advances in the analysis of

RNA derived from archival FFPE samples (Brayer et al. 2016; Brown et al. 2017) provide a new opportunity to analyze gene expression patterns in rare tumors like ACC, using primary patient samples that are more than a decade old. Here, we describe the unbiased RNA-seq analysis of the largest cohort of ACC tumor samples to date: 68 archival FFPE salivary gland ACC tumors accompanied by retrospective clinical data, collected over a period of 25 years. The analysis revealed unforeseen heterogeneity amongst the ACC patients and provided evidence of diverse molecular signatures amongst ACC tumors as well as genes associated with poor outcome that could serve as novel biomarkers or targets for future therapeutic strategies.

27 2.3 RESULTS

2.3.1 RNA-seq analysis of ACC tumor samples up to 25 years old

In an earlier study, we compared the RNA-seq profiles of ACC tumors to normal salivary gland, but many of those tumor samples lacked clinical follow-up data

(Brayer et al. 2016). Many ACC patients survive more than 5 years after surgery before succumbing to distant metastases, necessitating the analysis of relatively old samples with informative outcome information. We tested improved RNA-seq methods (Brayer et al. 2016; Brown et al. 2017), using a small set of ACC samples collected over a range of dates up to 25 years ago. RNA was isolated from FFPE sections and analyzed using optimized library methods and the Ion

Proton instrument, which has the advantage of being able to analyze fragments as short as 25 nt. More than 85% of the initial samples, regardless of their age, yielded RNA-seq data suitable for our study. We expanded our analysis to 77 samples with follow-up periods of at least 5 years, of which 68 (88%) yielded high quality RNA-seq results, with an average of ~15 x 106 reads for each sample

(Table 2.1). An average of 9% of the reads mapped uniquely to exon features.

These are all new ACC samples, not analyzed in our previous study (Brayer et al.

2016). Figure 2.1A summarizes the number of reads mapped to exons obtained for each sample, as a function of the years since sample collection. Although some samples performed better than others, there was not a significant correlation between the number of high quality, exon mapped reads obtained and the age of the FFPE samples (R-squared = 0.02). We also performed several

28 types of quality control checks on the RNA-seq data and used those results to eliminate outlier samples (Brown et al. 2017). For example, we compared the total RNA-seq reads to the exon mapped reads (Figure 2.1B) and the number of reads in the XIST gene, a female-specific non-coding RNA expressed from the silenced X-chromosome, as a check of the reported gender information (Figure

2.1C). These results confirmed that RNA-seq can provide useful gene expression information from FFPE samples, even for archived samples that were collected more than 10 years ago.

2.3.2 Most ACC tumors express either MYB or MYBL1

Although rearrangements of the MYB and MYBL1 genes have been observed in many ACC tumors (Brayer et al. 2016), there has been some controversy about the importance of the oncogenes (D. Bell et al. 2016). In addition, commercially available antibodies to measure Myb protein levels by immunohistochemistry can be problematic (data not shown), which could contribute to some of the reported differences in the fraction of ACC samples that express Myb proteins. To increase sensitivity, we started with the RNA-seq raw aligned read (e.g. ‘.bam’) files and used a peak-calling algorithm to identify the

29 Figure 2.1. RNA-seq identifies distinct subgroups of ACC tumor samples. (A) Plot of years since samples were collected vs. RNA-seq reads mapped to exons in the reference genome shows that high-quality results were obtained with samples collected up to 24 years ago, and that quality did not correlate with the age of the samples. (B) Plot of total RNA-seq reads vs. exon mapped reads, one of the quality control measures employed in this study. (C) Plot of reads mapped to the XIST gene as a function of reported gender in the associated clinical data. This quality control step is useful to identify mislabeled samples. (D) Genome browser representation of peak-calling results generated from ACC

30 tumor sample RNA-seq data for the genes indicated. Gene names and exon/intron structures are at top, arrows show the direction of transcription, each horizontal line or track is a different ACC sample, ordered to cluster the samples with similar gene expression patterns and colored bars indicate regions of transcription detected by the peak-calling algorithm. Note that the MYB gene is transcribed left-to-right, but the others are right-to-left. Samples that express MYB (dark blue), MYBL1 (cyan) or neither oncogene (orange) are labeled at left.

Table 2.1. RNA-Seq Statistics Total Samples, n 68 Female 30 Male 38 Average Total Reads, x10-6 (range) 15.17 (4.26-27.98) Average Exon Mapped Reads, x10-6 (range) 1.41 (0.34-2.68) MYB overexpression 49 (72%) MYBL1 overexpression 7 (10%) No MYB or MYBL1 12 (18%)

31 samples that did or did not express the MYB or MYBL1 genes above background. The advantage of the peak-calling algorithm is that it also makes use of the reads that map to intron regions, rather than only the exon-mapped reads we used for quantifying gene expression (Brayer et al. 2016; Brown et al.

2017). The peak-calling results for these and several other genes are shown in

Figure 2.1D, where colored lines indicate a region of gene expression defined by the peak-calling algorithm. The results led us to divide the samples into three groups: 49 of the samples expressed MYB (dark blue, upper samples), 7 expressed MYBL1 (cyan) and 12 expressed neither oncogene (orange, bottom).

Overall, 56 of the 68 samples or 82% expressed either MYB or MYBL1.

Interestingly, none of the samples expressed both MYB and MYBL1, consistent with the hypothesis that these are the interchangeable driver oncogenes for most

ACC tumors – there is no need or selection pressure for a tumor to express both.

The peak-calling algorithm was able to distinguish many of the samples in which the MYB or MYBL1 genes were truncated due to translocations (indicated by shorter lines that fail to extend across the entire gene). For comparison, Figure

2.1D also shows the peak-calling results for several other genes: NFIB and

NOTCH1, which have been implicated as important in ACC tumors and are expressed in most, but not all of the samples from all three groups, and VGLL3, an example of a gene that was expressed only in samples that expressed neither

MYB nor MYBL1. The VGLL3 gene encodes a transcription factor implicated in other epithelial tumors (Gambaro et al. 2013; Tufegdzic et al. 2015), but its

32 importance in ACC has not been established. We include it here as an example only to illustrate the striking differences in gene expression profiles in these samples. Two of the samples in our cohort did not express NFIB above background levels, suggesting that NFIB expression is not required for the development of all ACC tumors (Rettig et al. 2016). However, the samples that were positive expressed high levels of transcripts, suggesting that at least one allele of the NFIB gene was very highly expressed in most samples.

2.3.3 Analysis of RNA-seq data for evidence of fusion transcripts

Chromosome translocations and gene fusions are important driver mutations for many types of leukemia and solid tumors, such as ACC, but their detection can be problematic. We used several approaches to attempt to identify potential gene fusions in the ACC tumor RNA-seq data. The peak-calling algorithm described in Figure 2.1 identified some tumors that appeared to have truncated oncogenes. A splice-aware aligning program, STAR (Dobin et al.

2013), was used to identify chimeric reads that aligned to two different genes.

Candidates were then verified by visually inspecting the reads using a genome browser. The results of these efforts are summarized in Table 2.2. Despite the relatively poor quality of the starting RNA used for these studies, and the modest read depths obtained, we were able to identify putative chimeric or fusion reads

33 Table 2.2. Observed and Putative MYB gene fusions identified in 68 ACC tumors Partner 1 Partner 2 Putative Translocation* No. of Cases MYB NFIB t(6;9) 11 PDCD1LG2 t(6;9) 1 EFR3A t(6;8) 1 Fusion partner t(6;?) 29 unknown MYBL1 NFIB t(8;9) 3 Fusion partner t(8;?) 4 unknown Tumors with apparent MYB truncation or translocation* 42 (62%) Tumors with apparent MYBL1 truncation or translocation* 7 (10%) – – Total number of tumors with apparent MYB or MYBL1 translocations* 49 (72%) Tumors that over-express MYB, but no evidence of truncation 7 (10%) Tumors that express neither MYB nor MYBL1 12 (18%) * Based only on RNA-seq results, not confirmed by FISH.

34 in a large fraction of the samples. We identified MYB-NFIB fusion reads in 11 samples and MYBL1-NFIB fusion reads in 3 samples. We also identified fusions between MYB and the PDCD1LG2 or EFR3A genes in two additional samples, and validated those novel fusions by amplifying them using genomic DNA-based

PCR followed by conventional (Sanger) sequencing (for details and sequencing results see Supplementary Table S.2.1). We identified 29 samples that appeared to have truncated MYB gene transcripts where no fusion reads could be found, so the fusion partner remains uncharacterized. Similarly, 4 samples appeared to have truncated MYBL1 genes based on the RNA-seq data, but insufficient fusion reads were found to identify a fusion partner. Although the analysis of RNA-seq data was able to identify many examples of fusion transcripts, this type of analysis cannot identify other types of fusions or gene rearrangements that may not lead to the expression of fusion transcripts, so the percentages of cases with translocations should be considered an underestimate.

2.3.4 Gene expression signatures identify major subgroups of ACC tumors

In addition to the survival groups described above, our peak-calling analysis established that ACC tumors form at least three groups, based on the expression of MYB, MYBL1 or neither oncogene. We characterized the gene expression signatures in the ACC tumors to investigate the differences or similarities in these groups. As shown in Figure 2.2A, Principal Components

Analysis separated the ACC tumors into two major groups. The samples that

35 36 Figure 2.2. Differential Gene Expression Analysis: MYB/MYBL1 vs Neither Oncogene. (A) Principal components analysis of ACC tumor RNA-seq data. The colors indicate the samples that express MYB (dark blue), MYBL1 (cyan) or neither of the oncogenes (orange), as determined by the peak-calling results summarized in Figure 1D. Note that the orange samples that express neither MYB nor MYBL1 separate from the others and form their own group on the right side of the plot. (B) Volcano plot summarizing the differential gene expression analysis, showing log2 of fold change vs. log10 of the p-value (BH adjusted). See Materials and Methods for details. (C) The heatmap summarizes the supervised clustering and differential gene expression analysis comparing the samples expressing MYB or MYBL1 (marked blue or cyan at top) to the samples expressing neither oncogene (marked orange at top). The side bar at left indicates genes that are listed in the drug gene interactions database. Several interesting genes specific for the two groups are labeled at right. A larger version of this heatmap with all the genes labeled is provided in the supplementary results (Supplementary Figure S1).

37 expressed neither MYB nor MYBL1 clustered at the right side of the plot

(orange). The remaining tumors expressing either MYB (blue) or MYBL1 (cyan) were on the left, and completely overlapped, suggesting that the two oncogenes were interchangeable and contributed to similar gene expression profiles. This is consistent with previous studies showing that swapping the DNA binding domains of the c-Myb and A-Myb transcription factors, encoded by the MYB and

MYBL1 genes, respectively, resulted in only minimal changes in specificity and activity (Lei et al. 2004). The samples that expressed neither MYB nor MYBL1 were specifically re-checked to make sure that they were diagnosed correctly.

Re-examination revealed that all cases were adenoid cystic cancer composed of tubular and cribriform patients with no solid features. The majority arose from minor salivary gland sites (see Supplementary Table S2.2).

For differential gene expression analysis, we treated the tumors expressing either MYB or MYBL1 (dark blue or cyan in Figure 2.2A) as one group and searched for genes distinguishing them from the tumors expressing neither oncogene (orange in Figure 2.2A). As shown in the Volcano plot in Figure 2.2B, our analysis identified more than 1,500 genes that were at least 2-fold up- or down-regulated, with an adjusted p-value of 0.05 or less. The heatmap shown in

Figure 2.2C summarizes the supervised clustering analysis. The dendrogram and the color bar at top identify the tumors that express either MYB or MYBL1

(right, dark blue and cyan) and the tumors that express neither oncogene (left, orange).

38 Several important conclusions can be drawn from the heatmap in Figure

2.2C. As described above, the tumors expressing either MYB or MYBL1 do not form their own groups and do not have distinct gene expression signatures.

Instead, the samples expressing MYBL1 are scattered amongst the MYB samples, suggesting that the oncogenes are interchangeable and that either can suffice as the key driver for these tumors. However, there is evidence of heterogeneity amongst the tumors expressing MYB or MYBL1. Several subgroups are apparent in the dendrogram at the top, and are especially evident in the top half of the heatmap, which shows clusters of tumors with different patterns of gene expression.

Another conclusion is that there are hundreds or thousands of gene expression differences between the MYB/MYBL1 samples and the tumors that express neither oncogene. This was unexpected, since all of these tumors are classified as ACC, but explains why the samples were so easily distinguished in the principal components analysis (Figure 2.2A). Several interesting genes have been highlighted and labeled in the heatmap in Figure 2.2C (a full-size version with all the genes labeled is provided as Supplementary Figure S.2.1). Some of the most interesting genes up-regulated in the tumors that express neither MYB nor MYBL1 included JUNB, FOXO1, , VGLL3, and FOSB, all of which encode transcription factors and could be potential ‘drivers’ of this ‘non-MYB’ subgroup of ACC tumors. In contrast, the genes correlated most closely with

MYB or MYBL1 expression included chemokine CXXC4 and the transcription

39 factors SOX4 and EN1. The latter gene, which encodes the

1 transcription factor, has been identified previously as an important biomarker in

ACC tumors (Diana Bell et al. 2012). The SOX4 gene was also identified previously as being up-regulated in ACC tumors (Frierson et al. 2002). Our results show that both EN1 and SOX4 are highly correlated with the expression of MYB/MYBL1, suggesting that they could be direct downstream targets regulated by the oncogenes.

2.3.5 EN1 and SOX4 are Myb-regulated target genes in ACC tumors

Comprehensive chromatin immunoprecipitation-sequencing (ChIP-seq) results for ACC tumor samples have been reported (Drier et al. 2016). We analyzed the publicly available data and confirmed that, although the binding is weak, both the EN1 and SOX4 promoters are occupied by Myb proteins in ACC tumors (data not shown). We used PCR to amplify the promoters of each gene, cloned them into reporter gene plasmids and tested their response to Myb proteins in transfection/reporter gene assays. Diagrams of the promoter regions of the EN1 and SOX4 genes are shown in Figure 3.3A, along with the regions that we cloned into the reporter gene plasmids. Both promoters contain predicted

Myb Response Elements (Ness et al. 1993; Ness, Marknell, and Graf 1989),

40 Figure 2.3. EN1 and SOX4 promoter reporter gene assays. (A) Structure of EN1 and SOX4 promoters and reporter gene vectors. The diagrams show the 5’-end of each gene with the normal transcription start site indicated with an arrow and the fragment used for the promoter-reporter constructs indicated below. Red marks indicate predicted binding sites for Myb proteins (Myb Response Elements). The full DNA sequence of each cloned fragment is provided in Methods. (B) Transfection-reporter gene results. The EN1 and SOX4 promoter-reporter gene plasmids were co-transfected into HEK293T cells along with control (empty) vector or plasmids expressing the normal, full- length c-Myb or either MYB-NFIB or MYBL1-RAD51B fusion constructs. The diagrams at left show the structures of the fusion proteins that were expressed. The bar graph at right shows luciferase reporter gene activity normalized to the level of control (empty) vector for EN1 (gray) and SOX4 (blue) promoter-reporter plasmids.

41 indicated by red marks in the promoter fragments shown in Figure 3.3A . For functional assays, we transfected the reporter plasmids into HEK293T cells, which lack endogenous c-Myb or A-Myb protein expression, along with control

(empty vector) plasmid or plasmids expressing wild type c-Myb or MYB-NFIB or

MYBL1-RAD51B fusion oncogenes identified previously (Brayer et al. 2016). As shown in Figure 3.3B, both the EN1 (gray) and SOX4 (blue) reporter genes were strongly (3-14 fold) activated by co-transfection of plasmids expressing wild type or oncogenic Myb proteins. Neither promoter was significantly activated by a negative control vector expressing a c-Myb protein with a mutated DNA binding domain that is unable to bind DNA (not shown). These results confirm that both the c-Myb protein encoded by the MYB gene and the A-Myb protein encoded by

MYBL1 can activate the EN1 and SOX4 promoters in transfection assays. Based on these results, the published ChIP-seq results showing that these promoters are occupied by Myb proteins in ACC tumors, and the tight correlation between

MYB/MYBL1, EN1 and SOX4 RNA levels in ACC tumor samples, we conclude that EN1 and SOX4 are likely to be direct targets of regulation by Myb proteins in

ACC tumors. However, additional experiments will be required to determine whether these two Myb-regulated genes play a direct role in the development or pathogenesis of ACC tumors.

2.3.6 Identification of a high-risk, poor-outcome subgroup of ACC patients

ACC is a morphologically and clinically heterogeneous disease, which makes grading and treatment challenging. Our previous analyses using only 20

42 tumor samples suggested that there was heterogeneity amongst ACC tumors

(Brayer et al. 2016). To investigate this further, we performed unsupervised hierarchical clustering of the gene expression data for the 68 new tumors, generating the dendrogram shown in Figure 2.4A. The ACC tumors formed two major groups. Group 1 (red, n=14) was distinct and separated in the dendrogram, indicating that the samples were quite different in terms of major gene expression characteristics. Group 2 (n=54, black and orange) contained the majority of cases and was composed of several smaller subgroups, implying additional genetic heterogeneity amongst ACC tumors that could be biologically important.

All of the samples that expressed neither MYB nor MYBL1 (orange) were in the larger Group 2. We used Kaplan-Meier survival analysis to evaluate all the samples with survival data (Figure 2.4B), which revealed a median survival for all

ACC patients of 147 months and a 5-year survival rate of 72% (95% Confidence

Interval, C.I.: 0.62-0.84). However, as shown in Figure 2.4C, the 13 patients in

Group 1 (red) with survival information displayed a median survival of only 28 months, a mean survival of only 54% after 2 years (95% Confidence Interval,

C.I.: 0.33-0.89) and a dismal 31% survival over 5 years (95% C.I.: 0.14-0.70).

There were no patients in Group 1 that survived more than 10 yrs. The patients

43 44 Figure 2.4. Identification of a high-risk subgroup of ACC patients. (A) Unsupervised hierarchical clustering: ACC tumor samples form two major clusters, labeled Group 1 (red) and Group 2 (orange and black). (B) Kaplan- Meier survival plot for all 68 ACC tumor samples with survival information showing median survival (red) as well as 95% confidence intervals (cyan and dark blue). (C) Kaplan-Meier survival plots of ACC tumor samples in Groups 1 and 2 (red and black, respectively). The groups contained 13 and 55 patients, respectively. (D) Principal components analysis of ACC tumor RNA-seq data. The colors indicate the samples that express either MYB or MYBL1 in Group 1 (red) or Group 2 (black) or the samples that express neither of the oncogenes (orange), as determined by the peak-calling results summarized in Figure 1D. Note that the samples that express neither MYB nor MYBL1 (orange) separate from the others and form their own group on the right side of the plot. The poor survival Group 1 samples (red) cluster at the upper left corner of the plot. (E) The heatmap summarizes the results of differential gene expression analysis comparing the poor survival Group 1 (left, red) and better survival Group 2 (right) ACC samples. The color bar at top indicates samples in Group 1 (red) or samples that express MYB (dark blue), MYBL1 (cyan) or neither oncogene (orange). Several interesting genes up-regulated in each group are labeled at right. A larger version of this heatmap with all the genes labeled is provided in supplementary information as Supplemental Figure S2.2.

45 in Group 2 (black) displayed significantly better survival (log-rank p-value < 1x10-

5) with an average 92% survival over 2 years (95% C.I.: 0.85-1.0), more than

81% survival over 5 years (95% C.I.: 0.71-0.93) and 72% survival over 10 yrs

(95% C.I.: 0.61-0.87). Group 2 contained samples that expressed MYB plus all of the samples that expressed MYBL1 and all of the samples that expressed neither oncogene (described in Figure 2.1). When tested separately, all of these subgroups had relatively good survival. The Principal Components Analysis plot in Figure 2.4D combines these survival clusters with the results shown above in

Figure 2.2. The ACC samples form three distinct groups: the poor survival Group

1 samples (red) at upper left, the samples that express neither MYB nor MYBL1 at the right (orange), and the remainder of the MYB or MYBL1 expressing samples at lower left (black). Thus, gene expression patterns identified a previously unknown subgroup of ACC patients with significantly worse overall survival and divide the ACC patients into three distinct groups with different driver oncogenes and outcomes.

A number of publications have reported markers for identifying poor survival subgroups of ACC patients (Mitani et al. 2011; Diana Bell et al. 2012; Yi, Li, and

Zhou 2016; H. Liu et al. 2015; Phuchareon, van Zante, et al. 2014; Qu et al.

2016; Shao et al. 2011; Zhu et al. 2014; Y. L. Tang et al. 2014; Chang et al. 2016;

Dai et al. 2014). Most of the markers were originally tested using antibody staining in immunohistochemistry assays, and some were developed using ACC cell lines whose authenticity have been called into question (Phuchareon et al.

46 2009; M. Zhao et al. 2011). We tested whether 20 previously identified markers were useful for identifying poor survival subgroups in our cohort, using the RNA- seq data. The results, summarized in Table 2.3, showed that only MYB, PTK2

(FAK) and SNAI2 (Slug) were useful for identifying subgroups of ACC tumors that showed significant differences in survival, based on using a Cox proportional hazard model analysis. None of the other markers yielded significant results, although the failures could reflect a difference between RNA expression data and protein expression as measured by immunohistochemistry assays.

We also tested whether any of the other clinical parameters provided with our cohort of samples could be used to distinguish a poor survival subgroup. As shown in Table 2.4, we found that age at surgery was significantly associated with overall survival, when treated as a continuous variable (Hazard Ratio, HR =

1.42 per 10 years). The patients 50 years of age or more had a higher risk than

47 Table 2.3. Genes reported to be linked to poor prognosis in ACC tumors Gene Reference Correlated to Methods* HR** 95% C.I. p-value BMI1 (Yi, Li, and Zhou 2016) Poor Outcome IHC 0.98 0.59 1.61 0.929 CDH1 (Yi, Li, and Zhou 2016) Good Outcome IHC 1.08 0.12 9.43 0.945 EN1 (Diana Bell et al. 2012) Poor Outcome IHC 2.82 0.66 12.01 0.160 EPAS1 (HIF2a) (C. Zhou et al. 2012) Poor Outcome QPCR, IHC 0.16 0.02 1.25 0.081 FABP7 (Phuchareon, Overdevest, et al. 2014) Poor Outcome QPCR 2.20 0.83 5.81 0.111 ILK (H. Liu et al. 2015) Poor Outcome IHC 0.57 0.09 3.73 0.555 KIT (Phuchareon, van Zante, et al. 2014) Poor Outcome QPCR 1.13 0.17 7.42 0.897 MYB (Mitani et al. 2011) Poor Outcome QPCR 3.59 0.95 13.53 0.059 NOTCH1 (Su et al. 2014) Poor Outcome IHC 2.11 0.08 54.54 0.653 PDCD4 (Qi et al. 2013) Good Outcome IHC 0.68 0.21 2.26 0.531 PIM1 (Zhu et al. 2014) Poor Outcome IHC 0.57 0.20 1.58 0.277 PPM1D (WIP1) (Y. Tang et al. 2015) Poor Outcome IHC 1.79 0.55 5.84 0.333 PTEN (H. Liu et al. 2015) Good Outcome IHC 0.19 0.01 3.27 0.252 PTK2 (FAK) (H. Liu et al. 2015) Poor Outcome IHC 126.26 2.89 5507.29 0.012 SNAI1 (Snail) (Yi, Li, and Zhou 2016) Poor Outcome IHC 0.94 0.60 1.47 0.790 SNAI2 (Slug) (Yi, Li, and Zhou 2016) Poor Outcome IHC 0.34 0.13 0.91 0.032 SOD2 (Chang et al. 2016) Poor Outcome IHC 0.55 0.10 2.98 0.492 (Dai et al. 2014) Poor Outcome QPCR 0.65 0.36 1.16 0.144 TWIST2 (C. Zhou et al. 2012) Poor Outcome QPCR, IHC 0.85 0.50 1.43 0.529 ZEB2 (SIP1) (C. Zhou et al. 2012) Poor Outcome QPCR, IHC 0.48 0.07 3.24 0.453 *IHC = Immunohistochemistry; QPCR = Quantitative PCR or other PCR assay. **HR = Hazard Ratio for one standard deviation increase of gene expression in log scale. younger patients (HR = 1.34), although the association was not statistically significant. Gender did not have a significant association with overall survival, though males had a slightly higher risk than females (HR = 1.27). As might be expected, the two clinical parameters that describe outcome, Cancer Status

(either NED: No evidence of disease or with tumor) and Metastasis Status (Yes or No) were both significantly linked to poor survival. Patients with known metastases had a significantly higher risk than those without (HR = 4.86, p-value

= 0.0008). Patients with tumor had a significantly higher risk than those with no evidence of disease (NED) (HR = 11.27, p-value = 0.0013).

We used a multivariate analysis to test whether the gene expression groups provided additional survival information compared to the other variables. As shown in Table 2.5, gene expression cluster Group 1 was significantly associated with poor outcome, even after adjusting for the effects of age at surgery and metastasis status (HR=4.76, p-value < .001). Likewise, age at surgery as a continuous variable was significantly associated with worse survival after adjusting for the effects of metastasis status and gene expression cluster (HR =

1.55 per 10 yrs, p-value =.014), and patients with metastases had a significantly higher risk than those without (HR = 3.18, p-value =.027) after adjusting for the effects of age at surgery and gene expression cluster. Thus, these three variables appear to be independently associated with poor survival. Metastasis

49 Table 2.4. Characteristics of 68 ACC tumors Level Overall HR* 95% C.I. p-value N 68 Age at Surgery, mean 50.1 (14.1) 1.42 1.07 1.87 0.0143 (SD) (per 10 yrs) Age Group (%) <50 32 (47.1) 1 50+ 36 (52.9) 1.34 0.63 2.86 0.4513 Gender (%) Female 30 (44.1) 1 Male 38 (55.9) 1.27 0.58 2.74 0.5516 Metastasis Status (%) No 36 (57.1) 1 Yes 27 (42.9) 4.86 1.93 12.21 0.0008 Cancer Status (%) NED 30 (56.6) 1 With Tumor 23 (43.4) 11.27 2.57 49.51 0.0013 MYB + MYBL1 3.56* 0.71 17.91 0.123 Expression EN1 Expression 2.82* 0.66 12.01 0.1603 Key: HR = hazard ratio; C.I. = Confidence Interval; p-value = p-value of Wald test based on Cox-regression; NED = No Evidence of Disease *HR = Hazard Ratio resulted in by one standard deviation increase

Table 2.5. Multivariate Cox-Regression Analysis Level HR 95% C.I. p-value N Age at Surgery 1.55* 1.10 2.20 0.014 (per 10 yrs) Metastasis Status No 1 Yes 3.18 1.14 8.87 0.027 Gene Expression Cluster Group 2 1 Group 1 4.76 1.93 11.77 <0.001 Key: HR = hazard ratio; C.I. = Confidence Interval; p-value = p-value of Wald test based on Cox-regression * Hazard Ratio associated with 10-year increase in age.

50 Status is an outcome marker that is evaluated years after surgery and describes the success of the treatment. In contrast, the gene expression patterns were determined from surgical samples that were collected before the outcomes were known. So information about molecular differences could be useful for predicting overall survival and for identifying patients that need different treatment strategies or who could benefit from more intensive follow-up care.

2.3.7 Gene expression signatures define good- and poor-outcome subgroups of

ACC patients

We performed differential gene expression analysis using the groups of

ACC tumors identified by hierarchical clustering, and identified over 2,000 genes that were significantly (at least 2-fold up or down, adjusted p-val < 0.05) different between the two survival groups. The heatmap in Figure 2.4E compares the expression of the 100 genes that were most significantly correlated to Group 1

(poor survival, left, color bar: red) or Group 2 (better survival, right, color bar: orange, blue or cyan). Genes up-regulated in the poor survival group included

IPO9, ERBB3, SOX4, MYB and GABRP. Genes up-regulated in the better survival group included SETBP1, EGFR, TP63 and PIGR (A full-sized heatmap with all the genes labeled is provided as Supplementary Figure S.2.2). The differential expression of ERBB3 and EGFR in the poor and good survival groups, respectively, suggests a difference in signaling pathways linked to epithelial to mesenchymal transition. Although WNT signature genes were not significantly enriched in our pathway analyses, several genes in the WNT

51 signaling pathway were differentially expressed, suggesting that a WNT specific signature may be important for the differences between the good- and poor- outcome groups. However, this characterization will require additional study.

2.3.7 A stem cell gene expression signature is associated with poor outcome

We used Gene Set Enrichment Analysis (Subramanian et al. 2005) to illuminate the differences between the poor and good outcome subgroups of

ACC tumors. A straightforward comparison of the two groups, using the differentially expressed genes for all 68 tumor samples (Figure 2.4E), did not identify any significantly enriched pathways, perhaps because of the dramatic heterogeneity within the good outcome group, which contained samples expressing MYB, MYBL1 and neither oncogene. Therefore, since all of the Group

1 samples expressed MYB, we focused our analysis by comparing them only to the other samples expressing MYB. Principal Components Analysis (Figure 2.5A) cleanly separated the 13 samples in the poor outcome Group 1 (red, left) from the 36 MYB expressing samples in the better outcome Group 2 (blue, right).

Using the approximately 1,000 genes that were significantly differentially expressed between the two groups of MYB expressing tumors, Gene Set

Enrichment Analysis identified several gene lists that were enriched in the Group

2, good outcome samples (Table 2.6). However, only one gene list was significantly enriched in the poor outcome group (Table 2.7). As shown in Figure

52 Figure 2.5. Gene Set Enrichment Analysis of MYB Samples (A) Principal Components Analysis of the MYB expressing samples. The colors indicate the samples in Group 1 (red, left) or Group 2 (blue, right). Note that the poor survival Group 1 samples form their own group on the left side of the plot. (B) Enrichment Plot of gene set ‘Benporath_ES_with_H3K27me3’ identified using the genes differentially expressed between the Group 1 and Group 2 samples expressing MYB. (C) Heatmap of differentially expressed genes from gene set

53 ‘Benporath_ES_with_H3K27me3’, identified through Gene Set Enrichment Analysis. Table 2.6. Gene Set Enrichment Analysis Results – Group 2 ENRICHED IN GROUP 2 (GOOD OUTCOME)

GENESET NAME SIZE NOM p-val FDR q-val LIM_MAMMARY_STEM_CELL_UP 41 0 0 LIU_PROSTATE_CANCER_DN 42 0 6.08E-04 SENESE_HDAC1_AND_HDAC2_TARGETS_DN 18 0 0.008256009 WANG_SMARCE1_TARGETS_UP 24 0 0.011867385 ONDER_CDH1_TARGETS_2_DN 32 0 0.013505637 MCBRYAN_PUBERTAL_TGFB1_TARGETS_UP 15 0 0.017456258 FORTSCHEGGER_PHF8_TARGETS_DN 18 0 0.022528453 ACEVEDO_FGFR1_TARGETS_IN_PROSTATE_CANCER_MODEL_DN 23 0.001287001 0.022527734 TURASHVILI_BREAST_DUCTAL_CARCINOMA_VS_DUCTAL_NORMA L_DN 29 0.001305483 0.040243994 CHANDRAN_METASTASIS_DN 15 0.001461988 0.02578634 KOINUMA_TARGETS_OF_SMAD2_OR_SMAD3 33 0.002597403 0.05427582 KINSEY_TARGETS_OF_EWSR1_FLII_FUSION_DN 22 0.004166667 0.052254837 PASINI_SUZ12_TARGETS_DN 17 0.004279601 0.034044717 MARTINEZ_RB1_AND_TP53_TARGETS_UP 15 0.005763689 0.07122304 BRUINS_UVC_RESPONSE_LATE 27 0.006648936 0.06798041 JAEGER_METASTASIS_DN 22 0.008310249 0.06001583 CHICAS_RB1_TARGETS_CONFLUENT 33 0.008782936 0.07140977

Table 2.7. Gene Set Enrichment Analysis Results – Group 1 ENRICHED IN GROUP 1 (POOR OUTCOME)

GENESET NAME SIZE NOM p-val FDR q-val BENPORATH_ES_WITH_H3K27ME3 22 0.003861004 0.10307397

54 5B, the genelist “BENPORATH_ES_WITH_H3K27ME3” was significantly (pval <

0.004) enriched in Group 1. This genelist is described as “genes possessing the trimethylated H3K27 (H3K27me3) mark in their promoters in human embryonic stem cells, as identified by ChIP on chip” (Ben-Porath et al. 2008). This suggests that the poor outcome tumors display features related to ES cells, similar to high- grade estrogen receptor-negative, basal-like breast tumors, which also have poor clinical outcome (Ben-Porath et al. 2008). The heatmap in Figure 2.5C summarizes the expression of the genes in this list in the Group 1 (left, red) and

Group 2 (blue, right) tumors. The poor outcome samples are distinguished by their relatively high expression of SOX8, MYB and TGFA and low expression of

SYNE1, TBX3 and PTPRT. These results suggest that a gene expression biomarker could be developed to identify patients in the high-risk, poor-outcome group so they could be stratified for more intensive follow up to improve survival.

2.4 DISCUSSION

ACC is representative of a large class of relatively rare tumors that are difficult to study because of limited samples, lack of validated cell lines and undeveloped tumor models. In the case of ACC, the disease course can be slow, often resulting in the development of metastases 5 or more years after diagnosis and surgery (Hunt 2011). This necessitates the study of relatively old samples so that molecular information can be correlated to clinical outcomes. To address this issue, we developed optimized RNA-seq approaches (Brayer et al. 2016; Brown

55 et al. 2017) so that the gene expression patterns in archived, FFPE samples up to 25 years old can be analyzed in detail. By applying these methods to one of the largest cohorts of ACC samples ever studied, we uncovered several important subsets of ACC patients and illuminated new molecular details about the oncogenic drivers that could be targeted by new types of therapeutic approaches.

We applied a novel use of a statistical peak-calling algorithm to identify the

ACC samples that expressed MYB, MYBL1 or neither oncogene. In addition to identifying three subgroups of ACC tumors, this approach showed that no ACC tumors in our cohort expressed both MYB and MYBL1, which is consistent with our model that the two oncogenes are interchangeable drivers of tumorigenesis

(Brayer et al. 2016). With the subgroups of ACC tumors cleanly separated, we were able to perform differential gene expression analysis to identify gene expression signatures characteristic of each group. The ~80% of tumors that expressed either MYB or MYBL1 displayed dramatically different gene expression profiles than the tumors that expressed neither of the oncogenes

(Figure 2.2). Of particular note were the EN1 (engrailed) and SOX4 genes that were highly correlated to the expression of MYB/MYBL1. We investigated these two genes that encode transcription factors, and found that the promoters of both genes have predicted Myb Response Elements and were activated by ectopically expressed c-Myb or A-Myb proteins in transfection-reporter gene assays.

Coupled with the published ChIP-seq data showing that the promoters are

56 occupied by Myb proteins in ACC tumors (Drier et al. 2016), we conclude that both EN1 and SOX4 are likely to be direct target genes activated by the

MYB/MYBL1 oncogene products in ACC tumors that express one of the oncogenes.

The other group of ACC tumors express neither MYB nor MYBL1 and also do not express EN1 or SOX4. Instead, they express oncogenic transcription factors KLF4, FOXO1, JUNB and FOSB and the important developmental regulator VGLL3. There is currently no other supporting evidence indicating that the products of these genes act as drivers of tumorigenesis in this subgroup of

ACC tumors, but they seem like excellent candidates for further study and perhaps the development of animal models to test their activities.

Because of the heterogeneity we observed in the gene expression patterns in ACC tumors, we applied unsupervised hierarchical clustering to the RNA-seq results from our cohort of 68 ACC samples and unexpectedly identified a subgroup of tumors with significantly worse overall survival (Figure 2.4). The 68 patients that we studied included 25 who failed to survive 10 years, and more than half of those were in the poor survival subgroup identified by hierarchical clustering. The poor survival group over-expressed ERBB3 and under-expressed

EGFR, suggesting that poor survival may be linked to epithelial-to-mesenchymal transition or to myoepithelial vs epithelial phenotype. The poor survival samples also over-expressed CTNNB1 and SOX4 and under-expressed PIK3R1 and

TP63, all of which have been linked to tumorigenesis in other cancers. We tested

57 20 different markers that had been reported by others to be able to distinguish poor survival ACC tumors, but only three, MYB, PTK2 (Fak) and SNAI2 (Slug), showed significant differences in our cohort (Table 2.3). This probably reflects the differences in assays used – RNA levels in our assays compared to immunohistochemistry in most of the publications – but should raise a warning flag to others who wish to compare published results with studies using different technologies.

Perhaps the most important finding from these studies is the use of RNA- seq and gene expression patterns to identify a high-risk, poor-survival subgroup of ACC patients that were previously unidentified. These patients are the ones that could benefit most from increased surveillance and the development of new types of therapeutic strategies, such as targeted therapies that inactivate the

MYB oncogene (Uttarkar, Dassé, et al. 2016; Uttarkar, Piontek, et al. 2016). The development of improved biomarkers to identify the highest-risk patients could lead to important improvements in the treatment of ACC tumors.

2.5 METHODS

2.5.1 Human Salivary Gland ACC FFPE samples

De-identified salivary gland adenoid cystic carcinoma FFPE tumor samples were obtained from the Salivary Gland Tumor Biorepository (MD Anderson

Cancer Center, Houston, TX; Table 2.1). A search of MD Anderson Head and

Neck tumor bank for salivary gland adenoid cystic carcinoma led to the

58 identification of 100 patients with primary section at our institution. Hematoxylin and eosin stained slides of all tumors were retrieved and were subjected to an independent, blinded review by two specialized head and neck pathologists. The phenotypic assessment of ACC was strictly made on the histopathologic finding of tubular and cribriform patterns with dual cellular formation and light luminal polysaccharide secretion with and without solid component. Tumor with solid form lacking tubular/cribriform foci were not included. The review confirmed the diagnosis of ACC in all tumors. Due to the long disease course for ACC tumors, samples were chosen that had at least 5 yr follow-up. All samples were collected with informed consent of the donors, and studies were conducted in accordance with the principle of the Declaration of Helsinki. All studies were performed with

Institutional Review Board-approved protocols.

2.5.2 RNA Isolation and Sequencing

Total RNA was isolated from one or two 5-micron slide-mounted FFPE sections using the RNeasy FFPE kit (Qiagen). cDNA synthesis and library preparation were performed using the SMARTer Universal Low Input RNA Kit for

Sequencing (Clontech) and the Ion Plus Fragment Library Kit (Life Technologies) as previously described (Brayer et al. 2016). Sequencing was performed using the Ion Proton and Ion S5/XL systems (Life Technologies) in the Analytical and

Translational Genomics Shared Resource at the University of New Mexico

Comprehensive Cancer Center. RNA sequencing data is available for download

59 from the NCBI BioProject database using study accession number

PRJNA287156.

2.5.3 Data Analysis

Low quality and non-human RNA-seq reads were identified and removed from the analysis pipeline using the Kraken suite of quality control tools (Davis et al. 2013; Wood and Salzberg 2014). High-quality, trimmed, human RNA-seq reads were aligned to the human genome (GRCh37; hg19) using TMAP (v5.0.7) and gene counts were calculated using HT-Seq as previously described (Brayer et al. 2016) Poor quality samples, defined as those samples which had fewer than 10% of the median number of reads of all samples, were removed from further analyses. Several additional samples were removed based on the quality control measures described in the text. Peak finding to identify samples that expressed MYB or MYBL1 was performed using findpeaks from the HOMER

(v4.9) package (Heinz et al. 2010), with settings of -region, -size 1000 and - minDist 10000.

2.5.4 Unsupervised hierarchical clustering

Analyses were limited to genes that were highly expressed above a threshold level in a number of samples (e.g. 250 reads in at least 10 samples).

Hierarchical clustering was performed on an expression matrix of 882 highly expressed genes in 68 ACC tumors with the Euclidean distance as the dissimilarity measure and the default “complete” clustering method from the

60 hclust command in the package stats, part of statistical software R/Bioconductor

(Gentleman et al. 2004). Genes that correlated with molecular subgroup were discovered using the samr R package (Jun Li et al. 2012; J. Li and Tibshirani

2013) to identify genes positively and negatively correlated with the first and second dimensions of a PCA plot describing these data (Tusher, Tibshirani, and

Chu 2001).

2.5.5 Statistical Analysis

The primary endpoint for the outcome was overall survival (OS), defined as time from the date of diagnosis to the date of death. Subjects who were lost to follow-up or alive within the follow-up period were censored at the date of the last contact. The OS was estimated using the Kaplan-Meier method, and the differences in OS were examined using the log-rank test. Univariate Cox regression was used to assess the associations of clinical characteristics or genomic features with survival outcome, and multivariate Cox regression was used to compare amongst these variables. All the statistical analyses were performed using statistical software R (Tusher, Tibshirani, and Chu 2001).

2.5.6 Translocation Verification

For detection of putative fusions, samples with apparent MYB or MYBL1 translocations, evidenced by a lack of reads mapped to the 3' end of the gene, were examined for chimeric reads containing sequence matching the MYB or

MYBL1 gene and another gene. Chimeric reads were detected using the

61 Integrative Genomics Viewer (version 2.3.79) (Thorvaldsdóttir, Robinson, and

Mesirov 2013) with "show soft clips" turned on, and then secondarily aligned to

GRCh37/hg19 using BLAT at the UCSC genome browser (Karolchik, Hinrichs, and Kent 2007; Kent et al. 2002) to determine the translocation partner. Samples that had obvious truncations but for which no chimeric reads were identified, were categorized as "unknown”. Novel translocations were verified by RT-PCR amplification of RNA isolated from FFPE slices as previously described (Brayer et al. 2016). Gene-specific primers used to amplify cDNA and resulting Sanger sequences are provided in Supplementary Table S.2.1.

2.5.7 EN1 and SOX4 Promoter Fragments

Fragments of the human EN1 and SOX4 gene promoters were isolated by

PCR amplification from genomic DNA and cloned into the pGL3-basic reporter backbone. A 345 bp fragment of the EN1 promoter was amplified (forward: 5’-

GAGGAGGAGCTCGAGAACGTAAACTGTCGACGC, reverse: 5’-

GAGGAGGAGAAGCTTAGAGAAATGCAGGATTATGGGTC) and cloned using

XhoI and HindIII restriction sites included in the primers. Similarly, a 161 bp fragment of the SOX4 promoter was amplified (forward: 5’-

GAGGAGGAGGCTAGCTGCAGCCAAGACTGTGAAAG, reverse: 5’-

GAGGAGGAGCTCGAGAGGAGTTCCTCCAGTGCAGA) and cloned using XhoI and NheI restriction sites. Insert sequences were verified via conventional

(Sanger) sequencing.

62 Insert sequences were verified via conventional (Sanger) sequencing.

Underlined portions are untranslated regions. Predicted Myb binding motifs are in bold+underlined. Genomic coordinates (hg19) of the inserts are shown. The EN1 promoter fragment that we cloned contains a common polymorphism

(rs3731613) compared to the reference hg19 sequence, indicated by a lower case ‘g’.

>SOX4-luc chr6: 21593929-21594089

GCTGCAGCCAAGACTGTGAAAGGATAAAGAGGCGCGAGGCGGAATTGGGGTCTGCTCTAA

GCTGCAGCAAGAG AAACTGTGTGT GAGGGGAAGAGGCCTGTTTCGCTGTCGGGTCTCTAG

TTCTTGCACGCTCTTTAAGAGTCTGCACTGGAGGAACTCCT

>EN1-luc chr2: 119605727-119606071

AACGTAAACGTGCGACGCTAGCTAGGCGCAGCGGGCCTTTCAGATTTTGCTATTTGTGAA

AAACAAATTGCGCCTCTGAAAGTAACCAACTCTAGGTCTATTTCACATCACCGACCTCCC

TGTCTCACTCCCCCTCCCTCCACTACACACACCCAAACCCACACCCACCCACAAACACAC

AAACCGGCAGTGACAACAACCACCCATCCTTCAATAACAGCAACCAGAGACAGAGGAGAA

AATAAAAAGCTGAGTTTCTTAGGCGTGGGGGTGCAAAACAGCCAGGCTCCTGCCTACTGC

CCCTGCTCCCGgAGCTCACAGACCCATAATCCTGCATTTCTCTAA

2.5.7 Transfections and reporter gene assays

MYB and MYBL1 fusion expression vectors, cloned into pCDNA3, were described previously (Brayer et al. 2016). Specifically the c-Myb:NFIB fusion contains a cDNA fragment spanning exons 1-8s (O’Rourke and Ness 2008; Y.

Zhou et al. 2011) of MYB (NM_001130173) encoding the first 313 aa of c-Myb

63 fused to exons 10-11 of NFIB (NM_001282787), which adds 73 novel amino acid residues to the truncated Myb protein. The A-Myb:RAD51B fusion contains exons

1-9 of MYBL1 (NM_001080416) encoding the first 367 aa of A-Myb fused to an intronic region of RAD51B, which adds 28 novel amino acids to the truncated A-

Myb protein. Reporter gene assays were performed in HEK293T cells in triplicate. Cells were seeded at approximately 4 to 6 x 104 cells per well in a 24 well plate and allowed 24 hours growth before transient co-transfection with 50 ng of luciferase reporter plasmid (EN1-luc or SOX4-luc in pGL3-basic) and 5 ng of activator plasmid (MYB or MYBL1 fusions cloned into pCDNA3). Transfections were performed in duplicate using the TransIT-2020 transfection (Mirus) reagent according to the manufacturer’s instructions. Cells were harvested and luminescence was measured after 48 hours using the Luciferase Assay System

(Promega). Background subtracted data were normalized to cells transfected with the reporter plasmid and no activator (empty pCDNA3). All experiments were performed at least in triplicate.

2.6 ACKNOWLEDGMENTS

The authors acknowledge the outstanding technical support from Jennifer

Woods, Maggie Cyphery, Jamie Padilla, Jason Byars and Gavin Pickett. Some experiments used the facilities or services provided by the Analytical and

Translational Genomics Shared Resource, the Fluorescence Microscopy and

Cell Imaging Shared Resource and the Human Tissue Repository and Cell

64 Analysis Shared Resource, which are supported by the State of New Mexico and the UNM Comprehensive Cancer Center P30CA118100. We are also grateful for support from the Adenoid Cystic Carcinoma Research Foundation and the

Salivary Gland Tumor Biorepository. We especially thank Dr. Diana Bell for help with the ACC tumor validation. RNA sequencing data is available for download from the NCBI BioProject database using study accession number

PRJNA287156.

Funding: NIH grants 5R01CA170250, 5R01DE023222 and P30CA118100.

65 CHAPTER 3: AN ALTERNATIVE MYB PROMOTER ACTIVATED IN ADENOID

CYSTIC CARCINOMA PRODUCES N-TERMINALLY TRUNCATED MYB

PROTEINS WITH UNIQUE TRANSCRIPTIONAL ACTIVITIES

Candace A. Frerich1, Hailey N. Sedam1, Huining Kang2, Yoshitsugu Mitani3, Adel

K. El-Naggar3 and Scott A. Ness1,4 *

1 Department of Internal Medicine, Division of Molecular Medicine, University of

New Mexico Health Sciences Center, Albuquerque, NM

2 Department of Internal Medicine, Division of Epidemiology, University of New

Mexico Health Sciences Center, Albuquerque, NM

3 Head and Neck Pathology, University of Texas MD Anderson Cancer Center

4 University of New Mexico Comprehensive Cancer Center, Albuquerque, NM

66 3.1 ABSTRACT

Adenoid cystic carcinoma (ACC) is an aggressive salivary gland tumor that tends to infiltrate surrounding nerves. Most ACC tumors overexpress the

MYB gene, making it the predicted driver oncogene, but chromosomal translocations only occur in half those cases. We found by RNA-sequencing analyses and mapping the 5’-ends of MYB transcripts that ACC tumors preferentially utilized an alternative MYB promoter, a novel mechanism of MYB gene activation in ACC. Nearly all ACC tumors used the alternative promoter, but other tumor types did not. Alternative promoter use produces N-terminally truncated Myb proteins which activated cell type specific MYB promoters in reporter assays. N-terminally truncated Myb isoforms displayed unique transcriptional activities, regulating many genes differently than full-length Myb.

This Myb isoform uniquely regulated genes associated with pro-tumorigenic neural migration signals and perineural invasion, a feature associated with poor prognosis in ACC tumors. Thus, a regulatory pathway unique to ACC activates the alternative MYB promoter, leading to production of a truncated Myb protein with altered transcriptional activities. This study has important implications for future studies targeting MYB gene activation in ACC.

3.2 STATEMENT OF SIGNIFICANCE

The majority of MYB-expressing ACC tumors preferentially use a rarely used alternative MYB promoter, producing Myb proteins with an N-terminal deletion

67 and significantly altered transcriptional specificity. These results implicate Myb in promoting perineural invasion and have important implications for developing targeted therapeutics aiming to disrupt MYB gene expression in ACC tumors.

68 3.3 INTRODUCTION

Adenoid cystic carcinoma (ACC) is an unpredictable and aggressive malignancy which most frequently occurs in the salivary gland. Estimates of local recurrence are as high as 100% and late occurring, distant metastases are reported in over 60% of patients (Jones et al. 1997; DeAngelis et al. 2011; Spiro

1997; van der Wal et al. 2002). ACC tumors have one of the highest incidences of perineural invasion (Gil et al. 2009), a condition where tumor cells invade the surrounding nerves, and is associated with local tumor recurrence (Dantas et al.

2015). Recent studies have revealed previously unappreciated diversity amongst

ACC tumors, with gene expression analyses exposing a poor-outcome patient group with a median survival of little more than 2 years (Frerich et al. 2018).

Consequently, ACC patients face uncertain outcomes even after initial treatment due to local recurrence, distant metastases, lack of targeted therapeutics, and intrinsic tumor diversity. Hallmark chromosomal translocations activate the related MYB or MYBL1 genes, indicating that these are the essential drivers in

ACC tumors (Frerich et al. 2018; Brayer et al. 2016; Gao et al. 2014; Mitani et al.

2010; 2016). Many chromosomal translocations involve the NFIB gene, creating t(6;9) and t(8;9) fusions for the MYB and MYBL1 genes respectively, but less frequent translocations involving other genes occur as well (Frerich et al. 2018).

Together, the activated MYB and MYBL1 oncogenes appear to be responsible for oncogenesis in over 80% of ACC tumors (Frerich et al. 2018). Over 70% of ACC

69 tumors, including the most aggressive, poor-outcome tumors (Frerich et al.

2018), overexpress MYB and the encoded Myb transcription factor.

The Myb transcription factor regulates the expression of thousands of genes to coordinate proliferative and differentiated cell states in multiple normal cell types (Brayer et al. 2016; Quintana et al. 2011; Mucenski et al. 1991;

Matsumoto et al. 2016). Indeed, Myb is implicated in ensuring normal salivary gland development (Matsumoto et al. 2016), the tissue in which ACC tumors most often arise. Myb has also been implicated in numerous other malignancies

(George and Ness 2014; Y. Zhou and Ness 2011), where C-terminal truncations to the Myb protein are necessary to unleash its oncogenic and transforming ability (Hu et al. 1991; Ramsay, Ishii, and Gonda 1991; Lei et al. 2004; Fu and

Lipsick 1996). For instance, in pediatric leukemias enhanced alternative RNA splicing of MYB gene transcripts appears to produce truncated, activated isoforms of the Myb protein that may contribute to oncogenesis (Y. Zhou et al.

2011). Similarly, in many ACC tumors, chromosomal translocations break the

MYB gene producing C-terminally truncated, oncogenic isoforms of Myb proteins.

Chromosomal translocations are further implicated in activating expression of the

MYB gene by recruiting distant enhancers linked to its translocation partner NFIB to interact with the conventional MYB promoter, thus stimulating expression of the MYB gene (Drier et al. 2016). The recruited NFIB enhancers also appeared to be activated by Myb proteins themselves, creating an oncogenic feedback loop enforcing expression of the MYB oncogene in ACC tumors (Drier et al.

70 2016). Thus, chromosomal translocations and enhancer hijacking are thought to be the primary mechanism activating the MYB gene. However, the MYB gene is highly over-expressed in most ACC tumors, even those without evidence of chromosomal translocations, raising the possibility that there are additional unknown mechanisms by which these tumors activate expression of MYB

(Frerich et al. 2018).

Transcription of the MYB gene is tightly controlled and highly regulated throughout development in different tissues. The conventional promoter, upstream of exon 1, responds to a variety of stimuli (Lauder, Castellanos, and

Weston 2001; Drabsch et al. 2007; H. J. Hugo et al. 2013; Cesi et al. 2011). For example, in T-cells NF-κB binding of this MYB promoter following IL-2 stimulation allows proliferation and protects from apoptosis (Lauder, Castellanos, and

Weston 2001). In some tissues, MYB expression is also tightly regulated by a well-described transcriptional pause site in the first intron (J. Zhang et al. 2016;

Perkel, Simon, and Rao 2002; Yuan 2000; H. Hugo et al. 2006; Suhasini and Pilz

1999; Pereira et al. 2015). Specifically, RNA polymerase stalling must be relieved to allow MYB expression in breast cancer where binding of estrogen receptor releases attenuation and allows MYB expression (Drabsch et al. 2007). In normal proliferating erythroid cells this entire region, from the promoter through the length of the first intron, interacts with multiple distant enhancer elements forming a dynamic active chromatin hub (Stadhouders et al. 2014).

71 Additionally, an alternative MYB promoter immediately upstream of the second exon has been implicated in aberrant expression of MYB in some leukemia cell lines (Dassé et al. 2012; Jacobs, Gorse, and Westin 1994).

Genome-wide studies estimate that half of the expressed human genes have multiple promoters, with an average of 4 promoters per gene, and it is increasingly clear that alternative promoter use is an important regulatory mechanism to increase transcript and protein diversity (X. Wang et al. 2016).

Promoter selection is regulated in a developmental-stage, cell-type, and disease- specific manner. Widespread promoter switching is observed during normal cerebellar development, where ~20% of genes display differential promoter usage (P. Zhang et al. 2017), and ~80% of human transcription start sites are classified as cell-type specific in the FANTOM5 promoter atlas (Forrest et al.

2014). Aberrant alternative promoter activation was first implicated in oncogenesis at least 25 years ago (Marcu, Bossone, and Patel 1992), and evidence of its role in tumorigenesis has continued to increase (Davuluri et al.

2008; Northcott et al. 2014; Weischenfeldt et al. 2017). These studies indicate that the alternative MYB promoter may have an unappreciated importance in

MYB driven disease.

Transcriptional regulation of the MYB gene is tightly controlled in developing cells and silenced in mature, differentiated, non-proliferative cells.

Those normal regulatory mechanisms are hijacked or circumvented in tumors to allow overexpression of oncogenic Myb proteins. In ACC unique, tumor-specific

72 interactions between a hijacked enhancer and the MYB gene promoter could provide a novel target for therapeutic intervention. However, MYB is also highly over-expressed in ACC tumors that do not have chromosomal translocations, and the mechanism of MYB activation in these tumors is unclear. Thus, we sought to investigate in detail regulation of the MYB gene in ACC tumors.

Surprisingly, we found that the majority of ACC tumors utilize a normally silent alternative promoter located in the first intron of the MYB gene. These results have important implications for devising possible strategies to disrupt Myb-driven oncogenesis that leads to ACC tumor formation.

73 3.4 RESULTS

3.4.1 ACC tumors express MYB from an alternative promoter

Many studies have established that the MYB gene is an important oncogene in ACC tumors. Most of these studies have focused on hallmark chromosomal translocations involving the MYB gene and their downstream consequences. In the course of our previous RNA-sequencing (RNA-seq) studies of ACC tumors (Frerich et al. 2018; Brayer et al. 2016) we observed that there are typically very few reads aligned to the first exon of the MYB gene, indicating an anomaly in its transcriptional regulation. However, little regarding the transcriptional regulation of the MYB gene in ACC tumors has been studied. In other tissues and cell types transcriptional regulation of the gene is tightly controlled through multiple mechanisms centered around the extreme 5’ end of the gene (see the gene track below the RNA-seq data; Figure 3.1A)(Lauder,

Castellanos, and Weston 2001; Drabsch et al. 2007; J. Zhang et al. 2016).

Transcription typically begins at the conventional promoter (TSS) in the region directly upstream of exon 1 (Figure 3.1A). Downstream of TSS1, within intron 1 is a regulatory RNA polymerase II pause site (hairpin structure) and an infrequently used alternative promoter, designated here as TSS2 (Figure 3.1A) (Dassé et al.

2012; Jacobs, Gorse, and Westin 1994)).

74 Figure 3.1: ACC tumors use an alternative MYB gene promoter. (A) RNA-seq from two ACC tumor is displayed in IGV. Aligned reads are displayed as gray peaks exons. Spliced reads displayed as read arcs with the

75 raw number of spliced reads above the arcs. In most cell types, MYB transcription begins at TSS1 upstream of exon 1. The first intron contains the RNA-polymerase attenuation site (hairpin structure) and an alternative promoter (TSS2, second bent arrow). There are few reads aligned to exon 1 (gray peaks above the gene track) in two ACC tumors (T73 and T9), and many more reads aligned to exon 2. (B) 5’RLM-RACE performed on a frozen ACC tumor sample (T73) revealed transcripts from both TSS1 (designated +1 nt, RefSeq NM_005375) and TSS2 (GenBank X52126). Multiple transcription start sites were observed for TSS2, at +4409 nt and +4569 nt, creating a 180 nt and 20 nt 5’UTR respectively.

76 Transcription then continues to exons 3 and 4 and so on. In light of our previous observations that there may be an anomaly in transcriptional regulation of the

MYB gene in ACC tumors we visually inspected RNA-seq reads derived from two frozen ACC tumors (T73 & T9, Figure 3.1A). RNA-seq coverage of the first 4 exons of the MYB gene is shown using the Integrative Genome Browser (IGV) in

Figure 3.1A. The aligned RNA-seq reads are displayed as gray peaks and the number of reads spanning a splice junction are indicated above the corresponding red arc (Figure 3.1A). We chose RNA-seq from frozen tumors to mitigate sequencing artifacts present in low quality formaldehyde fixed paraffin embedded (FFPE) samples. Again we found markedly fewer reads aligned to exon 1 in compared to exon 2. Further, the number of reads in both samples that splice from exon 1 to exon 2 is drastically lower than those that splice from exon

2 to exon 3 (the raw number of reads is indicated above arcs which are displayed proportionally, Figure 3.1A). If transcription in these tumors began at TSS1 and continued through the remainder of the gene the number of reads aligned to exon 1 and exon 2 should be approximately equal. Alternatively, if transcription began at TSS1 and the RNA polymerase stalled at the regulatory hairpin structure within intron 1, a buildup of reads upstream of the hairpin followed by many fewer reads on exon 2 would be expected. However, we observed many more reads on exon 2 than exon 1, which is most consistent with transcription skipping TSS1 and instead beginning at TSS2 (Figure 3.1A, TSS2).

77 To confirm these observations were due to TSS2 use and not simply an artifact of sequencing depth and/or quality we mapped the 5’-ends of transcripts in ACC T73 using RNA-ligase mediated rapid amplification of cDNA ends

(5’RLM-RACE) followed by conventional ‘Sanger’ sequencing (Figure 3.1B).

5’RLM-RACE specifically discovers the extreme 5’ ends of completely processed mRNA transcripts and can be used to infer promoter location and transcription start sites. Using 5’RLM-RACE in ACC T73 we identified transcripts originating from the expected TSS1 (see the corresponding RNA-seq in Figure 3.1A), the promoter directly upstream of exon 1 (Figure 3.1B, MYB TSS1; RefSeq ID

NM_005375). Strikingly, this ACC tumor also expressed multiple transcripts that began in intron 1 directly upstream of exon 2 (Figure 3.1B, MYB TSS2, GenBank

X52126, Supplementary Figure S.3.1 & Supplementary Table S.3.1). These transcripts are unlikely to be read-through from TSS1 since RNA splicing would remove this intronic region. As controls, this assay detected TSS1 but not TSS2 transcripts in CD34+ hematopoietic progenitors and a salivary gland epidermoid carcinoma cell line A253 (data not shown). Thus, we conclude these transcripts in ACC T73 must originate from TSS2, as predicted from the sequencing data in

Figure 1A. Detailed investigation of TSS2 transcripts in T73 revealed replicate transcripts from multiple potential transcription start sites that extended either

~180 nt or ~ 20 nt upstream of exon 2, into intron 1 (Figure 3.1B, +4409 nt and

+4569 nt respectively). Only the ~20 nt TSS2 transcript has been previously described in the literature (Jacobs, Gorse, and Westin 1994). The ~180 nt TSS2

78 transcript appears to be novel. In mammalian promoters transcription often initiates at multiple, clustered start sites within a ~100 nt region (Carninci et al.

2006). Indeed, these data from an ACC tumor are consistent with multiple transcription start sites upstream of exon 2, one of which has not been previously described. (For the rest of this publication TSS1 and TSS2 will be used to refer to the entire promoter regions, making no distinctions between individual transcription start sites.)

Transcription initiating from TSS2 has been previously reported in leukemia cell lines, where it accounted for a substantial proportion of total MYB transcripts (Dassé et al. 2012; Jacobs, Gorse, and Westin 1994). However, TSS2 transcripts have not been described in human tumors or epithelial cells and only one transcript has been annotated (GenBank Accession X52126). Here, we provide the first evidence that TSS2 of the MYB gene is activated in an ACC tumor. Which produced a mixture of MYB transcripts using two different promoters, to express this important driver oncogene.

3.4.2 Myb transcription factors can activate cell-type specific MYB promoters

MYB TSS2 has been described previously (Dassé et al. 2012; Jacobs,

Gorse, and Westin 1994), still little is known about its regulation, function, or significance in human tumors. Thus, we utilized reporter assays to perform basic characterization of both MYB promoters. MYB TSS1 and TSS2 were cloned

79 Figure 3.2 Cell type specific MYB promoters are activated by Myb transcription factors. (A) MYB promoter reporter constructs corresponding to TSS1 (blue) and TSS2 (red) are illustrated. Transcription start sites are designated +1 nt and indicated with bent arrows. Predicted high quality (see methods) Myb binding sites are indicated as open circles. (B) MYB promoters are activated differently in different cell types. The reporters above were transiently transfected into three cell types in the absence of exogenous activator. Fold activation is calculated relative TSS1. (C) Both MYB promoters are activated by full-length and truncated Myb proteins. Full-length Myb (MYB) and a truncated Myb protein predicted from ACC tumors T349, which encodes

80 upstream of the luciferase reporter gene (Figure 3.2A, Supplementary Figure

S.3.2), then introduced into different cell lines where their basal activities were measured (Figure 3.2B). Both TSS1 and TSS2 functioned as bona fide promoters in these assays, but they showed cell type-specific differences.

(Figure 3.2B) The two promoters displayed similar activities in A253 salivary gland epidermoid carcinoma cells, but TSS2 was more active than TSS1 in

HEK293 human embryonic kidney cells, and the opposite was true in SW620 colorectal adenocarcinoma cells. In the ACC cell line MDACC-ACC-01(hTERT) both reporters were active (data not shown). These data confirm that the region upstream of exon 2 can act as a promoter in human cells and suggests that MYB promoter selection is regulated in a cell-type specific manner.

Sequence motif analyses (Thomas-Chollier et al. 2011) revealed known and novel transcription factor binding sites in each of the MYB promoters

(Supplementary Figure S.3.2). Within TSS1 there were predicted NF-κB binding sites, a known activator of MYB expression in T-cells (Lauder, Castellanos, and

Weston 2001). While TSS2 had multiple binding sites for the Fox family of transcription factors, including FOXO3, a transcription factor differentially expressed (Brayer et al. 2016) and mutated (A. S. Ho et al. 2013) in ACC tumors.

These analyses also revealed multiple Myb binding site motifs in both promoter regions (Figure 3.2A, open circles; Supplementary Figure S.3.2) suggesting they could be auto-regulated by Myb proteins. Since gene expression changes elicited by Myb transcription factors are responsible for driving tumorigenesis in over

81 80% of ACC tumors (Frerich et al. 2018; Brayer et al. 2016) we investigated the interaction of Myb transcription factors with its own promoters. HEK293TN cells were transfected with each of the MYB promoter reporters along with concurrently expressed Myb proteins. Figure 3.2C displays the fold activation of the promoters by ectopically expressed Myb proteins relative to no Myb protein.

We found both MYB promoter regions were activated similarly (4 to 7-fold) by full-length Myb and a Myb-Nfib fusion protein (T349, Figure 3.2C)(Brayer et al.

2016). The two MYB promoters were not significantly activated by a mutated Myb transcription factor harboring a point mutation that disrupts DNA-binding activity

(Frampton et al. 1991)(data not shown). These data demonstrate that Myb transcription factors are capable of activating of both TSS1 and TSS2 of the MYB gene.

3.4.3 MYB TSS2 activation is widespread amongst, yet unique to ACC tumors

Once we established that a single ACC tumor used TSS2 and that this promoter is active in reporter assays we asked if MYB TSS2 activation was a common feature of all ACC tumors. To explore MYB TSS2 activation in all ACC tumors we tabulated the RNA-seq reads that mapped to MYB exons in 55 previously sequenced FFPE ACC tumors (Frerich et al. 2018). Ovarian serous cystadenocarcinoma (OVCA) exon count data from NCBI Genomic Data

Commons was used for comparison. Figure 3.3A illustrates MYB exon counts

82 Figure 3.3. MYB TSS2 activation is unique to ACC tumors. (A) Exon counts for the MYB gene (gene track below) were tabulated for ACC (red, n=55) and (OVCA, blue, n=55) RNA-seq data. ACC tumors have a significantly lower use of exon 1, T-test *** p< 0.001 (B) ACC tumors have higher TSS2 use than other tumors. TSS2 use in ACC tumors compared to other tumor types. TSS2 use was calculated as a ratio of RNA-seq counts on exon 2 divided by those on exon 1 (exon2/exon1) for each tumor type. ACC RNA-seq

83 data was previously published (7). RNA-seq data for remaining tumor types was obtained from NCBI genomic data commons: Breast invasive carcinoma (BRCA), Prostate adenocarcinoma (PRAD), Head and neck squamous cell carcinoma (HNSC), Glioblastoma multiforme (GB), Acute Myeloid Leukemia (AML), Ovarian serous cystadenocarcinoma (OVCA). Sample sizes were equalized to 55 tumors in each dataset by random sampling. ACC tumors have a significantly different ratio than the other tumors (ANOVA, *** p=0.001). The mean exon ratio for each tumor type is tabulated below in parenthesis.

84 (note the log scale) in ACC (red) and OVCA (blue) tumors for the first four exons of the MYB gene (diagrammed below the dot plot). In OVCA tumors the first four exons of MYB are transcribed at approximately equal levels, as would be expected when transcription begins at TSS1. However, ACC tumors have approximately 10-fold lower exon counts for exon 1 than exon 2, and a significantly different average number of reads on exon 1 than OVCA tumors

(Figure 3.3A). In fact, in ACC tumors exon 1 was most often not transcribed at all, approximately 60% (34 of the 55 total ACC tumors) had zero reads on exon 1.

We conclude that the disproportionately low use of exon 1 in ACC tumors is due to TSS2 use described above, which is reflected in these RNA-seq data.

Next, we asked if the MYB TSS2 activation we observed in ACC tumors also occurred in other tumor types. We compared our ACC RNA-seq data to the following MYB expressing tumors from NCBI Genomic Data Commons: Breast invasive carcinoma (BRCA), Prostate adenocarcinoma (PRAD), Head and neck squamous cell carcinoma (HNSC), Glioblastoma mulitforme (GB), Ovarian serous cystadenocarcinoma (OVCA), and Acute Myeloid Leukemia (AML). MYB

TSS2 use for each tumor type was quantified as the ratio of counts on exon 2 divided by those on exon 1 (exon2/exon1). An exon2/exon1 ratio close to 1 would be expected from TSS1 use, where an equal number of reads were aligned to exon 1 and exon 2. Conversely, a higher ratio indicates TSS2 use, due to exon 1 skipping and consequently more reads on exon 2. By this metric, TSS2 use was highest in ACC tumors, the majority of which had very large exon2/exon1 ratios,

85 that averaged to 25.5 (Figure 3.3B, means tabulated in parenthesis below). In contrast, all the other tumor types had dramatically lower average exon2/exon1 ratios, breast invasive carcinomas (BRCA) had the next highest at 8.4. Head and neck tumors (HNSC = 4.2) had a mean exon2/exon1 ratio more than 6-fold lower than the average of ACC tumors and 27-fold lower than the highest ACC tumor.

Interestingly, only 4 of the 55 total ACC tumors analyzed had an exon2/exon1 ratio below the average for the other head and neck tumors (HNSC=4.2, Figure

3B). Finally, ACC tumors had a significantly different exon2/exon1 ratio than the other tumor types (ANOVA(F(6,155.61) =88.98, p<2.2e-16), with a significantly larger exon2/exon1 ratio than each of the other tumor types individually (p < 2e-

16; Figure 3.3B). We conclude that MYB TSS2 use is widespread amongst ACC tumors, but rarely used in other MYB-expressing tumors.

We utilized ACC tumor RNA-seq data to determine if the specific MYB

TSS2 activation we observed in ACC tumors could be due to chromosomal translocations involving the MYB gene. Chromosomal translocations have been implicated in activating expression of the MYB gene through recruitment of distant enhancer elements to MYB TSS1, however, enhancer interaction with

TSS2 was not addressed (Drier et al. 2016). We compared the exon2/exon1 ratio for ACC tumors with chromosomal translocations versus those without translocation and found approximately half of the tumors that used TSS2 did not have a chromosomal translocation, and there was no difference in the exon2/exon1 ratio in these two classes of ACC tumors (Supplementary Figure

86 S.3.3). While these data do not address whether NFIB enhancers can interact with MYB TSS2, we conclude that enhancers recruited by chromosomal translocations are not strictly responsible for activating MYB TSS2 in all ACC tumors.

3.4.4 MYB TSS2 transcripts give rise to N-terminally truncated proteins

Thus far, we have described alternative transcriptional regulation of the

MYB gene in ACC tumors via activation of TSS2. This promoter was utilized by a majority of ACC tumors but not other tumor types. In reporter assays TSS2 had cell-type specific activity and could be activated by Myb proteins. We next addressed the affects of TSS2 use on Myb proteins. Figure 3.4A illustrates the mRNA isoforms encoded by TSS1 (top) and TSS2 (bottom) with the corresponding amino acid sequence derived from each (middle). Translation of full-length Myb begins at the first start codon located in exon 1 (M1) and together exons 1 and 2 encode the first 47 amino acids (aa) of the full-length protein

(Figure 3.4A). However, when TSS2 is utilized (MYB TSS2 lower mRNA, Figure

4A), transcripts begin within intron 1 and do not include the usual start codon

(M1). Instead, translation is predicted to begin at the first in-frame start codon

(M21) located in exon 2, thereby skipping the first 20 aa of the Myb protein.

87 Figure 3.4. MYB TSS2 mRNA encodes an N-terminally truncated Myb protein isoform (∆N Myb). (A) MYB transcripts beginning at TSS1 (top) include exons 1 and 2 which encode amino acids 1-47 of the Myb protein (amino acid sequence below mRNA transcript). MYB TSS2 transcripts begin at +4409 nt or +4569 nt downstream of TSS1 (designed +1), skipping the entire first exon. These transcripts do not include the first start codon (M1) and are instead predicted to begin translation at residue M21, skipping the first 20 amino acids of the full-length Myb protein. (B) The full-length Myb protein (amino acids 1-640) encodes conserved DNA-binding and regulatory regions. TSS2 transcripts are predicted to encode a 20 amino acid N-terminal deletion, producing the ∆N Myb isoform (amino acids 20-640). The oncogenic AMV v-Myb (amino acids 72-442) also has a 72 aa N-terminal truncation.

88 The resulting proteins, illustrated in Figure 3.4B, have several highly conserved domains comprising the DNA binding and regulatory regions of the protein. As described above, Myb proteins translated from TSS2 are expected to have a 20 aa truncation to the amino terminus, producing the ΔN Myb protein isoform N Mybproteinisoform

(Figure 3.4B, ΔN Myb protein isoform N Myb). Similarly, the oncogenic v-Myb protein harborsa72aaN- terminal truncation (Figure 3.4B, v-Myb). The skipped N-terminal residues are highly conserved (Supplementary Figure S.3.4) and constitutively phosphorylated by Casein kinase II at serine residues 11 and 12 (pS11, pS12) (Figure 3.4A &

3.4B, black lollipop)(Oelgeschlager et al. 1995; Cures et al. 2001; Luscher et al.

1990). The S11 phosphorylated site serves as the epitope for a popular Myb antibody, which we exploited to illustrate full-length Myb and ΔN Myb protein isoform NMybprotein isoform expression in western blot analyses (Supplementary Figure S.3.5).

Engineered Myb proteins expressed from cDNA vectors in HEK293TN cells were submitted to Western blot analyses and probed with two Myb antibodies. An antiserum directed towards the DNA-binding domain (PB84, Myb DBD, (Dash,

Orrico, and Ness 1996)), which is an essential domain present in all Myb isoforms, detects both full-length and the ΔN Myb protein isoform N Myb isoform. Incontrast,the antibody directed towards the Myb pS11 residue (Myb pS11, ab45150) only detects full-length Myb, not ΔN Myb protein isoform N Myb. (Unfortunately, similarexperiments performed on frozen ACC tumor samples were inconclusive due to extensive

Myb protein degradation/proteolysis, not shown.)

89 The ΔN Myb protein isoform N Myb protein isoform has intact DNA-binding andregulatory domains, and is predicted to be a functional transcription factor. Similarly, the oncogenic v-Myb protein encoded by Avian Myeloblastosis Virus (AMV, Figure

3.4B) has an even larger 72 aa N-terminal truncation yet retains its DNA-binding ability and is a functioning oncogenic transcription factor. We confirmed ΔN Myb protein isoformNMyb was able to activate two known Myb regulated promoters (Frerich et al. 2018) in reporter assays (Supplementary Figure S.3.6). Thus, the first 20 conserved amino acids of c-Myb, including the phosphorylated residues, is not strictly necessary for the DNA binding and transactivation ability of Myb transcription factors (Oelgeschlager et al. 1995; 2001; Dini and Lipsick 1993). Still, the highly conserved nature of this region suggests it serves an important regulatory function, which is remains enigmatic. We predict that TSS2 transcripts produce functionally active, N-terminally truncated Myb proteins and thus may represent an additional mechanism by which ACC tumors abnormally activate MYB gene expression.

3.4.5 ∆N Myb transcription factors have unique activity in cells

Using RNA-seq and 5’RLM-RACE we have described activation of MYB

TSS2 in ACC tumors, but the functional consequences of TSS2 use remained unclear. We hypothesized that N-terminal truncation may alter the specificity of

Myb transcription factors in a manner that provided an oncogenic advantage to

ACC tumor cells. To test the first part of this hypothesis we performed RNA-seq experiments on cells engineered to ectopically express either full-length Myb or

90 ∆N Myb transcription factors. Unfortunately, the ACC cell line (MDACC-ACC-

01(hTERT)) currently used expressed the related A-Myb transcription factor which could have unknown influence on Myb stimulated gene expression, thus it was not appropriate for these experiments. Instead, SW620 colorectal adenocarcinoma cells were amenable to this experiment due to very little endogenous Myb expression and both TSS1 and TSS2 were functional in our transfection assays (Figure 3.2B). To perform these RNA-seq experiments we infected cells with lentivirus particles that achieved up to ~95% transduction with moderate over-expression of Myb proteins by Western blot analyses

(Supplementary Figure S.3.7). Note endogenous Myb protein expression in not detected in empty vector treated cells. Protein and RNA were harvested from the cells 48 hours after viral transduction and submitted for protein analyses and next generation sequencing. Principal Components Analysis (PCA) of the gene expression signatures elicited by ectopically expressed Myb transcription factors showed clear separation of the three treatments (Figure 3.5A). Full-length Myb

(red) and ∆N Myb (cyan) were separated from empty vector control (EV, black) along the first component (PC1, horizontal axis), which explained almost 70% of the variation in the dataset. Even these initial analyses distinguished full-length

Myb (red) from ∆N Myb (cyan) along the second component (PC2, vertical axis), which accounted for 18% of the variation in the dataset (Figure 3.5A). The PCA

91 Figure 3.5. Myb and ∆N Myb transcription factors elicit both similar and different gene expression changes in cells. (A) Myb proteins were ectopically expressed in SW620 cells using lentiviral particles. Total RNA was harvested at 48 hours post infection and prepared for RNA-sequencing. PCA analyses distinguished empty vector control (black) from full-length Myb (red) and ∆N Myb (cyan). (B) MYB isoforms regulated some

92 overlapping but also many unique genes. Significantly diferentially expressed genes (2-fold change, BH corrected P-value <0.05) were discovered for both Myb isoforms versus empty vector control and displayed in the Venn diagram. (C) A summary heatmap displays gene expression changes elicited by Myb transcription factors. The sidebars above the heatmap indicate the following: genes differentially expressed in Myb expressing ACC tumors versus Myb negative ACC tumors (orange), genes discussed in the text (gray), genes significantly differentially expressed in Myb versus ∆N Myb (cyan), and genes differentially expressed by both Myb and ∆N Myb relative to empty vector control (black). (D) ∆N Myb significantly differently activates or silences multiple genes. Bar chart displays the log fold change relative to empty vector of the 80 genes significantly differentially expressed between Myb (red) versus ∆N Myb (cyan).

93 plot highlighted an exciting difference in the gene expression signatures elicited by the two Myb protein isoforms.

We performed differential gene expression analysis, comparing each of the Myb samples to empty vector control, and identified 409 genes that were up- or down-regulated at least 2-fold by full-length Myb (adjusted p-val < 0.05; red,

Figure 3.5B), and 875 genes differentially regulated by ∆N Myb (cyan, Figure

3.5B). The total number of significantly differentially expressed genes discovered for each comparison are displayed in parenthesis below the comparison label in the Venn diagram (Figure 3.5B). Only a small proportion, ~35% (312/875 genes) of ∆N Myb regulated genes, were also regulated by full-length Myb. In contrast,

∆N Myb caused an additional 563 (60% of the total) genes to be up- or down- regulated at least 2-fold. Here it is apparent that full-length Myb and ∆N Myb elicited genes expression changes in both overlapping and unique sets of genes

(Figure 3.5B). A summary heatmap of the gene expression analyses performed above is included in Figure 3.5C (larger version Supplementary Figure S.3.8).

Genes commonly regulated by Myb and ∆N Myb are indicated by the black sidebar along the left of the heatmap, whereas genes regulated significantly differently between ∆N Myb and full-length Myb are indicated with cyan.

Additionally, genes discussed in this text are indicated in gray and genes associated with Myb expression in ACC tumors (Frerich et al. 2018) are marked with orange sidebars (Figure 3.5C). Some of the differentially expressed genes are associated with important cell functions like cell cycle regulation (CDK3,

94 COPS2, HSF4). While others have known oncogenic functions (MALAT1, GPC2,

LINC-PINT) or are associated with metastases in a variety of tumors (RAB40B,

PRSS3, NME1). Again it is apparent from the heatmap that full-length and ∆N

Myb regulated some genes similarly, but also regulated many genes differently.

Finally, we asked if ∆N Myb changed any single gene’s expression significantly differently than full-length Myb. We performed differential gene expression analyses with the same parameters used above but instead compared ∆N Myb and full-length Myb directly. The differences in activity between full-length and ∆N Myb are even more dramatic when these genes are presented as a histogram, which plots the log fold change relative to empty vector control both full-length Myb (red bars) and ∆N Myb (cyan bars, Figure

3.5D). Further investigation revealed that some genes are activated by both Myb isoforms but are induced more dramatically by one Myb isform (e.g. at left; these genes have both cyan and gray tick marks in the heatmap; Figure 3.5C), whereas others are regulated in completely opposite directions by ∆N Myb and full-length Myb proteins (e.g. center). For example, in comparison to empty vector control the ESRP1 gene (genes are denoted with a black dots in Figure

3.5D) is significantly activated by Myb ~3.1 fold and ~6.4 fold by ΔN Myb protein isoformNMyb.Thus,

ΔN Myb protein isoform N Myb significantly activated ESRP1 ~2.0 fold more than full-length Myb,which could be consistent with ΔN Myb protein isoform N Myb being an “unleashed” version ofMyb.However, this was not always the case, the NT5E gene was activated 195-fold by full- length Myb but only 50-fold by ∆N Myb (Figure 3.5D). We also observed genes

95 that were regulated by one isoform but not the other, like the oncogenic lincRNA

MALAT1, which is silenced by ΔN Myb protein isoform N Myb but is unchanged byfull-lengthMyb.

Finally, full-length Myb and ΔN Myb protein isoform N Myb even regulated a few genesinadirectly opposing manner, for instance the ATP5F1D gene is silenced by full-length Myb but activated by ΔN Myb protein isoform N Myb relative to empty vector (Figure 3.5D).Thesedata suggest that in some situations ΔN Myb protein isoform N Myb may have completelydifferent transcriptional activity and specificity than its full-length counterpart.

3.4.6 ΔN Myb uniquely modulates gene sets implicated in neuronal cell N Myb uniquely modulatesgenesetsimplicatedinneuronalcellmigration

The analyses above identified individual genes that were differentially expressed between ΔN Myb protein isoform N Myb and full-length Myb. However, wequestioned whether ∆N Myb was able to coordinate the expression of multiple, functionally related genes to elicit unique biological responses in cells. To address this the significantly up and down regulated genes were submitted to gene set over- representation analyses using the ClusterProfiler R package to query the

Molecular Signatures Database (Yu et al. 2012; Subramanian et al. 2005). The results are summarized in Figure 3.6A, where the left panel displays gene sets discovered in ∆N Myb treated cells and the right panel displays gene sets discovered in Myb treated cells. The color bar to the far right indicates gene sets discovered in both Myb and ∆N Myb treated cells in gray, gene sets unique to ∆N

96 Figure 6. ∆N Myb uniquely activates SEMA4D signaling, which is correlated with ACC patient survival.

97 (A) Gene set enrichment analyses were performed using the significantly differentially expressed genes discovered in Figure 5. The top two gene sets significantly enriched for each category plus selected gene sets unique to each category are displayed. The color bar to the far right indicates the treatment in which the gene set was enriched with enrichment in both Myb isoforms in gray, enrichment in only full-length Myb in red, and enrichment in only ∆N Myb in cyan. (B) Venn diagram of all the significantly enriched gene sets for each Myb isoform. (C) Gene network for the REACTOME_SEMA4D_IN_SEMAPHORIN_ SIGNALING (R-HSA-400685) gene set. 25 of the 32 genes are displayed, gene not included are: CD72, CDC42, LOC642076, MYL8P, MYL12A, PTPRC, RAC2, RHOG, ROCKIP1. Fold change in ∆N Myb treated SW620 cells versus empty vector control was used to color the gene nodes. (D) Unsupervised hierarchical clustering using the 22 gene from the REACTOME_SEMA4D_IN_SEMAPHORIN_SIGNALING gene set that were expressed in ACC tumors. Two major groups were defined using the resulting dendrogram, 21 tumors were in Group 1 (orange), and 34 tumors were in Group 2 (blue). The remainder of the ACC tumors clustered into multiple smaller groups and are colored black. (E) There was a significant difference in ACC patient survival when grouped based SEMA4D signaling. Tumors were assigned to Group1, Group 2, or or excluded from these analyses based on the dedrogram in panel D.

98 Myb in cyan, and those unique to Myb in red (Figure 3.6A). (Full enrichment results are provided in Supplementary Table S.3.3.) Approximately ~35% of the discovered gene sets (165/423 sets) were enriched in both full-length Myb and

∆N Myb expressing cells (Figure 3.6B), this included many of the top-ranking genes sets (categories denoted with gray in the side bar, Figure 3.6A). The commonly regulated genes were significantly enriched in a previously published

ACC gene set from the DisGenNET database (Supplementary Figure S.3.9)

(Pinero et al. 2015). Further, full-length Myb and ∆N Myb both activated genes which are over-represented in a previously published c-Myb target gene list

(LIU_CMYB_TARGETS_UP)(F. Liu et al. 2006), and silenced genes associated with SEMA3B expression (KOYAMA_SEMA3B_TARGETS_UP, Figure 3.6A)

(Koyama et al. 2008). Thus, our enrichment results are consistent with previously published findings and our own findings that ΔN Myb protein isoform N Myb andfull-lengthMyb commonly regulated some genes.

However, gene expression analyses also established that full-length Myb and ∆N Myb transcription factors regulated many genes differently (Figure 3.5B).

Rather than taking the 80 significantly differentially expressed genes in isolation, we rationalized that the cumulative affects of those genes regulated differently plus those regulated in common could act in concert to affect widespread changes in cells. Further, performing enrichment analyses in this manner would be more representative of what happens in the cell. Thus, enrichment analyses were performed as before, using all the significantly up and down regulated

99 genes when compared to empty vector control for each Myb isoform. We found that almost 70% of the enriched gene sets (239/423 sets) were unique to ∆N Myb treated cells (Figure 3.6B). A selected subset of these unique gene sets are included in Figure 6A (red and cyan side bar). Specifically, genes activated in cells that expressed ∆N Myb were significantly enriched in SEMA4D associated migratory cues (REACTOME_SEMA4D_INDUCED_CELL_MIGRATION_AND_

GROWTH_CONE_COLLAPSE and REACTOME_SEMA4D_IN_

SEMAPHORIN_SIGNALING (Garapati 2009b; 2009a))(Figure 3.6A). While genes silenced in cells expressing ∆N Myb were significantly enriched in gene sets associated with an immature or stem cell phenotype

(ZHANG_TLX_TARGETS_36HR_DN and BENPORATH_ES_1 (C. L. Zhang et al. 2008; Ben-Porath et al. 2008)). Full-length Myb (categories indicated in red) up-regulated genes enriched in RUNX1 gene sets

(TONKS_TARGETS_OF_RUNX1_RUNXT1_FUSION_HSC_UP (Tonks et al.

2007)) and silenced genes were enriched in downstream targets

(PID_P53_DOWNSTREAM_PATHWAY (Schaefer et al. 2009))(Figure 3.6A).

We found the unique enrichment of SEMA4D signaling in ∆N Myb expressing cells particularly interesting since it has long been associated with neural cell migration and is implicated in a number of tumors (Capparuccia and

Tamagnone 2009). The annotated, validated REACTOME_SEMA4D_IN_

SEMAPHORIN_SIGNALING gene set, summarized more simply as SEMA4D signaling hereafter, includes 32 genes, 6 of which were significantly differentially

100 expressed in ∆N Myb treated SW620 cells. The SEMA4D signaling gene interaction network is displayed in Figure 3.6C, the gene nodes are colored according to the fold change in ∆N Myb treated SW620 cells versus empty vector control cells. Evidence suggests that interactions between the Sema4D signaling molecule and its receptor Plexin-B1 promoted perineural invasion via chemo- attractive interactions in multiple tumors types (Binmadi et al. 2012). This same pathway was studied in more mechanistic detail in breast carcinoma cells, where

SEMA4D signaling via its receptor Plexin-B1 either activated or suppressed cell migration (Swiercz, Worzfeld, and Offermanns 2008). These contrasting effects were linked to intermediate signaling molecules: signaling through ErbB-2 stimulated migration, whereas signaling through Met suppressed migration.

Indeed, in our RNA-seq experiments ∆N Myb up-regulated expression of the

SEMA4D, PLXNB1, and ERBB2 and down-regulated MET (Figure 3.6C). Thus, the unique gene expression changes elicited by ∆N Myb in SW620 cells are consistent with SEMA4D stimulated cell migration.

We conclude ∆N Myb and Myb transcription factors, which only differ by

20 N-terminal amino acids, regulate many of the same core genes, as would be expected from the shared DNA-binding domain and C-terminal regulatory domains. Yet the 20 aa N-terminal deletion also imparted unique transcriptional activity, allowing ∆N Myb to activate and silence genes that full-length Myb did not. Further, enrichment analyses suggest that ∆N Myb uniquely activated genes involved in cell migration and perineural invasion.

101 3.4.5 ∆N Myb geneset identified a poor outcome subgroup of ACC tumors

Due to ACC tumors known predilection for perineural invasion, and the enriched gene sets potential involvement in this process, we performed unsupervised hierarchical clustering of primary ACC tumors based on their expression of SEMA4D signaling genes. The resulting dendrogram is displayed in Figure 6D. ACC tumors sorted into distinct groups indicating the SEMA4D signaling gene set captured meaningful biological variation in these tumors. The majority of ACC tumors sorted into two large groups, with 21 tumors in Group 1

(orange) and 34 tumors in Group 2 (blue; Figure 3.6D). The remainder of the

ACC tumors sorted into many small groups (colored in black to the left side of the dendrogram Figure 3.6D) and were quite variable in their gene expression of

SEMA4D signaling genes. Due to the small size of these groups (10 individuals in 6 groups total) they were excluded from further analyses. We then performed

Kaplan-Meier analyses to evaluate the prognoses of the two main groups of ACC tumors (Figure 3.6E). The median survival for all the ACC patients in this dataset was 147 months with a 5-year survival rate of 72% (previously published (Frerich et al. 2018)). When the ACC tumors were divided according to the dendrogram in

Figure 6D there was a significant difference in patient survival. Group 2 (blue) was on par with the average survival for ACC patients, with a 5-year survival rate above 80% (blue, Figure 3.6E). In contrast, Group 1 had significantly poorer survival (log-rank p-value = 3.9x10-5), with a median survival of 61.7 months had

102 significantly poorer survival and a much lower 5-year survival of ~55% (orange,

Figure 3.6E).

Thus, we have used RNA-seq to demonstrate ∆N Myb transcription factors had significant unique activities in cells when compared to full-length Myb.

In these experiments ∆N Myb transcription factors uniquely elicited changes in the SEMA4D signaling pathway, which are implicated in cell migration and perineural invasion. Finally, the SEMA4D signaling pathway discovering in ∆N

Myb treated SW620 cells is significantly associated with differential survival of

ACC patients.

103 3.5 DISCUSSION

Here we provide the first evidence that ACC tumors used an alternative

MYB promoter, which leads to expression of N-terminally truncated Myb proteins.

We demonstrated that MYB promoters have cell-type specific activity and MYB

TSS2 activation is widespread amongst but unique to ACC tumors, suggesting that the underlying cause of TSS2 activation must be due to something specific to ACC tumors. Chromosomal translocations are an important feature of ACC tumors and have many established roles in activating the MYB gene, one of which is to recruit enhancers downstream of the NFIB gene to interact with MYB

TSS1, stimulating its expression (Drier et al. 2016). It is probable that MYB TSS2 interacts with hijacked enhancers in ACC tumors that have chromosomal translocations, although this remains to be tested. However, MYB TSS2 appeared to be activated in most (if not all) MYB expressing ACC tumors, even those without detectable chromosomal translocations. Moreover, we found no difference in TSS2 use between ACC tumors with translocations and those without translocations, indicating that TSS2 is activated in ACC tumors regardless of translocation status. Thus, MYB TSS2 activation may represent a mechanism of MYB activation independent of chromosomal translocation.

Indeed, alternative promoter activation may explain how the MYB gene becomes activated in ACC tumors without chromosomal translocations, a critical missing link in ACC tumor biology.

104 The implications of promoter selection extend far beyond transcriptional regulation. Studies have shown alternative promoter use provides a mechanism to modulate protein expression and activity, where for instance, N-terminally truncated proteins produced from alternative promoters can have distinct functional activities. We hypothesized that ΔN Myb protein isoform N Myb may have altered,evenhighly oncogenic, activity in ACC tumors and thus used RNA-seq to investigate the transcriptional activity of Myb and ΔN Myb protein isoform N Myb protein isoforms. Wefoundthatthese transcription factor isoforms, which have identical DNA-binding domains, regulated hundreds of genes differently. Indicating ΔN Myb protein isoform N Myb hadvastlydifferent activity in these cells, where it not only activated different primary targets but also different downstream pathways altogether. This region is highly conserved, indicating an important function. Early studies implicated N-terminal truncation in oncogenesis, a mere 20 aa N-terminal truncation was sufficient to induce rapid- onset tumors when expressed in chickens (Jiang et al. 1997). While, phosphorylation of the S11 and S12 residues increased the specificity of full- length Myb by destabilizing DNA-binding (Ramsay, Ishii, and Gonda 1991;

Oelgeschlager et al. 1995; 2001) this effect was easily overcome by protein-to- protein interactions with co-factors that anchored Myb to DNA (Oelgeschlager et al. 1995). And it is clear N-terminally truncated Myb proteins are capable of binding and activating transcription of target genes. Thus, the mechanism responsible for the observed differences in Myb isoform activity is not clear.

105 From our RNA-seq analyses a picture emerged potentially implicating ΔN Myb proteinisoformN

Myb in ACC tumor migration and perineural invasion. In both Myb and ΔN Myb protein isoformNMyb expressing cells the largely chemo-repulsive, anti-tumorigenic SEMA3B associated migratory cues were down-regulated (Capparuccia and Tamagnone

2009). In addition, ΔN Myb protein isoform N Myb expressing cells alone displayed activatedSEMA4D chemo-attractive, pro-tumorigenic migratory cues. The combined influence of these modulated pathways could have significant effect on the migratory phenotype of tumor cells. Most ACC tumors express SEMA4D, and its receptor

PlexinB1 is up-regulated 5-fold in ACC tumors compared to normal salivary gland

(Brayer et al. 2016). We further found that classifying ACC tumors according to

SEMA4D signaling identified a significantly poorer outcome subgroup of tumors.

SEMA4D signaling has been implicated in the migration and invasiveness of a variety of tumors. In epithelial cells SEMA4D triggered invasive growth (Giordano et al. 2002) and stimulated migration in breast cancer cells in concert with ErbB-2 signaling (Swiercz, Worzfeld, and Offermanns 2008). Overexpression of the

SEMA4D receptor, PlexinB1, was correlated with invasiveness and metastasis in prostate tumors (Wong et al. 2007). Finally, SEMA4D signaling is implicated in perineural invasion, a hallmark of ACC tumors (Binmadi et al. 2012). Thus, our results potentially link three disjointed aspects of ACC tumors; expression of a previously unreported ΔN Myb protein isoform N Myb isoform activated SEMA4D signaling whichisin turn implicated in perineural invasion and patient outcome. The results described herein could potentially be an important step towards improving treatment of this

106 disease. In ACC tumors perineural invasion is linked to poor prognosis, local recurrence, and significant patient morbidity. To date there are no specific therapeutic intervention targeting perineural invasion in any tumor, and ΔN Myb protein isoformNMyb may prove an attractive target in mitigating this aspect of the disease.

In total, we conclude that most ACC tumors utilized a rare alternative MYB promoter. Myb proteins derived from this promoter are functional transcription factors and are sure to contribute to ACC oncogenesis as such. Our RNA-seq analyses revealed that ΔN Myb protein isoform N Myb can differently activate or silencehundredsof genes, indicating N-terminal truncation via MYB TSS2 activation qualitatively altered the specificity of Myb transcription factors. Further, ΔN Myb protein isoform NMybalonewas able to silence anti-tumorigenic neuronal migratory signals while also stimulating pro-tumorigenic neuronal migratory cues. Finally, expression of these same pro- tumorigenic neuronal migratory cues in ACC tumors identified a significantly poorer outcome subgroup of ACC tumors. These results potentially implicate ΔN Myb proteinisoformN

Myb in stimulating perineural invasion in ACC tumors, the mechanisms of which are still largely unknown (Bakst et al. 2019). It will be exciting to see future studies that fully elucidate the role of TSS2 in MYB gene activation and expression, the extent of its interaction with hijacked enhancers, and its full functional consequences in ACC tumors.

107 3.6 METHODS

3.2.1 Cell Culture and Luciferase assays

Human Kidney 293TN Producer cells (HEK239TN; Systems Biosystems) and SW620 colorectal carcinoma cells (ATCC; CCL-227) were cultured in

Dulbecco's Modified Eagle's Medium (DMEM; ATCC) supplemented with 5% (v/v) fetal bovine serum, 5% (v/v) newborn calf serum (Rocky Mountain Biologicals,

Inc.), and 1% Antibiotic-Antimycotic. A-253 epidermoid carcinoma cells (ATCC;

HTB-41) were cultured in McCoy’s 5A medium, supplemented with 5% (v/v) fetal bovine serum, 5% (v/v) newborn calf serum, and 1% Antibiotic-Antimycotic. All cells were cultured at 37 °C in 5% CO2. Media and supplements were purchased from Life Technologies unless otherwise indicated.

For reporter assays, cells were seeded in 24 well plates with approximately 4-6x104 cells per well. After 24 hours of growth, cells were transiently co-transfected with 50 ng of luciferase reporter plasmid plus 50 ng of activator plasmid (MYB cDNAs cloned into pcDNA3.0), or just reporter plasmid without activator. Transfections were performed in duplicate using the TransIT-

2020 transfection reagent (Mirus) according to manufacturer instructions. Cells were harvested, and firefly luciferase activity was measured after 48 hours using the Luciferase Assay System (Promega). Background subtracted data was normalized as stated in the Results. Reporter gene data are an average of three independent biological replicates, with error bars representing the standard deviation those.

108 3.6.2 Tumor RNA-seq

ACC RNA-seq data processing and analysis was performed previously

(Frerich et al. 2018), data was downloaded from the NCBI BioProject database using study accession number PRJNA287156. We obtained exon count data for breast invasive carcinoma (BRCA), prostate adenocarcinoma (PRAD), head and neck squamous cell carcinoma (HNSC), glioblastoma multiforme (GB), ovarian serous adenocarcinoma (OVCA), and acute myeloid leukemia (AML) from NCBI

Genomic Data Commons. Analyses were limited to samples that reasonably expressed MYB and had greater than 3 reads aligned to exon 2, this resulted in

55 ACC tumors in the final dataset. A random sampling of the NCBI Genomic

Data Commons datasets was used to equalize all datasets to 55 tumors total.

Exon counts were tabulated as the number of normalized RNA-seq reads that mapped to each MYB exon in all tumor types, TSS2 use was then calculated as the ratio of counts on exon 2 divided by those on exon 1 (exon2/exon1). RNA- seq from two frozen in ACC tumors (T9 & T73) have been deposited in the NCBI

BioProject (accession PRJNA573669).

3.6.3 SW620 RNA-seq

SW620 cells were infected with concentrated lentiviral particles, so that

70-90% of cells were Green Fluorescent Protein (GFP) positive. RNA and total protein were harvested 48 hours post transduction and submitted to Western blotting and RNA-seq. RNA-seq experiments were performed as biological

109 replicates. First, empty vector and Myb elicited gene expression was measured, then a second experiment including empty vector, Myb, and ∆N Myb was performed. Total RNA was extracted from cell pellets using the RNeasy Plus mini kit (Qiagen). Ribosomal RNA was removed with the RiboGone kit (Clontech) followed by cDNA synthesis using the SMARTer Universal Low Input RNA Kit for

Sequencing (Clontech). Libraries were prepared using the Ion Plus Fragment

Library Kit (Life Technologies) and sequenced using the Ion S5 systems (Life

Technologies) in the Analytical and Translational Genomics Shared Resource at the University of New Mexico Comprehensive Cancer Center. Resulting RNA-seq reads were aligned to the human genome (GRCh37; hg19) using TMAP (v5.2.25) and gene counts were calculated using HT-Seq. Data were analyzed using R v3.5.1 using the edgeR (v3.24.0), DESeq (v1.22.1), ggplot2 (v3.1.0), RUVSeq

(v1.16.0), limma (v3.38.2), mSIGDB (v6.1.1), survival (v3.6.1), stats (v3.6.1) and clusterProfiler (v3.10.1) packages (Yu et al. 2012; Robinson, McCarthy, and

Smyth 2010; Anders and Huber 2010; Risso et al. 2014; Wickham 2019). RNA- seq data is available for download from NCBI BioProject using accession number

PRJNA573669.

3.6.4 5’RLM-RACE

Total RNA was extracted from a frozen ACC tumors using the RNeasy total RNA extraction kit (Qiagen) according to manufacturer specifications.

5’RLM-RACE was performed using the Generacer RLM RACE kit (Invitrogen) according to manufacturer instructions. Briefly, total mRNA was

110 dephosphorylated, decapped, then the Generacer oligo ligated to the 5’ends.

Reverse transcription was performed using either the provided oligo-dT primer or a gene specific primer to exon 8. A nested PCR reaction was used to amplify products (Supplementary Table S.3.1). A mixture of PCR products were TOPO cloned (Invitrogen) and Sanger sequenced to verify insert sequence

(Supplementary Table S.3.1).

3.6.5 Protein and Western blots

Cell pellets were lysed in RIPA buffer with protease inhibitor for 10 min on ice followed by sonication (Diagenode Bioruptor, high, 30 sec on/ 30 sec off, 5-10 min). Western blotting was performed using the WES automated system

(ProteinSimple), all results were verified via traditional blotting methods. Cell lysates were diluted with sample 0.1x Sample Buffer (ProteinSimple) to a concentration of ~0.6 ug/uL. Protein separation and quantification was performed using the 12-230 kDa ladder according to manufacturer instructions. Antibodies used were as follows: rabbit anti-Myb (Rabbit antisera PB84, directed against amino acids 72-192 of Myb protein, 1:100, (Dash, Orrico, and Ness 1996)), rabbit anti-Myb pS11 (ab45150; Abcam; 1:200), anti-βactin (1:100). Ready to use mouse (042-205) and rabbit (042-206) HRP-conjugated secondary antibodies were purchased from ProteinSimple. Myb antibodies were validated using Myb protein expressed from plasmid cDNA (positive control) and paired untransfected

HEK293TN cell lysate (negative control) which do not express Myb.

111 3.6.6 Cloning

All MYB expression vectors were cloned into pcDNA3.0 as previously described (Brayer et al. 2016). Ectopically expressed Myb encoded the full-length proteins, ∆N Myb has a 20 amino acid N-terminal truncation described in the text.

T349 is a C-terminally truncated Myb protein predicted from an ACC tumor, it includes MYB exons 1-8 fused to NFIB exons 11-12 (Brayer et al. 2016). The uc022bdo UCSC transcript was used as the NFIB reference, exons 11-12 encodes 73 amino acids. Reporter plasmids were cloned as follows: a 879 bp fragment of MYB TSS1 was amplified from genomic DNA using MYB TSS1 FW and RV primers (Supplementary Table S.3.1), and inserted into pGL3-basic with

NheI and HindIII restriction sites. A 779 bp fragment of MYB TSS2 was amplified from genomic DNA using MYB TSS2 FW and RV primers (Supplementary Table

S.3.1) and inserted into pGL3-basic with NheI and XhoI restriction sites. Reporter plasmid inserts were sequence verified via Sanger sequencing (Supplementary

Table S.3.1).

3.6.7 Promoter motif analyses

Transcription factor binding motifs were discovered using the transcription factor affinity prediction (TRAP) set of web tools (Thomas-Chollier et al. 2011).

High quality motifs were defined as having a weight score above 2.5.

112 3.7 ACKNOWLEDGMENTS

The authors acknowledge the outstanding technical support from Jennifer

Woods, Maggie Cyphery, Jamie Padilla, Brandon Painter, and Kathryn Brayer.

Some experiments used the facilities or services provided by the Analytical and

Translational Genomics Shared Resource and the Flow Cytometry Shared

Resource which are supported by the State of New Mexico and the UNM

Comprehensive Cancer Center P30CA118100. We are also grateful to the support from Adenoid Cystic Carcinoma Research Foundation and the Salivary

Gland Tumor Biorepository at the University of Texas MD Anderson Cancer

Center. ACC tumor RNA sequencing data was downloaded from the NCBI

BioProject database using study accession number PRJNA287156. Expression data for other included tumor types was obtained from NCBI Genomic Data

Commons.

Disclosure of Potential Conflicts of Interest: No potential conflicts of interest for any author.

Funding: NIH grants R01CA170250, R01DE023222 and P30CA118100

113 CHAPTER 4: CONCLUSIONS, SIGNIFICANCE, FUTURE DIRECTIONS

4.1 How does MYB TSS2 become activated in ACC tumors?

As described in the third chapter ACC tumors have activated a rarely used alternative MYB promoter. We found that the protein produced from this promoter is truncated at the N-terminus (∆N Myb) and had unique functional effects in cells. When the same pathways were investigated in ACC tumors we found a significant difference in patient survival, indicating ∆N Myb contributes to ACC tumor progression. However, it is still unknown how MYB TSS2 becomes activated in these cells. MYB TSS2 was only activated in ACC tumors, not the other tumor types studied, suggesting that the underlying cause of its activation must be due to something unique to ACC tumors. We rejected our initial hypothesis that distant enhancers hijacked by chromosomal translocations are responsible for activating MYB TSS2 as it was not fully supported by these data.

Specifically, if hijacked enhancers were responsible only tumors with chromosomal translocations would be expected to use MYB TSS2, instead many

ACC tumors that did not appear to harbor C-terminal defects still used MYB

TSS2. And tumors with and without chromosomal translocations had equally elevated MYB TSS2 use.

Reporter assays demonstrated that MYB promoters are activated in a cell- type specific manner (Figure 3.2), supporting the hypothesis that each cell type expresses different specific transcription factors to activate MYB promoters to different degrees. The MYB TSS1 promoter is extremely GC-rich, above 90% in

114 some regions, whereas TSS2 is AT-rich. This is consistent with findings that genes commonly have a primary GC-rich promoter and one or more GC-poor alternative promoters and in embryonic stem cells GC-rich and GC-poor promoters were regulated differently (Mikkelsen et al. 2007). This study also found GC-rich promoters (like MYB TSS1) are active by default and must be actively silenced. Whereas AT-rich alternative promoters (like MYB TSS2) are inactive by default but can be selectively activated by cell-type specific transcription factors. This is likely achieved through differential recruitment of the basal transcriptional machinery to different core-promoter types by cell-type specific factors (Northcott et al. 2014; Gross et al. 1998; Losick 1998). Specific evidence for cell-type specific regulation of the MYB promoters was described in normal myeloid cells and leukemia cells, where localization of the PBX2 transcription factor to TSS2 versus TSS1 had a role in which promoter was activated (Dassé et al. 2012). Indeed, our reporter assays demonstrated that

MYB promoters are activated in a cell-type specific manner, most likely due to differential expression of transcription factors (Figure 3.2). However, the activity of our MYB promoter reporters did not reflect endogenous MYB promoter use in the same cell. For instance, HEK293T and HCT116 cells both activated the MYB

TSS2 reporter much more than TSS1. Yet HEK293T cells do not express Myb protein at all and HCT116 cells express the endogenous gene from MYB TSS1.

115 Figure 4.1. Proposed model of MYB TSS2 activation in ACC tumors. In most ACC tumors TSS2 is activated (larger bent arrow), but it remains unknown how MYB TSS2 becomes activated. TSS1 and TSS2 are activated by cell-type specific transcription factors (randomly colored spheres), and can be activated by both truncated and full-length Myb proteins (red hexagon). However, the identity of these transcription factors in ACC tumors is unknown. Expression of the correct transcription factors alone is not enough to activate either endogenous MYB promoter. Leading to my hypothesis that a combination of permissive chromatin structure plus cell-type specific transcription factors is likely required for TSS2 expression. However, ChIP-seq experiments show that the canonical H3K4me3 active promoter mark (cyan) are present at the TSS1 but are dramatically reduced at TSS2 (Drier et al. 2016), indicating this mark is probably not responsible for activating TSS2. Thus, I propose that a different, unknown histone mark is present at MYB TSS2 in ACC tumors, thus allowing its activation.

116 These results indicated that expression of cell-type specific transcription factors alone is not sufficient to activate either endogenous, chromatin embedded MYB promoter. Thus, I hypothesis that MYB TSS2 is activated by unique interactions between cell-type specific transcription factors, the general transcriptional machinery, and permissive local chromatin structure (Figure 4.1).

Published ChIP-seq studies of the chromatin landscape in ACC tumors revealed abundant H3K4me3 active promoter marks at MYB TSS1 but not at

MYB TSS2 (cyan peaks, Figure 4.1)(Drier et al. 2016). The H3K4me3 marks primarily localized to the MYB TSS1 promoter region and the first intron. The dramatic gap at MYB TSS1 is consistent with the hallmark nucleosome free region at the transcription start sites, followed by high H3K4me3 levels throughout the first half of the first intron. However, H3K4me3 levels drop dramatically just before MYB TSS2 and there is no hallmark nucleosome free region marking the transcripition start site. Which perhaps indicates the

H3K4me3 promoter mark is not responsible for providing the permissive chromatin structure underlying TSS2 activation. These findings in ACC tumors are consistent with findings in ES cells, where the primary GC-rich gene promoter (like MYB TSS1) was marked with activating H3K4me3 and/or silencing

H3K27me3 but most AT-rich alternative promoters (like MYB TSS2) had neither of these marks (Mikkelsen et al. 2007).

However, it remains unclear what specific transcription factors and histone marks could be responsible for MYB TSS2 activation in ACC tumors. Defects in

117 chromatin remodeling genes are noted in 35% of ACC tumors (A. S. Ho et al.

2013) and multiple genes significantly correlated with MYB TSS2 use in ACC tumors are involved in histone modification and/or chromatin remodeling. For example, the MORF4L2 (mortality factor 4 like 2) gene encodes a transcription factor and a subunit of the NuA4 histone acetyltransferase multi-subunit complex

(Cai et al. 2003), which can acetylate H4K16, a mark associated with transcriptional activation (R. Zhang, Erler, and Langowski 2017). Similarly, the

ATF2 (activating transcription factor 2) gene encodes a transcription factor that can acetylate histones H2B and H4 in vitro (Bruhat et al.

2007). Finally, the H3K4me1 mark has been demonstrated to localize at an active, intergenic alternative promoter in at least one situation (Skvortsova et al.

2016). Thus, I propose ChIP-seq to investigate if any (or all) of these histone modifications have a role in activating MYB TSS2 in ACC tumors. Follow up experiments with modified dCas9 could be utilized to directly modulate histone marks at the endogenous MYB TSS2 in an non-MYB expressing cell line (like

HEK293T) to determine if the identified histone modifications are sufficient to allow MYB TSS2 expression.

The recently described CRISPR affinity purification in situ of regulatory elements (CAPTURE) protocol (X. Liu et al. 2017) would be an ideal method to identify cell type specific transcription factors responsible for MYB TSS2 activation. The MYB TSS2 reporter is highly active in HEK293T cells, indicating they have the required transcription factors and co-factors to drive its expression,

118 and making them an ideal cell line to perform these assays. Specifically,

HEK293T cells would be engineered to express dCas9, the MYB TSS2 reporter described in chapter 3, and a specific guide RNA directed toward the reporter.

We could then CAPTURE (X. Liu et al. 2017) the transcription factors and co- factors bound to the TSS2 reporter by immunoprecipitating dCas9 targeted to the reporter plasmid sequence (a slight modification of the original protocol). Simple

Western blot analyses would detect CAPTURE of proteins identified as potential

MYB TSS2 activators in previous bioinformatic analyses (like ATF2, MORF4L2,

SNW1, MYB, FOXO3, PBX2, RNA polII positive control). If required this method can also be paired with Mass Spec. analyses for high-throughput identification of transcriptional complex composition.

4.2 N-terminal and C-terminal Myb truncations are cooperative

Although C-terminal truncations resulting from chromosomal translocations were the first MYB defect observed in ACC tumors. Studies that tracked ACC tumor progression surprisingly found that MYB-NFIB translocations were not always an early event in ACC tumor formation (Costa et al. 2014).

Western blot analyses of Myb protein expression in ACC tumors showed that tumors expected to express full-length c-Myb proteins had no detectable Myb protein at all (Mitani et al. 2016). Interestingly, the antibody used to detect Myb in this blot is unable to detect N-terminally truncated Myb proteins produced by

TSS2 transcripts (Supplementary Figure S.3.5). Further, it appeared from our early RNA-seq analyses (Figure 2.1) that half of these ACC tumors expressed

119 full-length MYB transcripts and did not appear to harbor C-terminal truncations.

These findings are incompatible with well established MYB proto-oncogene biology, ie the Myb protein must be truncated to unleash its oncogenic activity, and the driver oncogene must be expressed early in tumor development. Thus, the oncogenic driver in a quarter of ACC tumors is surely Myb, yet the Myb isoform that appeared to be expressed is not oncogenic. At the time we concluded that these ACC tumors must harbor cryptic genetic defects that truncate the C-terminus, but are difficult to detect with RNA-seq. However, our later discovery that ACC tumors used MYB TSS2, which leads to the expression of N-terminally truncated Myb proteins provides an alternative explanation for all these observations. RNA-seq analyses revealed that almost all ACC tumors used

TSS2, thus, those tumors that looked to express full-length MYB transcripts are actually expected to harbor N-terminal truncations from MYB TSS2 use. Thus, I hypothesize that N-terminal truncations via MYB TSS2 activation occur early to initiate ACC tumor development, and chromosomal translocations occur later as a second hit to the MYB gene. I predict that high-throughput CAP analysis of gene expression (CAGE-seq) to map the 5’ends of all transcripts in ACC tumors would confirm this hypothesis, and reveal further complexity in MYB gene regulation.

However, if N-terminal truncation is sufficient to initiate and drive ACC oncogenesis then why do chromosomal translocations occur? I hypothesize that

N-terminal plus C-terminal truncation creates a more oncogenic version of Myb

120 compared to singly truncating either end. Thus, selection pressure would ensure that tumor cells with double Myb truncations would be maintained. Indeed, the prototypical oncogenic v-Myb protein is truncated at both termini (Figure 1.4).

Evidence suggests the N-terminus is involved in both modulating DNA-binding affinity and protein-to-protein interactions with co-factors (Oelgeschlager et al.

2001; 1995; Ramsay, Ishii, and Gonda 1991; Burk and Klempnauer 1999). We have provided evidence that N-terminally truncated Myb proteins have altered transcriptional specificity. Extensive studies have shown that C-terminal truncation releases Myb from self-inhibitory interactions (Dash, Orrico, and Ness

1996), participates in protein-to-protein interactions (Ness 2003), increases the stability of the protein (Corradini et al. 2005; Kanei-Ishii 2004), and alters its transcriptional specificity (Ness 2003; F. Liu et al. 2006). Further, high-throughput gene expression analyses revealed that both N and C-terminal truncation had a role in activating Myb proteins. Specifically, gene expression changes elicited by full-length c-Myb, singly truncated Myb isoforms, and finally doubly truncated v-

Myb revealed each alteration additively converted the elicited gene expression signature from c-Myb to v-Myb (F. Liu et al. 2006). Thus, historical evidence supports my hypothesis that double truncation of Myb may be more oncogenic in

ACC tumors. Similar RNA-seq experiments to those performed in chapter 3 would be the first steps in testing this hypothesis in ACC tumors. N-terminal truncation via MYB TSS2 activation may lead to slow-growing tumors with a locally invasive phenotype but that do not metastasize to distant organs. Then

121 further truncation of the C-terminus via chromosomal translocation results in highly metastatic tumors with increased growth rate. Future studies will be essential to fully determine the role and extent of N-terminal and C-terminal truncation of Myb proteins in ACC tumors.

4.3 Why are truncated Myb transcription factors oncogenic?

Despite identification of the driver oncogene in the majority of ACC tumors it remains unclear how Myb transcription factors initiate tumor development, drive tumor progression, and metastatic disease. As a result there has been frustratingly little progress towards controlling and eliminating this disease.

Discovery of both N-terminal and C-terminal truncations in ACC tumors is consistent with extensive past studies that demonstrated Myb transcription factors must be truncated to be oncogenic (Hu et al. 1991; Ramsay, Ishii, and

Gonda 1991; Lei et al. 2004; Fu and Lipsick 1996). Accumulating evidence suggests that Myb truncation alters which genes it regulates, the transcription factor code hypothesis summarized in the introduction (F. Liu et al. 2006; Ness

2003). Still it remains unclear what transcription factors and co-factors cooperate with truncated Myb versus full-length Myb in ACC tumors and how those altered interactions lead to tumor formation.

Gene expression studies of primary ACC tumors have produced a wealth of information, but efforts to simplify, synthesize, and utilize these data have proved challenging. ACC tumors express ~2,000 genes differently from normal salivary gland (Brayer et al. 2016), and nearly the same number of genes were

122 differentially expressed between Myb positive tumors and Myb negative ACC tumors (Frerich et al. 2018). ChIP-seq experiments in ACC tumors revealed that

Myb proteins bound ~13,000 sites in the genome (Drier et al. 2016). Together these studies make it apparent that Myb transcription factors completely reprogram the gene expression signature of ACC tumors. So much so they barely resemble their normal counterparts or other Myb negative ACC tumors.

Thus, it is becoming clear that Myb transcription factors are likely driving ACC oncogenesis not by activating a single oncogene or pathway. Rather Myb transcription factors completely reprogram normal cells by coordinating the expression of hundreds (even thousands) of genes, which then modulate multiple downstream pathways. The combined effects of this extensive reprogramming results in oncogenic transformation. The major weakness of these high- throughput publications is that it remains unclear which of these thousands of genes are important in activating which pathways, nor how they might interact to drive oncogenesis. In essence, it is still completely unknown how truncated Myb transcription factors are driving ACC tumor oncogenesis.

4.4 Myb in ACC tumor hallmarks: many unknowns

ACC tumors are typified by slow but indolent growth, late occurring metastases, perineural invasion, dedifferentation, and increasing genome instability with tumor progression (Figure 4.2)(Dillon et al. 2016; Dantas et al.

2015). As the driver oncogene, Myb is likely to directly contribute to some or all of these characteristics, yet the mechanisms are not well understood. Uncontrolled

123 proliferation is an essential hallmark of all tumors (Hanahan and Weinberg 2011).

ACC tumors specifically exhibit slow growth kinetics, which is likely the reason chemotherapeutic agents are largely ineffective in these tumors (Dillon et al.

2016). Both NOTCH and Wnt signaling pathways have been implicated in driving

ACC tumor growth (Daa et al. 2004; Su et al. 2014). Myb is directly implicated in coordinating Wnt signaling to control the timing and degree of cell proliferation during normal salivary gland development (Matsumoto et al. 2016). Whether truncated Myb similarly drives tumor cell proliferation by modulating Wnt signaling has not been investigated in ACC tumors. NOTCH1 is mutated in a subset of ACC tumors (Stephens et al. 2013; A. S. Ho et al. 2013), and in vitro assays have implicated it in stimulating cell growth (Su et al. 2014; Panaccione et al. 2016). As of yet no link between Myb and NOTCH signaling in driving ACC tumor cell proliferation has been established. Thus it appears that ACC tumor cell proliferation may be driven by both Myb dependent and Myb independent pathways. Further studies are required to fully elucidate the interplay between

Myb, NOTCH, and Wnt signaling pathways and determine if these are the predominant pathways driving ACC tumor proliferation.

ACC tumors exhibit several highly invasive characteristics: frequent local recurrence, perineural invasion, and late occurring metastases (Jones et al.

1997; DeAngelis et al. 2011; Spiro 1997; Dantas et al. 2015; van der Wal et al.

2002). These characteristics are the main factors leading to patient morbidity and mortality, hence they are a priority in improving patient treatment and outcome.

124 Specifically, perineural invasion often results in partial facial paralysis from tumor removal, local recurrence necessitates multiple invasive surgeries, and distant metastases nearly always lead to death (Spiro 1997). Local recurrence, perineural invasion, and metastases may be related manifestations of the same activated pathways. Alternatively, they could be three distinct, unrelated tumor characteristics. Again, what contribution the Myb oncogene has in driving these characteristics has not been definitively established.

125 Figure 4.2. The role of truncated Myb in ACC tumor hallmarks. ACC tumors are characterized by slow growth, multiple highly invasive phenotypes, dedifferentiation, and increasing genome instability (outer teal hexagons). As the oncogenic driver truncated Myb proteins (inner red hexagon) likely have a role in many of these processes. However, only correlative links have been made for some of these characteristics and many remain completely unknown (indicated with ?). The identified pathways involved in each hallmark are indicated in the outer tan hexagons, the main pathways involved in ACC tumors hallmarks are NOTCH and Wnt signaling.

126 Our studies directly implicate Myb transcription factors in stimulating perineural invasion via the SEMA4D signaling pathway (Chapter 3). Several previous studies have speculated that mutation and aberrant expression of multiple neural associated genes is likely to have a role in perineural invasion

(Drier et al. 2016; A. S. Ho et al. 2013; Panaccione et al. 2016; Binmadi et al.

2012). Perineural invasion could indirectly contribute to local tumor recurrence due to the difficulty of completely removing tumor cells that have invaded the nerves (Dillon et al. 2016). One study found local recurrence almost always occurred within the first 5 years of diagnosis (Mahrous 2010). Beyond this, the mechanisms driving local recurrence of these tumors has not been investigated.

Distant metastases are the leading cause of treatment failure in ACC patients, and average survival with metastatic disease as low as 19 months is reported (Shingaki et al. 2014). ACC tumors most often metastasize to the lungs, but have also been observed in the bone, liver, and brain (Shingaki et al. 2014;

Andreasen et al. 2018). While there are many more studies of ACC metastasis, as a whole the mechanisms driving metastatic dissemination in ACC tumors are unknown (Allen S. Ho et al. 2019). Chromosomal translocations involving MYB are often maintained in distant metastases (Andreasen et al. 2018; Allen S. Ho et al. 2019), indicating Myb expression is still required at the metastatic site. Myb protein expression was also correlated with the expression of genes involved in epithelial to mesenchymal transition (EMT) and increased lung metastases in vivo mouse experiments (L.-H. Xu et al. 2019). Thus, multiple lines of evidence

127 suggest Myb has a direct role in ACC tumor metastases. Similarly, Wnt signaling has been implicated in ACC tumor EMT (C.-X. Zhou and Gao 2006) and promoted cell invasion in vitro (R. Wang et al. 2015). Given the known intersection of Myb and Wnt signaling and their similar effects in these studies, it is likely that together they are a major pathway promoting distant metastases in

ACC tumors. Growing evidence has also implicated NOTCH signaling in ACC tumor metastases, but as of yet a direct link to Myb has not been investigated.

NOTCH1 expression was elevated in metastases relative to matched primary tumor (Su et al. 2014), and aberrant NOTCH signaling promoted EMT in vitro (Z.-

L. Zhao et al. 2015). A subset of ACC tumors harboring NOTCH mutations had a significantly increased rate of metastasis (Ferrarotto et al. 2017), and metastases had acquired significantly more mutations in the NOTCH1 gene relative to primary tumors (Allen S. Ho et al. 2019). Thus, the same NOTCH and Wnt signaling pathways implicated in ACC tumor proliferation are also implicated in metastases and may represent both Myb independent and dependent pathways.

Interestingly, acquired NOTCH mutations in metastasis raises the question whether Myb oncoproteins alone are able to initiate distant metastases, or if more cooperating mutations are required? This is directly relevant to patient treatment, early Myb driven metastases would respond well Myb therapeutic intervention targeting Myb early in tumor development. Conversely, an increasingly diverse rage of acquired mutations driving metastasis later in tumor

128 progression would require personalized treatment for each patient, but could occur later.

ACC tumor progression is often accompanied by dedifferentiation and increased mutation burden, where low grade tumors have few genetic lesions but high-grade tumors acquire many more genetic lesions (Costa et al. 2014; Su et al. 2014). Further, highly transformed, dedifferentiated tumors are associated with accelerated tumor progression and poor prognosis. Little is known regarding the molecular mechanisms underlying dedifferentiation and genome instability.

Increasing genome instability is a characteristic of many tumors, including ACC tumors which had an average of only 16 mutations in low grade tumors, but advanced tumors had acquired up to 36 mutations (Allen S. Ho et al. 2019). ACC tumors with activating NOTCH1 mutations were significantly more likely to have a dedifferentiated phenotype (Ferrarotto et al. 2017), and NOTCH mutations are often acquired as tumors progress (Allen S. Ho et al. 2019). Thus, increased mutational burden, especially in the NOTCH pathway, due to genome instability may lead to tumor progression and a dedifferentiated phenotype. In chapter 2 we described a poor outcome subgroup of ACC patients whose gene expression signature was linked to the differentiation state of the tumor (Frerich et al. 2018).

However, we did not observe association with Myb expression, NOTCH mutations, nor activated NOTCH signaling. Which may indicate additional unknown pathways are involved in this process. Nevertheless, dedifferentiation

129 and genetic instability present a complex problem; likely making tumors more aggressive, harder to target, and highly metastatic.

Studies have revealed many opportunities to develop targeted therapeutics for ACC patients. It is clear that ACC tumors employ a multitude of mechanisms to activate and ensure MYB oncogene expression, evidence of its importance in disease maintenance. Future studies will be required to fully define the mechanisms activating these oncogenes and how to disrupt them. To date studies indicate that Wnt and NOTCH the primary pathways driving proliferation, invasion, and metastases of ACC tumors, both attractive drug targets. Thus far it appears that in ACC tumors these pathways operate in Myb dependent and Myb independent manners. Importantly, these three pathways are interrelated and dependent on each other in intestinal tumorigenesis (Germann et al. 2014), thus further investigation to confirm the lack of interplay in ACC tumors is needed. Our studies hint at a previously unappreciated diversity in ACC tumor transcriptomes, indicating activation of pathways in addition to those already described that could be harnessed to develop targeted therapies in the future.

130 APPENDICES

Appendix A: Abbreviations

RNA-sequencing (RNA-seq) breast invasive carcinoma (BRCA) prostate adenocarcinoma (PRAD) head and neck squamous cell carcinoma (HNSC) glioblastoma multiforme (GB) ovarian serous cystadenocarcinoma (OVCA) acute myeloid leukemia (AML)

RNA-ligase mediated rapid amplification of cDNA ends (5’RLM-RACE)

amino acid (aa)

MYB gene promoter (TSS1)

MYB gene alternative promoter (TSS2)

Avian Myeloblastosis Virus (AMV)

DNA binding domain (DBD) transactivation domain (TAD) full-length Myb (FL Myb)

N-terminally truncated Myb (ΔN Myb protein isoformNMyb)

Integrative Genome Browser (IGV) formaldehyde-fixed paraffin-embedded (FFPE) untranslated region (UTR)

Principal components analyses (PCA)

131 empty vector (EV) differentially expressed (DE)

Adenoid cystic carcinoma (ACC)

Formalin-fixed, paraffin-embedded (FFPE)

Ribonucleic acid sequencing (RNA-seq)

Fluorescence In Situ Hybridization (FISH)

Chromatin immunoprecipitation (ChIP)

CRISPR affinity purification in situ of regulatory elements (CAPTURE)

CAP analysis of gene expression sequencing (CAGE-seq) epithelial to mesenchymal transition (EMT)

132 Appendix B: Chapter 2 supplementary tables and figures

133 Supplementary Figure S.2.1. Large version of Figure 2.2C

134 Supplementary Figure S.2.2. Large version of Figure 2.4E

135 Appendix C: Chapter 3 supplementary tables and figures

Supplementary Table S.3.1. PCR primers and Sanger sequencing. PCR primers used for 5’RLM-RACE amplification of MYB transcripts are listed. 5’RLM- RACE products were TOPO clones and screened with TSS1 and TSS2 specific primers. Multiple colonies predicted to be TSS1 and TSS2 were Sanger sequenced, good quality sequences are included here. MYB promoter regions were cloned into the pGL3-basic reporter vector as described in the methods. Inserts were Sanger sequenced and included here.

name sequence cloning primers MYB TSS1 FW gaggaggagGCTAGCTTGCCGCCCACTTGTATTGA MYB TSS1 RV gaggaggagaagcttGGGGTCTTCGGGCtATGG MYB TSS2 FW gaggaggagGCTAGCGACTCTGACTAACAAGTGGCCT MYB TSS2 RV gaggaggagctcgagCTCATCATCCTCGTCACTGCT

5’RLM RACE primers RACE MYBex8 RV TGGTAGCACCTGCTGTCCTTTTAGC RACE MYBex6RV TGTTCGACCTTCCGACGCATTGTAG RACE CCACTGGAATTCTACAATGCGTCGGA MYBex6 FW RACE MYBex7 RV CAGCTGGCTGAGGGACATTGACTAT RACE MYBex7/8 FW GCCGCAGCCATTCAGAGACACTAT

5’RLM RACE Sanger sequencing MYB TSS1 CTCTTTCTCCTGAGAAACTTCGCCCCAGCGGTGCGGAGCGCCGCTGCGCAG RACE CCGGGGAGGGACGCAGGCAGGCGGCGGGCAGCGGGAGGCGGCAGCCCG GTGCGGTCCCCGCGGCTCTCGGCGGAGCCCCGCGCCCGCCGCGCCATGG CCCGAAGACCCCGGCACAGCATATATAGCAGTGACGAGGATGATGAGGACTT TGAGATGTGTGACCATGACTATGATGGGCTGCTTCCCAAGTCTGGAAAGCGT CACTTGGGGAAAACAAGGTGGACCCGGGAAGAGGATGAAAAACTGAAGAAG CTGGTGGAACAGAATGGAACAGATGACTGGAAAGTTATTGCCAATTATCTCCC GAATCGAACAGATGTGCAGTGCCAGCACCGATGGCAGAAAGTACTAAACCCT GAGCTCATCAAGGGTCCTTGGACCAAAGAAGAAGATCAGAGAGTGATAGAGC TTGTACAGAAATACGGTCCGAAACGTTGGTCTGTTATTGCCAAGCACTTAAAG GGGAGAATTGGAAAACAATGTAGGGAGAGGTGGCATAACCACTTGAATCCAG AAGTTAAGAAAACCTCCTGGACAGAAGAGGAAGACAGAATTATTTACCAGGC ACACAAGAGACTGGGGAACAGATGGGCAGAAATCGCAAAGCTACTGCCTGG

136 ACG AACCAGTTTACAATACTAGAGCAACAGAATGCAGCAAACAATCTTGTTGTGCA AGTTTTCAAAGTTTTGTCTTCATAACCTTTGAAAAGATTGTTGAGGAGTTTTGT GTAAGTTTTGTAATCCAGTAGTAGTCTAAATCCTCTTGTTTCAGCCCACGTCTA CCCATTCTTATTTCTGCAGCATATATAGCAGTGACGAGGATGATGAGGACTTT GAGATGTGTGACCATGACTATGATGGGCTGCTTCCCAAGTCTGGAAAGCGTC ACTTGGGGAAAACAAGGTGGACCCGGGAAGAGGATGAAAAACTGAAGAAGC TGGTGGAACAGAATGGAACAGATGACTGGAAAGTTATTGCCAATTATCTCCCG AATCGAACAGATGTGCAGTGCCAGCACCGATGGCAGAAAGTACTAAACCCTG AGCTCATCAAGGGTCCTTGGACCAAAGAAGAAGATCAGAGAGTGATAGAGCT TGTACAGAAATACGGTCCGAAACGTTGGTCTGTTATTGCCAAGCACTTAAAGG GGAGAATTGGAAAACAATGTAGGGAGAGGTGGCATAACCACTTGAATCCAGA MYB TSS2 AGTTAAGAAAACCTCCTGGACAGAAGAGGAAGACAGAATTATTTACCAGGCA +4409 nt CACAAGAGACTGGGGAACAGATGGGCAGAAATCGCAAAGCTACTGCCTGGA RACE CG AGTTTACAATACTAGAGCAACAGAATGCAGCAAACAATCTTGTTGTGCAAGTT TTCAAAGTTTTGTCTTCATAACCTTTGAAAAGATTGTTGAGGAGTTTTGTGTAA GTTTTGTAATCCAGTAGTAGTCTAAATCCTCTTGTTTCAGCCCACGTCTACCCA TTCTTATTTCTGCAGCATATATAGCAGTGACGAGGATGATGAGGACTTTGAGAT GTGTGACCATGACTATGATGGGCTGCTTCCCAAGTCTGGAAAGCGTCACTTG GGGAAAACAAGGTGGACCCGGGAAGAGGATGAAAAACTGAAGAAGCTGGTG GAACAGAATGGAACAGATGACTGGAAAGTTATTGCCAATTATCTCCCGAATCG AACAGATGTGCAGTGCCAGCACCGATGGCAGAAAGTACTAAACCCTGAGCTC ATCAAGGGTCCTTGGACCAAAGAAGAAGATCAGAGAGTGATAGAGCTTGTAC AGAAATACGGTCCGAAACGTTGGTCTGTTATTGCCAAGCACTTAAAGGGGAG MYB TSS2 AATTGGAAAACAATGTAGGGAGAGGTGGCATAACCACTTGAATCCAGAAGTTA +4405 nt AGAAAACCTCCTGGACAGAAGAGGAAGACAGAATTATTTACCAGGCACACAA RACE GAGACTGGGGAACAGATGGGCAGAAATCGCAAAGCTACTGCCTGGACG MYB TSS2 ACCCATTCTTATTTCTGCAGCATATATAGCAGTGACGAGGATGATGAGGACTTT +4569 nt GAGATGTGTGACCATGACTATGATGGGCTGCTTCCCAAGTCTGGAAAGCGTC RACE ACTTGGGGAAAACAAGGTGGACCCGGGAAGAGGATGAAAAACTGAAGAAGC TGGTGGAACAGAATGGAACAGATGACTGGAAAGTTATTGCCAATTATCTCCCG AATCGAACAGATGTGCAGTGCCAGCACCGATGGCAGAAAGTACTAAACCCTG AGCTCATCAAGGGTCCTTGGACCAAAGAAGAAGATCAGAGAGTGATAGAGCT TGTACAGAAATACGGTCCGAAACGTTGGTCTGTTATTGCCAAGCACTTAAAGG GGAGAATTGGAAAACAATGTAGGGAGAGGTGGCATAACCACTTGAATCCAGA AGTTAAGAAAACCTCCTGGACAGAAGAGGAAGACAGAATTATTTACCAGGCA CACAAGAGACTGGGGAACAGATGGGCAGAAATCGCAAAGCTACTGCCTGGA CG ATTCTTATTTCTGCAGCATATATAGCAGTGACGAGGATGATGAGGACTTTGAGA TGTGTGACCATGACTATGATGGGCTGCTTCCCAAGTCTGGAAAGCGTCACTT GGGGAAAACAAGGTGGACCCGGGAAGAGGATGAAAAACTGAAGAAGCTGGT GGAACAGAATGGAACAGATGACTGGAAAGTTATTGCCAATTATCTCCCGAATC GAACAGATGTGCAGTGCCAGCACCGATGGCAGAAAGTACTAAACCCTGAGCT CATCAAGGGTCCTTGGACCAAAGAAGAAGATCAGAGAGTGATAGAGCTTGTA CAGAAATACGGTCCGAAACGTTGGTCTGTTATTGCCAAGCACTTAAAGGGGA MYB TSS2 GAATTGGAAAACAATGTAGGGAGAGGTGGCATAACCACTTGAATCCAGAAGT +4566 nt TAAGAAAACCTCCTGGACAGAAGAGGAAGACAGAATTATTTACCAGGCACAC RACE AAGAGACTGGGGAACAGATGGGCAGAAATCGCAAAGCTACTGCCTGGACG

Reporter sequences for MYB promoters

137 TTGCCGCCCACTTGTATTGAAGCGTCCTTTGTCACTAACAAGTTAAATTAGAG ATGTTATTTATTTAAGAAGAAGGAAAAAAAACCCTAGCCAAACAGCCTATGAAT ACATATGCTCACATCCCCTACTCCTCCAACTCCTAATTTCCCCGTCTCCAGAG GGCACAGTTGTAAACCTTGACGAAAATCCAATCTTCTGTGCGGGAATTTCCC CCCACCGCTTGCCGCCCCCGCGACAGTGAGTGGGAGCTGGAGGAGCTCTG GTCCCGCTGCCCGGGAGCACGCGGAGCCGGGCGACCGCGGTGCGGCAGC CAGGGAGGAGGGGAGGCGGCGGGACTGGGCGCGGGTCGGCGCCGCCGC GACCCGGGAGCGGGGTTTGCTCAGGAAAAGGCGCCGTCGCGGCCCCCGG CCACCCCTCCCTGGCCCCGGGCTCCCTGCCCGCGCGCCTCCCGGGCCTCG CGGCGCGCTAGGCGCACCGCGGCGGCGCGAGCGCCGAATGGGAGCGGCG ACCCGGCCAGCCCGGCAGCCCCGCGGGCGGCAGCCAGGGCGACCGCGGA GGCGGCGGGCAGGGCGCGTGCGCACTGCAGGGGCGCCAGATTTGGCGGG AGGGGGAGTGTCCAAAGCTCTTTGTTTGATGGCATCTCTGTTTACAGAGTTTA CACTTTAATATCAACCTGTTTCCTCCTCCTCCTTCTCCTCCTCCTCCGTGACC TCCTCCTCCTCTTTCTCCTGAGAAACTTCGCCCCAGCGGTGCGGAGCGCCG CTGCGCAGCCGGGGAGGGACGCAGGCAGGCGGCGGGCAGCGGGAGGCG MYB TSS1- GCAGCCCGGTGCGGTCCCCGCGGCTCTCGGCGGAGCCCCGCGCCCGCCG Luc. CGCtATGGCCCGAAGACCCC GACTCTGACTAACAAGTGGCCTAATTATTCACTTAGTTACTCTAGAAACTAAGT ATTGTAAACATGGGCACAAGTTGGATCAACCAGGCCTGGAGTTGTGAGCAAT TTGGTATTAATTTTATTTACAAAACATTAAAGCTTGATCACTCAATGTTCTTATCT TTGCTTTGGTTTTAAAATCCTTTCCTCTTAGATTCTCCTAATCCTCTAGACTTTA TGGGATCACTATAATTCTGTTTTGCGCTGTACTACTTCTTGATTTTTTTCTTCTT TTAATAAAACAAAAACCCCATTGGAATAGCATAGTTGAATTGTTTATTATGTTTG AGAAATATTATTTAAACGATGTGACAGATGCCAAAGATTTTGAGTGTGCACTTA TATAAAGGACATGGGTTCTTGTTCCTTTTCTTATCCTTAACCTTAAGTTTTCAAC TTAAACCTTCACTGGTTGGAAGGTGGCCAAATGTGTAACTTGTCCCTGGTCTA ATAGTAACAGCAGGTTCAGACATGCAGGGGAATAGGAAGGTGCCAGGTCCTT GGCCGTGTCTGTGGATACCCATAACAGCAGAACCAGTTTACAATACTAGAGCA ACAGAATGCAGCAAACAATCTTGTTGTGCAAGTTTTCAAAGTTTTGTCTTCATA ACCTTTGAAAAGATTGTTGAGGAGTTTTGTGTAAGTTTTGTAATCCAGTAGTAG MYB TSS2- TCTAAATCCTCTTGTTTCAGCCCACGTCTACCCATTCTTATTTCTGCAGCATAT Luc ATAGCAGTGACGAGGATGATGAG

138 Supplementary Figure S.3.1. Details of 5’ RLM-RACE MYB TSS2 products. MYB TSS2 5’RLM-RACE products were TOPO cloned and several colonies were submitted for Sanger sequencing. Importantly, 5’RLM-RACE, as performed here, is not quantitative and does not have single base resolution. Upon sequencing we found transcripts that extended 180 nt, 176 nt, 20 nt and 17 nt upstream of exon 2. We have grouped these into ~180 nt and ~20 nt groups in the main figure. Three colonies from each of the two groups were sequenced.

139 Supplementary Figure S.3.2. Predicted transcription factor binding sites for the MYB promoter reporters. Portions of both MYB gene promoters were clones upstream of the luciferase reporter. Transcription factor binding motifs were discovered using the transcription factor affinity prediction (TRAP) set of web tools (Thomas-Chollier et al. 2011). High quality motifs were defined as having a weight score above 2.5. Binding sites for Myb were discovered in both

140 sequences (bold underlined). Additional binding sites were discovered (underlined).

141 Supplementary Figure S.3.3. Exon ratio for ACC tumors with and without chromosomal translocation. ACC tumors were classified as full-length MYB expression, truncated MYB, or unknown by visual inspection of RNA-seq reads across the MYB gene. Chromosomal translocations almost always result in expression of a truncated gene; thus truncation is used as a substitute for translocation. Conversely, full-length MYB expression is derived from an unbroken gene, which has not been translocated. The ratio of reads on exon2/exon1 was plotted for full-length and truncated Myb expressing samples, unknown samples were not included. There was not a significant difference in exon ratio, and TSS use, between tumors with truncated and full-length MYB gene expression.

142 Supplementary Figure S.3.4. N-terminal conservation of the Myb protein. An alignment of the first 240 amino acids of the human Myb protein with the chicken, mouse, AMV v-Myb, and ∆N Myb protein sequence. The N-terminus is highly conserved, as is the DBD.

143 Supplementary Figure S.3.5. ∆N Myb is not detected by a common Myb antibody. Myb proteins were over-expressed in HEK293TN cells and lysates were probed with two anti-Myb antibodies. Rabbit serum that detects the DNA binding domain detects both full-length and ∆N Myb isoforms. Whereas, the popular anti-Myb pS11 antibody (ab45150) is unable to detect ∆N Myb isoforms.

Supplementary Figure S.3.6. ∆N Myb proteins have similar activities to full- length Myb in reporter assays. Two previously described (Frerich et al. 2018) Myb-responsive reporter genes were used. A portion of the engrailed gene (EN1) promoter and the SOX4 gene promoter were cloned upstream of the firefly luciferase reporter gene. Their response to ∆N Myb and full-length Myb were

144 assayed in HEK293TN cells. ∆N Myb binds and activates expression in a reporter assay and is thus a functional transcription factor.

145 Supplementary Figure S.3.7. Western blot of ectopically expressed Myb isoforms. SW620 cells were transduced with lentiviral particles to express empty vector, full-length Myb or ∆N Myb. Protein and RNA (used for RNA-seq) were harvested at 48hrs. Western blot analyses were performed using rabbit serum that detects the DNA-binding domain of Myb proteins, and actin was probed as a loading control. Full-length Myb and ∆N Myb were expressed in relatively equal quantities.

146 Supplementary Figure 8. Large heatmap of SW620 RNA-seq. A summary heatmap displays gene expression changes elicited by Myb transcription factors. Tick marks above the heatmap indicate the following: genes differentially expressed in Myb expression ACC tumors versus Myb negative ACC tumors (orange), genes discussed in the text (gray), genes differentially expressed in

147 Myb versus ∆N Myb (cyan), and genes differentially expressed by both Myb and ∆N Myb (black).

148 Supplementary Figure S.3.9. Myb transcription factors elicited a gene signature significantly enriched in an ACC disease set. Genes regulated in common by full-length Myb and ∆N Myb were submitted to DisGenNET (Pinero et al. 2015) using the clusterProfiler R package (Yu et al. 2012). The significantly enriched disease sets, including ACC, are displayed.

149 REFERENCES

Anders, S., and W. Huber. 2010. “Differential Expression Analysis for Sequence Count Data.” Genome Biol 11 (10): R106. https://doi.org/10.1186/gb-2010- 11-10-r106. Andreasen, Simon, Tina Klitmøller Agander, Kristine Bjørndal, Daiva Erentaite, Steffen Heegaard, Stine R. Larsen, Linea Cecilie Melchior, et al. 2018. “Genetic Rearrangements, Hotspot Mutations, and MicroRNA Expression in the Progression of Metastatic Adenoid Cystic Carcinoma of the Salivary Gland.” Oncotarget 9 (28). https://doi.org/10.18632/oncotarget.24800. Bakst, R. L., C. M. Glastonbury, U. Parvathaneni, N. Katabi, K. S. Hu, and S. S. Yom. 2019. “Perineural Invasion and Perineural Tumor Spread in Head and Neck Cancer.” Int J Radiat Oncol Biol Phys 103 (5): 1109–24. https://doi.org/10.1016/j.ijrobp.2018.12.009. Bandopadhayay, Pratiti, Lori A Ramkissoon, Payal Jain, Guillaume Bergthold, Jeremiah Wala, Rhamy Zeid, Steven E Schumacher, et al. 2016. “MYB- QKI Rearrangements in Angiocentric Glioma Drive Tumorigenicity through a Tripartite Mechanism.” Nature Genetics 48 (3): 273–82. https://doi.org/10.1038/ng.3500. Bell, D., A. Bell, E. Hanna, and R. Weber. 2016. “In‐depth depthCharacterizationofthe Characterization of the Salivary Adenoid Cystic Carcinoma Transcriptome with Emphasis on Dominant Cell Type.” Cancer 122 (10): 1513–22. https://doi.org/10.1002/cncr.29959. Bell, D., and E. Y. Hanna. 2012. “Salivary Gland Cancers: Biology and Molecular Targets for Therapy.” Curr Oncol Rep 14 (2): 166–74. https://doi.org/10.1007/s11912-012-0220-5. Bell, Diana, Achim Bell, Dianna Roberts, Randal S. Weber, and Adel K. El- Naggar. 2012. “Developmental Transcription Factor EN1-a Novel Biomarker in Human Salivary Gland Adenoid Cystic Carcinoma*.” Cancer 118 (5): 1288–92. https://doi.org/10.1002/cncr.26412. Bender, Timothy P, Christopher S Kremer, Manfred Kraus, Thorsten Buch, and Klaus Rajewsky. 2004. “Critical Functions for C-Myb at Three Checkpoints during Thymocyte Development.” Nature Immunology 5 (7): 721–29. https://doi.org/10.1038/ni1085. Ben-Porath, I., M. W. Thomson, V. J. Carey, R. Ge, G. W. Bell, A. Regev, and R. A. Weinberg. 2008. “An Embryonic Stem Cell-like Gene Expression Signature in Poorly Differentiated Aggressive Human Tumors.” Nat Genet 40 (5): 499–507. https://doi.org/10.1038/ng.127. Binmadi, Nada O., Ying-Hua Yang, Hua Zhou, Patrizia Proia, Yi-Ling Lin, Alfredo M. Batista De Paula, André L. Sena Guimarães, Fabiano O. Poswar, Devaki Sundararajan, and John R. Basile. 2012. “Plexin-B1 and Semaphorin 4D Cooperate to Promote Perineural Invasion in a

150 RhoA/ROK-Dependent Manner.” The American Journal of Pathology 180 (3): 1232–42. https://doi.org/10.1016/j.ajpath.2011.12.009. Bolcun-Filas, Ewelina, Laura A. Bannister, Alex Barash, Kerry J. Schimenti, Suzanne A. Hartford, John J. Eppig, Mary Ann Handel, Lishuang Shen, and John C. Schimenti. 2011. “A-MYB (MYBL1) Transcription Factor Is a Master Regulator of Male Meiosis.” Development 138 (15): 3319–30. https://doi.org/10.1242/dev.067645. Brayer, K. J., C. A. Frerich, H. Kang, and S. A. Ness. 2016. “Recurrent Fusions in MYB and MYBL1 Define a Common, Transcription Factor-Driven Oncogenic Pathway in Salivary Gland Adenoid Cystic Carcinoma.” Cancer Discov 6 (2): 176–87. https://doi.org/10.1158/2159-8290.CD-15-0859. Brill, L. B., W. A. Kanner, A. Fehr, Y. Andrén, C. A. Moskaluk, T. Löning, G. Stenman, and H. F. Frierson. 2011. “Analysis of MYB Expression and MYB-NFIB Gene Fusions in Adenoid Cystic Carcinoma and Other Salivary Neoplasms.” Mod Pathol 24 (9): 1169–76. https://doi.org/modpathol201186 [pii] 10.1038/modpathol.2011.86. Brown, Roger B., Nathaniel J. Madrid, Hideaki Suzuki, and Scott A. Ness. 2017. “Optimized Approach for Ion Proton RNA Sequencing Reveals Details of RNA Splicing and Editing Features of the Transcriptome.” PloS One 12 (5): e0176675. https://doi.org/10.1371/journal.pone.0176675. Bruhat, Alain, Yoan Chérasse, Anne-Catherine Maurin, Wolfgang Breitwieser, Laurent Parry, Christiane Deval, Nic Jones, Céline Jousse, and Pierre Fafournoux. 2007. “ATF2 Is Required for Amino Acid-Regulated Transcription by Orchestrating Specific Histone Acetylation.” Nucleic Acids Research 35 (4): 1312–21. https://doi.org/10.1093/nar/gkm038. Burk, Oliver, and Karl-Heinz Klempnauer. 1999. “Myb and Ets Transcription Factors Cooperate at the Myb-Inducible Promoter of the Tom-1 Gene.” Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression 1446 (3): 243–52. https://doi.org/10.1016/S0167-4781(99)00097-4. Cai, Yong, Jingji Jin, Chieri Tomomori-Sato, Shigeo Sato, Irina Sorokina, Tari J. Parmely, Ronald C. Conaway, and Joan Weliky Conaway. 2003. “Identification of New Subunits of the Multiprotein Mammalian TRRAP/TIP60-Containing Histone Acetyltransferase Complex.” The Journal of Biological Chemistry 278 (44): 42733–36. https://doi.org/10.1074/jbc.C300389200. Capparuccia, L., and L. Tamagnone. 2009. “Semaphorin Signaling in Cancer Cells and in Cells of the Tumor Microenvironment–Two Sides of a Coin.” J Cell Sci 122 (Pt 11): 1723–36. https://doi.org/10.1242/jcs.030197. Carninci, P., A. Sandelin, B. Lenhard, S. Katayama, K. Shimokawa, J. Ponjavic, C. A. Semple, et al. 2006. “Genome-Wide Analysis of Mammalian Promoter Architecture and Evolution.” Nat Genet 38 (6): 626–35. https://doi.org/10.1038/ng1789.

151 Cesi, V., A. Casciati, F. Sesti, B. Tanno, B. Calabretta, and G. Raschella. 2011. “TGFbeta-Induced c-Myb Affects the Expression of EMT-Associated Genes and Promotes Invasion of ER+ Breast Cancer Cells.” Cell Cycle 10 (23): 4149–61. https://doi.org/10.4161/cc.10.23.18346. Chang, Boyang, Hang Yang, Yuan Jiao, Kefeng Wang, Zhonghua Liu, Peihong Wu, Su Li, and Anxun Wang. 2016. “SOD2 Deregulation Enhances Migration, Invasion and Has Poor Prognosis in Salivary Adenoid Cystic Carcinoma.” Scientific Reports 6: 25918. https://doi.org/10.1038/srep25918. Corradini, F., V. Cesi, V. Bartella, E. Pani, R. Bussolari, O. Candini, and B. Calabretta. 2005. “Enhanced Proliferative Potential of Hematopoietic Cells Expressing Degradation-Resistant c-Myb Mutants.” J Biol Chem 280 (34): 30254–62. https://doi.org/10.1074/jbc.M504703200. Costa, A., A. Altemani, C. Garcia-Inclan, F. Fresno, C. Suarez, J. L. Llorente, and M. Hermsen. 2014. “Analysis of MYB Oncogene in Transformed Adenoid Cystic Carcinomas Reveals Distinct Pathways of Tumor Progression.” Lab Invest. 94 (6): 692–702. https://doi.org/10.1038/labinvest.2014.59. Cuddihy, A E, L A Brents, N Aziz, T P Bender, and W M Kuehl. 1993. “Only the DNA Binding and Transactivation Domains of C-Myb Are Required to Block Terminal Differentiation of Murine Erythroleukemia Cells.” Molecular and Cellular Biology 13 (6): 3505–13. https://doi.org/10.1128/MCB.13.6.3505. Cures, A., C. House, C. Kanei-Ishii, B. Kemp, and R. G. Ramsay. 2001. “Constitutive C-Myb Amino-Terminal Phosphorylation and DNA Binding Activity Uncoupled during Entry and Passage through the Cell Cycle.” Oncogene 20 (14): 1784–92. https://doi.org/10.1038/sj.onc.1204345. Daa, Tsutomu, Kenji Kashima, Naomi Kaku, Masashi Suzuki, and Shigeo Yokoyama. 2004. “Mutations in Components of the Wnt Signaling Pathway in Adenoid Cystic Carcinoma.” Modern Pathology 17 (12): 1475–82. https://doi.org/10.1038/modpathol.3800209. Dai, Wei, Xuexin Tan, Changfu Sun, and Qing Zhou. 2014. “High Expression of SOX2 Is Associated with Poor Prognosis in Patients with Salivary Gland Adenoid Cystic Carcinoma.” International Journal of Molecular Sciences 15 (5): 8393–8406. https://doi.org/10.3390/ijms15058393. Dantas, A. N., E. F. Morais, R. A. Macedo, J. M. Tinoco, and L. Morais Mde. 2015. “Clinicopathological Characteristics and Perineural Invasion in Adenoid Cystic Carcinoma: A Systematic Review.” Braz J Otorhinolaryngol 81 (3): 329–35. https://doi.org/10.1016/j.bjorl.2014.07.016. Dash, A. B., F. C. Orrico, and S. A. Ness. 1996. “The EVES Motif Mediates Both Intermolecular and Intramolecular Regulation of C-Myb.” Genes Dev 10 (15): 1858–69.

152 Dassé, E., G. Volpe, D.S. Walton, N. Wilson, W. Del Pozzo, L.P. O’Neill, R.K. Slany, J. Frampton, and S. Dumon. 2012. “Distinct Regulation of C-Myb Gene Expression by HoxA9, Meis1 and Pbx Proteins in Normal Hematopoietic Progenitors and Transformed Myeloid Cells.” Blood Cancer Journal 2 (6). https://doi.org/doi:10.1038/bcj.2012.20. Davis, Matthew P. A., Stijn van Dongen, Cei Abreu-Goodger, Nenad Bartonicek, and Anton J. Enright. 2013. “Kraken: A Set of Tools for Quality Control and Analysis of High-Throughput Sequence Data.” Methods (San Diego, Calif.) 63 (1): 41–49. https://doi.org/10.1016/j.ymeth.2013.06.027. Davuluri, R. V., Y. Suzuki, S. Sugano, C. Plass, and T. H. Huang. 2008. “The Functional Consequences of Alternative Promoter Use in Mammalian Genomes.” Trends Genet 24 (4): 167–77. https://doi.org/10.1016/j.tig.2008.01.008. DeAngelis, A. F., A. Tsui, D. Wiesenfeld, and A. Chandu. 2011. “Outcomes of Patients with Adenoid Cystic Carcinoma of the Minor Salivary Glands.” Int J Oral Maxillofac Surg 40 (7): 710–14. https://doi.org/10.1016/j.ijom.2011.02.010. Dillon, P., C. Moskaluk, P. Joshi, and C. Thomas. 2016. “Adenoid Cystic Carcinoma: A Review of Recent Advances, Molecular Targets, and Clinical Trials.” Head & Neck 38 (4): 620–27. https://doi.org/10.1002/hed.23925. Dini, P. W., and J. S. Lipsick. 1993. “Oncogenic Truncation of the First Repeat of C-Myb Decreases DNA Binding in Vitro and in Vivo.” Mol Cell Biol 13 (12): 7334–48. Dobin, A., C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, and T. R. Gingeras. 2013. “STAR: Ultrafast Universal RNA-Seq Aligner.” Bioinformatics 29 (1): 15–21. https://doi.org/10.1093/bioinformatics/bts635. Drabsch, Y., H. Hugo, R. Zhang, D. H. Dowhan, Y. R. Miao, A. M. Gewirtz, S. C. Barry, R. G. Ramsay, and T. J. Gonda. 2007. “Mechanism of and Requirement for Estrogen-Regulated MYB Expression in Estrogen- Receptor-Positive Breast Cancer Cells.” Proc Natl Acad Sci USA 104 (34): 13762–67. https://doi.org/10.1073/pnas.0700104104. Drier, Y., M. J. Cotton, K. E. Williamson, S. M. Gillespie, R. J. Ryan, M. J. Kluk, C. D. Carey, et al. 2016. “An Oncogenic MYB Feedback Loop Drives Alternate Cell Fates in Adenoid Cystic Carcinoma.” Nat Genet 48 (3): 265– 72. https://doi.org/10.1038/ng.3502. Dubendorff, J. W., L. J. Whittaker, J. T. Eltman, and J. S. Lipsick. 1992. “Carboxy- Terminal Elements of c-Myb Negatively Regulate Transcriptional Activation in Cis and in Trans.” Genes Dev 6 (12B): 2524–35. Dvorák, M., P. Urbánek, P. Bartůnĕk, V. Paces, J. Vlach, V. Pecenka, L. Arnold, M. Trávnicek, and J. Ríman. 1989. “Transcription of the Chicken Myb Proto-Oncogene Starts within a CpG Island.” Nucleic Acids Res 17 (14): 5651–64.

153 Fehr, A., A. Kovács, T. Löning, H. Frierson, J. van den Oord, and G. Stenman. 2011. “The MYB-NFIB Gene Fusion-a Novel Genetic Link between Adenoid Cystic Carcinoma and Dermal Cylindroma.” J Pathol 224 (3): 322–27. https://doi.org/10.1002/path.2909. Ferrarotto, Renata, Yoshitsugu Mitani, Lixia Diao, Irene Guijarro, Jing Wang, Patrick Zweidler-McKay, Diana Bell, et al. 2017. “Activating NOTCH1 Mutations Define a Distinct Subgroup of Patients With Adenoid Cystic Carcinoma Who Have Poor Prognosis, Propensity to Bone and Liver Metastasis, and Potential Responsiveness to Notch1 Inhibitors.” Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology 35 (3): 352–60. https://doi.org/10.1200/JCO.2016.67.5264. Forrest, A. R., H. Kawaji, M. Rehli, J. K. Baillie, M. J. de Hoon, V. Haberle, T. Lassmann, et al. 2014. “A Promoter-Level Mammalian Expression Atlas.” Nature 507 (7493): 462–70. https://doi.org/10.1038/nature13182. Frampton, J., T. J. Gibson, S. A. Ness, G. Doderlein, and T. Graf. 1991. “Proposed Structure for the DNA-Binding Domain of the Myb Oncoprotein Based on Model Building and Mutational Analysis.” Protein Eng 4 (8): 891–901. Frerich, C. A., K. J. Brayer, B. M. Painter, H. Kang, Y. Mitani, A. K. El-Naggar, and S. A. Ness. 2018. “Transcriptomes Define Distinct Subgroups of Salivary Gland Adenoid Cystic Carcinoma with Different Driver Mutations and Outcomes.” Oncotarget 9 (7): 7341–58. https://doi.org/10.18632/oncotarget.23641. Frierson, Henry F., Adel K. El-Naggar, John B. Welsh, Lisa M. Sapinoso, Andrew I. Su, Jun Cheng, Takashi Saku, Christopher A. Moskaluk, and Garret M. Hampton. 2002. “Large Scale Molecular Analysis Identifies Genes with Altered Expression in Salivary Adenoid Cystic Carcinoma.” The American Journal of Pathology 161 (4): 1315–23. https://doi.org/10.1016/S0002- 9440(10)64408-2. Fu, S. L., and J. S. Lipsick. 1996. “FAETL Motif Required for Leukemic Transformation by V-Myb.” J Virol 70 (8): 5600–5610. Furuta, Y., S. Aizawa, Y. Suda, Y. Ikawa, H. Nakasgoshi, Y. Nishina, and S. Ishii. 1993. “Degeneration of Skeletal and Cardiac Muscles in C-Myb Transgenic Mice.” Transgenic Res 2 (4): 199–207. Gambaro, Karen, Michael C. J. Quinn, Paulina M. Wojnarowicz, Suzanna L. Arcand, Manon de Ladurantaye, Véronique Barrès, Jean-Sébastien Ripeau, et al. 2013. “VGLL3 Expression Is Associated with a Tumor Suppressor Phenotype in Epithelial Ovarian Cancer.” Molecular Oncology 7 (3): 513–30. https://doi.org/10.1016/j.molonc.2012.12.006. Gao, R., C. Cao, M. Zhang, M. C. Lopez, Y. Yan, Z. Chen, Y. Mitani, et al. 2014. “A Unifying Gene Signature for Adenoid Cystic Cancer Identifies Parallel MYB-Dependent and MYB-Independent Therapeutic Targets.” Oncotarget 5 (24): 12528–42. https://doi.org/10.18632/oncotarget.2985.

154 Garapati, P. 2009a. “Sema4D in Semaphorin Signaling.” 2009. https://www.reactome.org/content/detail/R-HSA-400685. ———. 2009b. “Sema4D Induced Cell Migration and Growth-Cone Collapse.” 2009. https://www.reactome.org/content/detail/R-HSA-416572. Gentleman, R. C., V. J. Carey, D. M. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. Ellis, et al. 2004. “Bioconductor: Open Software Development for Computational Biology and Bioinformatics.” Genome Biol 5 (10): R80. https://doi.org/10.1186/gb-2004-5-10-r80. George, Olivia L, and Scott A Ness. 2014. “Situational Awareness: Cell Cycle Regulation of the Myb Transcription Factor,” 21. Germann, Markus, Huiling Xu, Jordane Malaterre, Shienny Sampurno, Mathilde Huyghe, Dane Cheasley, Silvia Fre, and Robert G. Ramsay. 2014. “Tripartite Interactions between Wnt Signaling, Notch and Myb for Stem/Progenitor Cell Functions during Intestinal Tumorigenesis.” Stem Cell Research 13 (3 Pt A): 355–66. https://doi.org/10.1016/j.scr.2014.08.002. Gil, Z., D. L. Carlson, A. Gupta, N. Lee, B. Hoppe, J. P. Shah, and D. H. Kraus. 2009. “Patterns and Incidence of Neural Invasion in Patients with Cancers of the Paranasal Sinuses.” Arch Otolaryngol Head Neck Surg 135 (2): 173–79. https://doi.org/10.1001/archoto.2008.525. Giordano, S., S. Corso, P. Conrotto, S. Artigiani, G. Gilestro, D. Barberis, L. Tamagnone, and P. M. Comoglio. 2002. “The Semaphorin 4D Receptor Controls Invasive Growth by Coupling with Met.” Nat Cell Biol 4 (9): 720– 24. https://doi.org/10.1038/ncb843. Gonda, T., C. Buckmaster, and R. Ramsay. 1989. “Activation of C-Myb by Carboxy-Terminal Truncation: Relationship to Transformation of Murine Haemopoietic Cells in Vitro.” EMBO J 8 (6): 1777–83. Grässer, F. A., T. Graf, and J. S. Lipsick. 1991. “Protein Truncation Is Required for the Activation of the C-Myb Proto-Oncogene.” Molecular and Cellular Biology 11 (8): 3987–96. https://doi.org/10.1128/MCB.11.8.3987. Gross, C. A., C. Chan, A. Dombroski, T. Gruber, M. Sharp, J. Tupy, and B. Young. 1998. “The Functional and Regulatory Roles of Sigma Factors in Transcription.” Cold Spring Harb Symp Quant Biol 63: 141–55. Hanahan, Douglas, and Robert A. Weinberg. 2011. “Hallmarks of Cancer: The Next Generation.” Cell 144 (5): 646–74. https://doi.org/10.1016/j.cell.2011.02.013. Heinz, Sven, Christopher Benner, Nathanael Spann, Eric Bertolino, Yin C. Lin, Peter Laslo, Jason X. Cheng, Cornelis Murre, Harinder Singh, and Christopher K. Glass. 2010. “Simple Combinations of Lineage- Determining Transcription Factors Prime Cis-Regulatory Elements Required for Macrophage and B Cell Identities.” Molecular Cell 38 (4): 576–89. https://doi.org/10.1016/j.molcel.2010.05.004.

155 Ho, A. S., K. Kannan, D. M. Roy, L. G. Morris, I. Ganly, N. Katabi, D. Ramaswami, et al. 2013. “The Mutational Landscape of Adenoid Cystic Carcinoma.” Nat Genet 45 (7): 791–98. https://doi.org/10.1038/ng.2643. Ho, Allen S., Angelica Ochoa, Gowtham Jayakumaran, Ahmet Zehir, Cristina Valero Mayor, Justin Tepe, Vladimir Makarov, et al. 2019. “Genetic Hallmarks of Recurrent/Metastatic Adenoid Cystic Carcinoma.” The Journal of Clinical Investigation 129 (10): 4276–89. https://doi.org/10.1172/ JCI128227. Hu, Y. L., R. G. Ramsay, C. Kanei-Ishii, S. Ishii, and T. J. Gonda. 1991. “Transformation by Carboxyl-Deleted Myb Reflects Increased Transactivating Capacity and Disruption of a Negative Regulatory Domain.” Oncogene 6 (9): 1549–53. Hugo, H., A. Cures, N. Suraweera, Y. Drabsch, D. Purcell, T. Mantamadiotis, W. Phillips, et al. 2006. “Mutations in the MYB Intron I Regulatory Sequence Increase Transcription in Colon Cancers.” Genes Chromosomes Cancer 45 (12): 1143–54. https://doi.org/10.1002/gcc.20378. Hugo, H. J., L. Pereira, R. Suryadinata, Y. Drabsch, T. J. Gonda, N. P. Gunasinghe, C. Pinto, et al. 2013. “Direct Repression of MYB by ZEB1 Suppresses Proliferation and Epithelial Gene Expression during Epithelial- to-Mesenchymal Transition of Breast Cancer Cells.” Breast Cancer Res 15 (6): R113. https://doi.org/10.1186/bcr3580. Hunt, J. L. 2011. “An Update on Molecular Diagnostics of Squamous and Salivary Gland Tumors of the Head and Neck.” Arch Pathol Lab Med 135 (5): 602–9. https://doi.org/10.1043/2010-0655-RAIR.1. Introna, Martino, Josée Golay, Jon Frampton, Toru Nakano, Scott A. Ness, and Thomas Graf. 1990. “Mutations in V-Myb Alter the Differentiation of Myelomonocytic Cells Transformed by the Oncogene.” Cell 63 (6): 1287– 97. https://doi.org/10.1016/0092-8674(90)90424-D. Jacobs, S. M., K. M. Gorse, and E. H. Westin. 1994. “Identification of a Second Promoter in the Human C-Myb Proto-Oncogene.” Oncogene 9 (1): 227– 35. Jiang, Wanping, Madge R Kanter, Ira Dunkel, Robert G Ramsay, Karen L Beemon, and William S Hayward. 1997. “Minimal Truncation of the C-Myb Gene Product in Rapid-Onset B-Cell Lymphoma.” J. VIROL. 71: 8. Jones, A. S., J. W. Hamilton, H. Rowley, D. Husband, and T. R. Helliwell. 1997. “Adenoid Cystic Carcinoma of the Head and Neck.” Clin Otolaryngol Allied Sci 22 (5): 434–43. Kanei-Ishii, C. 2004. “Wnt-1 Signal Induces Phosphorylation and Degradation of c-Myb Protein via TAK1, HIPK2, and NLK.” Genes & Development 18 (7): 816–29. https://doi.org/10.1101/gad.1170604. Karolchik, Donna, Angie S. Hinrichs, and W. James Kent. 2007. “The UCSC Genome Browser.” Current Protocols in Bioinformatics Chapter 1 (March): Unit 1.4. https://doi.org/10.1002/0471250953.bi0104s17.

156 Kent, W. James, Charles W. Sugnet, Terrence S. Furey, Krishna M. Roskin, Tom H. Pringle, Alan M. Zahler, and David Haussler. 2002. “The Human Genome Browser at UCSC.” Genome Research 12 (6): 996–1006. https:// doi.org/10.1101/gr.229102. Koyama, N., J. Zhang, Huqun, H. Miyazawa, T. Tanaka, X. Su, and K. Hagiwara. 2008. “Identification of IGFBP-6 as an Effector of the Tumor Suppressor Activity of SEMA3B.” Oncogene 27 (51): 6581–89. Lauder, A., A. Castellanos, and K. Weston. 2001. “C-Myb Transcription Is Activated by Protein Kinase B (PKB) Following Interleukin 2 Stimulation of Tcells and Is Required for PKB-Mediated Protection from Apoptosis.” Mol Cell Biol 21 (17): 5797–5805. Lei, W., J. J. Rushton, L. M. Davis, F. Liu, and S. A. Ness. 2004. “Positive and Negative Determinants of Target Gene Specificity in Myb Transcription Factors.” J Biol Chem 279 (28): 29519–27. https://doi.org/M403133200 [pii] 10.1074/jbc.M403133200. Li, J., and R. Tibshirani. 2013. “Finding Consistent Patterns: A Nonparametric Approach for Identifying Differential Expression in RNA-Seq Data.” Stat Methods Med Res 22 (5): 519–36. https://doi.org/10.1177/0962280211428386. Li, Jun, Daniela M. Witten, Iain M. Johnstone, and Robert Tibshirani. 2012. “Normalization, Testing, and False Discovery Rate Estimation for RNA- Sequencing Data.” Biostatistics (Oxford, England) 13 (3): 523–38. https://doi.org/10.1093/biostatistics/kxr031. Liu, F., W. Lei, J. P. O’Rourke, and S. A. Ness. 2006. “Oncogenic Mutations Cause Dramatic, Qualitative Changes in the Transcriptional Activity of c- Myb.” Oncogene 25 (5): 795–805. https://doi.org/1209105 [pii] 10.1038/sj.onc.1209105. Liu, Han, Li Du, Ru Wang, Chao Wei, Bo Liu, Lei Zhu, Pixu Liu, et al. 2015. “High Frequency of Loss of PTEN Expression in Human Solid Salivary Adenoid Cystic Carcinoma and Its Implication for Targeted Therapy.” Oncotarget 6 (13): 11477–91. https://doi.org/10.18632/oncotarget.3411. Liu, Xin, Yuannyu Zhang, Yong Chen, Mushan Li, Feng Zhou, Kailong Li, Hui Cao, et al. 2017. “In Situ Capture of Chromatin Interactions by Biotinylated DCas9.” Cell 170 (5): 1028-1043.e19. https://doi.org/10.1016/j.cell.2017.08.003. Liu, Yu, John Easton, Ying Shao, Jamie Maciaszek, Zhaoming Wang, Mark R. Wilkinson, Kelly McCastlain, et al. 2017. “The Genomic Landscape of Pediatric and Young Adult T-Lineage Acute Lymphoblastic Leukemia.” Nature Genetics 49 (8): 1211–18. https://doi.org/10.1038/ng.3909. Losick, R. 1998. “Summary: Three Decades after Sigma.” Cold Spring Harb Symp Quant Biol 63: 653–66. Luscher, B., E. Christenson, D. W. Litchfield, E. G. Krebs, and R. N. Eisenman. 1990. “Myb DNA Binding Inhibited by Phosphorylation at a Site Deleted

157 during Oncogenic Activation.” Nature 344 (6266): 517–22. https://doi.org/10.1038/344517a0. Mahrous, Ali K. 2010. “ADENOID CYSTIC CARCINOMA: TIMING OF RECURRENCE” 8 (3): 9. Malaterre, J., T. Mantamadiotis, S. Dworkin, S. Lightowler, Q. Yang, M. I. Ransome, A. M. Turnley, et al. 2008. “C-Myb Is Required for Neural Progenitor Cell Proliferation and Maintenance of the Neural Stem Cell Niche in Adult Brain.” Stem Cells 26 (1): 173–81. https://doi.org/2007-0293 [pii] 10.1634/stemcells.2007-0293. Marcu, K. B., S. A. Bossone, and A. J. Patel. 1992. “ Function and Regulation.” Annu Rev Biochem 61: 809–60. https://doi.org/10.1146/annurev.bi.61.070192.004113. Matsumoto, S., T. Kurimoto, M. M. Taketo, S. Fujii, and A. Kikuchi. 2016. “The WNT/MYB Pathway Suppresses KIT Expression to Control the Timing of Salivary Proacinar Differentiation and Duct Formation.” Development 143 (13): 2311–24. https://doi.org/10.1242/dev.134486. Mikkelsen, T. S., M. Ku, D. B. Jaffe, B. Issac, E. Lieberman, G. Giannoukos, P. Alvarez, et al. 2007. “Genome-Wide Maps of Chromatin State in Pluripotent and Lineage-Committed Cells.” Nature 448 (7153): 553–60. https://doi.org/10.1038/nature06008. Mitani, Y., J. Li, P. H. Rao, Y. J. Zhao, D. Bell, S. M. Lippman, R. S. Weber, C. Caulin, and A. K. El-Naggar. 2010. “Comprehensive Analysis of the MYB- NFIB Gene Fusion in Salivary Adenoid Cystic Carcinoma: Incidence, Variability, and Clinicopathologic Significance.” Clin Cancer Res 16 (19): 4722–31. https://doi.org/1078-0432.CCR-10-0463. Mitani, Y., B. Liu, P. Rao, V. Borra, M. Zafereo, R. Weber, M. Kies, et al. 2016. “Novel MYBL1 Gene Rearrangements with Recurrent MYBL1–NFIB Fusions in Salivary Adenoid Cystic Carcinomas Lacking t(6;9) Translocations.” Clinical Cancer Research 22 (3): 725–33. https://doi.org/10.1158/1078-0432.CCR-15-2867-T. Mitani, Y., P. H. Rao, P. A. Futreal, D. B. Roberts, P. J. Stephens, Y. J. Zhao, L. Zhang, et al. 2011. “Novel Chromosomal Rearrangements and Break Points at the t(6;9) in Salivary Adenoid Cystic Carcinoma: Association with MYB-NFIB Chimeric Fusion, MYB Expression, and Clinical Outcome.” Clin Cancer Res 17 (22): 7003–14. https://doi.org/10.1158/1078-0432.ccr-11- 1870. Mucenski, M. L., K. McLain, A. B. Kier, S. H. Swerdlow, C. M. Schreiner, T. A. Miller, D. W. Pietryga, Jr. Scott W. J., and S. S. Potter. 1991. “A Functional C-Myb Gene Is Required for Normal Murine Fetal Hepatic Hematopoiesis.” Cell 65 (4): 677–89. Ness, S. 2003. “Myb Protein Specificity: Evidence of a Context-Specific Transcription Factor Code.” Blood Cells, Molecules, and Diseases 31 (2): 192–200. https://doi.org/10.1016/S1079-9796(03)00151-7.

158 Ness, S., E. Kowenz-Leutz, T. Casini, T. Graf, and A. Leutz. 1993. “Myb and NF- M: Combinatorial Activators of Myeloid Genes in Heterologous Cell Types.” Genes Dev 7 (5): 749–59. Ness, S., A. Marknell, and T. Graf. 1989. “The V-Myb Oncogene Product Binds to and Activates the Promyelocyte-Specific Mim-1 Gene.” Cell 59 (6): 1115– 25. https://doi.org/0092-8674(89)90767-8 [pii]. Nordkvist, A., J. Mark, H. Gustafsson, G. Bang, and G. Stenman. 1994. “Non- Random Chromosome Rearrangements in Adenoid Cystic Carcinoma of the Salivary Glands.” Genes, Chromosomes & Cancer 10 (2): 115–21. Northcott, P. A., C. Lee, T. Zichner, A. M. Stutz, S. Erkek, D. Kawauchi, D. J. Shih, et al. 2014. “Enhancer Hijacking Activates GFI1 Family Oncogenes in Medulloblastoma.” Nature 511 (7510): 428–34. https://doi.org/10.1038/nature13379. Oelgeschläger, M., R. Janknecht, J. Krieg, S. Schreek, and B. Lüscher. 1996. “Interaction of the Co-Activator CBP with Myb Proteins: Effects on Myb- Specific Transactivation and on the Cooperativity with NF-M.” The EMBO Journal 15 (11): 2771–80. https://doi.org/10.1002/j.1460- 2075.1996.tb00637.x. Oelgeschlager, M., E. Kowenz-Leutz, S. Schreek, A. Leutz, and B. Luscher. 2001. “Tumorigenic N-Terminal Deletions of c-Myb Modulate DNA Binding, Transactivation, and Cooperativity with C/EBP.” Oncogene 20 (50): 7420– 24. https://doi.org/10.1038/sj.onc.1204922. Oelgeschlager, M., J. Krieg, J. M. Luscher-Firzlaff, and B. Luscher. 1995. “Casein Kinase II Phosphorylation Site Mutations in C-Myb Affect DNA Binding and Transcriptional Cooperativity with NF-M.” Mol Cell Biol 15 (11): 5966–74. Ogata, Kazuhiro, Souichi Morikawa, Haruki Nakamura, Ai Sekikawa, Taiko Inoue, Hiroko Kanai, Akinori Sarai, Shunsuke Ishii, and Yoshifumi Nishimura. 1994. “Solution Structure of a Specific DNA Complex of the Myb DNA- Binding Domain with Cooperative Recognition Helices.” Cell 79 (4): 639– 48. https://doi.org/10.1016/0092-8674(94)90549-5. Ohyashiki, Kazuma, Junko H. Ohyashiki, Alan J. Kinniburgh, Keisuke Toyama, Hisao Ito, Jun Minowada, and Avery A. Sandberg. 1988. “Myb Oncogene in Human Hematopoietic Neoplasia with 6q− Anomaly.” Cancer Genetics and Cytogenetics 33 (1): 83–92. https://doi.org/10.1016/0165- 4608(88)90053-2. O’Rourke, J. P., and S. Ness. 2008. “Alternative RNA Splicing Produces Multiple Forms of C-Myb with Unique Transcriptional Activities.” Mol Cell Biol 28 (6): 2091–2101. https://doi.org/10.1128/MCB.01870-07. Panaccione, A., M. T. Chang, B. E. Carbone, Y. Guo, C. A. Moskaluk, R. K. Virk, L. Chiriboga, et al. 2016. “NOTCH1 and SOX10 Are Essential for Proliferation and Radiation Resistance of Cancer Stem-Like Cells in Adenoid Cystic Carcinoma.” Clinical Cancer Research 22 (8): 2083–95. https://doi.org/10.1158/1078-0432.CCR-15-2208.

159 Pearson, Richard, and Kathleen Weston. 2000. “C-Myb Regulates the Proliferation of Immature Thymocytes Following β-Selection.” The EMBO Journal 19 (22): 6112–20. https://doi.org/10.1093/emboj/19.22.6112. Pereira, L. A., H. J. Hugo, J. Malaterre, X. Huiling, S. Sonza, A. Cures, D. F. Purcell, et al. 2015. “MYB Elongation Is Regulated by the Nucleic Acid Binding of NFkappaB P50 to the Intronic Stem-Loop Region.” PLoS One 10 (4): e0122919. https://doi.org/10.1371/journal.pone.0122919. Perkel, J. M., M. C. Simon, and A. Rao. 2002. “Identification of a C-Myb Attenuator-Binding Factor.” Leuk Res 26 (2): 179–90. Persson, M., Y. Andrén, J. Mark, H. M. Horlings, F. Persson, and G. Stenman. 2009. “Recurrent Fusion of MYB and NFIB Transcription Factor Genes in Carcinomas of the Breast and Head and Neck.” Proc Natl Acad Sci USA 106 (44): 18740–44. https://doi.org/10.1073/pnas.0909114106. Persson, Marta, Ywonne Andrén, Christopher A. Moskaluk, Henry F. Frierson, Susanna L. Cooke, Philip Andrew Futreal, Teresia Kling, et al. 2012. “Clinically Significant Copy Number Alterations and Complex Rearrangements of MYB and NFIB in Head and Neck Adenoid Cystic Carcinoma.” Genes, Chromosomes and Cancer 51 (8): 805–17. https://doi.org/10.1002/gcc.21965. Phuchareon, J., Y. Ohta, J. M. Woo, D. W. Eisele, and O. Tetsu. 2009. “Genetic Profiling Reveals Cross-Contamination and Misidentification of 6 Adenoid Cystic Carcinoma Cell Lines: ACC2, ACC3, ACCM, ACCNS, ACCS and CAC2.” PLoS One 4 (6): e6040. https://doi.org/10.1371/journal.pone.0006040. Phuchareon, J., J. Overdevest, Frank McCormick, David W. Eisele, Annemieke van Zante, and Osamu Tetsu. 2014. “Fatty Acid Binding Protein 7 Is a Molecular Marker in Adenoid Cystic Carcinoma of the Salivary Glands: Implications for Clinical Significance.” Translational Oncology 7 (6): 780– 87. https://doi.org/10.1016/j.tranon.2014.10.003. Phuchareon, J., Annemieke van Zante, Jonathan B. Overdevest, Frank McCormick, David W. Eisele, and Osamu Tetsu. 2014. “C-Kit Expression Is Rate-Limiting for Stem Cell Factor-Mediated Disease Progression in Adenoid Cystic Carcinoma of the Salivary Glands.” Translational Oncology 7 (5): 537–45. https://doi.org/10.1016/j.tranon.2014.07.006. Pinero, J., N. Queralt-Rosinach, A. Bravo, J. Deu-Pons, A. Bauer-Mehren, M. Baron, F. Sanz, and L. I. Furlong. 2015. “DisGeNET: A Discovery Platform for the Dynamical Exploration of Human Diseases and Their Genes.” Database (Oxford) 2015. https://doi.org/10.1093/database/bav028. Pusztaszeri, M. P., P. M. Sadow, A. Ushiku, P. Bordignon, T. A. McKee, and W. C. Faquin. 2014. “MYB Immunostaining Is a Useful Ancillary Test for Distinguishing Adenoid Cystic Carcinoma from Pleomorphic Adenoma in Fine-Needle Aspiration Biopsy Specimens.” Cancer Cytopathol 122 (4): 257–65. https://doi.org/10.1002/cncy.21381.

160 Qi, Cheng, Yi Shao, Ning Li, Chunyan Zhang, Miaoqing Zhao, and Fei Gao. 2013. “Prognostic Significance of PDCD4 Expression in Human Salivary Adenoid Cystic Carcinoma.” Medical Oncology (Northwood, London, England) 30 (1): 491. https://doi.org/10.1007/s12032-013-0491-1. Qu, Jing, Min Song, Jian Xie, Xiao-Yu Huang, Xiao-Meng Hu, Rui-Huan Gan, Yong Zhao, et al. 2016. “Notch2 Signaling Contributes to Cell Growth, Invasion, and Migration in Salivary Adenoid Cystic Carcinoma.” Molecular and Cellular Biochemistry 411 (1–2): 135–41. https://doi.org/10.1007/s11010-015-2575-z. Quintana, A. M., F. Liu, J. P. O’Rourke, and S. Ness. 2011. “Identification and Regulation of C-Myb Target Genes in MCF-7 Cells.” BMC Cancer 11: 30. https://doi.org/10.1186/1471-2407-11-30. Ramsay, R. G., and T. J. Gonda. 2008. “MYB Function in Normal and Cancer Cells.” Nat Rev Cancer 8 (7): 523–34. https://doi.org/nrc2439 [pii] 10.1038/ nrc2439. Ramsay, R. G., S. Ishii, and T. J. Gonda. 1991. “Increase in Specific DNA Binding by Carboxyl Truncation Suggests a Mechanism for Activation of Myb.” Oncogene 6 (10): 1875–79. Rettig, Eleni M., Conover Talbot, Mark Sausen, Sian Jones, Justin A. Bishop, Laura D. Wood, Collin Tokheim, et al. 2016. “Whole-Genome Sequencing of Salivary Gland Adenoid Cystic Carcinoma.” Cancer Prevention Research, February, canprevres.0316.2015. https://doi.org/10.1158/1940- 6207.CAPR-15-0316. Risso, D., J. Ngai, T. P. Speed, and S. Dudoit. 2014. “Normalization of RNA-Seq Data Using Factor Analysis of Control Genes or Samples.” Nat Biotechnol 32 (9): 896–902. https://doi.org/10.1038/nbt.2931. Robinson, M. D., D. J. McCarthy, and G. K. Smyth. 2010. “EdgeR: A Bioconductor Package for Differential Expression Analysis of Digital Gene Expression Data.” Bioinformatics 26 (1): 139–40. https://doi.org/10.1093/bioinformatics/btp616. Rushton, J. J., L. M. Davis, W. Lei, X. Mo, A. Leutz, and S. Ness. 2003. “Distinct Changes in Gene Expression Induced by A-Myb, B-Myb and c-Myb Proteins.” Oncogene 22 (2): 308–13. https://doi.org/1206131 [pii] 10.1038/ sj.onc.1206131. Rushton, J. J., and S. Ness. 2001. “The Conserved DNA Binding Domain Mediates Similar Regulatory Interactions for A-Myb, B-Myb, and c-Myb Transcription Factors.” Blood Cells Mol Dis 27 (2): 459–63. https://doi.org/ 10.1006/bcmd.2001.0405. Sadasivam, Subhashini, Shenghua Duan, and James A. DeCaprio. 2012. “The MuvB Complex Sequentially Recruits B-Myb and FoxM1 to Promote Mitotic Gene Expression.” Genes & Development 26 (5): 474–89. https://doi.org/10.1101/gad.181933.111.

161 Sandberg, Mark L., Susan E. Sutton, Mathew T. Pletcher, Tim Wiltshire, Lisa M. Tarantino, John B. Hogenesch, and Michael P. Cooke. 2005. “C-Myb and P300 Regulate Hematopoietic Stem Cell Proliferation and Differentiation.” Developmental Cell 8 (2): 153–66. https://doi.org/10.1016/j.devcel.2004.12.015. Schaefer, C. F., K. Anthony, S. Krupa, J. Buchoff, M. Day, T. Hannay, and K. H. Buetow. 2009. “PID: The Pathway Interaction Database.” Nucleic Acids Res 37 (Database issue): D674-9. https://doi.org/10.1093/nar/gkn653. Shao, C., W. Bai, J. C. Junn, M. Uemura, P. T. Hennessey, D. Zaboli, D. Sidransky, J. A. Califano, and P. K. Ha. 2011. “Evaluation of MYB Promoter Methylation in Salivary Adenoid Cystic Carcinoma.” Oral Oncol 47 (4): 251–55. https://doi.org/10.1016/j.oraloncology.2011.01.008. Shingaki, Susumu, Shohei Kanemaru, Yohei Oda, Kanae Niimi, Toshihiko Mikami, Akinori Funayama, and Chikara Saito. 2014. “Distant Metastasis and Survival of Adenoid Cystic Carcinoma after Definitive Treatment.” Journal of Oral and Maxillofacial Surgery, Medicine, and Pathology 26 (3): 312–16. https://doi.org/10.1016/j.ajoms.2013.09.013. Skvortsova, Yulia V., Sofia A. Kondratieva, Marina V. Zinovyeva, Lev G. Nikolaev, Tatyana L. Azhikina, and Ildar V. Gainetdinov. 2016. “Intragenic Locus in Human PIWIL2 Gene Shares Promoter and Enhancer Functions.” PLoS ONE 11 (6). https://doi.org/10.1371/journal.pone.0156454. Spiro, R. H. 1997. “Distant Metastasis in Adenoid Cystic Carcinoma of Salivary Origin.” Am J Surg 174 (5): 495–98. Stadhouders, R., S. Aktuna, S. Thongjuea, A. Aghajanirefah, F. Pourfarzad, W. van Ijcken, B. Lenhard, et al. 2014. “HBS1L-MYB Intergenic Variants Modulate Fetal Hemoglobin via Long-Range MYB Enhancers.” J Clin Invest 124 (4): 1699–1710. https://doi.org/10.1172/jci71520. Stadhouders, R., S. Thongjuea, C. Andrieu-Soler, R. J. Palstra, J. C. Bryne, A. van den Heuvel, M. Stevens, et al. 2012. “Dynamic Long-Range Chromatin Interactions Control Myb Proto-Oncogene Transcription during Erythroid Development.” Embo j 31 (4): 986–99. https://doi.org/10.1038/emboj.2011.450. Stephens, Philip J., Helen R. Davies, Yoshitsugu Mitani, Peter Van Loo, Adam Shlien, Patrick S. Tarpey, Elli Papaemmanuil, et al. 2013. “Whole Exome Sequencing of Adenoid Cystic Carcinoma.” Journal of Clinical Investigation 123 (7): 2965–68. https://doi.org/10.1172/JCI67201. Su, Bo-Hua, Jing Qu, Min Song, Xiao-Yu Huang, Xiao-Meng Hu, Jian Xie, Yong Zhao, et al. 2014. “NOTCH1 Signaling Contributes to Cell Growth, Anti- Apoptosis and Metastasis in Salivary Adenoid Cystic Carcinoma.” Oncotarget 5 (16). https://doi.org/10.18632/oncotarget.2321. Subramanian, Aravind, Pablo Tamayo, Vamsi K. Mootha, Sayan Mukherjee, Benjamin L. Ebert, Michael A. Gillette, Amanda Paulovich, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for

162 Interpreting Genome-Wide Expression Profiles.” PNAS 102 (43). https://doi.org/10.1073/pnas.0506580102. Suhasini, Modem, and Renate B Pilz. 1999. “Transcriptional Elongation of C-Myb Is Regulated by NF-KB (P50/RelB).” Oncogene 18: 7360–69. Swiercz, J. M., T. Worzfeld, and S. Offermanns. 2008. “ErbB-2 and Met Reciprocally Regulate Cellular Signaling via Plexin-B1.” J Biol Chem 283 (4): 1893–1901. https://doi.org/10.1074/jbc.M706822200. Tang, Y. L., Y. L. Fan, J. Jiang, K. D. Li, M. Zheng, W. Chen, X. R. Ma, et al. 2014. “C-Kit Induces Epithelial-Mesenchymal Transition and Contributes to Salivary Adenoid Cystic Cancer Progression.” Oncotarget 5 (6): 1491– 1501. Tang, Ya-ling, Xin Liu, Shi-yu Gao, Hao Feng, Ya-ping Jiang, Sha-sha Wang, Jing Yang, et al. 2015. “WIP1 Stimulates Migration and Invasion of Salivary Adenoid Cystic Carcinoma by Inducing MMP-9 and VEGF-C.” Oncotarget 6 (11): 9031–44. https://doi.org/10.18632/oncotarget.3320. Thomas-Chollier, Hufton, Heinig, O’Keeffe, El Masri, Roider, Manke, and Vingron. 2011. “Transcription Factor Binding Predictions Using TRAP for the Analysis of ChIP-Seq Data and Regulatory SNPs.” Nature Protocols 6 (12): 1860. https://doi.org/doi:10.1038/nprot.2011.409. Thorvaldsdóttir, H., J. T. Robinson, and J. P. Mesirov. 2013. “Integrative Genomics Viewer (IGV): High-Performance Genomics Data Visualization and Exploration.” Brief Bioinform 14 (2): 178–92. https://doi.org/10.1093/bib/bbs017. Tonks, A., L. Pearn, M. Musson, A. Gilkes, K. I. Mills, A. K. Burnett, and R. L. Darley. 2007. “Transcriptional Dysregulation Mediated by RUNX1- RUNX1T1 in Normal Human Progenitor Cells and in Acute Myeloid Leukaemia.” Leukemia 21 (12): 2495–2505. https://doi.org/10.1038/sj.leu.2404961. Toscani, A., R. V. Mettus, R. Coupland, H. Simpkins, J. Litvin, J. Orth, K. S. Hatton, and E. P. Reddy. 1997. “Arrest of Spermatogenesis and Defective Breast Development in Mice Lacking A-Myb.” Nature 386 (6626): 713–17. https://doi.org/10.1038/386713a0. Tufegdzic, Vidakovic, Oscar M. Rueda, Stephin J. Vervoort, Ankita Sati Batra, Mae Akilina Goldgraben, Santiago Uribe-Lewis, Wendy Greenwood, Paul J. Coffer, Alejandra Bruna, and Carlos Caldas. 2015. “Context-Specific Effects of TGF-β/SMAD3 in Cancer Are Modulated by the Epigenome.” Cell Reports 13 (11): 2480–90. https://doi.org/10.1016/j.celrep.2015.11.040. Tusher, V. G., R. Tibshirani, and G. Chu. 2001. “Significance Analysis of Microarrays Applied to the Ionizing Radiation Response.” Proc Natl Acad Sci U S A 98 (9): 5116–21. https://doi.org/10.1073/pnas.091062498. Uttarkar, Sagar, Emilie Dassé, Anna Coulibaly, Simone Steinmann, Anke Jakobs, Caroline Schomburg, Amke Trentmann, et al. 2016. “Targeting Acute

163 Myeloid Leukemia with a Small Molecule Inhibitor of the Myb/P300 Interaction.” Blood 127 (9): 1173–82. https://doi.org/10.1182/blood-2015- 09-668632. Uttarkar, Sagar, Therese Piontek, Sandeep Dukare, Caroline Schomburg, Peter Schlenke, Wolfgang E. Berdel, Carsten Müller-Tidow, Thomas J. Schmidt, and Karl-Heinz Klempnauer. 2016. “Small-Molecule Disruption of the Myb/ P300 Cooperation Targets Acute Myeloid Leukemia Cells.” Molecular Cancer Therapeutics 15 (12): 2905–15. https://doi.org/10.1158/1535- 7163.MCT-16-0185. Wal, J. E. van der, A. G. Becking, G. B. Snow, and I. van der Waal. 2002. “Distant Metastases of Adenoid Cystic Carcinoma of the Salivary Glands and the Value of Diagnostic Examinations during Follow-Up.” Head Neck 24 (8): 779–83. https://doi.org/10.1002/hed.10126. Wang, Ruinan, Ning Geng, Yuqiao Zhou, Dunfang Zhang, Longjiang Li, Jing Li, Ning Ji, Min Zhou, Yu Chen, and Qianming Chen. 2015. “Aberrant Wnt-1/ Beta-Catenin Signaling and WIF-1 Deficiency Are Important Events Which Promote Tumor Cell Invasion and Metastasis in Salivary Gland Adenoid Cystic Carcinoma.” Bio-Medical Materials and Engineering 26 (s1): S2145–53. https://doi.org/10.3233/BME-151520. Wang, X., J. Hou, C. Quedenau, and W. Chen. 2016. “Pervasive Isoform‐depth specificCharacterizationofthe Translational Regulation via Alternative Transcription Start Sites in Mammals.” Mol Syst Biol 12 (7). https://doi.org/10.15252/msb.20166941. Weischenfeldt, J., T. Dubash, A. P. Drainas, B. R. Mardin, Y. Chen, A. M. Stütz, S. M. Waszak, et al. 2017. “Pan-Cancer Analysis of Somatic Copy Number Alterations Implicates IRS4 and IGF2 in Enhancer Hijacking.” Nat Genet 49 (1): 65–74. https://doi.org/10.1038/ng.3722. West, R. B., C. Kong, N. Clarke, T. Gilks, J. S. Lipsick, H. Cao, S. Kwok, K. D. Montgomery, S. Varma, and Q. T. Le. 2011. “MYB Expression and Translocation in Adenoid Cystic Carcinomas and Other Salivary Gland Tumors with Clinicopathologic Correlation.” Am J Surg Pathol 35 (1): 92– 99. https://doi.org/00000478-201101000-00011 [pii] 10.1097/PAS.0b013e3182002777. Wickham, H. 2019. Ggplot2 - Elegant Graphics for Data Analysis | Hadley Wickham | Springer. https://www.springer.com/gp/book/9780387981413. Wong, O. G., T. Nitkunan, I. Oinuma, C. Zhou, V. Blanc, R. S. Brown, S. R. Bott, et al. 2007. “Plexin-B1 Mutations in Prostate Cancer.” Proc Natl Acad Sci U S A 104 (48): 19040–45. https://doi.org/10.1073/pnas.0702544104. Wood, Derrick E., and Steven L. Salzberg. 2014. “Kraken: Ultrafast Metagenomic Sequence Classification Using Exact Alignments.” Genome Biology 15 (3): R46. https://doi.org/10.1186/gb-2014-15-3-r46. Xu, Li-Hua, Fei Zhao, Wen-Wen Yang, Chu-Wen Chen, Zhi-Hao Du, Min Fu, Xi- Yuan Ge, and Sheng-Lin Li. 2019. “MYB Promotes the Growth and

164 Metastasis of Salivary Adenoid Cystic Carcinoma.” International Journal of Oncology 54 (5): 1579–90. https://doi.org/10.3892/ijo.2019.4754. Xu, Yali, Joseph P. Milazzo, Tim D.D. Somerville, Yusuke Tarumoto, Yu-Han Huang, Elizabeth L. Ostrander, John E. Wilkinson, Grant A. Challen, and Christopher R. Vakoc. 2018. “A TFIID-SAGA Perturbation That Targets MYB and Suppresses Acute Myeloid Leukemia.” Cancer Cell 33 (1): 13- 28.e8. https://doi.org/10.1016/j.ccell.2017.12.002. Yi, Chun, Bin-Bin Li, and Chuan-Xiang Zhou. 2016. “Bmi-1 Expression Predicts Prognosis in Salivary Adenoid Cystic Carcinoma and Correlates with Epithelial-Mesenchymal Transition-Related Factors.” Annals of Diagnostic Pathology 22 (June): 38–44. https://doi.org/10.1016/j.anndiagpath.2015.10.015. Yu, G., L. G. Wang, Y. Han, and Q. Y. He. 2012. “ClusterProfiler: An R Package for Comparing Biological Themes among Gene Clusters.” Omics 16 (5): 284–87. https://doi.org/10.1089/omi.2011.0118. Yuan, W. 2000. “Intron 1 Rather than 5’ Flanking Sequence Mediates Cell Type- Specific Expression of c-Myb at Level of Transcription Elongation.” Biochim Biophys Acta 1490 (1–2): 74–86. Zhang, C. L., Y. Zou, W. He, F. H. Gage, and R. M. Evans. 2008. “A Role for Adult TLX-Positive Neural Stem Cells in Learning and Behaviour.” Nature 451 (7181): 1004–7. https://doi.org/10.1038/nature06562. Zhang, J., B. Han, X. Li, J. Bies, P. Jiang, R. P. Koller, and L. Wolff. 2016. “Distal Regulation of C-Myb Expression during IL-6-Induced Differentiation in Murine Myeloid Progenitor M1 Cells.” Cell Death Dis 7 (9): e2364-. https:// doi.org/10.1038/cddis.2016.267. Zhang, P., E. Dimont, T. Ha, D. J. Swanson, W. Hide, and D. Goldowitz. 2017. “Relatively Frequent Switching of Transcription Start Sites during Cerebellar Development.” BMC Genomics 18 (1): 461. https://doi.org/10.1186/s12864-017-3834-z. Zhang, Ruihan, Jochen Erler, and Jörg Langowski. 2017. “Histone Acetylation Regulates Chromatin Accessibility: Role of H4K16 in Inter-Nucleosome Interaction.” Biophysical Journal 112 (3): 450–59. https://doi.org/10.1016/j.bpj.2016.11.015. Zhao, M., D. Sano, C. R. Pickering, S. A. Jasser, Y. C. Henderson, G. L. Clayman, E. M. Sturgis, et al. 2011. “Assembly and Initial Characterization of a Panel of 85 Genomically Validated Cell Lines from Diverse Head and Neck Tumor Sites.” Clin Cancer Res 17 (23): 7248–64. https://doi.org/1078-0432.CCR-11-0690 [pii] 10.1158/1078-0432.CCR-11- 0690. Zhao, Zhi-Li, Si-Rui Ma, Wei-Ming Wang, Cong-Fa Huang, Guang-Tao Yu, Tian- Fu Wu, Lin-Lin Bu, et al. 2015. “Notch Signaling Induces Epithelial- Mesenchymal Transition to Promote Invasion and Metastasis in Adenoid

165 Cystic Carcinoma.” American Journal of Translational Research 7 (1): 162–74. Zhou, Chenchen, Jeffrey Liu, Yaling Tang, Guiquan Zhu, Min Zheng, Jian Jiang, Jing Yang, and Xinhua Liang. 2012. “Coexpression of Hypoxia-Inducible Factor-2α, TWIST2, and SIP1 May Correlate with Invasion and Metastasis of Salivary Adenoid Cystic Carcinoma.” Journal of Oral Pathology & Medicine: Official Publication of the International Association of Oral Pathologists and the American Academy of Oral Pathology 41 (5): 424– 31. https://doi.org/10.1111/j.1600-0714.2011.01114.x. Zhou, Chuan-Xiang, and Yan Gao. 2006. “Aberrant Expression of β-Catenin, Pin1 and Cylin D1 in Salivary Adenoid Cystic Carcinoma: Relation to Tumor Proliferation and Metastasis.” Oncology Reports 16 (3): 505–11. https://doi.org/10.3892/or.16.3.505. Zhou, Y., and S. A. Ness. 2011. “Myb Proteins: Angels and Demons in Normal and Transformed Cells.” Front Biosci 16: 1109–31. https://doi.org/3738 [pii]. Zhou, Y., J. P. O’Rourke, J. S. Edwards, and S. A. Ness. 2011. “Single Molecule Analysis of C-Myb Alternative Splicing Reveals Novel Classifiers for Precursor B-ALL.” PLoS One 6 (8): e22880. https://doi.org/PONE-D-11- 07314 [pii] 10.1371/journal.pone.0022880. Zhu, Xin, Jia-jie Xu, Si-si Hu, Jian-guo Feng, Lie-hao Jiang, Xiu-xiu Hou, Jun Cao, Jing Han, Zhi-qiang Ling, and Ming-hua Ge. 2014. “Pim-1 Acts as an Oncogene in Human Salivary Gland Adenoid Cystic Carcinoma.” Journal of Experimental & Clinical Cancer Research: CR 33 (December): 114. https://doi.org/10.1186/s13046-014-0114-5. Zorbas, M., C. Sicurella, I. Bertoncello, D. Venter, S. Ellis, M. Mucenski, and R. Ramsay. 1999. “C-Myb Is Critical for Murine Colon Development.” Oncogene 18: 5821–30. https://doi.org/doi:10.1038/sj.onc.1202971.

166