<<

University of Vermont ScholarWorks @ UVM

Graduate College Dissertations and Theses Dissertations and Theses

2019 Maintenance Of Mammary Epithelial Phenotype By Factor Runx1 Through Mitotic Bookmarking Joshua Rose University of Vermont

Follow this and additional works at: https://scholarworks.uvm.edu/graddis Part of the Biochemistry Commons, and the Genetics and Genomics Commons

Recommended Citation Rose, Joshua, "Maintenance Of Mammary Epithelial Phenotype By Runx1 Through Mitotic Gene Bookmarking" (2019). Graduate College Dissertations and Theses. 998. https://scholarworks.uvm.edu/graddis/998

This Thesis is brought to you for free and open access by the Dissertations and Theses at ScholarWorks @ UVM. It has been accepted for inclusion in Graduate College Dissertations and Theses by an authorized administrator of ScholarWorks @ UVM. For more information, please contact [email protected].

MAINTENANCE OF MAMMARY EPITHELIAL PHENOTYPE BY TRANSCRIPTION FACTOR RUNX1 THROUGH MITOTIC GENE BOOKMARKING

A Thesis Presented

by

Joshua Rose

to

The Faculty of the Graduate College

of

The University of Vermont

In Partial Fulfillment of the Requirements for the Degree of Master of Science Specializing in Cellular, Molecular, and Biomedical Sciences

January, 2019

Defense Date: November 12, 2018 Thesis Examination Committee:

Sayyed Kaleem Zaidi, Ph.D., Advisor Gary Stein, Ph.D., Advisor Seth Frietze, Ph.D., Chairperson Janet Stein, Ph.D. Jonathan Gordon, Ph.D. Cynthia J. Forehand, Ph.D. Dean of the Graduate College

ABSTRACT

Breast arises from a series of acquired that disrupt normal mammary epithelial homeostasis and create multi-potent cancer stem cells that can differentiate into clinically distinct subtypes. Despite improved therapies and advances in early detection, breast cancer remains the leading diagnosed cancer in women. A predominant mechanism initiating invasion and migration for a variety of including breast, is epithelial-to-mesenchymal transition (EMT). EMT— a trans-differentiation process through which mammary epithelial cells acquire a more aggressive mesenchymal phenotype—is a regulated process during early development and involves many transcription factors involved in cell lineage commitment, proliferation, and growth. Despite accumulating evidence for a broad understanding of EMT regulation, the mechanism(s) by which mammary epithelial cells maintain their phenotype is unknown. Mitotic gene bookmarking, i.e., transcription factor binding to target during mitosis for post mitotic regulation, is a key epigenetic mechanism to convey regulatory information for cell proliferation, growth, and identity through successive cell divisions. Many phenotypic transcription factors, including the hematopoietic Runt Related Transcription Factor 1 (RUNX1/AML1), bookmark target genes during mitosis. Despite growing evidence, a role for mitotic gene bookmarking in maintaining mammary epithelial phenotype has not been investigated. RUNX1 has been recently identified to play key roles in breast cancer development and progression. Importantly, RUNX1 stabilizes the normal breast epithelial phenotype and prevents EMT through repression of EMT-initiating pathways [9]. Findings reported in this thesis demonstrate that RUNX1 mitotically bookmarks both RNA Pol I and II transcribed genes involved in proliferation, growth, and mammary epithelial phenotype maintenance. Inhibition of RUNX1 DNA binding by a specific small molecule inhibitor led to phenotypic changes, , differences in global synthesis, and differential expression of ribosomal RNA as well as protein coding genes and long non-coding RNA genes involved in cellular phenotype. Together these findings reveal a novel epigenetic regulatory role of RUNX1 in normal- like breast epithelial cells and strongly suggest that mitotic bookmarking of target genes by RUNX1 is required to maintain breast epithelial phenotype. Disruption of RUNX1 bookmarking results in initiation of epithelial to mesenchymal transition, an essential first step in the onset of breast cancer.

ACKNOWLEDGEMENTS

A very sincere and heartfelt thanks is given to Kaleem Zaidi, for his patience, guidance, and help throughout my entire graduate experience under his mentorship. His teachings and experiences have molded me into a better scientist as well as a better person, and I couldn’t imagine having chosen anyone else better to work alongside during my graduate research. A special thanks is also given to Gary Stein, Janet Stein, and Jane

Lian for their continuing guidance and support throughout my time in their lab, for without which my research experience would not have been possible.

I would also like to thank Joseph Boyd for his work performed conducting bioinformatic analysis, as well as for his patience and effort put forth on this project.

Without his expertise, this work would not have been possible. I really appreciate and thank Jonathan Gordon for his weekly guidance and suggestions to further my research in an efficient yet scientifically responsible manner. Sincere thanks also goes to

Andrew Fritz as well, for his wealth of information and ideas on top of the efforts he put forth with this project; it was greatly appreciated. I would also like to acknowledge and thank Mingu Kang, for his help mastering ChIP-Seq protocols and for being a phenomenal lab partner. I’d also like to thank Eliana Moskovitz in advance for her effort and work on this project after my departure from the lab; her timely undertaking on this project helped facilitate an advancement in my career as a scientist for which

I’m greatly appreciative. Additional thanks to all current (Prachi, Kirsten, Coralee,

Nick, Jamie, Jessica, Shelby, Caroline, and Natalie) and past members of the Stein-Lian

ii laboratory. From conducting the research to which my project was based on, to helping me find reagents and answering any of my questions, to having fun talks in the break room; it was tremendously appreciated.

I’d like to thank my thesis examination committee (Kaleem Zaidi, Janet Stein, Gary

Stein, Jonathan Gordon, and Seth Frietze) primarily for their additional time spent with me and my research on top of their already abundant commitments, as well as their guidance and mentorship to helping me conduct successful research.

Lastly, I’d like to thank my fiancé Katherine Pouliot. There have been many late nights and time away from her while working on this research project. Her patience, understanding, and unrelenting support for me has been the single greatest driving factor I’ve ever experienced in my life. Without her, this would not have been possible.

iii

TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS ...... ii

LIST OF TABLES ...... vii

LIST OF FIGURES ...... viii

CHAPTER 1: BREAST CANCER ...... 1

1.1. Overview...... 1

1.2. Clinical Distinctions of Breast Cancer ...... 1

1.3. Cellular Models for Investigating Breast Cancer ...... 4

1.4. Normal Mammary Gland Development ...... 7

CHAPTER 2: EPITHELIAL-TO-MESENCHYMAL TRANSITION (EMT) ...... 14

2.1. Biological Process of EMT ...... 14

CHAPTER 3: MITOTIC GENE BOOKMARKING ...... 17

3.1. Phenotypic Transcription Factors Maintain Cellular Phenotype ...... 17

3.2. Timeline of Mitotic Gene Bookmarking ...... 18

3.3. Supporting Evidence of Mitotic Gene Bookmarking ...... 21

3.4. Mitotic Gene Bookmarking Maintains Cancer Phenotypes ...... 23

3.5. Models and Techniques for Investigation of Mitotic Gene Bookmarking ...... 25

iv

CHAPTER 4: AML1/RUNX1 ...... 32

4.1. AML1/RUNX1 Overview and Introduction ...... 32

4.2. RUNX1 in Hematologic Malignancies ...... 35

4.3. RUNX1 in Breast Cancer ...... 37

4.4. RUNX and Mitotic Gene Bookmarking ...... 39

CHAPTER 5: RATIONALE FOR INVESTIGATION AND CENTRAL HYPOTHESIS ...... 41

RESULTS ...... 42

RUNX1 associates with mitotic chromatin and occupies target genes ...... 42

RUNX1 mitotically bookmarks RNA Pol I-transcribed genes involved in cell growth ...... 50

RUNX1 mitotically bookmarks RNA Pol II-transcribed genes involved in hormone responsiveness and cell phenotype ...... 54

DISCUSSION ...... 62

RUNX1 associates with mitotic chromatin at RNA Pol I and II-transcribed genes ... 62

Refined Role of RUNX1 in mammary gland development...... 63

RUNX1 regulation of ...... 65

RUNX1 regulation of HES1 ...... 66

RUNX1 regulation of H2AFX...... 67

RUNX1 regulation of lncRNAs NEAT1/NEAT2 (MALAT1) ...... 69 v

Requirement for mitotic bookmarking of genes in maintenance of an epithelial phenotype ...... 70

MATERIALS AND METHODS ...... 72

Cell Culture Techniques ...... 72

Protein Expression and Localization ...... 73

Molecular Techniques ...... 77

SUPPLEMENTAL MATERIAL ...... 81

REFERENCES ...... 89

vi

LIST OF TABLES

TABLE 1. RUNX1 MITOTICALLY BOOKMARKED GENES IN MCF10A BREAST EPITHELIAL CELLS. (N=417 TOTAL) ………………………………………………………………………………………………………………81

TABLE 2. RUNX1 BOOKMARKED GENES THAT ARE BOUND BY ER AND ARE DIFFERENTIALLY EXPRESSED (EITHER UP OR DOWN) UPON TREATMENT. ………………………………………………………………………………………………………………88

vii

LIST OF FIGURES

FIGURE 1. THE DIFFERENT INTRINSIC SUBTYPES OF BREAST CANCER. FROM DAI X, ET AL 2015 [3] ………………………………………………………………………………………………………………..2

FIGURE 2. CELL LINE SUBTYPES FOR INVESTIGATING BREAST CANCER. FIGURE ADAPTED FROM DAI X, ET AL 2017 [5] ……………………………………………….……………………………………………………………….5

FIGURE 3. MAMMARY GLAND DEVELOPMENT. FIGURE FROM INMAN JL, ET AL 2015. [1] ………………………………………………………………………………………………………………..8

FIGURE 4. ER AND PR SIGNALING PATHWAYS IN BREAST EPITHELIAL CELLS. FIGURE FROM TANOS T, ET AL 2012. [6] …………………………………………………………………………………………………………...….11

FIGURE 5. OVERVIEW OF EMT AND MET. FIGURE FROM LAMOUILLE, XU, AND DERYNCK 2014. [2] ………………………………………………………………………………………………………………14

FIGURE 6. MITOTIC RETENTION OF TRANSCRIPTION FACTORS ON CHROMATIN, WITH HIGHLIGHTED DIFFERENCES IN SIGNAL OF RNA POL I VS II TRANSCRIBED GENES. FIGURE ADAPTED FROM ZAIDI SK, ET AL. 2011 [8] ………………………………………………………………………………………………………………17

FIGURE 7. CURRENT ESTABLISHED PROTEINS KNOWN TO MITOTICALLY BOOKMARK TARGET GENES, AS WELL AS THE SIGNALING PATHWAYS AND CHROMATIN MODIFYING PROPERTIES ASSOCIATED WITH MITOTIC BOOKMARKING. FIGURE FROM ZAIDI SK, ET AL. 2018 [7] ………………………………………………………………………………………………………………20

FIGURE 8. A PROPOSED HYPOTHESIS FOR CHEMICAL ARTIFACTS REGARDING EXPERIMENTAL VISUALIZATION OF TRANSCRIPTION FACTOR RETENTION. FIGURE FROM TEVES SS ET AL. 2016 [3] ………………………………………………………………………………………………………………29

FIGURE 9. RUNX1’S DNA- AND PROTEIN-INTERACTING DOMAINS WITH CO-BINDING PARTNERS IDENTIFIED. AD: ACTIVATING DOMAIN, ID: INHIBITORY DOMAIN. FIGURE FROM ITO Y ET AL, 2015. [4] ………………………………………………………………………………………………………………32

FIGURE 10. VALIDATED EXPERIMENTAL APPROACH FOR ANALYSIS OF RUNX1 PROTEIN WITHIN MCF10A BREAST EPITHELIAL CELLS. A) PROTOCOL FOR HARVEST OF THREE MCF10A CELL POPULATIONS: DMSO = ASYNCH (A), MITOTIC = BLOCKED (B)(M), G1 = RELEASED (R)(G1). B) FLUORESCENTLY ACTIVATED CELL SORTING (FACS) ANALYSIS OF PI/RNASE STAINED MCF10A TREATMENT GROUPS. C) WESTERN BLOT OF MCF10A TREATMENT GROUPS FOR -SPECIFIC PROTEINS CYCLIN B1 (TOP), CDT1 (MIDDLE), AND LAMIN B (BOTTOM). ………………………………………………………………………………………………………………42

viii

FIGURE 11. WESTERN BLOT OF ASYNCH (A), MITOTIC (M), AND G1 MCF10A CELL POPULATIONS FOR RUNX1 (TOP) AND TUBULIN (BOTTOM). ………………………………………………………………………………………………………………43

FIGURE 12. LIMIT OF DETECTION FOR RUNX1 WITHIN ACTIVELY PROLIFERATING MCF10A CELLS. BLUE = DAPI, GREEN = RUNX1. RUNX1 FOCI ARE INDICATED BY WHITE ARROWS. A) AN ANTIBODY VALIDATED FOR USE IN WESTERN BLOT AND IMMUNOFLUORESCENCE (CST 4334S) WAS DILUTED TO 1:200 (TOP), 1:50 (MIDDLE), AND 1:10 (BOTTOM). B) A CHIP-GRADE ANTIBODY (CST 4334BF) WAS DILUTED TO 1:200 (TOP), 1:600 (MIDDLE), AND 1:1000 (BOTTOM). ………………………………………………………………………………………………………………44

FIGURE 13. RUNX1 IS PRESENT IN EACH TOPOLOGICALLY IDENTIFIED SUBSTAGE OF MITOSIS IN THE FORM OF MAJOR AND MINOR FOCI. BLUE = DAPI, GREEN = RUNX1. ………………………………………………………………………………………………………………45

FIGURE 14. GEL ELECTROPHORESIS OF SONICATED LYSATES (N=2 BIOLOGICAL REPLICATES) FOR CHIP. R = REPLICATE. A) SONICATED LYSATES FROM DMSO (ASYNCH) HARVESTED MCF10A CELLS. B) SONICATED LYSATES FROM MITOTIC AND G1 HARVESTED MCF10A CELLS. ………………………………………………………………………………………………………………46

FIGURE 15. BIOANALYZER TRACES OF POOLED AND SIZE SELECTED LIBRARIES REVEAL PROPER FRAGMENTATION RANGE FOR ACCURATE SEQUENCING. A) POOLED INPUT LIBRARIES. B) POOLED RUNX1-CHIP LIBRARIES. ………………………………………………………………………………………………………………47

FIGURE 16. RUNX1 OCCUPIES PROTEIN CODING GENES AND LONG NON-CODING RNAS ACROSS ASYNCHRONOUS, MITOTIC, AND G1 POPULATIONS OF MCF10A BREAST EPITHELIAL CELLS. A) HEATMAPS SHOWING PEAKS CALLED BETWEEN A, M, AND G1 MCF10A CELLS (LEFT, MIDDLE, AND RIGHT RESPECTIVELY). B) VENN DIAGRAMS ILLUSTRATING THE NUMBER LNCRNAS (LEFT DIAGRAM) AND PROTEIN CODING GENES (RIGHT DIAGRAM) IDENTIFIED WITHIN AND BETWEEN A, M, AND G1 MCF10A POPULATIONS. C) GENE SET ENRICHMENT ANALYSIS RESULTS FROM INTERROGATING MITOTICALLY BOOKMARKED GENES (I.E. GENES CALLED WITHIN BLOCKED POPULATION WITHIN 5KB OF TSS IN MCF10A POPULATIONS) AGAINST HALLMARK GENE SETS FROM MOLECULAR SIGNATURES DATABASE (MSIGDB). THE TOP 10 MOST SIGNIFICANTLY OVERLAPPING GENE SETS ARE SHOWN FROM TOP TO BOTTOM. ………………………………………………………………………………………………………………48

FIGURE 17. CHIP-SEQ TRACKS FOR EACH POPULATION OF MCF10A CELLS AT HUMAN RIBOSOMAL DNA REGIONS. FE = FOLD ENRICHMENT. ………………………………………………………………………………………………………………50

FIGURE 18. RUNX1 COLOCALIZES WITH UPSTREAM BINDING TRANSCRIPTION FACTOR (UBF) DURING MITOSIS. BLUE = DAPI, GREEN = RUNX1, RED = UBF.

………………………………………………………………………………………………………………51

ix

FIGURE 19. CONFOCAL IMMUNOFLUORESCENCE MICROSCOPY LINE PROFILES VALIDATING RUNX1 AND UBF COLOCALIZATION WITHIN MCF10A BREAST EPITHELIAL CELLS. A) LINE PROFILES REPRESENTATIVE FOR N=15 INTERPHASE CELLS. B) LINE PROFILES REPRESENTATIVE FOR N=15 METAPHASE CELLS. ………………………………………………………………………………………………………………52

FIGURE 20. RUNX1 REGULATES PRE-RRNA EXPRESSION. EXPRESSION LEVELS WERE MADE RELATIVE TO BETA-. AI-4-88 = INACTIVE INHIBITOR, AI-14-91 = ACTIVE INHIBITOR. ………………………………………………………………………………………………………………53

FIGURE 21. RUNX1 REGULATES GLOBAL PROTEIN SYNTHESIS WITHIN MCF10A BREAST EPITHELIAL CELLS. LEFT = PHASE CONTRAST, MIDDLE/GREEN = DNA (SYTOX GREEN), RIGHT/RED = NASCENT PROTEINS. “PROTEIN LABEL” INDICATES THE WELL DESIGNATION IN WHICH CELLS WOULD BE INCUBATED WITH THE PROTEIN LABEL AS OPPOSED TO CYCLOHEXIMIDE OR NO LABEL FOR PROTEINS. A) MCF10A CELLS THAT HAVE BEEN TREATED WITH EITHER INHIBITOR FOR 24HR. B) MCF10A CELLS THAT HAVE BEEN TREATED WITH EITHER INHIBITOR FOR 48HR. ………………………………………………………………………………………………………………54

FIGURE 22. SCATTERPLOT OF ER-BOUND GENES EITHER UP OR DOWN REGULATED IN RESPONSE TO ESTROGEN WITH RUNX1 BOOKMARKED GENES IDENTIFIED. ………………………………………………………………………………………………………………55

FIGURE 23. CHIP-SEQ TRACKS RUNX1 BOOKMARKED RNA POL II TRANSCRIBED GENES IMPORTANT FOR EPITHELIAL PHENOTYPE MAINTENANCE. A) H2AFX GENE CHIP-SEQ TRACKS. B) HES1 GENE CHIP-SEQ TRACKS. ………………………………………………………………………………………………………………56

FIGURE 24. RUNX1 REGULATES H2AFX AND HES1 EXPRESSION IN MCF10A BREAST EPITHELIAL CELLS. A) EXPRESSION OF H2AFX IN THE PRESENCE OF RUNX1 INHIBITORS FROM 6HR, 12HR, 24HR, AND 48HR. B) EXPRESSION OF HES1 IN THE PRESENCE OF RUNX1 INHIBITORS FROM 6HR, 12HR, 24HR, AND 48HR. ………………………………………………………………………………………………………………57

FIGURE 25. RUNX1 INHIBITION LEADS TO MORPHOLOGICAL CHANGES IN MCF10AS VISIBLE THROUGH BOTH PHASE CONTRAST AND IMMUNOFLUORESCENCE MICROSCOPY WITHIN 48HRS. A) RUNX1 INHIBITOR AI-4-88 (INACTIVE, TOP ROW) IN MCF10A CELLS AT 0HR, 24HR, AND 48HR. RUNX1 INHIBITOR AI-14-91 (ACTIVE, BOTTOM ROW) IN MCF10A CELLS AT 0HR, 24HR, AND 48HR. B) IMMUNOFLUORESCENCE MICROSCOPY OF PROMETAPHASE MCF10A CELLS TREATED WITH EITHER INHIBITOR FOR 48HR. BLUE = DAPI, GREEN = RUNX1, RED = UBF, PURPLE = F-ACTIN. ………………………………………………………………………………………………………………58

FIGURE 26. RUNX1 INHIBITION CAUSES CHANGES IN CELLULAR MORPHOLOGY OF MCF10A BREAST EPITHELIAL CELLS TO A MORE MESENCHYMAL-LIKE PHENOTYPE. (TOP ROW, FROM LEFT TO RIGHT) = 0HR, 24HR, AND 48HR TIMEPOINTS. (BOTTOM ROW, FROM LEFT TO RIGHT) = 72HR, 96HR, AND 120HR. ………………………………………………………………………………………………………………59

x

FIGURE 27. RUNX1 MITOTICALLY BOOKMARKS RNA POL I AND II-TRANSCRIBED GENES FOR MAINTENANCE OF NORMAL MAMMARY EPITHELIAL PHENOTYPE. ………………………………………………………………………………………………………………61

FIGURE 28. RUNX1 INHIBITION LEADS TO INITIAL REPRESSION FOLLOWED BY ~10- FOLD ACTIVATION OF SNAIL2/SLUG. ………………………………………………………………………………………………………………68

FIGURE 29. WESTERN BLOT ANALYSIS OF MCF10A CELLS TREATED WITH EITHER INHIBITOR FOR 6, 12, 24, OR 48HR FOR EMT TARGETS CDH1 AND VIMENTIN. PARENTAL AND MDA-MB-231 WELLS WERE LOADED FOR MAXIMUM VOLUME, NOT EQUAL PROTEIN LOADING. 20UG OF PROTEIN WAS RAN FOR EACH INHIBITOR SAMPLE. ………………………………………………………………………………………………………………69

xi

CHAPTER 1: BREAST CANCER

1.1. Overview

Breast cancer arises from a series of acquired mutations that lead to a clonal

expansion of cells that disrupt normal mammary homeostasis. According to recent

statistics, one out of eight women will develop invasive breast cancer over the course of

her lifetime [10]. Breast cancer is the most common cancer among women and the

second most common cause of cancer-related death in women [11, 12]. Incidence of

breast cancer appears to remain unchanged within the past few decades, however, death

rate from breast cancer has actually decreased [13]. This is thought to be from

improved technologies for earlier detection and better practices in breast cancer

screening, rather than from improved therapies. Five-year survival rates between 2007-

2013 reflect this, as patients staged with localized breast cancer at diagnosis have a

99% five-year survival, whereas distant (metastatic) staged breast cancer at diagnosis

only has 27% five-year survival [13]. The total number of expected deaths in the

United States due to breast cancer in 2017 were around 40,000 [12]. To further reduce

the number of breast cancer related deaths in women, the mechanisms of breast cancer

onset and progression need to be better understood.

1.2. Clinical Distinctions of Breast Cancer

Breast cancer tumors are traditionally classified based on morphologic features and further classified by tumor size, tumor grade, lymph node involvement, margin status, and lymph vascular invasion (Fig. 1) [14-16]. Clinically predictive biological

1 markers used in breast cancer diagnosis are estrogen (ER), receptor

(PR), and Her-2/neu receptor status (HER-2) [17]. Further subtype classification can be described by the molecular makeup of the tumor: ER+ (luminal A or B), ER- (Her-2/neu, triple negative, normal breast-like) [18-20].

Figure 1 The different intrinsic subtypes of breast cancer. From Dai X, et al 2015 [3]

Luminal A tumors are the largest constituent of breast cancers, making up about

60% of total breast cancers [15]. They are primarily characterized by an increased expression of ESR-1 ( gene), GATA3, FOXA1, and LIV1 as well as a lack of expression of Her-2/neu [14]. Luminal A ER+ tumors have a higher recurrence- free survival and superior survival outcome compared to other tumor types [15, 20-23].

Luminal B tumors are characterized by higher expression of ESR-1, cytokeratins

(CK8/18), and genes associated with proliferation such as CCNB1, MYBL2, and MKI67

[19]. Luminal B tumors can be further classified into Her-2/neu positive or negative. In comparison to luminal A tumors, luminal B tumors are more heterogeneous in their 2 molecular makeup, have higher proliferation rates, and tend to have worse clinical outcomes and chances of relapse [15, 21, 23, 24].

Constituting about 15% of total breast cancer tumors are HER-2 enriched subtype classifications. Similar in makeup to luminal B tumors, these are ER- and defined by a molecular makeup of increased HER-2/Neu , genes associated with proliferation, and specific genes related to HER-2/Neu: GRB7 and

TRAP100 [14]. About 40% of HER-2/Neu enriched tumor subtypes also possess mutations involving TP53. HER-2/Neu tumors tend to behave more aggressively and therefore respond better to chemotherapeutic treatments [25-27]. These tumors have a higher percent chance of local recurrence in comparison to luminal A subtypes and are commonly seen to metastasize to bone and liver [28, 29].

Basal-like tumors (triple negative, i.e., negative for ER, PR, and HER-2) can be classified by an ER- status, a decrease in gene expression related to ER and HER-2/Neu.

Additionally, these tumors are also classified with an upregulation of KRT5, KRT17,

CX3CL1, ANXA8, and TRIM29, genes pertaining to basal cell phenotype. Additionally, it is estimated that about 75% of tumors classified by basal-like subtype possess mutations in TP53 [14]. Epidermal growth factor receptor (EGFR) levels are also increased in basal-like tumors. In comparison to luminal subtypes, basal-like tumors have a worse prognosis, shorter relapse-free survival, and an increase in metastases to the central nervous system [29].

In addition to the well-established subtypes discussed above, lesser-known subtypes of breast carcinoma have also been defined. Of these, claudin low subtypes are

3 perhaps the most well defined [14]. These subtypes closely resemble basal-like tumor subtypes but also have lower levels of both proliferative genes and the cytokeratins that are commonly expressed in basal-like tumor subtypes. A distinguishing feature of claudin low subtypes is higher expression of genes involved in other cellular processes contributing to cancer phenotypes, such as cell-cell communication, differentiation, and angiogenesis. Interestingly, genes primarily involved in the immune system are also overexpressed, while genes involved in cell-cell adhesion have decreased expression.

This aspect is how the claudin subtype received its name; decreased expression of claudin genes along with E cadherin [14, 30].

Another lesser known molecular apocrine subtype has been previously identified

[31]. This tumor subtype is ER- with overexpression of ERRB2 gene and an increase in signaling. ERRB2 gene encodes a receptor kinase that is thought to both stabilize and alter androgen receptor function as a result from overexpression studies [32]. Lastly, the most recently suggested subtype of breast carcinomas is the interferon-related group subtype. The classification comes from breast tumors that emphasize higher expression of genes involved in proliferation and those related to interferons, specifically STAT1 [33, 34]. Involvement of immune-related processes may play a role in the formation of favorable tumor microenvironments, therefore exacerbating cancer growth.

1.3. Cellular Models for Investigating Breast Cancer

4

There are currently 73 cell lines and hybridoma’s available on ATCC® and documented in Cancer Cell Line Encyclopedia (CCLE – Broad/Novartis) originating from humans that are related to and used in breast cancer research. Depending on the purpose of experimentation, different breast cancer cell lines have been developed to explore targeted areas of breast cancer (Fig. 2). For example, MDA-MB-231 is a breast epithelial cell line derived from human mammary gland/breast with breast adenocarcinoma. This cell line exhibits metastatic behavior and is suitable for exploring molecular mechanisms of in breast epithelial cells [35, 36]. Another model of

Figure 2. Cell line subtypes for investigating breast cancer. Figure adapted from Dai X, et al 2017 [5]

5 tumor subtypes are MCF-7 cells [37]. These cells are also an epithelial cell line derived from human mammary gland/breast tissue that express estrogen receptor as well as the

WNT7B oncogene, making this cell line a suitable model for exploring responses to hormone-based therapies at an in vitro level [38].

One other model of tumor subtypes is the MCF10 series of cell lines. First isolated from a 36 year old Caucasian woman with benign fibrocystic disease, a non- tumorigenic breast epithelial cell line termed MCF10A cells were isolated for culture

[39]. These cells spontaneously immortalized and serve as a suitable model for exploring normal-like breast epithelial phenotype and the beginning of cancer formation.

Transcriptional profiling of various breast cancers revealed subtypes that closely resembled normal epithelial-like breast cells, which suggests that the biology and phenotype of MCF10A cells are ideal for further understanding the initiating events of breast cancer onset and progression [18, 19, 40]. To develop a cell-based model for breast cancer progression, MCF10A cells were overexpressed with oncogenic H-Ras to create a premalignant breast epithelial cell line termed MCF10-AT1 [41]. Repeated subcutaneous injection and harvest of MCF10AT1 cells in mice was used to generate the MCF10-CA1a cell line [42]. Repeated passage and harvest of these cells in mice yielded a new aggressive phenotype due to transformation. MCF10-CA1a cells exhibit the highest degree of metastasis and cell proliferation in comparison to MCF10A and MCF10AT1, making them a suitable model to study the molecular mechanisms underlying the advanced stage breast cancer. Importantly, the three cell lines have near normal karyotype, making them ideal for studies examining changes in higher order genome

6 organization. Taken together, these three cell lines are a powerful cellular model for discovering the molecular mechanisms driving breast cancer progression from a normal, epithelial-like phenotype (MCF10A), premalignant phenotype (MCF10AT1), and a metastatic/aggressive phenotype (MCF10A-CA1a).

1.4. Normal Mammary Gland Development

The mammary gland is a dynamically changing organ dependent upon hormonal cues in women. Changes that arise after proper development of the mammary gland utilize cell-fate specification and cell differentiation pathways that are vital for tissue remodeling. In breast cancer, these pathways are dysregulated and are therefore informative to understand the onset and progression of breast cancer.

Mammary gland development initiates in a developing embryo. Developmental studies in mice indicate the presence of early mammary gland lines starting at embryonic day 10.5 and further progressing into more differentiated mammary by day

13.5 with a fat pad mesenchyme forming at day 14.5 [1, 43-45]. Epithelial cells overlaying the mesenchyme form the nipple and form the beginnings of a ductal tree underneath the nipple at day 18.5 [44, 46]. Following embryonic development, mammary epithelial cells near the nipple remain quiescent. It is during where hormonal cues are sensed by mammary epithelial cells and they invade the fat pad. The invading branch is made up of epithelial cells, where leading cells of the branch are comprised of epithelial cells that exhibit mesenchymal-like behavior, suggesting that regulated epithelial-to-mesenchymal transition (EMT) plays a role in this invasion process [47-50].

7

Inactivation of this branching isn’t entirely understood, although endogenous production of transforming growth factor beta (TGFβ) and mechanical sensing of the tissue microenvironment have been shown to play roles [48]. A more finalized glandular structure consisting of a bilayer of epithelial cells (luminal and myoepithelial cells) upon inactivation of the branching takes shape and is referred to as the virgin mammary gland.

The mammary gland undergoes further changes later during estrus cycles and pregnancy,

Figure 3. Mammary gland development. Figure from Inman JL, et al 2015. [1] eventually leading to the involution of the gland and a return to a pre-pregnant state through tissue remodeling programs [1, 51, 52].

Various types of epithelial cells make up the fully formed and developed adult mammary gland (Fig. 3). Briefly summarized, the ducts throughout the mammary gland 8 are oriented in apical fashion with epithelial bilayer lining either side of a duct. The bilayer has luminal epithelial cells in contact with the lumen of the duct and a myoepithelial cell layer that is in contact with the basement membrane of surrounding breast tissue. Luminal cells are terminally differentiated during pregnancy to produce and secrete milk proteins into the duct and, when coupled with contraction of myoepithelial cells, milk is released from the mammary gland [53, 54]. These cells receive hormonal and other cues that triggers further development and tissue remodeling in a regulated fashion when necessary, i.e. during postnatal development, pregnancy.

There are molecular mechanisms in place that commit epithelial cells in the mammary gland to remain differentiated epithelial cells, yet this is still largely an unanswered and compelling question. During events such as pregnancy, where the mammary gland receives an influx of ovarian hormones such as estrogen and progesterone, the mechanisms that regulate controlled differentiation and commitment to an epithelial lineage are altered in a tightly regulated manner to facilitate a new proliferative phenotype for the cells. During cancer onset, these regulated differentiation processes appear to be perturbed.

Mammary epithelial cells respond to estrogen through estrogen receptor α (ERα) activity and progesterone through (PR) activity. Upon ligand binding, these receptors translocate into the nucleus, where the receptor binds to target genes and can recruit co-activating or co-repressing proteins. These steroid hormone receptors also have effects on one another, as ERα activity also increases PR activity, yet how activities influence one another are still being understood [55-57].

9

Epithelial cells that are ER+ act in a paracrine fashion by acting on nearby/adjacent epithelial cells to initiate a proliferative phenotype which facilitates further expansion and ductal invasion; this paracrine activation is required for ductal expansion [58-60]. As a result of the ERα activity, ER+ cells release epidermal growth factor (EGF), transforming growth factor alpha (TGFα), and amphiregulin, all of which are required for ductal expansion [55, 61-63]. Progesterone-activated mediators that play a role in progesterone- induced proliferation include RANKL, WNT-4, and Cyclin D1 [64-66]. These hormone- driven mechanisms trigger expansion of mammary ducts through an increase in epithelial

10 cell proliferation, however there are additional mechanisms that regulate and control mammary epithelial cell differentiation.

The activities of different transcription factors in mammary epithelial cells play important roles in determining cell lineage commitment and cell fate (Fig. 4). For example, GATA3 is expressed in luminal epithelial cells and is important for directing luminal epithelial cell differentiation [67]. This differentiation is achieved through

GATA3 acting upon FOXA1, a pioneer transcription factor which mediates ERα expression [68]. By altering ERα expression, it also inherently affects the paracrine

Figure 4. ER and PR signaling pathways in breast epithelial cells. Figure from Tanos T, et al 2012. [6]

11 mediators released from ER-responsive epithelial cells, perhaps providing phenotypic transcription factors a more important role for dictating differentiation and cell lineage commitment in mammary tissue. Another transcription factor shown to be important in luminal epithelial cell expansion is CCAAT/ Binding Protein Beta (C/EBPβ).

PR levels in mammary glands of C/EBPβ -/- mice were elevated while lobuloalveolar development was inhibited, indicating PR is parallel or downstream of C/EBPβ [69].

Other transcription factors that act downstream of estrogen and progesterone activity are also important in cell fate determination and lineage commitment. For example, STAT5a is important for glandular development during pregnancy, and the phosphorylated form of STAT5a (p-STAT5a) is required for branching and proliferation of the mammary gland [70-72]. Both PR and ERα regulate STAT5a expression [73].

E74-like factor 5 (Elf5) is yet another transcription factor involved in epithelial cell differentiation (specifically alveolar cells). RANKL is shown to regulate Elf5 levels, and

RANKL is regulated by progesterone receptor activity [74, 75]. These transcription factors play roles in determining cell fate upstream and downstream of ERα and PR activity.

Deregulation of differentiation processes operative in mammary epithelial cells can lead to uncontrolled proliferation and potentially breast cancer formation. The traditional view of breast cancer onset and progression into various subtypes stems from the different epithelial cells present in the mammary gland (luminal and basal). Insults to the DNA in these cells leads to mutations that result in a multi-potent cell capable of initiating tumor growth [76]. These insults could alter known oncogenes to facilitate

12 uncontrolled growth, tumor suppressor genes to prevent the suppression of proliferation/growth, as well as disrupt genes involved in mammary developmental processes. Cells of the primary tumor in certain breast cancer subtypes often mimic the originating epithelial cell lineage where the insult occurred [77]. However, this is not always the case, as more recent studies show that it is possible to generate multiple subtypes from a single lineage [78]. Importantly, epigenetic mechanisms also facilitate various aspects of breast cancer formation and progression, however are not quite as well understood. The fundamental aspect of breast cancer formation is that mammary epithelial cells differentiate from their intended tissue-specific function, i.e. lose epithelial phenotype, and give rise to a multi-potent cell. There are mechanisms in place for mammary epithelial cells to maintain their epithelial phenotype, but how these mechanisms stabilize the epithelial phenotype is not well understood.

13

CHAPTER 2: EPITHELIAL-TO-MESENCHYMAL TRANSITION (EMT)

2.1. Biological Process of EMT

Little is known regarding the precise mechanisms by which normal mammary epithelial cell structure and function are maintained and protected from transitioning into a more malignant and cancer-like phenotype through epithelial-to-mesenchymal

Figure 5. Overview of EMT and MET. Figure from Lamouille, Xu, and Derynck 2014. [2] transition (EMT) (Fig. 5). Pathways and proteins involved in EMT sheds significant insight into the mechanisms that govern mammary epithelial cell maintenance.

14

EMT is an essential physiological process that allows a polarized epithelial cell to undergo a series of biochemical changes and assume a more mesenchymal-like phenotype, i.e. enhanced migratory capacity, invasiveness, resistance to apoptosis, and increased production of extracellular matrix (ECM) components [79, 80]. EMT was first identified as a vital process in embryonic development, specifically during gastrulation

[81, 82]. Subsequent research and increased understanding of EMT now supports a model of three separate types of EMT. The first type of EMT is associated with implantation, embryo formation, and organ development where these EMTs produce cells with mesenchymal-like phenotypes that lack invasive properties [80, 83]. The second type of

EMTs associate with tissue regeneration, wound healing, and organ fibrosis; this type of

EMT is tightly coordinated with inflammatory responses. The third type of EMT is associated with cancer formation. Briefly, oncogenes and tumor suppressors hijack mechanisms underlying EMT to facilitate clonal expansion of a plastic cell that often metastasizes to a new tissue after acquiring more mesenchymal-like phenotypes.

However, this third type of EMT also produces a spectrum of cells with both epithelial and mesenchymal-like traits rather than strictly epithelial or mesenchymal phenotype [80,

84]. For a cell to metastasize (which most likely possesses more mesenchymal traits), invade, and seed secondary tumor growth resembling the primary tumor at a distant site, the cell must undergo mesenchymal-to-epithelial transition (MET). Type 3 EMT, involved in cancer metastasis and tumor formation, has been challenging to study because of the plasticity of affected cells to undergo MET. However, there is considerable evidence illustrating the various factors involved in this process.

15

Under normal circumstances, EMT is a tightly controlled and regulated process involving cross talk between transcription factors, signaling pathways, and miRNAs.

Transcription factors that include twist basic helix-loop-helix transcription factor 1

(TWIST1), snail family 1 (SNAI1), snail family zinc finger 2 (SNAI2/SLUG), zinc finger E-box binding 1 (ZEB1), and fork head box C2 (FOXC2) [85-87] are primarily responsible for inducing EMT. These transcription factors and their activity are mediated by transforming growth factor beta (TGFβ), epidermal growth factor (EGF), fibroblast growth factor (FGF), platelet-derived growth factor (PDGF), Wnt, Sonic

Hedgehog (Shh), Notch, and integrin signaling pathways [84, 88-93]. Loss of epithelial cadherin (E-Cad) and β-catenin (cellular adhesion molecules) are an established hallmark of EMT initiation. Following induction of EMT, it is also possible that signals from tumor microenvironments influence expression of specific EMT transcription factors.

Another layer of regulatory complexity is provided by various epigenetic mechanisms, i.e. heritable alterations in gene expression without changes in DNA sequence, that contribute to EMT and cancer progression. These epigenetic mechanisms include global

DNA hypomethylation, gene hypermethylation, modification, deregulation of chromatin remodeling complexes, and deregulation of important micro

RNAs (miRNAs) (Reviewed in [86]). Despite the breadth of knowledge regarding EMT, a more fundamental understanding of breast epithelial cell maintenance is lacking, i.e., how does a mammary epithelial cell remain an epithelial cell through successive cell divisions, and prevents processes like EMT, tumor formation, and cancer metastasis?

16

CHAPTER 3: MITOTIC GENE BOOKMARKING

3.1. Phenotypic Transcription Factors Maintain Cellular Phenotype

Each cell lineage arises from multipotent stem cell, capable of differentiating into committed cells in a context dependent function. Committed cells must maintain their identity through successive cell divisions. Phenotypic transcription factors are responsible for controlling gene expression and maintaining cellular identity in tissue- specific contexts, however the precise mechanisms are not well understood.

Figure 6. Mitotic retention of transcription factors on chromatin, with highlighted differences in signal of RNA Pol I vs II transcribed genes. Figure adapted from Zaidi SK, et al. 2011 [8]

Mitotic gene bookmarking is defined as the retention of key regulatory proteins that include sequence specific transcription factors, chromatin modifying factors and components of RNA Pol I and II regulatory machineries at gene loci on mitotic (Fig. 6). For cells to remain committed to their differentiated phenotype, cells must maintain this identity through successive cell divisions. By bookmarking target genes during mitosis, regulatory proteins such as phenotypic transcription factors ensure that genes important for the cellular phenotype will be expressed in progeny cells immediately completing mitosis. In addition to phenotypic transcription factors,

17 chromatin modifiers/histone variants and functional components of RNA Pol I and II machineries also bookmark genes during mitosis. With these three classes of proteins mitotically bookmarking genes during mitosis, coordinated control of cell phenotype, growth, and proliferation are ensured. Importantly, mitotic gene bookmarking has recently been implicated in maintenance of tumor/cancer phenotype [7].

3.2. Timeline of Mitotic Gene Bookmarking

Mitotic gene bookmarking was first described in 1997 by Levens et al. [94].

These authors demonstrated that chromatin conformation was distorted specifically at transcription start sites (TSS) of genes poised for reactivation after successful completion of mitosis. They postulated that, in order for proper chromatin reassembly following mitosis, a subset of factors remain bound to mitotic chromosomes at these loci, thus providing a “molecular bookmark”. A review from a separate group (John and Workman) in 1998 supported the bookmarking hypothesis as a “mechanism by which transcriptional competence can be maintained through mitosis” [95]. Challenging this idea, another research group demonstrated that condensed chromatin structure during mitosis led to the displacement of different sequence-specific transcription factors [96]. The first evidence of mitotic bookmarking by a transcription factor was reported in 2003 by our group [97].

The sequence-specific phenotypic transcription factor RUNX2 was shown by in situ immunofluorescence to remain associated with chromatin throughout mitosis. Each progeny nuclei possessed equivalent levels of Runx2 protein following division and was rendered equally competent to carry out proper phenotypic gene expression.

18

Subsequently, our group provided further evidence of mitotic bookmarking in 2007 and

2008 [98-100]. Lineage-specifying factors MyoD, , RUNX2, and C/EBPβ replaced Myc, which occupies and activates rRNA genes during proliferative stage of multipotent mesenchymal stem cells, during mesenchymal progenitor differentiation and suppressed RNA Pol I-mediated transcriptional control of rRNA genes through interaction with UBF-1. Consistently, these lineage-specifying factors also occupied

RNA Pol-II regulated genes involved in cell proliferation and fate determination.

Evidence began to grow for mitotic bookmarking as Kadauke et al. demonstrated in 2012 that GATA1, a hematopoietic-specific transcription factor required for erythroid development, occupied target genes during mitosis that were rapidly reactivated upon completion of cell division [101]. Arampatzi et al demonstrated in 2013 that components of the MHC Class II enhanceosome (MCE), a multi-protein complex necessary for MHC Class II gene transcription, were dynamically associated with chromatin during mitosis [102]. Furthermore, a subunit of the MCE, Nuclear transcription factor Y-A (NFYA), recruits PP2A to a specific DRA gene enhancer region

(LCR/XL4) and causes localized chromatin decompaction for timely reactivation of transcription following mitosis. Caravaca et al. further supported mitotic gene bookmarking by examining pioneer transcription factor FoxA1, an important transcription factor involved in liver development [103]. During mitosis, FoxA1 was shown to occupy a subset of interphase-specific genes important for liver differentiation.

FoxA1 also exhibited non-specific binding to mitotic chromatin in the “vicinity of other target genes”. Regardless of specific or non-specific binding, the group concluded that

19 both means of FoxA1 binding facilitate early gene reactivation following exit from mitosis. Lerner et al. also provided additional evidence for mitotic gene bookmarking a few years later in 2016 [104]. Hepatocyte nuclear factor 1 β (HNF1β), a transcription

Figure 7. Current established proteins known to mitotically bookmark target genes, as well as the signaling pathways and chromatin modifying properties associated with mitotic bookmarking. Figure From Zaidi SK, et al. [7] factor important for early steps of pancreas, kidney, and liver development, was shown to occupy mitotic chromatin. The study also examined the roles of clinically relevant mutations found in HNF1β of patients suffering from renal multicystic dysplasia and diabetes. These mutations had prevented HNF1β to mitotically bookmark DNA.

20

These reports demonstrate mitotic retention of tissue-specific transcription factors as a key epigenetic mechanism for regulation of genes that coordinately control cell growth and identity upon exit from mitosis (Fig. 7). Although significant evidence has accumulated for mitotic gene bookmarking in maintenance of normal tissue-specific cellular identity, a compelling question is how mitotic gene bookmarking maintains a cancer-like phenotype.

3.3. Supporting Evidence of Mitotic Gene Bookmarking

Upon entering mitosis, the cell undergoes significant structural and biologically- relevant changes. The most generally accepted finding was that RNA transcription in the nucleus is ceased when a cell enters mitosis [105-107]. This is mostly a result of RNAP II complexes being displaced from mitotic chromosomes. However, it was also shown that a small portion of genes maintained these complexes during mitosis [108]. Using a sensitive approach to detect transcription during mitosis, Zaret and colleagues recently reported a finding challenging the long-standing paradigm of transcription being halted during mitosis [109]. These authors pulse-labeled nascent transcripts with 5- ethynyluridine, a non-radioactive cell-permeable molecule and demonstrated that transcription of genes required for vital cellular processes such as cellular growth and proliferation occurs in waves, mostly in mid to late mitosis. The genes more important in cellular phenotype and identity were transcribed immediately following mitosis. This finding provides a strong supportive role for mitotic gene bookmarking, especially for genes that are involved in vital cellular processes such as growth and proliferation.

21

Components of RNAP I transcriptional machinery, mainly upstream binding transcription factor 1 (UBF1) have been reported by our group (and others) to occupy ribosomal RNA genes during mitosis [99, 110]. Since protein synthesis by the cell is directly related to cellular growth and proliferation, mitotic retention of these complexes, combined with steady state levels of transcription during mitosis, ensures competency of a cell to maintain its growth potential following cell division. Beyond rDNA transcription, components of RNAP II transcriptional machinery, such as TATA-binding protein (TBP), have also been shown to occupy target genes during mitosis and lead to their immediate reactivation following mitosis through recruitment of RNAP II [111].

RNAP II transcriptional machinery is responsible for transcription of genes involved in cellular identity and phenotype. The discovery of steady state transcription during mitosis agrees with the accumulated evidence for mitotic gene bookmarking through tissue specific transcription factors as well as components of RNAP I and II transcriptional machinery.

Mitotic chromatin is highly condensed, resulting in displacement of numerous transcription factors, as many sequence specific binding sites of transcription factors are hidden or inaccessible. However, if steady state levels of transcription are occurring during mitosis, then a small portion of mitotic chromatin must remain uncondensed for regulatory machinery to bind and perform its function. Reports by a few research groups have addressed accessibility of mitotic chromatin by utilizing state-of-the-art approaches to analyze the three-dimensional conformation of chromatin in interphase and mitotic cells [106, 112, 113]. These studies have identified regions of mitotic chromatin in an

22 open conformation and therefore, accessible for regulatory proteins to bind. These euchromatic regions of mitotic chromatin share properties including nuclease accessibility, enrichment of unique histone variants, and association of cohesion with actively transcribed genes.

Histone variants and post-translational modifications such as H2A.Z, H3.3, and

H3K4me3 are also shown to be enriched in the nucleosomes of mitotically bookmarked genes [114, 115]. Histone modifications often associated with activation of genes (i.e.

H3K4me3 and H3K27Ac) have been frequently found at the promoters of mitotically bookmarked genes [116]. Although further research is necessary, it appears that specific histone variants and post translational modifications are associated with mitotically bookmarked genes and provide a suitable chromatin environment to facilitate and/or enhance mitotic retention of regulatory factors on DNA.

3.4. Mitotic Gene Bookmarking Maintains Cancer Phenotypes

The most fundamental characteristics of tumor phenotypes include uncontrolled cellular growth with a deregulation of differentiation, both of which lead to an accumulation of clonally selected cells in a tissue that disrupts homeostasis. Given the documented examples of mitotic gene bookmarking thus far, it comes as no surprise that this epigenetic mechanism can have significant roles in maintenance of a cancerous phenotype. Examples of mitotic gene bookmarking sustaining cancerous phenotype comes from studies within . AML1-ETO, an oncogenic found in (AML) patients, regulates vital cellular processes such as

23 differentiation, proliferation, apoptosis, and self-renewal to promote leukemogenesis and maintain a leukemic phenotype both in vivo and in vitro (reviewed in [117]). AML1-ETO mitotically bookmarks rRNA genes, genes required for proliferation, as well as genes involved in myeloid cell differentiation. In comparison to normal AML1/RUNX1, a phenotypic transcription factor primarily involved in hematopoiesis and myeloid differentiation, AML1-ETO causes the opposing regulatory effect on mitotically bookmarked genes. Ribosomal RNA and genes involved with cell proliferation are upregulated with AML1-ETO occupancy during mitosis, which causes an enhanced growth and proliferative potential for these cells. In contrast, genes that are important for proper myeloid differentiation are down regulated, causing an arrest in a blast-like stage of differentiation. The lack of myeloid differentiation and enhanced proliferative effects on blast cells that express AML1-ETO promote leukemogenesis and help maintain a leukemic phenotype [118].

Mixed lineage leukemia protein (MLL) is another example of mitotic gene bookmarking maintaining a tumor phenotype. MLL is a chromatin remodeling factor that bookmarks target genes involved in leukemogenesis. Mitotic occupancy of target genes by MLL recruits chromatin remodeling machinery to these genes and poises them for rapid reactivation following mitosis, thus maintaining the leukemic phenotype [119].

Furthermore, another fusion protein demonstrated mitotic gene bookmarking activity.

CBF-β, the obligatory binding partner for RUNX proteins, forms a fusion protein with smooth muscle myosin heavy chain (SMMHC) in inv(16) AMLs [120]. This fusion protein is required for survival of inv(16) AML cells [121]. Like AML1-ETO, the CBFβ-

24

SMMHC fusion protein was also capable of mitotically bookmarking rRNA genes, with a suggested repressive role [122].

The current examples of mitotic gene bookmarking maintaining cancerous phenotype have thus far been demonstrated in leukemia. Given the variety of transcription factors involved in numerous processes such as controlling cell growth, proliferation, and cellular identity/phenotype, it is inevitable that other tumor phenotypes are maintained either partially or substantially through mitotic gene bookmarking of specific target genes. Further studies are required to determine if other oncogenic proteins, similar to AML1-ETO and MLL, mitotically bookmark target genes to maintain cancerous phenotypes in different tissue-specific contexts.

3.5. Models and Techniques for Investigation of Mitotic Gene Bookmarking

With rapid evolution and development of state-of-the-art approaches to study gene expression and genome organization, there are now more robust technical approaches to study mitotic gene bookmarking and investigating its role in different cellular and tissue specific contexts.

A key approach for investigating localization of target proteins within mitotic cells is through immunofluorescence microscopy (IF). This well-established technique can be readily employed to visualize the localization of a protein in a variety of cells regardless of their stage of the cell cycle. One critical component of this technique is the requirement of suitable and validated antibodies targeting the protein of interest. Potential masking of the epitope on the target protein of interest because of normal mitotic-related

25 cellular events such as chromatin compaction and post translational modifications are possible and should be taken into consideration for antibody selection. Typical IF experiments are performed in fixed cells, which also warrants some considerations.

Different fixation reagents could potentially disrupt antibody binding to its intended target [123]. Overexpression of a target protein that is linked to a fluorescent molecule such as GFP allows for live cell imaging and avoids some of these potential issues.

However, overexpression of a protein with a fluorescent tag can introduce different mitotic occupancy patterns and should be taken into consideration. Proteins that mitotically bookmark target genes involved in cell growth, proliferation, and phenotype interact with other proteins or have obligate binding partners whose interactions may be disrupted with the introduction of a fluorescent-tagged molecule. The fluorescent tag on the protein of interest may also affect protein folding and, as a result, can lead to differences in DNA binding affinity (for transcription factors). Various DNA stains allow for visualization of condensed chromatin during mitosis, as well as for identification of substages of mitosis based on the topology of DNA staining. Overall, IF is a valuable tool for providing evidence of a protein in its endogenous form and natural environment being mitotically retained on DNA during mitosis, thus providing the foundation for further investigation.

Chromatin immunoprecipitation (ChIP) is another vital technique to biochemically validate mitotic gene bookmarking. When coupled with high throughput sequencing (ChIP-Seq), specific genes bound by the protein of interest can be identified.

Antibody requirements for ChIP-Seq are more stringent in comparison to IF. The purity,

26 specificity, and concentration of antibodies used in ChIP-Seq are often vital for successful ChIP-Seq experiments. ChIP-Seq is also dependent upon endogenous levels of the target protein in the specific cellular model. Lower levels of endogenous protein will most likely require a larger amount of chromatin, and therefore more antibody for successful ChIP-Seq [124]. The use of more antibody for ChIP-Seq can lead to potential non-specific immunoprecipitation. To overcome low endogenous levels of protein or a low affinity antibody, an overexpression of the target protein labeled with a molecule which has strong antibodies against it (i.e., FLAG, V5, etc.) can be utilized, but also has shortcomings. Similar to the overexpression issues described for IF, overexpression and labeling with a molecule like FLAG can change mitotic occupancy patterns, or even mitotic occupancy dynamics (in the case of non-specific binding). A more suitable alternative would be endogenous tagging using precision genome editing. Regardless, any alterations to the endogenous protein carries the potential of differential mitotic occupancy patterns. ChIP-Seq must be performed in a population of pure mitotic cells.

Depending on cellular model, cells can typically be arrested with a cell-cycle disrupting agent such as nocodazole or colcemid. Following optimized conditions for mitotic arrest, an optimized mitotic shake off procedure should be utilized for harvest of only mitotically arrested cells. Fluorescently activated cell sorting (FACS) protocols involving mitotic-specific markers such as H3Ser10p can be used to accurately evaluate the purity of the mitotic population of cells and also utilized to sort them from non- mitotic cells [125].

27

RNA-Seq is a technique used to investigate the expression of genes through detection of mRNA levels, however it is also capable of detecting miRNAs, siRNAs, and lncRNAs, all of which are non-coding in nature [126]. Pertaining to mitotic bookmarking, levels of mRNA can be detected for bookmarked genes at specific stages of the cell cycle. For example, if cells are synchronized into early G1, RNA-Seq could determine the expression of bookmarked genes upon immediately exiting mitosis. With use of a small molecule inhibitor to disrupt mitotic bookmarking, looking at the expression of bookmarked genes in G1-synchronized cells treated with or without the inhibitor should theoretically reveal differences in expression of only bookmarked genes.

As beneficial as this can be, it has a critical limitation in that it is only capable of detecting total RNA levels present within the cells. Detection of nascent RNAs would truly reflect differences in gene expression. GRO-Seq (Global Run-on Sequencing) is a technique which detects levels of only nascent RNAs [127, 128]. Briefly, cells incorporate 5-bromouridine 5’-triphosphate (BrUTP) to tag nascent RNAs. Following this tagging, controlled time for transcription, and subsequent harvest of RNA, beads coated with 5-bromo-2’-deoxyuridine (BrdU) are used to isolate the tagged nascent RNA.

This nascent RNA can be sequenced to determine gene expression of only the newly transcribed genes. When this technique is coupled with a cell synchronization into G1 and coupled with another technique to disrupt mitotic bookmarking (i.e. small molecule inhibitor disruption, inducible knockout, etc.), a difference in expression of newly transcribed genes which are also mitotically bookmarked can be determined.

28

By taking the necessary considerations into account for IF and ChIP-Seq, these two techniques validate mitotic retention of factors on target genes during mitosis. Some results have been published which show discordant results between IF and ChIP-Seq. In one example, GATA1 was shown to be excluded from DNA during mitosis through IF, yet ChIP-Seq findings for GATA1 revealed a number of genes occupied during mitosis

[101]. This disagreement between techniques could be explained by chemical fixation conditions for both protocols. In a recent report, Teves et al. investigated the artifacts of

Figure 8. A proposed hypothesis for chemical artifacts regarding experimental visualization of transcription factor retention. Figure from Teves SS et al. 2016 [3] chemical fixation for evaluating mitotic gene bookmarking (Fig. 8) [3]. Using live cell imaging, they demonstrated that 2 secs prior to 1% paraformaldehyde treatment, their target protein (H2B-GFP) was present on mitotic chromatin. After 10 secs following addition of 1% paraformaldehyde treatment, the levels of H2B-GFP on the chromosomes

29 was markedly reduced, and was indistinguishable from cytoplasmic signal by 60 sec.

With this data, they proposed a potential mechanism of formaldehyde fixation, in which the chemical fixation prevents transcription factors from binding to mitotic chromatin.

Briefly, fixative entering the cell would first cross-link cytoplasmic protein as it makes its way towards the nucleus. Therefore, the cytoplasmic pool of protein association rate (kon) would be greatly diminished due to the crosslinking. Taking into account the short residency times of transcription factors on DNA, mentioned to be under 20 sec based on a set of recent fluorescent recovery after photobleaching (FRAP) experiments [129, 130], the protein occupying mitotic chromatin may be evicted as a result of formaldehyde fixation. This artifact of chemical fixation would be exacerbated in highly dynamic transcription factors. Other chemicals can be used for fixation including MeOH and glutaraldehyde, although they carry other potential considerations as they are better suited for different techniques such as electron microscopy.

A variety of auxiliary approaches can be taken to provide supporting evidence of a protein occupying target genes during mitosis and interphase. For example, it is known that specific histone marks indicate either up-regulation of nearby genes

(e.g.H3K4me3, H3K27Ac) or down regulation of nearby genes (e.g. H3K27me3) through association with euchromatic or heterochromatic states of DNA. By correlating a specific gene mitotically occupied during mitosis with either an active or repressive histone mark can help parse out a global picture of mitotic gene bookmarking. DNase hypersensitivity assays also reveal genes that are accessible by the nuclease, and therefore in a euchromatic state. Determining the structure of chromatin at gene loci that are mitotically

30 bookmarked can provide additional regulatory information critical to discern expression of these genes following cell division.

Lastly, downstream applications such as reverse transcription- quantitative polymerase chain reaction (RT-qPCR and western blot should be utilized to examine the regulation of specific genes mitotically occupied during mitosis. Use of an inhibitor to disrupt the bookmarking activity yet maintain protein levels is perhaps the most suitable to determine the immediate effects of disrupted mitotic bookmarking activity. An alternative approach can be the use of CRISPR gene editing technologies to introduce mutations into the DNA binding domain of endogenous protein, therefore disrupting mitotic bookmarking while maintaining protein levels. Techniques such as RNA interference can also be used to deplete the cellular model of target protein and therefore mitotic bookmarking, although the characterized differences observed after depletion may not entirely be attributable specifically to disruption of mitotic gene bookmarking activity.

Taken together, approaches to evaluate mitotic occupancy of target genes by various factors require important considerations for valid interpretation of the data.

31

CHAPTER 4: AML1/RUNX1

4.1. AML1/RUNX1 Overview and Introduction

Runt-related transcription factor 1 (RUNX1), also referred as acute myeloid leukemia 1 protein (AML1), is a lineage-specifying transcription factor from the Runx family of proteins (Fig. 9). The RUNX family of proteins consists of RUNX1/AML1,

Runx2/CBF-alpha-1, and Runx3/PEB2Pα. RUNX proteins are lineage-specifying transcription factors that share a highly conserved Runt domain (similar to the conserved domain found in the runt gene of Drosophila), which binds to DNA and mediates for specific protein-protein interactions [131-133]. RUNX DNA-binding capabilities is intimately tied with its obligate binding partner, – Beta (CBF-β).

When RUNX proteins interact with CBF-β, specificity and affinity of Runx proteins for their target genes increases considerably (~6-10 fold) in comparison to RUNX proteins alone [134-139]. This is the result of a favorable conformational change induced by CBF-

β on the DNA binding domain structure to increase binding affinity [140]. RUNX

Figure 9. Runx1’s DNA- and protein-interacting domains with co-binding partners identified. AD: Activating domain, ID: inhibitory domain. Figure from Ito Y et al, 2015. [4] proteins are predominantly localized in the nucleus. RUNX proteins contain a nuclear localization signal on the C-terminal side of the Runt domain which is responsible for nuclear localization of Runx proteins. In addition, RUNX proteins contain a conserved C-

32 terminal sub-nuclear matrix-targeting signal, which targets RUNX proteins to punctate domains [141-145]. The nuclear matrix targeting signal is also responsible for interaction with other co-regulatory factors. RUNX proteins also interact with co-activators and co- through their PY and VWRPY motifs in a tissue-specific manner [146-151] [4,

131, 146-148, 152]. Physiological activities of RUNX proteins are thus dictated by their interactions with target genes and co-regulators.

RUNX1 is a critical transcription factor required for hematopoietic development

[153, 154]. Hematopoiesis is divided into two main stages: primitive and definitive.

Primitive hematopoiesis is the earlier developmental stage which begins at day 7.5 and is limited to the extra-embryonic yolk sac. During this stage, three main precursor cell types arise: erythrocytes, macrophages, and megakaryocytes (reviewed in [155], [156-159]). It isn’t until definitive hematopoiesis when erythrocyte precursors from primitive hematopoiesis differentiate and give rise to the hematopoietic stem cells (HSC), which are multipotent cells capable of differentiating into all hematopoietic sub-lineages including myeloid and lymphoid cells [158, 160, 161]. RUNX1 deficient mice lack the progenitors that normally arise during definitive hematopoiesis, a phenotype that is mimicked by a CBF-β knockout as well, the obligate binding partner of

Runx1 [153, 162-166]. RUNX1 deficient mice are embryonically lethal around embryonic day 12.5 due to the absence of definitive hematopoiesis [153, 165, 167].

RUNX1 is also responsible for differentiation of other cell types during normal hematopoiesis which include megakaryocytes, B-Cells, and T-Cells [168-171].

33

Because RUNX1 is expressed in several human tissue types, its function is not limited to hematopoiesis. RUNX1 has been shown to play regulatory roles in the development of bone, nervous system, mammary gland, muscle, and even hair follicle tissues [172-179]. RUNX1 also has an implicated role in embryogenesis [180-182].

Around day 5.5, human embryonic stem cells (hESCs) undergo mesendodermal differentiation. During this period, a transient spike of RUNX1 expression occurs, and promotes a controlled, physiological EMT event that is vital for the differentiation of early mesendodermal cells through TGFβ signaling. RUNX1 transcription from the proximal P2 promoter is associated with this transient expression during embryonic development. Transcription of RUNX1 from the P2 promoter is predominantly isoform

1b, whereas expression from the distal P1 promoter yields isoforms important for hematopoietic development [153, 181, 183, 184]. The P2 promoter also contains bivalent histone marks (H3K27me3 and H3K4me3) which poises many genes in stem cells for rapid activation or inactivation, further highlighting RUNX1 role in embryogenesis [185-

188].

More recently, RUNX1 has been shown to be involved in different stages of mammary gland development. RUNX1 levels are highest during virgin and early- pregnancy stages of mammary gland development but decrease as the mammary gland progresses into late-pregnancy and lactation [178, 189, 190]. Basal progenitor cells have higher RUNX1 levels in comparison to luminal cells. The stages where alveolar luminal cells arise i.e. late pregnancy and lactation, RUNX1 expression is lost. It is hypothesized that loss of RUNX1 during this developmental stage of the mammary gland is required

34 for milk production and secretion [178]. RUNX1 has also been shown to be vital in facilitating mammary stem cell differentiation into mature lobules and ducts [176].

Lastly, targeted RUNX1 in mouse mammary glands was shown to decrease the number of ER+ luminal cells, through repression of Elf5, a transcription factor important in mammary gland ductal differentiation [178].

Because RUNX1 plays critical roles in physiological development and differentiation in so many different tissues, it is informative to examine how RUNX1 aberrations/mutations/defects can lead to improper differentiation and development.

4.2. RUNX1 in Hematologic Malignancies

The first translocation identified in a cancer was the t(8;21) (also referred to as

AML1-ETO, RUNX1-ETO, RUNX1-MTG8, RUNX1-RUNX1T1) translocation in acute myeloid leukemia in 1973 [191-194]. Since then, a significant amount of research has revealed multiple roles the resulting AML1-ETO fusion protein plays in the onset and progression of leukemia as well as in maintaining a leukemic phenotype. The fusion protein consists of the first 5 of RUNX1 from 21 with the ETO gene from [195-197]. In the unaltered RUNX1 transcript, the C-terminus contains both a subnuclear targeting domain for RUNX1 subnuclear localization as well as a conserved motif that facilitates protein-protein interactions; both these domains are vital for RUNX1 biological activity [195, 198, 199]. The t(8;21) translocation disrupts normal RUNX1 activity. Since the DNA binding domain is unaltered, AML1-ETO is occupies RUNX1 target genes, however their regulation is disrupted [152]. AML1-ETO

35 forms co- complexes with NCOR1, HDAC1, and SIN3A/HDAC [200-205].

Furthermore, because ETO replaces RUNX1 subnuclear targeting domain, the AML1-

ETO fusion protein is localized to different subnuclear domains than the ones where normal RUNX1 resides [199]. The deregulation of RUNX1 target genes and altered subnuclear targeting are considered the primary mechanisms by which AML1-ETO results in the leukemic phenotype.

Another mechanism by which AML1-ETO contributes to leukemic phenotype is through transcriptional regulation of miR-29b-1 [197]. MicroRNAs (miRs) and their role in regulating transcription of virtually every biological pathway is a relatively newer field that is heavily researched. By binding to the 3’ untranslated region (3’ UTR) of transcripts for certain genes, they can inhibit that mRNA’s translation. It was demonstrated that AML1-ETO downregulates miR-29b-1, a mIR that also targets AML1-

ETO for its repression. The study unraveled a unique regulatory circuit miR-29b-1 and

AML1-ETO with their respective downstream targets. AML1-ETO not only directly binds to the promoter region of mIR-29b-1, it also inhibits C/EBPα, a target gene who activates mIR-29b-1, thus causing indirect regulation of mIR-29b-1 by AML1-ETO [197,

206]. mIR-29b-1 downstream targets affecting leukemic phenotype are Myc, Akt2, and

CCND2, all of which have roles in regulation of cell growth, DNA Repair, Apoptosis, and cell cycle [207].

Several other translocations involving RUNX1 have been found in leukemic patients and/or contribute to leukemic phenotype. These include the t(12;21) and the t(3;21) translocations, which TEL-RUNX1 and EVI-RUNX1 (also called MDS1-

36

RUNX1 or EAP-RUNX1), respectively [208-214]. There are currently 55 translocations involving RUNX1, of which only 21 are well-studied [215]. Furthermore, a variety of mutations within the RUNX1 gene have been identified with specific disease states and cancers (reviewed in [209] and [216]). Mutations in the Runt domain of RUNX1 have been associated with AML, (MDS), myeloproliferative neoplasms (MPN), and chronic myelomonocytic leukemia (CMML) [217-221].

Haploinsufficiency of RUNX1 has also been associated with Familial Platelet Disorder with predisposition to AML (FPD/AML) [222]. Additionally, mutations outside of the

Runt domain have been suggested to play roles within the disease state for MDS and

CMML [223-225]. Overall, it has been estimated that a combined estimated frequency of

RUNX1 mutations within AML adult patients is approximately 28% [216].

4.3. RUNX1 in Breast Cancer

Recently, mutations within CBFβ and RUNX1 were identified as novel recurrent mutations found within breast cancer [226]. The mutations in CBFβ were comprised of both truncating mutations as well as missense mutations, both of which disrupt interaction with RUNX1. The alterations in RUNX1 were primarily deletions, also resulting in the disruption of CBFβ-RUNX1 interaction. This study was the first to associate CBFβ and RUNX1 complex inactivation within an epithelial cancer [226-228].

In another study, CBFβ and RUNX1 were identified as novel significantly mutated genes, with CBFβ harboring 9 mutations and RUNX1 harboring 19 mutations [168]. The authors suggested that these mutations in CBFβ and RUNX1 play an important role in ERα

37 signaling, a pathway that is heavily involved in mammary gland development and intersects with RUNX1, i.e. RUNX1 tethers ERα to novel DNA binding sites [229, 230].

Mutations in CBFβ and RUNX1 were identified in luminal A, luminal B, and Her2- enriched cancer subtypes, but not basal-like [168]. Using cBioPortal, a database of gene alterations in cancer patients, 3002 sequenced breast cancers across 3 separate studies were examined for RUNX1 mutations [231-236]. RUNX1 exhibited a somatic frequency of 4.1% with a total of 132 mutations. 87 of these mutations were truncating and found throughout RUNX1, including the Runt domain. 33 mutations were found to be missense mutations residing across RUNX1. 5 mutations were described as in-frame and

7 were described as “other.” Evidence supports that both mutations disrupting CBFβ and

RUNX1 interaction as well as those in the Runt domain of RUNX1, have important roles in tumorigenesis of breast cancers. Consistent with these findings, RUNX1 is not expressed or expressed at very low levels in differentiated/metastatic breast tumor samples compared to normal breast tissue. Importantly, RUNX1 also suppressed tumor growth [237]. Despite the preliminary observations regarding RUNX1 suppressing breast tumor growth, the underlying mechanisms have not yet been established.

More recent studies indicate two opposing roles of RUNX1 in breast cancer.

Reduced RUNX1 expression correlated with Fork head box O (FOXO1) upregulation, leading to the hypothesis that RUNX1 results in subsequent tumor progression [238].

Normal mammary epithelial cells with RUNX1 loss led to increased proliferation and altered cellular morphology within a 3D Matrigel assay which was dependent upon normal FOXO1, indicating that RUNX1 and FOXO1 coupled together inhibit tumor

38 progression [239, 240]. A separate study examining RUNX1 loss in mammary epithelial cells demonstrated a reduction in ER+ luminal cells, indicating that RUNX1 is a positive regulator of the ER+ luminal lineage [178]. Additionally, this study demonstrated that these ER+ luminal cells lost by RUNX1 depletion, could be rescued by loss of either

TP53 () or RB1 (). RUNX1 was also shown to control estrogen-mediated AXIN1 transcriptional repression in ER+ breast cancer cells [241]. By deregulation of β-catenin, a critical protein in Wnt signaling, driven by loss of RUNX1,

ER+ RUNX1-deficient breast cancer cells were associated with compromised mitotic checkpoint control, increased cell proliferation, as well as a more mesenchymal phenotype by increased expression of stem cell markers [241].

Although there are two separate proposed roles of RUNX1 in breast cancer onset/progression, RUNX1 as a tumor suppressor has become the more well established and understood role.

4.4. RUNX Proteins and Mitotic Gene Bookmarking

The first reported instance of mitotic bookmarking described the mitotic occupancy of genes by RUNX2 in Saos-2 cells [97]. RUNX2 was demonstrated to be stable during mitosis and occupy sequence-specific sites on mitotic chromatin. Through use of siRNA-mediated knockdown of RUNX2 and ChIP-Seq, genes involved in cell growth and differentiation were identified to be regulated post-mitotically by RUNX2.

This finding strongly supported the role of mitotic gene bookmarking in maintaining cellular phenotype. Further defining the role of RUNX2 bookmarking within Saos-2

39 cells, our group discovered RUNX2 mitotically bookmarking both RNA Pol I and II transcribed genes for coordinated control of cellular growth, proliferation, and phenotype

[99, 100]. RUNX2 is associated with rDNA and controls rRNA expression, which ultimately leads to differences in global protein synthesis. By mitotically bookmarking both RNA Pol I and II transcribed genes, the strong foundational hypothesis of coordinated cellular control by phenotypic transcription factors such as RUNX2 was established. This makes conceptual sense, as a cell cannot entirely shift its overall phenotype, for example from epithelial to mesenchymal, without changing multiple attributes. A cell’s phenotype will not change by dysregulation of phenotypic-related genes that dictate morphology, size, structure, polarity, etc. without an accompanying dysregulation in genes that dictate cellular growth and proliferation, and vice versa. It makes conceptual sense that transcription factors capable of this coordinated regulation of RNA Pol I and II transcribed genes would be mitotically bookmarked, therefore a parental phenotype is maintained throughout successive cell divisions, primarily by a few vital phenotypic transcription factors. This coordinated control of cellular phenotype by other transcription factors in different cellular lineages [98]. Transcription factors such as

MyoD and CEBPα were also capable of mitotically bookmarking and regulating rRNA expression, on top of their already established roles in regulating RNA Pol II-transcribed genes [242-244]. These findings from our group established mitotic gene bookmarking as a novel epigenetic mechanism for coordinately controlling cellular phenotype.

RUNX1, which shares the conserved Runt (DNA binding) domain with RUNX2 and RUNX3, was investigated for mitotic bookmarking capabilities. Given RUNX1’s

40 role within proper hematopoiesis, mitotic bookmarking was investigated within Jurkat (a

T-lymphocyte lineage which expresses high levels of RUNX1/AML1) and Kasumi-1 cells (a leukemic cell line which possesses the AML1-ETO t(8;21) translocation) [118].

The AML1-ETO translocation is present in a significant portion of patients (~15%) who suffer from AML [245]. This translocation retains AML1/RUNX1’s Runt domain, however the nuclear localization signal and transactivation domain are lost [145, 199]. It was determined that both RUNX1/AML1 and AML1-ETO were capable of mitotically bookmarking rRNA genes, however, they exhibited different regulation. RUNX1/AML1 was shown to repress rRNA genes whereas AML1-ETO was shown to activate rRNA, most likely contributing (in part) to the leukemic phenotype [118].

Taken together, these findings by members of our group demonstrate RUNX proteins capable of mitotically bookmarking genes in both normal and cancer cells which contribute to maintenance of cellular phenotype through both RNA Pol I and II regulated genes. RUNX proteins are phenotypic transcription factors capable of coordinately controlling these large cellular attributes such as cellular growth, proliferation, and phenotype specifically through the epigenetic mechanism of mitotic gene bookmarking.

41

CHAPTER 5: RATIONALE FOR INVESTIGATION AND CENTRAL

HYPOTHESIS

RUNX1 has established roles in physiological tissue development, especially in the mammary gland. Importantly, significant somatic mutations in RUNX1 have been discovered in breast tumors, indicating that RUNX1 plays a consequential role in breast cancer formation/progression. Recently, our group showed that RUNX1 functions as a tumor suppressor in a normal breast epithelial cells [9]. RUNX1 maintains epithelial phenotype and prevents EMT through transcriptional regulation of genes involved in key cellular pathways including TGFβ pathway.

This dissertation tests the hypothesis that RUNX1 is persistently retained on chromatin throughout the cell cycle (including mitosis) at RNA Pol I- and II- transcribed genes in breast epithelial cells to maintain epithelial cell phenotype.

Perturbation of RUNX1 mitotic bookmarking leads to EMT, an essential driver of breast cancer formation and progression.

41

RESULTS

RUNX1 associates with mitotic chromatin and occupies target genes

To investigate RUNX1 protein localization and expression within mitotic

MCF10A breast epithelial cells, a protocol for isolating three distinct cell cycle-specific populations of cells was conducted (Fig. 10). MCF10A cells were treated with either

DMSO as a negative control or 50ng/mL Nocodazole for 16 hr to arrest cells in mitosis

(Fig. 10A). Cells were synchronized into G1 with a media replacement following mitotic arrest (Fig. 10A). Mitotic shake off was used to harvest mitotic populations of cells.

Fluorescently activated cell sorting (FACS) analysis was conducted to determine purity of each harvested population using PI/RNase stain for DNA content, shown by a representative FACS Profile (Fig. 10B). Expression of cell-cycle specific proteins were examined to further validate the purity of each harvested population of MCF10A cells

Figure 10. Validated experimental approach for analysis of RUNX1 protein within MCF10A breast epithelial cells. A) Protocol for harvest of three MCF10A cell populations: DMSO = Asynch (A), Mitotic = Blocked (B)(M), G1 = Released (R)(G1). B) Fluorescently activated cell sorting (FACS) analysis of PI/RNase stained MCF10A treatment groups. C) Western blot of MCF10A treatment groups for cell cycle- specific proteins Cyclin B1 (top), CDT1 (middle), and Lamin B (bottom).

42

(Fig. 10C).

After validating the purity of harvested MCF10A cell populations, we next examined RUNX1 levels within each population to ensure RUNX1 was expressed at detectable levels (Fig 11). Western blot analysis of each cell population revealed detectable levels of RUNX1, indicating RUNX1 expression within each harvested group

(Fig 12).

Figure 11. Western blot of Asynch (A), Mitotic (M), and G1 MCF10A cell populations for RUNX1 (top) and Tubulin (bottom).

Given detectable expression of RUNX1 in each harvested population of MCF10A cells, RUNX1 protein localization and expression was evaluated within actively proliferating MCF10A cells using immunofluorescence microscopy. Limit of detection for RUNX1 within MCF10A cells was conducted to observe the minimum amount of antibody required to observe RUNX1 foci via immunofluorescent microscopy. This was performed using two separate antibodies against RUNX1 (Fig. 12). In an antibody validated for use in Western blot and immunofluorescent applications, the most abundant signal was observed at a 1:10 dilution (Fig. 12A). A chromatin immunoprecipitation

(ChIP) grade antibody was also used, with a RUNX1 signal observed down to 1:600 (Fig.

12B).

43

Figure 12. Limit of detection for RUNX1 within actively proliferating MCF10A cells. Blue = DAPI, Green = RUNX1. RUNX1 foci are indicated by white arrows. A) An antibody validated for use in western blot and Immunofluorescence (CST 4334S) was diluted to 1:200 (top), 1:50 (middle), and 1:10 (bottom). B) A ChIP-grade antibody (CST 4334BF) was diluted to 1:200 (top), 1:600 (middle), and 1:1000 (bottom). Within limit of detection studies, it was clear that RUNX1 signal was comprised of two major forms: major and minor foci. These foci appeared to remain during mitosis.

To better determine RUNX1 chromatin occupancy within specific substages of mitosis

(prophase, prometaphase, metaphase, anaphase, and telophase) we performed immunofluorescence microscopy in actively proliferating MCF10A cells (Fig. 13). We find that RUNX1 is distributed in punctate subnuclear domains throughout interphase nuclei in MCF10A cells (Fig. 13). Interestingly, RUNX1 is localized on mitotic chromatin at all topologically identified substages of mitosis (Fig 13). Presence of two distinct type of foci were validated on mitotic chromosomes: large punctate foci as well as numerous smaller foci were detected. In agreement with our previous findings,

RUNX1 foci are equally distributed into resulting progeny cells [97]. Presence of

RUNX1 in all stages of mitosis indicates that the protein is stable during mitosis and appears to associate with chromatin.

44

Figure 13. RUNX1 is present in each topologically identified substage of mitosis in the form of major and minor foci. Blue = DAPI, Green = RUNX1.

From IF experiments it was evident that RUNX1 was associated with chromatin throughout mitosis, therefore we next addressed if RUNX1 was bound to specific genes

45 related to cell identity or cancer progression during mitosis by conducting ChIP-Seq.

MCF10A cells were synchronized in mitosis by Nocodazole treatment (50ng/mL) for

16hr and isolated by nocodazole treatment followed by release of block by media replacement for 3hr post-block (Fig. 11A). After nuclei isolation, chromatin was sonicated and gel electrophoresis of DNA from each treatment group (Asynch, Mitotic, and G1) reveal an appropriately sized range of fragments (200-500bp) suitable for ChIP using a validated antibody (Fig. 14).

Figure 14. Gel electrophoresis of sonicated lysates (n=2 biological replicates) for ChIP. R = replicate. A) Sonicated lysates from DMSO (Asynch) harvested MCF10A cells. B) Sonicated lysates from Mitotic and G1 harvested MCF10A cells. Following ChIP reactions, the resulting pooled libraries were evaluated by bioanalyzer to determine quality of libraries prior to sequencing (Fig. 15). Bioanalyzer traces revealed that ChIP reactions and library construction had been successful and were of ideal size (maxima of 350-450bp) for reliable and accurate sequencing (Fig 15).

46

Figure 15. Bioanalyzer traces of pooled and size selected libraries reveal proper fragmentation range for accurate sequencing. A) Pooled input libraries. B) Pooled RUNX1-ChIP libraries. Following high throughput sequencing, the resulting datasets were mapped to the latest build (hg38) using Bowtie2 [246]. Enriched regions were determined using Model-Based Analysis of ChIP-Seq [247] and were analyzed at p<10-5 significance level and with an irreproducible discovery rate (IDR) of 0.05. Peak calling revealed 2,020 genes that include both protein coding (1,426) genes and long non-coding

RNAs (lncRNAs) (594) were bound by RUNX1 in Asynch (Fig. 16B). MCF10A cells enriched in G1 showed a total of 1,095 genes with 762 protein coding genes and 333 lncRNAs bound by RUNX1 (Fig. 16B). RUNX1 occupied 551 genes (413 protein coding and 138 lncRNAs) in mitotically enriched MCF10A cells (Fig. 16B). Clustering of

Asynch, Mitotic, and G1 MCF10A cells were generated by seqsetvis (Bioconductor)

(Fig.16A). A comparison of the three cell populations reveal subsets of genes that were either shared (354 genes) across the three groups or were specific for each, indicating dynamic binding of RUNX1 during the cell cycle (Fig. 16B). These findings reveal that

RUNX1 bookmarks several hundred target genes during mitosis. A complete list of

RUNX1 bookmarked genes can be found in supplemental material (S. Table 1).

47

Figure 16. RUNX1 occupies protein coding genes and long non-coding RNAs across asynchronous, mitotic, and G1 populations of MCF10A breast epithelial cells. A) Heatmaps showing peaks called between A, M, and G1 MCF10A cells (left, middle, and right respectively). B) Venn diagrams illustrating the number lncRNAs (left diagram) and protein coding genes (right diagram) identified within and between A, M, and G1 MCF10A populations. C) Gene Set Enrichment Analysis results from interrogating mitotically bookmarked genes (i.e. genes called within blocked population within 5kb of TSS in MCF10A populations) against Hallmark Gene sets from Molecular Signatures Database (MSigDB). The top 10 most significantly overlapping gene sets are shown from top to bottom. Gene set enrichment (GSE) analysis was performed using RUNX1 bookmarked genes to identify regulatory pathways (Fig. 16C). Consistent with established roles of

RUNX1 for cell cycle and DNA damage/repair regulation [248-252], pathways involved with regulation of G2M Checkpoint, p53, and DNA repair, were among the top 10 which had significant overlap with the mitotically bookmarked genes identified from ChIP-Seq

(Fig. 16C). Relevant to normal mammary epithelial phenotype, both early and late

Estrogen response signaling gene sets had significant overlap with RUNX1 mitotically

48 bookmarked genes. Together these findings reveal that RUNX1 bookmarks genes involved in cell proliferation, growth, and phenotype in normal mammary epithelial cells.

49

RUNX1 mitotically bookmarks RNA Pol I-transcribed genes involved in cell

growth

Our ChIP-Seq results revealed that RUNX1 occupies rDNA repeats in MCF10A cells; all three MCF10A cell populations showed enrichment within the promoter region of hrDNA (Fig. 17), suggesting a potential regulatory role for RUNX1 regulation of rRNA in MCF10A cells.

Figure 17. ChIP-Seq tracks for each population of MCF10A cells at human ribosomal DNA regions. FE = Fold Enrichment.

We confirmed this finding in actively proliferating MCF10A cells by immunofluorescence microscopy using antibodies specific against RUNX1 and upstream binding transcription factor (UBF), a transcriptional activator that remains bound to rRNA genes during mitosis. We find that large RUNX1 foci colocalize with UBF throughout each stage of mitosis (Fig. 18). Colocalization between RUNX1 and UBF was validated by confocal microscopy. Line profiles of MCF10A cells show that although

RUNX1 and UBF occupy distinct nuclear microenvironments in interphase (n=15), both

50 proteins significantly colocalize in metaphase (n=15) (Fig. 19). Taken together, these findings establish RUNX1 binding to ribosomal DNA repeat regions, visualized through

ChIP-Seq and validated through confocal microscopy, for the potential role in regulating rRNA expression in MCF10A breast epithelial cells. We experimentally addressed the hypothesis that RUNX1 regulates ribosomal gene expression by using a pharmacological inhibitor of RUNX1. This small molecule inhibitor binds allosterically to RUNX1’s

Figure 18. RUNX1 colocalizes with upstream binding transcription factor (UBF) during mitosis. Blue = DAPI, Green = RUNX1, Red = UBF. obligate binding partner, CBF-β to disrupt the RUNX1-CBF-β interaction [253]. Actively

51 proliferating MCF10A cells were treated with a RUNX1 specific inhibitor (AI-14-91) for

6hr, 12hr, 24hr, and 48hr at 20μM; an inactive inhibitor (AI-4-88) was used as a control under identical conditions. Pre-rRNA expression was significantly increased at 12hr and

48hr timepoints when asynchronous MCF10As were treated with AI-14-91 in comparison to AI-4-88, indicating that RUNX1 suppresses rRNA genes in normal mammary epithelial cells (Fig. 20). Because rRNA expression directly correlates with global protein synthesis, a fluorescent-based detection method was used to determine newly synthesized proteins in asynchronous MCF10A cells treated with either AI-4-88 or

AI-14-91 for both 24hr and 48hr. MCF10A cells treated with AI-14-91 at 20μM for 24hr and 48hr appear to have overall decreased levels of global protein synthesis and altered cell morphology in comparison to AI-4-88 treated MFC10As under identical conditions

(Fig. 21). These findings are further supported by a key observation from our GSE analysis – that mitotically bookmarked RUNX1 target genes are enriched in mTOR signaling, a pathway that is required for cell growth and is a therapeutic target in breast cancers [254, 255]. Together, our results demonstrate that RUNX1 bookmarks RNA Pol I

Figure 19. Confocal immunofluorescence microscopy line profiles validating RUNX1 and UBF colocalization within MCF10A breast epithelial cells. A) Line profiles representative for N=15 interphase cells. B) Line profiles representative for N=15 metaphase cells. 52 regulated rRNA genes during mitosis, transcriptionally represses them and impacts global protein synthesis in normal mammary epithelial cells.

Figure 20. RUNX1 regulates pre-rRNA expression. Expression levels were made relative to Beta-Actin. AI-4-88 = Inactive inhibitor, AI-14-91 = Active inhibitor.

53

Figure 21. RUNX1 regulates global protein synthesis within MCF10A breast epithelial cells. Left = phase contrast, middle/green = DNA (Sytox Green), right/red = nascent proteins. “Protein Label” indicates the well designation in which cells would be incubated with the protein label as opposed to cycloheximide or no label for proteins. A) MCF10A cells that have been treated with either inhibitor for 24hr. B) MCF10A cells that have been treated with either inhibitor for 48hr.

54

RUNX1 mitotically bookmarks RNA Pol II-transcribed genes involved in

hormone responsiveness and cell phenotype

Genes involved in both early and late response to estrogen were among the top identified regulatory pathways from our GSE analyses (Fig. 16C). Estrogen plays vital roles in normal mammary gland development that include promoting proliferative phenotypes of mammary epithelial cells for ductal expansion and invasion into breast tissue [58-60]. We interrogated RUNX1 bookmarked genes with publicly available

Estrogen Receptor Alpha (ERα) ChIP-Seq data sets to identify genes bound by both

Figure 22. Scatterplot of ER-bound genes either up or down regulated in response to estrogen with RUNX1 bookmarked genes identified.

55

RUNX1 and ERα as well as sensitive to treatment (Fig. 22) [230]. Our analysis revealed that a subset of mitotically bookmarked by RUNX1 is also bound by ERα, and either up or down regulated in response to Estradiol, indicating that RUNX1 bookmarks hormone-responsive genes to regulate proliferation of mammary epithelial cells (Fig. 22).

A list of these genes can be found in the supplemental material (S. Table 2).

Interestingly, GSEA of RUNX1 bookmarked genes revealed regulatory pathways involved in cellular proliferation and phenotype including TNFα via NFκB, Apical

Junction, targets, and Notch signaling (Fig 16C). Interestingly, NEAT1 and NEAT2

(MALAT1), lncRNAs, often deregulated in cancers [256, 257], were also mitotically bookmarked by RUNX1. Of the 413 RUNX1 bookmarked protein coding genes, TOP2A,

MYC, HES1, RRAS, H2AFX, and CCND3 are RNA Pol II transcribed genes involved with proliferation and/or phenotype maintenance (See S. Table 1 for complete list). HES1 and H2AFX show significant fold enrichment between the three populations of MCF10A

Figure 23. ChIP-Seq tracks RUNX1 bookmarked RNA Pol II transcribed genes important for epithelial phenotype maintenance. A) H2AFX gene ChIP-Seq tracks. B) HES1 gene ChIP-Seq tracks.

56 cells (Fig. 23). Recently, HES1 and H2AFX have been identified as regulators of breast epithelial phenotype [258-260].

Expression of HES1 nearly doubled (relative to Beta Actin expression) upon inhibition of RUNX1 DNA binding at 48hr, while only increasing slightly (24.7% relative to Beta Actin expression) at 6hr (Fig. 5C), indicating that RUNX1 acts to repress

HES1. In contrast, the H2AFX expression at 24hr and 48hr was decreased (36% and 26% respectively, relative to Beta Actin expression), suggesting RUNX1 activates H2AFX expression (Fig. 24). These results indicate that RUNX1 stabilizes the epithelial phenotype by bookmarking both protein coding and non-coding genes.

Figure 24. RUNX1 regulates H2AFX and HES1 expression in MCF10A breast epithelial cells. A) Expression of H2AFX in the presence of RUNX1 inhibitors from 6hr, 12hr, 24hr, and 48hr. B) Expression of HES1 in the presence of RUNX1 inhibitors from 6hr, 12hr, 24hr, and 48hr.

57

We experimentally addressed whether disruption of RUNX1 bookmarking leads to a change in epithelial phenotype (Fig. 25). After 48hr of treatment with 20μM AI-14-

91, we find MCF10A cells have adopted mesenchymal-like cellular morphology in comparison to the identical treatment with AI-4-88, visible by phase contrast microscopy

(Fig. 25A). AI-14-91 treated MCF10As retain RUNX1 foci that colocalize with UBF, however intensity of RUNX1 signal throughout mitotic chromatin decreased, suggesting an inability of RUNX1 to bind DNA as a result of pharmacologic inhibition.

Figure 25. RUNX1 inhibition leads to morphological changes in MCF10As visible through both phase contrast and immunofluorescence microscopy within 48hrs. A) RUNX1 inhibitor AI-4-88 (Inactive, Top row) in MCF10A cells at 0hr, 24hr, and 48hr. RUNX1 inhibitor AI-14-91 (Active, Bottom row) in MCF10A cells at 0hr, 24hr, and 48hr. B) Immunofluorescence microscopy of prometaphase MCF10A cells treated with either inhibitor for 48hr. Blue = DAPI, Green = RUNX1, Red = UBF, Purple = F-Actin.

58

Immunofluorescence microscopy also reveals MCF10A cells treated with AI-14-91 for

48hrs have differences in F-Actin protein expression and localization within the cell in comparison to AI-4-88 treated MCF10A cells, suggesting RUNX1 inhibition leads to rearrangement of cytoskeletal proteins, an important feature dictating cellular phenotype

(Fig. 25B). Longer term treatment (5 days) of actively proliferating MCF10A cells with

AI-14-91 showed significant cell rounding and loss of adherence, possibly indicating apoptosis, with a small sub-population of cells remaining attached and adhered with a transformed, mesenchymal-like phenotype (Fig. 26). After 5 days of treatment, inhibitor was removed and cells were maintained in normal media after Day 5 of treatment. By day

Figure 26. RUNX1 inhibition causes changes in cellular morphology of MCF10A breast epithelial cells to a more mesenchymal-like phenotype. (Top row, from left to right) = 0hr, 24hr, and 48hr timepoints. (Bottom row, from left to right) = 72hr, 96hr, and 120hr. 3-4 following media replacement, cells clearly maintained a mesenchymal-like phenotype and by day 7 a morphologically heterogeneous population, indicating that MCF10A cells lacking RUNX1 mitotic bookmarking do not sustain normal mammary epithelial

59 phenotype, which mimics the initiating events of tumor formation. These results demonstrate that RUNX1 is involved in maintaining epithelial morphology and cell growth, proliferation and phenotypic gene expression, through direct-DNA binding/gene regulation through the cell cycle, including mitosis. Disruption of RUNX1 gene bookmarking in normal mammary epithelial cells initiates epithelial to mesenchymal transition, a key first event at the onset of breast cancer.

60

Figure 27. RUNX1 mitotically bookmarks RNA Pol I and II-transcribed genes for maintenance of normal mammary epithelial phenotype.

61

DISCUSSION

RUNX1 associates with mitotic chromatin at RNA Pol I and II-transcribed genes

This study establishes RUNX1 mitotic bookmarking as an epigenetic mechanism for coordinate regulation of both RNA Pol I- and II-transcribed genes critical for mammary epithelial proliferation, growth, and phenotype maintenance.

Pharmacological inhibition of RUNX1 DNA-binding in MCF10A cells causes a transformation to a mesenchymal-like phenotype, indicating RUNX1 DNA-binding is required for normal epithelial maintenance.

Consistent with published reports in other cell types [172-179], MCF10A breast epithelial cells contain RUNX1 signal throughout the interphase nucleus and , with a slightly increased localization to the periphery of nucleoli, the sites of ribosomal

RNA biogenesis. . Importantly, RUNX1 localizes to major (bright, punctate) and minor foci on mitotic chromatin in all topologically distinct substages of mitosis. ChIP-Seq of

A, M, and G1 populations revealed mitotically bookmarked RNA Pol I- and II- transcribed genes that regulate cell growth, proliferation, and phenotype.

Pharmacological inhibition of RUNX1 in asynchronous MCF10A cells followed by qPCR of target genes revealed both up- and downregulation depending on the specific gene, indicating a regulatory role of RUNX1 in controlling the expression of these genes.

Recent reports suggest that RUNX1 is a tumor suppressor in breast cancer [9,

241]. Consistent with these results, RUNX1 often acquires loss of function mutations within BrCa [168, 226, 261]. We have also shown that loss of RUNX1 is coupled with

62 activation of TGFβ signaling pathway, although the precise mechanism for maintenance of epithelial phenotype by RUNX1 remained unclear. Here, we have defined for the first time that loss of RUNX1-DNA interaction, specifically through disruption of RUNX1-

CBF-β interaction, results in a distinct morphological change that may be a precursor to longer term transformation in breast cancer-related cells. This association of RUNX1 with chromatin throughout the cell cycle suggests that RUNX1-DNA binding activity is necessary through mitosis to establish proper phenotype upon cell division.

Refined Role of RUNX1 in mammary gland development

Estrogen receptor plays a vital role in directing mammary gland development.

Estrogen is recognized by (ERα) on breast epithelial cells during development to help direct a regulated expansion and ductal invasion into the breast tissue to form the mammary gland [58-60]. Uncontrolled differentiation and proliferation through ERα signaling contributes to breast cancer phenotype, where about 60% of total diagnosed breast cancers are “ER+” (subtype classification as luminal A) and respond to estrogen receptor activity [15]. Therefore, ERα binding sites are among the most important to understand breast cancer progression.

RUNX1 has been shown to interact with ERα, potentially forming a co-regulatory complex, at both enhancer regions and transcriptional start sites (TSSs) to modify the expression of ER-responsive specific genes [230, 241]. Our ChIP-Seq results coupled with publicly available data sets revealed RUNX1 bookmarking of ERα-occupied, hormone-responsive genes, indicating that a subset of RUNX1 bookmarked genes is

63 regulated by ERα signaling and may be critical for maintenance of breast epithelial phenotype.

RUNX1 was discovered to play an important role in directing ERα to a subset of genes which lack an estrogen response element (ERE) [230]. The classical mode of ERα binding to target genes involves the recognition of an ERE or half-ERE sites. The non- classical mode of ERα binding involves a protein-protein interaction, described as

“tethering”, where ERα binds to new loci through interaction with another transcription factor. Evidence supports the physiological significance of this ERα tethering, as selective estrogen receptor modulators (SERMs), such as tamoxifen, regulate ERα through agonist binding and therefore disrupt any potential tethering interactions [262].

These ERE-independent genes are also differentially expressed within breast tumors and respond differently to SERM treatment. [263]. Motif analysis of the ERE-independent,

ERα bound genes revealed RUNX1 as one of the top motifs, thus indicating its role in tethering ERα to these genes. Interestingly, comparing our RUNX1 mitotically bookmarked genes in MCF10A cells with the genes bound by ERα and identified to be either upregulated or downregulated in response to estrogen, there are overlapping genes

(Fig. 5D). GSEA of our RUNX1 mitotically bookmarked genes further supports this role as two of the top 10 hallmark gene sets (the first and fourth most significant) are involved in estrogen response (Fig. 2C). These findings suggest that RUNX1 mitotically bookmarks genes that are occupied by ERα, are sensitive to estrogen treatment and may be vital in proper breast epithelial cell growth and phenotype.

64

RUNX1 regulation of MYC

Our data revealed that RUNX1 binds to an enhancer region of MYC about 67kb upstream of the transcriptional start site (TSS), as well as about 2500bp downstream of the TSS (data not shown). This enhancer region was shown to be critical for MYC activation through ERα and AP-1 mediated transactivation [264]. Furthermore, the MYC enhancer region contains multiple ERE-half sites as well as two AP1 sites, with only 1

ERE half-site and 1 AP-1 site being critical for this transactivation to occur. AP-1 was the other motif identified to be important in tethering ERα to target genes [230]. Although the role for RUNX1 in to the context of ERα transactivation of MYC is not understood, it most likely has a role based on our ChIP-Seq findings. c-Myc is recognized as a key protein in promoting cell growth through control of RNA Pol I, II and III mechanisms responsible for rRNA biogenesis [265]. In addition to RUNX1 directly binding to rDNA promoters and repressing global protein synthesis, our finding that RUNX1 binds to this

Myc enhancer suggest that RUNX1 also indirectly regulates rRNA biogenesis through regulating MYC expression. Both MYC and RUNX DNA binding motifs are present in rDNA repeats [99, 266, 267]. RUNX1 and RUNX2 have been previously shown to repress rDNA expression and affect global protein synthesis in different cellular models other than breast [98-100, 118]. Therefore, RUNX1 likely plays a vital role in regulating protein synthesis to control cellular growth for mammary epithelial cells through direct and indirect means.

65

RUNX1 regulation of HES1

Using GSEA to identify and prioritize RUNX1 mitotically bookmarked genes, one candidate gene of interest is hairy and enhancer of split-1 (HES1). Hes1 is a transcription factor which represses genes involved in cellular development, and is primarily regulated by NOTCH signaling, one of our top ten overlapping hallmark gene sets [268, 269]. HES1 was recently shown to have a role in initiating EMT within mammary cells [259]. Upon 17β-estradiol treatment of MCF7 cells, Hes1 protein levels decrease and a proliferative phenotype is observed. However, overexpression of HES1 in

T47D cells was capable of overcoming this 17β-estradiol specific proliferative effect

[259]. Over expression of HES1 was capable of mediating estrogen’s proliferative effect in MCF7 cells by preventing upregulation of proliferating cell nuclear antigen (PCNA)

[259].

However, the role of HES1 in breast epithelial cells is of current debate. Hes1 is known to interact with transducin like enhancer of split 1 (TLE1) via conserved WRPW motifs and recruit histone deacetylases (HDACs) to target genes for repression [270-272].

Products from NOTCH signaling within bone marrow, including HES1, have been shown to inhibit RUNX2, an osteogenic transcription factor, to maintain a pool a mesenchymal progenitor cells [273]. It was suggested that this model poises bone marrow to either activate NOTCH signaling for increased mesenchymal progenitors, or repress NOTCH signaling for increased bone differentiation. RUNX3 was shown to regulate NOTCH1 signaling through interaction with other co-regulatory proteins at the HES1 promoter, thus repressing HES1 [274]. This repression was achieved with co-repressing proteins,

66 including TLE1, that interact with RUNX3. RUNX1 also contains the conserved

VWRPY motif that mediates interaction with TLE1 for HDAC recruitment and repression of target genes [147, 275]. Our results indicated that inhibition of RUNX1 led to an increase in HES1 levels, suggesting RUNX1 acts in part to repress HES1 expression in MCF10A cells similar to RUNX3 activity in hepatocellular carcinoma cells [274].

More recently, HES1 was shown to have a prominent role in proliferation and invasion of

MCF7 cells [258]. Silenced HES1 led to a downregulation of p-Akt signaling in MDA-

MB-231 breast cancer cells and ultimately prevented EMT. This finding is also in agreement with our hypothesis that RUNX1 mitotically bookmarks HES1 for repression in MCF10A breast epithelial cells.

RUNX1 regulation of H2AFX

Another important RNA Pol II-transcribed gene mitotically bookmarked by

RUNX1 and critical for maintaining mammary epithelial phenotype is histone variant

H2AX (H2AFX). Silencing H2AFX in colon cancer cells led to induction of EMT through activation of SNAIL2/SLUG and ZEB1 [276]. This regulation was also reported in

MCF10A breast epithelial cells, with the key difference that TWIST1 was activated instead of ZEB1 [260]. It is shown that SNAIL2/SLUG activation is minor in comparison to TWIST1 activation in breast epithelial cells, however TWIST1 is activated in both breast epithelial cells and colon cancer cells upon H2AFX silencing. Potentially, the lack of TWIST1 expression in breast epithelial cells from our experiments could be due to the less transformed phenotype of MCF10A cells in comparison to the more

67

aggressive/malignant phenotype of HCT116 and HCT15 cells, resulting in differences in

EMT transcription factor expression, although this requires further investigation. Our

RUNX1 inhibition data suggest that RUNX1 activates H2AFX, as H2AFX expression

decreases at 24hr and 48 hr post AI-14-91 treatment.

Furthermore, our findings, although preliminary, demonstrate that RUNX1

inhibition results in an initial decrease in SNAIL2/SLUG expression within the first 12hrs,

followed by an almost 10-fold increase in SNAIL2/SLUG expression at hour 48 (Fig. 27).

Additionally, in AI-14-91-treated MCF10A cells, TWIST1 or ZEB1 expression

Figure 28. RUNX1 inhibition leads to initial repression followed by ~10-fold activation of SNAIL2/SLUG.

68

did not change significantly at any timepoint of treatment, indicating a critical role for

SNAIL2/SLUG in the initiation of breast epithelial cell EMT. Consistent with

SNAIL2/SLUG upregulation, E cadherin protein expression is decreased at hour 48 in AI-

14-91-treated MCF10A cells (Preliminary) (Fig. 28). However, there is no reciprocal

increase in vimentin signal at this timepoint. We hypothesize that RUNX1 mitotically

bookmarks H2AFX for its activation to prevent SNAIL2/SLUG expression and

Figure 29. Western blot analysis of MCF10A cells treated with either inhibitor for 6, 12, 24, or 48hr for EMT targets CDH1 and Vimentin. Parental and MDA-MB-231 wells were loaded for maximum volume, not equal protein loading. 20ug of protein was ran for each inhibitor sample. subsequently initiate of EMT in breast epithelial cells.

RUNX1 regulation of lncRNAs NEAT1/NEAT2 (MALAT1)

Nuclear enriched abundant transcript 1 (NEAT1) is an important long non-

coding RNA (lncRNA) that was identified in our ChIP-Seq data to be mitotically

bookmarked by RUNX1 (data not shown). There is an established role for NEAT1 in

normal mammary gland development and lactation [277]. More recently, NEAT1 role in 69 breast cancer progression and tumorigenesis has been demonstrated. Upregulation of

NEAT1, caused by either epigenetic silencing or mutations in BRCA1, causes repression of miR-129-5p, a downstream target of NEAT1. Decreased levels of miR-129-5p led to increased WNT4 signaling, which in turn initiates cell proliferation, invasiveness, and stemness [278]. In a separate study utilizing patient tumor samples, NEAT1 was shown to be upregulated in breast tumor tissue in comparison to matched normal breast tissue and this higher expression within tumors correlated with lymph node metastasis and poorer prognoses in patients [279]. In addition to breast cancer, NEAT1 also has roles in the formation and progression in a variety of other cancers, demonstrating its potential importance in tissue development [Reviewed in [257]]. Our ChIP-Seq results show

RUNX1 mitotically bookmarks the TSS and promoter region of NEAT1, potentially implicating a role in regulating its expression in MCF10A epithelial cells. Although further studies are necessary, it is plausible that RUNX1 mitotically bookmarks NEAT1, an oncogene in breast cancer, in MCF10A cells to prevent its expression, thus stabilizing the mammary epithelial phenotype.

Requirement for mitotic bookmarking of genes in maintenance of an epithelial

phenotype

Our findings are the first to demonstrate mitotic gene bookmarking as an important mechanism for maintenance of normal mammary epithelial phenotype. Equally noteworthy, our study shows that disruption of mitotic gene bookmarking specifically elicits a phenotypic change. Another novel finding of current study is the mitotic

70 bookmarking of lncRNAs by a transcription factor. RUNX1 was recently shown to regulate lncRNAs NEAT1 and NEAT2 (MALAT1) [280], but the mechanism by which

RUNX1 regulates these lncRNAs was not established. Our findings provide the mechanism for regulation of these lncRNAs in breast epithelial cells. Occupancy of the

Myc enhancer by RUNX1 is another key finding of the current study, with novel regulatory implications for the onset and progression of breast cancer.

In summary, this work establishes a novel epigenetic mechanism where RUNX1 mitotically bookmarks RNA Pol I and II transcribed genes for coordinate regulation of normal mammary epithelial proliferation, growth, and phenotype maintenance.

Disruption of RUNX1 bookmarking leads to initiation of epithelial to mesenchymal transition, a key event in breast cancer onset, and strongly suggests the requirement of

RUNX1 bookmarking for proper maintenance of mammary epithelial phenotype.

71

MATERIALS AND METHODS

Cell Culture Techniques

Breast epithelial (MCF10A) cells were cultured in DMEM/F-12 50/50 mixture

(Corning™, Corning, NY). Culturing media was also supplemented with horse serum to

5% (GIBCO, Grand Island, NY), human insulin to 10µg/mL (Sigma Aldrich, St. Louis,

MO), human epidermal growth factor to 20ng/mL (PeproTech, Rocky Hill, NJ), cholera toxin to 100ng/mL (Thomas Scientific, Swedesboro, NJ), hydrocortisone to 500ng/mL

(Sigma Aldrich, St. Louis, MO), Penicillin-Streptomycin to 100U/mL (Thermo Fisher

Scientific, Ashville, NC), and L-Glutamine to 2mM (Thermo Fisher Scientific, Ashville,

NC).

For mitotic arrest of parental MCF10A cells, culturing media was supplemented with 50ng/mL of Nocodazole (Sigma Aldrich, St. Louis, MO) and incubated with cells for 16hrs. Supplementing culturing media with equivalent volumes of DMSO (Sigma

Aldrich, St. Louis, MO) served as a negative control. For DMSO-treated and mitotically arrested populations of MCF10A cells, harvests were conducted following the 16hr incubation. For G1 (released from mitotic arrest) populations of MCF10A cells, the nocodazole-supplemented culturing media was replaced with normal culturing media and incubated with cells for 3hrs. Following the 3hr incubation, released populations of cells were harvested for subsequent analysis.

Core binding factor – Beta (CBFβ) inhibitors AI-4-88 and AI-14-91 were given to us from John H. Bushweller (University of Virginia) and used to evaluate RUNX1

72 inhibition in MCF10A cells. Protein synthesis evaluation by immunofluorescence was conducted following manufacturer protocol (K715-100, BioVision, San Francisco, CA).

Protein Expression and Localization

SDS-PAGE was performed to visualize protein expression within MCF10A cells.

8% SDS resolving gels and 4% stacking gels were prepared in-house (National

Diagnostics, Atlanta, GA). Pelleted cells were resuspended in RIPA buffer and incubated on ice for 30min. Following incubation, cell lysates were sonicated using Q700 Sonicator

(QSonica, Newtown, CT). Lysates were sonicated using 7 cycles of 10 seconds on/ 30 seconds off at power setting. Lysates were then clarified by centrifugation at 15,000 rpm for 30min at 4℃. Protein concentration in the remaining supernatant was quantified using a PierceTM BCA Protein Assay Kit (Thermo Fisher Scientific, Ashville, NC).

Electrophoresis was performed at 160V for 15min followed by 200V for 45min.

Overnight wet transfer of protein into PVDF membranes was performed at 30V for 18hr in 4℃. PVDF membranes were blocked at room temperature in 5% BSA in 1X TBST.

For protein detection, primary antibodies raised against UBF (sc-13125, Santa Cruz

Biotechnology, Dallas, TX), RUNX1 (4334S, Technologies, Danvers,

MA), Cyclin B (4138S, Cell Signaling Technologies, Danvers, MA), Beta-Actin (3700S,

Cell Signaling Technologies, Danvers, MA), and CDT1 (ab70829, AbCam, Cambridge,

UK) were used at 1:1000 dilution for western blotting. Lamin B1 (ab16048, AbCam,

Cambridge, UK) was used at 1:2000 dilution for western blotting. Primary antibodies were diluted in 5% BSA in TBS-T and incubated with blots overnight at 4℃. Blots were

73 washed four separate times with PBS-T or TBST. Goat anti-mouse IgG HRP conjugated

(31460, Invitrogen, Carlsbad, CA) secondary antibody was incubated with blots at 1:5000 and incubated for 1hr at room temperature with mild agitation. HRP conjugated goat anti- rabbit IgG (31430, Thermo Fisher Scientific, Ashville, NC) secondary antibody was incubated with blots at 1:1000, 1:2000, or 1:5000 and incubated for 1hr at room temperature with mild agitation. Blots were developed using Clarity Western ECL

Substrate (Bio-Rad, Hercules, CA) following manufacturer’s instructions. Blots were exposed to visualize protein and images were captured using Molecular Imager® Chemi docTM XRS+ Imaging System (Bio-Rad, Hercules, CA). Captured images were processed using Image Lab Software Version 5.1 (Bio-Rad, Hercules, CA).

Immunofluorescent microscopy was performed to observe distribution and localization of protein expression within MCF10A cells throughout all stages of mitosis and interphase. MCF10A cells were plated within a 6 well plate at 175,000 cells/mL on coverslips coated in gelatin (0.5% w/v solution in 1XPBS) and allowed to grow overnight. Coverslips were washed twice with sterile-filtered PBS at 4℃. Coverslips were then placed in room temperature fixative solution (1% MeOH-free Formaldehyde in

PBS) for 10min. After a sterile-filtered PBS wash, coverslips were transferred to permeabilization solution (0.25% Triton X-100 in PBS) for 20min on ice. Following another sterile-filtered PBS wash, coverslips were then blocked in sterile-filtered PBS supplemented with bovine serum albumin (PBSA) at 0.5% w/v (Sigma Aldrich, St.

Louis, MO). Coverslips were then incubated with primary antibody for 1hr at 37℃ in a humidified chamber. Primary antibodies were specific for RUNX1 at a dilution of 1:10

74

(4334S, Cell Signaling Technologies, Danvers, MA) and Upstream Binding Transcription

Factor (UBF) at a dilution of 1:200 (F-9 sc-13125, Santa Cruz Biotechnology, Dallas,

TX). Coverslips were washed four separate times in sterile-filtered PBSA following primary antibody incubation. Coverslips were then placed in secondary antibody for 1hr at 37℃ within a humidified chamber. Secondary antibodies used were goat anti-rabbit

IgG conjugated with Alexa Fluor 488 (A-11070, Life Technologies, Carlsbad, CA) and goat anti-mouse IgG conjugated with Alexa Fluor 594 (A-11005, Life Technologies,

Carlsbad, CA) diluted 1:800. Coverslips were then washed four times in sterile-filtered

PBSA. Staining of the coverslips for DNA was performed with 1.0µg DAPI in 0.1%

Triton X-100 and sterile-filtered PBSA for 5min on ice. Stained coverslips were washed once in 0.1% Triton X-100 in sterile-filtered PBSA, then two times with sterile filtered

PBS. Coverslips were mounted onto slides using ProLong Gold Antifade Mountant

(Thermo Fisher Scientific, Ashville, NC). Images were captured using a Zeiss Axio

Imager.Z2 fluorescent microscope and Hamamatsu ORCA-R2 C10600 digital camera.

Images were processed using ZEN 2012 software.

Confocal microscopy was performed on slides prepared as described above.

MCF10A breast epithelial cells were initially imaged with a Zeiss LSM 510 META confocal laser scanning microscope (Carl Zeiss Microscopy, LLC., Thornwood, NY,

USA) for a preliminary study to assess potential colocalization. At a later time, additional samples were imaged with a Nikon A1R-ER laser scanning confocal microscope (Nikon, Melville, NY, USA) for complete colocalization analysis. Images were acquired with the resonant scanner at a frame size of 1024 X 1024 pixels with 8X

75 averaging. Fluorescently labeled samples were excited by laser lines sequentially imaged in channel series mode. The DAPI signal was excited with a 405 nm laser and collected with a 425-475 nm band pass filter, Alexa 488 was excited with a 488 nm laser and collected with a 500-550 nm band pass filter, and Alexa 568 with a 561 nm laser and collected with a 570-620 nm band pass filter. Images were captured with a Plan-Fluor

40X (1.3 NA) objective lens. The confocal pinhole was initially set to 1.2 Airy Unit diameter for the 561 nm excitation giving an optical section thickness of 0.41 µm. Images were acquired at 12-bit data depth, and all settings, including laser power, amplifier gain, and amplifier offset were established using a look-up table to provide an optimal gray- scale intensity. All images were acquired using matching imaging parameters.

Images were acquired with at 40X objective were subject to colocalization analysis via Volocity version 6.3.0 (Perkin Elmer, Waltham, MA, USA). Images were opened in the colocalization tab. Cell nuclei, indicated by the DAPI signal, were circled via the ROI tool. At least 15 interphase and 15 metaphase cells were identified within captured images and appropriate thresholds were manually determined to eliminate background fluorescence for calculating Pearsons and Manders correlation coefficients between RUNX1 and UBF.

Images were also viewed in NIS Elements version 5.02.01 and analyzed using the line profiling tool. Overlaying DAPI, RUNX1, and UBF fluorescent intensities from individual channels along the line profile revealed overlapping peak intensities between the RUNX1 and UBF channels, thus indicating colocalization.

76

Molecular Techniques and Sequencing/Bioinformatics

Total RNA was isolated from MCF10A cells using TRIzolTM Reagent

(Invitrogen, Carlsbad, CA) and Direct-ZolTM RNA MiniPrep isolation kit (Zymo

Research, Irvine, CA) following manufacturer instructions. cDNA was created using

SuperScript IV® First-Strand Synthesis System for RT-PCR (ThermoFisher, Asheville,

NC). Resulting cDNA were quantified on a Qubit 2.0 Fluorometer (Invitrogen, Carlsbad,

CA) and diluted to 500pg/μL. Equal amounts of DNA template were loaded for samples analyzed by qPCR.

Chromatin Immunoprecipitation was conducted on asynchronous (Asynch), mitotically arrested (M), and released from mitosis (G1) MCF10A breast epithelial cells.

Cells were fixed with 1% v/v MeOH-free Formaldehyde in PBS for 10min at room temperature. Formaldehyde fixation was neutralized using 2.5M Glycine and incubated with cells for 5min at room temperature. Two washes with PBS supplemented with cOmpleteTM EDTA-free Protease Inhibitor Cocktail (Sigma Aldrich, Saint Louis, MO) and MG-132 (Calbiochem-Millipore Sigma, Burlington, MA) were performed. For asynchronous and G1 populations of cells, culture dishes were scraped to collect fixated lysate. Mitotic cells were isolated using a mitotic shake off. Mitotic cells were spun down at 1500rpm x 5min, resuspended in 1% v/v MeOH-free Formaldehyde in PBS, and neutralized with 2.5M Glycine for 5min. Fixed harvests were centrifuged at 1500rpm x

5min (4oC) and the supernatant was discarded. All fixed cell pellets were flash frozen in liquid nitrogen and stored at -80oC until lysis.

77

Fixed cell pellets were thawed on ice. Once thawed, pellets were lysed in a nuclear lysis buffer supplemented with cOmpleteTM EDTA-free Protease Inhibitor

Cocktail (Sigma Aldrich, Saint Louis, MO) and MG-132 (Calbiochem-Millipore Sigma,

Burlington, MA) with a volume that was approximately 5X the volume of pellet. Pellets were incubated in nuclear lysis buffer for 30min before being flash frozen down in liquid nitrogen. Lysates were thawed at room temperature but not allowed to reach room temperature. Sonication of the lysates were performed using a S220 focused ultra- sonicator (Covaris, Matthews, NC). Sonication parameters for each population of cells was as follows: Peak Watt 140W, Duty Factor 10, Cycle/Burst 200. M and G1 populations of cells were sonicated for 28min total whereas asynchronous populations of cells were sonicated for 36min. All samples were sonicated at 6oC. Following sonication, aliquots were spun down at 15,000rpm x 10min and 4oC. Following the spin, the resulting supernatants were pooled together and analyzed.

Sonicated lysate was boiled in 100oC for 15min with NaCl and elution buffer.

Boiled lysate was allowed to cool and treated with RNaseA (10ug/uL) for 10min at 37oC.

DNA was isolated using PureLinkTM PCR Purification Kit (K310001, ThermoFisher,

Ashville, NC) following manufacturer recommendations. Resulting DNA was quantified via nanodrop and 1-2μg was separated on a 1.5% agarose gel to observe relative fragment size distribution prior to generating ChIP reactions. Resulting DNA was also quantified via Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA) and analyzed by using a High

Sensitivity DNA Kit on a Bioanalyzer 2100 (Agilent, Santa Clara, CA).

78

For chromatin immunoprecipitation (ChIP) reactions, 150μg of sonicated chromatin was incubated with 10μg of RUNX1 antibody (4336BF, Cell Signaling

Technologies, Danvers, MA), diluted 1:10 in IP dilution buffer, and incubated overnight

(16-18hrs) at 4oC with mild agitation. Following incubation, 150μL of Protein A/G magnetic beads (Thermo Scientific – Pierce, Waltham, MA) per ug of antibody used were added to each IP reaction and incubated for 2-4hrs at 4oC with mild agitation. Beads were isolated from solution using a neodymium magnet separator and washed two times in two separate IP wash buffers. Lastly, beads were resuspended in an elution buffer and agitated in a thermomixer (Eppendorf, Hamburg, Germany) or vortexer at 1000rpm x

30min at room temperature. This elution step was repeated on the beads, beads discarded, and the resulting supernatant was incubated with NaCl overnight (16-18hrs) at 67oC to reverse formaldehyde crosslinks. DNA from RUNX1 ChIP samples were purified using

PureLinkTM PCR Purification Kit (K310001, ThermoFisher, Ashville, NC) following manufacturer recommendations.

DNA libraries were generated using Accel-NGS® 2S Plus DNA Library kit

(Swift Biosciences, Ann Arbor, MI) following manufacturers protocol. Input and

RUNX1 ChIP samples were normalized to 1ng prior to library generation. Libraries were amplified in an optional PCR step for 12 total cycles. Finalized libraries were double size selected using AMPure XP beads (0.8X and 0.2X volume ratios to sample), resulting in the majority fragments sized between 250-400bp. Next generation sequencing of pooled

ChIP libraries was performed by the University of Vermont Cancer Center - Vermont

Integrated Genomics Resource (VIGR).

79

Because we were specifically investigating rDNA, a customized build of hg38 was constructed that included normally masked regions of rDNA (Gencode U13369).

Since some (although not complete) rDNA sequence is present in the hg38 assembly, we masked all parts of hg38 that would normally be attributed to rDNA sequences (bedtools v2.25.0 maskfasta). Finally, we appended the rDNA sequence as a “unique” chromosome (chrU13369.1) to the masked hg38 fasta resulting in the hg38_rDNA assembly used for analysis.

Single-end, SE50 reads were processed pre-alignment by removing adapter reads

(Cutadapt v1.6) and trimming low quality base calls from both ends (FASTQ Quality

Trimmer 1.0.0; min score >= 20, window of 10, and step size of 1). Resulting reads were aligned to hg38_rDNA (STAR v2.4; splicing disabled with '--alignIntronMax 1'). Peaks were called and fold-enrichment (FE) bedGraph files were generated (MACS2 v2.1.0.20140616; callpeak at p-value e-5; and bdgcmp with FE method) [247].

Irreproducible Discovery Rate (IDR) was conducted using unpooled replicates with all peaks in pooled samples passing an IDR cutoff of 0.5 [281]. To reduce artificial peaks, we calculated strand cross-correlation for all peaks at a shift of 95 bp (the mean observed fragment size of 180 bp minus the read size of 85 bp) and unshifted [124]. We eliminated peaks that exhibited low shifted correlation (shifted correlation <.7) and those that exhibited high unshifted correlation relative to shifted (shifted – unshifted correlation

< .1). This increased retrieval of the RUNX1 motif and improved agreement with other

RUNX1 datasets. Passing peaks were annotated separately to mRNA and lncRNA transcript start sites (TSSs) using GENCODE v27 with a distance cutoff of 5000 bp.

80

Regional distribution of peaks was determined using the same annotation reference limited to the “basic” tag for exons and promoters.

SUPPLEMENTAL MATERIAL

Table 1. RUNX1 mitotically bookmarked genes in MCF10A breast epithelial cells. (N=417 total)

gene_name gene_id gene_type seqnames start end strand RUNX1_blocked_distance HSPB6 ENSG00000004776 protein_coding chr19 35758079 35758079 - 0 HCCS ENSG00000004961 protein_coding chrX 11111301 11111301 + 0 KMT2E ENSG00000005483 protein_coding chr7 105014179 105014179 + 1168 MYCBP2 ENSG00000005810 protein_coding chr13 77327050 77327050 - 0 TNFRSF12A ENSG00000006327 protein_coding chr16 3018445 3018445 + 2492 KDM7A ENSG00000006459 protein_coding chr7 140177035 140177035 - 1277 ETV1 ENSG00000006468 protein_coding chr7 13991425 13991425 - 0 ALDH3B1 ENSG00000006534 protein_coding chr11 68008578 68008578 + 4608 CDK11A ENSG00000008128 protein_coding chr1 1724324 1724324 - 110 SYPL1 ENSG00000008282 protein_coding chr7 106112576 106112576 - 473 SPAG9 ENSG00000008294 protein_coding chr17 51120865 51120865 - 7 CRY1 ENSG00000008405 protein_coding chr12 107093829 107093829 - 0 CSDE1 ENSG00000009307 protein_coding chr1 114758676 114758676 - 386 AKAP8L ENSG00000011243 protein_coding chr19 15419141 15419141 - 0 MATR3 ENSG00000015479 protein_coding chr5 139293648 139293648 + 0 WWTR1 ENSG00000018408 protein_coding chr3 149736714 149736714 - 79 ZNF839 ENSG00000022976 protein_coding chr14 102317377 102317377 + 139 RTEL1-TNFRSF6B ENSG00000026036 protein_coding chr20 63659300 63659300 + 535 SH3YL1 ENSG00000035115 protein_coding chr2 266398 266398 - 2314 MAT2B ENSG00000038274 protein_coding chr5 163503114 163503114 + 2157 CLEC16A ENSG00000038532 protein_coding chr16 10944488 10944488 + 0 HSPA5 ENSG00000044574 protein_coding chr9 125241330 125241330 - 87 TPR ENSG00000047410 protein_coding chr1 186375693 186375693 - 0 POLR3E ENSG00000058600 protein_coding chr16 22297375 22297375 + 767 ATP2B4 ENSG00000058668 protein_coding chr1 203626561 203626561 + 0 PRDM6 ENSG00000061455 protein_coding chr5 123089121 123089121 + 509 SPAG4 ENSG00000061656 protein_coding chr20 35615892 35615892 + 394 MRPS24 ENSG00000062582 protein_coding chr7 43869893 43869893 - 90 WISP2 ENSG00000064205 protein_coding chr20 44714844 44714844 + 0 CNN2 ENSG00000064666 protein_coding chr19 1026581 1026581 + 1785 SBNO2 ENSG00000064932 protein_coding chr19 1174283 1174283 - 3747 KARS ENSG00000065427 protein_coding chr16 75648643 75648643 - 593 PPP2R5A ENSG00000066027 protein_coding chr1 212285537 212285537 + 607 KLF6 ENSG00000067082 protein_coding chr10 3785281 3785281 - 3442 TP53BP1 ENSG00000067369 protein_coding chr15 43510728 43510728 - 0 DHX8 ENSG00000067596 protein_coding chr17 43483865 43483865 + 0 COASY ENSG00000068120 protein_coding chr17 42561467 42561467 + 447 PRR11 ENSG00000068489 protein_coding chr17 59155499 59155499 + 41 ASNS ENSG00000070669 protein_coding chr7 97872542 97872542 - 56 OSBPL3 ENSG00000070882 protein_coding chr7 24981634 24981634 - 784 ALDH3A2 ENSG00000072210 protein_coding chr17 19648136 19648136 + 654 FSCN1 ENSG00000075618 protein_coding chr7 5592823 5592823 + 892 REXO2 ENSG00000076043 protein_coding chr11 114439386 114439386 + 0 ARHGEF1 ENSG00000076928 protein_coding chr19 41883161 41883161 + 4614 NFKB2 ENSG00000077150 protein_coding chr10 102394110 102394110 + 0 FAM76B ENSG00000077458 protein_coding chr11 95790409 95790409 - 184 CARMIL1 ENSG00000079691 protein_coding chr6 25279078 25279078 + 0 CNOT4 ENSG00000080802 protein_coding chr7 135510127 135510127 - 0 EXD2 ENSG00000081177 protein_coding chr14 69191511 69191511 + 0 FAM135A ENSG00000082269 protein_coding chr6 70412941 70412941 + 389

81

GSK3B ENSG00000082701 protein_coding chr3 120094417 120094417 - 648 ZNF446 ENSG00000083838 protein_coding chr19 58474017 58474017 + 1281 RPS5 ENSG00000083845 protein_coding chr19 58386400 58386400 + 4810 AKR1B1 ENSG00000085662 protein_coding chr7 134459284 134459284 - 0 NFX1 ENSG00000086102 protein_coding chr9 33290511 33290511 + 0 TXLNG ENSG00000086712 protein_coding chrX 16786427 16786427 + 113 PPP1R15A ENSG00000087074 protein_coding chr19 48872392 48872392 + 0 FTL ENSG00000087086 protein_coding chr19 48965301 48965301 + 0 AURKA ENSG00000087586 protein_coding chr20 56392337 56392337 - 34 SNX5 ENSG00000089006 protein_coding chr20 17968980 17968980 - 0 CFAP61 ENSG00000089101 protein_coding chr20 20052514 20052514 + 0 ZNF302 ENSG00000089335 protein_coding chr19 34677639 34677639 + 0 HECTD1 ENSG00000092148 protein_coding chr14 31207804 31207804 - 372 SUCO ENSG00000094975 protein_coding chr1 172532349 172532349 + 339 FKBP5 ENSG00000096060 protein_coding chr6 35728583 35728583 - 82 GADD45B ENSG00000099860 protein_coding chr19 2476122 2476122 + 0 LGALS1 ENSG00000100097 protein_coding chr22 37675608 37675608 + 261 DDX17 ENSG00000100201 protein_coding chr22 38507660 38507660 - 917 PACSIN2 ENSG00000100266 protein_coding chr22 43015145 43015145 - 35 TNRC6B ENSG00000100354 protein_coding chr22 40044817 40044817 + 0 HDAC10 ENSG00000100429 protein_coding chr22 50251405 50251405 - 284 AP4S1 ENSG00000100478 protein_coding chr14 31025106 31025106 + 1656 TRIP11 ENSG00000100815 protein_coding chr14 92040896 92040896 - 458 CSTF1 ENSG00000101138 protein_coding chr20 56392371 56392371 + 0 CRNKL1 ENSG00000101343 protein_coding chr20 20056046 20056046 - 3250 MANBAL ENSG00000101363 protein_coding chr20 37289638 37289638 + 0 JAG1 ENSG00000101384 protein_coding chr20 10674107 10674107 - 1072 VAPA ENSG00000101558 protein_coding chr18 9914002 9914002 + 857 MYL12A ENSG00000101608 protein_coding chr18 3247481 3247481 + 0 POLI ENSG00000101751 protein_coding chr18 54269404 54269404 + 0 SETD6 ENSG00000103037 protein_coding chr16 58515479 58515479 + 0 VAC14 ENSG00000103043 protein_coding chr16 70801161 70801161 - 0 ESRP2 ENSG00000103067 protein_coding chr16 68238102 68238102 - 84 HCFC1R1 ENSG00000103145 protein_coding chr16 3024286 3024286 - 2123 ZNF500 ENSG00000103199 protein_coding chr16 4767624 4767624 - 95 GSPT1 ENSG00000103342 protein_coding chr16 11916082 11916082 - 0 STX4 ENSG00000103496 protein_coding chr16 31032889 31032889 + 117 GABPB1 ENSG00000104064 protein_coding chr15 50355408 50355408 - 0 GSR ENSG00000104687 protein_coding chr8 30727926 30727926 - 4604 RSPH6A ENSG00000104941 protein_coding chr19 45815319 45815319 - 1783 CCDC130 ENSG00000104957 protein_coding chr19 13731760 13731760 + 196 PTOV1 ENSG00000104960 protein_coding chr19 49850735 49850735 + 2127 TBCB ENSG00000105254 protein_coding chr19 36114289 36114289 + 563 POLR2I ENSG00000105258 protein_coding chr19 36115346 36115346 - 0 OVOL3 ENSG00000105261 protein_coding chr19 36111151 36111151 + 3701 CCDC9 ENSG00000105321 protein_coding chr19 47255980 47255980 + 269 NKG7 ENSG00000105374 protein_coding chr19 51372715 51372715 - 4262 ETFB ENSG00000105379 protein_coding chr19 51366418 51366418 - 1534 NAPA ENSG00000105402 protein_coding chr19 47515240 47515240 - 0 CYTH2 ENSG00000105443 protein_coding chr19 48469032 48469032 + 0

82

CCDC114 ENSG00000105479 protein_coding chr19 48321894 48321894 - 0 PLEKHA4 ENSG00000105559 protein_coding chr19 48868632 48868632 - 3481 RPL18A ENSG00000105640 protein_coding chr19 17859876 17859876 + 0 GSK3A ENSG00000105723 protein_coding chr19 42242625 42242625 - 703 CFAP69 ENSG00000105792 protein_coding chr7 90245174 90245174 + 0 DNAJC2 ENSG00000105821 protein_coding chr7 103344873 103344873 - 0 HBP1 ENSG00000105856 protein_coding chr7 107168961 107168961 + 595 CBLL1 ENSG00000105879 protein_coding chr7 107743697 107743697 + 331 CPED1 ENSG00000106034 protein_coding chr7 120988677 120988677 + 493 MOSPD3 ENSG00000106330 protein_coding chr7 100612102 100612102 + 0 IMPDH1 ENSG00000106348 protein_coding chr7 128410252 128410252 - 0 CHCHD3 ENSG00000106554 protein_coding chr7 133082088 133082088 - 0 TFAM ENSG00000108064 protein_coding chr10 58385022 58385022 + 0 RPL28 ENSG00000108107 protein_coding chr19 55385345 55385345 + 64 MLX ENSG00000108788 protein_coding chr17 42567068 42567068 + 4653 SC5D ENSG00000109929 protein_coding chr11 121292453 121292453 + 0 HSPA8 ENSG00000109971 protein_coding chr11 123063230 123063230 - 1449 ARHGEF17 ENSG00000110237 protein_coding chr11 73308289 73308289 + 838 BIRC2 ENSG00000110330 protein_coding chr11 102347211 102347211 + 15 HPS5 ENSG00000110756 protein_coding chr11 18322198 18322198 - 0 GTF2H1 ENSG00000110768 protein_coding chr11 18322295 18322295 + 0 ATP5B ENSG00000110955 protein_coding chr12 56646068 56646068 - 0 VPS29 ENSG00000111237 protein_coding chr12 110502117 110502117 - 0 NT5DC3 ENSG00000111696 protein_coding chr12 103841197 103841197 - 620 MED23 ENSG00000112282 protein_coding chr6 131628229 131628229 - 0 CCND3 ENSG00000112576 protein_coding chr6 42050357 42050357 - 4212 PRSS16 ENSG00000112812 protein_coding chr6 27247701 27247701 + 4011 FAF2 ENSG00000113194 protein_coding chr5 176447628 176447628 + 462 LNPEP ENSG00000113441 protein_coding chr5 96935394 96935394 + 0 SKP1 ENSG00000113558 protein_coding chr5 134177038 134177038 - 0 TTC33 ENSG00000113638 protein_coding chr5 40755975 40755975 - 0 OGG1 ENSG00000114026 protein_coding chr3 9749944 9749944 + 0 CEP70 ENSG00000114107 protein_coding chr3 138594538 138594538 - 0 HES1 ENSG00000114315 protein_coding chr3 194136145 194136145 + 1369 EIF1B ENSG00000114784 protein_coding chr3 40309684 40309684 + 0 MPV17 ENSG00000115204 protein_coding chr2 27325680 27325680 - 2429 RAB3GAP1 ENSG00000115839 protein_coding chr2 135052265 135052265 + 110 TMEM9 ENSG00000116857 protein_coding chr1 201171574 201171574 - 431 ESYT2 ENSG00000117868 protein_coding chr7 158830253 158830253 - 0 FASTKD2 ENSG00000118246 protein_coding chr2 206765357 206765357 + 0 DCLRE1B ENSG00000118655 protein_coding chr1 113905141 113905141 + 0 CCNI ENSG00000118816 protein_coding chr4 77076005 77076005 - 23 RBM18 ENSG00000119446 protein_coding chr9 122264839 122264839 - 0 NPC2 ENSG00000119655 protein_coding chr14 74494177 74494177 - 562 DUSP1 ENSG00000120129 protein_coding chr5 172771195 172771195 - 1120 SIL1 ENSG00000120725 protein_coding chr5 139293557 139293557 - 0 FAM126A ENSG00000122591 protein_coding chr7 23014130 23014130 - 0 ARL4A ENSG00000122644 protein_coding chr7 12686856 12686856 + 0 TSFM ENSG00000123297 protein_coding chr12 57782589 57782589 + 0 BAZ2B ENSG00000123636 protein_coding chr2 159616692 159616692 - 400 MC3R ENSG00000124089 protein_coding chr20 56248732 56248732 + 2454 SRSF6 ENSG00000124193 protein_coding chr20 43457928 43457928 + 0 IL17C ENSG00000124391 protein_coding chr16 88638591 88638591 + 3850 USP22 ENSG00000124422 protein_coding chr17 21043760 21043760 - 77 LYPD3 ENSG00000124466 protein_coding chr19 43465660 43465660 - 1905 RIOK1 ENSG00000124784 protein_coding chr6 7389496 7389496 + 402 LRRC29 ENSG00000125122 protein_coding chr16 67227048 67227048 - 0 POLR1B ENSG00000125630 protein_coding chr2 112541915 112541915 + 224 SDCBP2 ENSG00000125775 protein_coding chr20 1329239 1329239 - 3595 GZF1 ENSG00000125812 protein_coding chr20 23362182 23362182 + 108

83

MKKS ENSG00000125863 protein_coding chr20 10434222 10434222 - 0 MGME1 ENSG00000125871 protein_coding chr20 17968913 17968913 + 0 CAPNS1 ENSG00000126247 protein_coding chr19 36139575 36139575 + 0 BCL2L12 ENSG00000126453 protein_coding chr19 49665142 49665142 + 662 IRF3 ENSG00000126456 protein_coding chr19 49665875 49665875 - 0 RRAS ENSG00000126458 protein_coding chr19 49640201 49640201 - 1543 SCAF1 ENSG00000126461 protein_coding chr19 49642125 49642125 + 0 WFIKKN1 ENSG00000127578 protein_coding chr16 629239 629239 + 150 ICE2 ENSG00000128915 protein_coding chr15 60479160 60479160 - 0 MINDY2 ENSG00000128923 protein_coding chr15 58771192 58771192 + 392 KLK10 ENSG00000129451 protein_coding chr19 51020175 51020175 - 781 SGO1 ENSG00000129810 protein_coding chr3 20186292 20186292 - 0 DOHH ENSG00000129932 protein_coding chr19 3500940 3500940 - 45 PHF10 ENSG00000130024 protein_coding chr6 169725566 169725566 - 2592 CYP2E1 ENSG00000130649 protein_coding chr10 133520406 133520406 + 0 ZC3H4 ENSG00000130749 protein_coding chr19 47113752 47113752 - 703 GGT7 ENSG00000131067 protein_coding chr20 34872860 34872860 - 3473 ACSS2 ENSG00000131069 protein_coding chr20 34872146 34872146 + 4187 THOC6 ENSG00000131652 protein_coding chr16 3024027 3024027 + 1864 TRAF7 ENSG00000131653 protein_coding chr16 2155698 2155698 + 4488 TOP2A ENSG00000131747 protein_coding chr17 40417950 40417950 - 16 SNRPA1 ENSG00000131876 protein_coding chr15 101295282 101295282 - 162 SLX1A ENSG00000132207 protein_coding chr16 30193887 30193887 + 0 H3F3B ENSG00000132475 protein_coding chr17 75785893 75785893 - 646 UNK ENSG00000132478 protein_coding chr17 75784771 75784771 + 0 NIP7 ENSG00000132603 protein_coding chr16 69339430 69339430 + 0 FOPNL ENSG00000133393 protein_coding chr16 15888625 15888625 - 0 AGAP3 ENSG00000133612 protein_coding chr7 151085831 151085831 + 3555 THUMPD3 ENSG00000134077 protein_coding chr3 9362842 9362842 + 0 AP4B1 ENSG00000134262 protein_coding chr1 113905201 113905201 - 0 KCTD1 ENSG00000134504 protein_coding chr18 26657401 26657401 - 978 TES ENSG00000135269 protein_coding chr7 116210493 116210493 + 355 ATP5G2 ENSG00000135390 protein_coding chr12 53677408 53677408 - 819 MPHOSPH6 ENSG00000135698 protein_coding chr16 82170226 82170226 - 0 FHOD1 ENSG00000135723 protein_coding chr16 67247658 67247658 - 1256 IGF2BP3 ENSG00000136231 protein_coding chr7 23470467 23470467 - 3992 SLC2A8 ENSG00000136856 protein_coding chr9 127397138 127397138 + 0 MYC ENSG00000136997 protein_coding chr8 127735434 127735434 + 2796 TFAP2A ENSG00000137203 protein_coding chr6 10419659 10419659 - 329 FLOT1 ENSG00000137312 protein_coding chr6 30742733 30742733 - 102 IER3 ENSG00000137331 protein_coding chr6 30744554 30744554 - 1218 TAF8 ENSG00000137413 protein_coding chr6 42050513 42050513 + 4368 FAM8A1 ENSG00000137414 protein_coding chr6 17600355 17600355 + 0 RTCA ENSG00000137996 protein_coding chr1 100266207 100266207 + 0 MDH1B ENSG00000138400 protein_coding chr2 206765547 206765547 - 29 G3BP2 ENSG00000138757 protein_coding chr4 75724525 75724525 - 0 USO1 ENSG00000138768 protein_coding chr4 75724593 75724593 + 0 TMEM19 ENSG00000139291 protein_coding chr12 71686087 71686087 + 140 TARBP2 ENSG00000139546 protein_coding chr12 53500921 53500921 + 0 MAP3K12 ENSG00000139625 protein_coding chr12 53500063 53500063 - 394 GTF2A2 ENSG00000140307 protein_coding chr15 59657541 59657541 - 0 USP3 ENSG00000140455 protein_coding chr15 63504511 63504511 + 855 ABHD2 ENSG00000140526 protein_coding chr15 89087459 89087459 + 395 TOB1 ENSG00000141232 protein_coding chr17 50867978 50867978 - 0 CCDC97 ENSG00000142039 protein_coding chr19 41310189 41310189 + 0 EMP3 ENSG00000142227 protein_coding chr19 48321509 48321509 + 360 SAE1 ENSG00000142230 protein_coding chr19 47113274 47113274 + 225 RNPEPL1 ENSG00000142327 protein_coding chr2 240565804 240565804 + 0 RPL11 ENSG00000142676 protein_coding chr1 23691779 23691779 + 0 PIGK ENSG00000142892 protein_coding chr1 77219430 77219430 - 0

84

TINAGL1 ENSG00000142910 protein_coding chr1 31576485 31576485 + 0 RPS8 ENSG00000142937 protein_coding chr1 44775251 44775251 + 54 ACP1 ENSG00000143727 protein_coding chr2 264140 264140 + 56 ANKMY1 ENSG00000144504 protein_coding chr2 240569209 240569209 - 2962 UBA3 ENSG00000144744 protein_coding chr3 69080408 69080408 - 4344 ARL6IP5 ENSG00000144746 protein_coding chr3 69084944 69084944 + 0 RPS3A ENSG00000145425 protein_coding chr4 151099573 151099573 + 0 RNF145 ENSG00000145860 protein_coding chr5 159210053 159210053 - 2023 TRIM41 ENSG00000146063 protein_coding chr5 181222499 181222499 + 668 ZMYM4 ENSG00000146463 protein_coding chr1 35268967 35268967 + 0 POMZP3 ENSG00000146707 protein_coding chr7 76627261 76627261 - 0 CSGALNACT1 ENSG00000147408 protein_coding chr8 19758029 19758029 - 0 MRRF ENSG00000148187 protein_coding chr9 122264603 122264603 + 0 SYT8 ENSG00000149043 protein_coding chr11 1828307 1828307 + 639 SLX4IP ENSG00000149346 protein_coding chr20 10435303 10435303 + 774 ITGB1 ENSG00000150093 protein_coding chr10 33005792 33005792 - 4404 RAD9B ENSG00000151164 protein_coding chr12 110501655 110501655 + 237 EIF4E ENSG00000151247 protein_coding chr4 98930637 98930637 - 944 TMEM267 ENSG00000151881 protein_coding chr5 43483893 43483893 - 0 TADA1 ENSG00000152382 protein_coding chr1 166876327 166876327 - 0 DCLRE1C ENSG00000152457 protein_coding chr10 14954432 14954432 - 243 ZNF837 ENSG00000152475 protein_coding chr19 58381060 58381060 - 29 CCDC50 ENSG00000152492 protein_coding chr3 191329077 191329077 + 0 ASAP1 ENSG00000153317 protein_coding chr8 130443660 130443660 - 942 ANKRD29 ENSG00000154065 protein_coding chr18 23662885 23662885 - 113 TBRG1 ENSG00000154144 protein_coding chr11 124622836 124622836 + 0 ENPP3 ENSG00000154269 protein_coding chr6 131628442 131628442 + 0 VOPP1 ENSG00000154978 protein_coding chr7 55572988 55572988 - 1082 APOOL ENSG00000155008 protein_coding chrX 85003826 85003826 + 0 ZKSCAN2 ENSG00000155592 protein_coding chr16 25257931 25257931 - 72 TMBIM4 ENSG00000155957 protein_coding chr12 66170072 66170072 - 0 CARNMT1 ENSG00000156017 protein_coding chr9 75028423 75028423 - 0 C8orf37 ENSG00000156172 protein_coding chr8 95269201 95269201 - 149 CCT8 ENSG00000156261 protein_coding chr21 29073797 29073797 - 0 MAP3K7CL ENSG00000156265 protein_coding chr21 29077471 29077471 + 3401 PCDH1 ENSG00000156453 protein_coding chr5 141879246 141879246 - 0 RPL30 ENSG00000156482 protein_coding chr8 98046469 98046469 - 0 HKDC1 ENSG00000156510 protein_coding chr10 69220303 69220303 + 2357 C1orf27 ENSG00000157181 protein_coding chr1 186375838 186375838 + 0 GRHL3 ENSG00000158055 protein_coding chr1 24319322 24319322 + 2271 ZNF230 ENSG00000159882 protein_coding chr19 44002948 44002948 + 0 C21orf58 ENSG00000160298 protein_coding chr21 46323875 46323875 - 71 PCNT ENSG00000160299 protein_coding chr21 46324122 46324122 + 318 CLDND2 ENSG00000160318 protein_coding chr19 51369003 51369003 - 550 PSMC2 ENSG00000161057 protein_coding chr7 103344254 103344254 + 337 WDR90 ENSG00000161996 protein_coding chr16 649311 649311 + 1000 SNX7 ENSG00000162627 protein_coding chr1 98661701 98661701 + 183 PPP1R21 ENSG00000162869 protein_coding chr2 48440598 48440598 + 0 B3GALNT2 ENSG00000162885 protein_coding chr1 235504481 235504481 - 0 DUSP19 ENSG00000162999 protein_coding chr2 183078559 183078559 + 0 LYSMD1 ENSG00000163155 protein_coding chr1 151165948 151165948 - 410 PRKCI ENSG00000163558 protein_coding chr3 170222365 170222365 + 51 HSPA4L ENSG00000164070 protein_coding chr4 127781821 127781821 + 372 SAP30 ENSG00000164105 protein_coding chr4 173369969 173369969 + 1854 RHOBTB3 ENSG00000164292 protein_coding chr5 95713522 95713522 + 676 CAGE1 ENSG00000164304 protein_coding chr6 7389743 7389743 - 155 SLU7 ENSG00000164609 protein_coding chr5 160421711 160421711 - 0 PTTG1 ENSG00000164611 protein_coding chr5 160421822 160421822 + 0 HEY1 ENSG00000164683 protein_coding chr8 79767863 79767863 - 2151 ZNF704 ENSG00000164684 protein_coding chr8 80874781 80874781 - 0

85

PEX2 ENSG00000164751 protein_coding chr8 77001044 77001044 - 522 TMEM184A ENSG00000164855 protein_coding chr7 1560821 1560821 - 4432 CDK5 ENSG00000164885 protein_coding chr7 151058530 151058530 - 0 SLC4A2 ENSG00000164889 protein_coding chr7 151057210 151057210 + 1019 SNAPC3 ENSG00000164975 protein_coding chr9 15422704 15422704 + 0 PSIP1 ENSG00000164985 protein_coding chr9 15511019 15511019 - 0 ISCA2 ENSG00000165898 protein_coding chr14 74493720 74493720 + 105 CEP57 ENSG00000166037 protein_coding chr11 95789965 95789965 + 0 GPT2 ENSG00000166123 protein_coding chr16 46884378 46884378 + 375 BTRC ENSG00000166167 protein_coding chr10 101354033 101354033 + 0 RRAD ENSG00000166592 protein_coding chr16 66925644 66925644 - 1331 FAM96B ENSG00000166595 protein_coding chr16 66934423 66934423 - 0 TERF2IP ENSG00000166848 protein_coding chr16 75647786 75647786 + 0 MAP1A ENSG00000166963 protein_coding chr15 43510958 43510958 + 0 PROSER3 ENSG00000167595 protein_coding chr19 35758143 35758143 + 0 HSD11B1L ENSG00000167733 protein_coding chr19 5680604 5680604 + 0 ZNF83 ENSG00000167766 protein_coding chr19 52690496 52690496 - 0 KRT80 ENSG00000167767 protein_coding chr12 52192000 52192000 - 74 MGAT5B ENSG00000167889 protein_coding chr17 76868456 76868456 + 3904 ECI1 ENSG00000167969 protein_coding chr16 2252300 2252300 - 401 FTH1 ENSG00000167996 protein_coding chr11 61967660 61967660 - 456 SLC3A2 ENSG00000168003 protein_coding chr11 62856102 62856102 + 0 COA6 ENSG00000168275 protein_coding chr1 234373456 234373456 + 0 MPLKIP ENSG00000168303 protein_coding chr7 40134659 40134659 - 0 ARF4 ENSG00000168374 protein_coding chr3 57598220 57598220 - 1363 ADAM9 ENSG00000168615 protein_coding chr8 38996869 38996869 + 102 TMEM208 ENSG00000168701 protein_coding chr16 67227103 67227103 + 0 MFF ENSG00000168958 protein_coding chr2 227325151 227325151 + 0 THBS3 ENSG00000169231 protein_coding chr1 155209051 155209051 - 0 ADRB2 ENSG00000169252 protein_coding chr5 148825245 148825245 + 3130 SIN3A ENSG00000169375 protein_coding chr15 75455842 75455842 - 4084 PTK2 ENSG00000169398 protein_coding chr8 141002216 141002216 - 0 TM2D2 ENSG00000169490 protein_coding chr8 38996824 38996824 - 57 BOLA2B ENSG00000169627 protein_coding chr16 30194306 30194306 - 313 BUB1 ENSG00000169679 protein_coding chr2 110678114 110678114 - 0 PRDM10 ENSG00000170325 protein_coding chr11 130002835 130002835 - 569 ARL6IP1 ENSG00000170540 protein_coding chr16 18801678 18801678 - 0 TRABD ENSG00000170638 protein_coding chr22 50185915 50185915 + 4343 RNF139 ENSG00000170881 protein_coding chr8 124474738 124474738 + 0 MCC ENSG00000171444 protein_coding chr5 113488830 113488830 - 435 CHCHD1 ENSG00000172586 protein_coding chr10 73782047 73782047 + 0 RAD9A ENSG00000172613 protein_coding chr11 67317871 67317871 + 0 CES2 ENSG00000172831 protein_coding chr16 66934444 66934444 + 0 MYEOV ENSG00000172927 protein_coding chr11 69294138 69294138 + 3087 ANKRD13D ENSG00000172932 protein_coding chr11 67288547 67288547 + 0 MTX1 ENSG00000173171 protein_coding chr1 155208699 155208699 + 185 AHSA2 ENSG00000173209 protein_coding chr2 61177418 61177418 + 195 RNASE11 ENSG00000173464 protein_coding chr14 20609884 20609884 - 3484 SMARCC1 ENSG00000173473 protein_coding chr3 47782106 47782106 - 0 GTF2IRD2B ENSG00000174428 protein_coding chr7 75092573 75092573 + 408 RIN1 ENSG00000174791 protein_coding chr11 66336840 66336840 - 0 C19orf70 ENSG00000174917 protein_coding chr19 5680896 5680896 - 0 SUGCT ENSG00000175600 protein_coding chr7 40134977 40134977 + 46 EID2 ENSG00000176396 protein_coding chr19 39540333 39540333 - 0 PRR15 ENSG00000176532 protein_coding chr7 29563811 29563811 + 900 MAP3K19 ENSG00000176601 protein_coding chr2 135047468 135047468 - 4907 FAM109B ENSG00000177096 protein_coding chr22 42074251 42074251 + 3226 COX14 ENSG00000178449 protein_coding chr12 50111979 50111979 + 0 EXOSC4 ENSG00000178896 protein_coding chr8 144078626 144078626 + 0 EXD1 ENSG00000178997 protein_coding chr15 41230743 41230743 - 104

86

PER1 ENSG00000179094 protein_coding chr17 8156506 8156506 - 1758 SPTY2D1 ENSG00000179119 protein_coding chr11 18634791 18634791 - 572 SHARPIN ENSG00000179526 protein_coding chr8 144108124 144108124 - 2714 HTR1D ENSG00000179546 protein_coding chr1 23194729 23194729 - 2124 MAF1 ENSG00000179632 protein_coding chr8 144104499 144104499 + 410 WDR97 ENSG00000179698 protein_coding chr8 144107726 144107726 + 2316 ZNRF2 ENSG00000180233 protein_coding chr7 30284307 30284307 + 1632 KCTD2 ENSG00000180901 protein_coding chr17 75032575 75032575 + 1733 RAP2B ENSG00000181467 protein_coding chr3 153162270 153162270 + 503 MRPS23 ENSG00000181610 protein_coding chr17 57850056 57850056 - 0 DEXI ENSG00000182108 protein_coding chr16 10942460 10942460 - 1661 MOB2 ENSG00000182208 protein_coding chr11 1501247 1501247 - 1644 FBXL6 ENSG00000182325 protein_coding chr8 144359376 144359376 - 271 SKA2 ENSG00000182628 protein_coding chr17 59155269 59155269 - 0 C16orf72 ENSG00000182831 protein_coding chr16 9091648 9091648 + 182 NPIPA1 ENSG00000183426 protein_coding chr16 14922802 14922802 + 1161 UPP1 ENSG00000183696 protein_coding chr7 48088628 48088628 + 335 ELOA3 ENSG00000183791 protein_coding chr18 47030078 47030078 - 1279 GPR39 ENSG00000183840 protein_coding chr2 132416574 132416574 + 0 TMEM173 ENSG00000184584 protein_coding chr5 139482935 139482935 - 314 CLDN6 ENSG00000184697 protein_coding chr16 3020071 3020071 - 866 PRMT3 ENSG00000185238 protein_coding chr11 20387530 20387530 + 0 SLC52A2 ENSG00000185803 protein_coding chr8 144354135 144354135 + 4469 DYNC2H1 ENSG00000187240 protein_coding chr11 103109431 103109431 + 0 CHP1 ENSG00000187446 protein_coding chr15 41230839 41230839 + 8 SCGB1C1 ENSG00000188076 protein_coding chr11 193080 193080 + 3061 H2AFX ENSG00000188486 protein_coding chr11 119095467 119095467 - 0 UTS2B ENSG00000188958 protein_coding chr3 191330536 191330536 - 980 LIN54 ENSG00000189308 protein_coding chr4 83012926 83012926 - 9 SRGAP3 ENSG00000196220 protein_coding chr3 9363053 9363053 - 0 CD55 ENSG00000196352 protein_coding chr1 207321508 207321508 + 0 FAM72A ENSG00000196550 protein_coding chr1 206204414 206204414 - 2371 MKL1 ENSG00000196588 protein_coding chr22 40636702 40636702 - 43 STRN3 ENSG00000196792 protein_coding chr14 31026401 31026401 - 361 LAMB3 ENSG00000196878 protein_coding chr1 209652466 209652466 - 4570 FBXL22 ENSG00000197361 protein_coding chr15 63597353 63597353 + 4054 STMN3 ENSG00000197457 protein_coding chr20 63657682 63657682 - 582 RPE ENSG00000197713 protein_coding chr2 210002565 210002565 + 0 ZNF181 ENSG00000197841 protein_coding chr19 34734155 34734155 + 0 GPAA1 ENSG00000197858 protein_coding chr8 144082590 144082590 + 3780 AC092143.1 ENSG00000198211 protein_coding chr16 89919165 89919165 + 808 HYLS1 ENSG00000198331 protein_coding chr11 125883614 125883614 + 3770 NAGA ENSG00000198951 protein_coding chr22 42070842 42070842 - 0 PJA2 ENSG00000198961 protein_coding chr5 109409994 109409994 - 211 MRPL38 ENSG00000204316 protein_coding chr17 75905413 75905413 - 4034 SCGB2B2 ENSG00000205209 protein_coding chr19 34675699 34675699 - 1642 FGFR1OP ENSG00000213066 protein_coding chr6 166999182 166999182 + 1076 COG8 ENSG00000213380 protein_coding chr16 69339583 69339583 - 0 ZSWIM8 ENSG00000214655 protein_coding chr10 73785582 73785582 + 3231 AC078927.1 ENSG00000228144 protein_coding chr12 66169985 66169985 - 0 TMEM238 ENSG00000233493 protein_coding chr19 55384598 55384598 - 182 TRIM26 ENSG00000234127 protein_coding chr6 30213427 30213427 - 598 JRK ENSG00000234616 protein_coding chr8 142681968 142681968 - 4548 CHMP1B ENSG00000255112 protein_coding chr18 11851396 11851396 + 973 AC106782.1 ENSG00000258130 protein_coding chr16 30209071 30209071 - 3906 RTEL1 ENSG00000258366 protein_coding chr20 63657810 63657810 + 454 TUBB3 ENSG00000258947 protein_coding chr16 89921392 89921392 + 3035 AL163195.3 ENSG00000259060 protein_coding chr14 20609795 20609795 - 3573 AC138811.2 ENSG00000260342 protein_coding chr16 18801519 18801519 - 0 AC010542.3 ENSG00000260851 protein_coding chr16 66517312 66517312 - 73 TMEM249 ENSG00000261587 protein_coding chr8 144354914 144354914 - 3690 GAN ENSG00000261609 protein_coding chr16 81314952 81314952 + 0 OTUD7B ENSG00000264522 protein_coding chr1 150010676 150010676 - 0 ANXA8 ENSG00000265190 protein_coding chr10 47484158 47484158 - 2237 SRGAP2 ENSG00000266028 protein_coding chr1 206203345 206203345 + 1302 AC008403.1 ENSG00000268465 protein_coding chr19 48465837 48465837 + 2831 FAM231B ENSG00000268991 protein_coding chr1 16539066 16539066 + 2871 AC008750.8 ENSG00000269403 protein_coding chr19 51368099 51368099 - 0 HIST2H4A ENSG00000270882 protein_coding chr1 149832659 149832659 + 0 NBPF19 ENSG00000271383 protein_coding chr1 149390623 149390623 + 0 AC233992.2 ENSG00000271698 protein_coding chr8 144355609 144355609 - 2995 KMT2B ENSG00000272333 protein_coding chr19 35718019 35718019 + 506 COG8 ENSG00000272617 protein_coding chr16 69339667 69339667 - 0 NBPF26 ENSG00000273136 protein_coding chr1 120723923 120723923 + 0 HIST1H4E ENSG00000276966 protein_coding chr6 26204552 26204552 + 546 HIST1H4D ENSG00000277157 protein_coding chr6 26189076 26189076 - 0 ELOA3B ENSG00000278674 protein_coding chr18 47023927 47023927 - 990 87

Table 2. RUNX1 bookmarked genes that are bound by ER and are differentially expressed (either up or down) upon estrogen treatment.

gene_id gene_name baseMean log2FoldChange lfcSE stat pvalue padj binding group RUNX1 Bookmarked ENSG00000006534 ALDH3B1 223.764483 -1.486043549 0.292189004 5.085898274 3.66E-07 1.37E-05 ESR1+RUNX1 bound, de Down yes ENSG00000064932 SBNO2 821.5236174 0.418847935 0.157636559 -2.657048196 0.007882817 0.043159725 ESR1+RUNX1 bound, de Up yes ENSG00000104957 CCDC130 225.5108634 -0.47886552 0.178215127 2.68700827 0.007209517 0.040327755 ESR1+RUNX1 bound, de Down yes ENSG00000105856 HBP1 1707.02922 -0.792170507 0.180235367 4.39520012 1.11E-05 0.0002326 ESR1+RUNX1 bound, de Down yes ENSG00000114315 HES1 1118.247038 -1.062455094 0.183071692 5.803491963 6.49E-09 4.68E-07 ESR1+RUNX1 bound, de Down yes ENSG00000126461 SCAF1 594.8761924 0.523342397 0.167935604 -3.116327838 0.001831186 0.014261354 ESR1+RUNX1 bound, de Up yes ENSG00000130649 CYP2E1 23.32770091 -1.011494498 0.315636584 3.204617426 0.001352421 0.011202425 ESR1+RUNX1 bound, de Down yes ENSG00000131652 THOC6 443.1523134 0.327922269 0.125154411 -2.620141511 0.008789329 0.046980735 ESR1+RUNX1 bound, de Up yes ENSG00000137312 FLOT1 891.4108791 -0.542205658 0.207900051 2.608011185 0.009106998 0.048334548 ESR1+RUNX1 bound, de Down yes ENSG00000149043 SYT8 37.01811475 1.381161412 0.345434679 -3.998328756 6.38E-05 0.001007558 ESR1+RUNX1 bound, de Up yes ENSG00000156453 PCDH1 647.6693707 -0.813604616 0.200521252 4.057448322 4.96E-05 0.000815465 ESR1+RUNX1 bound, de Down yes ENSG00000164070 HSPA4L 206.0022589 0.942744258 0.248477038 -3.794090052 0.000148186 0.001948169 ESR1+RUNX1 bound, de Up yes ENSG00000172927 MYEOV 31.07297255 2.202080169 0.346033148 -6.363783884 1.97E-10 2.44E-08 ESR1+RUNX1 bound, de Up yes ENSG00000189308 LIN54 514.9465205 0.520387566 0.186238281 -2.794203008 0.005202781 0.031446101 ESR1+RUNX1 bound, de Up yes ENSG00000276966 HIST1H4E 1563.433919 1.181840259 0.199327077 -5.929150602 3.05E-09 2.39E-07 ESR1+RUNX1 bound, de Up yes

88

REFERENCES

1. Inman, J.L., et al., Mammary gland development: cell fate specification, stem cells and the microenvironment. Development, 2015. 142(6): p. 1028-42. 2. Lamouille, S., J. Xu, and R. Derynck, Molecular mechanisms of epithelial- mesenchymal transition. Nat Rev Mol Cell Biol, 2014. 15(3): p. 178-96. 3. Teves, S.S., et al., A dynamic mode of mitotic bookmarking by transcription factors. Elife, 2016. 5. 4. Ito, Y., S.C. Bae, and L.S. Chuang, The RUNX family: developmental regulators in cancer. Nat Rev Cancer, 2015. 15(2): p. 81-95. 5. Dai, X., et al., Breast Cancer Cell Line Classification and Its Relevance with Breast Tumor Subtyping. J Cancer, 2017. 8(16): p. 3131-3141. 6. Tanos, T., et al., ER and PR signaling nodes during mammary gland development. Breast Cancer Res, 2012. 14(4): p. 210. 7. Zaidi, S.K., et al., Mitotic Gene Bookmarking: An Epigenetic Program to Maintain Normal and Cancer Phenotypes. Mol Cancer Res, 2018. 8. Zaidi, S.K., et al., Bookmarking the genome: maintenance of epigenetic information. J Biol Chem, 2011. 286(21): p. 18355-61. 9. Hong, D., et al., Runx1 stabilizes the mammary epithelial cell phenotype and prevents epithelial to mesenchymal transition. Oncotarget, 2017. 8(11): p. 17610- 17627. 10. BreastCancer.org. U.S. Breast Cancer Statistics. 01.09.2018 [cited 2018 07.21.2018]; Available from: https://www.breastcancer.org/symptoms/understand_bc/statistics. 11. United States Cancer Statistics: 1999-2014 Incidence and Mortality Web-based Report. 2018 07.13.2018 [cited 2018 07.21.2018]; Available from: https://www.cdc.gov/cancer/npcr/uscs/index.htm. 12. Siegel, R.L., K.D. Miller, and A. Jemal, Cancer Statistics, 2017. CA Cancer J Clin, 2017. 67(1): p. 7-30. 13. Noone AM, H.N., Krapcho M, Miller D, Brest A, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA (eds). SEER Cancer Statistics Review, 1975-2015. 2018 [cited 2018 07.21.2018]; Available from: https://seer.cancer.gov/csr/1975_2015/. 14. Bandyopadhyay, S., M.H. Bluth, and R. Ali-Fehmi, Breast Carcinoma: Updates in Molecular Profiling 2018. Clin Lab Med, 2018. 38(2): p. 401-420. 15. O'Brien, K.M., et al., Intrinsic breast tumor subtypes, race, and long-term survival in the Carolina Breast Cancer Study. Clin Cancer Res, 2010. 16(24): p. 6100-10. 16. Tavassoli FA, D.P., ed. World Health Organization Classification of Tumours Pathology and Genetics of Tumors of the Breast and Femal Genital Organs. 2003, IARC Press: Lyon (France). 17. Patani, N., L.A. Martin, and M. Dowsett, Biomarkers for the clinical management of breast cancer: international perspective. Int J Cancer, 2013. 133(1): p. 1-13.

89

18. Perou, C.M., et al., Molecular portraits of human breast tumours. Nature, 2000. 406(6797): p. 747-52. 19. Sorlie, T., et al., Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A, 2001. 98(19): p. 10869-74. 20. Sotiriou, C., et al., Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci U S A, 2003. 100(18): p. 10393-8. 21. Park, S., et al., Characteristics and outcomes according to molecular subtypes of breast cancer as classified by a panel of four biomarkers using immunohistochemistry. Breast, 2012. 21(1): p. 50-7. 22. Prat, A., et al., Prognostic significance of progesterone receptor-positive tumor cells within immunohistochemically defined luminal A breast cancer. J Clin Oncol, 2013. 31(2): p. 203-9. 23. Voduc, K.D., et al., Breast cancer subtypes and the risk of local and regional relapse. J Clin Oncol, 2010. 28(10): p. 1684-91. 24. Kim, H.S., et al., Analysis of the potent prognostic factors in luminal-type breast cancer. J Breast Cancer, 2012. 15(4): p. 401-6. 25. Gennari, A., et al., HER2 status and efficacy of adjuvant anthracyclines in early breast cancer: a pooled analysis of randomized trials. J Natl Cancer Inst, 2008. 100(1): p. 14-20. 26. Hayes, D.F., et al., HER2 and response to paclitaxel in node-positive breast cancer. N Engl J Med, 2007. 357(15): p. 1496-506. 27. Pritchard, K.I., et al., HER-2 and topoisomerase II as predictors of response to chemotherapy. J Clin Oncol, 2008. 26(5): p. 736-44. 28. Nguyen, P.L., et al., Breast cancer subtype approximated by estrogen receptor, progesterone receptor, and HER-2 is associated with local and distant recurrence after breast-conserving therapy. J Clin Oncol, 2008. 26(14): p. 2373-8. 29. Smid, M., et al., Subtypes of breast cancer show preferential site of relapse. Cancer Res, 2008. 68(9): p. 3108-14. 30. Prat, A., et al., Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res, 2010. 12(5): p. R68. 31. Farmer, P., et al., Identification of molecular apocrine breast tumours by microarray analysis. Oncogene, 2005. 24(29): p. 4660-71. 32. Mellinghoff, I.K., et al., HER2/neu kinase-dependent modulation of androgen receptor function through effects on DNA binding and stability. Cancer Cell, 2004. 6(5): p. 517-27. 33. Bromberg, J.F., et al., Transcriptionally active Stat1 is required for the antiproliferative effects of both interferon alpha and . Proc Natl Acad Sci U S A, 1996. 93(15): p. 7673-8. 34. Hu, Z., et al., The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics, 2006. 7: p. 96. 35. Cailleau, R., et al., Breast tumor cell lines from pleural effusions. J Natl Cancer Inst, 1974. 53(3): p. 661-74.

90

36. Pratap, J., et al., Regulatory roles of Runx2 in metastatic tumor and cancer cell interactions with bone. Cancer Metastasis Rev, 2006. 25(4): p. 589-600. 37. Soule, H.D., et al., A human cell line from a pleural effusion derived from a breast carcinoma. J Natl Cancer Inst, 1973. 51(5): p. 1409-16. 38. Levenson, A.S. and V.C. Jordan, MCF-7: the first hormone-responsive breast cancer cell line. Cancer Res, 1997. 57(15): p. 3071-8. 39. Soule, H.D., et al., Isolation and characterization of a spontaneously immortalized human breast epithelial cell line, MCF-10. Cancer Res, 1990. 50(18): p. 6075-86. 40. Sorlie, T., et al., Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A, 2003. 100(14): p. 8418-23. 41. Dawson, P.J., et al., MCF10AT: a model for the evolution of cancer from proliferative breast disease. Am J Pathol, 1996. 148(1): p. 313-9. 42. Santner, S.J., et al., Malignant MCF10CA1 cell lines derived from premalignant human breast epithelial MCF10AT cells. Breast Cancer Res Treat, 2001. 65(2): p. 101-10. 43. Hens, J.R. and J.J. Wysolmerski, Key stages of mammary gland development: molecular mechanisms involved in the formation of the embryonic mammary gland. Breast Cancer Res, 2005. 7(5): p. 220-4. 44. Sakakura, T., The Mammary Gland; Development, Regulation, and Function. Mammary Embryogenesis. 1987, New York: Plenum Press. 45. Veltmaat, J.M., et al., Mouse embryonic mammogenesis as a model for the molecular regulation of pattern formation. Differentiation, 2003. 71(1): p. 1-17. 46. Hogg, N.A., C.J. Harrison, and C. Tickle, Lumen formation in the developing mouse mammary gland. J Embryol Exp Morphol, 1983. 73: p. 39-57. 47. Kouros-Mehr, H. and Z. Werb, Candidate regulators of mammary branching morphogenesis identified by genome-wide transcript analysis. Dev Dyn, 2006. 235(12): p. 3404-12. 48. Nelson, C.M., et al., Tissue geometry determines sites of mammary branching morphogenesis in organotypic cultures. Science, 2006. 314(5797): p. 298-300. 49. Silberstein, G.B. and C.W. Daniel, Glycosaminoglycans in the basal lamina and extracellular matrix of the developing mouse mammary duct. Dev Biol, 1982. 90(1): p. 215-22. 50. Williams, J.M. and C.W. Daniel, Mammary ductal elongation: differentiation of myoepithelium and basal lamina during branching morphogenesis. Dev Biol, 1983. 97(2): p. 274-90. 51. Fata, J.E., V. Chaudhary, and R. Khokha, Cellular turnover in the mammary gland is correlated with systemic levels of progesterone and not 17beta-estradiol during the estrous cycle. Biol Reprod, 2001. 65(3): p. 680-8. 52. Talhouk, R.S., et al., Proteinases of the mammary gland: developmental regulation in vivo and vectorial secretion in culture. Development, 1991. 112(2): p. 439-49.

91

53. Haaksma, C.J., R.J. Schwartz, and J.J. Tomasek, Myoepithelial cell contraction and milk ejection are impaired in mammary glands of mice lacking smooth muscle alpha-actin. Biol Reprod, 2011. 85(1): p. 13-21. 54. Hennighausen, L., et al., Prolactin signaling in mammary gland development. J Biol Chem, 1997. 272(12): p. 7567-9. 55. Arendt, L.M. and C. Kuperwasser, Form and function: how estrogen and progesterone regulate the mammary epithelial hierarchy. J Mammary Gland Biol Neoplasia, 2015. 20(1-2): p. 9-25. 56. Petz, L.N., et al., Differential regulation of the human progesterone receptor gene through an estrogen response element half site and Sp1 sites. J Steroid Biochem Mol Biol, 2004. 88(2): p. 113-22. 57. Schultz, J.R., L.N. Petz, and A.M. Nardulli, Estrogen receptor alpha and Sp1 regulate progesterone receptor gene expression. Mol Cell Endocrinol, 2003. 201(1-2): p. 165-75. 58. Feng, Y., et al., Estrogen receptor-alpha expression in the mammary epithelium is required for ductal and alveolar morphogenesis in mice. Proc Natl Acad Sci U S A, 2007. 104(37): p. 14718-23. 59. Mallepell, S., et al., Paracrine signaling through the epithelial estrogen receptor alpha is required for proliferation and morphogenesis in the mammary gland. Proc Natl Acad Sci U S A, 2006. 103(7): p. 2196-201. 60. Mueller, S.O., et al., Mammary gland development in adult mice requires epithelial and stromal estrogen receptor alpha. Endocrinology, 2002. 143(6): p. 2357-65. 61. Ciarloni, L., S. Mallepell, and C. Brisken, Amphiregulin is an essential mediator of estrogen receptor alpha function in mammary gland development. Proc Natl Acad Sci U S A, 2007. 104(13): p. 5455-60. 62. Kenney, N.J., et al., Effect of exogenous epidermal-like growth factors on mammary gland development and differentiation in the estrogen receptor-alpha knockout (ERKO) mouse. Breast Cancer Res Treat, 2003. 79(2): p. 161-73. 63. Luetteke, N.C., et al., Targeted inactivation of the EGF and amphiregulin genes reveals distinct roles for EGF receptor ligands in mouse mammary gland development. Development, 1999. 126(12): p. 2739-50. 64. Brisken, C., et al., Essential function of Wnt-4 in mammary gland development downstream of progesterone signaling. Genes Dev, 2000. 14(6): p. 650-4. 65. Fernandez-Valdivia, R., et al., The RANKL signaling axis is sufficient to elicit ductal side-branching and alveologenesis in the mammary gland of the virgin mouse. Dev Biol, 2009. 328(1): p. 127-39. 66. Sicinski, P., et al., Cyclin D1 provides a link between development and oncogenesis in the retina and breast. Cell, 1995. 82(4): p. 621-30. 67. Asselin-Labat, M.L., et al., Gata-3 is an essential regulator of mammary-gland morphogenesis and luminal-cell differentiation. Nat Cell Biol, 2007. 9(2): p. 201- 9. 68. Bernardo, G.M., et al., FOXA1 is an essential determinant of ERalpha expression and mammary ductal morphogenesis. Development, 2010. 137(12): p. 2045-54.

92

69. Seagroves, T.N., et al., C/EBPbeta (CCAAT/enhancer binding protein) controls cell fate determination during mammary gland development. Mol Endocrinol, 2000. 14(3): p. 359-68. 70. Liu, X., et al., Stat5a is mandatory for adult mammary gland development and lactogenesis. Genes Dev, 1997. 11(2): p. 179-86. 71. Miyoshi, K., et al., Signal transducer and activator of transcription (Stat) 5 controls the proliferation and differentiation of mammary alveolar epithelium. J Cell Biol, 2001. 155(4): p. 531-42. 72. Santos, S.J., S.Z. Haslam, and S.E. Conrad, Signal transducer and activator of transcription 5a mediates mammary ductal branching and proliferation in the nulliparous mouse. Endocrinology, 2010. 151(6): p. 2876-85. 73. Santos, S.J., S.Z. Haslam, and S.E. Conrad, Estrogen and progesterone are critical regulators of Stat5a expression in the mouse mammary gland. Endocrinology, 2008. 149(1): p. 329-38. 74. Lee, H.J., et al., Progesterone drives mammary secretory differentiation via RankL-mediated induction of Elf5 in luminal progenitor cells. Development, 2013. 140(7): p. 1397-401. 75. Oakes, S.R., et al., The Ets transcription factor Elf5 specifies mammary alveolar cell fate. Genes Dev, 2008. 22(5): p. 581-6. 76. Smalley, M. and A. Ashworth, Stem cells and breast cancer: A field in transit. Nat Rev Cancer, 2003. 3(11): p. 832-44. 77. Polyak, K., Breast cancer: origins and evolution. J Clin Invest, 2007. 117(11): p. 3155-63. 78. Lim, E., et al., Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat Med, 2009. 15(8): p. 907-13. 79. Kalluri, R. and E.G. Neilson, Epithelial-mesenchymal transition and its implications for fibrosis. J Clin Invest, 2003. 112(12): p. 1776-84. 80. Kalluri, R.a.W.R., The basics of epithelial-mesenchymal transition. J Clin Invest, 2009. 119(6): p. 9. 81. Hay, E.D., Organization and fine structure of epithelium and mesenchyme in the developing chick embryo. In Epithelial-Mesenchymal Interactions. 1968, Baltimore, MD: Williams & Wilkins Co. 82. Trelstad, R.L., E.D. Hay, and J.D. Revel, Cell contact during early morphogenesis in the chick embryo. Dev Biol, 1967. 16(1): p. 78-106. 83. Zeisberg, M. and E.G. Neilson, Biomarkers for epithelial-mesenchymal transitions. J Clin Invest, 2009. 119(6): p. 1429-37. 84. Gonzalez, D.M. and D. Medici, Signaling mechanisms of the epithelial- mesenchymal transition. Sci Signal, 2014. 7(344): p. re8. 85. Mani, S.A., et al., Mesenchyme Forkhead 1 (FOXC2) plays a key role in metastasis and is associated with aggressive basal-like breast cancers. Proc Natl Acad Sci U S A, 2007. 104(24): p. 10069-74.

93

86. Nickel, A. and S.C. Stadler, Role of epigenetic mechanisms in epithelial-to- mesenchymal transition of breast cancer cells. Transl Res, 2015. 165(1): p. 126- 42. 87. Tam, W.L. and R.A. Weinberg, The of epithelial-mesenchymal plasticity in cancer. Nat Med, 2013. 19(11): p. 1438-49. 88. Al Moustafa, A.E., A. Achkhar, and A. Yasmeen, EGF-receptor signaling and epithelial-mesenchymal transition in human carcinomas. Front Biosci (Schol Ed), 2012. 4: p. 671-84. 89. Espinoza, I. and L. Miele, Deadly crosstalk: Notch signaling at the intersection of EMT and cancer stem cells. Cancer Lett, 2013. 341(1): p. 41-5. 90. Heldin, C.H., M. Vanlandewijck, and A. Moustakas, Regulation of EMT by TGFbeta in cancer. FEBS Lett, 2012. 586(14): p. 1959-70. 91. Katoh, Y. and M. Katoh, FGFR2-related pathogenesis and FGFR2-targeted therapeutics (Review). Int J Mol Med, 2009. 23(3): p. 307-11. 92. McCormack, N. and S. O'Dea, Regulation of epithelial to mesenchymal transition by bone morphogenetic proteins. Cell Signal, 2013. 25(12): p. 2856-62. 93. Taipale, J. and P.A. Beachy, The Hedgehog and Wnt signalling pathways in cancer. Nature, 2001. 411(6835): p. 349-54. 94. Michelotti, E.F., S. Sanford, and D. Levens, Marking of active genes on mitotic chromosomes. Nature, 1997. 388(6645): p. 895-9. 95. John, S. and J.L. Workman, Bookmarking genes for activation in condensed mitotic chromosomes. Bioessays, 1998. 20(4): p. 275-9. 96. Martinez-Balbas, M.A., et al., Displacement of sequence-specific transcription factors from mitotic chromatin. Cell, 1995. 83(1): p. 29-38. 97. Zaidi, S.K., et al., Mitotic partitioning and selective reorganization of tissue- specific transcription factors in progeny cells. Proceedings of the National Academy of Sciences of the United States of America, 2003. 100(25): p. 14852- 14857. 98. Ali, S.A., et al., Phenotypic transcription factors epigenetically mediate cell growth control. Proc Natl Acad Sci U S A, 2008. 105(18): p. 6632-7. 99. Young, D.W., et al., Mitotic occupancy and lineage-specific transcriptional control of rRNA genes by Runx2. Nature, 2007. 445(7126): p. 442-6. 100. Young, D.W., et al., Mitotic retention of gene expression patterns by the cell fate- determining transcription factor Runx2. Proc Natl Acad Sci U S A, 2007. 104(9): p. 3189-94. 101. Kadauke, S., et al., Tissue-specific mitotic bookmarking by hematopoietic transcription factor GATA1. Cell, 2012. 150(4): p. 725-37. 102. Arampatzi, P., et al., Gene-specific factors determine mitotic expression and bookmarking via alternate regulatory elements. Nucleic Acids Res, 2013. 41(4): p. 2202-15. 103. Caravaca, J.M., et al., Bookmarking by specific and nonspecific binding of FoxA1 pioneer factor to mitotic chromosomes. Genes Dev, 2013. 27(3): p. 251-60. 104. Lerner, J., et al., Human mutations affect the epigenetic/bookmarking function of HNF1B. Nucleic Acids Res, 2016. 44(17): p. 8097-111.

94

105. Gottesfeld, J.M. and D.J. Forbes, Mitotic repression of the transcriptional machinery. Trends Biochem Sci, 1997. 22(6): p. 197-202. 106. Naumova, N., et al., Organization of the mitotic chromosome. Science, 2013. 342(6161): p. 948-53. 107. Scholey, J.M., I. Brust-Mascher, and A. Mogilner, Cell division. Nature, 2003. 422(6933): p. 746-52. 108. Parsons, G.G. and C.A. Spencer, Mitotic repression of RNA polymerase II transcription is accompanied by release of transcription elongation complexes. Mol Cell Biol, 1997. 17(10): p. 5791-802. 109. Palozola, K.C., et al., Mitotic transcription and waves of gene reactivation during mitotic exit. Science, 2017. 358(6359): p. 119-122. 110. Roussel, P., et al., The rDNA transcription machinery is assembled during mitosis in active NORs and absent in inactive NORs. J Cell Biol, 1996. 133(2): p. 235-46. 111. Teves, S.S., et al., A stable mode of bookmarking by TBP recruits RNA polymerase II to mitotic chromosomes. Elife, 2018. 7. 112. Dekker, J., et al., Capturing chromosome conformation. Science, 2002. 295(5558): p. 1306-11. 113. Gibcus, J.H. and J. Dekker, The hierarchy of the 3D genome. Mol Cell, 2013. 49(5): p. 773-82. 114. Kelly, T.K., et al., H2A.Z maintenance during mitosis reveals nucleosome shifting on mitotically silenced genes. Mol Cell, 2010. 39(6): p. 901-11. 115. Ng, R.K. and J.B. Gurdon, Epigenetic memory of an active gene state depends on histone H3.3 incorporation into chromatin in the absence of transcription. Nat Cell Biol, 2008. 10(1): p. 102-9. 116. Liu, Y., et al., Widespread Mitotic Bookmarking by Histone Marks and Transcription Factors in Pluripotent Stem Cells. Cell Rep, 2017. 19(7): p. 1283- 1293. 117. Peterson, L.F. and D.E. Zhang, The 8;21 translocation in leukemogenesis. Oncogene, 2004. 23(24): p. 4255-62. 118. Bakshi, R., et al., The leukemogenic t(8;21) fusion protein AML1-ETO controls rRNA genes and associates with nucleolar-organizing regions at mitotic chromosomes. J Cell Sci, 2008. 121(Pt 23): p. 3981-90. 119. Blobel, G.A., et al., A reconfigured pattern of MLL occupancy within mitotic chromatin promotes rapid transcriptional reactivation following mitotic exit. Mol Cell, 2009. 36(6): p. 970-83. 120. Castilla, L.H. and J.H. Bushweller, Molecular Basis and Targeted Inhibition of CBFbeta-SMMHC Acute Myeloid Leukemia. Adv Exp Med Biol, 2017. 962: p. 229-244. 121. Shigesada, K., B. van de Sluis, and P.P. Liu, Mechanism of leukemogenesis by the inv(16) chimeric gene CBFB/PEBP2B-MHY11. Oncogene, 2004. 23(24): p. 4297- 307. 122. Lopez-Camacho, C., et al., CBFbeta and the leukemogenic fusion protein CBFbeta-SMMHC associate with mitotic chromosomes to epigenetically regulate ribosomal genes. J Cell Biochem, 2014. 115(12): p. 2155-64.

95

123. Tokuda, K., et al., Optimization of fixative solution for retinal morphology: a comparison with Davidson's fixative and other fixation solutions. Jpn J Ophthalmol, 2018. 62(4): p. 481-490. 124. Landt, S.G., et al., ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res, 2012. 22(9): p. 1813-31. 125. Follmer, N.E. and N.J. Francis, Preparation of Drosophila tissue culture cells from different stages of the cell cycle for chromatin immunoprecipitation using centrifugal counterflow elutriation and fluorescence-activated cell sorting. Methods Enzymol, 2012. 513: p. 251-69. 126. Wang, Z., M. Gerstein, and M. Snyder, RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet, 2009. 10(1): p. 57-63. 127. Core, L.J., J.J. Waterfall, and J.T. Lis, Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science, 2008. 322(5909): p. 1845-8. 128. Lopes, R., R. Agami, and G. Korkmaz, GRO-seq, A Tool for Identification of Transcripts Regulating Gene Expression. Methods Mol Biol, 2017. 1543: p. 45- 55. 129. Chen, J., et al., Single-molecule dynamics of enhanceosome assembly in embryonic stem cells. Cell, 2014. 156(6): p. 1274-1285. 130. Swinstead, E.E., et al., Steroid Receptors Reprogram FoxA1 Occupancy through Dynamic Chromatin Transitions. Cell, 2016. 165(3): p. 593-605. 131. Coffman, J.A., Runx transcription factors and the developmental balance between cell proliferation and differentiation. Cell Biol Int, 2003. 27(4): p. 315-24. 132. Ito, Y., Oncogenic potential of the RUNX gene family: 'overview'. Oncogene, 2004. 23(24): p. 4198-208. 133. Ogawa, E., et al., PEBP2/PEA2 represents a family of transcription factors homologous to the products of the Drosophila runt gene and the human AML1 gene. Proc Natl Acad Sci U S A, 1993. 90(14): p. 6859-63. 134. Adya, N., L.H. Castilla, and P.P. Liu, Function of CBFbeta/Bro proteins. Semin Cell Dev Biol, 2000. 11(5): p. 361-8. 135. Crute, B.E., et al., Biochemical and biophysical properties of the core-binding factor alpha2 (AML1) DNA-binding domain. J Biol Chem, 1996. 271(42): p. 26251-60. 136. Golling, G., et al., Drosophila homologs of the proto-oncogene product PEBP2/CBF beta regulate the DNA-binding properties of Runt. Mol Cell Biol, 1996. 16(3): p. 932-42. 137. Huang, X., et al., Overexpression, purification, and biophysical characterization of the heterodimerization domain of the core-binding factor beta subunit. J Biol Chem, 1998. 273(4): p. 2480-7. 138. Kagoshima, H., et al., The C. elegans CBFbeta homologue BRO-1 interacts with the Runx factor, RNT-1, to promote stem cell proliferation and self-renewal. Development, 2007. 134(21): p. 3905-15.

96

139. Tang, Y.Y., et al., Biophysical characterization of interactions between the core binding factor alpha and beta subunits and DNA. FEBS Lett, 2000. 470(2): p. 167-72. 140. Tahirov, T.H. and J. Bushweller, Structure and Biophysics of CBFbeta/RUNX and Its Translocation Products. Adv Exp Med Biol, 2017. 962: p. 21-31. 141. Kanno, T., et al., Intrinsic transcriptional activation-inhibition domains of the polyomavirus enhancer binding protein 2/core binding factor alpha subunit revealed in the presence of the beta subunit. Mol Cell Biol, 1998. 18(5): p. 2444- 54. 142. Zaidi, S.K., et al., A specific targeting signal directs Runx2/Cbfa1 to subnuclear domains and contributes to transactivation of the osteocalcin gene. J Cell Sci, 2001. 114(Pt 17): p. 3093-102. 143. Zeng, C., et al., Intranuclear targeting of AML/CBFalpha regulatory factors to nuclear matrix-associated transcriptional domains. Proc Natl Acad Sci U S A, 1998. 95(4): p. 1585-9. 144. Stein, G.S., et al., Organization of transcriptional regulatory machinery in nuclear microenvironments: implications for biological control and cancer. Adv Enzyme Regul, 2007. 47: p. 242-50. 145. Zeng, C., et al., Identification of a nuclear matrix targeting signal in the leukemia and bone-related AML/CBF-alpha transcription factors. Proc Natl Acad Sci U S A, 1997. 94(13): p. 6746-51. 146. Aronson, B.D., et al., Groucho-dependent and -independent repression activities of Runt domain proteins. Mol Cell Biol, 1997. 17(9): p. 5581-7. 147. Chuang, L.S., K. Ito, and Y. Ito, RUNX family: Regulation and diversification of roles through interacting proteins. Int J Cancer, 2013. 132(6): p. 1260-71. 148. Javed, A., et al., Groucho/TLE/R-esp proteins associate with the nuclear matrix and repress RUNX (CBF(alpha)/AML/PEBP2(alpha)) dependent activation of tissue-specific gene transcription. J Cell Sci, 2000. 113 ( Pt 12): p. 2221-31. 149. Lian, J.B., et al., Regulatory controls for osteoblast growth and differentiation: role of Runx/Cbfa/AML factors. Crit Rev Eukaryot Gene Expr, 2004. 14(1-2): p. 1-41. 150. Westendorf, J.J., Transcriptional co-repressors of Runx2. J Cell Biochem, 2006. 98(1): p. 54-64. 151. Zaidi, S.K., et al., The dynamic organization of gene-regulatory machinery in nuclear microenvironments. EMBO Rep, 2005. 6(2): p. 128-33. 152. Durst, K.L. and S.W. Hiebert, Role of RUNX family members in transcriptional repression and gene silencing. Oncogene, 2004. 23(24): p. 4220-4. 153. Okuda, T., et al., AML1, the target of multiple chromosomal translocations in human leukemia, is essential for normal fetal liver hematopoiesis. Cell, 1996. 84(2): p. 321-30. 154. Yzaguirre, A.D., M.F. de Bruijn, and N.A. Speck, The Role of Runx1 in Embryonic Blood Cell Formation. Adv Exp Med Biol, 2017. 962: p. 47-64. 155. Swiers, G., M. de Bruijn, and N.A. Speck, Hematopoietic stem cell emergence in the conceptus and the role of Runx1. Int J Dev Biol, 2010. 54(6-7): p. 1151-63.

97

156. Haar, J.L. and G.A. Ackerman, A phase and electron microscopic study of vasculogenesis and erythropoiesis in the yolk sac of the mouse. Anat Rec, 1971. 170(2): p. 199-223. 157. Moore, M.A. and D. Metcalf, Ontogeny of the haemopoietic system: yolk sac origin of in vivo and in vitro colony forming cells in the developing mouse embryo. Br J Haematol, 1970. 18(3): p. 279-96. 158. Palis, J., et al., Development of erythroid and myeloid progenitors in the yolk sac and embryo proper of the mouse. Development, 1999. 126(22): p. 5073-84. 159. Tober, J., et al., The megakaryocyte lineage originates from hemangioblast precursors and is an integral component both of primitive and of definitive hematopoiesis. Blood, 2007. 109(4): p. 1433-41. 160. Brotherton, T.W., et al., Hemoglobin ontogeny during normal mouse fetal development. Proc Natl Acad Sci U S A, 1979. 76(6): p. 2853-7. 161. Wong, P.M., et al., Adult hemoglobins are synthesized in murine fetal hepatic erythropoietic cells. Blood, 1983. 62(6): p. 1280-8. 162. Cai, Z., et al., Haploinsufficiency of AML1 affects the temporal and spatial generation of hematopoietic stem cells in the mouse embryo. Immunity, 2000. 13(4): p. 423-31. 163. Li, Z., et al., Runx1 function in hematopoiesis is required in cells that express Tek. Blood, 2006. 107(1): p. 106-10. 164. Sasaki, K., et al., Absence of fetal liver hematopoiesis in mice deficient in transcriptional coactivator core binding factor beta. Proc Natl Acad Sci U S A, 1996. 93(22): p. 12359-63. 165. Wang, Q., et al., Disruption of the Cbfa2 gene causes necrosis and hemorrhaging in the central nervous system and blocks definitive hematopoiesis. Proc Natl Acad Sci U S A, 1996. 93(8): p. 3444-9. 166. Wang, Q., et al., The CBFbeta subunit is essential for CBFalpha2 (AML1) function in vivo. Cell, 1996. 87(4): p. 697-708. 167. Ichikawa, M., et al., A role for RUNX1 in hematopoiesis and myeloid leukemia. Int J Hematol, 2013. 97(6): p. 726-34. 168. Comprehensive molecular portraits of human breast tumours. Nature, 2012. 490(7418): p. 61-70. 169. Ichikawa, M., et al., AML-1 is required for megakaryocytic maturation and lymphocytic differentiation, but not for maintenance of hematopoietic stem cells in adult hematopoiesis. Nat Med, 2004. 10(3): p. 299-304. 170. Niebuhr, B., et al., Runx1 is essential at two stages of early murine B-cell development. Blood, 2013. 122(3): p. 413-23. 171. Seo, W., et al., Runx1-Cbfbeta facilitates early B lymphocyte development by regulating expression of Ebf1. J Exp Med, 2012. 209(7): p. 1255-62. 172. Hoi, C.S., et al., Runx1 directly promotes proliferation of hair follicle stem cells and epithelial tumor formation in mouse skin. Mol Cell Biol, 2010. 30(10): p. 2518-36.

98

173. Kanaykina, N., et al., In vitro and in vivo effects on neural crest stem cell differentiation by conditional activation of Runx1 short isoform and its effect on neuropathic pain behavior. Ups J Med Sci, 2010. 115(1): p. 56-64. 174. Lian, J.B., et al., Runx1/AML1 hematopoietic transcription factor contributes to skeletal development in vivo. J Cell Physiol, 2003. 196(2): p. 301-11. 175. Osorio, K.M., et al., Runx1 modulates developmental, but not injury-driven, hair follicle stem cell activation. Development, 2008. 135(6): p. 1059-68. 176. Sokol, E.S., et al., Perturbation-expression analysis identifies RUNX1 as a regulator of human mammary stem cell differentiation. PLoS Comput Biol, 2015. 11(4): p. e1004161. 177. Umansky, K.B., et al., Runx1 Transcription Factor Is Required for Myoblasts Proliferation during Muscle Regeneration. PLoS Genet, 2015. 11(8): p. e1005457. 178. van Bragt, M.P., et al., RUNX1, a transcription factor mutated in breast cancer, controls the fate of ER-positive mammary luminal cells. Elife, 2014. 3: p. e03881. 179. Yamashiro, T., et al., Expression of Runx1, -2 and -3 during tooth, palate and craniofacial bone development. Mech Dev, 2002. 119 Suppl 1: p. S107-10. 180. VanOudenhove, J.J., et al., Transient RUNX1 Expression during Early Mesendodermal Differentiation of hESCs Promotes Epithelial to Mesenchymal Transition through TGFB2 Signaling. Stem Cell Reports, 2016. 7(5): p. 884-896. 181. VanOudenhove, J.J., et al., Precocious Phenotypic Transcription-Factor Expression During Early Development. J Cell Biochem, 2017. 118(5): p. 953- 958. 182. Wang, L. and Y.G. Chen, Signaling Control of Differentiation of Embryonic Stem Cells toward Mesendoderm. J Mol Biol, 2016. 428(7): p. 1409-22. 183. Chen, M.J., et al., Runx1 is required for the endothelial to haematopoietic cell transition but not thereafter. Nature, 2009. 457(7231): p. 887-91. 184. Lacaud, G., et al., Runx1 is essential for hematopoietic commitment at the hemangioblast stage of development in vitro. Blood, 2002. 100(2): p. 458-66. 185. Grandy, R.A., et al., Genome-Wide Studies Reveal that H3K4me3 Modification in Bivalent Genes Is Dynamically Regulated during the Pluripotent Cell Cycle and Stabilized upon Differentiation. Mol Cell Biol, 2016. 36(4): p. 615-27. 186. Mikkelsen, T.S., et al., Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature, 2007. 448(7153): p. 553-60. 187. Pope, B.D., et al., Topologically associating domains are stable units of replication-timing regulation. Nature, 2014. 515(7527): p. 402-5. 188. Thurman, R.E., et al., The accessible chromatin landscape of the human genome. Nature, 2012. 489(7414): p. 75-82. 189. McDonald, L., et al., RUNX2 correlates with subtype-specific breast cancer in a human tissue microarray, and ectopic expression of Runx2 perturbs differentiation in the mouse mammary gland. Dis Model Mech, 2014. 7(5): p. 525-34. 190. Rooney, N., A.I. Riggio, D. Mendoza-Villanueva, P. Shore, E. R. Cameron and K. Blyth, Runx Genes in Breast Cancer and the Mammary Lineage, in RUNX

99

Proteins in Development and Cancer, Y.I. Y. Groner, P. Liu et al., Editor. 2017, Springer Singapore: Singapore. p. 353-368. 191. Metzeler, K.H. and C.D. Bloomfield, Clinical Relevance of RUNX1 and CBFB Alterations in Acute Myeloid Leukemia and Other Hematological Disorders. Adv Exp Med Biol, 2017. 962: p. 175-199. 192. Miyoshi, H., et al., t(8;21) breakpoints on in acute myeloid leukemia are clustered within a limited region of a single gene, AML1. Proc Natl Acad Sci U S A, 1991. 88(23): p. 10431-4. 193. Rowley, J.D., Identificaton of a translocation with quinacrine fluorescence in a patient with acute leukemia. Ann Genet, 1973. 16(2): p. 109-12. 194. Sood, R., Y. Kamikubo, and P. Liu, Role of RUNX1 in hematological malignancies. Blood, 2017. 129(15): p. 2070-2082. 195. Hatlen, M.A., L. Wang, and S.D. Nimer, AML1-ETO driven acute leukemia: insights into pathogenesis and potential therapeutic approaches. Front Med, 2012. 6(3): p. 248-62. 196. Meyers, S., J.R. Downing, and S.W. Hiebert, Identification of AML-1 and the (8;21) translocation protein (AML-1/ETO) as sequence-specific DNA-binding proteins: the runt homology domain is required for DNA binding and protein- protein interactions. Mol Cell Biol, 1993. 13(10): p. 6336-45. 197. Zaidi, S.K., et al., An AML1-ETO/miR-29b-1 regulatory circuit modulates phenotypic properties of acute myeloid leukemia cells. Oncotarget, 2017. 8(25): p. 39994-40005. 198. Barseguian, K., et al., Multiple subnuclear targeting signals of the leukemia- related AML1/ETO and ETO repressor proteins. Proc Natl Acad Sci U S A, 2002. 99(24): p. 15434-9. 199. McNeil, S., et al., The t(8;21) chromosomal translocation in acute myelogenous leukemia modifies intranuclear targeting of the AML1/CBFalpha2 transcription factor. Proc Natl Acad Sci U S A, 1999. 96(26): p. 14882-7. 200. Amann, J.M., et al., ETO, a target of t(8;21) in acute leukemia, makes distinct contacts with multiple histone deacetylases and binds mSin3A through its oligomerization domain. Mol Cell Biol, 2001. 21(19): p. 6470-83. 201. Davis, J.N., L. McGhee, and S. Meyers, The ETO (MTG8) gene family. Gene, 2003. 303: p. 1-10. 202. Gelmetti, V., et al., Aberrant recruitment of the - complex by the acute myeloid leukemia fusion partner ETO. Mol Cell Biol, 1998. 18(12): p. 7185-91. 203. Lin, S., J.C. Mulloy, and S. Goyama, RUNX1-ETO Leukemia. Adv Exp Med Biol, 2017. 962: p. 151-173. 204. Lutterbach, B., et al., ETO, a target of t(8;21) in acute leukemia, interacts with the N-CoR and mSin3 . Mol Cell Biol, 1998. 18(12): p. 7176-84. 205. Wang, J., et al., ETO, fusion partner in t(8;21) acute myeloid leukemia, represses transcription by interaction with the human N-CoR/mSin3/HDAC1 complex. Proc Natl Acad Sci U S A, 1998. 95(18): p. 10860-5.

100

206. Eyholzer, M., et al., The tumour-suppressive miR-29a/b1 cluster is regulated by CEBPA and blocked in human AML. Br J Cancer, 2010. 103(2): p. 275-84. 207. Gong, J.N., et al., The role, mechanism and potentially therapeutic application of microRNA-29 family in acute myeloid leukemia. Cell Death Differ, 2014. 21(1): p. 100-12. 208. Golub, T.R., et al., Fusion of the TEL gene on 12p13 to the AML1 gene on 21q22 in acute lymphoblastic leukemia. Proc Natl Acad Sci U S A, 1995. 92(11): p. 4917-21. 209. Lam, K. and D.E. Zhang, RUNX1 and RUNX1-ETO: roles in hematopoiesis and leukemogenesis. Front Biosci (Landmark Ed), 2012. 17: p. 1120-39. 210. Look, A.T., Oncogenic transcription factors in the human acute . Science, 1997. 278(5340): p. 1059-64. 211. Nucifora, G., et al., Consistent intergenic splicing and production of multiple transcripts between AML1 at 21q22 and unrelated genes at 3q26 in (3;21)(q26;q22) translocations. Proc Natl Acad Sci U S A, 1994. 91(9): p. 4004- 8. 212. Rubin, C.M., et al., t(3;21)(q26;q22): a recurring chromosomal abnormality in therapy-related myelodysplastic syndrome and acute myeloid leukemia. Blood, 1990. 76(12): p. 2594-8. 213. Rubin, C.M., et al., Association of a chromosomal 3;21 translocation with the blast phase of chronic myelogenous leukemia. Blood, 1987. 70(5): p. 1338-42. 214. Zelent, A., M. Greaves, and T. Enver, Role of the TEL-AML1 fusion gene in the molecular pathogenesis of childhood acute lymphoblastic leukaemia. Oncogene, 2004. 23(24): p. 4275-83. 215. De Braekeleer, E., et al., RUNX1 translocations and fusion genes in malignant hemopathies. Future Oncol, 2011. 7(1): p. 77-91. 216. Mangan, J.K. and N.A. Speck, RUNX1 mutations in clonal myeloid disorders: from conventional cytogenetics to next generation sequencing, a story 40 years in the making. Crit Rev Oncog, 2011. 16(1-2): p. 77-91. 217. Ding, Y., et al., AML1/RUNX1 point mutation possibly promotes leukemic transformation in myeloproliferative neoplasms. Blood, 2009. 114(25): p. 5201-5. 218. Gelsi-Boyer, V., et al., Genome profiling of chronic myelomonocytic leukemia: frequent alterations of RAS and RUNX1 genes. BMC Cancer, 2008. 8: p. 299. 219. Imai, Y., et al., Mutations of the AML1 gene in myelodysplastic syndrome and their functional implications in leukemogenesis. Blood, 2000. 96(9): p. 3154-60. 220. Kuo, M.C., et al., RUNX1 mutations are frequent in chronic myelomonocytic leukemia and mutations at the C-terminal region might predict acute myeloid leukemia transformation. Leukemia, 2009. 23(8): p. 1426-31. 221. Osato, M., Point mutations in the RUNX1/AML1 gene: another actor in RUNX leukemia. Oncogene, 2004. 23(24): p. 4284-96. 222. Song, W.J., et al., Haploinsufficiency of CBFA2 causes familial thrombocytopenia with propensity to develop acute myelogenous leukaemia. Nat Genet, 1999. 23(2): p. 166-75.

101

223. Churpek, J.E., et al., Identification and molecular characterization of a novel 3′ mutation in RUNX1 in a family with familial platelet disorder. Leuk Lymphoma, 2010. 51(10): p. 1931-5. 224. Heller, P.G., et al., Low Mpl receptor expression in a pedigree with familial platelet disorder with predisposition to acute myelogenous leukemia and a novel AML1 mutation. Blood, 2005. 105(12): p. 4664-70. 225. Michaud, J., et al., In vitro analyses of known and novel RUNX1/AML1 mutations in dominant familial platelet disorder with predisposition to acute myelogenous leukemia: implications for mechanisms of pathogenesis. Blood, 2002. 99(4): p. 1364-72. 226. Banerji, S., et al., Sequence analysis of mutations and translocations across breast cancer subtypes. Nature, 2012. 486(7403): p. 405-9. 227. Kan, Z., et al., Diverse somatic mutation patterns and pathway alterations in human cancers. Nature, 2010. 466(7308): p. 869-73. 228. Sjoblom, T., et al., The consensus coding sequences of human breast and colorectal cancers. Science, 2006. 314(5797): p. 268-74. 229. Daub, H., et al., Kinase-selective enrichment enables quantitative phosphoproteomics of the kinome across the cell cycle. Mol Cell, 2008. 31(3): p. 438-48. 230. Stender, J.D., et al., Genome-wide analysis of estrogen receptor alpha DNA binding and tethering mechanisms identifies Runx1 as a novel tethering factor in receptor-mediated transcriptional activation. Mol Cell Biol, 2010. 30(16): p. 3943-55. 231. Cerami, E., et al., The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov, 2012. 2(5): p. 401-4. 232. Ciriello, G., et al., Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer. Cell, 2015. 163(2): p. 506-19. 233. Curtis, C., et al., The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature, 2012. 486(7403): p. 346-52. 234. Gao, J., et al., Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal, 2013. 6(269): p. pl1. 235. Griffith, O.L., et al., The prognostic effects of somatic mutations in ER-positive breast cancer. Nat Commun, 2018. 9(1): p. 3476. 236. Razavi, P., et al., The Genomic Landscape of Endocrine-Resistant Advanced Breast Cancers. Cancer Cell, 2018. 34(3): p. 427-438.e6. 237. Kadota, M., et al., Delineating genetic alterations for tumor progression in the MCF10A series of breast cancer cell lines. PLoS One, 2010. 5(2): p. e9201. 238. Wang, L., J.S. Brugge, and K.A. Janes, Intersection of FOXO- and RUNX1- mediated gene expression programs in single breast epithelial cells during morphogenesis and tumor progression. Proc Natl Acad Sci U S A, 2011. 108(40): p. E803-12. 239. Paik, J.H., et al., FoxOs are lineage-restricted redundant tumor suppressors and regulate endothelial cell homeostasis. Cell, 2007. 128(2): p. 309-23.

102

240. Silva, F.P., et al., Identification of RUNX1/AML1 as a classical tumor suppressor gene. Oncogene, 2003. 22(4): p. 538-47. 241. Chimge, N.O., et al., RUNX1 prevents oestrogen-mediated AXIN1 suppression and beta-catenin activation in ER-positive breast cancer. Nat Commun, 2016. 7: p. 10751. 242. Davis, R.L., H. Weintraub, and A.B. Lassar, Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell, 1987. 51(6): p. 987-1000. 243. Edmondson, D.G. and E.N. Olson, A gene with homology to the myc similarity region of MyoD1 is expressed during myogenesis and is sufficient to activate the muscle differentiation program. Genes Dev, 1989. 3(5): p. 628-40. 244. Wright, W.E., D.A. Sassoon, and V.K. Lin, Myogenin, a factor regulating myogenesis, has a domain homologous to MyoD. Cell, 1989. 56(4): p. 607-17. 245. Li, J., et al., New insights into transcriptional and leukemogenic mechanisms of AML1-ETO and E2A fusion proteins. Front Biol (Beijing), 2016. 11(4): p. 285- 304. 246. Langmead, B. and S.L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat Methods, 2012. 9(4): p. 357-9. 247. Zhang, Y., et al., Model-based analysis of ChIP-Seq (MACS). Genome Biol, 2008. 9(9): p. R137. 248. Kim, W., et al., RUNX1 is essential for mesenchymal stem cell proliferation and myofibroblast differentiation. Proc Natl Acad Sci U S A, 2014. 111(46): p. 16389- 94. 249. Ozaki, T., A. Nakagawara, and H. Nagase, RUNX Family Participates in the Regulation of p53-Dependent DNA Damage Response. Int J Genomics, 2013. 2013: p. 271347. 250. Satoh, Y., et al., C-terminal mutation of RUNX1 attenuates the DNA-damage repair response in hematopoietic stem cells. Leukemia, 2012. 26(2): p. 303-11. 251. Wang, C.Q., et al., Disruption of Runx1 and Runx3 leads to bone marrow failure and leukemia predisposition due to transcriptional and DNA repair defects. Cell Rep, 2014. 8(3): p. 767-82. 252. Wu, D., et al., Runt-related transcription factor 1 (RUNX1) stimulates tumor suppressor p53 protein in response to DNA damage through complex formation and . J Biol Chem, 2013. 288(2): p. 1353-64. 253. Illendula, A., et al., Small Molecule Inhibitor of CBFbeta-RUNX Binding for RUNX Transcription Factor Driven Cancers. EBioMedicine, 2016. 8: p. 117-131. 254. Cargnello, M., J. Tcherkezian, and P.P. Roux, The expanding role of mTOR in cancer cell growth and proliferation. Mutagenesis, 2015. 30(2): p. 169-76. 255. Ciruelos Gil, E.M., Targeting the PI3K/AKT/mTOR pathway in estrogen receptor-positive breast cancer. Cancer Treat Rev, 2014. 40(7): p. 862-71. 256. Gutschner, T., M. Hammerle, and S. Diederichs, MALAT1 -- a paradigm for long noncoding RNA function in cancer. J Mol Med (Berl), 2013. 91(7): p. 791-801. 257. Yu, X., et al., NEAT1: A novel cancer-related long non-coding RNA. Cell Prolif, 2017. 50(2).

103

258. Li, X., et al., Upregulation of HES1 Promotes Cell Proliferation and Invasion in Breast Cancer as a Prognosis Marker and Therapy Target via the AKT Pathway and EMT Process. J Cancer, 2018. 9(4): p. 757-766. 259. Strom, A., et al., The Hairy and Enhancer of Split homologue-1 (HES-1) mediates the proliferative effect of 17beta-estradiol on breast cancer cell lines. Oncogene, 2000. 19(51): p. 5951-3. 260. Weyemi, U., et al., Twist1 and Slug mediate H2AX-regulated epithelial- mesenchymal transition in breast cells. Cell Cycle, 2016. 15(18): p. 2398-404. 261. Ellis, M.J., et al., Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature, 2012. 486(7403): p. 353-60. 262. Clarke, R., J.J. Tyson, and J.M. Dixon, Endocrine resistance in breast cancer – an overview and update. Mol Cell Endocrinol, 2015. 418(0 3): p. 220-34. 263. Glidewell-Kenney, C., et al., ERE-independent ERalpha target genes differentially expressed in human breast tumors. Mol Cell Endocrinol, 2005. 245(1-2): p. 53-9. 264. Wang, C., et al., Estrogen induces c-myc gene expression via an upstream enhancer activated by the estrogen receptor and the AP-1 transcription factor. Mol Endocrinol, 2011. 25(9): p. 1527-38. 265. Campbell, K.J. and R.J. White, MYC Regulation of Cell Growth through Control of Transcription by RNA Polymerases I and III. Cold Spring Harb Perspect Med, 2014. 4(5). 266. Arabi, A., et al., c-Myc associates with ribosomal DNA and activates RNA polymerase I transcription. Nat Cell Biol, 2005. 7(3): p. 303-10. 267. Grandori, C., et al., c-Myc binds to human ribosomal DNA and stimulates transcription of rRNA genes by RNA polymerase I. Nat Cell Biol, 2005. 7(3): p. 311-8. 268. Kageyama, R., T. Ohtsuka, and T. Kobayashi, The Hes gene family: repressors and oscillators that orchestrate embryogenesis. Development, 2007. 134(7): p. 1243-51. 269. Rani, A., et al., HES1 in immunity and cancer. Cytokine Growth Factor Rev, 2016. 30: p. 113-7. 270. Fisher, A.L., S. Ohsako, and M. Caudy, The WRPW motif of the hairy-related basic helix-loop-helix repressor proteins acts as a 4-amino-acid transcription repression and protein-protein interaction domain. Mol Cell Biol, 1996. 16(6): p. 2670-7. 271. Grbavec, D. and S. Stifani, Molecular interaction between TLE1 and the carboxyl-terminal domain of HES-1 containing the WRPW motif. Biochem Biophys Res Commun, 1996. 223(3): p. 701-5. 272. Paroush, Z., et al., Groucho is required for Drosophila neurogenesis, segmentation, and sex determination and interacts directly with hairy-related bHLH proteins. Cell, 1994. 79(5): p. 805-15. 273. Hilton, M.J., et al., Notch signaling maintains bone marrow mesenchymal progenitors by suppressing osteoblast differentiation. Nat Med, 2008. 14(3): p. 306-14.

104

274. Gao, J., et al., RUNX3 directly interacts with intracellular domain of Notch1 and suppresses Notch signaling in hepatocellular carcinoma cells. Exp Cell Res, 2010. 316(2): p. 149-57. 275. Imai, Y., et al., TLE, the human homolog of groucho, interacts with AML1 and acts as a repressor of AML1-induced transactivation. Biochem Biophys Res Commun, 1998. 252(3): p. 582-9. 276. Weyemi, U., et al., The histone variant H2A.X is a regulator of the epithelial- mesenchymal transition. Nat Commun, 2016. 7: p. 10711. 277. Standaert, L., et al., The long noncoding RNA Neat1 is required for mammary gland development and lactation. Rna, 2014. 20(12): p. 1844-9. 278. Lo, P.K., et al., Dysregulation of the BRCA1/long non-coding RNA NEAT1 signaling axis contributes to breast tumorigenesis. Oncotarget, 2016. 7(40): p. 65067-65089. 279. Zhao, D., et al., NEAT1 negatively regulates miR-218 expression and promotes breast cancer progression. Cancer Biomark, 2017. 20(3): p. 247-254. 280. Barutcu, A.R., et al., RUNX1 contributes to higher-order chromatin organization and gene regulation in breast cancer cells. Biochim Biophys Acta, 2016. 1859(11): p. 1389-1397. 281. Li, Q., et al., Measuring reproducibility of high-throughput experiments.

105