Expression of EZH1-Polycomb Repressive Complex 2 and MALAT1 lncRNA and their Combined Role in Epigenetic Adaptive Response

Thesis by

Lamya Al Fuhaid

In Partial Fulfillment of the Requirements

For the Degree of

Master of Science

King Abdullah University of Science and Technology

Thuwal, Kingdom of Saudi Arabia

July, 2019

2

EXAMINATION COMMITTEE PAGE

The thesis of Lamya Al Fuhaid is approved by the examination committee.

Committee Chairperson: Prof. Valerio Orlando Committee Members: Prof. Stefan Arold, Prof. Salim Al Babili

3

COPYRIGHT

© July, 2019

Lamya Al Fuhaid

All Rights Reserved

4

ABSTRACT

Expression of EZH1-Polycomb Repressive Complex 2 and MALAT1

lncRNA and their Combined Role in Epigenetic Adaptive Response

Lamya Al Fuhaid

Living cells maintain stable transcriptional programs while exhibiting plasticity that allows them to respond to environmental stimuli. The Polycomb repressive complex 2

(PRC2) is a key regulator of chromatin structure that maintains silencing through the methylation of histone H3 on lysine 27 (H3K27me), establishing chromatin-based memory. Two variants of PRC2 are present in mammalian cells, PRC2-EZH2 which is predominantly present in differentiating cells, and PRC2-EZH1 that predominates in post-mitotic tissues.

PRC2-EZH1α/β pathway is involved in the response of muscle cells to oxidative stress. Atrophied muscle cells respond to oxidative stress by enabling the nuclear translocation of EED and its assembly with EZH1α and SUZ12. Here we prove that the metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) long noncoding RNA (lncRNA) is required for the assembly of PRC2-EZH1 components.

The absence of MALAT1 significantly decreased the association between EED and

EZH1α proteins. Biochemical analysis shows that the presence of MALAT1 increases the enzymatic activity of PRC2-EZH1 in vitro.

In addition, we show that the simultaneous expression of PRC2 core components is necessary for their solubility. The successful expression of PRC2 proteins enables the execution of several downstream experiments, which will further explain the nature of the interplay between MALAT1 and PRC2. 5

ACKNOWLEDGEMENTS

I would like to thank my supervisor, Prof. Valerio Orlando for providing me with the opportunity to explore the field of epigenetics. I would also like to show great gratitude to Dr. Peng Liu, who was generous with his time and has put immense effort into training and assisting me. I extend my thanks to my committee members, Prof.

Stefan Arold and Prof. Salim Al Babili, for their time and valuable input.

Acknowledgment goes to my labmate, Dr. Nadine El Said, for her insightful data, and

Dr. Muhammad Tehseen from the Laboratory of DNA Replication and

Recombination, for providing SF9 insect cells and pFastBac vector. Most importantly, for his guidance and patience in answering all of my questions.

Special thanks to my dear friend Mohammed Alomari, for his help in the in vitro . Thank you for the support, the company, and for lending me autoclaved flasks when I was too lazy to autoclave my own. My best friend, Arwa Alghuneim, you have been with me through it all, and without you, it would have been dull.

Thank you for your interest in my work and for helping me whenever I asked, or did not. Last but not least, my mom, from whom I got the love of knowledge and the pursuit of perfection, epigenetics I guess, I love you. 6

TABLE OF CONTENTS

EXAMINATION COMMITTEE PAGE ______2

COPYRIGHT ______3

ABSTRACT ______4

ACKNOWLEDGEMENTS ______5

TABLE OF CONTENTS ______6

LIST OF ABBREVIATIONS ______9

LIST OF SYMBOLS ______12

LIST OF ILLUSTRATIONS ______13

LIST OF TABLES ______17

CHAPTER 1: INTRODUCTION ______18

1.1. Polycomb Repressive Complex ______20 1.1.1. Classes of Polycomb repressive complex ______20 1.1.2. PRC2 Components ______21 1.1.2.1. EZH protein ______21 1.1.2.2. EED protein ______21 1.1.2.3. SUZ12 protein ______22 1.1.2.4. RbAp46 and RbAp48 proteins ______23 1.1.2.5. AEBP2 and JARID2 proteins ______23

1.2. PRC2 in development and plasticity ______23 1.2.1. PRC2 in cellular memory ______23 1.2.2. The interchange between PRC2-EZH2 and PRC2-EZH1 ______24 1.2.3. PRC2 and adaptive response in postmitotic cells ______26

1.3. Interplay between PRC2 and MALAT1: ______27 1.3.1. Non-coding RNA: ______27 1.3.2. lncRNA and PRC2 function: ______28 1.3.3. Interaction between MALAT1 and PRC2 ______30 1.3.3.1. MALAT1 and PRC2-EZH2 ______30 7

1.3.4. Investigation of the involvement of MALAT1 in PRC2-EZH1 mediated plasticity 31

1.4. Scope of the thesis: Objectives, contributions, and model system ______34

CHAPTER 2: MATERIALS AND METHODS ______35

2.1. Mammalian cell culture ______35

2.2. Stable cell lines ______35

2.3. Immunofluorescence ______35

2.4. Cell treatments ______36

2.5. RNA isolation and RT-qPCR ______36

2.6. Co-immunoprecipitation ______37

2.7. LC-MS analysis and MS data analysis ______37

2.8. ChIP-seq ______39

2.9. Expression plasmids ______39

2.10. RNA in vitro transcription ______40

2.11. Protein expression and purification in bacteria ______40

2.12. Protein expression in insect cells ______41

2.13. Western blotting ______42

2.14. Histone methyltransferase activity ______42

2.15. Statistical analysis ______43

CHAPTER 3: RESULTS ______44

3.1. Generation of stable C2C12 cell line expressing tagged EZH1α ______44

3.2. Identification of suitable antisense oligos for MALAT1 loss of function experiments ______46

3.3. MALAT1 is required for the proper association of PRC2 core components in the nucleus ______47 8

3.4. MALAT1 is necessary for the binding of EZH1α and EED to muscle-specific promoters under oxidative stress ______50

3.5. Expression of PRC2 core components in E. coli ______51

3.6. PRC2 proteins are insoluble in E. coli except EED ______54

3.7. Expression of PRC2 core components in insect cells ______57

3.8. Simultaneous expression of the three core components of PRC2-EZH1 is necessary for their solubility ______59

3.9. In vitro transcription of MALAT1 ______60

3.10. MALAT1 positively affects the enzymatic activity of PRC2-EZH1 in vitro _ 61

CHAPTER 4: DISCUSSION ______63

CHAPTER 5: CONCLUSION ______68

APPENDIX A: VECTOR MAPS ______69

1. pOZ-FH-C-Puro vector ______69

2. pJET1.2 vector ______70

3. pGEX-5X-1 vector ______71

4. pFastBac HT C vector ______72

APPENDIX B: PRIMER SEQUENCES ______73

1. qPCR primers ______73

2. ChIP ______73

3. Cloning primers ______74

4. Sequencing primers: ______74

APPENDIX C: ANTIBODIES ______76

1. Western blotting ______76

REFERENCES ______77

9

LIST OF ABBREVIATIONS

3D Three dimensional

ASO Antisense oligos

ChIP Chromatin immunoprecipitation

ChIP-qPCR Chromatin immunoprecipitation followed by real-time polymerase chain reaction

CRPC Castration-resistant prostate cancer

Cryo-EM Cryogenic electron microscopy

DMEM Dulbecco's modified Eagle's medium

DNA Deoxyribonucleic acid dSfmbt Scm-related gene containing four malignant brain tumor domains EED Embryonic ectoderm development

ELISA Enzyme-linked immunosorbent assay

EMSA Electrophoretic mobility shift assay

EMT Epithelial to mesenchymal transmission

ERK Extracellular signal-regulated kinase

EZH1 Enhancer of zeste homolog 1

EZH1α-FH EZH1α fused with FLAG and HA tags

EZH2 Enhancer of zeste homolog 2

FISH RNA fluorescent in situ hybridization

H2AK119 Histone H2A at lysine 119

H3K27me Histone H3 lysine 27 methylation

H3K27me3 Histone H3 lysine 27 trimethylation

H3K4me3 Histone H3 lysine 4 trimethylation

H3S28ph Histone H3 at serine 28 phosphorylation

HCD Higher energy collision 10

HMTase Histone methyltransferase

IF Immunofluorescence

IP Immunoprecipitation

JmjC Jumonji C

LNA Antisense locked nucleic acid dissociation lncRNA Long noncoding RNA

MALAT1 Metastasis-associated lung adenocarcinoma transcript 1

MAPK Mitogen-activated protein kinase

MBT Malignant brain tumor mCK Muscle creatine kinase mRNA Messenger RNA

Msk-1 Mitogen-and-stress-activated protein kinase 1

Msk-2 Mitogen-and-stress-activated protein kinase 2

MW Molecular weight

MyoG Myogenin ncRNA Non-coding RNA

OD Optical density

Pc Chromodomain of Polycomb

PcG Polycomb group

PCR Polymerase chain reaction

Ph Polyhomeotic

Pho Pleiohomeotic

Pol II RNA polymerase II

PRC1 Polycomb repressive complex 1

PREs Polycomb group response elements

Psc Posterior sex combs 11 qPCR quantitative polymerase chain reaction

RIP RNA immunoprecipitation

RNA Ribonucleic acid rRNA Ribosomal RNA

SET Su(var) 3–9-enhancer of zeste- trithorax snoRNA Small nucleolar RNA snRNA Small nuclear RNA

SR Serine/arginine

SUZ12 Suppressor of zeste 12

TAP Tandem affinity purification

TREs Trithorax group response elements trxG Trithorax group

VEFS Vrn2-emf2-fis2-suz12

WDB WD-40 binding domain

X-Gal X- 5-Bromo-4-chloro-3-indolyl-β-D-galactopyranoside 12

LIST OF SYMBOLS

D Aspartic acid

H2O2 Hydrogen peroxide

Kd Binding constant

W Tryptophan

13

LIST OF ILLUSTRATIONS

Figure 1 Schematic model of the function of Ezh1α and Ezh1β in differentiated muscle cells. Ezh1β acts as a regulator that prevents PRC2-EZH1 activity, through its binding to EED in the cytosol. The induction of oxidative stress frees EED and enables it nuclear localization, and in turn complex assembly. The effect of oxidative stress is reversible. (Adapted from Bodega et al., 2017.) ...... 27 Figure 2 Association of MALAT1 with PRC2-EZH1 in atrophic skeletal muscle cells. a) C2C12 cells were treated with H2O2 to induce atrophy and native RIP followed by real-time quantitative PCR (nRIP-qPCR) and and formaldehyde cross- linking RIP followed by qPRC (fRIP-qPCR) were performed to map RNA-protein interaction. EED was immunoprecipitated together with its associated RNA molecules and the fold enrichment of MALAT1 over mock was measured upon oxidative stress (pink) compared to normal C2C12 (blue). The results are consistent without formaldehyde crosslinking (nRIP) and with crosslinking (fRIP). Statistical test was carried out as two-way ANOVA where *=P<0.05, **=P<0.01, ***=P<0.001, ****=P<0.0001 n=3, error bars representing mean with SD). b) IF of H3K27me3 (green) and DAPI (blue) in C2C12 cells in normal (control), oxidative stress (H2O2 stress) , and oxidative stress and stress with Malat1-knockdown (Stress + Malat1KD). c) FISH of Malat1 (red), Eed (green), and DAPI (blue), in wild type and MALAT1 knockdown (Malat1-KD) C2C12 in both normal (control) and oxidative stress (stress) conditions. (Nadine El Said, PhD Thesis, KAUST 2019) ...... 33 Figure 3 The generation of EZH1α-FH tagged stable C2C12 cell line. (a) IF EZH1α-FH (green), myosin heavy chain (red), and DAPI (blue) confirms the presence of the tagged protein in both myoblast (left) and myotube (right). (b) Silver staining of immunoprecipitated EZH1α-FLAG-HA (1α-FH) using Flag or HA for the elution confirms the presence of EZH1α. Input samples represents non-eluted cells. (c) MS data showing the mascot score and number of unique peptides corresponding a list of proteins that associate with EZH1α in the cell in three replicates of immunoprecipitated EZH1α-FH samples. The high mascot number is indicative of high presence of EZH1α...... 45 Figure 4 Quantification of the expression of MALAT1 in C2C12 cells transfected with different MALAT1 antisense oligos. Cells were transfected with one of three different MALAT1 antisense oligos: MALAT #1, MALAT1 #2, MALAT1 #3 using two concentrations: 100 nM (a) and 50 nM (b). The expression of MALAT1 was quantified by qPCR and compared to the expression of MALAT1 in cells transfected with Scramble RNA. The shown percentages represent the expression percentage in reference to the scramble. Values are expressed as mean ± SD (n=3). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the scramble. The absence of the asterisks indicates nonsignificant difference...... 47 Figure 5 Co-IP of physically associated PRC2 proteins in the presence and absence of Malat1. (a) Western blots (top) of immunoprecipitated EZH1α-FH (EZH1α IP) from C2C12 cells transfected with scramble RNA (-) and MALAT1-KD 14

C2C12 (+) both treated with H2O2, using anti-EED antibodies and anti-HA for control. Input samples represent non-eluted cells. Ponceau S staining was used as loading control (bottom) (b) The relative intensities of EED and SUZ12 enrichment in immunoprecipitated EZH1α-FH from C2C12 cells transfected with scramble RNA

(black) and MALAT1-KD C2C12 (gray) both treated with H2O2. Values are expressed as mean ± SD (n=3) and P<0.05 is considered for significance...... 49 Figure 6 The effect of MALAT1 on the binding of PRC2 proteins to muscle- specific promoters under oxidative stress. ChIP-qPCR of (a) EED, (b) EZH1α, and (c) H3K27me3 on the promoter regions of muscle-specific : myogenin (MyoG), neurogenin (NeuroG), myosin heavy chain 3 (Myh3), myosin heavy chain (Myh8), and desmin in control, H2O2 stress (Stress), and H2O2 stress + MALAT1 knockdown (Stress + Malat1-KD). Values are expressed as mean ± SD (n=3). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the control. The absence of the asterisks indicates nonsignificant difference. (Adapted from Nadine El Said, PhD Thesis, KAUST 2019) ...... 50 Figure 7 Cloning of PRC2 core components into E. coli expression plasmids. (a) Schematic representation of the expression plasmid constructs. Genes were cut from the backbone vector (1) and ligated into the destination vector (2). (b) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pJET1.2 vector (1). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table. (c) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pGEX-5x-1vector (2). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table.1% gel electrophoresis was run using 1 kb plus ladder and only constructs that showed fragments of expected sized were picked and confirmed by Sanger sequencing (not shown). Vector maps are shown in appendix A...... 53 Figure 8 Expression trials of PRC2 components in bacteria at various conditions. The expression of EZH1α (a), EZH1β (b), EED441 (c), EED500 (d), and Suz12 (e) was carried out at 37ºC, 30ºC, or 20ºC using 0.5mM or 1mM IPTG to induce the expression. (-) represents un-induced samples, and (+) represents samples induced with 0.5mM IPTG. Proteins were loaded on 4-12% gel, and Spectra Multicolor Broad Range Protein Ladder was used as marker. (f) Protein blots using anti-GST antibodies to validate the expression of the protein in the different fractions...... 54 Figure 9 Protein presence in different fraction of the bacterial lysates. Total lysates were collected (total) then centrifuged and the supernatant (soluble) and the debris (pellet) were isolated then loaded to check the presence of the indicated protein. Proteins were expressed and lysed using three different methods (a), (b), and (c) varying in temperature and lysis buffer components as described in the methods. (- ) represents un-induced samples, and (+) represents samples induced with 0.5 mM IPTG. Proteins were loaded on 4-12% gel, and Spectra Multicolor Broad Range Protein Ladder (a, b) and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder (c) were used as markers...... 56 Figure 10 Cloning of PRC2 core components into insect expression plasmids. (a) Schematic representation of the expression plasmid constructs. Genes were cut from 15 the backbone vector (1) and ligated into the destination vector (2). (b) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pFastBac-HT C vector (2). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table.1% gel electrophoresis was run using 1 kb plus ladder and only constructs that showed fragments of expected sized were picked and confirmed by Sanger sequencing (not shown). Vector maps are shown in appendix A...... 57 Figure 11 Schematic representation of baculovirus expression main steps. pFastBac vector is transformed into DH10Bac E. coli and the bacmid is isolated and transfected into SF9 insect cells to produce multiple generations of viruses that are used to infect SF9 cells to produce the proteins...... 58 Figure 12 Expression of PRC2 core components in SF9 cells. PRC2 Proteins were expressed in SF9 cells and SUMO was expressed as a positive control. Total lysates were collected (T) or centrifuged and the supernatant was collected as the soluble part (S). The fractions were loaded on 4-12% gels to check the presence of the indicated protein, and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder was used as marker...... 59 Figure 13 Co-infection of PRC2 core components in insect cells. (a) SF9 cells were co-infected with variable concentrations of EZH1α, EED, and SUZ12 in 1:1:1 volume ratio. Total lysates were collected (T) or centrifuged and the supernatant was collected as the soluble part (S). The fractions were loaded on 4-12% gel to check the presence of the proteins, and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder was used as marker. (b) Western blotting of the co-infected cells using protein- specific antibodies to validate the presence of proteins of a similar MW...... 60 Figure 14 In vitro transcription of MALAT1. (a) MALAT1 full length in vitro transcription plasmid map with restriction sites. (b) Diagnostic digestion of cloned MALAT1 plasmid (10 kb) shows a band of 7 kb corresponding to MALAT1 and another band of 3 kb corresponding to remaining fragment. (c) Gel electrophoresis of in vitro transcribed MALAT1. Samples were run on 1% agarose gel using 1 kb plus marker...... 61 Figure 15 The level of H3K27me3 exerted by PRC2 core components with different concentrations of MALAT1. EZH1α, EED, and SUZ12 were added in a 1:1:1 volume ratio with varying amounts of MALAT1 and the relative levels of H3K27me3 were calorimetrically detected by absorbance measurement (OD). Values are expressed as mean ± SD (n=8). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the scramble. The absence of the asterisks indicates nonsignificant difference. (Experiment performed by Nadine El Said) ...... 62 Figure 16 Schematic suggested model of the involvement of MALAT1 in the adaptive response of skeletal muscle cells to oxidative stress. At normal conditions, MALAT1 is enriched on active genes, and PRC2-EHZ1 forms a noncanonical complex in the nucleus that binds to active genes. Upon stress, MALAT1 facilitates the full assembly of PRC2-EZH1 core components in the nucleus, enhancing gene repression...... 67 Figure 17 pOZ-FH-C-Puro vector map. (Adapted from Addgene)...... 69 Figure 18 pJET1.2 vector map. (Adapted from SnapGene)...... 70 16

Figure 19 pGEX-5X-1 vector map. (Adapted from SnapGene)...... 71 Figure 20 pFastBac HT C vector map. (Adapted from SnapGene)...... 72

17

LIST OF TABLES

Table 1 Sequences of primers used for qPCR. ______73 Table 2 Sequences of primers used for ChIP. ______73 Table 3 Sequences of primers used for cloning. ______74 Table 4 Sequences of primers used for Sanger sequencing. ______74 Table 5 Primary antibodies and the corresponding secondary antibodies. ______76

18

CHAPTER 1: INTRODUCTION

A complex network of epigenetic mechanisms contributes to controlling 1,2. This informational complement to the DNA sequence, the epigenome, is made of chemical modifications and higher order spatial organization of the genome inside the nucleus. The epigenome determines the ability of cells to faithfully follow developmental programs and maintain their fate. In addition to development, epigenome structure confers the necessary level of plasticity that allows cells and adult tissues to respond to environmental stimuli, by modulating gene expression patterns 3–5.

The role of the epigenome is to regulate gene expression by making regulatory elements differentially accessible for transcription factors 6,7. This is achieved by setting different epigenetic modifications, including DNA methylation, chemical modification of histones such as methylation and acetylation, substitution of histone variants, noncoding RNA and changes in chromatin 3D structure 7,8. Importantly, these modifications are actively maintained by chromatin modifiers and other structural components that guarantee homeostatic cell programs even after cell division 9.

The Polycomb group (PcG) of proteins and trithorax group (trxG) were found to be key players in maintaining early-determined gene expression status 2. Early studies in

Drosophila melanogaster revealed that PcG and trxG genes control the expression of a set of transcription factor genes 1,10–12, known as Hox genes, which are responsible for specifying the segment identity along the anteroposterior axis of segmented animals including mammals 13. Interestingly, the spatial expression of Hox genes mirrors their arrangement on the chromosome, with the most 5’ gene expressed in the 19 anterior segment/head and the 3’ in the posterior 13.

Mutations in PcG result in a loss of silencing, leading to a gain-of-function homeotic transformation as a consequence of the hyperactivation of the most posterior genes 14.

Conversely, in double mutant combinations, mutations in trxG suppress the defects caused by PcG mutations by preventing the ectopic expression 14. Therefore, PcG and trxG act antagonistically, where PcG is classified as a transcriptional repressor and trxG is classified as a transcriptional activator 15. In Drosophila, PcG and trxG exert their function by specialized, largely overlapping DNA regulatory elements: PcG response elements (PREs) and trxG response elements (TREs), respectively 1. In combination with target promoters, PREs and TREs are necessary to convey epigenetic inheritance 1. In mammals, PcG and trxG are targeted to CpG islands and promoter regions 2,16.

PcG and trxG act on chromatin, where they chemically modify histones. The PRC2- type of PcG complex has a histone H3 lysine 27 trimethylation activity (H3K27me3).

Similarly, specific trxG complexes also have a histone methyltransferase activity, however, specific to histone H3 lysine 4 trimethylation (H3K4me3) 15. H3K27me3 mark is usually enriched on facultative heterochromatin 17, hence, it is associated with gene silencing 15. Conversely, H3K4me3 is a mark for active genes 15. Therefore, PcG maintains the repressed gene state, whereas trxG maintains the active gene state 1.

Importantly, PcG and trxG are not involved in the initiation and determination of the gene expression state 1,18, instead, they maintain early-determined developmental transcriptional patterns 2,16. Long after the original stimulus is gone, the developmental programs are maintained over many rounds of cell divisions, hence the concept of cell memory 2,16. Accordingly, PcG and TrxG are critical for the 20 maintenance of cell identity and homeostasis, which is relevant for stem cell identity, cancer, and cell reprogramming 15.

Although the main function of PcG is maintaining gene expression status, transcription factors and signaling events can modulate PcG-mediated gene repression

19. Environmental stimuli, such as metabolic or chemical stress, affect cellular microenvironment and tissue homeostasis, which can alter gene repression status. In other words, PcG regulation of gene expression is plastic, which allows the cells to maintain their identity and also respond to environmental stimuli 19.

1.1. Polycomb Repressive Complex

1.1.1. Classes of Polycomb repressive complex

PcG proteins are divided into three classes. The first class is PRC2, which consists of four core components: enhancer of zeste homolog 2 (EZH2), embryonic ectoderm development (EED), suppressor of zeste 12 (SUZ12) and RBAP46 or RBAP48. In addition, PRC2 contains other cofactors: AEBP2 and JARID2. These components work coordinately to imprint histone 3 (H3) K27 methylation 20. The cofactors can assemble with the core components in different combinations, resulting in two different subclasses of PRC2 known as PRC2.1 and PRC2.1 21 The second class is the

Polycomb repressive complex 1 (PRC1), which is larger in size than PRC2. PRC1 contains the chromodomain of Polycomb (Pc), Polyhomeotic (Ph), Posterior sex combs (Psc) and dRing, and other associated factors such as TBP-associated factors.

It functions to ubiquitinate histone H2A at lysine 119 (H2AK119), resulting in the compaction of chromatin 20. The third and last class is PhoRC, which contains a

DNA-binding protein, Pleiohomeotic (Pho), and Scm-related gene containing four 21 malignant brain tumor (MBT) domains (dSfmbt) protein. Through its MBT repeats, dSfmbt protein binds to mono- and dimethylated H4K20 and H3K9 20. The above can work synergistically or independently.

1.1.2. PRC2 Components

1.1.2.1.EZH protein

Two versions of the PRC2 exist in mammals: PRC2-EZH2 and PRC2-EZH1, which have different but similar roles in cell development and differentiation 16. EZH2 is the catalytic subunit of PRC2. It is a histone methyltransferase (HMTase) that contains an evolutionarily conserved carboxy-terminal Su(var) 3–9-enhancer of zeste- trithorax

(SET) domain, which confers the methyltransferase activity. In addition, EZH2 contains five other domains: a WD-40 binding domain (WDB), domains I–II, two

SWI3-ADA2-N-CoR-TFIIIB [SANT] domains, and a cysteine-rich CXC domain 22.

EZH2 protein was found to play a major role in several cancers and other diseases 23–

29. Mammalian cells contain another gene, Ezh1, encoding for the protein EZH1, which has a weaker HMTase activity. While EZH2 is predominant in differentiating cells, EZH1 is exclusively present in embryonic stem cells and post-mitotic tissues

16,30.

1.1.2.2.EED protein

EED is a WD-repeat protein that folds into β-propeller structure 22. WD-repeat proteins are a group of proteins that are characterized by the presence of a repeating motif of ~40 amino acids, of which the last two amino acids are often tryptophan (W) and aspartic acid (D), hence it is known as the WD40 motif 31. EED protein has seven 22 copies of the WD40 motif, each forms four-stranded antiparallel β-sheets, resulting seven-bladed β-propeller structured 22.

Through the binding of its carboxy-terminal domain to N-terminal tail of H3K27me3,

EED functions to recruit PRC to H3K27me3 sites. There, the presence of H3K27me3 acts a positive feedback signal that enhances the HMTase activity of EZH2, maintaining the repressive mark and allowing its transmission to daughter cells 22.

The bottom of the WD40 motif of EED binds to a 30-residue peptide in the EZH2 protein, which is both necessary and sufficient for the interaction 16.

Two interchangeable homologous copies of EED exist in Drosophila, ESC and ESCL

4. In mammals, EED mRNA has four translation starts resulting in four different EED isoforms 32, all of which can facilitate the HMTase activity of PRC2 in vitro and in vivo 22.

1.1.2.3.SUZ12 protein

SUZ12 is a zinc finger protein that is important for the assembly and HMTase activity of the PRC2. It binds to EZH2 protein through its C-terminal Vrn2-Emf2-Fis2-SUZ12

[VEFS] domain, maintaining EZH2 stability. SUZ12(VEFS) domain can also allosterically inhibit the HMTase activity of EZH2, preventing the spread of silenced chromatin. In addition, SUZ12 binds to RbAp46/48 subunit and mediates the interactions with JARID2 cofactor. Among all PRC2 components, SUZ12 has the highest affinity to non-coding (ncRNAs), which makes it an integral subunit in

PRC2-ncRNA interactions 22. 23

1.1.2.4.RbAp46 and RbAp48 proteins

RbAp46 is also known as RBBP7 and RbAp48 is known as RBBP4. These two cofactors are homologous proteins that are not required for the HMTase activity of

PRC222. RbAp48 recruits PRC2 to nucleosomes by binding to histone H3–H4 heterodimers. It is still unclear if RBBP7 and RbAp48 have different roles in the

PRC2. Nevertheless, they both play an integral role in establishing and maintaining chromatin structure 22.

1.1.2.5.AEBP2 and JARID2 proteins

Like SUZ12, AEBP2 is also a zinc-finger protein. It enhances the HMTase activity of

EZH2 by interacting with both EZH2 and SUZ12 22.

JARID2 is a member of the of histone demethylases family, Jumonji C (JmjC) and

ARID domain. However, it is devoid of the enzymatic activity due to the absence of some binding cofactors. Because JARID2 mimics H3K27me3, it leads to the recruitment and activation PRC2. Furthermore, JARID2 has the ability to bind long noncoding RNAs (lncRNAs), facilitating its interaction with EZH2 in vitro 22.

1.2. PRC2 in cell development and plasticity

1.2.1. PRC2 in cellular memory

Both the PcG and trxG are critical in maintaining an already established gene expression status, either silenced or active, respectively 33,34. As one of the PcG classes, PRC2 plays an important role in the protection of cell identity and guarding cellular memory 4. It maintains a repressed gene expression status that can be mitotically and meiotically inherited. 24

Thus, an important aspect is understanding how H3K27me-repressed chromatin states are maintained and passed to daughter cells. There are two proposed models explaining the mechanism of inheritance. The first model suggests that, during DNA replication, methylated H3K27 histones are locally passed to the two daughter chromatids 35. The second model, on the other hand, suggests that PRC2 is passed to daughter cells, where it establishes new H3K27me 36.

In attempt to test these models, a recent study generated and examined

Caenorhabditis elegans embryos containing or lacking PRC2 activity and with or without H3K27me 37. It was shown that cells which lacked PRC2 but contained H3K27me chromosomes were able to maintain the exact pattern for a few rounds of cell division. On the other hand, cells that contained only PRC2 were able to maintain mosaic H3K27me throughout embryogenesis. Concluding that both

H3K27me and PRC2 contribute to the epigenetic transmission of the repression memory, where the first ensures the precision of the pattern and the second provides a long-term memory 37. Because the presence of H3K27me acts as a positive feedback for PRC2 38, this might explain the loss of H3K27me pattern after several rounds of division in cells lacking PRC2.

1.2.2. The interchange between PRC2-EZH2 and PRC2-EZH1

EZH1 and EZH2 play an alternative role during cell development and differentiation.

During mitosis, EZH2 are found in a substantially higher level than EZH116. After that, the levels of EZH1 rise in post-mitotic tissues16, while EZH2 is gradually degraded 4. The alternative expression of Ezh1 and Ezh2 maintains constant levels of

H3K27me3 at PRC2 targets 16. Because EZH2 is dominant during cell division, and has a stronger HMTase activity, it is suggested that PRC2–EZH2 establishes cellular 25

H3K27me2/3 in the cells. While, on the other hand, PRC2–EZH1 functions to maintain the methylation levels by restoring lost H3K27me2/3 4.

Intriguingly, new evidence suggests that the majority of EZH1 binding associates with

H3K4me3 regions, and only a small fraction of the genome is shared by EZH1 and

EZH2 39. Additionally, EZH1 showed an association with regions of RNA polymerase

II (Pol II) binding, suggesting its involvement in mRNA transcription. Further experiments showed that EZH1 interacts with Pol II itself, promoting transcriptional elongation in differentiating skeletal muscle cells 39. These data reveal an interesting role of PRC2, as they associate PcG with a transcriptional activation role, in contrary to its known silencing activity. Later, it was shown that EZH1 forms a complex with

SUZ12 in the nucleus. This complex binds to active domains on the chromatin, and promotes gene expression. However, upon environmental stress, the full complex assembles and turns into a repressor 30.

Moreover, recently, it was shown that PRC2 regulates gene activation during neuronal differentiation, stress response and mitogenic signaling 40,41. Mechanistically, the displacement of PRC2 on chromatin allows for an H3K27/H3S28 methyl/phospho switch, activating otherwise repressed genes 40,41. During myogenesis, mitogen- activated protein kinases (MAPKs), p38, or extracellular signal-regulated kinase

(ERK) come into play and activate several downstream pathways. The activation of p38 or ERK pathways ultimately activates the mitogen-and-stress-activated protein kinases (Msk-1 and Msk-2), which phosphorylate histone H3 at serine 28 (H3S28ph)

42.

Because the activation of muscle-specific genes during myoblast-myotube transition requires transcriptional and post-transcriptional downregulation of Ezh2 16, a similar 26 mechanism was thought to be involved in skeletal muscle cell differentiation. This notion was supported by a later study, which proved that muscle cell differentiation requires the deposition H3S28ph produced by Msk1-dependent signaling 16.

Consequently, PRC2-EZH2 is displaced from the regulatory regions of muscle genes, myogenin (MyoG) and muscle creatine kinase (mCK). Upon differentiation, a PRC2-

EZH2/PRC2-EZH1 switch occurs at MyoG sites, which is critical for controlling the proper timing of its activation 16,43.

1.2.3. PRC2 and adaptive response in postmitotic cells

Epigenetic mechanisms are the key to translate environmental effects into phenotypic changes by altering gene expression. Hence, they allow a plastic, reversible response to environmental stimuli. For example, it was recently shown that the levels of EZH1- mediated H3K27me3 significantly increase at Myog and Myh3 muscle gene promoters, in response to atrophic stress in muscle cells 43. Moreover, very recently, a novel, uncharacterized isoform of the canonical EZH1 was identified 43. This isoform lacks the SET domain and, hence, it is devoid of the HMTase activity. Thereafter, the canonical isoform was referred to as Ezh1α and the non-canonical isoform as Ezh1β.

It was discovered that EZH1β plays an important role in modulating the PRC2–EZH1 adaptive response in postmitotic cells 43.

In muscle cells, EZH1β interacts with EED in the cytosol, which prevents the nuclear translocation of EED. The absence of nuclear EED prevents its association with the rest of PRC2-EZH1 proteins, preventing them from methylating histones. As a result, the expression of target genes remains active. After the stimulation of atrophic conditions by inducing oxidative stress, EZH1β gets ubiquitylated and consequently degraded. The degradation of EZH1β enables the re-localization of EED into the 27 nucleus, where it is associated with the rest of the PRC2-EZH1 components. The assembled complex acts to methylate histones, repressing target muscle genes 43 (Fig.

1).

Figure 1 Schematic model of the function of Ezh1α and Ezh1β in differentiated muscle cells. Ezh1β acts as a regulator that prevents PRC2-EZH1 activity, through its binding to EED in the cytosol. The induction of oxidative stress frees EED and enables it nuclear localization, and in turn complex assembly. The effect of oxidative stress is reversible. (Adapted from Bodega et al., 2017.)

1.3. Interplay between PRC2 and MALAT1:

1.3.1. Non-coding RNA:

The mammalian genome is pervasively transcribed, however, only 5% of the total output consists of messenger RNA (mRNA) coding for proteins 15.The majority of the RNA moieties are ncRNAs 44, which are associated with various functions.

Examples include ribosomal RNA (rRNA) and tRNA which are involved in mRNA processing and translation machinery, small nuclear RNAs (snRNAs) which is involved splicing, and small nucleolar RNAs (snoRNAs) which modifies rRNAs 45.

Other ncRNA molecules were found to be involved in transcriptional regulation through the regulation of chromatin structure 44. ncRNAs are also classified according 28 to their sizes into lncRNAs which defines RNA molecules that are larger than 200 nt

15 and small noncoding RNA molecules, which can be involved in RNA degradation, as well as the repression of gene expression or transcription by RNAi pathways 6,46–52.

To date, about eight thousand lncRNAs were discovered in humans 15. Normally, they are expressed at relatively moderate or low levels, and they are preferentially nuclear and associated with chromatin. They show little conservation; however, due to their extended length, lncRNAs have the ability to fold into different secondary structures, giving them functional versatility 53–55. They can act as scaffolds of multiprotein complexes by providing support to proteins that lack direct protein- protein interactions 15,53,56,57. Interestingly, the complexity of an organism shows a correlation to the number of ncRNAs rather than the number of protein-encoding genes, indicating that they play a role in the evolution of developmental complexity 44

1.3.2. lncRNA and PRC2 function:

It has been widely accepted that the lncRNA molecules control gene expression through the regulation of epigenetic chromatin modulating proteins. One of the main examples of RNA-induced epigenetic regulation is the recruitment of PRC2 proteins by the lncRNA XIST, which is a fundamental player in female X-chromosome inactivation. The recruitment of PRC2 mediates gene inactivation through histone methylation, catalyzed by EZH2 HMTase 58–61.

Analyses of early embryos homozygous for a mutation in the gene Eed, which encodes the PRC2 core component, EED, showed a phenotype that is normally associated with failure of X inactivation 62. Immunofluorescence studies confirmed that EED protein is highly enriched on the inactive X chromosome (Xi), where its localization resembles that of the XSIT lncRNA. Furthermore, the SET domain was 29 also found to be enriched within Xi territories 58,63. These results indicate that Xist might have a direct physical interaction with PRC2.

Extensive studies on XIST RNA facilitated the analysis of other related lncRNAs.

Other ncRNAs appear to recruit PRC2 in a similar manner. In addition to dosage compensation, lncRNAs are involved in the regulation of genome imprinting and eukaryotic developmental gene expression. Specific examples include Kcnqtlot1, which is involved in gene imprinting in mice 64,65, HOTAIR, which is transcribed from the human HOXC locus then directs PRC2 in-trans to act on the HOXD locus in another chromosome 56, and COLDAIR, which recruits PRC2 in Arabidopsis thaliana, regulating vernalization 66.

The affinity of PRC2 for PRC2-binding RNAs has not been thoroughly investigated or quantified. Thus, the binding specificity of RNA to PRC2 is not fully understood.

Genome-wide studies suggest that the binding of ncRNA to PRC2 might need a specific, common RNA secondary structure 67–69. On the other hand, a recent study measured the binding constants (Kd) of various RNAs to, and concluded that PRC2 binds RNA promiscuously, both in vitro and in vivo 70. Interestingly, the same study also showed that although E. coli lacks PRC2, PRC2 was able to bind E. coli maltose- binding protein mRNA with a comparable affinity to that of HOTAIR 70. Until now, the exact binding nature and mechanistic contribution of RNA and PRC2 remains to be understood.

Crosslinking studies indicate that PRC2 also binds nascent mRNA moieties, suggesting that it might have a role in the negative control of transcriptional initiation or elongation 67. According to the junk mail model, PRC2 promiscuously binds nascent mRNA in order to scan chromatin for target genes that have escaped 30 repression 70,71. It is suggested that RNA and chromatin compete for PRC2 binding 72.

At highly expressed genes, the high level of RNA provides decoy for PRC2, stripping it away from chromatin 70,72.

1.3.3. Interaction between MALAT1 and PRC2

1.3.3.1.MALAT1 and PRC2-EZH2

The metastasis-associated lung adenocarcinoma Ttanscript 1 (MALAT1) is a lncRNAs that has also been associated with PRC2. The mature MALAT1 transcript lacks the poly(A) tail and, instead, it is stabilized by a 3’ triple helical structure 73.

MALAT1 is retained in the nucleus 74, where it is involved in various functions such as scaffolding ribonucleoprotein complexes, regulating gene expression by facilitating the recruitment of chromatin modulators to target genes 75. It also regulates by modulating the serine/arginine (SR) splicing factor phosphorylation 74.

Additionally, MALAT1 is involved in the regulation of cell cycle by controlling the expression of oncogenic transcription factor B-MYB, which is involved in the G2/M progression 76.

As its name suggests, it is also associated with the progression and advancement of lung cancer 77, as well as other types of cancers and other diseases 23,24,73,77–82.

MALAT1 was associated with aberrant function of PRC2-EZH2 23,24,78,81, and it has been reported that it can interact with EZH2 23,24 and SUZ12 78 proteins of the PRC2.

A study shows that MALAT1 interacts with the N-terminal of EZH2 in castration- resistant prostate cancer (CRPC) 24. In CRPC cells, It was found that MALAT1 enhances both EZH2-mediated repression of target genes, and EZH2-mediated invasion and migration 24. Similarly, in renal cell carcinoma, MALAT1 interacts with

EZH2 protein, which promotes cancer aggressiveness 23. Moreover, in bladder cancer, 31

MALAT1 was found to be associated with SUZ12 78. MALAT1-SUZ12 association promotes TGF-β-induced an epithelial to mesenchymal transmission (EMT), promoting tumor invasion and metastasis 78.

1.3.4. Investigation of the involvement of MALAT1 in PRC2-EZH1 mediated

plasticity lncRNAs are critical regulatory molecules in cellular stress response in many organisms. For example, NEAT1 lncRNA is upregulated in breast cancer and other solid tumors in response to hypoxia, resulting in the formation of paraspeckles that in turn promote cancer aggressiveness 83. Many lncRNAs have been related to oxidative stress response including MALAT1, which activates the antioxidant pathway as a result of its upregulation in response to H2O2 treatment in human umbilical vein endothelial cells 84. In plants, lncRNA play and integral role in response to stress. A famous example is COLDAIR, which responds to cold by forming repressive heterochromatin through the recruitment of PRC2 66.

Nonetheless, the involvement of lncRNA in PRC2-mediated somatic cell plasticity and response to environment is not well investigated. Based on the evidence that

PRC2-EZH1α/β pathway plays a role in the adaptation to atrophic conditions in muscle cells 43, a study performed in our lab exploited this in vivo system to identify bona fide functional RNAs bound to PRC2-EZH1 (Nadine El Said, PhD Thesis,

KAUST 2019).

As described previously, EZH1β interacts with EED in the cytosol, which prevents the nuclear translocation of EED. Moreover, atrophic stress conditions direct EED protein from the cytosol to the nucleus by inducing the degradation of EZH1β. The 32 investigation of lnRNAs that are possibly involved in that process revealed that

MALAT1 was among the top lncRNAs which interacted with PRC2 in C2C12 muscle cells. RNA immunoprecipitation (RIP) showed that EED protein was highly associated with MALAT1 in C2C12 under oxidative stress conditions (Fig. 2a). Upon oxidative stress, the levels of H3K27me3 increase, and this increase is prevented in the absence of MALAT1 as shown by immunofluorescence (IF) (Fig. 2b). Moreover,

IF coupled with RNA fluorescent in situ hybridization (FISH) also confirmed the co- localization of MALAT1 and EED in the nucleus upon oxidative stress. Interestingly, the knockdown of MALAT1 diminished EED’s nuclear localization, making it more diffused into the cytosol. Furthermore, the induction of oxidative stress increased

MALAT1 levels in C2C12 cells, reinforcing the involvement of MALAT1 in muscle atrophy (Fig. 2c).

In addition to the compartmentalization and association of PRC2 components,

MALAT1 also affected the enzymatic activity of PRC2. The enzymatic activity was investigated by observing the level of H3K27me3 marker. It was previously shown that oxidative stress increases global H3K27me3 43. However, MALAT1 knockdown significantly reduced the level of global H3K27me3 in C2C12. 33

Figure 2 Association of MALAT1 with PRC2-EZH1 in atrophic skeletal muscle cells. a) C2C12 cells were treated with H2O2 to induce atrophy and native RIP followed by real-time quantitative PCR (nRIP-qPCR) and and formaldehyde cross-linking RIP followed by qPRC (fRIP-qPCR) were performed to map RNA-protein interaction. EED was immunoprecipitated together with its associated RNA molecules and the fold enrichment of MALAT1 over mock was measured upon oxidative stress (pink) compared to normal C2C12 (blue). The results are consistent without formaldehyde crosslinking (nRIP) and with crosslinking (fRIP). Statistical test was carried out as two-way ANOVA where *=P<0.05, **=P<0.01, ***=P<0.001, ****=P<0.0001 n=3, error bars representing mean with SD). b) IF of H3K27me3 (green) and DAPI (blue) in C2C12 cells in normal (control), oxidative stress (H2O2 stress) , and oxidative stress and stress with Malat1-knockdown (Stress + Malat1KD). c) FISH of Malat1 (red), Eed (green), and DAPI (blue), in wild type and MALAT1 knockdown (Malat1-KD) C2C12 in both normal (control) and oxidative stress (stress) conditions. (Nadine El Said, PhD Thesis, KAUST 2019) 34

1.4. Scope of the thesis: Objectives, contributions, and model

system

The current work focuses on understanding the nature of the involvement of the lncRNA MALAT1 in PRC2-EZH1-mediated plasticity in C2C12 cells under oxidative stress. In particular we wanted to investigate:

1. The role of MALAT1 in the assembly of PRC2-EZH1 core components in

atrophied muscle cells.

2. The direct effect of MALAT1 on the enzymatic activity of PRC2-EZH1.

To this aim, a stable mouse C2C12 cell line overexpressing EZH1α fused with FLAG and HA tags (EZH1α-FH) protein was generated. Cells were treated with H2O2 to induce atrophy-like conditions. The effect of MALAT1 on the association of PRC2 core components was then revealed by co-immunoprecipitation (CoIP) and mass spectrometry (MS) enrichment studies. Moreover, the expression and purification of

PRC2 core components was attempted in E. coli an SF9 insect cells in order to study the direct involvement MALAT1 in PRC2 assembly and function. The role of

MALAT1 in the HMTase activity was revealed through a colorimetric-detection enzyme-linked immunosorbent assay (ELISA). The in vitro reconstituted complex can be used in the future to study the effect of MALAT1 on the interaction dynamics of the proteins via microscale thermophoresis (MST).

35

CHAPTER 2: MATERIALS AND METHODS

2.1. Mammalian cell culture

Mouse C2C12 skeletal myoblasts (ATCC; CRL-1772) were grown in Dulbecco's modified Eagle's medium (DMEM) (4.5 g/l D-glucose and glutamax) (GIBCO) supplemented with 10% fetal bovine serum (FBS) (Sigma), 100 units/ml penicillin and 100 μg streptomycin (GIBCO), according to standard protocols. C2C12 myoblasts were differentiated at 90–95% confluence in DMEM supplemented with

2% horse serum (GIBCO) and penicillin-streptomycin.

2.2. Stable cell lines

To generate Ezh1α expression construct with HA and FLAG tags, Ezh1α was amplified from C2C12 myotube cDNA. After gel purification with QIAquick Gel

Extraction Kit (Qiagen), according to manufacturer’s instructions, the PCR products were digested and ligated into pOZ-C-FH vector that had been linearized with the same restriction enzymes. The resultant clone and a selectable marker for puromycin resistance were co-transfected into C2C12 cells. Transfected cells were grown in the presence of puromycin (Sigma), and individual clones were isolated and analyzed for

Ezh1α-FH expression by immunoprecipitation (IP) followed by western blotting, MS, and IF. Vector map is available in appendix A (Fig. 17)

2.3. Immunofluorescence

Stable C2C12 cell line constitutively expressing Ezh1α-FH were grown to myoblast or differentiated to myotubes on glass slides. Cells were incubated with 4% paraformaldehyde for 15 min at 23 ºC for cell fixation, permeabilized with 0.1% 36

Triton X-100 in PBS for 10 min, and blocked with 1% BSA solution. Next, cells were stained with primary antibodies: 1:200 HA (Roche; 3F10) and 1:500

MHC/MF-20 (DSHB; 051320) in a 1% BSA solution for 1 hour at room temperature. After that, they were washed three times with 0.1% PBS and stained with secondary antibodies: 1:500 conjugated Alexa Fluor 488 (Invitrogen, A-

11006) or Alexa Fluor 568 (Invitrogen, A-11031) in a 1% BSA solution at room temperature (1:500). Mounting medium containing DAPI (Sigma, F6057) was used to counterstain nuclei localization. Images were obtained with a Leica TCS SP5 confocal microscope with an HCX PL APO 63.0×/1.40-NA oil-immersion objective.

2.4. Cell treatments

To mimic muscle atrophy, C2C12 cells were treated at day-3 myotubes with 100

μM H2O2 (Sigma) for 24 h to induce oxidative stress. To knockdown MALAT1 in the stable C2C12 cell line overexpressing Ezh1α-FH, myotubes were transfected with 100 nM antisense locked nucleic acid (LNA) GapmeRs (300600, Qiagen), using Lipofectamine RNAiMAX reagent (Invitrogen) according to the manufacturer’s instructions.

2.5. RNA isolation and RT-qPCR

Total RNA was isolated using Tri-reagent (Sigma) according to the manufacturer's instructions. cDNA was produced starting at 800 ng of RNA using iScript

Advanced cDNA Synthesis Kit for RT-qPCR (Bio-Rad) and analyzed in SYBR

Green qPCR supermixes for real-time PCR (Bio-rad) using GAPDH for normalization. Real-time PCR analyses were carried out with a CFX96 Touch Deep 37

Well thermal cycler (Bio-rad). Primer sequences are available in appendix B (Table

1).

2.6. Co-immunoprecipitation

For the co-IP of stable C2C12 cell line overexpressing HA-tagged Ezh1α, myotubes were scraped in cold DPBS (GIBCO) containing 1X protease inhibitor, pelleted, suspended in cytosolic extraction buffer (10mM HEPES-KOH, pH 8.5,

400 mM NaCl, 0.1 mM EDTA, 0.1 mM EGTA, 1X proteinase inhibitor), kept at 4

ºC for 30 min with end-over-end rotation. Samples were sonicated to release nuclear part then debris was pelleted at 10000 rpm and 4 °C, and the supernatant was collected for nuclear IP. ANTI-FLAG M2 Agarose Affinity Gel (Sigma) was washed in TGEN buffer (20 mM Tris-HCl, pH 7.8,150 mM NaCl, 3 mM MgCl2,

0.1 mM EDTA, 10% glycerol, 0.01% NP-40). The Agarose beads were added to the IP samples and incubated overnight at 4 °C on the wheel. The immunocomplexes were then recovered with 200 ug/mL Flag peptide (sigma), and incubated overnight at 4 °C on the wheel. The eluted immunoprecipitants were loaded on 4-12% polyacrylamide gel (Invitrogen) and subjected to western blotting analysis.

2.7. LC-MS analysis and MS data analysis

The peptide mixture was measured on a Q Exactive HF mass spectrometer (Thermo

Fisher Scientific) coupled with an UltiMateTM 3000 UHPLC (Thermo Fisher

Scientific). Peptides were separated using an Acclaim PepMap100 C18 column (75 um I.D. X 25 cm, 3 μm particle sizes, 100 Å pore sizes) with a flow rate of 300 nl/min. A 75-minute gradient was established using mobile phase A (0.1% 38

FA) and mobile phase B (0.1% FA in 80% ACN): 5%-40% B for 55 min, 5-min ramping to 90% B, 90% B for 5 min, and 2% B for 10-minute column conditioning.

The sample was introduced into mass spectrometer through a Nanospray Flex

(Thermo Fisher Scientific) with an electrospray potential of 1.5 kV. The ion transfer tube temperature was set at 160°C. The Q Exactive was set to perform data acquisition in DDA mode. A full MS scan (350-1400 m/z range) was acquired in the

Orbitrap at a resolution of 60,000 (at 200 m/z) in a profile mode, a maximum ion accumulation time of 100 milliseconds and a target value of 3 x e6. Charge state screening for precursor ion was activated. The ten most intense ions above a 2e4 threshold and carrying multiple charges were selected for fragmentation using higher energy collision dissociation (HCD). The resolution was set as 15,000. Dynamic exclusion for HCD fragmentation was 20 seconds. Other setting for fragment ions included a maximum ion accumulation time of 100 milliseconds, a target value of

1xe5, a normalized collision energy at 28%, and isolation width of 1.8.

The MS RAW files from Q-Exactive HF were converted to .mgf files using Proteome discoverer (V1.4) and analyzed using Mascot (Version 2.4) against mouse database

(Uniprot Mus musculus database). The Mascot search results were further processed using Scaffold (Version 4.1, Proteomesoftware Inc., Portland, OR, USA) for validation of protein identification and quantitative assessment. For protein identification, it requires a minimal 99% possibility for protein and with at least one peptide having a possibility greater than 95% according to the PeptideProphet and

ProteinProphet. The label-free quantification of proteins and phosphorylation peptides were performed using Maxquant LFQ. 39

2.8. ChIP-seq

C2C12 cells were crosslinked with 1% formaldehyde (F8775, Sigma) for 10 min.

Cells were then lysed using lysis buffer 1 LB1(50 mM Hepes KOH pH 7.5, 10 mM

NaCl, 1mM EDTA, 10% glycerol, 0.5% NP–40, 0.25% Triton X-100), nuclei were pelleted, collected, washed using lysis buffer 2 LB2 (10 mM Tris HCl pH 8, 200 mM

NaCl, 1mM EDTA, 0.5 mM EGTA), then lysed using lysis buffer 3 LB3 (10 mM Tris

HCl pH 8; 100 mM NaCl, 1mM EDTA; 0.5 mM EGTA; 0.1% Na-Deoxycholate,

0.5% N-laurylsarcosine). For chromatin shearing, lysates were sonicated (BRANSON

A250, with tapered microtip 3,2 mm, 4 cycle of 1.5 min at 20 % of amplitude, 50% of

Duty Cycle). For each IP reaction 5 μg of antibodies were added to 100 μg of chromatin DNA equivalents. Protein-antibody immunocomplexes were magnetically recovered (Dynabeads protein G, Invitrogen). Washes were carried out for the beads using low salt wash buffer (0.1% SDS; 2mM EDTA, 1% Triton X-100, 20 mM Tris

HCl pH8, 150 mM NaCl), then high salt wash buffer (0.1% SDS, 2mM EDTA, 1%

Triton X-100, 20 mM Tris HCl pH 8, 500 mM NaCl). Beads were then resuspended in elution buffer (50 mM Tris HCl pH 8, 10 mM EDTA, 1% SDS). For de- crosslinking, RNase A (0.2 mg/ml) and Proteinase K (0.2 mg/ml) were added, and

DNA was then purified for qPCR analysis. ChIP primers are available in appendix B

(Table 2), and qPCR primers are available in appendix B (Table 1).

2.9. Expression plasmids

For the expression of PRC2 core components in E. coli, Ezh1α, Ezh1β, Eed3, Eed4, and SUZ12 were amplified from C2C12 myotube cDNA and cloned into a pJET1.2

(Thermo Scientific) vector. Next, Ezh1α, Ezh1β, Eed3, and Eed4 were cut from pJET1.2 vector by restriction enzymes and purified by gel extraction using 40

QIAquick Gel Extraction Kit (Qiagen), according to manufacturer’s instructions.

Next, Ezh1α, Ezh1β, Eed3, and Eed4 were cloned into pGEX-5X-1, and Suz12 was cloned into a PGEX-P6-1 vector.

For the expression of PRC2 core components in SF9 cells, Ezh1α, Ezh1β, Eed3, and Eed4 were subcloned from PGEX-5X-1 vector by gel extraction using

QIAquick Gel Extraction Kit (Qiagen), according to manufacturer’s instructions and ligated into pFsatBac-HT C baculovirus expression vector.

All recombinant plasmids were confirmed by diagnostic digestion followed by

Sanger sequencing using vector-specific forward and reverse sequencing primers.

All vector maps are available in appendix A (Fig. 18, 19, 20) and primer sequences are available in appendix B (Table 3, 4).

2.10. RNA in vitro transcription

Mouse Malat1 full length plasmid was obtained from Prof. Shinichi Nakagawa

(RNA Biology Laboratory, RIKEN Advanced Research Institut), and linearized using XhoI restriction enzyme, then transcribed using MAXIscript T7

Transcription Kit (Invitrogen), according to manufacturer’s instructions. In vitro transcribed RNA was then treated with DNase and purified using Tri-reagent

(Sigma) according to the manufacturer's instructions.

2.11. Protein expression and purification in bacteria

BL21(D3) E. coli were cultured to OD 0.6-0.7. Expression of EZH1α-GST,

EZH1β-GST, EED3-GST and SUZ12-GST tagged proteins was induced using 0.5 41 mM IPTG. Expression and lysis were performed in three different induction and lysis conditions for optimization.

Condition I: The expression was induced at 37 ºC for 3 hours and cells were lysed in lysis buffer (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 1 mM DTT, 1mg/mL lysozyme, 1X proteinase inhibitor). Condition II: The expression was induced at 30

ºC for 3 hours and cells were lysed in lysis buffer (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 5 mM DTT, 1mg/mL lysozyme, 10% glycerol, 0.3% triton, 1X proteinase inhibitor). Condition III: The expression was induced at 16 ºC for overnight and cells were lysed in lysis buffer (20 mM Tris-HCl, pH 7.5, 150 mM

NaCl, 1% triton, 1X proteinase inhibitor).

For all three conditions, lysates were then sonicated for 5 min (Soniprep 150 plus,

MSE) (1 min on/ 1 min off), then centrifuged at 10000 rpm and 4 °C, and different fractions were collected. The supernatant was collected as the soluble part and the remaining pellets were dissolved in PBS (GIBCO). The total (input) was collected before centrifugation. All fractions were checked by SDS-PAGE for the presence of the proteins.

2.12. Protein expression in insect cells

Insect SF9 Spodoptera frugiperda cells were grown in suspension supplement-free

ESF 921 medium (Expression Systems), according to standard protocols. For expression of PRC2 individual components, the full-length sequences of mouse

Ezh1α, Ezh1β, Eed3, Eed4, and Suz12 were cloned into pFastBac-HT C baculoviral expression vectors. Recombinant plasmids were transformed into DH10Bac E. coli and X- 5-Bromo-4-chloro-3-indolyl-β-D-galactopyranoside (X-Gal, Sigma) was used 42 to select positive colonies (blue/white selection). Bacmids were extracted and transformed into SF9, and three generations of baculovirus were collected: P1, P2, and P3. For protein expression, SF9 cells were infected with P3 virus of EZH1α,

EZH1β, EED 3, EED 4, SUZ12, and SUMO proteins, separately. The proteins were expressed for 80 h, and cell pellets were collected and lysed (25 mM HEPES, pH 7.9,

250 mM NaCl, 5% glycerol, 1% triton, 100 mM PMSF, 1 mM DTT, and protease inhibitor cocktail) then sonicated (Soniprep 150 plus, MSE) (1 min on/3 min off, twice). Lysates were centrifuged at 10000 rpm and 4 °C, and different fractions were collected. The supernatant was collected as the soluble part and the total

(input) was collected before centrifugation.

To increase the stability of the proteins, SF-9 cells were co-infected with EZH1α,

EED3, and SUZ12. The proteins were expressed for 80 h and lysed following to the previous procedure.

2.13. Western blotting

Proteins were loaded on 4-12% polyacrylamide gels, resolved by SDS-PAGE, and transferred to a nitrocellulose membrane (wet transfer). The membranes were blocked and incubated for 1 h with primary antibodies. They were then treated with the appropriate secondary antibody coupled to HRP and revealed through Pierce

Enhanced Chemiluminescence Western Blotting Substrate (Thermo Scientific). A list of all antibodies is available in appendix C (Table 5).

2.14. Histone methyltransferase activity

To measure HMTase activity, a commercial colorimetric kit was used (Epigentek), according to the manufacturer’s instructions. Commercial PRC2 core components: 43

EZH1, EED, and SUZ12 were added in a 1:1:1 volume ratio together with various concentrations of in vitro transcribed MALAT1 lncRNA. The histone trimethylation levels were identified by reading the absorbance of the assay (optical density, OD).

2.15. Statistical analysis

For all bar graphs, values are expressed as mean ± SD with n=3 and analyzed using two-tailed, unpaired Student’s t-test. For statistical tests, the 0.05 level of confidence was accepted for statistical significance.

44

CHAPTER 3: RESULTS

3.1. Generation of stable C2C12 cell line expressing tagged EZH1α

Our previous results indicate that MALAT1 lncRNA affects the levels of H3K27me3 mediated by PRC2-EZH1 in C2C12 cells under oxidative stress (Fig. 2b). The proper association of the three core components of PRC2-EZH1, namely EZH1α, SUZ12, and EED is necessary for its activity. To understand the potential molecular mechanism of how MALAT1 modulates the global H3K27me3 level through PRC2-

EZH1α complex, we wanted to check whether MALAT1 is required for the proper assembly of PRC2-EZH1α. It was already shown that MALAT1 is required for the nuclear localization of EED during muscle atrophy (Fig. 2c). Therefore, we verified the effect of MALAT1 on the association of EED and SUZ12 with the catalytic subunit, EZH1α, stable C2C12 cell line constitutively expressing EZH1α-FH was generated and used for Co-IP experiments.

To generate the stable cell lines, C2C12 cells were transfected with a recombinant puromycin-plasmid containing the EZH1α-FH, and selected in the presence of puromycin. Positive cells were analyzed by IF which shows that the fusion protein

EZH1α-FH is mainly located in nucleus, in consistent with the endogenous localization of EZH1α (Fig. 3a). Next, IP was performed using both FLAG- and HA- conjugated agarose beads sequentially. Silver staining assay indicates that many specifically associated protein partners could be pulled down (Fig 3b).

Immunoprecipitated protein partners were analyzed by MS. MS Mascot analysis shows a high Mascot score for EZH1α in the eluted samples, as well as for SUZ12, which is known to be physically associated with EZH1α in the nucleus (Fig. 3c). 45

Together, these results confirm the successful generation of the EZH1α-FH stable

C2C12 cell line.

Figure 3 The generation of EZH1α-FH tagged stable C2C12 cell line. (a) IF EZH1α-FH (green), myosin heavy chain (red), and DAPI (blue) confirms the presence of the tagged protein in both myoblast (left) and myotube (right). (b) Silver staining of immunoprecipitated EZH1α-FLAG-HA (1α-FH) using Flag or HA for the elution confirms the presence of EZH1α. Input samples represents non-eluted cells. (c) MS data showing the mascot score and number of unique peptides corresponding a list of proteins that associate with EZH1α in the cell in three replicates of immunoprecipitated EZH1α-FH samples. The high mascot number is indicative of high presence of EZH1α.

46

3.2. Identification of suitable antisense oligos for MALAT1

loss of function experiments

To induce oxidative stress response mimicking muscle atrophy, C2C12 myotubes overexpressing EZH1α-FH were treated with 100 µM H2O2 for 24 h. This concentration was chosen to mimic muscle atrophy, without affecting the viability of the cells 43. Additionally, to knockdown MALAT1, myotubes were transfected with

MALAT1 antisense locked nucleic acid (LNA) GapmeRs. The efficiency of three different antisense oligos (ASO) and the effective concentration were tested by qPCR using MALAT1 forward and reverse primers. Out of the three tested oligos, oligos #1 and #3 (MALAT #1 and MALAT #3) produced a significant knockdown effect (Fig. 4). Because MALAT #1 had the most significant effect, it was used for further experiments. 47

Figure 4 Quantification of the expression of MALAT1 in C2C12 cells transfected with different MALAT1 antisense oligos. Cells were transfected with one of three different MALAT1 antisense oligos: MALAT #1, MALAT1 #2, MALAT1 #3 using two concentrations: 100 nM (a) and 50 nM (b). The expression of MALAT1 was quantified by qPCR and compared to the expression of MALAT1 in cells transfected with Scramble RNA. The shown percentages represent the expression percentage in reference to the scramble. Values are expressed as mean ± SD (n=3). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the scramble. The absence of the asterisks indicates nonsignificant difference.

3.3. MALAT1 is required for the proper association of PRC2

core components in the nucleus

After the successful establishment of an efficient system with MALAT1 knockdown in EZH1α-FH stable cell lines under oxidative stress condition, we were interested in 48 testing the effect of MALAT1 on the complex assembly. We tested the interaction between EZH1α, SUZ12 and EED in the nucleus under oxidative stress condition with or without MALAT1. EZH1α-FH stable cell line was treated with H2O2 with or without MALAT1 #1 ASO. Then, EZH1α and its interacting protein partners were immunoprecipitated using a similar tandem affinity purification (TAP) strategy.

After a two-step elution, HA eluted samples were collected and loaded on SDS-PAGE together with the input samples, which were used as a negative control. Western blotting analysis indicates that EED interacts with EZH1α only in the presence on

MALAT1, and the knockdown of MALAT1 disrupted the interaction (Fig. 5a). The bands located above the 50 kDa mark correspond to another isoform of EED, EED 4, which has a molecular weight (MW) of 55.5 kDa. The decrease in the band intensity in the absence of MALAT1 suggests that MALAT1 might be involved in the association of this isoform as well.

Furthermore, the immunoprecipitated EZH1α was analyzed by MS to determine the interacting partners. Proteins which are known to interact with EZH1α such as

SUZ12, EED, AEBP2, PHF1 and RBBP4/7 were detected by MS analysis, confirming the quality of the pulldown and the accuracy of the MS results

(Supplementary file 1). MS data illustrate that EZH1α has a physical association with

EED in the nucleus during oxidative stress (Fig. 5b), which agrees with the previously reported model 43. Interestingly, this association is significantly decreased in the absence of MALAT1 (Fig. 5b).

Together, these results indicate that MALAT1 is required for the proper assembly of

EED with PRC2 core components in the nucleus. Additionally, they provide a further 49 insight on the shuttling of EED form the nucleus back to the cytosol in the absence of

MALAT1.

Figure 5 Co-IP of physically associated PRC2 proteins in the presence and absence of Malat1. (a) Western blots (top) of immunoprecipitated EZH1α-FH (EZH1α IP) from C2C12 cells transfected with scramble RNA (-) and MALAT1-KD C2C12 (+) both treated with H2O2, using anti-EED antibodies and anti-HA for control. Input samples represent non-eluted cells. Ponceau S staining was used as loading control (bottom) (b) The relative intensities of EED and SUZ12 enrichment in immunoprecipitated EZH1α-FH from C2C12 cells transfected with scramble RNA (black) and MALAT1-KD C2C12 (gray) both treated with H2O2. Values are expressed as mean ± SD (n=3) and P<0.05 is considered for significance.

.

50

3.4. MALAT1 is necessary for the binding of EZH1α and

EED to muscle-specific promoters under oxidative stress

Additionally, it is interesting to test the effect of MALAT1 knockdown on the association of PRC2-EZH1 on known genomic loci under oxidative stress. Thus,

ChIP followed by qPCR (ChIP-qPCR) was performed to investigate the binding profiles of EZH1α and EED on the promoter regions of muscle-specific genes including myosin heavy chain 3 (Myh3), myosin heavy chain (Myh8), and MyoG. As illustrated, the binding of EZH1α and EED to the promoter regions was increased during oxidative stress, and MALAT1 knockdown significantly reduced the binding.

Consistently, the levels of H3K27me3 on muscle gene promoters followed a similar trend (Fig. 6). These results indicate that MALAT1 in crucial for the binding of

EZH1α and EED to their targets, which is needed for imprinting the H3K27me3 on histones.

Figure 6 The effect of MALAT1 on the binding of PRC2 proteins to muscle-specific promoters under oxidative stress. ChIP-qPCR of (a) EED, (b) EZH1α, and (c) H3K27me3 on the promoter regions of muscle- specific genes: myogenin (MyoG), neurogenin (NeuroG), myosin heavy chain 3 (Myh3), myosin heavy chain (Myh8), and desmin in control, H2O2 stress (Stress), and H2O2 stress + MALAT1 knockdown (Stress + Malat1- KD). Values are expressed as mean ± SD (n=3). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the control. The absence of the asterisks indicates nonsignificant difference. (Adapted from Nadine El Said, PhD Thesis, KAUST 2019) 51

3.5. Expression of PRC2 core components in E. coli

MALAT1 knockdown reduced the global H3K27me3 levels in atrophied C2C12. We have already shown that MALAT1 was required for the assembly and/or stability of

PRC2-EZH1α in the nucleus under oxidative stress. The effect of MALAT1 on

PRC2-EZH1 can be achieved through a direct physical interaction with the complex or through other mediators in the cell. Another important point to consider is the effect of MALAT1 on the enzymatic activity of PRC2-EZH1α. To investigate these points, we were interested in testing the effect of MALAT1 on an in vitro reconstituted PRC2-EZH1. Three core components of PRC2-EZH1 are required for its full activity: EZH1, EED, and SUZ12. Hence, the expression of these proteins was attempted. This would allow us to determine whether MALAT1 affects PRC2-EZH1α directly or indirectly.

To find the optimum expression conditions, BL21 E. coli were transformed with

GST-tagged Ezh1α, Ezh1β, Eed3, Eed4, or Suz12 (Fig. 7). Positive colonies were cultured, and induced with either 0.5 mM or 1 mM IPTG. The induction was carried out at three different temperatures: 37, 30, or 20 ºC. Ezh1α, Ezh1β, Eed3, and Suz12 were expressed at all three temperatures (Fig. 8). Both IPTG concentrations seemed to be equally effective, therefore, 0.5 mM concentration was chosen for further optimization steps. However, Eed 4 was not expressed in BL21 E. coli in any of the conditions (Fig. 8d). Since EED 3 isoform was shown to be involved in the plastic response to atrophy in C2C12 cells 43, it was selected for further studies.

To validate that the observed bands correspond to our GST-tagged proteins of interest, western blotting was performed using anti-GST antibodies (Fig. 8f). In agreement with the SDS-PAGE staining results, western blotting results indicate the 52 presence of GST-tagged proteins in the total and nuclear fractions of the bacterial lysates. These results show that the expression of EZH1α, EZH1β, EED 3, and

SUZ12 can be induced in E. coli. 53

Figure 7 Cloning of PRC2 core components into E. coli expression plasmids. (a) Schematic representation of the expression plasmid constructs. Genes were cut from the backbone vector (1) and ligated into the destination vector (2). (b) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pJET1.2 vector (1). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table. (c) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pGEX-5x-1vector (2). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table.1% gel electrophoresis was run using 1 kb plus ladder and only constructs that showed fragments of expected sized were picked and confirmed by Sanger sequencing (not shown). Vector maps are shown in appendix A. 54

Figure 8 Expression trials of PRC2 components in bacteria at various conditions. The expression of EZH1α (a), EZH1β (b), EED441 (c), EED500 (d), and Suz12 (e) was carried out at 37ºC, 30ºC, or 20ºC using 0.5mM or 1mM IPTG to induce the expression. (-) represents un-induced samples, and (+) represents samples induced with 0.5mM IPTG. Proteins were loaded on 4-12% gel, and Spectra Multicolor Broad Range Protein Ladder was used as marker. (f) Protein blots using anti-GST antibodies to validate the expression of the protein in the different fractions.

3.6. PRC2 proteins are insoluble in E. coli except EED

To enable the purification of the proteins, they must be present in the soluble fraction of the lysate. In an attempt to purify the proteins, their expression levels were checked 55 in the total, soluble, and pellet fractions. Different fractions were visualized on SDS-

PAGE gel to determine the solubility of the proteins (Fig. 9). When the expression was induced at 37 ºC, only EED was present in all fractions, including the soluble one, which would enable its purification (Fig. 9a). However, none of the other proteins were present in the soluble fraction in a considerable quantity (Fig. 9a).

It is reported that lowering the induction temperature could improve the solubility of the proteins 85. Therefore, in an aim to improve their solubility, the proteins were expressed at 30 ºC and the lysis buffer composition was altered by the addition of

10% glycerol, 0.3% triton. The temperature was decreased further to 16 ºC and the triton concentration in the lysis buffer was increased to 1%. Nevertheless, neither temperature improved the solubility of the proteins. (Fig. 9b, c). These results show that although the expression of PRC2 core components in E. coli was feasible.

However, with the exception of EED, these proteins were not soluble.

56

Figure 9 Protein presence in different fraction of the bacterial lysates. Total lysates were collected (total) then centrifuged and the supernatant (soluble) and the debris (pellet) were isolated then loaded to check the presence of the indicated protein. Proteins were expressed and lysed using three different methods (a), (b), and (c) varying in temperature and lysis buffer components as described in the methods. (-) represents un-induced samples, and (+) represents samples induced with 0.5 mM IPTG. Proteins were loaded on 4-12% gel, and Spectra Multicolor Broad Range Protein Ladder (a, b) and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder (c) were used as markers.

57

3.7. Expression of PRC2 core components in insect cells

In an aim to obtain soluble PRC2 proteins, a different expression system was chosen. Baculovirus expression system was used to express Ezh1α, Ezh1β, Eed 3,

Eed 4, or Suz12 that had been cloned in pFastBac-HT expression vector (Fig. 10).

Recombinant vectors were used to infect SF9 insect cell (Fig. 11). All proteins were successfully expressed in SF9 cells; however, only EED proteins were soluble (Fig.

12).

Figure 10 Cloning of PRC2 core components into insect expression plasmids. (a) Schematic representation of the expression plasmid constructs. Genes were cut from the backbone vector (1) and ligated into the destination vector (2). (b) Diagnostic digestion of the constructed plasmids to confirm correct ligation in pFastBac-HT C vector (2). Restriction enzymes used and expected fragment sizes for each digestion are represented in the table.1% gel electrophoresis was run using 1 kb plus ladder and only constructs that showed fragments of expected sized were picked and confirmed by Sanger sequencing (not shown). Vector maps are shown in appendix A. 58

a) Transformation of b) Insertion of gene recombinant vector into genomic DNA by into DH10Bac E. coli helper plasmid

c) Bacmid Isolation

d) Transfection e) Obtaining of bacmid to Sf9 baculovirus P1 insect cells

f) P1 baculovirus infection to Sf9 insect cells

h) P2 g) Obtaining baculovirus i) Obtaining baculovirus P2 infection to Sf9 baculovirus P3 insect cells

j) Infection of P3 baculovirus to Sf9

k) Collection of infected cells

Figure 11 Schematic representation of baculovirus expression main steps. pFastBac vector is transformed into DH10Bac E. coli and the bacmid is isolated and transfected into SF9 insect cells to produce multiple generations of viruses that are used to infect SF9 cells to produce the proteins. 59

Figure 12 Expression of PRC2 core components in SF9 cells. PRC2 Proteins were expressed in SF9 cells and SUMO was expressed as a positive control. Total lysates were collected (T) or centrifuged and the supernatant was collected as the soluble part (S). The fractions were loaded on 4-12% gels to check the presence of the indicated protein, and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder was used as marker.

3.8. Simultaneous expression of the three core components of

PRC2-EZH1 is necessary for their solubility

To increase the stability of the proteins and, in turn, their solubility, we co-infected

SF9 cell with EZH1α, Eed, and Suz12 simultaneously. The co-infection was done in

1:1:1 volume ratio of virus in varying concentrations (Fig. 13a). After co-infection, the proteins were present in the soluble fraction for all three concentrations used for infection (Fig. 13a). Because EZH1α and SUZ12 are of similar MWs (92.5, 81.4 kDa, respectively), we ran a western blot using anti-EZH1α and anti-SUZ12 specific antibodies to validate the expression of both proteins (Fig. 13b). Both EZH1α and

SUZ12 were present and soluble. These results indicate that the presence of the three proteins together is necessary for their stability and solubility 86,87.

60

Figure 13 Co-infection of PRC2 core components in insect cells. (a) SF9 cells were co-infected with variable concentrations of EZH1α, EED, and SUZ12 in 1:1:1 volume ratio. Total lysates were collected (T) or centrifuged and the supernatant was collected as the soluble part (S). The fractions were loaded on 4-12% gel to check the presence of the proteins, and BioRad Precision Plus Protein Kaleidoscope Prestained Protein ladder was used as marker. (b) Western blotting of the co-infected cells using protein-specific antibodies to validate the presence of proteins of a similar MW.

3.9. In vitro transcription of MALAT1

In order to study the biochemical interaction of MALAT1 with PRC2-EZH1 core components, full-length in vitro transcription MALAT1 plasmid (kindly provided by

Prof. Shinichi Nakagawa) (Fig. 14a) was amplified in E. coli (Fig. 14b). The plasmid was then linearized and transcribed in vitro. Analysis of the transcript by gel electrophoresis shows the successful transcription of MALAT1 (Fig. 14c).

61

Figure 14 In vitro transcription of MALAT1. (a) MALAT1 full length in vitro transcription plasmid map with restriction sites. (b) Diagnostic digestion of cloned MALAT1 plasmid (10 kb) shows a band of 7 kb corresponding to MALAT1 and another band of 3 kb corresponding to remaining fragment. (c) Gel electrophoresis of in vitro transcribed MALAT1. Samples were run on 1% agarose gel using 1 kb plus marker.

3.10. MALAT1 positively affects the enzymatic activity of

PRC2-EZH1 in vitro

After the successful transcription of MALAT1, in vitro transcribed MALAT1 was used to determine its effect of PRC2 enzymatic activity in vitro. At this stage, commercially-available PRC2 core components were used. EZH1α, EED, and SUZ12 62 proteins were added to H3K27 substrates in a 1:1:1 volume ratio with varying concentrations of MALAT1 (0, 0.5, 1, 2, or 5 ng of MALAT1 to 100 ng of individual component). Colorimetric detection was used to detect the relative color change determined by absorbance readings (OD). It was observed that MALAT1 resulted in a general increase in the H3K27me3 as seen when 1, 2, and 5 ng of MALAT1 were added. However, only 1 ng of MALAT1 had a significant effect on the H3K27me3.

These results suggest that MALAT1 increases the enzymatic activity of PRC2-EZH1, and this effect is concentration-dependent. These results provide preliminary insight; however, since the expression of PRC2 core components was successful, they can be used for further validation. Contrary to the reconstitution of the complex in vitro, the formation of the complex in the host insect cells preserves its integrity.

Figure 15 The level of H3K27me3 exerted by PRC2 core components with different concentrations of MALAT1. EZH1α, EED, and SUZ12 were added in a 1:1:1 volume ratio with varying amounts of MALAT1 and the relative levels of H3K27me3 were calorimetrically detected by absorbance measurement (OD). Values are expressed as mean ± SD (n=8). **P<0.01, ***P<0.001, ****P<0.0001 with respect to the scramble. The absence of the asterisks indicates nonsignificant difference. (Experiment performed by Nadine El Said) 63

CHAPTER 4: DISCUSSION

MALAT1 lncRNA is involved in the PRC2-EZH1-mediated response of muscle cells to oxidative stress. The presence of MALAT1 was shown to be important for the translocation of EED to the nucleus and, in turn, the HMTase activity of the complex.

Here we illustrate that MALAT1 is particularly necessary for the proper association of

EED with the other components in the nucleus. In addition, we confirm that

MALAT1 directly affects the levels of H3K27me3.

Our results show MALAT1 is essential for the association between EED and EZH1α in atrophied muscle cells (Fig. 5). This suggests that the previously observed effects of MALAT1 on H3K27me3 levels (Fig. 2b) occurs through disturbing the association between EED and the other proteins. This notion is supported by the decrease in the binding of both EZH1α and EED to the promoters of their target genes, and the consequent decrease of H3K27me3 in the absence of MALAT1 (Fig. 6). Consistently,

MALAT1 enhanced the enzymatic activity of PRC2-EZH1 in vitro (Fig. 15). Because

EZH1 and SUZ12 form a non-canonical PRC2 complex in the nucleus 30, MALAT1 had insignificant effect of on their association (Fig. 5).

A previous study showed that MALAT1 physically interacts with SUZ12 in bladder cancer 78. MALAT1 was also highly enriched with SUZ12 and EZH2 antibodies in T cell lymphoma cells 81. Additionally, our previous RIP data (Fig. 2a) indicate that

MALAT1 was highly enriched with EED antibodies during atrophic stress.

Collectively, these studies indicate that MALAT1 has the ability to be physically associated with PRC2 proteins. 64

Many studies have linked RNA moieties to PRC2 recruitment and activity. Yet, the interplay between RNA and PRC2, and the specificity of their binding are complex and controversial. Most of the lncRNAs which have been involved in PRC2 gene repression activity, such as Xist or HOTAIR, are by themselves repressors 88,89.

However, evidence suggests that at normal conditions, MALAT1 is enriched on active genes 83. Similarly, PRC2-EZH1 forms a noncanonical complex at normal conditions, which also is associated with active genes 39. Interestingly, our results suggest that upon oxidative stress, those two components work together to infer gene repression (Fig. 15, 16)

The in vitro expression of PRC2 enables the investigation of the direct effects of

MALAT1 on the complex. The ability of BL21 E. coli to express PRC2 proteins is attributed to their deficiency in the Lon protease, which normally degrades most of the foreign proteins 85. Nevertheless, the insolubility of the expressed proteins could be a result of them being foreign. The recombinant proteins are expressed in the microenvironment of E. coli, which might lack specific post-translational modifications or other parameters affecting folding mechanisms 85. Consequently, the proteins will be unstable and will cluster in aggregates known as inclusion bodies 85, rendering the proteins insoluble.

As illustrated, when the proteins were expressed at a higher temperature, their amount increased (Fig. 8a, b, c, e). Proteins at high levels tend to form aggregates 85 and, therefore, decreasing the expression temperature is suggested to enhance protein’s solubility. However, in our case, decreasing the temperature did not have an effect on the solubility of the proteins (Fig. 9), which might be a result of either the foreignness of the proteins or their instability when expressed individually. 65

The PRC2 is a native complex in insect cells, which might represent a better candidate for protein expression. A study has compared the expressions of human kinases in SF9 insect cells and E. coli. Out of the 62 proteins tested, 61 were successfully expressed and soluble in SF9 cells. On the other hand, only 87% of the proteins were expressed in E. coli, and only half of the those expressed were soluble

90. The baculovirus expression system in insect cells represents a versatile protein expression platform that has the ability to introduce similar post-translational modifications on the proteins as in mammalian cells 91. It also enables high levels of expression 86, allowing for mass production for downstream experiments and applications that require high quantities of the proteins.

Generally, the expression of proteins in isolation, followed by in vitro reconstitution is advantageous, as it enables the study of the role of each protein in the complex. For example, the protein can be replaced with a non-functional mutant to investigate its role. In our case, the expression of each of the PRC2 proteins separately is preferred, as it would allow the study of the complex association dynamics. For example, the effect of MALAT1 on PRC2 association/dissociation dynamics can be studied by

MST. However, in most cases, individual subunits cannot be expressed without their protein partners 86. Because PRC2 proteins are present in a complex, the co-presence of the three proteins was necessary for their solubility.

There are mainly two approaches to express multiple proteins using baculovirus expression system. One approach is known as co-infection, which involves the expression of multiple monocistronic baculoviruses, each carrying a single gene. The second approach is the co-expression, which uses a single polycistronic baculovirus that carries multiple genes and expresses them simultaneously. If a large number of proteins are to be expressed, a combination of both methods can be used 91. 66

The co-infection of several viruses enables the control of the multiplicity of infection

(MOI), which is defined as the ratio of virus to infected cells 86. As a result, the protein expression ratios can be adjusted. On the other hand, the use of co-expression is preferred for expressing large number of proteins as it guaranties the homogeneity of protein expression in the cells 86. Additionally, cloning all of the protein encoding genes into a singular vector will result in a near-stoichiometric expression, without the need of optimizing the ratios of the viruses 87. Nevertheless, this method requires the design and cloning of a very large plasmid with multiple genes, a process that can be cumbersome.

The components of PRC-EZH2 have been successfully expressed and purified previously for the purpose of structural analysis 87,92. Such analyses require the presence of the proteins in a complex at the correct stoichiometric ratios and, therefore, co-expression represented an ideal option. When co-infected, all of the

PRC2 proteins were soluble (Fig. 13). The fact that EED was soluble even when expressed individually (Fig. 12) might indicate its stability as an individual protein.

This agrees with the reported models mentioned previously that suggest that EZH1α and SUZ12 form a non-canonical PRC2 in the nucleus in the absence of EED, which only associates with EZH1α and SUZ12 during stress 30,43. After the successful expression of soluble PRC2 proteins, the next step would be to purify those His- tagged proteins for downstream experiments. To obtain individual proteins, the proteins of interest can be fused with different tags and purified separately. 67

Figure 16 Schematic suggested model of the involvement of MALAT1 in the adaptive response of skeletal muscle cells to oxidative stress. At normal conditions, MALAT1 is enriched on active genes, and PRC2-EHZ1 forms a noncanonical complex in the nucleus that binds to active genes. Upon stress, MALAT1 facilitates the full assembly of PRC2-EZH1 core components in the nucleus, enhancing gene repression.

68

CHAPTER 5: CONCLUSION lncRNAs exhibit a high level of functional versatility. The involvement of lncRNA in stress response had been widely established. In many cases, lncRNAs are associated with PRC2 recruitment and regulation. With the involvement of MALAT1 lncRNA,

PRC-EZH1α/β pathway mediates the response of atrophied muscle cells to oxidative stress. MALAT1 was found to be essential for the proper association of EED with

EZH1α in the nucleus during muscle atrophy, and for the HMTase activity of the complex. We suggest that MALAT1 facilitates the nuclear translocation of EED, and acts as a physical tethering agent between the components of the complex. Further structural and binding studies are needed to validate this model.

Our PRC2 expression optimization trials revealed that the co-presence of PRC2 components is crucial for their stability. After the successful expression and purification of PRC2 components, the association/dissociation dynamics can be quantitatively measured by MST in the presence and absence of MALAT1. This would allow a quantitative understanding of the role of MALAT1 in PRC2 assembly and/or disassembly. The physical binding between MALAT1 and any of the PRC2 proteins can also be investigated by electrophoretic mobility shift assay (EMSA).

Additionally, cryogenic electron microscopy (cryo-EM) can be used to decipher the structure and interplay of PRC2-EZH1 and MALAT1.

69

APPENDIX A: Vector maps

1. pOZ-FH-C-Puro vector

Figure 17 pOZ-FH-C-Puro vector map. (Adapted from Addgene).

70

2. pJET1.2 vector

Figure 18 pJET1.2 vector map. (Adapted from SnapGene). 71

3. pGEX-5X-1 vector

Figure 19 pGEX-5X-1 vector map. (Adapted from SnapGene). 72

4. pFastBac HT C vector

Figure 20 pFastBac HT C vector map. (Adapted from SnapGene).

73

APPENDIX B: Primer sequences

1. qPCR primers

Table 1 Sequences of primers used for qPCR.

Primer Sequence

MALAT1 forward 5’- GCC TTT TGT CAC CTC ACT -3’

MALAT1 reverse 5’- CAA ACT CAC TGC AAG GTC TC -3’

Myogenin forward 5’- GCA GCG CCA TCC AGT TTG -3’

Myogenin reverse 5’- GCA ACA GAC ATA TCA CCG -3’

Myh3 forward 5’- GCA GAT TCA GAA ACT GGA GAC -3’

Myh3 reverse 5’- CCT TAA CAC GCC GCT CAT AC -3’

Myh8 forward 5’- GAA CTT GAA GGA GAG GTC GA -3’

Myh8 reverse 5’- GAG CAC ATT CTT GCG GTC TT -3’

2. ChIP

Table 2 Sequences of primers used for ChIP.

Primer Sequence

Myogenin forward 5’- TGG CTA TAT TTA TCT CTG GGT TCA -3’

Myogenin reverse 5’- GCT CCC GCA GCC CCT -3’

Myh3 forward 5’- CAG CTG GCC TTT CCT AAT TG -3’

Myh3 reverse 5’- CCC ATG TTC ACA GTG ACA GC -3’

Myh8 forward 5’- TAG TGT GTT GGG AAG GGA ATC T -3’

Myh8 reverse 5’- GCT CCT GTT GGA ACA AAT AAG G -3’

Desmin forward 5’- GCT TCC TAG CTG GGC CTT TCC -3’

Desmin reverse 5’- TTC ATC CCC CTA CTC ACA CC -3’

NeuroG1 forward 5’- CCT CCC GCG ACG ATA AAT -3’

NeuroG1 reverse 5’- CCT CAG GAC CCC TTA AGT ACG -3’

74

3. Cloning primers

Table 3 Sequences of primers used for cloning.

Primer Sequence

Ezh1α-XhoI-forward 5’- ATG AGG AAA ATG GAT ATA GCA -3’

Ezh1α-SalI-forward 5’- GTC GAC ATG AGG AAA ATG GAT ATA GCA AGT CCC CCA AC -3’

Ezh1α-NotI-reverse 5’- GCG GCC GCC TAG AAG ACG TCC GTT TCC CTC TCG ATG CCC ACA TAC -3’

Ezh1β-XhoI- forward 5’- CTC GAG ATG GAT ATA GCA AGT CCC CCA ACT TCC AAA TGC ATC AC-3’

Ezh1β-SalI- forward 5’- GTC GAC ATG GAT ATA GCA AGT CCC CCA ACT TCC AAA TG -3’

Ezh1β-NotI-reverse 5’- GCG GCC GCC TAA GGG GCA GGA GAA AAG AGT CTG GGC ACT CCA AG -3’

Eed3-XhoI-forward 5’- CTC GAG ATG TCC GAG AGG GAA GTG TCG ACT GCG CCG GCG GGA AC -3’

Eed3-EcoRI-forward 5’- GAA TTC AAT GTC CGA GAG GGA AGT GTC GAC TGC GCC GGC GGG AAC -3’

Eed4-XhoI-forward 5’- CTC GAG ATG GTG GCG GGG TCG CAC GCA CGC CCG CCT CGG CGG CTG -3’

Eed4-EcoRI-forward 5’- GAA TCC AAT GGT GGC GGG GTC GCA GCG ACG CCC GCC TCG GCG GCT G -3’

Eed3/4-NotI-reverse 5’- GCG GCC GCT TAT CGA AGT CGA TCC CAT CGC CAA ATG CTG GCA TCA TCG -3’

Suz12-EcoRI-forward 5’- GAA TTC AAT GGC GCC TCA GAA GCA CGG CGG TGG GGG AG -3’

Suz12-XhoI-reverse 5’- CTC GAG TCA GAG TTT TTG TTT CTT GCT CTG TT -3’

4. Sequencing primers:

Table 4 Sequences of primers used for Sanger sequencing.

Primer Sequence

pGEX-5x-1 forward 5’- GGG CTG GCA AGC CAC GTT TGG TG -3’

pGEX-5x-1 reverse 5’- CCG GGA GCT GCA TGT GTC AGA GG -3’ 75

pJET1.2 forward 5’- GTA GCA TCA CGC TGT GAG TAA GTT C -3’

pJET1.2 reverse 5’- ACC TAC AAC GGT TCC TGA TGA GGT G -3’ pFastBac HT C forward 5’- TCT AGT GGT TGG CTA CGT ATA CTC C -3’ pFastBac HT C reverse 5’- TGT GGT ATG GCT GAT TAT GAT CCT C -3’

76

APPENDIX C: Antibodies

1. Western blotting

Table 5 Primary antibodies and the corresponding secondary antibodies.

Primary antibody Secondary antibody

1:500 EED rabbit polyclonal antibody 1:5000 mouse anti-rabbit IgG-HRP (sc- (PA5-72224, Invitrogen) to 5% BSA in 2357, SCBT) in 5% BSA in TBTS TBTS

1:500 SUZ12 rabbit monoclonal 1:5000 mouse anti-rabbit IgG-HRP (sc- antibody (D39F6, CST) to 5% BSA in 2357, SCBT) in 5% BSA in TBTS TBTS

1:200 EZH1α goat antibody (182062) to 1:5000 donkey anti-goat IgG-HRP: (sc- 5% BSA in TBTS 2020, SCBT) in 0.1% TBTS

1:1000 Anti-HA (3f10) monoclonal high 1:5000 goat anti-rat IgG-HRP (A9037, affinity rat antibody (11867423001, Sigma) in 5% milk in TBTS Sigma) to 5% milk in TBTS

1:1000 Anti-GST rabbit antibody 1:5000 mouse anti-rabbit IgG-HRP (sc- (ab9085, abcam) to 5% BSA in TBTS 2357, SCBT) in 5% BSA in TBTS

77

REFERENCES

1. Schuettengruber, B., Chourrout, D., Vervoort, M., Leblanc, B. & Cavalli, G. Genome Regulation by Polycomb and Trithorax Proteins. Cell 128, 735–745 (2007).

2. Ringrose, L. & Paro, R. Epigenetic Regulation of Cellular Memory by the Polycomb and Trithorax Group Proteins. Annu. Rev. Genet. 38, 413–443 (2004).

3. Allis, C. D. & Jenuwein, T. The molecular hallmarks of epigenetic control. Nature Reviews Genetics (2016). doi:10.1038/nrg.2016.59

4. Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life. Nature 469, 343–349 (2011).

5. Duncan, E. J., Gluckman, P. D. & Dearden, P. K. Epigenetics, plasticity, and evolution: How do we link epigenetic change to phenotype? Journal of Experimental Zoology Part B: Molecular and Developmental Evolution (2014). doi:10.1002/jez.b.22571

6. Szyf, M. Nongenetic inheritance and transgenerational epigenetics. Trends Mol. Med. 21, 134–144 (2015).

7. Felsenfeld, G. A Brief History of Epigenetics. doi:10.1101/cshperspect.a018200

8. Rivera, C. M. & Ren, B. Leading Edge Review Mapping Human Epigenomes. Cell 155, 39–55 (2013).

9. Almouzni, G. & Cedar, H. Maintenance of epigenetic information. Cold Spring Harb. Perspect. Biol. (2016). doi:10.1101/cshperspect.a019372

10. Lewis, E. B. in Genes, Development and Cancer 205–217 (Springer US, 1978). doi:10.1007/978-1-4419-8981-9_13

11. Jurgens, G. A group of genes controlling the spatial expression of the bithorax complex in Drosophila.

12. Struhl, G. A gene product required for correct initiation of segmental determination in Drosophila. Nature 293, (1981).

13. Mallo, M. & Alonso, C. R. The regulation of Hox gene expression during animal development. Development 140, 3951–3963 (2013).

14. Milne, T. A., Sinclair, D. A. R. & Brock, H. W. The Additional sex combs gene of Drosophila is required for activation and repression of homeotic loci, and interacts specifically with Polycomb and super sex combs. Mol. Gen. Genet. (1999). doi:10.1007/s004380050018

15. Rinn, J. L. & Chang, H. Y. Genome Regulation by Long Noncoding RNAs. 78

doi:10.1146/annurev-biochem-051410-092902

16. Stojic, L. et al. Chromatin regulated interchange between polycomb repressive complex 2 ( PRC2 ) -Ezh2 and PRC2-Ezh1 complexes controls myogenin activation in skeletal muscle cells. Epigenetics Chromatin 4, 16 (2011).

17. Saksouk, N., Simboeck, E. & Déjardin, J. Constitutive heterochromatin formation and transcription in mammals. Epigenetics Chromatin 8, 3 (2015).

18. Struhl, G. & Akam, M. Altered distributions of Ultrabithorax transcripts in extra sex combs mutant embryos of Drosophila. EMBO J. 4, 3259–3264 (1985).

19. Comet, I., Riising, E. M., Leblanc, B. & Helin, K. Maintaining cell identity: PRC2-mediated regulation of transcription and cancer. Nature Reviews Cancer (2016). doi:10.1038/nrc.2016.83

20. Schuettengruber, B., Chourrout, D., Vervoort, M., Leblanc, B. & Cavalli, G. Leading Edge Review Genome Regulation by Polycomb and Trithorax Proteins. doi:10.1016/j.cell.2007.02.009

21. Mechanisms Regulating PRC2 Recruitment and Enzymatic Activity.pdf.

22. Shi, Y. et al. Structure of the PRC2 complex and application to drug discovery. Acta Pharmacol. Sin. 38, 963–976 (2017).

23. Hirata, H. et al. Long noncoding RNA MALAT1 promotes aggressive renal cell carcinoma through Ezh2 and interacts with miR-205. Cancer Res. 75, 1322–1331 (2015).

24. Wang, D. et al. LncRNA MALAT1 enhances oncogenic activities of EZH2 in castration-resistant prostate cancer. Oncotarget 6, 41045–41055 (2015).

25. Gan, L. et al. Epigenetic regulation of cancer progression by EZH2: from biological insights to therapeutic potential. Biomark. Res. 6, 10 (2018).

26. Kim, K. H. & Roberts, C. W. M. Targeting EZH2 in cancer. Nat. Med. 22, 128–34 (2016).

27. Herrera-Merchan, A. et al. Ectopic expression of the histone methyltransferase Ezh2 in haematopoietic stem cells causes myeloproliferative disease. Nat. Commun. 3, 623 (2012).

28. Poirier, J. T. et al. DNA methylation in small cell lung cancer defines distinct disease subtypes and correlates with high expression of EZH2. Oncogene 34, 5869–5878 (2015).

29. Raaphorst, F. M. et al. Coexpression of BMI-1 and EZH2 Polycomb Group Genes in Reed-Sternberg Cells of Hodgkin’s Disease. Am. J. Pathol. 157, 709– 715 (2000).

30. Xu, J. et al. Developmental control of polycomb subunit composition by 79

GATA factors mediates a switch to non-canonical functions. Mol. Cell (2015). doi:10.1016/j.molcel.2014.12.009

31. Li, D. & Roberts, R. WD-repeat proteins: structure characteristics, biological function, and their involvement in human diseases. Cell. Mol. Life Sci. 58, 2085–97 (2001).

32. H, N. H. Different Ezh2-Containing Complexes Target Methylation of Histone H1 or Introducción. 14, 183–193 (2004).

33. Riising, E. M. et al. Gene silencing triggers polycomb repressive complex 2 recruitment to CpG Islands genome wide. Mol. Cell (2014). doi:10.1016/j.molcel.2014.06.005

34. Yuan, W. et al. Dense chromatin activates polycomb repressive complex 2 to regulate H3 lysine 27 methylation. Science (80-. ). (2012). doi:10.1126/science.1225237

35. Lanzuolo, C., Sardo, F. Lo, Diamantini, A. & Orlando, V. PcG complexes set the stage for epigenetic inheritance of gene silencing in early S phase before replication. PLoS Genet. (2011). doi:10.1371/journal.pgen.1002370

36. Petruk, S. et al. TrxG and PcG proteins but not methylated histones remain associated with DNA through replication. Cell (2012). doi:10.1016/j.cell.2012.06.046

37. Gaydos, L. J., Wang, W. & Strome, S. H3K27me and PRC2 transmit a memory of repression across generations and during development. Science (80-. ). (2014). doi:10.1126/science.1255023

38. Margueron, R. et al. Role of the polycomb protein EED in the propagation of repressive histone marks. Nature 461, 762–767 (2009).

39. Mousavi, K., Zare, H., Wang, A. H. & Sartorelli, V. Polycomb Protein Ezh1 Promotes RNA Polymerase II Elongation. Mol. Cell (2012). doi:10.1016/j.molcel.2011.11.019

40. Gehani, S. S. et al. Polycomb Group Protein Displacement and Gene Activation through MSK-Dependent H3K27me3S28 Phosphorylation. Mol. Cell 39, 886–900 (2010).

41. Lau, P. N. I. & Cheung, P. Histone code pathway involving H3 S28 phosphorylation and K27 acetylation activates transcription and antagonizes polycomb silencing. Proc. Natl. Acad. Sci. 108, 2801–2806 (2011).

42. Deak, M., Clifton, A. D., Lucocq, L. M. & Alessi, D. R. Mitogen- and stress- activated protein kinase-1 (MSK1) is directly activated by MAPK and SAPK2/p38, and may mediate activationofCREB. EMBO J. 17, 4426–4441 (1998).

43. Bodega, B. et al. A cytosolic Ezh1 isoform modulates a PRC2-Ezh1 epigenetic adaptive response in postmitotic cells. Nat. Struct. Mol. Biol. 24, 444–452 80

(2017).

44. Fatica, A. & Bozzoni, I. Long non-coding RNAs: new players in cell differentiation and development. Nat. Rev. Genet. 15, 7–21 (2014).

45. R, R., Mattick, J. S. & Makunin, I. V. Non-coding RNA. 15, 17–29 (2006).

46. Reinhart, B. J. & Bartel, D. P. Small RNAs correspond to centromere heterochromatic repeats. Science (80-. ). (2002). doi:10.1126/science.1077183

47. Volpe, T. A. et al. Regulation of heterochromatic silencing and histone H3 lysine-9 methylation by RNAi. Science (80-. ). (2002). doi:10.1126/science.1074973

48. Jones, L. et al. RNA-DNA interactions and DNA methylation in post- transcriptional gene silencing. Plant Cell (1999). doi:10.1105/tpc.11.12.2291

49. Hamilton, A. J. & Baulcombe, D. C. A species of small antisense RNA in posttranscriptional gene silencing in plants. Science (80-. ). (1999). doi:10.1126/science.286.5441.950

50. Meister, G. et al. Human Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. Mol. Cell (2004). doi:10.1016/j.molcel.2004.07.007

51. Hutvágner, G. & Zamore, P. D. A microRNA in a multiple-turnover RNAi enzyme complex. Science (80-. ). (2002). doi:10.1126/science.1073827

52. Hammond, S. M., Boettcher, S., Caudy, A. A., Kobayashi, R. & Hannon, G. J. Argonaute2, a link between genetic and biochemical analyses of RNAi. Science (80-. ). (2001). doi:10.1126/science.1064023

53. Guttman, M. & Rinn, J. L. Modular regulatory principles of large non-coding RNAs. Nature 482, 339–346 (2012).

54. Batista, P. J. & Chang, H. Y. Long Noncoding RNAs: Cellular Address Codes in Development and Disease. Cell 152, 1298–1307 (2013).

55. Rinn, J. L. & Chang, H. Y. Genome Regulation by Long Noncoding RNAs. Annu. Rev. Biochem. 81, 145–166 (2012).

56. Rinn, J. L. et al. Functional Demarcation of Active and Silent Chromatin Domains in Human HOX Loci by Noncoding RNAs. Cell 129, 1311–1323 (2007).

57. Guttman, M. et al. lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature 477, 295–300 (2011).

58. Mak, W. et al. Mitotically stable association of polycomb group proteins eed and enx1 with the inactive x chromosome in trophoblast stem cells. Curr. Biol. 12, 1016–20 (2002).

59. Plath, K. et al. Role of Histone H3 Lysine 27 Methylation in X Inactivation. 81

Science (80-. ). 300, 131–135 (2003).

60. Silva, J. et al. Establishment of histone h3 methylation on the inactive X chromosome requires transient recruitment of Eed-Enx1 polycomb group complexes. Dev. Cell 4, 481–95 (2003).

61. Zhao, J., Sun, B. K., Erwin, J. A., Song, J.-J. & Lee, J. T. Polycomb Proteins Targeted by a Short Repeat RNA to the Mouse X Chromosome. Science (80-. ). 322, 750–756 (2008).

62. Wang, J. et al. Imprinted X inactivation maintained by a mouse Polycomb group gene. Nat. Genet. 28, 371–375 (2001).

63. Duthie, S. et al. Xist RNA exhibits a banded localization on the inactive X chromosome and is excluded from autosomal material in cis. Hum. Mol. Genet. 8, 195–204 (1999).

64. Fitzpatrick, G. V., Soloway, P. D. & Higgins, M. J. Regional loss of imprinting and growth deficiency in mice with a targeted deletion of KvDMR1. Nat. Genet. 32, 426–431 (2002).

65. Pandey, R. R. et al. Kcnq1ot1 Antisense Noncoding RNA Mediates Lineage- Specific Transcriptional Silencing through Chromatin-Level Regulation. Mol. Cell 32, 232–246 (2008).

66. Heo, J. B. & Sung, S. Vernalization-Mediated Epigenetic Silencing by a Long Intronic Noncoding RNA. Science (80-. ). 331, 76–79 (2011).

67. Mercer, T. R. & Mattick, J. S. Structure and function of long noncoding RNAs in epigenetic regulation. Nat. Struct. Mol. Biol. 20, 300–307 (2013).

68. Kanhere, A. et al. Short RNAs Are Transcribed from Repressed Polycomb Target Genes and Interact with Polycomb Repressive Complex-2. Mol. Cell 38, 675–688 (2010).

69. Somarowthu, S. et al. HOTAIR Forms an Intricate and Modular Secondary Structure. Mol. Cell 58, 353–361 (2015).

70. Davidovich, C., Zheng, L., Goodrich, K. J. & Cech, T. R. Promiscuous RNA binding by Polycomb repressive complex 2. Nat. Struct. Mol. Biol. 20, 1250– 1257 (2013).

71. Davidovich, C. & Cech, T. R. The recruitment of chromatin modifiers by long noncoding RNAs: Lessons from PRC2. RNA (2015). doi:10.1261/.053918.115

72. Beltran, M. et al. The interaction of PRC2 with RNA or chromatin s mutually antagonistic. Genome Res. (2016). doi:10.1101/gr.197632.115

73. Wilusz, J. E. et al. A triple helix stabilizes the 3’ ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 26, 2392–2407 (2012). 82

74. Tripathi, V. et al. The Nuclear-Retained Noncoding RNA MALAT1 Regulates Alternative Splicing by Modulating SR Splicing Factor Phosphorylation. Mol. Cell 39, 925–938 (2010).

75. Sun, Q., Hao, Q. & Prasanth, K. V. Nuclear Long Noncoding RNAs: Key Regulators of Gene Expression. Trends in Genetics (2018). doi:10.1016/j.tig.2017.11.005

76. Tripathi, V. et al. Long Noncoding RNA MALAT1 Controls Cell Cycle Progression by Regulating the Expression of Oncogenic Transcription Factor B-MYB. PLoS Genet. (2013). doi:10.1371/journal.pgen.1003368

77. Amodio, N. et al. MALAT1: A druggable long non-coding RNA for targeted anti-cancer approaches. J. Hematol. Oncol. 11, 1–19 (2018).

78. Fan, Y. et al. TGF- b – Induced Upregulation of malat1 Promotes Bladder Cancer Metastasis by Associating with suz12. 1531–1542 (2014). doi:10.1158/1078-0432.CCR-13-1455

79. Xiaochun, T. et al. Long noncoding RNA MALAT1 regulates autophagy associated chemoresistance via miR-23b-3p sequestration in gastric cancer. Mol. Cancer 16, 1–12 (2017).

80. Zuccalà, V. et al. Drugging the lncRNA MALAT1 via LNA gapmeR ASO inhibits gene expression of proteasome subunits and triggers anti-multiple myeloma activity. Leukemia 32, 1948–1957 (2018).

81. Kim, S. H., Yang, W. I., Kim, S. J., Yoon, S. O. & Kim, S. H. Association of the long non-coding RNA MALAT1 with the polycomb repressive complex pathway in T and NK cell lymphoma. Oncotarget 8, 31305–31317 (2017).

82. Zhang, X., Hamblin, M. H. & Yin, K.-J. The long noncoding RNA Malat1: Its physiological and pathophysiological functions. RNA Biol. 14, 1705–1714 (2017).

83. Shin, V. Y. et al. Long non-coding RNA NEAT1 confers oncogenic role in triple-negative breast cancer through modulating chemoresistance and cancer stemness. Cell Death Dis. (2019). doi:10.1038/s41419-019-1513-5

84. Zeng, R. et al. The long non-coding RNA MALAT1 activates Nrf2 signaling to protect human umbilical vein endothelial cells from hydrogen peroxide. Biochem. Biophys. Res. Commun. (2018). doi:10.1016/j.bbrc.2017.12.105

85. Rosano, G. L. & Ceccarelli, E. A. Recombinant protein expression in Escherichia coli: Advances and challenges. Frontiers in Microbiology (2014). doi:10.3389/fmicb.2014.00172

86. Berger, I. & Poterszman, A. Baculovirus expression: Old dog, new tricks. Bioengineered (2015). doi:10.1080/21655979.2015.1104433

87. Kasinath, V. et al. Structures of human PRC2 with its cofactors AEBP2 and JARID2. Science (80-. ). (2018). doi:10.1126/science.aar5700 83

88. Penny, G. D., Kay, G. F., Sheardown, S. A., Rastan, S. & Brockdorff, N. Requirement for Xist in X chromosome inactivation. Nature 379, 131–137 (1996).

89. Portoso, M. et al. PRC2 is dispensable for HOTAIR ‐mediated transcriptional repression. EMBO J. 36, 981–994 (2017).

90. Chambers, S. P., Austen, D. A., Fulghum, J. R. & Kim, W. M. High- throughput screening for soluble recombinant expressed kinases in Escherichia coli and insect cells. Protein Expr. Purif. (2004). doi:10.1016/j.pep.2004.03.003

91. Sokolenko, S. et al. Co-expression vs. co-infection using baculovirus expression vectors in insect cell culture: Benefits and drawbacks. Biotechnology Advances (2012). doi:10.1016/j.biotechadv.2012.01.009

92. Poepsel, S., Kasinath, V. & Nogales, E. Cryo-EM structures of PRC2 simultaneously engaged with two functionally distinct nucleosomes. Nat. Struct. Mol. Biol. 25, 154–162 (2018).