Universidad de Navarra

Facultad de Medicina

New Epigenetic Targets in Multiple Myeloma

Raquel Ordoñez Ciriza Tesis doctoral

2018

Universidad de Navarra

Facultad de Medicina

New Epigenetic Targets in Multiple Myeloma

Memoria presentada por Dña. Raquel Ordoñez Ciriza para aspirar al grado de Doctor por la Universidad de Navarra.

Dña. Raquel Ordoñez Ciriza

El presente trabajo ha sido realizado bajo nuestra dirección en el Área de Oncohematología y autorizamos su presentación ante el Tribunal que lo ha de juzgar. Pamplona, 15 de octubre de 2018.

Dr. Felipe Prósper Cardoso Dr. Xabier Agirre Ena

This doctoral thesis has been suported by the “Personal Investigador en Formación” program from ADA (Asociacion de Amigos de la Universidad de Navarra), and the “Formación de Profesorado Universitario” program from MINECO (Ministerio de Educación, Cultura y Deporte).

To my family,

to Pablo,

to all of you who have been with me along the way.

~

A mi familia,

a Pablo, a todos los que me habéis acompañado en el camino.

Acknowledgements

ACKNOWLEDGEMENTS

Después de todos estos años, todo el esfuerzo, la ilusión, las ganas, ha llegado el final. Mirando hacia atrás, no puedo sentirme más feliz. Sois muchos los que me habéis acompañado en esta aventura, y si con algo me quedo de esta experiencia es con vuestro cariño y apoyo. Gracias, de corazón, a todos y cada uno de los que hacéis que cada paso adelante merezca la pena. Esta tesis es vuestra.

A mi director de tesis, Felipe Prósper, por confiar en mí desde el primer día que pisé el laboratorio, por tu paciencia, dedicación, motivación, criterio y aliento. Gracias por haberme ayudado a crecer como profesional pero sobre todo como persona, exigiéndome cada día un poco más, sacando siempre lo mejor de mí. Por apoyarme en cada una de mis locuras, por saber escucharme y por transmitirme la importancia del trabajo en equipo. Gracias por todo.

A mi co-director, Xabier Agirre, por ser uno de mis pilares durante todos estos años. Porque por muy difícil que haya sido el camino nunca has dejado que me rinda, siempre has sabido “darle una vuelta”. Por transmitirme tu ilusión por cada resultado, cada experimento, cada pequeño paso adelante. Por enseñarme a pensar y a disfrutar cada día en el laboratorio. Gracias y mil gracias.

A Iñaki, porque trabajar con alguien que disfruta tanto de la ciencia como tú, es una lección de vida. Gracias por enseñarme a mirar siempre más allá, a hacer de cada debilidad un punto fuerte, a disfrutar del trabajo en equipo y sobre todo a no perder nunca la ilusión. Ha sido un verdadero privilegio poder contar con tu confianza, guía y ayuda.

A Marta, porque aunque quisiera no podría imaginarme una mejor compañera en este camino. Gracias por tu buen rollo, por tu paciencia, por tus consejos. Por animarme en los momentos de desesperación, por los cafés en Rumbos, los vinitos a

Acknowledgements media tarde. Por ser mucho más que una compañera, una amiga. Y sobre todo gracias por tu ilusión, sin ti nada habría sido lo mismo.

Gracias a Mariajo, que desde el principio me animó en esta aventura de la tesis doctoral, dándome mi primera oportunidad en la ciencia. Gracias por esos cafés- terapia, las partidas de continental, tus felicitaciones navideñas, y por todo el cariño y arte innato que derrochas día a día.

Mil gracias a todo el grupo de epigenética, porque sin vuestro apoyo y vuestra eterna paciencia, ¡qué sería de mí! A mi pequeña Arantxiburi, porque detrás de esa faceta de “malota” se esconde un corazón enorme. Gracias por cada abrazo, por compartir conmigo los buenos y malos momentos todos estos años y sobre todo por escucharme cuando lo he necesitado. ¡Vas a llegar muy lejos pequeñaja! A Teresa, gracias por tu cariño incondicional, por estar siempre ahí. Sabes que para mí eres un ejemplo a seguir, como científica, luchadora, mami y sobre todo como amiga. A Esti, gracias por haberme enseñado día a día el valor del trabajo bien hecho y a esperar siempre a ver la carga antes de sacar conclusiones (por si acaso). Por tus “politonos”, tu buen rollo, tu energía y tu compañerismo, el laboratorio no sería lo mismo sin ti. A Leyre, ¡la “mami” del grupo! Gracias por no dejarme desesperar, estar siempre dispuesta a echar una mano y alegrarme las horas en poyata. A Edurne, por demostrarme que la pasión por la ciencia nunca se pierde y que todo esfuerzo tiene su recompensa. Gracias también a las nuevas incorporaciones, Naroa, Ane y Luisvi, por vuestra motivación, ganas e ilusión.

A Paula y Javi, mis Zugarramurdis, mis Mellizas, sois sin duda lo mejor que me ha pasado estos años. No hay palabras para poder agradeceros todo lo que habéis hecho por mí. A Paula, por ser mi apoyo incondicional desde el primer día. Gracias por tener tiempo siempre para mi desesperación, mis alegrías o simplemente mis ganas de merendar. A Javi, por tu buen rollo innato, tu risa contagiosa, tus abrazos

Acknowledgements fuera de contexto y tus momentos como “cojo”. Eres desesperante y adorable a la vez. Gracias, gracias y mil gracias a los dos. Porque habéis hecho fácil lo difícil. ¡Dentro de poco os toca a vosotros! Pajara shoes nos espera…

Gracias a Amaia, por ser como una hermana mayor para mí. Por estar a mi lado desde la primera PCR, por tu infinita paciencia, por ayudarme a pensar y a repensar, por animarme en cada momento y no dejar que me rinda. Pero sobre todo por todos los momentazos y risas dentro y fuera del labo. Por ser mucho más que una compañera, gracias.

A David, por confiar en mí más que yo misma, por animarme siempre a ir más allá, por contagiarme tu pasión por la ciencia. Gracias por las cervezas con parmigiano, las palmeras de chocolate y sobre todo por no abandonarme cuando echaba el pulmón en la cuesta de Zizur.

Gracias a todo el laboratorio 1.01, 1.02 y 1.04. Sois mi pequeña gran familia, el mejor equipo que podría imaginar. Gracias a Ana Perez, aguantarme tarde tras tarde ha tenido que ser ¡todo un reto! A Isa, por saber siempre cuando necesitaba un achuchón. A Eva, por tu cariño y tus sonrisas día a día. A Arantxa Baraibar, por tu fortaleza y tu optimismo contagioso. A Saray, por tu buen rollo innato, por conseguir “desatarnos” a todos. A JuanRo, por sus riquísimas paellas, por echarme una mano siempre que lo he necesitado. A Manu, porque muchas veces no sabes si quererlo o matarlo, pero en el fondo sabes que se preocupa por ti. A Silvia, por cuidar siempre de mi salud mental y física, por ser como mi segunda “mami”. A Adri, porque aunque casi te entiendo más en inglés que en castellano, siempre has tenido tiempo para escucharme y darme consejo. A Patxi, por tu dedicación, tu humor y sobre todo tu gran corazón. A Laura, porque a pesar de todas mis liadas sigues apoyándome y estando ahí cuando lo he necesitado. A Juan Pablo, por estar dispuesto a echarme una mano siempre con una sonrisa. A Miriam, Borja, Ana Pardo, Rebeca, Xonia, Xabi

Acknowledgements

López, Giulia, Bea, Froi, Edu, Tania, Elena, Óscar. A todo el laboratorio de la CUN. Gracias porque por todos vosotros, cada día voy a trabajar con una sonrisa.

Y por supuesto a los nuevos fichajes: Leyre, Iñigo, Laura, Ana Cris, Marta, Jose, Pilar, Juan Antonio, Ángel. Habéis sido un soplo de aire fresco para todo el laboratorio. Coged aire, porque el camino que os queda es largo, pero merece la pena.

Tampoco me olvido de todos los que habéis formado alguna vez parte de esta pequeña gran familia, y que ahora seguís otro camino (o quizás el mismo). Gracias a Marien, por contagiarme tu ilusión. A Nico, por todas las tardes que hemos pasado entre experimentos eternos. A Esti Arellano, por ser tan tú y alegrarnos cada día que estuviste en el laboratorio. A Andoni, por acompañarme en mis primeros pasos en la ciencia. A Emma, por derrochar alegría allá donde vas. A Natalia Zapata, por tu risa contagiosa. A Natalia Aguado, por tus locuras de pitxota. Mucha suerte a todos en vuestras nuevas aventuras.

A todo el equipo de citometría, en especial a Diego e Idoia, porque ya sabía que erais unos grandes profesionales, pero además he podido ver que sois unas personas increíbles. Gracias por todo. A la unidad de bioinformática, en especial a Víctor, porque sabe que el 5032 será siempre mi extensión favorita, pero sobre todo por su interminable paciencia. A todo el personal del centro, secretarias, recepción, mantenimiento, limpieza… Porque siempre habéis tenido una sonrisa, una palabra de ánimo o la solución perfecta para dar otro paso más en este largo camino. Gracias al departamento de Bioquímica de la Universidad de Navarra, en especial a Silvia, Carlos e Iñigo, por confiar siempre en mí.

No puedo olvidarme de todos los compañeros del IDIBAPS. Mil gracias por acogerme y tratarme como una más, hicisteis que la experiencia fuera inolvidable. Moltes gràcies a tots!

Acknowledgements

Y por encima de todo, con todo mi cariño, gracias a los míos por estar incondicionalmente conmigo durante estos años. Siempre.

Gracias papá y mamá, por ser los primeros en confiar en mí y acompañarme a lo largo de todo el camino. Por preguntarme una y otra vez qué tal iban los experimentos, las estadísticas, los resultados, y escucharme ilusionados al ver a vuestra pequeña crecer. Porque sin vosotros no lo hubiese logrado. Porque lo mejor de este momento es saber que estáis orgullosos de mí. A Javi, porque aunque las hermanas mayores tenemos que ayudar a los pequeños, tú siempre has sido un apoyo para mí. Gracias a mis abuelos, las razones sobran. Habéis sido, sois y seréis la mejor prueba de amor incondicional. A mis tíos y primas, por su alegría e ilusión. A todos, os quiero mucho.

A Pablo, gracias por darme fuerzas día a día, por confiar en mí, por hacerme crecer a tu lado. Gracias por escuchar mis interminables charlas sobre epigenética mirándome con admiración, haciéndome entender por qué todo esto ha valido la pena. Por tu cariño, tu alegría, por seguirme al fin del mundo. Porque después de todos estos años, mirarme y pensar: tampoco ha sido tan duro ¿no?, demuestra por qué te quiero. A toda tu familia, en especial a Elena, Txema, Lucía y Fran por su apoyo y su cariño desde el primer momento. Y como no, a Manu, Palas, Javi e Iñigo, por cuidar siempre de mí y tratarme como una más.

A Naroa, porque nadie puede entenderme como tú, porque sé que te voy a tener a mi lado siempre. Gracias por compartir conmigo esta aventura de locos. A Maialen y Araiz, gracias por ser mi apoyo “martes” a “martes”, por darme tanto sin esperar nada a cambio. A Olaia, por escucharme y entenderme, siempre con una sonrisa. A Pablo Checa, Fiti, Pimpo, por ser mis liantes favoritos. A Martí, por haberte convertido en alguien tan importante en tan poco tiempo, por saber siempre como animarme. A Iñigo, mi crítico gourmet favorito, compañero de carreras y futuro

Acknowledgements vecino. Gracias por tu motivación, tu superación y todo tu cariño. A Kenia y Jone, gracias por hacerme sentir que no ha pasado el tiempo cuando nos volvemos a juntar. A Dany, porque soy tu fan incondicional, ¡viva la ciencia! A Gabriel San, por seguir apoyándome y vacilándome en la distancia. A todo el grupo de contemporáneo, en especial a Ivan y Sara, gracias por ser mi vía de escape.

Y gracias a los que vienen, y a los que no están. Gracias por todo. Os quiero con todo mi corazón.

“Science means constantly walking a tightrope between blind faith and curiosity, between expertise and creativity, between bias and openness, between experience and epiphany, between ambition and passion, and between arrogance and conviction. In short, between an old today and a .”

(Heinrich Rohrer)

TABLE OF CONTENTS

SELECTED ABBREVIATIONS ...... 7

ABSTRACT ...... 9

INTRODUCTION ...... 13 1. Epigenetics...... 17 1.1. The dynamic nature of epigenetics ...... 17 1.2. DNA methylation ...... 18 1.3. Chromatin structure and histone modifications ...... 20 1.4. The chromatin modifying machinery ...... 28 1.5. Methods to analyze epigenomic modifications ...... 31 1.5.1. Quantification of DNA methylation ...... 31 1.5.1.1. Bisulfite pyrosequencing ...... 32 1.5.1.2. Whole-genome Bisulfite Sequencing ...... 33 1.5.2. Identification and mapping of histone modifications ...... 34 1.5.3. Characterization of chromatin accessibility ...... 35 1.5.4. Analysis of 3D structure of chromatin ...... 36 2. Multiple Myeloma ...... 39 2.1. Introduction to B cell development ...... 39 2.2. MM clinical and biological features ...... 41 2.3. Genetic alterations and mutational landscape of MM ...... 41 2.4. Epigenetic features of MM ...... 43

HYPOTHESIS AND OBJECTIVES ...... 47

MATERIALS AND METHODS ...... 51 1. Primary samples ...... 53

1.1. Isolation of MM patient samples...... 53 1.2. Isolation of normal B cell subpopulations ...... 53 2. Chromatin immunoprecipitation with high-throughput sequencing (ChIP-Seq) ..... 55 3. Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-Seq) ...... 57 4. Whole Genome Bisulfite Sequencing (WGBS) ...... 59 5. RNA sequencing (RNA-Seq) ...... 60 5.1. Strand Specific RNA-Seq ...... 60 5.2. Low input 3’ end RNA-Seq ...... 61 6. 4C-Seq ...... 64 7. Bioinformatics ...... 65 7.1. Read mapping and data processing...... 65 7.2. Detection of differentially methylated CpGs, genomic regions and expressed genes ...... 67 7.3. Detection of de novo activated regions in MM ...... 68 7.4. K-means clustering ...... 69 7.5. Gene ontology and canonical pathways analysis of data sets ...... 70 8. Cell culture ...... 70 8.1. Culture of MM cell lines ...... 70 8.2. Culture of HEK93T cell line ...... 71 9. Inducible shRNA knockdown system ...... 71 9.1. Short hairpin RNA sequences design and cloning ...... 71 9.2. Lentiviral production ...... 73 10. CRISPR-Cas9 knockout system ...... 74 10.1. Cas9 stable MM cell lines ...... 75 10.2. Single gRNA sequences design and cloning ...... 75 10.3. Paired gRNA sequences design and cloning ...... 77 10.4.gRNA transduction...... 78

10.5. Assessment of CRISPR-Cas9 single gRNA editing efficiency ...... 78 10.6. Assessment of CRISPR-Cas9 paired gRNA editing efficiency ...... 80 11. Luciferase reporter assay ...... 81 12. RT-qPCR analysis ...... 82 13. Identification of fusion transcripts by gel-PCR ...... 83 14. DNA methylation analysis by pyrosequencing ...... 85 15. Western blot ...... 86 16. Apoptosis assay ...... 87 17. Cell viability assays by MTS ...... 87 18. Cell proliferation assays by flow cytometry ...... 87 19. Statistical analysis ...... 88 20. Data availability ...... 88

RESULTS ...... 89 1. Multilayer profiling of the MM reference epigenome ...... 93 2. Characterization of the MM epigenome in the context of normal B cell differentiation ...... 94 2.1. MM patients exacerbate chromatin patterns already present in normal plasma cells ...... 97 2.2. Identification of a core epigenomic landscape specific of MM patients ...... 100 2.3. De novo chromatin activation is a hallmark of MM ...... 106 3. Identification of TFs involved in MM chromatin modulation ...... 112 4. De novo chromatin activation affects genes related to key pathogenic mechanisms in MM ...... 114 5. TXN de novo activated enhancer is an essential regulatory element for MM pathogenesis ...... 119 5.1. TXN overexpression might be driven by a de novo activated enhancer region in MM ...... 119

5.2. TXN gene knockout impairs MM cell growth ...... 120 5.3. TXN enhancer deletion affects TXN gene expression, decreasing MM cell proliferation ...... 122 6. Identification of activated chromatin blocks specific of MM ...... 125 7. PRDM5 is a new candidate oncogene in MM ...... 126 7.1. PRDM5 and NDNF are consistently overexpressed and hypomethylated in MM patients ...... 126 7.2. PRDM5 and NDNF are independent transcripts in MM ...... 129 7.3. PRDM5 does not regulate NDNF expression ...... 130 7.4. PRDM5 and NDNF chromatin block is topologically remodeled in MM ...... 132 7.5. PRDM5 knockdown inhibits cell proliferation and induces apoptosis in MM cell lines ...... 133 7.6. NDNF knockout slightly impairs MM cell growth in a cell line-specific manner ...... 136

DISCUSSION ...... 141 1. General aspects ...... 145 2. Chromatin activation as a hallmark for MM patients ...... 146 2.1. Epigenetic memory of MM tumor plasma cells ...... 146 2.2. Potential epigenetic drivers of MM chromatin activation ...... 148 2.3. Functional relevance of the MM enhancer signature associated to oxidative stress ...... 151 2.4. Disruption of chromatin domains and oncogene activation in MM ...... 154 3. Functional implications of heterochromatin modulation in MM cells ...... 155 4. Epigenetic modulating agents as a rational therapeutic approach in MM ...... 156 5. Concluding remarks ...... 159

CONCLUSIONS ...... 161

REFERENCES ...... 165

APPENDIX ...... 193 Discovery of Reversible DNA Methyltransferase and Lysine Methyltransferase G9a Inhibitors with Antitumoral in Vivo Efficacy ...... 197 Epigenomic profiling of myelofibrosis reveals widespread DNA methylation changes in enhancer elements and ZFP36L1 as a potential tumor suppressor gene epigenetically regulated ...... 227

SELECTED ABBREVIATIONS

Selected abbreviations

3C Chromosome Conformation Capture

4C-Seq Circularized Chromosome Conformation Capture Sequencing

5hmC 5-hydorxy-methyl-cytosine

5mC 5-methyl-cytosine

ACB Activated Chromatin Block

AmpR Ampicillin Resistance gene

APS Adenosine 5' phosphosulfate

ASO Antisense oligonucleotide

ATAC-Seq Assay for Transposase-Accessible Chromatin with Sequencing

BFP Fluorescent Protein

BlastR Blasticidin Resistance gene bm- Bone marrow

BMP Bone Morphogenic Protein

BMPR Bone Morphogenic Protein Receptor bp Base pairs

CGI CpG Island

ChIP-Seq Chromatin Immunoprecipitation Sequencing

CpG CG dinucleotide

CRISPR Clustered Regularly Interspaced Short Palindromic Repeats

CSR Class-Switch Recombination

CTCF CCCTC-binding factor

DNMT DNA Methyltransferase

DNMTi DNMT inhibitor

3

Selected abbreviations dNTP Deoxynucleotide triphosphate

Dox Doxycycline

DSB Double Strand Break

EBI European Bioinformatics Institute

EGA European Genome Phenome Archive

ER Endoplasmic Reticulum

ERV Endogenous Retrovirus

ES Enrichment Score f1 ori f1 bacterial origin of replication

FACS Fluorescence Activated Cell Sorting

FBS Fetal Bovine Serum

FC Fold Change

FDA Food and Drug Administration

FDR False Discovery Rate

FOX Forkhead boxes

FPKM Fragments Per Kilobase Million

Fw Forward

GAPDH Glyceraldehyde-3-phosphate dehydrogenase

GCBC Germinal Center B cells

GEO Gene Expression Omnibus

GFP Green Fluorescent Protein

GO Gene Ontology gRNA Guide RNA

4

Selected abbreviations

GSEA Gene Set Enrichment Analysis gTXN gRNA targeting TXN gene

GUSβ Glucuronidase beta

H3K27ac Acetylation of lysine 27 of histone 3

H3K27me3 Trimethylation of lysine 27 of histone 3

H3K36me3 Trimethylation of lysine 36 of histone 3

H3K4me1 Monomethylation of lysine 4 of histone 3

H3K4me3 Trimethylation of lysine 4 of histone 3

H3K9me3 Trimethylation of lysine 9 of histone 3

HAT Histone acetyltransferase

HDAC Histone deacetylase

HDACi HDAC inhibitor

HMD Histone demethylase hPGK Human phosphoglycerate kinase promoter

HSC Hematopoietic Stem Cell

ICGC International Cancer Genome Consortium

IHEC International Human Epigenome Consortium

IL6 Interleukin 6

IL6R Interleukin 6 receptor

Indel Insertion/Deletion

IRF Interferon Regulatory Factors kb Kilobases

KMT Lysine methyltransferase

5

Selected abbreviations

LTR Long Terminal Repeat

Mb Megabase

MBC Memory B cells

MEF2 Monocyte Enhancer Factor 2

MGUS Monoclonal Gammopathy of Undetermined Significance mitRNA Mitochondrial RNA

MM Multiple Myeloma

MSC Mesenchymal Stem Cell

MW Molecular Weight

NBC Naïve B cells

NCBI National Center for Biotechnology Information

NDNF Neuron Derived Neurotrophic Factor

NDNFpr-Mut NDNF promoter mutated

NDNFpr-WT NDNF promoter wild type

NeoR Neomycin Resistance gene

NHEJ Non-Homologous End Joining

Norm Counts Normalized counts

NTC Negative Control pb- Peripheral blood

PC Plasma Cells

PC1-PC2 Principal Component 1-2

PCA Principal Component Analysis

PCL Plasma Cell Leukemia

6

Selected abbreviations

PCR Polymerase Chain Reaction

PPi Pyrophosphate

PRDM5 PR domain zinc finger protein 5

PRMT Arginine methyltransferase

PuroR Puromycin Resistance gene

R Pearson correlation coefficient rRNA Ribosomal RNA

RT-qPCR Reverse Transcription with quantitative Polymerase Chain Reaction

Rv Reverse

SAM S-adenosyl-methionine

Scr Scramble

SD Standard Desviation shGFP shRNA targeting GFP gene

SHM Somatic Hypermutation shNDNF shRNA targeting NDNF gene shPRDM5 shRNA targeting PRDM5 gene shRNA Short hairpin RNA

SMM Smoldering Multiple Myeloma ssRNA-Seq Strand specific RNA sequencing

SV40 Simian Virus 40 promoter t- Tonsil

TAD Topologically Associating Domain

TetR Tetracycline Repressor protein

7

Selected abbreviations

TF Transcription Factor

TIDE Tracking of Indels by DEcomposition

TSS Transcription Start Site

TXN Thioredoxin

UMI Unique Molecular Identifier

UPR Unfolded Protein Response vs Versus

VST Variance-Stabilizing Transformed values

WGBS Whole Genome Bisulfite Sequencing

WT Wild Type

ΔEnh Enhancer deletion

8

ABSTRACT

Abstract

How a quiescent, terminally differentiated plasma cell transforms into an aggressive and lethal disease such as multiple myeloma (MM) remains a scientific puzzle. The abnormal genomic landscape of MM patients has been previously characterized but has not been fully capable of explaining the pathogenesis of the disease. Furthermore, MM is a remarkably heterogeneous disease with no unifying transforming mechanism. Recently, breakthrough discoveries in the epigenetics field have transformed our understanding of neoplastic transformation. However, the chromatin regulatory network underlying MM onset and progression has just started to be characterized. Over the last years, we have been generating and integrating genome-wide maps of multiple epigenetic layers in human primary plasma cells from MM and B cells at different stages of differentiation. These have included six histone modifications with non-overlapping functions (H3K27ac, H3K4me1, H3K4me3, H3K36me3, H2K27me3 and H3K9me3), chromatin accessibility (ATAC-Seq), DNA methylation (WGBS) and gene expression (RNA-Seq). Beyond the description of the complete reference epigenome of the disease, we have identified that MM, although genetically heterogeneous, shares a core epigenomic MM-specific signature characterized by an extensive de novo chromatin activation of regulatory elements which has not been described in other hematological malignancies. When further characterized, we have found that chromatin activation and abnormal accessibility landscape lead to an extensive perturbation of the MM transcriptome, associated with two different general phenomena. On one hand, the target genes of the activated regulatory regions were remarkably enriched in multiple pathways and biological processes previously described to be associated with MM, such as protection from oxidative stress, suggesting a chromatin-basis underlying the pathogenesis of MM. On the other hand, we have observed that blocks of closely linked genes without an apparent functional association showed concurrent chromatin activation and expression in MM. In both cases we have functionally validated our hypotheses inducing not only the silencing of the target genes but also

11

Abstract of the cis-regulatory elements proving that chromatin regulation participates in the malignant behavior of MM plasma cells. Such studies have allowed us to identify new oncogenes and regulatory elements essential for MM cells. Together, our results suggest a new model to understand the pathogenesis of MM with the identification of new mechanisms of the disease and potential new targets, including epigenomic regulatory elements.

12

INTRODUCTION

“The beginning of knowledge is the discovery of something we do not understand.”

(Frank Herbert)

Introduction

1. Epigenetics 1.1. The dynamic nature of epigenetics

The genetic information necessary to generate all types of cells in the human body is coded by the same DNA sequence. Unique cellular phenotypes are therefore accomplished through specific regulatory mechanisms carefully orchestrated to allow or prevent gene expression at specific time points and cells (Holliday, 2006): the epigenomic regulatory landscape. Epigenetics is the study of these heritable changes in gene expression that do not require changes in the genomic DNA sequence (Dunn, Verma and Umar, 2003). While the genetic material is only edited in few exceptional cases, epigenetic patterns are extensively exchanged, modified and remodeled, both in the context of physiological and pathological conditions. A major challenge in biology is the understanding of the dynamical nature of this epigenetic regulation, integrating the information derived from different epigenetic layers, such as DNA methylation, histone modifications, histone variants, nucleosome positioning and chromatin looping (Song and Johnson, 2018) (Figure 1).

In the course of this doctoral thesis, we have focused on the study of DNA methylation, histone modifications and chromatin accessibility to characterize the reference epigenome of multiple myeloma (MM) in the context of normal B cell differentiation.

17

Introduction

Figure 1. The cellular nature of epigenetic information. Schematic representation of the epigenetic landscape, including DNA methylation (brown dots), post-translational histone modifications (green dots for activation marks; red dots for silencing marks) and nucleosome positioning (euchromatin represented by light-blue nucleosomes; heterochromatin as dark-blue nucleosomes) regulating RNA transcription (brown lines). On the left panel, epigenetic modulation under physiological conditions; on the right panel, stochastic epigenetic modulation patterns in cancer. Figure adapted from (Feinberg, 2018).

1.2. DNA methylation

DNA methylation is the most widely studied and best-characterized epigenetic modification, and has been reported to play a key role in cell differentiation and neoplastic transformation (Egger et al., 2004; Cedar and Bergman, 2011; Kulis et al.,

2015). It consists in the addition of a methyl group (-CH3) to the 5-carbon position of cytosine bases in CpG dinucleotides (CpGs) by DNA methyltransferase enzymes (DNMTs), yielding 5-methyl-cytosine (5mC). These CpGs are largely depleted from the genome (1% CpGs, C+G content 42%), except for specific regions with relatively high frequency of this dinucleotide (10% CpGs, C+G content > 55%), named CpG

18

Introduction islands (CGIs). These CGIs are associated with the promoter regions of 60%-70% of all human genes (Saxonov, Berg and Brutlag, 2006; Illingworth and Bird, 2009). DNA methylation of CGIs is robustly associated with transcriptional silencing, the formation of repressive chromatin states as well as chromosome X inactivation and genomic imprinting. Although the mechanisms underlying gene repression are not fully elucidated, increasing evidence suggests that it could be achieved through direct obstruction of transcriptional activators from their binding sites (Maurano et al., 2015; Yin et al., 2017) and the recruitment of co-repressor complexes by methyl-CpG binding proteins (Baubec et al., 2013). On the other hand, the absence of DNA methylation at CGIs does not ensure transcriptional activation in all cases. Other epigenetic mechanisms, such as histone modifications, are therefore responsible for silencing unmethylated promoter genes at specific cell differentiation stages (Bird, 2002; Li et al., 2018). Figure 2. Cytosine methylation. Cytosine is converted to 5′- methyl-cytosine by the action of DNMTs, by the addition of a methyl group (-CH3) from S- adenosyl methionine (SAM). Image reproduced from (Cui et al., 2016).

Remarkably, although the mechanisms are not fully characterized, DNA methylation located in gene bodies or intergenic regions outside CGIs have also been demonstrated to control gene expression through regulation of transcriptional elongation (Hahn et al., 2011), determination of alternative promoters (Maunakea et al., 2010) and regulation of mRNA splicing (Fong, Morison and Dawson, 2014). This non-CGI methylation is more dynamic and more tissue-specific than CGI methylation and has been associated to cancer-causing somatic and germline mutations (Jones, 2012). However, the potential role of methylation of CpG-poor promoters,

19

Introduction enhancers, insulators, repetitive elements and gene bodies needs to be characterized in detail.

1.3. Chromatin structure and histone modifications

In eukaryotic cells, DNA is presented in the context of chromatin: a complex of nucleic acid and associated proteins called histones. The nucleosome is the basic unit of chromatin, with 147 base pairs (bp) of DNA wrapped around a protein octamer, containing two copies of each “core” histone H2A, H2B, H3 and H4. The N- and C- terminal histone tails protrude from the nucleosome core and have the potential to interact with adjacent nucleosomes and the linker Figure 3. Chromatin conformational states. DNA. In addition to the structural role of Schematic representation of A) heterochromatin and B) euchromatin structures. Image courtesy of packing the DNA into the cell nucleus, National Institutes of Health. chromatin behaves as a highly dynamic scaffold essential in regulating gene expression (Maeshima et al., 2014). Traditionally, chromatin can be found in two main conformational states: heterochromatin and euchromatin. Heterochromatin is a highly condensed closed structure with high density of nucleosomes and therefore associated with transcriptional repression (Figure 3A). There have been described two types of heterochromatin: constitutive heterochromatin, which contains highly repetitive sequences implicated in chromosome structure maintenance and remains condensed throughout cell cycle and development; and facultative heterochromatin, which has potential for gene expression at some point in cell development and can be either condensed or decondensed depending on cell type (Wang, Jia and Jia,

20

Introduction

2016). On the other hand, euchromatin displays a lightly packed structure, usually associated with actively transcribed regions (Figure 3B). The transition between both states is an active process of both nucleosome repositioning mediated by ATP- dependent chromatin remodeling machinery and nucleosome disassembly and exchange performed by histone chaperones through a highly dynamic and extensively regulated mechanism (Burgess and Zhang, 2013; Bartholomew, 2014).

Additionally to nucleosome positioning, the chromatin landscape is also regulated by histone modifications. Histone tails can be post-transcriptionally modified, regulating chromatin structure directly and frequently acting as binding site for the recruitment of other non-histone proteins to chromatin, such as transcription factors (TFs) or chromatin remodelers. Over 60 different residues on histone tails have been described to be susceptible of being modified. The most frequent histone modifications are acetylation, phosphorylation, methylation and ubiquitination, although many other modifications have been also reported (Arnaudo and Garcia, 2013). The specific patterns of histone post-transcriptional modifications and their crosstalk has been termed the “histone code” (Jenuwein and David Allis, 2001), which regulates not only gene transcription, DNA repair and replication, but also higher order chromatin structure and stability (Kouzarides, 2007; Bannister and Kouzarides, 2011). Certain histone marks have been conventionally associated with transcriptionally active regions, commonly known as “active marks”, such as acetylation (ac) of lysines (K) 27 and 9 of histone 3 (H3), or H3K27ac and H3K9ac respectively. These histone marks may regulate transcription by creating an open chromatin structure and recruit effectors that mediate a transcriptionally competent state (Zentner, Tesar and Scacheri, 2011; Bonn et al., 2012; Karmodiya et al., 2012; Gates et al., 2017). By contrast, methylation (me) of residues H3K27 and H3K9 are hallmarks of repressive chromatin and are often found at silent gene loci. For example, trimethylation of H3 lysine 27 (H3K27me3) is associated with the formation

21

Introduction of facultative heterochromatin, whereas di/trimethylation of H3 lysine 9 (H3K9me2/me3) displays an important role in the formation of constitutive heterochromatin and gene expression regulation during cell development (Trojer and Reinberg, 2007; Jamieson et al., 2016).

Additionally to the mentioned role in gene activation and repression, a more comprehensive overview of the chromatin structure and its implication in cellular function has been described thanks to the generation of genome-wide maps of numerous histone modifications. Specific chromatin marks and their combinatorial patterns have been associated with particular functional regions of the genome, or the so called “chromatin states” (Ernst et al., 2011). This characterization is able to define the regulatory elements of the genome with greater robustness and precision than the individual histone marks (Ernst and Kellis, 2010). In this way, researchers have identified epigenetic patterns that correlate with the following functional elements in the genome:

1) Promoters

A promoter is a DNA segment where transcription initiates, which includes the transcription start site (TSS) of a gene and functional sequence elements which act together as an assembly platform for the transcriptional machinery complex. Promoter regions are usually nucleosome depleted and commonly marked by trimethylation of H3 lysine 4 (H3K4me3) (Thurman et al., 2012; Benayoun et al., 2014). Within H3K4me3 marked promoters, we can differentiate into active (gaining H3K27ac), weak (an intermediate state defined by the only presence of H3K4me3), and poised (gaining H3K27me3) promoters, which differ in the expression level of the associated transcripts (Heintzman et al., 2007) (Figure 4). Remarkably, recent studies have demonstrated that removal of most H3K4me3 from the TSS lead to very few transcriptional changes, suggesting that rather

22

Introduction

than having a key role in regulating transcription, H3K4me3 may be deposited as a result of active transcription, regulating mechanisms such as splicing, transcription termination, epigenetic memory of previous states and transcriptional consistency between cells in a specific population (Howe et al., 2017).

Figure 4. Chromatin modifications are distributed in specific gene regulatory regions. A) Distribution of DNA methylation (5mC), DNA hydroxymethylation (5hmC) and histone marks in the enhancer, promoter and gene body regions of actively transcribed genes. Actively transcribed genes carry typically have chromatin modifications within the gene body to facilitate transcription initiation and elongation. B) Common chromatin modifications found in the enhancer, promoter and gene body regions of silenced genes. C) Bivalent/poised genes have both activating and silencing chromatin modifications to facilitate rapid changes in gene expression during cell development. Figure reproduced from (Layman and Zuo, 2014).

23

Introduction

2) Enhancers

Enhancers are regulatory elements localized distal to promoters and TSS, which have been demonstrated to be highly dynamic and correlate with cell- specific gene expression profiles, whereas promoter activation is more stable across cell types (Calo and Wysocka, 2013; Lara-Astiaso et al., 2014). Enhancers function as integrated binding platforms for TFs (Local et al., 2018), which are capable of activate transcription of their target genes at great distances by exerting long-range interactions with promoters though DNA looping (Bulger and Groudine, 2011) (Figure 5). Primed or poised enhancers are marked by monomethylation of H3 lysine 4 (H3K4me1), which not only determines the location of a putative enhancer region, but has also been described to prevent DNA methylation, facilitate nucleosome repositioning and promote the binding of the so called “pioneer” factors responsible for enhancer activation (Zaret and Carroll, 2011; Sharifi-Zarchi et al., 2017). Presence of H3K27ac distinguishes active enhancer states from those poised for activation, which are bivalently marked by H3K4me1 and H3K27me3 in specific cell types (Creyghton et al., 2010) (Figure 4). These two histone modifications (H3K4me1 and H3K27ac), in combination with chromatin accessibility, have been widely utilized for enhancer annotation in multiple studies (Heintzman et al., 2009; Fang et al., 2016; Gao et al., 2016).

24

Introduction

Figure 5. Enhancers and their features. Enhancers are distinct functional genomic regions that contain short DNA motifs acting as binding sites for sequence-specific transcription factors (TFs), which can enhance the transcription of a target gene. In a specific cell context, active enhancers (Enhancer B) are bound by activating TFs and brought into proximity of their respective target promoters by DNA looping. Active enhancers are characterized by a depletion of nucleosomes, and the active H3K4me1 and H3K27ac histone marks. On the other hand, inactive enhancers might be silenced by different mechanisms, such as repressive TFs binding (Enhancer A) or the Polycomb-associated H3K27me3 mark (Enhancer C). The combined regulatory patterns of TFs binding and histone modifications determine the cell-specific gene expression profile. Image modified from (Shlyueva, Stampfel and Stark, 2014).

3) Transcribed regions

Recent studies have demonstrated that trimethylation of H3 lysine 36 (H3K36me3) is a histone modification highly correlated with the transcribed regions of active genes. Notably, exons seem to be enriched for H3K36me as compared to intronic regions within the same gene (Figure 4). Moreover, alternatively spliced exons show reduced H3K36me3 levels when compared to constitutively expressed exons, suggesting the implication of this histone modification also in the splicing mechanism (Kolasinska-Zwierz et al., 2009). Additionally, H3K36me3 seems to have a role in key processes like chromatin architecture and stability maintenance (Joshi and Struhl, 2005; Sims III and Reinberg, 2009) and regulation of DNA mismatch repair (Li et al., 2013).

25

Introduction

4) Insulators

Chromatin insulators are regulatory DNA elements that divide the genome into independent chromatin domains to prevent inappropriate interactions between adjacent regulatory elements (such as promoters and enhancers) (Figure 6). CCCTC-binding factor (CTCF) is the most characterized insulator- binding protein, which has been described to organize the genome into epigenetically distinct domains by forming chromatin loops (Cuddapah et al., 2009). Besides the insulator role of CTCF, it has been described to facilitate gene transcription at specific gene loci by linking promoters to enhancers (Handoko et al., 2011; Lu et al., 2016). Therefore, it is considered a key genome organizer, acting as chromatin barrier, enhancer insulator and enhancer linker (Phillips and Corces, 2009; Nakahashi et al., 2013), but also playing important roles in the regulation of V(D)J recombination (Chaumeil and Skok, 2012), topologically associating domain (TAD) boundary demarcation (Dixon et al., 2012), alternative promoter selection (Guo et al., 2012) and alternative splicing regulation (Shukla et al., 2011).

Figure 6. Classical functions of insulators. A fragment of genomic DNA with functional regulatory sequences (promoter, enhancer and insulator) is shown. A) A barrier insulator is able to block the spread of heterochromatin, rendering the activation of the downstream gene. B) An enhancer-blocking insulator complex limits the activity of an enhancer in an orientation- dependent manner. Image reproduced from (Heger and Wiehe, 2014).

26

Introduction

5) Heterochromatin

As aforementioned, heterochromatin can be classified into two categories: constitutive and facultative heterochromatin. On one hand, constitutive heterochromatin is a high stable chromatin structure that represses transcription and recombination of repetitive DNA elements to maintain genome integrity (Grewal and Jia, 2007). It is usually associated to trimethylated H3 lysine 9 (H3K9me3) histone mark and DNA methylation (Figure 4), which are stably inherited though DNA replication (Hall et al., 2002; Audergon et al., 2015). Recent studies have demonstrated that, despite being a highly stable structure, this constitutive heterochromatin domains might display dynamic patterns under specific conditions, like a global loss of H3K9me3 heterochromatin observed during the normal aging process (Zhang et al., 2015).

On the other hand, facultative heterochromatin is often located at developmentally regulated genes and gets modulated throughout the cell maturation process (Trojer & Reinberg, 2007). It is frequently associated with Polycomb repression though H3 lysine 27 trimethylation (H3K27me3) deposition (Figure 4). H3K27me3 has been shown to co-localize with DNA methylation throughout most of the genome, except for CGIs, where these two marks are mutually exclusive (Brinkman et al., 2012). Cross-talk between H3K27me3 histone modification and the Polycomb complex maintains a heterochromatic state by blocking the binding of key TFs to repressed promoter and enhancer regions, altering nucleosome interactions and disrupting the elongation mechanisms (Simon and Kingston, 2009). Several studies demonstrate that deregulation of H3K27 methylation alters cellular differentiation under both physiological and pathological conditions (Kim and Kim, 2012; Conway, Healy and Bracken, 2015).

27

Introduction

Over the last years, genome-wide maps of chromatin modifications in a variety of cell types and organisms have been extensively mined (Epigenomics Consortium et al., 2015) with the development of new bioinformatic analysis tools (Ernst and Kellis, 2017). The challenge now is to figure out how these chromatin states are maintained and modified throughout the genome and to decipher what a particular chromatin state modulation means mechanistically in a specific cellular context.

1.4. The chromatin modifying machinery

Chromatin is a dynamic structure that responds to a wide variety of stimuli to regulate DNA accessibility and gene expression patterns. This dynamic nature is orchestrated by the chromatin modifying machinery, including multiple regulatory proteins that write, read and erase DNA and Figure 7. The chromatin modifying machinery. The histone modifications (Reik, 2007). epigenetic machinery includes chromatin “writers” which introduce specific modifications (circles), Chromatin “writers” are enzymes “erasers” that remove them and “readers” which can that catalyze the epigenetic recognize a specific epigenetic mark and recruit other epigenetic and transcription factors. Image reproduced modifications, which can be then from (Tarakhovsky, 2010). removed by enzymatic “erasers”. Epigenetic “readers” can recognize and interpret such chromatin modifications and recruit, in a context dependent manner, specific protein complexes that modify the chromosomal structure, DNA accessibility and transcriptional landscape of the cell (Nature Structural & Molecular Biology, 2013) (Figure 7).

DNA methylation is mediated by DNMTs, which transfer a methyl group (-CH3) from S-adenosyl methionine (SAM) to the carbon-5 position of cytosine residues.

28

Introduction

These enzymes are involved in both stablishing de novo DNA methylation patterns (DNMT3a and DNMT3b) and their maintenance during cell division (DNMT1) (Chen and Riggs, 2011). Whereas DNA methylation mechanisms are well characterized, the DNA demethylation process is still controversial. On one hand, DNA methylation can be passively lost due to an inefficient maintenance during cell division. On the other hand, active DNA demethylation can occur by deamination of 5mC to thymine, catalyzed by AID enzyme; or by hydroxylation to 5-hydroxy-methyl-cytosine (5hmC), catalyzed by the TET protein family (Bhutani, Burns and Blau, 2011).

Histone acetylation is strongly correlated with chromatin activation and its balance depends on two enzyme types: histone acetyltransferases (HATs) and histone deacetylases (HDACs). Among HATs, we can distinguish between those located in the nucleus (including GNAT, MYST and p300/CBP families) and those that acetylate newly synthesized histones in the cytoplasm before their assembly (such as HAT1 and HAT2). On the other hand, HDAC enzymes can be divided into Cass I, II, III (also called sirtuins) and IV, depending on their nuclear/cytoplasmic localization and cofactor dependency (Bannister and Kouzarides, 2011) (Figure 8).

Histone methylation is a highly complex epigenetic modification that can be involved both in activation and repression of gene transcription, including different levels of modulation (mono-, di- or tri-methylated residues). The enzymes responsible for histone methylation are lysine methyltransferases or KMTs, and arginine methyltransferases (PRMT) which are less characterized. A wide range of protein families have been described to harbor KMT function, including SETD1 and MLL for H3K4 methylation, SETD2 and NSD1-3 for H3K36 methylation, SUV39H and G9a (EHMT2) for H3K9 methylation and EZH2 for H3K27 methylation, among others. On the other hand, histone demethylation has been found to be substrate-specific (mono-, di- or tri-methylated lysine), including a complex network of histone

29

Introduction demethylases or HDM (KMD1-5 and JMJD protein families) (Greer and Shi, 2012) (Figure 8).

Figure 8. Histone modifying enzymes. Post-translational modifications of lysine (K), arginine (R), serine (S) and threonine (T) residues are shown with their respective histone modifying enzymes. Arginine residues can be methylated (M); lysine residues can be monomethylated, dimethylated or trimethylated. Enzymes shown in green are associated with transcriptional activation and enzymes shown in red are associated with transcriptional repression. Figure reproduced from (Huynh and Casaccia, 2013).

Numerous studies have demonstrated that there is a dynamic crosstalk between chromatin modifications and the epigenetic machinery. Many epigenetic “writers” are positively regulated by the marks that they place, as well as other modifications associated with the same chromatin state, reinforcing the formation and maintenance of distinct chromatin patterns through cell division. Moreover, for many epigenetic modifications it is not well stablished whether they directly influence transcription or their placement is simply a consequence of the transcriptional state of a genomic region (Zhang, Cooper and Brockdorff, 2015). Due to the complexity of the regulatory patterns underlying the chromatin modifying machinery, the exact molecular mechanisms of the epigenetic memory and its aberrant deregulation remain to be fully characterized.

30

Introduction

1.5. Methods to analyze epigenomic modifications

Considering the increasing relevance of epigenetics for understanding both physiological and pathological cellular processes, a wide range of techniques to identify and quantify modulations of the epigenetic landscape have been developed in the past decades. In the current era of the high-throughput data, many measurement technologies that were first developed to study locus-specific modifications have been now adapted to microarray platforms and next-generation sequencing to further interrogate epigenetic modifications at a whole-genome scale. From the vast variety of epigenomic analysis tools, in the following section we will introduce the experimental techniques applied in the course of this doctoral thesis.

1.5.1. Quantification of DNA methylation

Some of the most widely used approaches to map 5mC are based on sodium bisulfite modification of the DNA. Bisulfite treatment deaminates unmethylated cytosines resulting in their conversion to uracil, to be further recognized as thymidine during PCR based strategies, while leaving methylated cytosine residues unchanged (Figure 9). Thus, the ratio between methylated and unmethylated signals at each specific CpG is proportional to its methylation level (Kagohara et al., 2018). DNA bisulfite Figure 9. Bisulfite modification modification can be coupled with different principle. Bisulfite conversion of unmethylated cytosines (left) into downstream protocols, like pyrosequencing for uracil residues, recognized as thymidine during PCR amplification. detecting locus-specific DNA methylation changes, or Methylated cytosines (right) remain unchanged. whole-genome bisulfite sequencing (WGBS) to detect genome-wide DNA methylation patterns.

31

Introduction

1.5.1.1. Bisulfite pyrosequencing

Pyrosequencing is a method of DNA sequencing based on the “sequencing by synthesis” principle, in which the sequencing is performed by detecting the nucleotide incorporated in each sequencing cycle. Briefly, bisulfite modified DNA is amplified by PCR using a specific pair of primers flanking the region containing the CpGs of interest, with one of the two primers being biotinylated. In this amplification step, uracil residues are replaced by thymidines. This biotinylated PCR product is bound to streptavidin-coated sepharose beads and after a denaturation step, the unlabeled strand is washed away. Then, the pyrosequencing reaction is carried out using the biotinylated strand as template. Pyrosequencing relies on the coupled reaction of the following enzymes: 1) a DNA polymerase which, from a specific sequencing primer, adds the corresponding nucleotide to the complementary strand. Nucleotides are sequentially added and their incorporation releases pyrophosphate (PPi). 2) ATP sulfurylase converts PPi to ATP in the presence of adenosine 5’ phosphosulfate (APS). 3) The luciferase enzyme uses this ATP to transform luciferin into oxoluciferin, which generates visible light in proportional amounts to the concentration of ATP. This light is detected by the pyrosequencing platform in a real- time manner, and subsequently analyzed for sequence determination. 4) The unincorporated nucleotides and ATP are then degraded by the apyrase enzyme and a new sequencing reaction cycle starts with the addition of a different nucleotide (Bassil, Huang and Murphy, 2013) (Figure 10).

The quantification of DNA methylation levels of a specific CpG is based on the peaks intensity for cytosine and thymine in the first base of the CpG dinucleotide: if a thymine is detected, the initial CpG site was unmethylated; whereas if a cytosine is present, the initial CpG was methylated. The intensity ratio between thymine and cytosine signals at each specific CpG leads to an accurate quantification of its methylation status. The main limitation of the method is that the lengths of

32

Introduction individual reads of DNA sequence are in the range of 150 to 300 bp, being the method of choice for interrogation of specific genomic loci (Izzi, Binder and Michels, 2014).

Figure 10. Pyrosequencing protocol. The DNA fragment is amplified and used as template for sequencing by synthesis. Doxyribonucleotide triphosphate (dNTP) is added to the complementary strand by the DNA polymerase, releasing a pyrophosphate (PPi), which is converted to ATP by the sulfurylase enzyme. This ATP drives the luciferase-mediated conversion of luciferine to generate visible light amounts proportional to the amount of dNTP added. The light peak is detected by the platform and determined as a peak for sequence analysis. The apyrase enzyme degrades the unincorporated nucleotides and ATP before starting the next sequencing cycle. Image courtesy of the Density Design Research Lab, Polytechnic University of Milan.

1.5.1.2. Whole-genome Bisulfite Sequencing

Whole genome bisulfite sequencing (WGBS) is the gold-standard approach to provide single base-pair resolution and quantitative information of methylated cytosines in the complete genome. In this method, genomic DNA is treated with sodium bisulfite, followed by library preparation and next-generation sequencing. The location of the methylated cytosines can then be determined by comparing bisulfite-treated and untreated sequences. Following this approach, not only the methylation status of millions of CpG sites of the genome can be assessed, but also methylation changes of cytosine residues outside the CpG context, allowing for an unbiased whole-genome coverage (Lister et al., 2009). Although it has been demonstrated to be a very powerful research tool, the associated high sequencing costs due to the relatively high coverage necessary to obtain accurate DNA

33

Introduction methylation estimates, together with the large computational power needed for the downstream bioinformatic analyses, continue to limit its widespread application (Ziller et al., 2015).

1.5.2. Identification and mapping of histone modifications

Chromatin immunoprecipitation (ChIP) is a widely used technique for identifying genomic regions associated with specific proteins within their native chromatin context. It has been used to map the genome-wide distribution of histone modifications, histone proteins and variants, histone chaperones, chromatin regulators, transcription factors and cofactors among other DNA binding proteins (Mundade et al., 2014). The standard ChIP protocol starts by formaldehyde crosslinking of intact cells to Figure 11. ChIP protocol. Schematic fix and preserve protein-DNA interactions, representation of the chromatin followed by cell lysis and chromatin immunoprecipitation (ChIP) assay possible methods for analysis. Figure reproduced from isolation. This chromatin is then (Collas, 2010). fragmented into approximately 100 to 300 bp fragments by sonication or nuclease digestion. Next, selective immunoprecipitation of protein-DNA complexes is performed using specific antibodies, followed by reversal of crosslinks and purification of the associated DNA. This immunoprecipitated DNA can be analyzed using various methods: 1) Standard or real time PCR, which allows the validation of locus-specific interactions for known targets of a given protein (ChIP-PCR). 2) Genomic microarray techniques, the so called ChIP-chip, allowing for genome-wide

34

Introduction identification of DNA-protein interactions, with the genome coverage limited by the microarray probe design. 3) High-throughput sequencing, termed ChIP-Seq, which produces whole-genome profiles of DNA-protein interactions with higher spatial resolution, dynamic range and genomic coverage, allowing it to have higher sensitivity and specificity than ChIP-chip (Ho et al., 2011) (Figure 11). The basic analysis of ChIP-Seq includes the reads alignment to the reference genome, followed by peak calling (dentification of genomic regions that have been enriched with aligned reads) and assignment of peaks to putative target genes. Due to its versatility, ChIP-Seq has been one of the key technologies used to generate reference epigenome maps in a wide range of epigenomic projects (Epigenomics Consortium et al., 2015; Yan et al., 2016; Beekman et al., 2018). However, although many different ChIP-Seq variations have been developed in recent years, this approach is still limited by several factors, including the requirement of large amounts of starting material, the lack of internal control needed for cross-sample comparison and the considerable computational effort needed for downstream bioinformatic analyses (Collas, 2010).

1.5.3. Characterization of chromatin accessibility

Whereas inactive genomic loci remain silenced in tightly packaged heterochromatin, active regulatory elements are located in euchromatic regions exposing nucleosome-free DNA loci accessible to the transcriptional regulatory machinery. The Assay for Transposase-Accessible Chromatin Sequencing (ATAC-Seq) is a recently developed technique used to identify accessible DNA regions genome- wide (Buenrostro et al., 2013). ATAC-Seq is based on a process called tagmentation: simultaneous fragmentation and tagging of a specific accessible genomic region. Tagmentation is carried out by a mutant hyperactive Tn5 transposase, which simultaneously cuts DNA loci of increased accessibility and ligates the adaptors used for high-throughput sequencing (Buenrostro et al., 2015) (Figure 12A). Adaptor- ligated DNA fragments are then purified (Figure 12B), sequenced and aligned to the

35

Introduction reference genome (Figure 12C). The number of reads for a region correlates with the accessibility of that chromatin locus at a single nucleotide resolution. This revolutionary technique allows the interrogation of genome-wide chromatin accessibility, but also gives an insight into TF occupancy (Tripodi, Allen and Dowell, 2018) and nucleosome positioning in regulatory sites (Schep et al., 2015).

Figure 12. ATAC-Seq protocol. A) Tagmentation reaction carried out by the Tn5 transposase (showed in purple). B) Adaptor-ligated DNA fragments (adaptors showed in pink and yellow) are purified and amplified by PCR if necessary. C) DNA fragments are then sequenced and aligned to the reference genome for ATAC peak identification. Image modified from (Milani et al., 2016).

1.5.4. Analysis of 3D structure of chromatin

Recent epigenomic studies have identified hundreds of thousands of putative regulatory regions throughout the genome, demonstrating that multiple regulatory elements can modulate the expression of a specific gene, most of them located at long chromosomal distances. Spatial chromatin organization brings these remote regulatory elements in physical contact with target promoters to regulate gene expression (Visel, Rubin and Pennacchio, 2009; Kulaeva et al., 2012). The relative proximity of distant genomic regulatory regions to each other defines the 3D structure of chromatin, which can be interrogated with chromosome conformation capture (3C) based technologies. All 3C-based technologies share the same initial steps. First, chromatin is cross-linked using formaldehyde to fix and preserve interactions between genomic loci. Next, the fixed chromatin is cut with a restriction

36

Introduction enzyme, followed by random ligation of the resulting fragments, preferentially between those that remain cross-linked. DNA fragments that are distant on the linear genome, but co-localize in space, can therefore be ligated to each other. To stablish the spatial conformation of a genomic locus, the number of ligation events between non-neighbor sites is determined by PCR amplification of the selected ligation junctions. In 3C experiments (also known as “one versus one”), interactions between a single pair of pre-chosen genomic partners is quantified by quantitative PCR (Dekker et al., 2002).

Figure 13. 4C-Seq protocol. Cross-linked chromatin is digested by a restriction enzyme and fixed fragments are ligated to each other. After reverse crosslinking, a second step of digestion and subsequent ligation is performed to obtain circularized DNA fragments. 4C primers are designed on the viewpoint fragment (red) allowing the PCR amplification of captured fragments (blue) and further analysis using next generation sequencing. Image modified from (Splinter et al., 2012).

Based on 3C principles, 4C technology (circularized chromosome conformation capture, also known as “one versus all”) captures interactions between one determined locus and all other genomic loci. Briefly, the ligated 3C template is digested with a second restriction enzyme followed by a subsequent ligation step, to create smaller DNA circular structures. A target locus or view point is defined for PCR

37

Introduction amplification of all sequences contacting this genomic region, which can be then analyzed by microarray platforms or next generation sequencing methods (4C-Seq) (Harmen J.G. van de Werken et al., 2012) (Figure 13). 4C-Seq can be used to study inter-chromosomal and remote intra-chromosomal interactions with high accuracy. In contrast with 3C technology, which is the method of choice to identify contacts between nearby regulatory sequences, the 4C protocol is more robust and unbiased, as it does not require the prior knowledge of both interacting genomic partners (Harmen J G van de Werken et al., 2012; Splinter et al., 2012).

High throughput sequencing technologies have also led to the development of the 5C (“many versus many”) and HiC (“all versus all”) technologies. Both techniques have been used genome-wide to reconstruct the average 3D topology of the cell “contactome”, by characterizing specific DNA contacts in the context of other genomic interactions (de Wit and de Laat, 2012; Denker and de Laat, 2016).

38

Introduction

2. Multiple Myeloma 2.1. Introduction to B cell development

B cell development is a complex and tightly regulated process, guided by the coordinated expression of different maturation-specific TFs and environmental signals (Matthias and Rolink, 2005; Kurosaki, Shinohara and Baba, 2010). Hematopoietic stem cells (HSCs) located in the bone marrow niche differentiate into common lymphoid progenitors, which then commit to the B cell lineage and give rise to precursor B cells. After immunoglobulin rearrangement, these precursor cells differentiate into Naïve B cells (NBCs), which leave the bone marrow niche to enter the bloodstream. NBCs migrate to the lymph nodes, where they can undergo T-cell dependent antigen activation, which induces the germinal center reaction. Germinal center B cells (GCBCs) are subjected to repeated rounds of proliferation, somatic hypermutation (SHM) and class-switch recombination (CSR) of their immunoglobulin genes, leading to clonal selection and terminal differentiation to plasma cells (PCs) and memory B cells (MBCs) (Mesin, Ersching and Victora, 2016). Mature PCs specialized in the production and secretion of high-affinity antibodies, migrate to the bone marrow where they can reside for extended periods of time; whereas MBCs recirculate through the blood and other lymphoid organs, setting the basis for humoral immunity (Suan, Sundling and Brink, 2017) (Figure 14).

Cell development is by definition an epigenetic process. In the particular case of B cell differentiation, coordinated regulation of genetic events and modulation of gene expression programs by chromatin changes results in de development of highly specialized cells (Reik, 2007; Epigenomics Consortium et al., 2015; Kulis et al., 2015). Alterations in such genetic events or chromatin patterns could lead to the development of B cell related immune disorders, such as autoimmunity, allergic diseases and B cell neoplasms (Portela and Esteller, 2010; Blumenthal, 2012; Beekman et al., 2018). B cell malignancies can originate from each different cell

39

Introduction maturation stage, giving rise to a large number of different B cell derived neoplasms, such as chronic lymphocytic leukemia, diffuse large B cell lymphoma, follicular lymphoma or MM among others. The biological and clinical features of each malignancy rely on the properties and functions of the different cells of origin, as well as the multiple molecular mechanisms underlying disease development and progression (Schroeder et al., 2015). Out of all B cell neoplasms, this doctoral thesis has been focused on MM, a B cell malignancy derived from terminally differentiated PCs.

Figure 14. B cell maturation process in the germinal center. The production of mature B cells occurs through a series of processes that take place in peripheral lymphoid organs. Naïve B cells migrate to the lymph nodes, where they can undergo T cell dependent antigen activation, which induces the germinal center reaction. These antigen-activated B cells further differentiate into centroblasts that undergo clonal expansion in the dark zone of the germinal center. During proliferation, the process of somatic hypermutation (SHM) introduces base-pair changes into the V(D)J region of the rearranged genes encoding the immunoglobulin variable region of the heavy chain and light chain. Centroblasts then differentiate into centrocytes and move to the light zone, where the modified antigen receptor, with help from T-cells and follicular dendritic cells (FDCs), is selected for improved binding to the immunizing antigen. Newly generated centrocytes that produce an unfavourable antibody undergo apoptosis and are removed. A subset of centrocytes undergoes immunoglobulin class-switch recombination (CSR). Antigen-selected centrocytes eventually differentiate into memory B cells or plasma cells, specialized in the production and secretion of high-affinity antibodies. Figure reproduced from (Klein and Dalla- Favera, 2008).

40

Introduction

2.2. MM clinical and biological features

MM is a clonal B cell neoplasm characterized by the uncontrolled expansion and accumulation of malignant PCs in the bone marrow. It is the second most common hematological malignancy after non-Hodgkin lymphoma, accounting for 13% of all hematological neoplasms (Kumar et al., 2017). MM is characterized by different stages of clonal evolution, with a preceding asymptomatic premalignant PC disorder named monoclonal gammopathy of undetermined significance or MGUS, which can evolve to asymptomatic (or smoldering, SMM) MM with a progression rate of 10% per year for the first 5 years, and finally to symptomatic MM with a risk of progression of 1% per year (Kyle et al., 2007; Landgren et al., 2009; Korde, Kristinsson and Landgren, 2011). Extensive studies exploring the molecular pathogenesis and biological features of MM have revealed a complex and heterogeneous disease, with increasing evidence pointing towards MM not being a single disease, but a collection of genetic diseases with a common clinical phenotype (Kumar et al., 2017). In most patients, MM is characterized by the secretion of a monoclonal immunoglobulin (termed M protein) and different cytokines produced by malignant PCs, which accumulate in peripheral organs driving the main clinical manifestations of this disease (collectively known as CRAB features): hypercalcemia, renal insufficiency, anemia, bone disease and lytic lesions (Rajkumar et al., 2014). Despite intense efforts over recent years and the introduction of new therapies (Maes et al., 2013), MM remains an incurable disease, with an overall survival of approximately 5 years due to a high rate of relapses and the frequent development of drug resistance.

2.3. Genetic alterations and mutational landscape of MM

Malignant transformation of MM is a heterogeneous multistep process associated with the accumulation of varied molecular alterations that impact cellular function within the tumor and its microenvironment. Numerous genetic alterations

41

Introduction have been proposed as driving events in myelomagenesis. The primary cytogenetic aberrations involved in MGUS and MM development are hyperdyploidy (displaying multiple trisomies of odd-numbered chromosomes) and translocations of the immunoglobulin heavy chain (IGH) locus located in chromosome 14. These translocations arise from the errors occurring during CSR and SHM in B cell differentiation within the germinal canter and involve the deregulation of specific oncogenes, such as WHSC1/FGFR3 in t(4;14), CCND1 in t(11;14), MAF in t(14;16), MABF in t(14;20) or CCND3 in t(6;14) (Walker et al., 2013; Kumar et al., 2017). Along with the mentioned cytogenetic aberrations, other secondary genetic events have been demonstrated to contribute to MM pathogenesis. On one hand, copy number abnormalities, including del(1p), amp(1q), del(13q) and del(17p), and MYC oncogene rearrangements (Walker et al., 2010, 2015). On the other hand, a wide variety of genetic mutations occurring in small subsets of patients, like KRAS, NRAS, BRAF, FAM46C, DIS3 and TP53 mutations, with frequencies ranging from 8% to 23% of MM patients (Bolli et al., 2014; Lohr et al., 2014; Walker et al., 2015) (Figure 15). Some of these abnormalities have also an impact on risk stratification, such as hyperdyploidy and t(11;14) associated with standard risk, amp(1q) or t(4;14) with intermediate risk and t(14;16) or del(17p) which are associated with high risk or worse prognosis (Braggio, Kortüm and Stewart, 2015; Walker et al., 2018).

All these genetic alterations ultimately result in deregulation of gene expression patterns in MM. However, this aberrant genetic landscape is insufficient to explain MM biological complexity. Therefore, numerous epigenomic analyses of MM cells have been published over the last years trying to shed light on the epigenetic mechanisms underlying myelomagenesis.

42

Introduction

Figure 15. MM pathogenesis and disease evolution. Schematic representation of B cell development (left) leading to plasma cell differentiated cells. Early genetic events (hyperdiploidy and IGH translocations) drive abnormal plasma cell clonal proliferation in MGUS patients. Secondary genetic events increase genomic instability, contributing to SMM and symptomatic MM progression. Further genetic aberrancies lead to bone marrow microenvironment independence and development of extramedullary MM and plasma cell leukemia (PCL). Figure reproduced from (Braggio, Kortüm and Stewart, 2015).

2.4. Epigenetic features of MM

It is widely accepted that upon genetic alterations, epigenetic abnormalities are key factors for the initiation and progression of cancer, and particularly MM (Pawlyn et al., 2014; Amodio et al., 2017). Regarding DNA methylation, the progression from MGUS to MM seems to be associated with widespread DNA hypomethylation, generally associated with genomic instability and chromosomal rearrangements (Salhia et al., 2010; Walker et al., 2011; Heuck et al., 2013; Kaiser et al., 2013; Agirre et al., 2015). Remarkably, MM patients show very heterogeneous DNA methylomes, which range from globally hypo- to hypermethylated genomes. Moreover, DNA hypermethylation in MGUS and MM is preferentially located outside CGIs, overlapping with intronic regions and enhancer-related chromatin patterns in normal B cells (Agirre et al., 2015). This aberrant DNA methylation landscape in MM has also been associated with transcriptional silencing of specific tumor suppressor genes, which might play a role in MM patient outcome (Kaiser et al., 2013; Paiva et al.,

43

Introduction

2017; Yu et al., 2017). The mechanisms underlying such epigenetic deregulation are still controversial. Although recent studies report an altered expression of DNMTs in MM, it is currently unknown if this could led to the alteration of DNA methylation patterns or is simply a marker of increased cell proliferation (Dimopoulos, Gimsing and Grønbaek, 2014).

Regarding abnormal patterns of histone modifications, mutations of histone modifying enzymes have been reported in MM, including UTX, MLL, MLL2, MLL3 and WHSC1, which in turn result in the upregulation of the oncogene HOXA9 (Chapman et al., 2011; Pawlyn et al., 2016). Remarkably, loss of UTX (KDM6A, which demethylates H3K27) occurs in a variety of cancers (Van der Meulen, Speleman and Van Vlierberghe, 2014) including MM, where inactivating mutations or deletions of UTX are found in ~10% of primary MM patients (van Haaften et al., 2009). Loss of UTX promotes MM progression and dissemination, by altering the transcriptional profile of MM cells contributing to the malignant phenotype. In UTX-defficient MM patients, rebalancing the H3K27me3 modulatory patterns with the use of EZH2 (H3K27 methyltransferase) inhibitors allows to reactivate tumor suppressor programs, ultimately leading to MM cell death (Ezponda et al., 2017). Recent studies have described that not only mutations of the epigenetic machinery, but also its transcriptional deregulation could lead to an aberrant neoplastic phenotype. Indeed, EZH2 overexpression in MM leads to the increased silencing of H3K27me3 targets independently of UTX loss in MM patients at advanced stages of this disease, correlating with poor patient survival (Agarwal et al., 2016; Pawlyn et al., 2017). Moreover, the overexpression of H3K79 methyltransferase DOT1L has also been demonstrated to alter the epigenetic landscape of MM patients and deregulate the expression of IRF4 and MYC genes, required for MM cell survival (Ishiguro et al., 2018). These results highlight the potential of epigenetic therapies in MM as a therapeutic strategy to restore the epigenetic balance in MM cells.

44

Introduction

However, the most characteristic chromatin abnormality in the context of MM is the upregulation of WHSC1 in all cases with t(4;14), accounting for approximately 15% of MM patients, which is associated with intermediate risk phenotype (Keats et al., 2005; Braggio, Kortüm and Stewart, 2015; Walker et al., 2018). WHSC1 is a ubiquitously expressed histone methyltransferase, which catalyzes H3K20 trimethylation, H3K36 dimethylation and also enhances the function of HDAC1, HDAC2 and histone demethylase LSD1 (Marango et al., 2008; Martinez-Garcia et al., 2011). The result of WHSC1 overexpression is a global alteration of histone methylation patterns, with accumulation of H3K36me2 and widespread reduction of H3K27me3, leading to transcriptional activation of oncogenic loci (Mirabella et al., 2013). This aberrant phenotype has been demonstrated to drive proliferation, clonogenicity and invasiveness of MM tumor cells (Ezponda and Licht, 2014), highlighting the relevance of the epigenetic regulation in MM development.

Noteworthy, a genome-wide characterization of the epigenomic regulatory landscape of MM has been recently published (Jin et al., 2018). Independently of their genomic background, MM cells undergo consistent changes in enhancer activity, mostly related to the activation of superenhancers which mediate upregulation of TF genes key for MM development. The analysis of the TF regulatory networks in these cells indicates that IRF4 and FLI1 are key players in the establishment and modulation of the MM enhancer regulatory program.

Taken together, these studies suggest that the characterization of the epigenetic landscape in cancer is currently an area of great interest, as there is increasing appreciation of its relevance in the neoplastic transformation. Nevertheless, the MM epigenome, as defined by the guidelines of the International Human Epigenome Consortium (IHEC, http://ihec- epigenomes.org/research/ reference-epigenome- standards/) remains to be characterized. An integrative multilayered epigenetic study of MM patients could shed light into the mechanisms underlying myelomagenesis

45

Introduction and lead to the identification of specific targets and epigenetic drivers for this disease, both potential advances in the field of personalized therapy and precision medicine.

46

HYPOTHESIS AND OBJECTIVES

Hypothesis and objectives

Over recent years, intense efforts have been focused on unraveling the molecular mechanisms underlying MM onset and progression. However, previous reports describing an abnormal genomic and epigenetic landscape of MM patients have not been fully capable of explaining the pathogenesis of the disease.

An integrative multilayered epigenomic characterization of MM patients could provide a global view of the genome function and regulation, shedding light into the core mechanisms underlying myelomagenesis, and leading to the identification of specific targets and epigenetic drivers for this disease.

In order to elucidate this hypothesis, we have established the following principal objectives:

1. To characterize in detail the reference epigenome of MM patients in the context of B cell differentiation. 2. To detect, characterize and annotate epigenetic regulatory elements that could represent MM-specific epigenetic drivers. 3. To identify potential therapeutic targets and rational therapeutic approaches based on the chromatin landscape of MM cells.

49

MATERIALS AND METHODS

Materials and Methods

1. Primary samples

All patients and donors gave informed consent for their participation in this study following the International Cancer Genome Consortium (ICGC) guidelines and the ICGC Ethics and Policy committee. This study was approved by the clinical research ethics committee of Clínica Universidad de Navarra.

1.1. Isolation of MM patient samples

Purified PCs were selected from bone marrow aspirations from newly diagnosed patients of MM before administration of any treatment. Cells were filtered (40 µm mesh size) to remove bone fragments or cell clumps. In order to select the optimal patient samples for further processing, the number of total leukocytes and percentage of CD138+ cells was determined by flow cytometry, discarding those samples with less than 106 total neoplastic CD138+ cells. Prior to CD138+ selection, the sample was incubated with FACS Lysing Solution 1:10 (BD Biosciences) for 10 minutes, and washed with AutoMACS Buffer solution (PBS, 0.5% BSA and 2 mM EDTA). For positive CD138+ selection, cells were resuspended in AutoMACS Buffer solution (40 μl / 10·106 cells) and labeled with MACSprepMultiple Myeloma CD138 MicroBeads (10 μl / 10·106 cells) (Miltenyi Biotech) for 15 minutes at 4 oC. CD138+ cells were isolated by positive magnetic separation using AutoMACS system (AutoMACS Pro Separator, Miltenyi Biotech). After isolation, the purity of PCs was verified by flow cytometry, obtaining over 90% purity in all cases. Samples were cryopreserved at -80 oC in fetal bovine serum (FBS) with 10% DMSO.

1.2. Isolation of normal B cell subpopulations

Different B cell subpopulations used as control samples in this study were obtained from peripheral blood (pb-), tonsils (t-) or bone marrow (bm-) aspirations

53

Materials and Methods from healthy donors. The purity of each of the isolated B cell subpopulations exceeded 90% in all samples.

1) Isolation of B cell subpopulations from peripheral blood

Naïve B cells (pb-NBC) and memory B cells (pb-MBC) were isolated from buffy coats obtained from adult healthy donors (between 26 and 66 years). After Ficoll- Isopaque density centrifugation, CD19+ cells were isolated by positive magnetic cell separation using AutoMACS system (Milteny Biotec). Purified CD19+ cells were labeled with anti-CD27, anti-IgD, anti-IgM, anti-IgG and anti-IgA for 15 min at room temperature in staining buffer (PBS with 0.5% BSA). pb-NBCs (CD19+ / CD27- / IgD+) and pb-MBCs (CD19+ / CD27+ / IgA+ or IgG+) were obtained by fluorescence activated cell sorting (FACS) on FACSAriaII (BD Biosciences).

2) Isolation of B cell subpopulations from tonsils

Naïve B cells (t-NBC), germinal center B cells (t-GCBC) and plasma cells (t-PC) were isolated from tonsils of children undergoing tonsillectomies (ranging in age between 2 and 13). Tonsils were minced extensively and after Ficoll-Isopaque density centrifugation, enrichment of B cells was performed with the AutoMACS system either by positive selection of CD19+ cells or by the B Cell Isolation Kit II (Milteny Biotec). t-PCs (CD20med / CD38high), t-GCBCs (CD20high / CD38med) and t-NBCs (CD20+ / CD23+) were separated by FACS on FACSAriaII (BD Biosciences).

3) Isolation of B cell subpopulations from bone marrow

Purified PCs from bone marrow (bm-PC) were selected from bone marrow aspirations from healthy donors ranging from 20 to 30 years. After Ficoll-Isopaque density gradient centrifugation, we performed a selective depletion of CD3+, CD14+ and CD15+ cells by immunomagnetic selection (Miltenyi Biotec), followed by FACS of CD45+ / CD138+ / CD38+ cells using a FACSAriaII (BD Biosciences) device.

54

Materials and Methods

The whole set of samples isolated and processed in this study are summarized in Table 1.

Table 1. Samples and procedures involved in the study. Sample H3K4me3 H3K4me1 H3K27ac H3K36me3 H3K9me3 K3K27me3 ATAC-Seq ssRNA-Seq WGBS t-NBC 3 3 3 3 3 3 3 8 - pb-NBC 3 3 3 3 3 3 3 3 3 t-GCBC 3 3 3 3 3 3 3 7 3 pb-MBC 3 3 3 3 3 3 6 8 3 t-PC 3 3 3 3 3 3 3 8 3 bm-PC - - 1 - - - - 3 2

MM 4 10 14 4 4 4 17 40 5

2. Chromatin immunoprecipitation with high-throughput sequencing (ChIP-Seq)

For normal B cells subpopulations, in order to fix the protein-DNA complexes and ensure an appropriate immunoprecipitation, cells were cross-linked with 1% formaldehyde during 8 minutes at room temperature and quenched with 125 mM glycine for 2 minutes at room temperature. The crosslinking was done in between the isolation by AutoMACS and the separation by FACS.

For MM purified plasma cells, samples were defrosted in RPMI medium with 10% FBS, crosslinked with 1% formaldehyde during 16 minutes at 4 oC and quenched with 125 mM glycine for 10 minutes.

Once crosslinked, cells were lysed in lysis buffer (20 mM Hepes [pH 7.6], 1% SDS and 1:100 protease inhibitor cocktail) at a final concentration of 15·106 cells/ml. DNA was fragmented by sonication with the Bioruptor Sonication System (Diagenode) for 3 x 10 minutes [30 seconds ON; 30 seconds OFF] at 4 oC.

55

Materials and Methods

Before starting chromatin immunoprecipitation, we checked sonication efficiency and protocol yield. Chromatin was then de-crosslinked in a buffer containing 33.3 μl of chromatin (from 500.000 cells), 366.7 μl elution buffer (1% SDS, 0.1 M NaHCO3), 5 M NaCl and 4 μl Proteinase K (10 mg/ml) and shaken at 1,000 rpm for 1 hour at 65 oC. DNA was purified using a DNA kit (QIAquick MinElute PCR Purification Kit, Qiagen) following manufacturer’s instructions. The length of the fragments obtained was tested by the Agilent 2100 Bioanalyzer (with an appropriate size of the fragments between 150 bp to 600 bp) and the concentration of the samples was measured with Qubit Fluorometer (Termo Fisher Scientific).

Chromatin immunoprecipitation (ChIP) was then performed from 30 μl of total chromatin from 500.000 cells in a buffer containing 0.05% SDS, 1% Triton, 150 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris [pH 8], 1x protease inhibitor cocktail, 1% BSA and the amount of antibody detailed in Table 2. Samples were rotated overnight at 4 oC. To immunoprecipitate the chromatin / antibody complexes we used 10 μl of protein A and 10 μl of protein G magnetic beads per sample and rotated them at 4 oC for 90 minutes. Beads were then washed for 5 minutes at 4 oC using the following buffers: Wash1 buffer (0.1% SDS, 0.1% NaDOC, 1% Triton, 0.15 M NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris [pH 8]); Wash2 buffer (0.1% SDS, 0.1% NaDOC, 1% Triton, 0.5 M NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris [pH 8]) and Wash3 buffer (0.25 M LiCl, 0.5% NaDOC,0.5% NP-40, 0.5 M NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris [pH 8]). Samples were eluted from the beads by adding 200 μl of elution buffer (1% SDS, 0.1 M NaHCO3) shaking at 1,400 rpm for 20 minutes at 65 oC. DNA was then de-crosslinked by adding 8 μl of NaCl 5 M and 2 μl of Proteinase K (10 mg/ml) shaking at 1,000 rpm for 4 hours at 65 oC. Finally, DNA was purified using a DNA kit (QIAquick MinElute PCR Purification Kit, Qiagen) following the manufacturer’s instructions.

56

Materials and Methods

Library construction for Next Generation Sequencing (NGS) was performed using the KAPA Hyper Prep Kit (Kapa Biosystems) which provides a versatile, streamlined protocol for the rapid construction of libraries for Illumina sequencing from fragmented, double-stranded DNA (dsDNA). Briefly, the workflow includes an end repair step, followed by A-tailing of the DNA fragments for adapter ligation. If necessary, the library was amplified by PCR using the recommended cycle numbers to minimize over-amplification. Finally, we used E-Gel SizeSelect Agarose Gels (Thermo Fisher Scientific) for the selection of DNA fragments with a fragment length of 300 bp. Libraries were pooled at equimolar ratios and sequenced on a HiSeq2500 (Illumina). Amounts of sequence reads per condition are shown in Table 3.

Table 2. ChIP-Seq antibodies.

Antibody Reference Commercial Source μg / reaction

H3K27ac pAB-196-050 Diagenode Rabbit 1

H3K4me1* A1863-001P Diagenode Rabbit 0.2

H3K4me3 pAB-003-050 Diagenode Rabbit 1

H3K36me3 pAB-192-050 Diagenode Rabbit 0.5

H3K9me3 pAB-193-050 Diagenode Rabbit 0.5

K3K27me3 pAB-195-050 Diagenode Rabbit 1 *Not yet for sale

3. Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-Seq)

ATAC-Seq was performed starting from 100,000 sorted cells from the different B cell subpopulations and MM plasma cells (Table 1). Cells were resuspended in 25 μl of nuclei lysis buffer (10 mM Tris-HCl [pH 7.4], 10 mM MgCl2, 0.1% IGEPAL CA-630) and immediately centrifuged at 500 g for 20 minutes at 4 oC. Supernatant was removed and nuclei were resuspended in 25 μl of Tagmentation Reaction Mix (10

57

Materials and Methods

mM Tris-HCl [pH 8.4] and 25 mM MgCl2) and 2 μl of home-made Tn5. Tagmentation was incubated shaking at 500 rpm for 1 hour at 37 oC. The tagmentation reaction was stopped by adding 5 μl of clean up buffer (900 mM NaCl and 300 mM EDTA), 2 μl of 5% SDS and 2 μl of Proteinase K (10 mg/ml) and incubated for 30 minutes at 40 °C. DNA was isolated with SPRI beads (Agencourt AMPure XP Beads, Beckman Coulter) following a positive 2x SPRI-selection. Tagmented DNA was eluted and sample concentration was determined by Qubit Fluorometer (Termo Fisher Scientific).

PCR amplification was performed by combining the tagmented DNA with 2 μl each of i5 and i7 dual indexing primers (Nextera Indexing Kit, Illumina) or 2 μl of home-made primers (10 μM stock) and KAPA HiFi HotStart ReadyMix (Kapa Biosystems). An initial amplification was performed using the following cycle conditions: 1 cycle of 98 °C for 2 minutes, 4 cycles of 98 °C for 20 seconds, 63 °C for 30 seconds, and 72 °C for 1 minute. PCR product was cleaned with SPRI beads (Agencourt AMPure XP Beads, Beckman Coulter) following a negative 0.65x SPRI- selection. To estimate the number of additional PCR cycles required, samples were quantified by Qubit Fluorometer (Termo Fisher Scientific). The second PCR was set up with the same conditions as the first one. To remove primers and high molecular weight DNA following PCR amplification, a positive 2x SPRI-selection was performed. The quality of the amplified library was determined by Qubit Fluorometer (Termo Fisher Scientific) and TapeStation (Agilent 4200 TapeStation System, Agilent Technologies). Libraries were pooled at equimolar ratios and sequenced on a HiSeq2500 (Illumina). Amounts of sequence reads per condition are shown in Table 3.

58

Materials and Methods

Table 3. Mean of million sequence reads per condition

H3K27ac H3K4me1 H3K4me3 H3K36me3 H3K27me3 H3K9me3 ATAC-Seq

t-NBC 28.0 31.9 24.2 33.4 36.5 25.4 17.2 pb-NBC 27.4 44.7 26.3 44.4 52.7 26.3 20.2 t-GCBC 29.3 35.2 23.5 40.3 37.6 29.4 23.5 pb-MBC 28.1 32.1 20.2 41.0 40.6 24.2 16.4 t-PC 24.0 32.1 31.3 37.5 39.3 23.5 15.4

MM 19.2 40.3 25.9 40.3 40.4 40.0 17.7

4. Whole Genome Bisulfite Sequencing (WGBS)

For WGBS, 1-2 μg of genomic DNA was sheared by sonication to 50-500 bp in size using a Covaris E220 sonicator and fragments of 150–300 bp were selected using SPRI beads (Agencourt AMPure XP Beads, Beckman Coulter). Genomic DNA libraries were constructed using the Illumina TruSeq Sample Preparation kit following Illumina's standard protocol: end repair was performed on the DNA fragments, where an adenine was added to the 3′ end of each fragment and Illumina TruSeq adaptors were ligated to both ends. After adaptor ligation, DNA was treated with sodium bisulfite using the EpiTexy Bisulfite kit (Qiagen), following the manufacturer's instructions. Two rounds of bisulfite conversion were performed to ensure a conversion rate over 99%. Enrichment for adaptor-ligated DNA was carried out though seven PCR cycles using PfuTurboCx Hot-Start DNA polymerase (Stratagene). The quality of the amplified library was determined by Qubit Fluorometer (Termo Fisher Scientific) and TapeStation (Agilent 4200 TapeStation System, Agilent Technologies). Libraries were then pooled at equimolar ratios and sequenced on a HiSeq2000 (Illumina) using 100 bp paired-end reads. Filtering those CpGs with at least 10 reads in all samples, a total of 9,214,561 CpGs were called for further analysis.

59

Materials and Methods

5. RNA sequencing (RNA-Seq)

Transcriptomic analyses were performed following two different approximations:

1. Strand specific RNA sequencing (ssRNA-Seq) for deep transcriptome profiling of samples from the reference epigenome series and the validation series. 2. Low input 3’ end RNA sequencing for characterization of transcriptional changes in samples from PRDM5 knockdown studies.

5.1. Strand Specific RNA-Seq

For ssRNA-Seq, total RNA was isolated using TRIzol (Life Technologies) and libraries were prepared using TruSeq Stranded Total RNA Kit with Ribo-Zero Gold (Illumina) (Figure 16) following the manufacturer’s instructions. Briefly, the workflow starts by the removal of ribosomal RNA (rRNA) and mitochondrial RNA (mitRNA) using biotinylated target-specific oligos combined with Ribo-Zero Gold removal beads. Following purification, the RNA was then fragmented by divalent cations under elevated temperature. From cleaved RNA, first strand cDNA was synthesized by reverse transcription followed by second strand cDNA synthesis using DNA polymerase I and RNase H. The 3’ end of these double stranded cDNA products was adenylated for the following adapter ligation. Finally, the library was amplified by 13 cycles of PCR and the quality was determined by Qubit Fluorometer (Termo Fisher Scientific) and TapeStation (Agilent 4200 TapeStation System, Agilent Technologies). Libraries were pooled at equimolar ratios and sequenced on a HiSeq2500 (Illumina) using 100 bp single-end reads for the samples of the reference epigenome and 50 bp paired-end reads for the samples in the validation series, obtaining a mean of 75 million reads per sample.

60

Materials and Methods

Figure 16. Strand Specific RNA Sequencing. General flow chart for the Illumina TruSeq Stranded Total RNA Kit protocol. Image rreproduced from Abmgood.

5.2. Low input 3’ end RNA-Seq

Low input 3’ end RNA-Seq was performed as previously described (Jaitin et al., 2014) (Figure 17). Briefly, RNA from 100,000 cells from each experimental condition was purified using Dynabeads mRNA DIRECT Purification Kit (Thermo Fisher Scientific). For barcoded cDNA generation, polyadenilated RNA was retrotranscribed using Affinity Script cDNA Synthesis Kit (Agilent Technologies) and barcoded primers harboring also a T7 promoter and a partial Read2 sequence for Illumina sequencing (CGATTGAGGCCGGTAATACGACTCACTATAGGGGCGACGTGTGCTCTTCCGATCTXXXXXXN NNNTTTTTTTTTTTTTTTTTTTT, where XXXXXX is the cell barcode and NNNN is a

61

Materials and Methods unique molecular identifier or UMI, which represents the same initial RNA molecule for further quantification). At this point samples were pooled (up to 6 samples per pool) at equimolar ratios assessed by qPCR.

Figure 17. Low input 3' end RNA-Seq. Schematic flow chart of the experimental steps for RNA tagging, pooling, amplification, fragmentation and final library construction. Colored lines represent RNA (blue) or DNA (black) molecules, oligos and primers. NT20: 20-mer poly-T oligonucleotide; UMI: unique molecular identifier; rd2: Illumina primer sequence Read2; rd1: Illumina primer sequence Read1; rev: reverse; P5 and P7: Illumina adapters P5 and P7. Figure reproduced from (Jaitin et al., 2014).

62

Materials and Methods

Samples wee then treated with Exonuclease I (Thermo Fisher Scientific) following a 1.2x positive SPRI-selection. Second strand cDNA synthesis was then performed using NEBNext Ultra II Directional RNA Second Strand Synthesis Module (New England Biolabs) for 2 hours at 16 oC, followed by a 1.4x positive SPRI clean-up. The samples were linearly amplified by T7 in vitro transcription using a T7 RNA Polymerase (New England Biolabs) for 16 hours at 37 oC. After DNase treatment (TURBO DNse I, Thermo Fisher Scientific) for 15 minutes at 37 oC to remove dsDNA, samples were cleaned with a positive 1.2x SPRI-selection. RNA size, quality and concentration were determined by Qubit Fluorometer (Termo Fisher Scientific) and TapeStation (Agilent 4200 TapeStation System, Agilent Technologies). If necessary, RNA was fragmented in 200 to 300 bp fragments using Zn2+ divalent cations followed by a 2x positive SPRI clean-up. Fragmented RNA was dephosphorylated using FastAP Thermosensitive Alkaline Phosphatase (Thermo Fisher Scientific) for 10 minutes at 37 oC to enhance RNA / ssRNA ligation yield. In this step a partial Read1 (AGATCGGAAGAGCGTCGTGTAG) sequence for Illumina sequencing was ligated to the 3’ end of the fragmented RNA using a T4 RNA Ligase I (New England Biolabs). After a 1.5x positive SPRI clean-up, the RNA product was retrotranscribed using Affinity Script cDNA Synthesis Kit (Agilent Technologies) and a specific retrotranscription primer (TCTAGCCTTCTCGCAGCACATC). The library was then purified by 1.5x positive SPRI clean-up and the yield of the previous steps was determined by qPCR. Finally, a PCR using KAPA HiFi HotStart Ready Mix (Kapa Byosistems) was performed for second strand cDNA synthesis, Illumina primers addition (Read1-P5: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT CTTCCGATCT and Read2-P7: CAAGCAGAAGACGGCATACGAGATGTGACTGGAGTTCAGACGTGTGCTCTT CCGATCT) and library amplification, using the recommended cycle numbers to minimize over-amplification. The product was purified by a 0.7x positive SPRI clean- up. The quality of the final library was determined by qPCR (based on the threshold cycle, or Ct, obtained for GAPDH (Fw: CCAGCAAGAGCACAAGAGGAA, Rv:

63

Materials and Methods

GATTCAGTGTGGTGGGGG) or GUSβ (Fw: CAAGTGCCTCCTGGACTGTT, Rv: TCCACCTTTAGTGTTCCCTGC) housekeeping genes calibrated per individual set of samples), Qubit Fluorometer (Termo Fisher Scientific) and TapeStation (Agilent 4200 TapeStation System, Agilent Technologies). All libraries were pooled at equimolar ratios and sequenced on a NextSeq500 (Illumina) using 60 bp single-end reads,

obtaining a mean of 11 million reads per sample.

6. 4C-Seq

We performed 4C-Seq for 3 different the NDNF promoter viewpoint (chr4:121,070,660-121,071,025). Chromatin templates were prepared from 107 cells of JVM-2 and U266 cell lines as previously described (Simonis, Kooren and de Laat, 2007; Harmen J G van de Werken et al., 2012). Cells were cross-linked with 2% formaldehyde for 10 min at room temperature, nuclei were isolated and cross-linked DNA was digested with a primary restriction enzyme recognizing a 4-bp restriction site, followed by proximity ligation after which cross-links were removed. A secondary restriction enzyme digestion was performed with a 4-bp restriction enzyme recognizing a different sequence than the primary enzyme, followed again by proximity ligation. For each sample, 200 ng of the resulting 4C template was used for the subsequent PCR reaction. PCR products were purified using the High Pure PCR Product Purification Kit (Roche). Up to 16 samples (total: 3.2 μg of 4C template) were pooled and purified for next-generation sequencing. Restriction enzyme and primers used are listed in Table 4.

64

Materials and Methods

Table 4. 4C-Seq restriction enzymes and primer sequences. Restriction Target region Target coordinates Primer sequences (5´- 3´) Product (bp) enzymes

NDNF promoter chr4:121,070,660-121,071,025 DpnII / BfaI TTGCTTCTCATCTGTCGATC 365

CAGAAAGGTGAACCGAGAG

7. Bioinformatics 7.1. Read mapping and data processing

Fastq files of ChIP-Seq data were aligned to genome build GRCh38 (using bwa 0.7.7, PICARD and SAMTOOLS) and wiggle plots were generated (using PhantomPeakQualTools) as described (http://dcc.blueprintepigenome.eu/#/md/ methods). Peaks of the histone mark data were called as described (http://dcc.blueprintepigenome.eu/#/md/methods) using MACS2 (version 2.0.10.20131216) relative to input control.

ATAC-Seq fastq files were aligned to genome build GRCh38 using bwa 0.7.759 (parameters: -q 5, -P, -a 480) and SAMTOOLS v1.3.160 (default settings). BAM files were sorted and duplicates were marked using PICARD tools v2.8.1 (http://broadinstitute.github.io/picard, default settings). Finally, low quality and duplicate reads were removed using SAMTOOLS v1.3.160 (parameters: -b, -F 4, -q 5, - b, -F 1024). ATAC-Seq peaks were determined using MACS2 (v2.1.1.20160309, parameters: -g hs -q 0.05 —keep-dup all -f BAM –nomodel –shift − 96 –extsize 200) without input control.

For downstream analysis, peaks with p values < 10-5 (H3K36me3, H3K9me3 and H3K27me3) or < 10-9 (H3K27ac, H3K4me1, H3K4me3 and ATAC-Seq) were included. For each mark a set of consensus peaks, only including regions on chromosomes 1– 22, present in the normal B cells (n = 15 biologically independent samples for histone marks and n = 18 biologically independent samples for ATAC-Seq) and in the MM

65

Materials and Methods samples (n = 4 biologically independent samples for the reference epigenomes, n = 14 biologically independent samples for the extended H3K27ac series, n=10 biologically independent samples for H3K4me1 and n = 17 biologically independent samples for the extended ATAC-Seq series) was generated by merging the locations of the separate peaks per individual sample. To generate the consensus peak file, only peaks with input were used if available. For the histone marks the number of reads per sample per consensus peak was calculated using the genomecov function of bedtools. For the ATAC-Seq the number of insertions of the Tn5 transposase per sample per consensus peak was calculated by first determining the estimated insertion sites (shifting the start of the first mate 4 bp downstream), followed by the genomecov function of bedtools. Using DEseq2, variance stabilized transformed (VST) values were calculated for all consensus peaks (H3K27ac and ATAC-Seq data of extended MM series) or for the peaks that were present in > 1 sample (reference epigenome data). The numbers of consensus peaks for the reference epigenome analyses, for which VST values were calculated, were: 125,026 (H3K27ac), 60,494 (H3K4me1), 72,848 (H3K4me3), 33,907 (H3K36me3), 61,142 (H3K9me3), 37,499 (H3K27me3) and 144,535 (ATAC-Seq). Principal component analyses (PCAs) were generated with the prcomp function in R using the (corrected) VST values of all peaks that were present in > 1 sample.

RNA-Seq data of the reference epigenomes and the fastq files of the 59 samples (n = 37 MM patients, n = 22 normal B cells) mined from a previous study were aligned to genome build GRCh38; signal files were produced and gene quantifications (gencode 22) were calculated as described (http://dcc.blueprint- epigenome.eu/#/md/methods) using the GRAPE2 pipeline with the STAR-RSEM profile (adapted from the ENCODE Long RNA-Seq pipeline). The expected counts and fragements per kilobase million (FPKM) estimates were used for downstream

66

Materials and Methods analysis. The PCA of the RNA-Seq data was generated with the prcomp function in R using log10 transformed FPKM (+ 0.01 pseudocount) data.

Mapping and determination of DNA methylation estimates were performed as described (http://dcc.blueprint-epigenome.eu/#/md/methods) using GEM3.0. Per sample, only DNA methylation estimates of CpGs with ten or more reads were used for downstream analysis. The PCA of the DNA methylation data was generated with the prcomp function in R using methylation estimates of 9,214,561 CpGs located in chr1 to 22 with available DNA methylation estimates in all analyzed samples.

7.2. Detection of differentially methylated CpGs, genomic regions and expressed genes

For individual histone marks and ATAC-Seq data, only consensus regions present in at least 1 and in a maximum of all but 1 sample were used; that is, excluding individual specific and constitutive regions. Of the included consensus peaks, those differentials among the six different subgroups (MM and five normal B cell subpopulations) were defined using the likelihood ratio test (false discovery rate; FDR < 0.01) of the DEseq2 package.

For the ssRNA-Seq dataset, Limma analysis was performed in order to obtain differentially expressed protein coding genes (FDR < 0.05, |FC| > 1.5) between MM samples and all B cell subpopulations. The threshold to consider a gene as unexpressed was set to 0.1 FPKM.

For the low input 3’ end RNA-Seq dataset, only genes that were expressed (sum of FPKM values in all samples greater than 10) were included. Differential expressed genes were defined for each time point (2 days and 4 days) using the likelihood ratio test of DEseq2 package (|FC| > 1.5 and p < 0.05 for 2 days; |FC| > 1.5 and p < 0.01 for 4 days).

67

Materials and Methods

7.3. Detection of de novo activated regions in MM

To stringently identify de novo active regions, several filters were applied to the previously detected set of regions with differential gain of H3K27ac in MM versus B cells (n=12,195) (Figure 18). First, the H3K27ac profile of bm-PC from one healthy donor was generated and the peaks shared between MM and its normal counterpart were filtered out. Next, the percentages of H3K27ac peak occupancy within each differential region were calculated and those regions having less than 20% of H3K27ac in normal B cells while showing a consistently present H3K27ac gain in MM (FDR < 0.05) were selected. Finally, in order to identify exclusively the regions with a putative functional impact on MM biology, the previously selected de novo H3K27ac regions were annotated to all the genes lying within the same topological associated domain (TAD, data from lymphoblastoid cell line, GM12878). Gene expression modulation was analyzed by comparing RNA-Seq data from 37 MM cases and 22 normal mature B cell subtypes samples, including 3 healthy bm-PCs. Only the genes overexpressed in MM as compared to all normal B cells (Limma, FDR < 0.05) and with clearly higher expression in bm-PC (|FC| > 1.5) were filtered, as an additional condition. Finally, 1,059 genes corresponding to 1,556 regions with de novo gain of H3K27ac in MM were selected. In parallel, the same type of analysis was performed with all the analogous steps but for the contrary situation, in order to find the regions that are de novo repressed in MM, thus losing H3K27ac completely as comparing to normal B cells. In this case, only 4 repressed regions were identified, corresponding to 6 genes downregulated in MM.

68

Materials and Methods

Figure 18. Detection of de novo activated and repressed regions in MM. Schematic representation of the filtering strategy for identification of regions modulating H3K27ac de novo in MM cells. The left panel shows regions gaining H3K37ac de novo in MM (number of regions indicated in the pink box) and the right panel those loosing H3K27ac de novo in MM cells (number of regions is indicated in the blue box).

7.4. K-means clustering

When performing K-means clustering, the absolute VST levels (which are dependent on the size of the regions/genes) affect the clustering; while we were only interested in relative differences. Therefore, Z-scores are necessary to correct for this phenomenon. Hence, K-means clustering was performed using the Z-scores of the

69

Materials and Methods

VST values of the differential regions/genes. For each, 20 clusters were assigned, which were merged based on pattern similarity.

7.5. Gene ontology and canonical pathways analysis of data sets

Differentially-expressed genes were further analyzed using Gene Ontology (GO) functional enrichment analysis (GO-PANTHER) as previously described (Mi et al., 2017). Enrichment of certain GO terms was determined based on Fisher's exact test. A multiple correction control (permutation to control FDR) was implemented to set up the threshold to obtain the lists of significantly over-represented GO terms (FDR < 0.05).

The molecular processes, molecular functions and signaling networks were further evaluated by analyzing differentially expressed genes using Gene Set Enrichment Analysis (GSEA, http://www.broad.mit.edu/gsea/index.html). Gene sets are available from Molecular Signatures DataBase (http://software.broadinstitute. org/gsea/msigdb/collections.jsp). Briefly, GSEA calculates an enrichment score (ES) that reflects the degree to which a gene set is overrepresented at the extremes (top or bottom) of the entire ranked list of expression data, where genes are ranked according to the expression difference between normal B cells and MM patients, and estimates the statistical significance (p value) of this ES.

8. Cell culture 8.1. Culture of MM cell lines

MM cell lines were maintained in RPMI-1640 medium (Lonza) supplemented with 20% FBS (Gibco), 1% Penicillin / Streptomycin (Lonza) and 2% HEPES (Life

o Technologies). All cells were maintained at 37 C and 5% CO2. Detailed information of MM cell lines is summarized in Table 5.

70

Materials and Methods

Table 5. MM cell lines.

Cell line Translocations TP53 KRAS TRAF3 CDKN2C MMSET BRAF RB1

t(4;14) t(8;14) KMS11 HD HD t(14;16)

KMS12BM t(11;14) Mut (Hom)

MM1R t(14;16) t(8;14) Mut (Ht) Mut (Hom) HD Mut+

MM1S t(14;16) t(8;14) Mut (Ht) Mut (Hom) HD Mut+

RPMI8226 t(14;16) t(8;22) Mut (Hom) HD

U266 t(11;14) Mut (Hom) Mut (Hom) Mut (Ht) Mut (Hom)

HD: Homozygous Deletion, Mut (Ht): Heterozygous Mutation, Mut (Hom): Homozygous Mutation, Mut+: Activating Mutation of MMSET, phenotype similar to t(4;14)

8.2. Culture of HEK93T cell line

HEK293T cells were maintained in DMEM (Lonza) with high glucose and pyruvate, supplemented with 10% FBS (Gibco), 1% Penicillin / Streptomicyn (Lonza) and 2% HEPES (Life Technologies) and grown in a humidified atmosphere at 37 oC and 5%

CO2. Only cells at low passage (below passage 16) were used for lentiviral production.

9. Inducible shRNA knockdown system 9.1. Short hairpin RNA sequences design and cloning

Short hairpin RNAs (shRNAs) targeting each of the genes of interest were designed using public available algorithms (www.broad.mit.edu/science/projects/ rnai-consortium/trc-shrna-design-process and www.med.nagoya-u.ac.jp/ neurogenetics/i_Score/i_score.html). DNA oligonucleotides harboring shRNA sequences were designed according to the diagram shown in Figure 19A. shRNAs were cloned into a TET inducible version of pLKO.1 vector, shown in Figure 19B (Wiederschain et al., 2009) (Tet-pLKO-puro, Addgene #21915). Briefly, single strand sequences of shRNAs were annealed by mixing each DNA oligonucleotide at a final

71

Materials and Methods concentration of 2 μM in 1x NEB2.1 buffer (New England) in 50 μl of final volume, denatured for 5 minutes at 95 °C and cooled to 25 °C at a ramp rate of 0.1 °C per second. This annealed insert was diluted 1:10 and phosphorylated using T4 PNK (New England Biolabs) for 30 minutes at 37 °C and 15 minutes at 85 °C. The ligation reaction was performed with 200 ng of EcoRI / AgeI digested Tet-pLKO-puro plasmid, 5 μl of insert and 0.5 μl of T4 DNA Ligase (New England Biolabs) and incubated overnight at 16 °C. Ligation product was transformed into 50 μl of Stbl3 competent cells (Life Technologies). Transformed clones were tested by Sanger sequencing. See Table 6 for target sequences and sequencing primers. A shRNA targeting GFP gene (shGFP) was used as experimental control.

Figure 19. shRNA design and cloning in Tet-pLKO-puro lentiviral vector. A) Schematic representation of shRNA oligo flanked by sequences compatible with Tet-pLKO-puro cohesive ends after AgeI/EcoRI digestion. Forward and reverse oligos were annealed and ligated into the Tet-pLKO-puro vector, producing a final plasmid where U6 promoter drives RNA Polymerase III transcription for generation of shRNA transcripts (reproduced from Addgene). B) Map of Tet-pLKO-puro lentiviral vector. shRNA oligos are cloned into the AgeI and EcoRI sites in place of the stuffer. Human U6 promoter drives RNA Polymerase III transcription for generation of shRNA transcripts. In the absence of tetracycline / doxycycline, shRNA expression is repressed by constitutively-expressed TetR protein bound to TetR operator. Upon the addition of tetracycline / doxycycline to the growth media, shRNA expression is triggered resulting in target gene knockdown. hPGK: human phosphoglycerate kinase promoter drives the constitutive expression of TetR and PuroR genes. TetR: tetracycline repressor binds to the TetR operator to inhibit transcription. IRES: internal ribosomal entry site. PuroR: puromycin resistance gene for selection in mammalian cells. 3’LTR: 3’ self-inactivating long terminal repeat. f1 ori: f1 bacterial origin of replication. AmpR: ampicillin resistance gene for selection in bacterial cells. 5’LTR: 5’ long terminal repeat.

72

Materials and Methods

Table 6. Short hairpin RNAs sequences. Target gene Name Sequence (5'- 3')

GFP shGFP Fw CCGGACAACAGCCACAACGTCTATACTCGAGTATAGACGTTGTGGCTGTTGTTTTTG

shGFP Rv AATTCAAAAACAACAGCCACAACGTCTATACTCGAGTATAGACGTTGTGGCTGTTGT

PRDM5 shPRDM5 #1 Fw CCGGCGACACAAGATGACTCATATTCTCGAGAATATGAGTCATCTTGTGTCGTTTTTG

shPRDM5 #1 Rv AATTCAAAAACGACACAAGATGACTCATATTCTCGAGAATATGAGTCATCTTGTGTCG

shPRDM5 #2 Fw CCGGGGTGTAGCTGACAGCTAATAGCTCGAGCTATTAGCTGTCAGCTACACCTTTTTG

shPRDM5 #2 Rv AATTCAAAAAGGTGTAGCTGACAGCTAATAGCTCGAGCTATTAGCTGTCAGCTACACC

NDNF shNDNF #1 Fw CCGGGTACTATGGAAGGAAGGATATCTCGAGATATCCTTCCTTCCATAGTACTTTTTG

shNDNF #1 Rv AATTCAAAAAGTACTATGGAAGGAAGGATATCTCGAGATATCCTTCCTTCCATAGTAC

shNDNF #2Fw CCGGGGACATTAAATCACTTTAAATCTCGAGATTTAAAGTGATTTAATGTCCTTTTTG

shNDNF #2 Rv AATTCAAAAAGGACATTAAATCACTTTAAATCTCGAGATTTAAAGTGATTTAATGTCC

shNDNF #3 Fw CCGGGCTTAGAGGAAAACCTAAAATCTCGAGATTTTAGGTTTTCCTCTAAGCTTTTTG

shNDNF #3 Rv AATTCAAAAAGCTTAGAGGAAAACCTAAAATCTCGAGATTTTAGGTTTTCCTCTAAGC

shNDNF #4 Fw CCGGGTATCAGAGTAAGGTTGTGAACTCGAGTTCACAACCTTACTCTGATACTTTTTG

shNDNF #4 Rv AATTCAAAAAGTATCAGAGTAAGGTTGTGAACTCGAGTTCACAACCTTACTCTGATAC

Tet_pLKO Tet_pLKO Seq Fw* GGCAGGGATATTCACCATTATCGTTTCAGA

*Sequencing primer

9.2. Lentiviral production

Lentiviruses were generated by co-transfecting 2.5·106 HEK293T cells with 9 μg of shRNA-encoding plasmid, 6 μg psPAX2 (containing the viral machinery for virus packaging: Addgene, #12260) and 3 μg of pMD2G (containing the viral envelope for virus packaging: Addgene, #12259) plasmids using 40 μl of Lipofectamine2000 (Invitrogen) and 970 μl Opti-MEM (Gibco) in a 10 cm culture dish. Growth media was exchanged after 12 hours by RPMI-1640 medium (Lonza) supplemented with 20% FBS (Gibco), 1% Penicillin / Streptomycin (Lonza) and 2% HEPES (Life Technologies). Lentivirus-containing supernatant was harvested 48 hours later, filtered (0.2 μm) and applied directly to cells for infection with 2 μg/ml polybrene (Sigma). Target cell lines were selected in 1.5 μg/ml puromycin (Sigma) for 72 hours.

73

Materials and Methods

Stable cell lines harboring the silenced shRNA-expressing cassette were grown in Tet-Free FBS (Fetal Bovine Serum (FBS) South America, Tetracycline Free, Biowest) conditions to prevent Tet-promoter basal expression. shRNA expression was induced by adding 1 μg/ml doxycycline (Sigma) to the culturing media and replaced every 48 hours. Target gene knockdown was validated by RT-qPCR and western blot to ensure the correct silencing of the target gene at both RNA and protein level respectively. Cell viability was monitored by MTS assay every 48 hours during an 8-day culture. The percentage of apoptotic cells was determined by Annexin V+ flow cytometry, 8 days after the induction with doxycycline.

10. CRISPR-Cas9 knockout system

Engineered CRISPR systems were generated in MM cell lines to perform knockout experiments of target genes. In these systems, Cas9 and guide RNAs (gRNAs) were encoded on separate plasmids and transfected sequentially into cultured cells. First, we established cell lines constitutively expressing the Cas9 nuclease through lentiviral infection and antibiotic selection. Secondly, these cell lines were infected with lentiviruses carrying specific single or paired gRNAs sequences with a fluorescent reporter.

In the CRISPR-Cas9 single gRNA system, a unique gRNA targets a site within the coding region of the target gene, where the Cas9 endonuclease produces a double- strand break (DSB) within the target DNA. In most cases, the resulting DSB is repaired by non-homologous end joining (NHEJ), giving rise to small nucleotide insertions or deletions (indels) in the target DNA, that result in amino acid deletions, insertions or frameshift mutations in the coding-targeted gene.

CRISPR-Cas9 systems are also useful tools to interrogate the functional mechanisms of non-coding genomic elements, including regulatory regions and non- coding RNAs. A single gRNA system is not practical in this particular case, as small

74

Materials and Methods indels are less likely to ablate their function to the same extent as in a protein-coding sequence. Instead, a deletion strategy is developed: a pair of gRNAs is used to recruit Cas9 to sites flanking the target region, where simultaneous DSBs are induced and the NHEJ activity repairs the lesion, resulting in a genomic deletion of the targeted locus (Aparicio-Prat et al., 2015).

10.1. Cas9 stable MM cell lines

For stable Cas9 MM cell line generation, MM1S, MM1R, KMS11 and KMS12 cells lines were infected with lentiviruses (as previously described) carrying the Cas9-2A- Blasticidin expressing cassette (Hart et al., 2015) (Lenti-Cas9-2A-Blast plasmid, Addgene #73310, Figure 20) and selected with 10 μg/ml blasticidin (Invitrogen) for 7 days. Cas9 expression was checked by RT-qPCR.

Figure 20. Cas9 lentiviral vector. Map of Lenti-Cas9-2A-Blast lentiviral vector. EF-1α promoter drives the constitutive expression of the Cas9 endonuclease and BlasR genes. T2A: self-cleaving peptide for multi-gene expression. BlastR: blasticidin resistance gene for selection in mammalian cells. 3’LTR: 3’ self-inactivating long terminal repeat. f1 ori: f1 bacterial origin of replication. AmpR: ampicillin resistance gene for selection in bacterial cells. 5’LTR: 5’ long terminal repeat.

10.2. Single gRNA sequences design and cloning

Single gRNA sequences targeting the genes of interest were designed using public available algorithms (portals.broadinstitute.org/gpp/public/analysis-tools/sgrna- design) and public gRNA libraries for CRISPR screening. gRNAs were cloned into CRISPseq-BFP-backbone (Jaitin et al., 2016) (gift from Ido Amit laboratory, Addgene

75

Materials and Methods

#85707, Figure 21A). Briefly, CRISPseq-BFP-backbone was digested with BsmbI for 2 hours at 55 oC, dephosphorylated using 2 μl of Fast AP (Life Technologies) for 25 minutes at 37 oC and 20 minutes at 80 oC prior to purification with NucleoSpin Gel and PCR Clean-up (Macherey Nagel). Single strand sequences harboring gRNAs were phosphorylated using T4 PNK (New England Biolabs) for 30 minutes at 37 °C and 15 minutes at 85 °C. Complementary sequences were annealed by mixing each DNA oligonucleotide at a final concentration of 1 μM, denatured for 5 minutes at 95 °C and cooled to 25 °C at a ramp rate of 0.1 °C per second. The ligation reaction was then performed with 50 ng of digested and dephosphorylated CRISPseq-BFP- backbone plasmid, 1 μl of 1:250 diluted insert and 0.5 μl of T4 DNA Ligase (New England Biolabs) and incubated overnight at 16 °C. Ligation product was transformed into 50 μl of Stbl3 competent cells (Life Technologies). Transformed clones were tested by BsmbI digestion and Sanger sequencing. See Table 7 for target sequences and sequencing primers. Empty CRISPseq-BFP-backbone was used as experimental control (scramble, Scr).

Table 7. Single gRNA sequences. Target gene Name Target Sequence (5'- 3')

TXN gTXN#1 Fw Exon2 TTTATCACCTGCAGCGTCCA

gTXN#1 Rv TGGACGCTGCAGGTGATAAA

gTXN#2 Fw Exon2 TAGTTGACTTCTCAGCCACG

gTXN#2 Rv CGTGGCTGAGAAGTCAACTA

U6 promoter U6 Seq Fw* TAATTTCTTGGGTAGTTTGC

*Sequencing primer

76

Materials and Methods

10.3. Paired gRNA sequences design and cloning

For the genomic deletion of TXN enhancer and NDNF promoter regions, paired gRNAs flanking the corresponding regions were designed using CRISPETA paired gRNA design tool (Pulido-Quetglas et al., 2017). One gRNA sequence of each pair was cloned into CRISPseq-BFP-backbone harboring a BFP (blue fluorescent protein) reporter gene (Figure 21A) or pLKO5.sgRNA.EFS.GFP (Addgene #57822) harboring a GFP (green fluorescent protein) reporter (Figure 21B), respectively. Both plasmids were cloned as previously described for CRISPRseq-BFP-backbone. See Table 8 for target sequences and sequencing primers. Empty CRISPseq-BFP-backbone and pLKO5.sgRNA.EFS.GFP were used as experimental controls (scramble, Scr).

Figure 21. Lentiviral constructs for gRNA expression. A) Map of CRISPRseq-BFP-backbone lentiviral vector. B) Map of pLKO.sgRNA.EFS.GFP lentiviral vector. gRNA oligos are cloned into the BsmbI sites. Human U6 promoter drives RNA Polymerase III transcription for generation of gRNAs transcripts. EF-1α promoter drives the constitutive expression of the BFP or GFP reporter genes, respectively. 3’LTR: 3’ self-inactivating long terminal repeat. f1 ori: f1 bacterial origin of replication. AmpR: ampicillin resistance gene for selection in bacterial cells. 5’LTR: 5’ long terminal repeat.

77

Materials and Methods

Table 8. Paired gRNA sequences. Distance Target region Target coordinates gRNA #1 gRNA #2 between gRNAs (bp)

chr9:110,307,293- TXN Enhancer AGTGTACCATTGATGACCAT GGGGTAAAAAATTTAAGGGG 11,142 110,318,453

NDNF Promoter chr4:121071838- GCGTGCGCGCTGGCGCGCAG TTGAGAAGCTTCTTGGGCCT 841 Pair #1 121072697

NDNF Promoter chr4:121071597- CGGTCTGCTTGACAGTGTGA CGGAGCCTTTATCCCGGACT 1,385 Pair #2 121072997

10.4. gRNA transduction

Constitutively Cas9 expressing cells were infected with viral gRNA-expressing viruses produced as previously described. BFP+ (Blue Fluorescent Protein positive) cells in the case of single gRNAs and double BFP+ / GFP+ (Green Fluorescent Protein positive) cells in the case of double gRNAs were determined by flow cytometry (FACS Canto II, BD Biosciences) 72 hours after infection. Percentage of BFP+ cells and BFP+ / GFP+ exceeded 90% in all cases.

10.5. Assessment of CRISPR-Cas9 single gRNA editing efficiency

Genomic DNA was extracted (QIAamp DNA Mini and Blood Mini Kit, Qiagen) from the total pull of cells 72 hours after infection. Targeted loci were amplified (primers listed in Table 9), PCR products were purified (NucleoSpin Gel and PCR Clean-up, Macherey Nagel) and sequenced by Sanger reaction. The spectrum and frequency of targeted mutations generated in the cell pool was determined by the public available software Tracking of Indels by DEcomposition (TIDE) (Brinkman et al., 2018) (Figure 22). Protein levels of target genes were assessed by western blot analyses.

78

Materials and Methods

Figure 22. Assessment of CRISPR-Cas9 genome editing by TIDE web tool. A) Genomic DNA PCR amplification of the targeted region. Expected Cas9 endonuclease cut is located within the coding region of the target gene. Fw: fordward and Rv: reverse primers. B) Sanger sequencing reaction of the PCR product. Due to imperfect repair after Cas9 endonuclease cut, the DNA in the cell pool consists of a mixture of indels, which yields a composite sequence trace after the break site. C) Overview of TIDE algorithm and output, including three main steps: (1) Visualization of aberrant sequence signal in control (black) and edited samples (green), the expected break site (vertical dotted line) and the region used for decomposition (grey bar); (2) Decomposition yielding the spectrum of indels and their frequencies; (3) Inference of the base composition of +1 insertions. Figure modified from (Brinkman et al., 2014).

Table 9. PCR primer sequences for assessment of CRISPR-Cas9 single gRNA editing efficiency. Target region Name Sequence (5'- 3') Product (bp)

TXN Exon2 Seq TXN Ex2 Fw ATTTGGCTTGTTTGGGGATT 345

Seq TXN Ex2 Rv TCCTACCACTGTGACCACCA

79

Materials and Methods

10.6. Assessment of CRISPR-Cas9 paired gRNA editing efficiency

Genomic DNA was extracted (QIAamp DNA Mini and Blood Mini Kit, Qiagen) from the total pull of cells 72 hours after infection. Excision was assessed qualitatively by PCR using primers flanking the genomic target region, such that the edited alleles should produce a shorter PCR product than the uncut alleles (Figure 23). PCR products were resolved in 1% agarose gels. Given differences in PCR amplification efficiencies between products of different lengths, the relative populations of each allele cannot be inferred by analyzing relative band intensities.

Regarding TXN enhancer deletion, for a more quantitative determination of the editing efficiency in the total pull of cells, qPCR was performed using a primer internal to the deleted region, which will only amplify the wild type allele (Figure 23). All quantifications were normalized to a distal non-targeted genomic region. Products were loaded in 2% agarose gels to check for correct size. See Table 10 for primer sequences and PCR product sizes.

Figure 23. Genomic PCR primer strategy used for measuring the deletion efficiency of paired gRNA designs. Outline of the primer design strategy for qualitative and quantitative assessing of deletion efficiency. External primers (orange) flanking the genomic target region were designed to detect the presence of smaller PCR products in edited alleles by gel electrophoresis. Internal primers (green) were designed to assess the relative concentration of wild type and edited sites by qPCR, normalized to a distal non-targeted region.

80

Materials and Methods

In the case of NDNF promoter deletion, quantitative determination of the editing efficiency was not assessed by the previous qPCR-based method, due to unexpected limitations found in the primer design and amplification. Instead, the decay in mRNA levels of NDNF transcript was determined by RT-qPCR, as it should directly correlate with the promoter deletion efficiency.

Table 10. PCR primer sequences for assessment of CRISPR-Cas9 paired guide editing efficiency. WT allele Edited allele Target region Name Primer design Sequence (5'- 3') product (bp) product (bp)*

TXN Enhancer TXN_5’Enh_Fw Internal AGGCGAGATCTTGCTGTGTT 425 NA

TXN_5’Enh_Rv CTCACAACACTTTGGGCAGA

TXN Enhancer TXN_5’Enh_Fw External AGGCGAGATCTTGCTGTGTT 11,467 325

TXN_3’Enh_Rv CCATGATGCCTGGCTAATTT

TXN Exon2 Seq TXN Ex2 Fw Non-targeted ATTTGGCTTGTTTGGGGATT 345 345

Seq TXN Ex2 Rv TCCTACCACTGTGACCACCA

NDNF Promoter NDNF_5’Pr_Fw External TGGGACATGAATGCACAGTT 1,665 824#1 249#2

NDNF_3’Pr_Rv CTAGCTTTTCGCCCTACACG

*NA: No Amplification; #: Two distinct paired gRNAs (pair #1 and #2, Table 8) were designed intended to remove varying-size fragments of NDNF promoter region, yielding a different PCR product size in each case.

11. Luciferase reporter assay

One kilobase (kb) fragment from the NDNF promoter containing the predicted PRDM5 binding site (-679 to +321 bp from the NDNF TSS) was synthesized by GenScript. This fragment was cloned into a pGL4.17[luc2/Neo] vector (Promega #9PIE672, Figure 24) upstream the luciferase reporter gene after NheI / XhoI digestion and named as NDNF-promoter-wild type (NDNFpr-WT). Its counterpart harboring a deletion of PRDM5 binding site (+59 to +72 bp from the NDNF TSS) was synthesized by mutagenesis from the original NDNFpr-WT and named as NDNF- promoter-mutated (NDNFpr-Mut).

81

Materials and Methods

Figure 24. Luciferase reporter plasmid. Map of pGL4.17[luc2/Neo] reporter plasmid. NDNF promoter constructs are into the NheI and XhoI sites of the multi cloning site (MCS), driving the expression of the luciferase gene. SV40 (simian virus 40) promoter drives the constitutive expression of the neomycin resistance gene (NeoR) for selection in mammalian cells. f1 ori: f1 bacterial origin of replication. AmpR: ampicillin resistance gene for selection in bacterial cells.

HEK293T cells expressing two different shRNAs targeting PRDM5 (Table 6) were co-transfected with 10 ng/μl of reporter plasmids containing the NDNFpr-WT, NDNFpr-Mut or empty pGL4.17[luc2/Neo] plasmid and 0.5 ng/μl renilla luciferase vector (pRL-SV40 Renilla Luciferase Control Reporter Vector, Promega). shRNA expression was induced by adding 1 μg/ml doxycycline to the culturing media. Forty- eight hours after transfection PRDM5 knockdown was validated by RT-qPCR and luciferase/renilla activity was measured by a dual luciferase reporter assay (Promega) in an automatic 96-well plate reader according to manufacturer's instructions.

12. RT-qPCR analysis

RNA was isolated using TRIzol reagent (Life Technologies) following the manufacturer’s instructions. RNA concentration was quantified using NanoDrop spectrophotometer (Thermo Fisher Scientific) and 1 μg of RNA was retrotranscribed into cDNA using PrimeScript RTPCR Kit (Clontech, Takara). The reaction was carried out at 37 oC for 15 minutes and 65 oC for 5 seconds. RT-qPCR was performed using SYBR Green PCR Master Mix (Applied Biosystems) under the following conditions: 50 oC for 2 minutes, 95 oC for 10 minutes and 40 cycles of 95 oC for 15 seconds, 60 oC for 1 minute. Melting curve analysis was performed to assess the presence of specific

82

Materials and Methods

PCR products. See Table 11 for primer sequences. All RT-qPCR were performed in a QuantStudio Real Time PCR System (Thermo Fisher Scientific) and expression quantification was performed with the 2-ΔΔCt method, using GUSβ and GAPDH as normalizer genes.

Table 11. RT-qPCR primer sequences. Target gene Name Sequence (5´- 3´) Product (bp)

PRDM5 PRDM5 Fw AAGTGCTCAGAGTGCAGCAA 165

PRDM5 Rv GGGACGATTGGGATTATGAG

NDNF NDNF Fw CCTTTGGAGTGGAAGCTGAG 181

NDNF Rv AACCGGATGGGGAACTAGAC

TXN TXN Fw TTCCAACGTGATATTCCTTGA 154

TXN Rv GGTGGCTTCAAGCTTTTCCT

SpCas9 SpCas9 Fw CCGAAGAGGTCGTGAAGAAG 127

SpCas9 Rv GCCTTATCCAGTTCGCTCAG

GUSβ GUSβ Fw GAAAATATGTGGTTGGAGAGCTCATT 101

GUSβ Rv CCGAGTGAAGATCCCCTTTTTA

GAPDH GAPDH Fw CCAAGGTCATCCATGACAAC 465

GAPDH Rv TGTCATACCAGGAAATGAGC

13. Identification of fusion transcripts by gel-PCR

With the purpose of identifying possible fusion transcripts between PRDM5 and NDNF genes, PCR primers were designed in each of the coding exons of both individual transcripts (Table 12).

RNA was isolated using TRIzol reagent (Life Technologies) following the manufacturer’s instructions. RNA concentration was quantified using NanoDrop spectrophotometer (Thermo Fisher Scientific) and 1 μg of RNA was retrotranscribed into cDNA using PrimeScript RTPCR Kit (Clontech, Takara). PCR reactions from 10 ng of cDNA were carried out with Platinum Taq DNA Polymerase (Thermo Fisher

83

Materials and Methods

Scientific) with all possible combinations among the designed primers, under the following conditions: 94 oC for 2 minutes and 40 cycles of 94 oC for 15 seconds, 58 oC for 30 seconds and 68 oC for 30 seconds. Positive controls for specific PRDM5 and NDNF transcript amplification were included in each experiment (primers listed in Table 11). PCR products were resolved in 1% agarose gels. Amplified products were subsequently purified (NucleoSpin Gel and PCR Clean-up, Macherey Nagel), cloned into pGEM-T Easy Vector (Promega) and sequenced by Sanger reaction.

Table 12. Primer sequences for fusion transcript identification.

Target gene Exon Name Sequence (5´- 3´)

NDNF (Fw) 1 N1 CGTTCCTGGATATTGGTGCT 2 N2 ACTCAGCTCAAGGACCCAGA

3 N3 ACAGTGACGCCCTGTGATG

4 N4 GCCCTCTAGTCAGTTATATTGCTATTG PRDM5 (Rv) 1 P1 GTTCCTGCCATTCCTGCTG 2 P2 CAAAGGGTCCGAACTTTTCA

3 P3 TTCTCCCTTACTCCCACGAA

4 P4 CCAGGTAGCCAATCAGAAGC

5 P5 CGATTCACATTGAGGACAAGC

6 P6 TTCGCTGTGCACTGAAGAAC

7 P7 CTTCAGCCTCTTTCCACAGC

8 P8 TCCTGTAGGCTTGATGCTGA

9 P9 TGGGTGATCATATGACGTTTT

10 P10 TCTCGCAATTATAGGGTCGTTT

11 P11 CATTGGAACGGTCTCTCCTC

12 P12 AACCACCTGGACATGAACATT

13 P13 ACCACTGCTGGCAAATTTCT

14 P14 CGAATGTGCATCTTCAGTCC

15 P15 TTGCTGCACTCTGAGCACTT

16 P16 GGGACGATTGGGATTATGAG Fw: forward primers. Rv: reverse primers

84

Materials and Methods

14. DNA methylation analysis by pyrosequencing

In order to validate the results obtained by WGBS, we determined the DNA methylation levels of the promoter region of PRDM5 and NDNF genes by bisulfite pyrosequencing analysis. DNA was extracted from samples using a DNA kit (NucleoSpin Tissue, Macherey Nagel) following the manufacturer’s instructions. The purified DNA was quantified using a NanoDrop spectrophotometer (Thermo Fisher Scientific). One µg of genomic DNA was bisulfite-modified using the CpGenome DNA modification Kit (Chemicon International). For PCR amplification of the regions of interest, a “hot start” PCR (PyroMark PCR Kit, Qiagen) was used with the following protocol: 95 °C for 15 minutes, 45 cycles of 94 °C for 1 minute, 50 °C (for NDNF primers) or 54 °C (for PRDM5 primers) for 1 minute, and 72 °C for 1 minute, followed by a final 10 min extension at 72 °C. This PCR was performed using 2 µl of modified DNA and a final concentration of 0.2 μM of each specific primer. The quality of the PCR amplicon was verified by gel electrophoresis (see Table 13 for amplification and sequencing primers).

Table 13. Pyrosequencing primer sequences. Target region Name Sequence (5´- 3´) Product (bp)

PRDM5 promoter PRDM5 Piro Fw* AGGTGTTAATTAGATGATTTTTGAGTGTTT 330

PRDM5 Piro Rv (Bio) AAACAAACCTATTCTTACCCAAACC

PRDM5 Piro Seq* AGGTGTTAATTAGATGATTTTTGAGTGTTT

NDNF promoter NDNF Piro Fw TTTAGTTTAAGGAAGTTTTTATAA 162

NDNF Piro Rv (Bio) AAAAACTAAAAATCCAAACAAC

NDNF Piro Seq GTAGAGGATAAATGAGGAGTTAGAG

*Same primer; Bio: Biotinylated

The resulting biotinylated PCR products were bound to Streptavidin Sepharose High Performance Beads (GE Healthcare) and processed to yield high quality ssDNA using the PyroMark Vacuum Prep Workstation (Biotage). The pyrosequencing

85

Materials and Methods reactions were performed using the PyromarkTM ID (Biotage) and sequence analysis was performed using the PyroQ-CpG analysis software (Biotage). All experiments included a human genomic DNA universally methylated for all genes (Intergen Company) as a positive control and water blanks as technical control.

15. Western blot

Cells were harvested and lysed in buffer containing 1% Triton X-100, 150 mM NaCl, 50 mM Tris [pH 8], supplemented with 1x protease inhibitor cocktail (Complete Mini, Roche), 10 mM NaF and 1 mM sodium orthovanadate for 30 minutes at 4 oC. After centrifugation at 13,000 rpm for 30 minutes at 4 oC, supernatant was collected. Protein extracts were quantified using BCA (Pierce BCA Protein Assay Kit, Thermo Fisher Scientific) or Bradford (Protein Assay Dye Reagent Concentrate, BioRad) colorimetric assays. Equal amounts (30 to 50 μg) of protein samples were resolved by SDS-PAGE, transferred to nitrocellulose membranes (BioRad) and blocked in 0.2% I- Block reagent (I-Block Protein-Based Blocking Reagent, Thermo Fisher Scientific), 0.1% Tween-20 in 1x PBS for 1 hour prior to addition of primary antibody. For antibodies used and conditions, see Table 14. Antibody signal was revealed by a chemiluminescent reagent (Tropix, Bedford) and detected by autoradiograph using HyperfilmTM ECL (Amershan Biosciences) after 1 to 10 minutes of exposure. β-actin was used as loading control.

Table 14. Western Blot antibodies.

Antibody Reference Commercial Source Conditions

PRDM5 sc-376277 Santa Cruz Biotechnology Mouse 1:2,000

TXN #2429 Cell Signalling Rabbit 1:1,000

β-actin A5441 Sigma Rabbit 1:5,000

86

Materials and Methods

16. Apoptosis assay

Cells were harvested and stained with FITC Annexin V Apoptosis Detection Kit I (BD Biosciences) and 7AAD (BD Biosciences) following the manufacturer’s protocol at indicated time points. The percentage of Annexin V+ / 7AAD+ cells was analyzed using FACS Canto II (BD Biosciences) and FlowJo flow cytometry analysis software.

17. Cell viability assays by MTS

Cell lines infected with viral shRNA vectors were seeded at 40,000 cells/well in 96 well plates in presence or absence of doxycycline. The number of viable cells in proliferation was determined by MTS assay (CellTiter 96 Aqueous One Solution Cell Proliferation Assay, Promega) following the manufacturer's protocol, at indicated time points. Briefly, plates were centrifuged at 800 g for 10 minutes and culture medium was removed. Cells were incubated with 100 μl of medium and 20 μl of CellTiter 96 Aqueous One Solution Reagent per well at 37 oC. The MTS tetrazolium compound in this solution is bioreduced by cells into a colored formazan product which can be quantified by colorimetry. Absorbance was measured at λ = 490nm in a 96-well plate reader. Cell viability in each time point was calculated as the percentage of total absorbance cells in each condition / absorbance of control cells.

18. Cell proliferation assays by flow cytometry

Cell proliferation rate of MM cell lines infected with CRISPR-Cas9 single or paired gRNA system was monitored by flow cytometry. Seventy-two hours after gRNA transduction, 100,000 BFP+ (single gRNAs) or double BFP+ / GFP+ cells (paired gRNAs) were mixed with 100,000 wild type (WT, BFP- / GFP-) cells. The percentage of BFP+ or BFP+ / GFP+ cells was quantified by flow cytometry (FACS Canto II, BD Biosciences) and monitored every 72 hours during an 18-day culture. Each condition was

87

Materials and Methods normalized to the scramble gRNA, to assess the effect of CRISPR-Cas9 editing on cell proliferation.

19. Statistical analysis

All statistical tests for experimental data were performed using GraphPad Prism software. Data represent the mean or median (± SD) of at least three independent experiments. P values below 0.05 were considered statistically significant.

Normality analysis for each group were performed using Shapiro-Wilk test, followed by Levenne test for homogeneity of variances. For parametric group comparisons Student t-test (2 groups) or one-way ANOVA (more than 2 groups) were used; whereas for non-parametric group comparisons Mann-Whithney test (2 groups) or Kruskall-Wallis test (more than 2 groups) were employed. For multiple comparisons, Tukey correction was used for samples with homogenous variances, Tamhane’s T2 for variance heterogeneity and Fisher’s LSD with Bonferroni correction for comparisons among 3 groups.

20. Data availability

All the raw data included in this study will be further deposited and released, as part of the BLUEPRINT epigenome project, at the European Genome Phenome Archive (EGA, ega-archive.org), which is hosted at the European Bioinformatics Institute (EBI).

The Gene Expression Omnibus (GEO) accession number for the ssRNA-Seq dataset from normal B cells can be accessed by GSE114816 and GSE114803 via the National Center for Biotechnology Information (NCBI). The deposition of gene expression data from the low input 3’ end RNA sequencing is currently in progress.

88

RESULTS

"The great thing about science is that you can get it wrong over and over again because what you are after, call it truth or understanding, waits patiently for you.”

(Dudley Herschbach)

Results

1. Multilayer profiling of the MM reference epigenome

In the present study, we have accurately characterized the reference epigenome of 3 representative MM patients in the context of normal B cell differentiation, including tonsillar naïve B cells (t-NBCs), peripheral blood naïve B cells (pb-NBCs), tonsillar germinal center B cells (t-GCBC), peripheral blood memory B cells (pb-MBC) and terminally differentiated tonsillar plasma cells (t-PC). This analysis includes genome-wide maps of ChIP-Seq data from six histone modifications with non- overlapping functions (H3K27ac, H3K4me1, H3K4me3, H3K36me3, H3K27me3 and H3K9me3), ATAC-Seq for chromatin accessibility, WGBS for DNA methylation and ssRNA-Seq for transcriptome characterization (Figure 25A). The results obtained using these techniques in the above mentioned samples were validated in an extended series of MM patients (detailed number of samples is shown in Figure 25A) by analyzing the chromatin regulatory landscape. Moreover, to further evaluate the regulatory potential of such chromatin modulations in the transcriptomic profile of the cell, we analyzed by ssRNA-Seq an additional cohort of samples that included: 37 MM patients, 5 t-NBCs, 4 t-GCBCs, 5 t-MBCs, 5 t-PCs and 3 bm-PCs. The whole set of samples utilized in this study is summarized in Table 1.

We initially performed unsupervised principal component analysis (PCA) of each of the epigenetic layers (histone marks, chromatin accessibility, DNA methylation and gene expression) individually to identify similarity patterns in our data. Interestingly, we observed that in each of the analyses, MM cells clustered separately from normal B cells, with this later group of cells also showing differences according to their maturation stage (Figure 25B) as previously described (Beekman et al., 2018). Histone marks related to regulatory elements associated with transcriptional activation (H3K27ac, H3K4me1 and H3K4me3) and transcription (H3K36me3) showed a more homogenous profile in MM samples, whereas heterochromatin related marks (such as H3K27me3 and H3K9me3) as well as DNA methylation show higher

93

Results variability in these samples. Taken together, these results suggest that, additionally to the modulation patterns that occur during B cell differentiation, MM cells show a distinct epigenomic configuration than normal B cells, highlighting the crucial role of epigenetic alterations in the pathogenesis of this disease.

Figure 25. MM reference epigenomes overview. A) Summary of the techniques and number of samples (n) analyzed in this study. The number of samples included in the reference epigenome set of MM is shown in brackets. B) Unsupervised principal component analysis (PCA) of ChIP-Seq data for six different histone modifications, chromatin accessibility, DNA methylation and gene expression. We identified a clear segregation between MM samples and all normal B cells subpopulations.

2. Characterization of the MM epigenome in the context of normal B cell differentiation

We next aimed to characterize the dynamics of each independent histone mark of the epigenomic landscape of MM patients in the context of B cell differentiation (Figure 26). Overall, we identified a mean of 14,663 regions (ranging from 3,712 to 30,285 depending on the mark) with differential histone mark profile (FDR < 0.01)

94

Results between MM and normal B cells (Table 15). Considering that normal B cell maturation itself entails an extensive modulation of the epigenome (Beekman et al., 2018), we segregated the differential regions into two region subsets:

1) Chromatin modulation patterns occurring during normal B cell differentiation and sharing similarities with MM plasma cells 2) Regions whose chromatin landscape is stable in normal B cells and gets aberrantly modulated specifically in MM patients

Table 15. Differential histone mark profile in MM

Histone mark Differential regions MM/B cells* Regions modulated in B cells MM specific regions

H3K27ac 30,285 (24%) 13,661 16,624 H3K4me1 23,828 (39%) 14,949 8,879 H3K4me3 17,856 (25%) 6,548 11,308 H3K36me3 3,813 (11%) 2,241 1,572 H3K27me3 8,482 (23%) 4,074 4,408 H3K9me3 3,712 (6%) 1,677 2,035 *Percentage of differential regions from the total number of peaks called for each mark shown in brackets.

95

Results

Figure 26. Selection strategy for identification of H3K27ac regions modulated vs stable through B cell differentiation. From the consensus matrix of peaks, those representing individual specific patterns were excluded. Differential analysis between MM vs all normal B cells was performed to detect a differential histone mark profile in MM patients (FDR < 0.01). Finally, modulatory patterns among normal B cell subpopulations were analyzed by a secondary differential analysis, differentiating those regions modulated in through B cell differentiation (FDR < 0.01) and chromatin changes specific of MM patients (FDR ≥ 0.01). The same filtering strategy was applied for each independent histone mark.

96

Results

2.1. MM patients exacerbate chromatin patterns already present in normal plasma cells

It has been widely described that chromatin modifications play a critical role in B cell differentiation (Schroeder et al., 2015; Beekman et al., 2018; Wu et al., 2018), defining maturation stage specific epigenomic profiles. Although from a global perspective MM patients seem completely different to all B cell subpopulations (Figure 25B), we sought to characterize the chromatin landscape of MM that might share similarities to chromatin modulation patterns occurring in normal B cell differentiation, as these regions may be already implicated in gene deregulation at earlier maturation stages. We performed K-means clustering for each histone mark individually, in order to find subsets of regions showing similar modulation patterns among the different groups of samples analyzed. We observed that, when focusing on histone modifications related to regulatory elements and active transcription (H3K27ac, H3K4me1, H3K4me3 and H3K36me3), MM profiles were expectedly more similar to t-PCs than to other B cells, indicating that MM maintains chromatin features of its cellular origin, although the t-PC patterns become exacerbated in MM. Representative clusters of this phenomenon for H3K27ac histone mark are highlighted in blue in Figure 27A. Unsupervised PCA of these histone marks confirmed that the main source of variability among these regions is associated with the maturation stage of each B cell subpopulation, with MM patients again resembling normal t-PCs (Figure 27B). Moreover, we also identified regions in MM which loose this imprinting from the cell of origin, showing unexpected associations between t-GCBCs and t-PCs (highlighted in yellow for H3K27ac histone mark in Figure 27A). These regions were associated to genes related to regulation of DNA repair and replication, and T cell differentiation (Figure 27C), suggesting that these chromatin modulatory patterns might be influenced by the cell microenvironment.

97

Results

Figure 27. Chromatin similarities between MM and the B cell maturation process. A) K-means clustering of H3K27ac, H3K4me1, H3K4me3 and H3K36me3 genomic regions that show differential changes in MM versus normal B cells and dynamic modulation during normal B cell differentiation. The number of regions included in each cluster is indicated in the right side bar chart. Clusters showing MM

98

Results exacerbation of H3K27ac patterns already present in t-PCs are highlighted in blue. Similarities between t-GCBCs and t-PCs for H3K27ac are highlighted in yellow. B) Principal component analysis (PCA) of regions that show differential changes in MM versus normal B cells and dynamic modulation during normal B cell differentiation. First principal component (PC1) is represented for B cell subpopulations and MM samples, highlighting the proximity between MM and t-PCs. C) GO terms significantly enriched (FDR < 0.05) of genes associated to regions showing simmilarities between t-GCBCs and t-PCs (genes selected from clusters highlighted with yellow arrowheads in “A”).

In contrast, the B cell dynamic fraction of repressive marks (H3K27me3 and H3K9me3) in MM is extensively altered and the global resemblance to t-PCs is lost (Figure 28A). Indeed, we observed frequent similarities between t-GCBCs and t-PCs, highlighted in yellow in Figure 28A, with MM patients resembling the NBCs and pb- MBCs heterochromatin landscape. These similarities in condensed chromatin became apparent in the first component (PC1) of the PCA analysis, as it is shown in orange in the left panels of Figure 28B. However, we also observed heterochromatic regions common to t-GCBCs, t-PCs and MM patients, as it is highlighted in blue in Figure 28A, and is reflected also in the second component (PC2) of the unsupervised analysis (highlighted in blue in the right panels of Figure 28B). Remarkably, when trying to assess the functional implication of these changes, we could rather detect any expression of the genes associated to these regions in B cells or MM samples. Presumably, these modulations in heterochromatin do not imply a functional impact in the cell, as we will further discuss in detail.

99

Results

Figure 28. Heterochromatin similarities between MM and B cell maturation process. A) K-means clustering of H3K9me3 and H3K27me3 genomic regions that show differential changes in MM versus normal B cells and dynamic modulation during normal B cell differentiation. The number of regions included in each cluster is indicated in the right side bar chart. Clusters including common patterns among t-GCBCs, t-PCs and MM are highlighted in blue. Similarities between t-GCBCs and t-PCs are highlighted in yellow. B) Principal component analysis (PCA) of regions that show differential changes in MM versus normal B cells and dynamic modulation during normal B cell differentiation. First (PC1) and second principal components (PC2) are represented for B cell subpopulations and MM samples. In PC1, the proximity between NBCs, pb-MBCs and MM patients is marked in orange; the similarity between t- GCBCs, t-PCs and MM patients is highlighted in blue in PC2.

2.2. Identification of a core epigenomic landscape specific of MM patients

In contrast with the previous analysis, we further aimed to characterize those regions that do not seem to be implicated in B cell differentiation, but get aberrantly modulated specifically in MM patients. We hypothesized that chromatin changes shared by the neoplastic transformation and normal differentiation might be passive observers, whereas those occurring specifically in MM cells may constitute epigenetic drivers with a potential functional impact in this disease. Therefore, we analyzed the

100

Results crosstalk between histone marks, chromatin accessibility, DNA methylation and gene expression in regions undergoing these changes specifically in MM (Table 15).

First, we focused on H3K27ac histone modification, which is strongly associated with chromatin activation. We observed a higher number of regions gaining than loosing H3K27ac mark in MM (Figure 29A), revealing more than 1% of the genome acquiring H3K27ac in these patients (Figure 29B). Approximately, 50% of regions gaining H3K27ac are located in gene loci, overlapping with 3,356 genes; whereas almost 80% of regions loosing H3K27ac are located in genic regions, including 1,951 genes (Figure 29C). As expected, gain of H3K27ac is associated with increased chromatin accessibility in MM (Figure 29D, left panel) and DNA hypomethylation of these regions (Figure 29E, left panel), leading to an open chromatin state. On the other hand, H3K27ac loss leads to a more compact chromatin, determined by decreased number of ATAC peaks (Figure 29D, right panel). However, we could not detect significant differences in DNA methylation levels in these regions between B cells and MM patients (Figure 29E, right panel), which might be a consequence of the epigenetic memory maintained in MM cells from the B cell maturation process. Finally, more than 500 genes overlapping with H3K27ac-gain regions are overexpressed in MM due to the activation of their chromatin landscape, whereas only 200 genes showing H3K27ac loss are silenced in MM cells(Figure 29F). These results show a striking deregulation of the H3K27ac pattern in MM patients, which ultimately leads to an aberrant transcriptional program of the cell.

101

Results

Figure 29. Integrative analysis of MM specific H3K27ac modulation. Interplay between H3K27ac modulation patterns, chromatin accessibility, DNA methylation and gene expression of genomic regions undergoing MM specific changes. A) Number of regions that gain or lose H3K27ac specifically in MM. B) Percentage of the genome covered by the modulated regions. C) Percentage of genic and intergenic genes associated, and number of genes overlapping with these genomic regions. D) Percentage of modulated regions with ATAC peaks in B cells and MM samples. E) Median of DNA methylation of the modulated regions in B cells and MM samples. F) Number of differentially expressed genes (FDR < 0.05, |FC| > 1.5) hosting these regions.

Next, we followed the previous approximation to deeply characterize the modulation of each histone modification in the epigenetic context of MM. Overall, we observed that each histone mark undergoes more gains than losses in MM (Figure 30A and B), but at different degrees, showing H3K27ac, H3K4me1 and H3K4me3 the largest number of gained regions, suggesting a clear deregulation of their chromatin landscape (Figure 30A). In terms of genomic localization, most modulated regions in MM are evenly distributed between intergenic and genic loci (Figure 30C). However, H3K36me3 is preferentially located in gene bodies (Figure 30C, panel 4), as it is strongly associated with transcribed regions of active genes (Kolasinska-Zwierz et al., 2009). As expected, regions that acquire histone

102

Results modifications related to active regulatory elements (H3K27ac), including promoters (H3K4me3) and enhancers (H3K4me1), showed increased chromatin accessibility with a higher number of ATAC peaks (Figure 30D, panels 1-3), loss of DNA methylation (Figure 30E, panels 1-3) and an increased expression of the associated genes in MM (Figure 30F, panels 1-3). On the other hand, regions with loss of these active marks in MM showed decreased chromatin accessibility, (Figure 30D, panels 1- 3) and lower expression of the overlapping genes in these patients (Figure 30F, panels 1-3). Remarkably, DNA methylation of regions losing any of the active marks (H3K27ac, H3K4me1 and H3K4me3) remained unchanged in MM samples (Figure 30E, highlighted in panels 1-3), retaining the epigenetic imprinting from the B cell maturation process. These results support the concept that DNA methylation maintains the epigenetic memory of past activation during B cell maturation (Kulis et al., 2015), whereas chromatin modulation of regulatory elements seems to be more dynamic and highly correlated with transcriptional changes in the cells. Collectively, this analysis highlights the aberrant modulation of regulatory elements specifically in MM patients, involving a coordinated modification of different epigenetic layers.

103

Results

Figure 30. Interplay of chromatin modulatory patterns specific of MM. Summary of the interplay between histone marks, chromatin accessibility, DNA methylation and gene expression of genomic regions undergoing MM specific changes. A) Number of regions specifically gained or lost in MM for each mark. B) Percentage of the genome covered by the modulated regions. C) Percentage of genic and intergenic genes associated, and number of genes overlapping with these modulated genomic regions. D) Percentage of modulated regions with ATAC peaks in B cells and MM samples. E) Median of DNA methylation of the modulated regions in B cells and MM samples. Unchanged DNA methylation patterns between B cells and MM samples due to epigenetic memory are highlighted in red. Passive loss of DNA methylation of heterochromatic regions in MM samples is highlighted in green. F) Number of differentially expressed genes (FDR < 0.05, |FC| > 1.5) hosting these regions. From left to right, the figure panels are named as follows: panel 1 – H3K27ac; panel 2 – H3K4me1; panel 3 – H3K4me3; panel 4 – H3K36me3; panel 5 – H3K27me3; panel 6 – H3K9me3.

104

Results

Regarding the heterochromatin landscape of MM, we identified a high number of heterochromatic regions (H3K27me3 and H3K9me3) that were specifically modulated in MM cells (Figure 30A, panels 5 and 6), with almost 1.6% of the genome acquiring H3K27me3 specifically in the disease, suggesting an extensive heterochromatin modulation in these patients (Figure 30B, panel 5). However, MM- associated changes in H3K27me3 and H3K9me3 neither altered chromatin accessibility (Figure 30D, panels 5 and 6) nor were associated with gene expression modulation (Figure 30F, panels 5 and 6). In terms of DNA methylation, H3K27me3 and H3K9me3 have been classically associated with hypermethylated DNA, except for CG rich regions where H3K27me3 and DNA methylation seem to be mutually exclusive (Viré et al., 2006; Schlesinger et al., 2007; Brinkman et al., 2012). Strikingly, analyzing DNA methylation in further detail, we observed that MM cells show large blocks of hypomethylated regions of low CG content (excluding CGIs) gaining these repressive marks (Figure 30E, highlighted in panels 5 and 6; Figure 31).

Figure 31. DNA demethylation of H3K27me3 Polycomb repressed chromatin in MM. Genome browser snapshot of H3K27me3 ChIP-Seq data (grey) and Whole Genome Bisulfite Sequencing (WGBS, blue) for two independent heterochromatic regions (A and B) in B cell differentiation and MM samples. Genomic loci gaining H3K27me3 in MM patients (grey arrowheads) undergo a passive loss of DNA methylation (blue arrowheads) in regions of low CG content (not overlapping with CGIs). CGI: CpG island.

Such extensive hypomethylation takes place in heterochromatic regions regardless of the particular mode of repression (presence of H3K27me3, presence of H3K9me3 or low signal heterochromatin). Taking these results together, this

105

Results modulation seems just a transition between different types of repressive heterochromatin, with a slight functional effect on transcriptional regulation (Figure 30F, panels 5 and 6).

Finally, H3K36me3 is a histone mark tightly associated with transcription elongation (Sims III and Reinberg, 2009), but has not been related to any regulatory element. Therefore, we did not detect any change in chromatin accessibility (Figure 30D, panel 4) or DNA methylation (Figure 30E, panel 4) in regions with H3K36me3 modulation. However, we observed a high correlation with overexpression of genes associated with regions gaining this histone mark (Figure 30F, panel 4).

Collectively, these findings indicate that MM is characterized by a highly dynamic chromatin landscape, which on the one hand affects heterochromatin without apparent functional impact on the cell, and on the other hand, leads to an emergence of regulatory elements leading to extensive perturbation of the MM transcriptome.

2.3. De novo chromatin activation is a hallmark of MM

To capture the functional interplay between different histone modifications, we integrated ChIP-Seq data of the six histone marks into functional chromatin states, using the ChromHMM package (Figure 32). Chromatin states were learned across 15 normal samples based on genome-wide recurrent combinations of histone modifications. Once generated, this model was applied to MM neoplastic samples.

106

Results

Figure 32. Chromatin state model. A) Schematic representation of the segmentation of the genome into functional chromatin states, according to specific combinations of histone modifications. B) Chromatin states learned across 15 normal samples by ChromHMM multivariate hidden Markov model, based on genome-wide recurrent combinations of histone modifications (Ernst et al., 2011; Ernst and Kellis, 2017). Each entry represents the frequency with which a given mark (columns) is found at the corresponding chromatin state (rows).

Considering the widespread gain of histone modifications in the MM chromatin landscape (Figure 30A), we sought to characterize the functional localization of these epigenetic patterns. Based on the chromatin state model, we observed that MM specific H3K27ac gain was preferentially located in strong enhancer and promoter regions, which lack this activation mark in normal B cells (Figure 33A). As previously described, H3K4me1 is associated with enhancer regions (Figure 33B), H3K4me3 is enriched in promoters (Figure 33C) and H3K36me3 co- localizes with active transcription regions (Figure 33D). Remarkably, these regions were enriched in repressed chromatin marks in normal B cells (represented in gray in the different graphs of Figure 33), suggesting a de novo activation of these regulatory elements in MM from heterochromatic regions in B cells.

107

Results

Figure 33. Chromatin states distribution of genomic regions modulated specifically in MM. Distribution of the different chromatin states in all analyzed samples separately (4 MM and 15 normal B cells) at regions with MM specific gain of A) H3K27ac, B) H3K4me1, C) H3K4me3, D) H3K36me3, E) H3K27me3 and F) H3K9me3. The number of analyzed regions for each histone mark is represented in brackets. Arrowheads highlight the most relevant changes between MM samples and normal B cells.

On the other hand, in terms of heterochromatin-related modifications (H3K27me3 and H3K9me3), MM patients acquire these marks in regions previously annotated as low signal heterochromatin (Figure 33E and F), which is indeed another type of repressive chromatin conformation. These results corroborate that such

108

Results modulation in heterochromatic regions might be just a transition between different types of low accessibility chromatin in MM.

We further aimed to map the specific chromatin state transitions of these regions from normal B cells to MM. We observed that activation of regulatory elements such as enhancers and promoters by the acquisition of H3K27ac in MM was mainly originated from heterochromatin-low signal regions in normal B cells (Figure 34A). Recent epigenomic studies have described that, prior to activation, differentiation-associated promoter and enhancer regions can exist in a primed state, characterized by the presence of H3K4me3 and H3K4me1 respectively (Calo and Wysocka, 2013). Interestingly, these regulatory elements also acquire these “priming” modifications de novo in MM, as we can observe in Figure 34B and C. This means that MM specific chromatin activation does not arise from poised regulatory regions, but is originated de novo in MM cells from repressive chromatin regions. Moreover, active transcription characterized by H3K36me3 acquisition co-locates with heterochromatic regions in normal B cells (Figure 34D), indicating that many deregulated genes are overexpressed in MM patients, as will be further explored in detail. Taking together, these results suggest that MM cells acquire their malignant phenotype through widespread de novo activation of regulatory elements, which leads to deregulation of their transcriptome.

On the other hand, regarding the heterochromatic landscape of MM, we corroborated that there is a transition between different types of repressive chromatin (heterochromatin, Polycomb repressed, and low signal chromatin) (Figure 34E and F), without any apparent transcriptional impact on the cell.

109

Results

110

Results

Figure 34. Chromatin state transitions from B cells to MM. Matrixes representing the transitions among chromatin states from B cells to MM of regions gaining A) H3K27ac, B) H3K4me1, C) H3K4me3, D) H3K36me3, E) H3K27me3 and F) H3K9me3. Each matrix represents the transition percentage of each specific chromatin state in normal B cells (columns, 15 normal B cells) to a different state (chromatin state switch) in MM (rows, 4 MM patients), or the same sate (diagonal, no change). The total matrix represents 100% of the regions. The percentage of regions associated with each chromatin state in MM patients is indicated in the left side bar chart. The most relevant chromatin state transitions are highlighted by arrowheads.

The previous analyses suggest that MM patients acquire widespread chromatin activation, which is specific of these neoplastic cells. In order to detect regulatory elements that could represent MM-specific epigenetic drivers, we designed a more stringent approach to identify the set of what we called de novo activated regulatory regions, arising from heterochromatic domains in normal B cells. These filters included H3K27ac and transcriptional profiles of bone marrow plasma cells (bm-PCs) as normal counterpart for this analysis, from which multiple epigenetic marks could not be generated due to the scarcity of this cell subpopulation in healthy bone marrow. We focused on H3K27ac mark, a bona fide indicator of enhancer and promoter activation (Zentner, Tesar and Scacheri, 2011; Bonn et al., 2012). By calculating the percentage of H3K27ac peak occupancy within each peak region, we identified 12,195 regions that consistently gained H3K27ac de novo in all 14 MM samples as compared to all normal B cells (FDR < 0.05). Approximately 19% of these regions (n = 2,326) arise from repressed chromatin, with less than 20% of H3K27ac peak occupancy in normal B cells, including bm-PC (Figure 35A). Finally, we selected 1,556 of these regions associated to 1,059 genes significantly overexpressed in MM patients compared to normal bm-PC (FDR > 0.05, FC > 1.5) (Figure 35B). In contrast, we identified 4,429 regions with H3K27ac loss in MM, only 8 of them with levels below 20% of H3K27ac peak occupancy in MM cells, filtering 4 regions associated to 6 genes showing the opposite expression pattern (FDR > 0.05, FC < -1.5 compared to bm-PC). These results confirm that de novo chromatin activation is an epigenetic hallmark of MM, with hardly any de novo chromatin inactivation

111

Results

Figure 35. Identification of de novo chromatin activated regions in MM. A) Heatmap of percentage of H3K27ac occupancy in de novo activated regions identified in MM (n = 1,556 regions). B) Heatmap of gene expression levels (indicated as Z-scores) of differentially expressed genes related to de novo activated regions identified in MM (n = 1,059 genes).

3. Identification of TFs involved in MM chromatin modulation

We next aimed to explore whether this de novo chromatin activation is mediated by specific TFs. As occupancy of TFs at regulatory elements is associated with regions of nucleosome depletion and high DNA accessibility (Calo and Wysocka, 2013), from the previously filtered de novo activated regions in MM, we mined the MM-specific ATAC-Seq peaks included in such regions for TFs binding sites. Remarkably, we observed that de novo active chromatin regions in MM were most significantly enriched in binding motifs of IRF, FOX and MEF2 TF families (Figure 36), which have been previously linked to MM pathogenesis (Carvajal-Vergara et al., 2005; Shaffer et al., 2008; Campbell et al., 2010; Bai et al., 2017; Agnarelli, Chevassut and Mancini, 2018).

112

Results

Figure 36. TF binding sequences enriched in MM specific de novo activated regions. TF analysis of the accessible sites of the regions representing de novo gain of H3K27ac in MM. Binding motifs of the most enriched TF families are shown, as well as expression data for selected members of these families in B cells an MM patients, as determined by ssRNA- Seq.

As the binding sites of different members of such TF families are highly conserved and it is difficult to identify which are the main players in MM cells, we further studied their expression levels in B cells and MM patients. Some members of the IRF, FOX and MEF2 families were overexpressed in MM as compared to normal PCs, such as IRF1, FOXO4, FOXP2, MEF2A and MEF2C. Remarkably, FOX family of TF is particularly interesting since they have been described as “pioneer factors” for cell- type-specific transcriptional regulation. As such, they are involved in actively opening the local chromatin to allow other TFs to bind (Wingelhofer et al., 2018). In contrast with MM-specific TFs, IRF4 is closely related to normal PCs differentiation (Ochiai et al., 2013; Shukla and Lu, 2014; Willis et al., 2014) and showed similar mean expression levels in normal and neoplastic PCs. Thus, it seems that extensive chromatin activation in MM may be mediated both by TFs becoming overexpressed

113

Results specifically in the disease and PC-specific TFs. This latter aspect is unexpected because the stringently selected de novo regions characteristic for MM are inactive in normal PCs in spite of the high TF expression, suggesting that PCs exploit and enhance the function of these TFs during myelomagenesis.

4. De novo chromatin activation affects genes related to key pathogenic mechanisms in MM

Beyond the possible causes or potential drivers of such widespread chromatin activation, we sought to characterize the pathogenic relevance of this de novo modulation of epigenetic regulatory elements in MM. In order to do so, we explored the functional categories of target genes associated to these regions by GO and hallmark gene signature enrichments analyses (Figure 37). The results pointed to a variety of functions previously described to play a central role in myelomagenesis (Bommert, Bargou and Stühmer, 2006; Hu and Hu, 2018), but for which the underlying molecular mechanisms were unknown. Such functions included regulation of osteoclast differentiation, oxidative stress responses (Figure 37A), as well multiple signaling pathways, such as NFkB signaling, mTOR1 signaling, p53 pathway and Notch signaling (Figure 37B).

114

Results

Figure 37. A) GO analysis of the genes associated with activated regulatory elements in MM, grouped according to terms similarity (by ReVigo). B) Top 20 hallmark gene signatures obtained by signature enrichments analyses (GSEA).

In the following figures, we show examples of genes with regulatory elements becoming de novo active in MM as compared to normal cells. For instance, in the case of Notch regulatory network, increasing evidence indicates a key role of Notch signaling in MM onset and progression (Jundt et al., 2004; Colombo et al., 2015). However, unlike other Notch-related malignancies, where the majority of patients carry gain-of-function mutations in Notch pathway members, in MM cells Notch signaling is aberrantly activated due to an increased expression of Notch receptors and ligands (Jundt et al., 2004; H. Wang et al., 2014). In line with these results, we identified chromatin activation of genes at different steps of the Notch signaling pathway, starting from de novo active ligands (JAG1), receptor processing machinery (ST3GAL6 and FBXW7), up to downstream targets (HEY1, FZD1 and DEPTOR) (Figure 38), suggesting a chromatin-basis underlying Notch deregulation in MM pathogenesis.

115

Results

Figure 38. De novo activated genes associated to Notch signaling pathway. A) Schematic representation of Notch signaling pathway. B) Genome browser snapshots of gene loci associated to Notch pathway activation. Displayed tracks represent the chromatin state annotation for MM patients and normal B cells (one representative sample for each subpopulation: NBC, t-GCBC, pb-MBC and t-PC). C) Gene expression in normal B cells and MM patients analyzed by ssRNA-Seq.

Additionally, one of the principal cellular mechanisms involved in myelomagenesis is the development of MM-associated bone disease, which is characterized by an increase in osteoclastic bone resorption and a reduction in bone formation (Hameed et al., 2014; O’Donnell and Raje, 2017), creating a osteolytic microenvironment that favors tumor growth. In this context, MM cells develop a unique requirement for the bone marrow microenvironment for their growth and survival (Pinzone et al., 2009). Indeed, we observed de novo activation in MM cells of

116

Results key regulators associated to osteoclastogenesis, including bone morphogenic proteins (BMP6) and their receptors (BMPR2), as well as the main regulator DKK1, or the receptor for the MM survival factor IL6 (IL6R), as shown in Figure 39. As activation of the above mentioned pathways is in part mediated by the crosstalk between MM and microenvironmental cells, our results suggest that such microenvironmental interactions may underlie chromatin activation of downstream genes in MM.

Figure 39. De novo activated genes associated to osteoclastogenesis. A) Gene expression in normal B cells and MM patients analyzed by ssRNA-Seq. B) Genome browser snapshots of gene loci associated to Notch pathway activation. Displayed tracks represent the chromatin state annotation for MM patients and normal B cells (one representative sample for each subpopulation: NBC, t-GCBC, pb-MBC and t-PC). C) Schematic representation of signaling pathways involved in osteoclastogenesis. BMPs: bone morphogenetic protein; MSC: Mesenchymal Stem Cell.

Finally, a remarkable feature of the findings of this doctoral thesis was that MM cells de novo activate a specific regulatory network to be able to survive in spite of elevated oxidative stress conditions. In fact, oncogenic transformation in MM is accompanied by higher endoplasmic reticulum (ER) and oxidative stresses mostly

117

Results because of high immunoglobulin production and high metabolic demands caused by the proliferative activity of neoplastic cells (White-Gilbertson, Hua and Liu, 2013; Lipchick, Fink and Nikiforov, 2016). This regulatory network seems to be related to the emergence of de novo enhancers leading to overexpression of genes involved in major detoxification systems, including GLRX, OXR1, ATOX1, PYROXD2 and TXN (Figure 40). Next, in order to validate the functional implication of such de novo regulatory elements in MM pathogenesis, we focused on TXN-associated enhancer as a representative example of this phenomenon.

Figure 40. De novo activated regulatory network associated to oxidative stress response in MM. A) Venn Diagram highlighting the activation of oxidative stress related genes though the emergence of de novo regulatory regions in MM. B) Genome browser snapshots of gene loci associated to oxidative stress response. Displayed tracks represent the chromatin state annotation for MM patients and normal B cells (one representative sample for each subpopulation: NBC, t-GCBC, pb-MBC and t-PC). C) Gene expression in normal B cells and MM patients analyzed by ssRNA-Seq.

118

Results

5. TXN de novo activated enhancer is an essential regulatory element for MM pathogenesis 5.1. TXN overexpression might be driven by a de novo activated enhancer region in MM

The thioredoxin family is a group of small, ubiquitous, redox-sensitive proteins that, together with glutathione and peroxiredoxins, act as reducing agents to protect cells from oxidative stress. One member of this family is thioredoxin 1, or TXN, which contributes to maintain reactive oxygen species (ROS) homeostasis in the cell to prevent oxidative damage. TXN has been recently shown to enhance cell growth in MM and its inhibition leads to ROS-induced apoptosis in MM cell lines (Raninga et al., 2015; Sebastian et al., 2017).

From the chromatin perspective, the TXN gene body and promoter show activating histone modifications both in MM and normal B cell subpopulations. Remarkably, we identified a de novo active distant enhancer region in MM samples, located approximately 50 kb upstream TXN TSS, which may account for elevated activity of this gene (Figure 41A). Indeed, as previously described in solid tumors (Karlenius and Tonissen, 2010), we found that TXN is overexpressed both in MM patients and cell lines compared to normal B cell differentiation (Figure 41B and C). These findings suggest that TXN overexpression might be driven by de novo activation of a putative enhancer specific of MM patients. We further aimed to characterize the pathogenic implication of this de novo regulatory element in MM cells.

119

Results

Figure 41. TXN enhancer de novo activation in MM. A) Genome browser snapshot of TXN locus and the associated enhancer de novo activated in MM (arrowhead) located at 50 kb from the promoter region. Displayed tracks represent the chromatin state annotation for normal B cells and MM patients. The bottom track represents the H3K27ac profile of KMS11 MM cell line. B) TXN expression in normal B cells, MM patients and cell lines analyzed by ssRNA-Seq and C) RT-qPCR (using GUSβ for gene normalization).

5.2. TXN gene knockout impairs MM cell growth

We started by testing the effects of TXN silencing on MM cell proliferation, using a CRISPR-Cas9 single gRNA system in different MM cell lines. This system has two main components: a constitutively expressing Cas9 MM cell line established from a lentiviral construct carrying the Cas9-2A-Blasticidin expressing cassette, and a single gRNA expressing lentiviral plasmid, harboring a BFP reporter gene. The gRNA targets a site within the coding region of TXN gene, where the Cas9 endonuclease produces a double-strand break (DSB) within the target DNA. In most cases, the resulting DSB is repaired by NHEJ, giving rise to small nucleotide indels in the target DNA, which in turn results in amino acid deletions, insertions, or frameshift mutations in the targeted gene. To discard off-target effects, we designed two independent gRNAs targeting TXN exon 2.

120

Results

Figure 42. CRISPR-Cas9 mediated TXN knockout studies. A) Estimation of the allelic cell population percentage exhibiting indel events in the targeted site, analyzed by Tracking of Indels by DEcomposition (TIDE) web tool in KMS11 and RPMI8226 cell lines. B) Validation of TXN knockout by western blot analysis. C) Cell proliferation assay comparing the growth proliferation rates of scramble cells and cells harboring the gRNA, as determined by flow cytometry analysis. D) Effect of TXN knockout in cell apoptosis, as determined by Annexin V flow cytometry analysis. * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

Firstly, we transduced the gRNA-BFP lentiviral construct in Cas9-expressing MM cell lines and 72 hours post-infection, we evaluated the levels of BFP+ cells by flow cytometry, achieving over 90% BFP+ cells in all experiments. Editing efficiency was tested by the TIDE web tool (Brinkman et al., 2014), which uses the PCR sequence of the targeted region to identify the major induced mutations in the expected editing site, accurately determining their frequency in a cell population. We observed that the editing efficiency is cell line-dependent (Figure 42A), although efficiently protein knockout was achieved in both cell lines as analyzed by western blot (Figure 42B). In this context, it is important to note that MM cell lines are often aneuploid, carrying more than two copies of the TXN gene locus, although not all of them might render gene transcription. This could lead to variable edition efficiency at a genomic DNA level among different cell lines. Therefore, in this particular case, measuring protein

121

Results expression by western blot is a more precise technique to validate the knockdown of the target gene. Finally, we observed that cells harboring TXN knockout displayed a significantly lower growth rate compared to control cells, with a reduction in proliferation of approximately 70% in cells showing the greatest loss of TXN expression (Figure 42C). Such reduction was coupled with an increased apoptotic phenotype, as measured by Annexin V flow cytometry analysis (Figure 42D). These results corroborate that TXN is an essential gene for MM cell survival and proliferation (Raninga et al., 2015; Sebastian et al., 2017).

5.3. TXN enhancer deletion affects TXN gene expression, decreasing MM cell proliferation

We then wanted to assess if TXN overexpression was driven by the de novo putative enhancer identified in MM patients (Figure 41A). As an experimental tool, we designed a CRISPR-Cas9 paired gRNAs system to completely delete TXN enhancer region in MM cell lines. Similarly to the previous CRISPR-Cas9 strategy, this lentiviral system has two components: a constitutively expressing Cas9 MM cell line and a pair of gRNAs flaking the target enhancer region, each of them cloned in a lentiviral vector carrying a BFP or GFP reporter gene, respectively. Following our chromatin state annotation and the data for H3K27ac, H3K4me1 and ATAC peaks, we designed a pair of gRNAs to delete the entire enhancer region identified in MM patients (~11 kb). Remarkably, in order to confirm the enhancer activation in a MM cell line, we checked the H3K27ac profile from the reference epigenome series published by the Bradley Bernstein lab for the KMS11 MM cell line (ENCSR715JBO). We could observe that KMS11 cell line recapitulates the TXN enhancer activation detected in primary MM samples (Figure 41A), and therefore could be used as in vitro model for further functional studies.

122

Results

Firstly, we co-transduced both plasmids carrying the paired gRNAs in Cas9- expressing MM cell lines and 72 hours post-infection, we evaluated the levels of BFP+ / GFP+ cells by flow cytometry, achieving over 90% of double positive cells in all experiments (Figure 43A). Deletion efficiency was tested qualitatively by genomic DNA PCR, using external primers flanking the genomic targeted region, which detected the presence of a shorter PCR product (325 bp) in the correctly deleted alleles than the wild type uncut alleles (11,467 bp). Although we were not able to amplify the wild type region of more than 11 kb, we could detect the shorter edited allele in both cell lines (Figure 43B). To quantitatively assess the deletion efficiency, we performed genomic DNA qPCR, using one primer internal to the deleted region, which will only amplify the wild-type allele. The relative proportion of wild-type and edited alleles was calculated and normalized to a distal non-edited region, achieving an editing efficiency of approximately 50%, consistent with the loss of the enhancer region in one allele (Figure 43C).

We observed that TXN enhancer deletion in bulk cells significantly reduced target gene expression at RNA level in both cell lines (Figure 43D) and at protein level in one of the cell lines tested (Figure 43E), probably due to differences in the basal TXN protein expression levels in each cell line. These results validate the existence of disease-specific de novo enhancer activation in MM, which leads to an aberrant transcriptional modulation in these cells. Remarkably, when monitoring cell growth of the double BFP+ / GFP+ population by flow cytometry, we observed that TXN enhancer deletion induced a significant decrease in the proliferation rate of these cells (Figure 43F) due to the reduced TXN expression levels.

123

Results

Figure 43. CRISPR-Cas9 mediated deletion of TXN enhancer. A) Fraction of double BFP+ / GFP+ cells analyzed by flow cytometry prior to (grey) and after paired gRNAs transduction (blue). The x-axis represents GFP+ cells, and the y-axis represents BFP+ cells. Upper right quadrant shows double BFP+ / GFP+ cells. The proportion (%) of cell number after paired gRNAs transduction (blue population) is shown in each quadrant. B) Qualitative analysis of TXN enhancer deletion by electrophoresis of genomic DNA PCR products. Wild type genomic DNA (Scr) and water (NTC) templates are used as positive and negative controls, respectively. The green arrowhead indicates the size of PCR products expected from the deleted allele. No amplification was achieved from the wild type allele due to the large size of the amplicon (11 kb) C) Quantitative analysis of TXN enhancer deletion by qPCR of genomic DNA normalized to a distal non-targeted genomic region. D) TXN mRNA expression levels determined by RT-qPCR, using GAPDH for gene normalization. E) TXN protein levels determined by western blot. F) Relative cell proliferation rate of edited cells normalized to the scramble gRNA. MW: Molecular Weight. * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

Taken together, our results extend previous reports on the essential role of TXN in MM and show that not only TXN is an essential oncogene for MM pathogenesis, but TXN enhancer is an essential oncogenic regulatory element activated de novo in MM, which underlies TXN pathogenic impact in these cells.

124

Results

6. Identification of activated chromatin blocks specific of MM

The previous analyses revealed an extensive deregulation of the MM chromatin landscape, which affects relevant regulatory elements as well as functional pathways for the pathogenesis of this disease. However, beyond the activation of specific genes affecting relevant pathogenic mechanisms in MM, we also identified chromatin stretches or “blocks” containing more than one gene that can become de novo active in MM. Such activated chromatin blocks (ACBs) can coordinate the deregulation of more than one target gene without evident functional association. To capture these ACBs, from the most stringent selection of genes de novo activated in MM, we selected those pairs of genes adjacent in the genome (separated by < 1 Mb) and analyzed if their gene expression was significantly correlated (R > 0.5, p < 0.05), obtaining 41 pairs of genes and one triplet listed in Table 16.

Table 16. Genes included in ACBs

Gene pairs Gene triplets

Gene 1 Gene 2 Gene 1 Gene 2 Gene 1 Gene 2 Gene 1 Gene 2 Gene 3

ADAM10 FAM63B GLRX2 CDC73 PELI1 LGALSL TSEN15 C1orf21 EDEM3

ADAM22 SRI GMPR ATXN1 PERP KIAA1244

ADIPOR1 CYB5R1 HMGN3 LCA5 PRDM5 NDNF

AK3 RCL1 IFRD2 NAT6 PURA IGIP

ANKS6 GALNT12 IQGAP2 F2R RDX FDX1

APOL2 APOL1 ITGAV FAM171B RHOQ CRIPT

ARHGEF12 GRIK4 MARVELD2 OCLN RNF133 RNF148

BRMS1L MBIP METTL5 UBR3 RRAGD ANKRD6

COMMD3 BMI1 MKX ARMC4 SEC63 OSTM1

CSRP1 NAV1 MSANTD4 KBTBD3 SPTLC1 IARS

FAM219B COX5A NDUFC2 ALG8 STAT4 MYO1B

FANCC PTCH1 NIT2 TOMM70A TK2 CKLF

FECH NARS NXPE1 NXPE4 UBA5 NPHP3

GLCCI1 ICA1 OSBPL1A IMPACT

125

Results

Next, in order to analyze the functional implication of this genomic association of ACBs, we focused on two adjacent genes representative of this phenomenon: PRDM5 and NDNF. On one hand, PRDM5 is a TF member of the PRDM family that in contrast to PRDM1 is not expressed in normal PCs but is reported to act as a tumor suppressor gene in solid tumors (Deng and Huang, 2004; Duan et al., 2007; Watanabe et al., 2007; Cheng et al., 2010; Shu et al., 2011; Tan et al., 2014; Liu et al., 2016); on the other hand, NDNF which is a gene involved in neuron biology (Kuang et al., 2010; Ohashi et al., 2014).

7. PRDM5 is a new candidate oncogene in MM 7.1. PRDM5 and NDNF are consistently overexpressed and hypomethylated in MM patients

To properly interpret the pathogenic relevance of ACBs, we sought to characterize the functional implication of the ACB comprising PRDM5 and NDNF in MM pathogenesis. These two adjacent genes are included in a ~500 kb region showing clear de novo activation in MM (Figure 44A). We first analyzed their expression levels by ssRNA-Seq (Figure 44B and C) and validated these results in an additional cohort of MM patients and cell lines by RT-qPCR (Figure 44D and E).

In addition to their co-activation, we observed a consistent upregulation of both transcripts specifically in MM, with negligible expression levels in bone marrow and tonsillar PCs. Moreover, the expression level of both genes was significantly correlated in both sample cohorts analyzed (Figure 44F and G), validating the previous filters applied for the selection of ACBs candidates.

126

Results

Figure 44. PRDM5 and NDNF co-activation in MM. A) Genome browser snapshot of PRDM5 and NDNF loci. Displayed tracks represent the chromatin state annotation for normal B cells and MM patients. The bottom track represents the H3K27ac profile of KMS11 MM cell line. B) PRDM5 and C) NDNF erexpression in MM patients, analyzed ssRNA-Seq. D) PRDM5 and E) NDNF expression in an independent cohort of MM patients and cell lines determined by RT-qPCR, using GUSβ for gene normalization. F) Correlation of PRDM5 and NDNF expression levels from ssRNA-Seq data. G) Correlation of PRDM5 and NDNF expression levels from RT-qPCR data. MM samples vs t-PC: * p < 0.05, ** p < 0.01, *** p < 0.001. MM samples vs bm-PC: # p < 0.05, ## p < 0.01, ### p < 0.001, ns: not significant.

Next, in order to assess the epigenetic status of both gene loci, we checked the H3K27ac profile from the reference epigenome series published by the Bradley Bernstein lab for the KMS11 MM cell line (ENCSR715JBO). We could confirm that the ACB comprising PRDM5 and NDNF genes is also active in KMS11 cell line. Moreover, we determined the DNA methylation levels of the promoter region of both genes by

127

Results

WGBS (Figure 45A and B) and validated these results by pyrosequencing in an additional sample cohort of MM patients and cell lines (Figure 45C and D). As expected, both promoters underwent a progressive gain of DNA methylation throughout the process of B cell maturation to bm-PCs, followed by a significant loss of DNA methylation in MM patients and cell lines. Remarkably, these results corroborate that the MM cell lines analyzed recapitulate this epigenetic co-activation of PRDM5 and NDNF loci, and therefore could be used as in vitro models for further functional studies.

Figure 45. PRDM5 and NDNF promoter DNA methylation analysis. A) Genome browser snapshot of Whole Genome Bisulfite Sequencing (WGBS) data for PRDM5 and B) NDNF promoter regions. Displayed tracks represent the DNA methylation levels (from 0 to 1) of each CpG dinucleotide. Regions analyzed by bisulfite pyrosequencing are highlighted in a blue box. CGI: CpG island. C) Bisulfite pyrosequencing of PRDM5 and D) NDNF promoter regions in B cell differentiation, MM patients and MM cell lines. For each sample, the mean and standard deviation of 3 CpGs dinucleotides in each replicate is represented. MM samples vs bm-PC: * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

128

Results

7.2. PRDM5 and NDNF are independent transcripts in MM

Considering the strong correlation between PRDM5 and NDNF transcripts (Figure 44F and G), we aimed to verify whether both genes are transcribed independently in MM patients, discarding the presence of a single mRNA fusion transcript, as it was previously reported for the MDS1 and EVI1 Complex Locus (MECOM) on 3q26 (Hinai and Valk, 2016). We mined the ssRNA-Seq data from MM patients using the STAR- Fusion package for identifying fusion transcripts from RNA-Seq reads (Haas et al., 2017; Hsieh et al., 2017), discarding the presence of a fusion transcript between PRDM5 and NDNF in MM samples. Moreover, we further designed PCR primers in each of the coding exons of both PRDM5 and NDNF genes (Figure 46A), and analyzed by gel-PCR all the possible combinations in three different cell lines (MM1S, KMS11 and KMS12), together with positive controls for specific PRDM5 (165 bp) and NDNF (181 bp) transcript amplification. We could only detect product amplification using the forward primer designed in the 3rd exon of NDNF, and the reverse primer localized in the 1st exon of PRDM5 (Figure 46B). When this PCR product was further sequenced and aligned to the genome, it corresponded to unspecific amplification of NSUN4 gene. These results together corroborate that PRDM5 and NDNF are two independent transcripts in MM patients.

129

Results

Figure 46. Analysis of PRDM5 and NDNF fusion transcript. A) Schematic representation of PCR primer design strategy for the identification of PRDM5 and NDNF fusion transcript. Four forward primers were designed in each NDNF coding exon (named N1 to N4), and sixteen reverse primers were designed in each PRDM5 coding exon (named P1 to P16). B) Analysis of PCR product amplification by gel electrophoresis. In this example, the forward primer located in the 3rd exon of NDNF is combined with all sixteen possible reverse primers located in each PRDM5 exon. Only the primer pair N3-P1 yielded a PCR product of ~400 bp (lane 1). Specific primers for PRDM5 and NDNF transcripts are used as positive amplification controls (165 and 181 bp respectively). MW: Molecular Weight.

7.3. PRDM5 does not regulate NDNF expression

PRDM5 is a TF of the PR-domain protein family, which has been described to regulate gene transcription by recruitment of histone modifier enzymes or other TFs (Duan et al., 2007; Shu et al., 2011; Galli et al., 2012; Shu, 2015). Due to their highly correlated expression levels, we hypothesized that PRDM5 might be regulating the expression of its ACB partner NDNF. To characterize this association, we first calculated PRDM5 consensus binding sequence from previously available data (Galli et al., 2012, 2013) (Figure 47A) and mapped it to NDNF gene body and promoter region (-1.5 kb from TSS), identifying one putative PRDM5 binding site (GGAGAGCCGG) in NDNF promoter.

130

Results

Figure 47. Luciferase reporter assay to analyze PRDM5 regulatory potential of NDNF expression. A) Slogo representation of PRDM5 consensus binding sequence. B) Schematic representation of the luciferase reporter assay designed to assess the potential role of PRDM5 transcription factor in NDNF overexpression. Luciferase gene expression is driven by NDNF promoter (NDNFpr-WT). If PRDM5 is regulating NDNF expression, transcription factor knockdown (mediated by shRNAs) or deletion of its putative binding sequence (NDNFpr-mut, deleted region is represented as a white gap) would impair the transcription of luciferase reporter gene. C) Relative renilla/luciferase units (RLU) for empty vector (pGL4.17), NDNF wild type promoter (NDNFpr-WT) and NDNF promoter harboring PRDM5 binding site deletion (NDNFpr-mut). Values are normalized to the pGL4.17 empty vector. D) PRDM5 expression levels after shRNA expression determined by RT-qPCR, using GUSβ for gene normalization. Expression values are normalized to the same condition prior to addition of doxycycline. E) Modulation of renilla/luciferase values (RLU Fold Change) after shRNA mediated PRDM5 knockdown. Values are normalized to the same condition prior to addition of doxycycline. F) Luciferase reporter activity (RLU) for NDNF wild type promoter (NDNFpr-WT) and NDNF promoter harboring PRDM5 binding site deletion (NDNFpr-mut). * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

Next, we used a luciferase reporter assay to assess the potential role of PRDM5 transcription factor in NDNF overexpression. In this system, luciferase gene expression was driven by NDNF promoter region containing the predicted PRDM5 binding site (NDNFpr-WT), or harboring a deletion of this putative binding site as control (NDNFpr-Mut). If PRDM5 is regulating NDNF expression, knockdown of this TF

131

Results

(mediated by shRNAs) or deletion of its putative binding sequence (represented as a white gap in NDNFpr sequence) would impair the expression of the luciferase reporter gene (Figure 47B). This luciferase reporter assay was performed in HET293T cells due to the low transfection rate achieved in MM cell lines. Firstly, HEK293T cells were transduced with lentiviral vectors harboring inducible shRNAs targeting PRDM5 gene (shPRDM5 #1 and shPRDM5 #2) or a scramble shRNA targeting GFP gene (shGFP). Each condition was then co-transfected with the NDNFpr-WT or NDNFpr- mut luciferase reporter vectors and a renilla luciferase vector as internal control. Simultaneously, shRNA expression was induced by adding doxycycline to the culture medium. We observed that the NDNF promoter fragment cloned in the luciferase reporter vector was sufficient to provide a ~20-fold activation of luciferase/renilla activity (relative luciferase units, or RLUs) above background level from the empty vector control, both for wild type and mutated promoters under basal conditions (Figure 47C). However, PRDM5 transcription factor knockdown after doxycycline addition (Figure 47D and E), or deletion of its putative binding sequence (Figure 47F) did not significantly alter luciferase activity. These results discard the regulatory role of PRDM5 transcription factor in NDNF expression.

7.4. PRDM5 and NDNF chromatin block is topologically remodeled in MM

We next sought to characterize the chromatin topology of this ACB by comparing the linear proximity of both gene loci by 4C-Seq in a MM cell line expressing both transcripts (U266) and a mantle cell lymphoma cell line (JVM-2) lacking this co- activation. We identified three-dimensional interactions between PRDM5 and NDNF gene loci only in U266 MM cells, suggesting that the topological remodeling of this region specifically in MM might be related to their coordinated activation of both genes in these cells (Figure 48).

132

Results

Figure 48. Three-dimensional chromatin interactions between PRDM5 and NDNF gene loci measured by 4C-Seq. Normalized levels of chromatin interaction frequencies from the indicated viewpoint (dark blue arrowhead), as analyzed by 4C-Seq in JVM-2 mantle lymphoma cell line lacking PRDM5 and NDNF expression, and U266 MM cell line expressing both transcripts. Lowest contact frequencies are indicated in turquoise and highest frequencies in red. The black line shows the median contact frequencies in a window of 5 kb.

7.5. PRDM5 knockdown inhibits cell proliferation and induces apoptosis in MM cell lines

Finally, in order to identify whether these physically-related, but functionally- independent genes are involved in MM pathogenesis, we first performed PRDM5 knockdown studies mediated by doxycycline-inducible shRNAs (shPRDM5) in different MM cell lines. PRDM5 knockdown was validated by RT-qPCR (Figure 49A) and western blot (Figure 49B) to ensure the correct silencing of the target gene at both RNA and protein level respectively. We observed that PRDM5 knockdown induced a significant decrease in cell proliferation rate by MTS assay (Figure 49C), with an increase of Annexin V positive cells as measured by flow cytometry (Figure 49D) in all cell lines tested (MM1S, KMS12 and KMS11).

133

Results

Figure 49. PRDM5 knockdown studies. A) Validation of PRDM5 knockdown after shRNA expression determined by RT-qPCR, using GUSβ for gene normalization. Expression values are normalized to the same condition prior to addition of doxycycline. Statistical analysis compares the effect of each shRNA versus the scramble shRNA (shGFP). B) PRDM5 protein levels determined by western blot. C) Relative cell proliferation rate normalized to the same condition prior to addition of doxycycline. Statistical analysis compares the effect of each shRNA versus the scramble shRNA (shGFP). D) Effect of PRDM5 knockdown in cell apoptosis, as determined by Annexin V flow cytometry analysis.* p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

In order to decipher the specific function of PRDM5 in MM pathogenesis, we performed a transcriptomic characterization by low input 3’ end RNA-Seq of KMS11 cells harboring the shPRDM5 before and after the addition of doxycycline, for two different periods of time: two and four days. Both shPRDM5 were analyzed as replicates to minimize off-target effects. We found 91 and 73 differentially up- and downregulated genes, respectively (|FC| > 1.5, p < 0.05) after 2 days of shRNA expression induction (Figure 50A). We could identify differential expression of genes

134

Results previously related to MM cell proliferation and apoptosis (including CXCR3 (Giuliani et al., 2006), USP1 (Das et al., 2017) and CASP7) or cellular transport and protein folding (including RAB1A, MIB1, EPN1 and CLU among others) as detailed in Figure 50B and C. When analyzing transcriptomic changes after 4 days of induction with doxycycline, we found 1,215 differentially expressed genes (|FC| > 1.5; p < 0.01) (Figure 50D), which were significantly associated with DNA replication, ER stress and

unfolded protein response (UPR) (Figure 50E).

Figure 50. Identification of PRDM5 potential mechanisms by transcriptomic analysis by RNA-Seq. A) Heatmap of gene expression levels (indicated as Z-scores) of differentially expressed genes before and after 2 days of the addition of doxycycline for shRNA expression induction (n = 2 for control and Dox groups, 164 genes). B) Examples of down- and C) upregulated genes before and after the addition of doxycycline. D) Heatmap of gene expression levels (indicated as Z-scores) of differentially expressed genes before and after 4 days of the addition of doxycycline (n = 4 for control and Dox groups, 1,215 genes). E) GO terms significantly enriched (FDR < 0.05) of differentially expressed genes 4 days after the addition of doxycyclin. Dox: Doxycycline. * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

135

Results

As highly secretory antibody-producing cells, MM cells have a physiological induction of the UPR as they require an increased capacity to cope with unfolded proteins, which makes them particularly sensitive to ER stress and deregulation of protein turnover (Vincenz et al., 2013; Wallington-Beddoe and Pitson, 2017). The previous results suggest that PRDM5 is involved in protecting proliferating MM cells against cellular stress created by high protein synthesis, and therefore, when silenced, protein homeostasis is disturbed by deregulation of target components of the protein quality control machinery, leading to an accumulation of misfolded

proteins and induction of UPR in MM cells.

7.6. NDNF knockdown slightly impairs MM cell growth in a cell line-specific manner

We further aimed to characterize the functional implication of PRMD5 neighbor gene, NDNF, using a doxycycline-inducible shRNA system, as previously described. We validated NDNF knockdown only by RT-qPCR (Figure 51A), as we could not analyze protein levels due to the lack of a specific NDNF antibody. We could not get conclusive results with these experiments, as we obtained inconsistent phenotypes regarding cell proliferation (Figure 51B) and apoptosis (Figure 51C) with the different shRNAs designed.

136

Results

Figure 51. NDNF knockdown studies. A) Validation of NDNF knockdown after shRNA expression determined by RT-qPCR. Expression values are normalized to the same condition prior to addition of doxycycline. Statistical analysis compares the effect of each shRNA versus the scramble shRNA (shGFP). B) Relative cell proliferation rate normalized to the same condition prior to addition of doxycycline. Statistical analysis compares the effect of each shRNA versus the scramble shRNA (shGFP). C) Effect of NDNF knockdown in cell apoptosis, as determined by Annexin V flow cytometry analysis.* p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

As an alternative approximation to achieve NDNF knockdown in MM cells, we designed a CRISPR-Cas9 paired gRNAs system to completely delete NDNF genomic promoter region in MM cell lines. We chose the paired gRNAs system over the single gRNA system, due to the lack of NDNF specific antibody to validate gene knockdown in the single gRNA system. Additionally, the deletion of the promoter region has two main advantages over the alternative of removing the entire gene sequence: firstly,

137

Results the region to be deleted can be in the size range of 1 kb, enabling higher knockdown efficiency than those of 10 to 100 kb required for removing the whole gene loci (Canver et al., 2014); secondly, by deleting smaller regions, we can be more confident that the observed phenotypic effects are due to loss of gene transcription and not an unintended consequence of deleting overlapping genomic regulatory elements located within the gene body.

As previously described, this CRISPR-Cas9 paired gRNAs lentiviral system has two components: a constitutively expressing Cas9 MM cell line, and a pair of gRNAs flanking the target region, each of them cloned in a lentiviral vector harboring a BFP or GFP reporter gene, respectively. We designed two distinct paired gRNAs intended to remove varying-size fragments (841 and 1,385 bp respectively) including the promoter and TSS of the target gene. Our selection was guided by both Genecode gene annotation and our data for H3K27ac, H3K4me3 and ATAC peaks located in this region. We co-transduced both plasmids harboring the paired gRNAs in Cas9- expressing MM cell lines and 72 hours post-infection, we evaluated the levels of BFP+ / GFP+ cells by flow cytometry, achieving over 90% double positive cells in all experiments. Deletion efficiency was tested qualitatively by genomic DNA PCR, using external primers flanking the genomic targeted region, which detect the presence of a shorter PCR product (824 bp and 249 bp respectively for each pair of gRNAs) in the correctly deleted alleles than the wild type uncut alleles (1,665 pb) (Figure 52A). NDNF promoter deletion in bulk cells should result in loss of RNA levels of the target gene. Therefore, validation of NDNF knockdown was assessed by RT-qPCR across the cell lines tested (Figure 52B). Finally, by monitoring cell growth of the double BFP+ / GFP+ population by flow cytometry, we observed that NDNF knockdown induced a slight decrease in the proliferation rate only in MM1S cell line of all MM cell lines tested (Figure 52C).

138

Results

Figure 52. CRISPR-Cas9 mediated deletion of NDNF promoter. A) Qualitative analysis of NDNF promoter deletion by electrophoresis of genomic DNA PCR products. Wild type genomic DNA (Scr) and water (NTC) templates are used as positive and negative controls, respectively. Grey and pink arrows indicate the size of PCR products expected from wild type and deleted alleles respectively. B) NDNF mRNA expression levels determined by RT-qPCR after promoter deletion. C) Relative cell proliferation rate of edited cells normalized to the scramble gRNA. MW: Molecular Weight. * p < 0.05, ** p < 0.01, *** p < 0.001, ns: not significant.

139

Results

Taken together, these results suggest that this de novo activation of the ACB comprising PRDM5 and NDNF drives the overexpression of both genes specifically in MM. However, only PRDM5 continued overexpression lead to oncogene dependence (Weinstein, 2002), presumably by maintaining protein homeostasis and preventing ER stress in MM cells.

140

DISCUSSION

“Important thing in science is not so much to obtain new facts as to discover new ways of thinking about them.”

(Sir William Bragg)

Discussion

1. General aspects

Since the early insights into the central role of genomic alterations in cancer development, the study of the mechanisms underlying tumorigenesis has been focused on the genetic landscape of the cell (including mutations, genomic rearrangements and copy number variations) (Stratton, Campbell and Futreal, 2009). However, MM is characterized by a wide heterogeneity in the chromosomal and genetic changes present in tumor plasma cells, without a single and unifying genetic event identified among all patients (Robiou du Pont et al., 2017).

Noteworthy, over the past decades, breakthrough discoveries in the epigenetics field and technological advances in high throughput technologies, have transformed our knowledge of the inheritance of gene expression patterns in both normal and neoplastic cells (David Allis and Jenuwein, 2016). In spite of the multifaceted nature of the epigenome, the cancer epigenomics field has been traditionally focused on the association between DNA methylation and transcription (Esteller, 2008). Numerous studies published so far in MM have identified alterations in the DNA methylome of tumor plasma cells (Walker et al., 2011; Heuck et al., 2013; Kaiser et al., 2013; Agirre et al., 2015). However, this aberrant landscape has only revealed limited insights into gene regulation and stills insufficient to fully characterize MM biological complexity. In this doctoral thesis, we have gone beyond this classical model of cancer epigenetics, and performed an integrative characterization of the MM reference epigenome as defined by the guidelines of the IHEC (http://ihec- epigenomes.org/research/reference-epigenome-standards/). Such integrative multi- epigenomics approach can provide a global view of the genome function and regulation, as it has been recently shown in various cancer types such as chronic lymphocytic leukemia, acute myeloid leukemia or gastric adenocarcinoma (Martens et al., 2010; Ooi et al., 2016; Beekman et al., 2018). Regarding MM, recent efforts have focused on the analysis of Polycomb-mediated gene repression (Agarwal et al.,

145

Discussion

2016) and regulatory networks driven by super-enhancer elements (Jin et al., 2018). However, the chromatin landscape underlying aberrant cellular functions in MM has just started to be characterized. In the present study, we have performed an integrative analysis of multiple epigenetic layers, including whole genome maps of six histone modifications with non-overlapping functions (H3K27ac, H3K4me1, H3K4me3, H3K36me3, H3K27me3 and H3K9me3), chromatin accessibility, DNA methylation and gene transcription of purified primary MM plasma cells. The results derived from these analyses have allowed us to achieve a more profound understanding of the molecular mechanisms underlying myelomagenesis, in the context of normal B cell differentiation. We are aware that MM is a genetically and clinically heterogeneous disease, and that our fully-characterized patient cohort is not informative to define biological variability in MM. However, we could identify common MM-specific signatures for multiple epigenetic marks, revealing the existence of a core epigenomic landscape underlying MM pathogenesis. Some of the common epigenetic principles defined in this doctoral thesis will be discussed in the following sections.

2. Chromatin activation as a hallmark for MM patients 2.1. Epigenetic memory of MM tumor plasma cells

In the course of this doctoral thesis, the strategy of analyzing MM chromatin modulation from the perspective of the entire B cell maturation process has allowed us to better characterize MM pathogenesis. During cellular differentiation, developmental signals induce changes in gene expression and chromatin structure patterns, which define cell identity and can be maintained through subsequent cell divisions (D’Urso and Brickner, 2014). Recent studies have suggested that the malignant transformation of a normal cell into a tumor cell displays some similarities to the reprogramming process of a somatic cell into a pluripotent cell. In this context, the epigenetic landscape is a potent barrier which dramatically changes during

146

Discussion malignant transformation, as it does in cellular reprogramming (Suva, Riggi and Bernstein, 2013; Kim and Costello, 2017). In line with this hypothesis, we have observed that the chromatin landscape as a whole is completely deregulated in MM, affecting both euchromatic and heterochromatic regions as compared to normal B cells. However, we could identify epigenetic imprints of the cellular origin of this neoplasm. In fact, MM cells exacerbate epigenetic patterns of chromatin activation already present in normal PCs. This partial maintenance of their original identity in tumor cells has been already described in solid and hematological tumors (Halley- Stott and Gurdon, 2013; Beekman et al., 2018), giving insight into their cell of origin. This phenomenon has been termed as “epigenetic memory”, which is defined as the retention of gene transcription patterns after the induction of a new transcriptional program regulated by epigenetic mechanisms (Halley-Stott and Gurdon, 2013). Interestingly, these patterns are only evident for active chromatin marks and chromatin accessibility, whereas for heterochromatic regions (H3K27me3 and H3K9me3) the global resemblance to normal PCs is lost, suggesting that not all epigenetic modifications seem to hold epigenetic memory in MM.

This link between normal B cell differentiation and MM tumor PCs further invites us to wonder if there could be alterations in the chromatin regulation of B cell maturation in MM patients. The identification of an aberrant differentiation program in the B cell compartment of MM patients could lead to new insights into the neoplastic transformation.

Besides the epigenetic memory retained from the tumor normal counterpart, epigenetic marks might be also maintained though cancer progression, and represent the history of tumor evolution. Indeed, profiling intratumoral heterogeneity provides a powerful tool to trace back the formation of the malignancy and reconstruct tumor evolution (Mazor et al., 2016). Recently, the study of the evolutionary history of tumors has been extended from genome to epigenome landscape. In fact, in prostate

147

Discussion cancer (Brocks et al., 2014) and brain tumors (Mazor et al., 2015), DNA methylation patterns of tumor evolution are remarkably similar to those obtained by looking at genomic alterations. However, MM onset and progression have always been analyzed from the genomics point of view, yielding to different models: the “static progression model”, where tumor cells evolve in a linear pathway though gradual acquisition of genetic alterations that lead to an increasingly aggressive phenotype; and the “spontaneous model”, based on the presence of multiple clones at baseline, which undergo a branching pattern of clonal evolution with alternating clones and subclones at different time points during the disease course (Fakhri and Vij, 2016; Bolli et al., 2018). Given the key role of epigenetic alterations in MM development, it will be interesting for future research to characterize the evolution of the epigenomic landscape in MM samples, both at diagnose and though disease progression. Moreover, the advancement of single-cell genome-wide profiling techniques will allow us to precisely characterize the genomic and epigenomic landscape of MM samples at a single cell level (Clark et al., 2016; Kelsey, Stegle and Reik, 2017; Lo and Zhou, 2018). As this field continues to develop, it has the potential to transform our understanding of tumor evolution and have a direct impact on therapeutic decisions.

2.2. Potential epigenetic drivers of MM chromatin activation

Additionally to the maintenance of an epigenetic footprint of the cell of origin, the MM chromatin landscape is also remodeled independently of B cell differentiation. We have identified that the extensive de novo chromatin activation of MM cells might be a unifying molecular principle of this hematological disease. Such chromatin remodeling drives the reprogramming of regulatory elements, including de novo activated promoter and enhancer regions. Whereas the transcriptional memory of the cell frequently maintains previously expressed genes primed for reactivation (D’Urso and Brickner, 2014), the majority of these activated regions arise from heterochromatic loci in normal B cells. Therefore, these modulatory patterns

148

Discussion seem to be independent of the B cell maturation program, and could be considered onco-epigenetic events essential for the neoplastic transformation.

A major question in cancer epigenomics involves the identification of potential drivers of chromatin modulation. However, the landscape of genetic alterations related to epigenetic regulators in a small subset of MM patients is insufficient to explain the widespread chromatin activation that we have defined as a hallmark for this disease (Pawlyn et al., 2016; Dupéré-Richer and Licht, 2017). Nevertheless, recent studies have proposed the attractive hypothesis that, in cells in which Tet enzymes are expressed, active DNA demethylation through the 5hmC intermediate is necessary for the initial steps of chromatin activation, with 5hmC being enriched at active enhancer regions (Sérandour et al., 2011; Stroud et al., 2011). The global and progressive hypomethylation observed from MGUS to MM patients (Agirre et al., 2015), together with the overexpression of TET enzymes in both MGUS and MM cells, suggest that an active widespread DNA demethylation could be one of the leading events in the neoplastic transformation. In this context, 5hmC could lead to chromatin activation by counteracting the heterochromatin state or by recruiting unidentified effector proteins. However, this is just one potential mechanism that should be further investigated.

Noteworthy, we have shown that de novo chromatin modulation in MM seems to rely on a relatively small subset of TF families, including IRF, FOX and MEF2, which have been previously reported to be functionally important for MM pathogenesis (Carvajal-Vergara et al., 2005; Shaffer et al., 2008; Campbell et al., 2010; Bai et al., 2017; Agnarelli, Chevassut and Mancini, 2018). Thus, inhibition of these TFs represents a rational therapeutic approach to revert aberrant chromatin activation in MM patients.

149

Discussion

This hypothesis is supported by recent studies suggesting that IRF4, besides being a critical TF for B cell differentiation, also behaves as a master regulator of the aberrant expression program in MM (L. Wang et al., 2014). In fact, IRF4 knockdown is lethal for MM cells, which are addicted to the regulatory network aberrantly modulated by this TF, including the overexpression of MYC oncogene (Shaffer et al., 2008). Moreover IRF4 mRNA expression levels are a prognostic marker for poor survival in MM patients (Heintel et al., 2008). The dependency of MM cells on IRF4 expression is described as a “non-oncogene addiction”, which is the increased dependence of cancer cells on the normal cellular functions of certain genes, not considered classical oncogenes (Solimini, Luo and Elledge, 2007). In fact, IRF4 is not genetically altered in most patients, and its overexpression itself is not sufficient for MM development (Yanai, Negishi and Taniguchi, 2012). In the light of the results of this doctoral thesis, different plausible mechanisms underlying this “non-oncogenic addiction” can be considered. One could be that the open chromatin structure of MM patients allows IRF4 access to loci normally silenced under physiological conditions. Thus, the broadened genetic repertoire of IRF4 targets, together with an increased TF activity due to IRF4 overexpression, could lead to the aberrant deregulation of the transcriptional program of the cell. On the other hand, IRF4 and its cooperative binding partner BATF, have been proposed to act as essential pioneer factors for stablishing chromatin accessibility in T-helper cells (Li et al., 2012; Mahnke et al., 2016; Carr et al., 2017). Whereas most TFs cannot access their target sequences in compacted chromatin, pioneer factors can access their target sequences and actively open up the local chromatin to make it accessible for other factors to bind (Zaret and Carroll, 2011). Consistent with our results, IRF4 could behave has a pioneer factor in MM cells, leading to the widespread remodeling and activation of multiple regulatory elements. However, additional factors are required for MM development, as IRF4 is already upregulated in the premalignant MGUS disorder as compared to normal PCs (Bai et al., 2017). An integrative characterization

150

Discussion of the chromatin landscape modulation through MGUS to MM evolution could shed light into the role of IRF4 in MM development and progression, and the identification of novel epigenetic drivers underlying myelomagenesis.

Both previous hypothesis are not mutually exclusive, and describe an altered IRF4 regulatory landscape closely related to the MM epigenome. This context offers considerable promise for therapies targeting IRF4 inhibition, which may revert chromatin activation and represent a rational therapeutic option for MM. Although TFs have been traditionally considered undruggable therapeutic targets, recent promising advances have been achieved in targeting TP53 (Yu et al., 2014) and BCL6 (Cerchietti et al., 2010) using small molecule inhibitors. Moreover, the design of antisense oligonucleonides (ASOs) for IRF4 selective knockdown is currently emerging as an attractive therapeutic strategy for the treatment of MM and other IRF4-dependent malignancies (Zhou et al., 2017).

2.3. Functional relevance of the MM enhancer signature associated to oxidative stress

The altered chromatin landscape in MM affects genes involved in a variety of signaling pathways and cellular responses previously reported to play a central role into myelomagenesis (Bommert, Bargou and Stühmer, 2006; Hu and Hu, 2018). Noteworthy, recent reports suggest that genes implicated in cell identity, DNA repair or other adaptive processes require complex regulatory circuits, and their expression is regulated by the combination of multiple regulatory elements and TFs (Zabidi et al., 2015). Understanding the mechanisms underlying the chromatin regulatory network, and remarkably the enhancer signature, is currently an area of great interest, as there is increasing evidence of its relevance not only in development but also in evolution and disease (Dawson and Kouzarides, 2012; Lara-Astiaso et al., 2014; Beekman et al., 2018). Although the general features of enhancer regulation

151

Discussion are the same under physiological or neoplastic conditions, tumor cells tend to lose active enhancer near differentiation-associated genes (Agirre et al., 2015), and gain them near genes related to cell growth (Hnisz et al., 2013). In line with these results, we have identified an activated enhancer signature specific for MM cells, which is characterized by the de novo activation of these regulatory regions which arise from heterochromatic domains in normal B cells, and are therefore independent from the B cell maturation process. Remarkably, recent studies have characterized the nucleosome occupancy maps in cell free DNA from plasma and serum samples, identifying cellular footprints from healthy hematopoietic lineages (Snyder et al., 2016). As the enhancer signature of MM differs from the physiological landscape, the analysis of cell free DNA accessibility patterns could be a clinically useful biomarker for the early detection and monitoring of MM disease.

A large number of aberrant patterns of enhancer activation have been observed in tumor cells (Herz, Hu and Shilatifard, 2014; Sur and Taipale, 2016; Sengupta and George, 2017), as well as MM patients and cell lines (Jin et al., 2018). In line with these results, a remarkable feature of the findings of this project is that MM cells de novo activate regulatory elements of genes involved in preventing cell death associated with an adaptive response to oxidative stress. In most of the mentioned studies, the functional role of such regulatory elements has hardly been experimentally tested, and it is still currently unclear to what extent the enhancer defining marks reflect true enhancer activity and its regulatory potential. Noteworthy, we did not only characterize the aberrant activation of the MM specific enhancer landscape, but also validated the functional relevance of one representative case by CRISPR-Cas9 mediated enhancer deletion. Our results suggested that TXN acts as an oncogene that is overexpressed specifically in MM cells due, at least in part, to the aberrant activation of a MM-specific enhancer region. Downregulation of TXN expression either by disrupting the gene itself or by deleting

152

Discussion the distant regulatory element impairs MM cell growth. These results strongly indicate that TXN enhancer activation is an onco-epigenetic event essential for myelomagenesis. Remarkably, the thioredoxin system plays a crucial role in maintaining cellular redox homeostasis in MM and also in different tumor types (Arnér and Holmgren, 2006; Karlenius and Tonissen, 2010), being described as a promising strategy for cancer treatment (Zhang et al., 2012, 2017). The findings from this doctoral thesis might therefore have implications not only for a better understanding of the mechanisms underlying TXN deregulation in MM, but also in other tumor types.

It is also important to note that most of the association studies between enhancer activation and gene expression modulation (including some of the results presented in this study) are based on enhancer proximity to the putative target gene. However, epigenetic mechanisms can no longer be understood as one or two dimensional regulatory patterns, as long-range interactions in the chromatin topology have been widely documented (Nagano et al., 2015; Dixon et al., 2018; Hofmann and Heermann, 2018). Indeed, the Common Fund’s 4D Nucleome program has been recently stablished (Dekker et al., 2017) trying to understand the principles behind the 3D organization of the nucleus in space and time (the 4th dimension), the role nuclear organization plays in gene expression and cellular function, and how changes in the nuclear organization affect normal development as well as various diseases (https://commonfund.nih.gov/4dnucleome). In this context, the recently published genome architecture of MM has revealed a relationship among genomic alterations, 3D genome reorganization, and gene expression regulation in this neoplasm (Wu et al., 2017). These results invite us to further characterize the role of long range chromatin interactions in MM patients and to validate the association of de novo activated regulatory elements in these cells to their putative target genes.

153

Discussion

2.4. Disruption of chromatin domains and oncogene activation in MM

In addition to de novo emergence of enhancer elements we also identified the activation of entire genes, which in part become active as adjacent, but functionally unrelated, co-expressed gene pairs. This finding suggests that chromatin stretches containing more than one gene can become de novo active in MM, a phenomenon that we named ACBs. Under pathological conditions, genes within the identified ACBs exert a significant co-regulation in MM patients. A similar phenomenon has been recently described in glioma cells, where aberrant DNA methylation patterns compromise the binding of CTCF transcriptional regulator, disrupting the chromatin topology and allowing aberrant regulatory interactions that induce oncogene expression (Flavahan et al., 2016). Therefore, it could be plausible that the widespread epigenetic modulation of MM cells could lead to the disruption of the chromatin topology in these cells, leading to an extensive perturbation of the MM transcriptome.

In line with these results, we have demonstrated that the aberrant epigenetic activation of the ACB comprising PRDM5 and NDNF gene loci induced the overexpression of both transcripts specifically in MM. We could demonstrate that PRDM5 transcription factor acts as a potential oncogene in MM and remarkably, we discovered that this TF seems to be involved in ER and UPR stresses. This finding further supports the concept that protection against various cellular stresses is a key element of MM survival, and that neoplastic PCs achieve this resistant phenotype through aberrant de novo chromatin activation. On the other hand, although we could not confirm the oncogenic role of NDNF in MM, we cannot discard its implication in other functional mechanisms previously reported, such as cell adhesion, migration or regulation of angiogenesis (Kuang et al., 2010; Ohashi et al., 2014).

154

Discussion

Considering the wide variety of genomic and epigenomic alterations that characterize the MM landscape, they could affect genome topology, deregulating the transcriptional program and ultimately contributing to myelomagenesis. However we are just starting to understand the nature of chromatin domains, genome topology and chromatin network regulation, remaining many questions to be unanswered.

3. Functional implications of heterochromatin modulation in MM cells

Besides the widespread chromatin activation, the heterochromatin modulatory patterns found in MM patients were one of the most unexpected findings from this project. In line with previous studies (Agarwal et al., 2016) we described an aberrant heterochromatin signature unique to MM cells, with most of the genes affected by this changes being either silencer or expressed at low levels in both normal B cells and MM patients. Remarkably, we characterized this modulation as a transition between different types of repressive chromatin (heterochromatin – H3K9me3, Polycomb repressed – H3K27m3 and low signal chromatin) with no apparent functional impact on the cell, at least in terms of transcriptional regulation.

Although this chromatin inactivation seems to have a minor effect on the deregulation of the cell transcriptome, recent evidence suggest that it is required for the maintenance of MM transcriptional balance. In fact, Polycomb mediated silencing has been demonstrated to inhibit the expression of microRNAs with tumor suppressor functions, predicted to target MM-associated oncogenes such as IRF4, XBP1, PRDM1 or MYC (Alzrigat et al., 2017), or to play a key role in MM prognosis (Pawlyn et al., 2017). In this context, Polycomb genetic and pharmacological targeting have been demonstrated to be potential therapeutic options for MM treatment (Kikuchi et al., 2015; Agarwal et al., 2016; Alzrigat et al., 2017; Ohguchi, Hideshima and Anderson, 2018).

155

Discussion

Besides gene silencing, heterochromatin formation is also crucial for the preservation of genome stability and the regulation of key processes such as DNA damage repair and DNA replication (Becker, Nicetto and Zaret, 2016; Nair, Shoaib and Sørensen, 2017). Our findings on the aberrant heterochromatin signature in MM could further impair the cell ability to repair genomic aberrations, therefore promoting the heterogeneous chromosomal instability that characterizes this disease. Moreover, the heterochromatin landscape is also a key modulator of higher order chromosome structure and nuclear architecture, which can ultimately have profound effects on gene regulation (Schneider and Grosschedl, 2007). This is consistent to recent studies that describe how growth stimuli can promote alterations to nuclear architecture and lamina contacts that render specific loci susceptible to H3K9me3 modification (Zhu et al., 2013). In line with these results, the bone marrow microenvironment signals, which have been demonstrated to play a key role in MM cell growth and survival (Jurczyszyn et al., 2015), could be further stimulating an aberrant MM chromatin remodeling process. In fact, emerging evidence is pointing towards SUV39H1 specific H3K9 methyltransferase as a potential therapeutic target for this neoplasm (Devin et al., 2015).

Although further study is clearly needed to appreciate the significance of heterochromatin modulatory patterns in MM, the identification and initial characterization of these global heterochromatin aberrations provides a starting point for such investigations.

4. Epigenetic modulating agents as a rational therapeutic approach in MM

Most of the known epigenetic modifications are reversible, offering considerable promise for therapies based on the adaptive nature of epigenetic regulation (Arrowsmith et al., 2012; David Allis and Jenuwein, 2016). Considering the relevance

156

Discussion of epigenetic factors in myelomagenesis, targeting epigenetic modifications represent a promising therapeutic approach for MM patients.

Regarding DNA methylation, DNMT inhibitors (DNMTi) 5-azacytidine and 5-aza- deoxycitidine (decitabine) are already Food and Drug Administration (FDA)-approved compounds for the treatment of acute myeloid leukemia and myelodysplastic syndromes (Amodio et al., 2017). In vitro studies have demonstrated anti myeloma activity of both DNMTi (Maes et al., 2014; Cao et al., 2016) and clinical studies combining these compounds with conventional MM therapeutic agents look promising although very preliminary (Maes et al., 2014; Moreaux et al., 2014). The rationale behind demethylation therapies in MM is the ability of DNMTi to restore the expression of silenced tumor suppressor genes. However, the identification of such events in MM patients has been notably difficult, and is limited to a small number of published studies (Lavelle et al., 2003; Hurt et al., 2006; Heller et al., 2008). Indeed, the findings of this doctoral thesis together with previous published results (Agirre et al., 2015) suggest that DNA hypermethylation is a minor epigenetic event in MM cells. Remarkably, recent studies suggest that inhibition of DNA methylation can restore the expression of hypermethylated endogenous retrovirus (ERV) genes in tumor cells, causing a type I interferon response via the viral defense pathway (Chiappinelli et al., 2015). In fact, interferon based therapies have been used in MM treatment for over forty years, due to its anti-proliferative and immunomodulating properties (Khoo et al., 2011), suggesting that the activation of the interferon signaling pathway could be an additional mechanism of action of the hypomethylating therapy in MM. Besides DNA hypomethylation, DNMTi can also induce DNA damage and interfere with RNA translation (Hollenbach et al., 2010), directly impairing MM cell proliferation.

Another class of epigenetic modulating agents thoroughly studied in the treatment of MM are HDAC inhibitors (HDACi), which are responsible for

157

Discussion deacetylation of proteins (both histone and non-histone substrates), leading to a wide variety of responses linked to epigenetic mechanisms and post-transcriptional protein modifications (Federico and Bagella, 2011; Kim and Bae, 2011; Khan and La Thangue, 2012). In fact, HDACi have been demonstrated to induce cell apoptosis through modulation of both intrinsic and extrinsic apoptotic pathways, induction of proteasome stress and DNA damage, or alterations in the tumor microenvironment. Additionally, HDACi can increase the immunogenicity of tumor cells and modulate cytokine signaling, contributing to the anti-cancer effect in vivo (Dickinson, Johnstone and Prince, 2010; Maes et al., 2013). Several HDACi are have been implemented in clinical trials, either as monotherapy or in combination with conventional agents for the treatment of MM (Offidani et al., 2012; Dimopoulos et al., 2013; Richardson et al., 2013), with panobinostat being the first epigenetic inhibitor to have clinically meaningful benefit for patients with relapsed or refractory MM (San-Miguel, Einsele and Moreau, 2016; Laubach et al., 2017; San-Miguel et al., 2017). Although in this doctoral thesis we have described that the MM chromatin landscape is characterized by widespread activation, the true mechanism of HDACi in MM patients is likely to be a mixture of destabilization of the epigenetic landscape, gene expression changes and deregulation of proteins through modulation of their acetylation status.

One of the primary concerns to the use of epigenetic therapies questions in what extent they may affect healthy cells. However, this drawback is overcome in the case of epigenetic targets upregulated within the specific malignant context. Due to its key role in stablishing the MM transcriptional program, the aberrant enhancer signature of MM can also be an attractive target for MM therapy. Although genome- editing technologies such as CRISPR-Cas9 systems are very powerful tools for functional validation of enhancer regions, their application for in vivo therapy is still limited, as it relies on the necessity to target every single tumor cell. Instead, small- molecule inhibitors targeting the bound TFs or epigenetic modifying enzymes are

158

Discussion emerging as promising therapeutic agents (Arrowsmith et al., 2012). One of the strengths of this targeted therapy is that it will presumably disturb the pathological strong enhancer network of MM without destabilizing more robust networks that control crucial physiological processes. In fact, tumors cells have been suggested to have more fragile chromatin and higher “epigenetic noise” (Pujadas and Feinberg, 2012), being more susceptible to selective killing by treatment with epigenetic inhibitors combined with conventional therapy (David Allis and Jenuwein, 2016). In this context, several inhibitors targeting BET bromodomain reader proteins (which recognize acetylated lysine residues on histone tails) (Filippakopoulos et al., 2010) have been proven to exert anti-myeloma activity both in vitro and in vivo, alone or in combination with conventional therapy (Delmore et al., 2011; Chaidos et al., 2014; Matthews et al., 2016; Díaz et al., 2017; Imayoshi et al., 2017), being a promising therapeutic approach for MM patients, and potential advances in the field of personalized therapy and precision medicine.

5. Concluding remarks

As shown through the course of this doctoral thesis, our understanding of the molecular mechanisms underlying cancer development can be clearly improved through an integrative analysis of the tumor epigenome, including different epigenetic layers, such as DNA methylation, histone modifications, gene expression and 3D chromatin structure. However, this is just the starting point for further research. Recent genomic and epigenomic tools in the context of complex bioinformatic and system biology computational approaches are currently allowing researchers to interrogate the cell epigenome from different perspectives. Indeed, there are still many epigenetic facets and mechanisms that need further characterization, such as modeling the progression risk from MGUS to MM from the epigenomics perspective. These approaches should ultimately end up in better diagnosis, prognosis and treatment of cancer patients. In this exciting era for

159

Discussion biomedical research, we are looking forward to seeing how the currently available approaches and future ones will be used to answer the remaining long-standing open questions and challenges.

160

CONCLUSIONS

Conclusions

1. MM cells show an aberrant epigenomic configuration, affecting both euchromatic and heterochromatic regions, as compared to normal B cells. 2. MM cells partially maintain epigenetic memory from the B cell maturation process, exacerbating chromatin activation patterns present in normal PCs, but losing the global resemblance in heterochromatic regions. 3. MM cells extensively modulate their heterochromatin landscape without any apparent transcriptional relevance. 4. De novo chromatin activation is a unifying molecular principle of MM patients. 5. De novo chromatin activation is preferentially located in regulatory elements, including promoter and enhancer regions, which arise from heterochromatic loci in normal B cells. 6. Aberrant chromatin modulation in MM cells relies on a small subset of transcription factors, including IRF and FOX and MEF2 family members. 7. De novo chromatin activation is tightly associated to MM pathogenesis, deregulating key cellular functions for myelomagenesis such as osteoclast differentiation, oxidative stress responses, as well as multiple signaling pathways. 8. MM cells de novo activate regulatory elements of genes involved in preventing cell death associated with an adaptive response to oxidative stress. 9. Downregulation of TXN expression either by disruption of the gene itself or by deletion of an associated distant regulatory element impairs MM cell growth. 10. Chromatin stretches containing more than one gene can become de novo active in MM, forming ACBs specific of MM patients.

163

Conclusions

11. PRDM5 transcription factor acts as a potential oncogene in MM, involved in prevention of ER and UPR stresses, and could represent a rational therapeutic target for MM. 12. Protection against various cellular stresses is a key element of MM survival, suggesting a chromatin-basis underlying the pathogenesis of MM.

164

REFERENCES

References

Agarwal, P. et al. (2016) ‘Genome-wide profiling of histone H3 lysine 27 and lysine 4 trimethylation in multiple myeloma reveals the importance of Polycomb gene targeting and highlights EZH2 as a potential therapeutic target.’, Oncotarget, 7(6), pp. 6809–23. doi: 10.18632/oncotarget.6843.

Agirre, X. et al. (2015) ‘Whole-epigenome analysis in multiple myeloma reveals DNA hypermethylation of B cell-specific enhancers’, Genome Research, 25, pp. 478–487. doi: 10.1101/gr.180240.114.

Agnarelli, A., Chevassut, T. and Mancini, E. J. (2018) ‘IRF4 in multiple myeloma— Biology, disease and therapeutic target’, Leukemia Research, 72, pp. 52–58. doi: 10.1016/j.leukres.2018.07.025.

Alzrigat, M. et al. (2017) ‘EZH2 inhibition in multiple myeloma downregulates myeloma associated oncogenes and upregulates microRNAs with potential tumor suppressor functions.’, Oncotarget, 8(6), pp. 10213–10224. doi: 10.18632/oncotarget.14378.

Amodio, N. et al. (2017) ‘Epigenetic modifications in multiple myeloma: recent advances on the role of DNA and histone methylation.’, Expert opinion on therapeutic targets, 21(1), pp. 91–101. doi: 10.1080/14728222.2016.1266339.

Aparicio-Prat, E. et al. (2015) ‘DECKO: Single-oligo, dual-CRISPR deletion of genomic elements including long non-coding RNAs’, BMC Genomics, 16(1), p. 846. doi: 10.1186/s12864-015-2086-z.

Arnaudo, A. M. and Garcia, B. A. (2013) ‘Proteomic characterization of novel histone post-translational modifications’, Epigenetics & Chromatin. doi: 10.1186/1756-8935- 6-24.

Arnér, E. S. J. and Holmgren, A. (2006) ‘The thioredoxin system in cancer’, Seminars in Cancer Biology, 16(6), pp. 420–426. doi: 10.1016/J.SEMCANCER.2006.10.009.

Arrowsmith, C. H. et al. (2012) ‘Epigenetic protein families: a new frontier for drug discovery’, Nature Reviews Drug Discovery, 11(5), pp. 384–400. doi: 10.1038/nrd3674.

Audergon, P. N. C. B. et al. (2015) ‘Epigenetics. Restricted epigenetic inheritance of H3K9 methylation.’, Science (New York, N.Y.), 348(6230), pp. 132–5. doi: 10.1126/science.1260638.

Bai, H. et al. (2017) ‘Bone marrow IRF4 level in multiple myeloma: an indicator of peripheral blood Th17 and disease.’, Oncotarget, 8(49), pp. 85392–85400. doi: 10.18632/oncotarget.19907.

167

References

Bannister, A. J. and Kouzarides, T. (2011) ‘Regulation of chromatin by histone modifications’, Nature Publishing Group, 21, pp. 381–395. doi: 10.1038/cr.2011.22.

Bartholomew, B. (2014) ‘Regulating the Chromatin Landscape: Structural and Mechanistic Perspectives’, Annual Review of Biochemistry, 83(1), pp. 671–696. doi: 10.1146/annurev-biochem-051810-093157.

Bassil, C. F., Huang, Z. and Murphy, S. K. (2013) ‘Bisulfite Pyrosequencing’, in Methods in molecular biology (Clifton, N.J.), pp. 95–107. doi: 10.1007/978-1-62703-547-7_9.

Baubec, T. et al. (2013) ‘Methylation-dependent and -independent genomic targeting principles of the MBD protein family.’, Cell, 153(2), pp. 480–92. doi: 10.1016/j.cell.2013.03.011.

Becker, J. S., Nicetto, D. and Zaret, K. S. (2016) ‘H3K9me3-Dependent Heterochromatin: Barrier to Cell Fate Changes’, Trends in Genetics, 32(1), pp. 29–41. doi: 10.1016/j.tig.2015.11.001.

Beekman, R. et al. (2018) ‘The reference epigenome and regulatory chromatin landscape of chronic lymphocytic leukemia’, Nature Medicine, 24, pp. 868–880. doi: 10.1038/s41591-018-0028-4.

Benayoun, B. A. et al. (2014) ‘H3K4me3 Breadth Is Linked to Cell Identity and Transcriptional Consistency’, Cell, 158(3), pp. 673–688. doi: 10.1016/j.cell.2014.06.027.

Bhutani, N., Burns, D. M. and Blau, H. M. (2011) ‘DNA demethylation dynamics.’, Cell, 146(6), pp. 866–72. doi: 10.1016/j.cell.2011.08.042.

Bird, A. (2002) ‘DNA methylation patterns and epigenetic memory’, Genes & development, 16(1), pp. 6–21. doi: 10.1101/gad.947102.

Blumenthal, M. N. (2012) ‘Genetic, epigenetic, and environmental factors in asthma and allergy’, Annals of Allergy, Asthma & Immunology, 108(2), pp. 69–73. doi: 10.1016/j.anai.2011.12.003.

Bolli, N. et al. (2014) ‘Heterogeneity of genomic evolution and mutational profiles in multiple myeloma.’, Nature communications, 5(1), p. 2997. doi: 10.1038/ncomms3997.

Bolli, N. et al. (2018) ‘Genomic patterns of progression in smoldering multiple myeloma.’, Nature communications, 9(1), p. 3363. doi: 10.1038/s41467-018-05058-y.

168

References

Bommert, K., Bargou, R. C. and Stühmer, T. (2006) ‘Signalling and survival pathways in multiple myeloma.’, European journal of cancer, 42(11), pp. 1574–80. doi: 10.1016/j.ejca.2005.12.026.

Bonn, S. et al. (2012) ‘Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development’, Nature Genetics, 44(2), pp. 148–156. doi: 10.1038/ng.1064.

Braggio, E., Kortüm, K. M. and Stewart, A. K. (2015) ‘SnapShot: Multiple Myeloma.’, Cancer cell, 28(5), p. 678–678.e1. doi: 10.1016/j.ccell.2015.10.014.

Brinkman, A. B. et al. (2012) ‘Sequential ChIP-bisulfite sequencing enables direct genome-scale investigation of chromatin and DNA methylation cross-talk.’, Genome research, 22(6), pp. 1128–38. doi: 10.1101/gr.133728.111.

Brinkman, E. K. et al. (2014) ‘Easy quantitative assessment of genome editing by sequence trace decomposition’, Nucleic Acids Research, 42(22), pp. e168–e168. doi: 10.1093/nar/gku936.

Brinkman, E. K. et al. (2018) ‘Easy quantification of template-directed CRISPR/Cas9 editing’, Nucleic Acids Research, 46(10), p. E58. doi: 10.1093/nar/gky164.

Brocks, D. et al. (2014) ‘Intratumor DNA Methylation Heterogeneity Reflects Clonal Evolution in Aggressive Prostate Cancer’, Cell Reports, 8(3), pp. 798–806. doi: 10.1016/j.celrep.2014.06.053.

Buenrostro, J. D. et al. (2013) ‘Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position’, Nature Methods, 10(12), pp. 1213–1218. doi: 10.1038/nmeth.2688.

Buenrostro, J. D. et al. (2015) ‘ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide.’, Current protocols in molecular biology, 109, p. 21.29.1- 9. doi: 10.1002/0471142727.mb2129s109.

Bulger, M. and Groudine, M. (2011) ‘Leading Edge Review. Functional and Mechanistic Diversity of Distal Transcription Enhancers’, Cell, 144(3), pp. 327–339. doi: 10.1016/j.cell.2011.01.024.

Burgess, R. J. and Zhang, Z. (2013) ‘Histone chaperones in nucleosome assembly and human disease’, Nature Structural & Molecular Biology, 20(1), pp. 14–22. doi: 10.1038/nsmb.2461.

Calo, E. and Wysocka, J. (2013) ‘Modification of Enhancer Chromatin: What, How, and Why?’, Molecular Cell, 49(5), pp. 825–837. doi: 10.1016/J.MOLCEL.2013.01.038.

169

References

Campbell, A. J. et al. (2010) ‘Aberrant expression of the neuronal transcription factor FOXP2 in neoplastic plasma cells’, British Journal of Haematology, 149(2), pp. 221– 230. doi: 10.1111/j.1365-2141.2009.08070.x.

Canver, M. C. et al. (2014) ‘Characterization of genomic deletion efficiency mediated by clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 nuclease system in mammalian cells.’, The Journal of biological chemistry, 289(31), pp. 21312– 24. doi: 10.1074/jbc.M114.564625.

Cao, Y. et al. (2016) ‘Decitabine enhances bortezomib treatment in RPMI 8226 multiple myeloma cells’, Molecular Medicine Reports, 14(4), pp. 3469–3475. doi: 10.3892/mmr.2016.5658.

Carr, T. M. et al. (2017) ‘JunB promotes Th17 cell identity and restrains alternative CD4+ T-cell programs during inflammation’, Nature Communications, 8(1), p. 301. doi: 10.1038/s41467-017-00380-3.

Carvajal-Vergara, X. et al. (2005) ‘Multifunctional role of Erk5 in multiple myeloma’, Blood, 105(11), pp. 4492–4499. doi: 10.1182/blood-2004-08-2985.

Cedar, H. and Bergman, Y. (2011) ‘Epigenetic mechanisms Epigenetics of haematopoietic cell development’, Nature Publishing Group, 11. doi: 10.1038/nri2991.

Cerchietti, L. C. et al. (2010) ‘A Small-Molecule Inhibitor of BCL6 Kills DLBCL Cells In Vitro and In Vivo’, Cancer Cell, 17(4), pp. 400–411. doi: 10.1016/j.ccr.2009.12.050.

Chaidos, A. et al. (2014) ‘Potent antimyeloma activity of the novel bromodomain inhibitors I-BET151 and I-BET762’, Blood, 123(5), pp. 697–705. doi: 10.1182/blood- 2013-01-478420.

Chapman, M. A. et al. (2011) ‘Initial genome sequencing and analysis of multiple myeloma’, Nature, 471(7339), pp. 467–472. doi: 10.1038/nature09837.

Chaumeil, J. and Skok, J. A. (2012) ‘The role of CTCF in regulating V(D)J recombination.’, Current opinion in immunology, 24(2), pp. 153–9. doi: 10.1016/j.coi.2012.01.003.

Chen, Z. and Riggs, A. D. (2011) ‘DNA Methylation and Demethylation in Mammals’, Journal of Biological Chemistry, 286(21), pp. 18347–18353. doi: 10.1074/jbc.R110.205286.

Cheng, H.-Y. et al. (2010) ‘DNA methylation and carcinogenesis of PRDM5 in cervical cancer’, Journal of Cancer Research and Clinical Oncology, 136(12), pp. 1821–1825. doi: 10.1007/s00432-010-0840-9.

170

References

Chiappinelli, K. B. et al. (2015) ‘Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses.’, Cell, 162(5), pp. 974–86. doi: 10.1016/j.cell.2015.07.011.

Clark, S. J. et al. (2016) ‘Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity.’, Genome biology, 17, p. 72. doi: 10.1186/s13059-016-0944-x.

Collas, P. (2010) ‘The Current State of Chromatin Immunoprecipitation’, Molecular Biotechnology, 45(1), pp. 87–100. doi: 10.1007/s12033-009-9239-8.

Colombo, M. et al. (2015) ‘Notch signaling deregulation in multiple myeloma: A rational molecular target.’, Oncotarget, 6(29), pp. 26826–40. doi: 10.18632/oncotarget.5025.

Conway, E., Healy, E. and Bracken, A. P. (2015) ‘PRC2 mediated H3K27 methylations in cellular identity and cancer’, Current Opinion in Cell Biology, 37, pp. 42–48. doi: 10.1016/j.ceb.2015.10.003.

Creyghton, M. P. et al. (2010) ‘Histone H3K27ac separates active from poised enhancers and predicts developmental state.’, Proceedings of the National Academy of Sciences of the United States of America, 107(50), pp. 21931–6. doi: 10.1073/pnas.1016071107.

Cuddapah, S. et al. (2009) ‘Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains.’, Genome research, 19(1), pp. 24–32. doi: 10.1101/gr.082800.108.

Cui, X. et al. (2016) ‘DNA methylation in spermatogenesis and male infertility’, Experimental and Therapeutic Medicine, 12(4), pp. 1973–1979. doi: 10.3892/etm.2016.3569.

D’Urso, A. and Brickner, J. H. (2014) ‘Mechanisms of epigenetic memory’, Trends in Genetics, 30(6), pp. 230–236. doi: 10.1016/j.tig.2014.04.004.

Das, D. S. et al. (2017) ‘Blockade of deubiquitylating enzyme USP1 inhibits DNA repair and triggers apoptosis in multiple myeloma cells’, Clinical Cancer Research, 23(15), pp. 4280–4289. doi: 10.1158/1078-0432.CCR-16-2692.

David Allis, C. and Jenuwein, T. (2016) ‘The molecular hallmarks of epigenetic control’, Nature Reviews Genetics, 17(8), pp. 487–500. doi: 10.1038/nrg.2016.59.

Dawson, M. A. and Kouzarides, T. (2012) ‘Cancer epigenetics: from mechanism to therapy.’, Cell, 150(1), pp. 12–27. doi: 10.1016/j.cell.2012.06.013.

171

References

Dekker, J. et al. (2002) ‘Capturing Chromosome Conformation’, Science, 295(5558), pp. 1306–1311. doi: 10.1126/science.1067799.

Dekker, J. et al. (2017) ‘The 4D nucleome project’, Nature, 549(7671), pp. 219–226. doi: 10.1038/nature23884.

Delmore, J. E. et al. (2011) ‘BET Bromodomain Inhibition as a Therapeutic Strategy to Target c-Myc’, Cell, 146(6), pp. 904–917. doi: 10.1016/j.cell.2011.08.017.

Deng, Q. and Huang, S. (2004) ‘PRDM5 is silenced in human cancers and has growth suppressive activities’, Oncogene, 23(28), pp. 4903–4910. doi: 10.1038/sj.onc.1207615.

Denker, A. and de Laat, W. (2016) ‘The second decade of 3C technologies: detailed insights into nuclear organization.’, Genes & development, 30(12), pp. 1357–82. doi: 10.1101/gad.281964.116.

Devin, J. et al. (2015) ‘Inhibition of SUV39H Methyltransferase As a Potent Therapeutic Target in Multiple Myeloma’, Blood, 126(23).

Díaz, T. et al. (2017) ‘The BET bromodomain inhibitor CPI203 improves lenalidomide and dexamethasone activity in in vitro and in vivo models of multiple myeloma by blockade of Ikaros and MYC signaling.’, Haematologica, 102(10), pp. 1776–1784. doi: 10.3324/haematol.2017.164632.

Dickinson, M., Johnstone, R. W. and Prince, H. M. (2010) ‘Histone deacetylase inhibitors: potential targets responsible for their anti-cancer effect.’, Investigational new drugs. Springer, 28 Suppl 1(Suppl 1), pp. S3-20. doi: 10.1007/s10637-010-9596-y.

Dimopoulos, K., Gimsing, P. and Grønbaek, K. (2014) ‘The role of epigenetics in the biology of multiple myeloma’, Blood Cancer Journal, 429. doi: 10.1038/bcj.2014.29.

Dimopoulos, M. et al. (2013) ‘Vorinostat or placebo in combination with bortezomib in patients with multiple myeloma (VANTAGE 088): a multicentre, randomised, double-blind study.’, The Lancet. Oncology, 14(11), pp. 1129–1140. doi: 10.1016/S1470-2045(13)70398-X.

Dixon, J. R. et al. (2012) ‘Topological domains in mammalian genomes identified by analysis of chromatin interactions.’, Nature, 485(7398), pp. 376–80. doi: 10.1038/nature11082.

Dixon, J. R. et al. (2018) ‘Integrative detection and analysis of structural variation in cancer genomes’, Nature Genetics. doi: 10.1038/s41588-018-0195-8.

172

References

Duan, Z. et al. (2007) ‘Epigenetic Regulation of Protein-Coding and MicroRNA Genes by the Gfi1-Interacting Tumor Suppressor PRDM5’, Molecular and Cellular Biology, 27(19), pp. 6889–6902. doi: 10.1128/MCB.00762-07.

Dunn, B. K., Verma, M. and Umar, A. (2003) ‘Epigenetics in Cancer Prevention: Early Detection and Risk Assessment’, Annals of the New York Academy of Sciences, 983(1), pp. 1–4. doi: 10.1111/j.1749-6632.2003.tb05957.x.

Dupéré-Richer, D. and Licht, J. D. (2017) ‘Epigenetic regulatory mutations and epigenetic therapy for multiple myeloma’, Current Opinion in Hematology, 24(4), pp. 336–344. doi: 10.1097/MOH.0000000000000358.

Egger, G. et al. (2004) ‘Epigenetics in human disease and prospects for epigenetic therapy’, Nature, 429(6990), pp. 457–463. doi: 10.1038/nature02625.

Epigenomics Consortium, R. et al. (2015) ‘Integrative analysis of 111 reference human epigenomes’, Nature, 518, pp. 317–330. doi: 10.1038/nature14248.

Ernst, J. et al. (2011) ‘Mapping and analysis of chromatin state dynamics in nine human cell types’, Nature, 473(7345), pp. 43–49. doi: 10.1038/nature09906.

Ernst, J. and Kellis, M. (2010) ‘Discovery and Characterization of Chromatin States for Systematic Annotation of the Human Genome’, Nature Biotechnology, 28, pp. 817– 825. doi: 10.1007/978-3-642-20036-6_6.

Ernst, J. and Kellis, M. (2017) ‘Chromatin-state discovery and genome annotation with ChromHMM’, Nature Protocols, 12(12), pp. 2478–2492. doi: 10.1038/nprot.2017.124.

Esteller, M. (2008) ‘Epigenetics in Cancer’, New England Journal of Medicine, 358(11), pp. 1148–1159. doi: 10.1056/NEJMra072067.

Ezponda, T. et al. (2017) ‘UTX/KDM6A Loss Enhances the Malignant Phenotype of Multiple Myeloma and Sensitizes Cells to EZH2 inhibition.’, Cell reports, 21(3), pp. 628–640. doi: 10.1016/j.celrep.2017.09.078.

Ezponda, T. and Licht, J. D. (2014) ‘Molecular Pathways: Deregulation of Histone H3 Lysine 27 Methylation in Cancer--Different Paths, Same Destination’, Clinical Cancer Research, 20(19), pp. 5001–5008. doi: 10.1158/1078-0432.CCR-13-2499.

Fakhri, B. and Vij, R. (2016) ‘Clonal Evolution in Multiple Myeloma’, Clinical Lymphoma Myeloma and Leukemia, 16, pp. S130–S134. doi: 10.1016/j.clml.2016.02.025.

173

References

Fang, Y. et al. (2016) ‘In silico identification of enhancers on the basis of a combination of transcription factor binding motif occurrences’, Scientific Reports, 6(1), p. 32476. doi: 10.1038/srep32476.

Federico, M. and Bagella, L. (2011) ‘Histone Deacetylase Inhibitors in the Treatment of Hematological Malignancies and Solid Tumors’, Journal of Biomedicine and Biotechnology, 2011, pp. 1–12. doi: 10.1155/2011/475641.

Feinberg, A. P. (2018) ‘The Key Role of Epigenetics in Human Disease Prevention and Mitigation’, New England Journal of Medicine, 378(14), pp. 1323–1334. doi: 10.1056/NEJMra1402513.

Filippakopoulos, P. et al. (2010) ‘Selective inhibition of BET bromodomains’, Nature, 468(7327), pp. 1067–1073. doi: 10.1038/nature09504.

Flavahan, W. A. et al. (2016) ‘Insulator dysfunction and oncogene activation in IDH mutant gliomas.’, Nature, 529(7584), pp. 110–4. doi: 10.1038/nature16490.

Fong, C. Y., Morison, J. and Dawson, M. A. (2014) ‘Epigenetics in the hematologic malignancies’, Haematologica, 99(12), pp. 1772–1783. doi: 10.3324/haematol.2013.092007.

Galli, G. G. et al. (2012) ‘Prdm5 regulates collagen gene transcription by association with RNA polymerase II in developing bone.’, PLoS genetics, 8(5), p. e1002711. doi: 10.1371/journal.pgen.1002711.

Galli, G. G. et al. (2013) ‘Genomic and Proteomic Analyses of Prdm5 Reveal Interactions with Insulator Binding Proteins in Embryonic Stem Cells’, Molecular and Cellular Biology. A, 33(22), pp. 4504–4516. doi: 10.1128/MCB.00545-13.

Gao, T. et al. (2016) ‘EnhancerAtlas: a resource for enhancer annotation and analysis in 105 human cell/tissue types’, Bioinformatics, 32(23), p. btw495. doi: 10.1093/bioinformatics/btw495.

Gates, L. A. et al. (2017) ‘H3K9ac mediates Pol II pause release Acetylation on Histone H3 Lysine 9 Mediates a Switch from Transcription Initiation to Elongation Downloaded from’, The Journal of biological chemistry, 292(35), pp. 14456–14472. doi: 10.1074/jbc.M117.802074.

Giuliani, N. et al. (2006) ‘CXCR3 and its binding chemokines in myeloma cells: expression of isoforms and potential relationships with myeloma cell proliferation and survival.’, Haematologica, 91(11), pp. 1489–97. Available at: http://www.ncbi.nlm.nih.gov/pubmed/17082008 (Accessed: 24 July 2018).

174

References

Greer, E. L. and Shi, Y. (2012) ‘Histone methylation: a dynamic mark in health, disease and inheritance’, Nature Reviews Genetics, 13(5), pp. 343–357. doi: 10.1038/nrg3173.

Grewal, S. I. S. and Jia, S. (2007) ‘Heterochromatin revisited.’, Nature reviews. Genetics, 8(1), pp. 35–46. doi: 10.1038/nrg2008.

Guo, Y. et al. (2012) ‘CTCF/cohesin-mediated DNA looping is required for protocadherin α promoter choice.’, Proceedings of the National Academy of Sciences of the United States of America, 109(51), pp. 21081–6. doi: 10.1073/pnas.1219280110. van Haaften, G. et al. (2009) ‘Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer’, Nature Genetics, 41(5), pp. 521–523. doi: 10.1038/ng.349.

Haas, B. et al. (2017) ‘STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq’, bioRxiv, p. 120295. doi: 10.1101/120295.

Hahn, M. A. et al. (2011) ‘Relationship between gene body DNA methylation and intragenic H3K9me3 and H3K36me3 chromatin marks.’, PloS one, 6(4), p. e18844. doi: 10.1371/journal.pone.0018844.

Hall, I. M. et al. (2002) ‘Establishment and maintenance of a heterochromatin domain.’, Science, 297(5590), pp. 2232–7. doi: 10.1126/science.1076466.

Halley-Stott, R. P. and Gurdon, J. B. (2013) ‘Epigenetic memory in the context of nuclear reprogramming and cancer’, Briefings in Functional Genomics, 12(3), pp. 164–173. doi: 10.1093/bfgp/elt011.

Hameed, A. et al. (2014) ‘Bone disease in multiple myeloma: pathophysiology and management.’, Cancer growth and metastasis. SAGE Publications, 7, pp. 33–42. doi: 10.4137/CGM.S16817.

Handoko, L. et al. (2011) ‘CTCF-mediated functional chromatin interactome in pluripotent cells.’, Nature genetics, 43(7), pp. 630–8. doi: 10.1038/ng.857.

Hart, T. et al. (2015) ‘High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities’, Cell, 163(6), pp. 1515–1526. doi: 10.1016/J.CELL.2015.11.015.

Heger, P. and Wiehe, T. (2014) ‘New tools in the box: an evolutionary synopsis of chromatin insulators.’, Trends in genetics : TIG, 30(5), pp. 161–71. doi: 10.1016/j.tig.2014.03.004.

175

References

Heintel, D. et al. (2008) ‘Expression of MUM1/IRF4 mRNA as a prognostic marker in patients with multiple myeloma’, Leukemia, 22(2), pp. 441–445. doi: 10.1038/sj.leu.2404895.

Heintzman, N. D. et al. (2007) ‘Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome’, Nature Genetics, 39(3), pp. 311–318. doi: 10.1038/ng1966.

Heintzman, N. D. et al. (2009) ‘Histone modifications at human enhancers reflect global cell-type-specific gene expression.’, Nature, 459(7243), pp. 108–12. doi: 10.1038/nature07829.

Heller, G. et al. (2008) ‘Genome-Wide Transcriptional Response to 5-Aza-2’- Deoxycytidine and Trichostatin A in Multiple Myeloma Cells’, Cancer Research, 68(1), pp. 44–54. doi: 10.1158/0008-5472.CAN-07-2531.

Herz, H.-M., Hu, D. and Shilatifard, A. (2014) ‘Enhancer Malfunction in Cancer’, Molecular Cell, 53(6), pp. 859–866. doi: 10.1016/j.molcel.2014.02.033.

Heuck, C. J. et al. (2013) ‘Myeloma Is Characterized by Stage-Specific Alterations in DNA Methylation That Occur Early during Myelomagenesis’, The Journal of Immunology, 190(6), pp. 2966–2975. doi: 10.4049/jimmunol.1202493.

Hinai, A. A. and Valk, P. J. M. (2016) ‘Aberrant EVI1 expression in acute myeloid leukaemia’, British Journal of Haematology, 172(6), pp. 870–878. doi: 10.1111/bjh.13898.

Hnisz, D. et al. (2013) ‘Super-Enhancers in the Control of Cell Identity and Disease’, Cell, 155(4), pp. 934–947. doi: 10.1016/j.cell.2013.09.053.

Ho, J. W. K. et al. (2011) ‘ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis.’, BMC genomics, 12(1), p. 134. doi: 10.1186/1471-2164-12-134.

Hofmann, A. and Heermann, D. W. (2018) ‘Deciphering 3D Organization of Chromosomes Using Hi-C Data’, in Methods in molecular biology (Clifton, N.J.), pp. 389–401. doi: 10.1007/978-1-4939-8675-0_19.

Hollenbach, P. W. et al. (2010) ‘A comparison of azacitidine and decitabine activities in acute myeloid leukemia cell lines.’, PloS one, 5(2), p. e9001. doi: 10.1371/journal.pone.0009001.

Holliday, R. (2006) ‘Epigenetics: A Historical Overview’, Epigenetics, 1(2), pp. 76–80. doi: 10.4161/epi.1.2.2762org/10.4161/epi.1.2.2762.

176

References

Howe, F. S. et al. (2017) ‘Is H3K4me3 instructive for transcription activation?’, BioEssays : news and reviews in molecular, cellular and developmental biology, 39(1), pp. 1–12. doi: 10.1002/bies.201600095.

Hsieh, G. et al. (2017) ‘Statistical algorithms improve accuracy of gene fusion detection.’, Nucleic acids research, 45(13), p. e126. doi: 10.1093/nar/gkx453.

Hu, J. and Hu, W.-X. (2018) ‘Targeting signaling pathways in multiple myeloma: Pathogenesis and implication for treatments’, Cancer Letters, 414, pp. 214–221. doi: 10.1016/j.canlet.2017.11.020.

Hurt, E. M. et al. (2006) ‘Reversal of p53 epigenetic silencing in multiple myeloma permits apoptosis by a p53 activator.’, Cancer biology & therapy, 5(9), pp. 1154–60. doi: 10.4161/cbt.5.9.3001.

Huynh, J. L. and Casaccia, P. (2013) ‘Epigenetic mechanisms in multiple sclerosis: implications for pathogenesis and treatment’, The Lancet Neurology, 12(2), pp. 195– 206. doi: 10.1016/S1474-4422(12)70309-5.

Illingworth, R. S. and Bird, A. P. (2009) ‘CpG islands--’a rough guide’.’, FEBS letters, 583(11), pp. 1713–20. doi: 10.1016/j.febslet.2009.04.012.

Imayoshi, N. et al. (2017) ‘CG13250, a novel bromodomain inhibitor, suppresses proliferation of multiple myeloma cells in an orthotopic mouse model’, Biochemical and Biophysical Research Communications, 484(2), pp. 262–268. doi: 10.1016/j.bbrc.2017.01.088.

Ishiguro, K. et al. (2018) ‘DOT1L inhibition blocks multiple myeloma cell proliferation by suppressing IRF4-MYC signaling’, Haematologica. doi: 10.3324/haematol.2018.191262.

Izzi, B., Binder, A. M. and Michels, K. B. (2014) ‘Pyrosequencing Evaluation of Widely Available Bisulfite Conversion Methods: Considerations for Application.’, Medical epigenetics, 2(1), pp. 28–36. doi: 10.1159/000358882.

Jaitin, D. A. et al. (2014) ‘Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types.’, Science, 343(6172), pp. 776–9. doi: 10.1126/science.1247651.

Jaitin, D. A. et al. (2016) ‘Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq’, Cell, 167(7), p. 1883–1896.e15. doi: 10.1016/J.CELL.2016.11.039.

177

References

Jamieson, K. et al. (2016) ‘Loss of HP1 causes depletion of H3K27me3 from facultative heterochromatin and gain of H3K27me2 at constitutive heterochromatin.’, Genome research, 26(1), pp. 97–107. doi: 10.1101/gr.194555.115.

Jenuwein, T. and David Allis, C. (2001) Translating the Histone Code, Science. doi: 10.1126/science.1063127.

Jin, Y. et al. (2018) ‘Active enhancer and chromatin accessibility landscapes chart the regulatory network of primary multiple myeloma.’, Blood, 131(19), pp. 2138–2150. doi: 10.1182/blood-2017-09-808063.

Jones, P. A. (2012) ‘Functions of DNA methylation: islands, start sites, gene bodies and beyond’, Nature Reviews Genetics, 13(7), pp. 484–92. doi: 10.1038/nrg3230.

Joshi, A. A. and Struhl, K. (2005) ‘Eaf3 chromodomain interaction with methylated H3-K36 links histone deacetylation to Pol II elongation.’, Molecular cell, 20(6), pp. 971–8. doi: 10.1016/j.molcel.2005.11.021.

Jundt, F. et al. (2004) ‘Jagged1-induced Notch signaling drives proliferation of multiple myeloma cells’, Blood, 103(9), pp. 3511–3515. doi: 10.1182/blood-2003-07- 2254.

Jurczyszyn, A. et al. (2015) ‘The Analysis of the Relationship between Multiple Myeloma Cells and Their Microenvironment.’, Journal of Cancer, 6(2), pp. 160–8. doi: 10.7150/jca.10873.

Kagohara, L. T. et al. (2018) ‘Epigenetic regulation of gene expression in cancer: techniques, resources and analysis.’, Briefings in functional genomics, 17(1), pp. 49– 63. doi: 10.1093/bfgp/elx018.

Kaiser, M. F. et al. (2013) ‘Global methylation analysis identifies prognostically important epigenetically inactivated tumor suppressor genes in multiple myeloma’, Blood, 122(2), pp. 219–226. doi: 10.1182/blood-2013-03-487884.

Karlenius, T. C. and Tonissen, K. F. (2010) ‘Thioredoxin and Cancer: A Role for Thioredoxin in all States of Tumor Oxygenation.’, Cancers, 2(2), pp. 209–32. doi: 10.3390/cancers2020209.

Karmodiya, K. et al. (2012) ‘H3K9 and H3K14 acetylation co-occur at many gene regulatory elements, while H3K14ac marks a subset of inactive inducible promoters in mouse embryonic stem cells’, BMC Genomics, 13(1), p. 424. doi: 10.1186/1471- 2164-13-424.

178

References

Keats, J. J. et al. (2005) ‘Overexpression of transcripts originating from the MMSET locus characterizes all t(4;14)(p16;q32)-positive multiple myeloma patients’, Blood, 105(10), pp. 4060–4069. doi: 10.1182/blood-2004-09-3704.

Kelsey, G., Stegle, O. and Reik, W. (2017) ‘Single-cell epigenomics: Recording the past and predicting the future.’, Science, 358(6359), pp. 69–75. doi: 10.1126/science.aan6826.

Khan, O. and La Thangue, N. B. (2012) ‘HDAC inhibitors in cancer biology: emerging mechanisms and clinical applications’, Immunology and Cell Biology, 90(1), pp. 85– 94. doi: 10.1038/icb.2011.100.

Khoo, T. L. et al. (2011) ‘Interferon-alpha in the treatment of multiple myeloma.’, Current drug targets, 12(3), pp. 437–46. doi: 10.2174/138945011794815329.

Kikuchi, J. et al. (2015) ‘Phosphorylation-mediated EZH2 inactivation promotes drug resistance in multiple myeloma.’, The Journal of clinical investigation, 125(12), pp. 4375–90. doi: 10.1172/JCI80325.

Kim, H.-J. and Bae, S.-C. (2011) ‘Histone deacetylase inhibitors: molecular mechanisms of action and clinical trials as anti-cancer drugs.’, American journal of translational research, 3(2), pp. 166–79.

Kim, J. and Kim, H. (2012) ‘Recruitment and biological consequences of histone modification of H3K27me3 and H3K9me3.’, ILAR journal, 53(3–4), pp. 232–9. doi: 10.1093/ilar.53.3-4.232.

Kim, M. and Costello, J. (2017) ‘DNA methylation: an epigenetic mark of cellular memory’, Experimental & Molecular Medicine, 49(4), pp. e322–e322. doi: 10.1038/emm.2017.10.

Klein, U. and Dalla-Favera, R. (2008) ‘Germinal centres: role in B-cell physiology and malignancy.’, Nature reviews. Immunology, 8(1), pp. 22–33. doi: 10.1038/nri2217.

Kolasinska-Zwierz, P. et al. (2009) ‘Differential chromatin marking of introns and expressed exons by H3K36me3.’, Nature genetics, 41(3), pp. 376–81. doi: 10.1038/ng.322.

Korde, N., Kristinsson, S. Y. and Landgren, O. (2011) ‘Monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM): novel biological insights and development of early treatment strategies.’, Blood, 117(21), pp. 5573–81. doi: 10.1182/blood-2011-01-270140.

Kouzarides, T. (2007) ‘Chromatin Modifications and Their Function’, Cell, 128(4), pp. 693–705. doi: 10.1016/j.cell.2007.02.005.

179

References

Kuang, X.-L. et al. (2010) ‘Spatio-temporal expression of a novel neuron-derived neurotrophic factor (NDNF) in mouse brains during development.’, BMC neuroscience, 11(1), p. 137. doi: 10.1186/1471-2202-11-137.

Kulaeva, O. I. et al. (2012) ‘Distant activation of transcription: mechanisms of enhancer action.’, Molecular and cellular biology, 32(24), pp. 4892–7. doi: 10.1128/MCB.01127-12.

Kulis, M. et al. (2015) ‘Whole-genome fingerprint of the DNA methylome during human B cell differentiation’, Nature Genetics, 47(7), pp. 746–756. doi: 10.1038/ng.3291.

Kumar, S. K. et al. (2017) ‘Multiple myeloma.’, Nature reviews. Disease primers, 3, p. 17046. doi: 10.1038/nrdp.2017.46.

Kurosaki, T., Shinohara, H. and Baba, Y. (2010) ‘B cell signaling and fate decision.’, Annual review of immunology, 28(1), pp. 21–55. doi: 10.1146/annurev.immunol.021908.132541.

Kyle, R. A. et al. (2007) ‘Clinical course and prognosis of smoldering (asymptomatic) multiple myeloma.’, The New England journal of medicine, 356(25), pp. 2582–90. doi: 10.1056/NEJMoa070389.

Landgren, O. et al. (2009) ‘Monoclonal gammopathy of undetermined significance (MGUS) consistently precedes multiple myeloma: a prospective study’, Blood, 113(22), pp. 5412–5417. doi: 10.1182/blood-2008-12-194241.

Lara-Astiaso, D. et al. (2014) ‘Chromatin state dynamics during blood formation’, Science, 345(6199), pp. 943–949. doi: 10.1126/science.1255083.

Laubach, J. P. et al. (2017) ‘Deacetylase inhibitors: an advance in myeloma therapy?’, Expert Review of Hematology, 10(3), pp. 229–237. doi: 10.1080/17474086.2017.1280388.

Lavelle, D. et al. (2003) ‘Decitabine induces cell cycle arrest at the G1 phase via p21(WAF1) and the G2/M phase via the p38 MAP kinase pathway.’, Leukemia research, 27(11), pp. 999–1007. doi: 10.1016/S0145-2126(03)00068-7.

Layman, W. S. and Zuo, J. (2014) ‘Epigenetic regulation in the inner ear and its potential roles in development, protection, and regeneration.’, Frontiers in cellular neuroscience, 8, p. 446. doi: 10.3389/fncel.2014.00446.

Li, F. et al. (2013) ‘The Histone Mark H3K36me3 Regulates Human DNA Mismatch Repair through Its Interaction with MutSα’, Cell, 153(3), pp. 590–600. doi: 10.1016/j.cell.2013.03.025.

180

References

Li, P. et al. (2012) ‘BATF-JUN is critical for IRF4-mediated transcription in T cells.’, Nature, 490(7421), pp. 543–6. doi: 10.1038/nature11530.

Li, Y. et al. (2018) ‘Genome-wide analyses reveal a role of Polycomb in promoting hypomethylation of DNA methylation valleys’, Genome Biology, 19(1), p. 18. doi: 10.1186/s13059-018-1390-8.

Lipchick, B. C., Fink, E. E. and Nikiforov, M. A. (2016) ‘Oxidative stress and proteasome inhibitors in multiple myeloma’, Pharmacological Research, 105, pp. 210–215. doi: 10.1016/j.phrs.2016.01.029.

Lister, R. et al. (2009) ‘Human DNA methylomes at base resolution show widespread epigenomic differences’, Nature, 462. doi: 10.1038/nature08514.

Liu, B. et al. (2016) ‘PRDM5 expression correlates with malignant behaviors and indicates favorable prognosis in HCC’, Int J Clin Exp Pathol, 9(8), pp. 8327–8334.

Lo, P.-K. and Zhou, Q. (2018) ‘Emerging techniques in single-cell epigenomics and their applications to cancer research.’, Journal of clinical genomics. NIH Public Access, 1(1). doi: 10.4172/JCG.1000103.

Local, A. et al. (2018) ‘Identification of H3K4me1-associated proteins at mammalian enhancers’, Nature Genetics, 50(1), pp. 73–82. doi: 10.1038/s41588-017-0015-6.

Lohr, J. G. et al. (2014) ‘Widespread genetic heterogeneity in multiple myeloma: implications for targeted therapy.’, Cancer cell, 25(1), pp. 91–101. doi: 10.1016/j.ccr.2013.12.015.

Lu, Y. et al. (2016) ‘Defining the multivalent functions of CTCF from chromatin state and three-dimensional chromatin interactions.’, Nucleic acids research, 44(13), pp. 6200–12. doi: 10.1093/nar/gkw249.

Maes, K. et al. (2013) ‘Epigenetic Modulating Agents as a New Therapeutic Approach in Multiple Myeloma’, Cancers, 5(4), pp. 430–461. doi: 10.3390/cancers5020430.

Maes, K. et al. (2014) ‘The role of DNA damage and repair in decitabine-mediated apoptosis in multiple myeloma’, Oncotarget. Impact Journals, 5(10), pp. 3115–3129. doi: 10.18632/oncotarget.1821.

Maeshima, K. et al. (2014) ‘Chromatin as dynamic 10-nm fibers’, Chromosoma, 123, pp. 225–237. doi: 10.1007/s00412-014-0460-2.

Mahnke, J. et al. (2016) ‘Interferon Regulatory Factor 4 controls TH1 cell effector function and metabolism’, Scientific Reports, 6(1), p. 35521. doi: 10.1038/srep35521.

181

References

Marango, J. et al. (2008) ‘The MMSET protein is a histone methyltransferase with characteristics of a transcriptional corepressor.’, Blood, 111(6), pp. 3145–54. doi: 10.1182/blood-2007-06-092122.

Martens, J. H. A. et al. (2010) ‘PML-RARα/RXR Alters the Epigenetic Landscape in Acute Promyelocytic Leukemia’, Cancer Cell, 17(2), pp. 173–185. doi: 10.1016/j.ccr.2009.12.042.

Martinez-Garcia, E. et al. (2011) ‘The MMSET histone methyl transferase switches global histone methylation and alters gene expression in t(4;14) multiple myeloma cells.’, Blood, 117(1), pp. 211–20. doi: 10.1182/blood-2010-07-298349.

Matthews, G. M. et al. (2016) ‘BET Bromodomain Degradation As a Therapeutic Strategy in Multiple Myeloma’, Blood, 128(22).

Matthias, P. and Rolink, A. G. (2005) ‘Transcriptional networks in developing and mature B cells.’, Nature reviews. Immunology, 5(6), pp. 497–508. doi: 10.1038/nri1633.

Maunakea, A. K. et al. (2010) ‘Conserved role of intragenic DNA methylation in regulating alternative promoters.’, Nature, 466(7303), pp. 253–7. doi: 10.1038/nature09165.

Maurano, M. T. et al. (2015) ‘Role of DNA Methylation in Modulating Transcription Factor Occupancy.’, Cell reports, 12(7), pp. 1184–95. doi: 10.1016/j.celrep.2015.07.024.

Mazor, T. et al. (2015) ‘DNA Methylation and Somatic Mutations Converge on the Cell Cycle and Define Similar Evolutionary Histories in Brain Tumors’, Cancer Cell, 28(3), pp. 307–317. doi: 10.1016/j.ccell.2015.07.012.

Mazor, T. et al. (2016) ‘Intratumoral Heterogeneity of the Epigenome.’, Cancer cell, 29(4), pp. 440–451. doi: 10.1016/j.ccell.2016.03.009.

Mesin, L., Ersching, J. and Victora, G. D. (2016) ‘Germinal Center B Cell Dynamics.’, Immunity, 45(3), pp. 471–482. doi: 10.1016/j.immuni.2016.09.001.

Van der Meulen, J., Speleman, F. and Van Vlierberghe, P. (2014) ‘The H3K27me3 demethylase UTX in normal development and disease’, Epigenetics, 9(5), pp. 658– 668. doi: 10.4161/epi.28298.

Mi, H. et al. (2017) ‘PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.’, Nucleic acids research, 45(D1), pp. D183–D189. doi: 10.1093/nar/gkw1138.

182

References

Milani, P. et al. (2016) ‘Cell freezing protocol suitable for ATAC-Seq on motor neurons derived from human induced pluripotent stem cells.’, Scientific reports, 6(1), p. 25474. doi: 10.1038/srep25474.

Mirabella, F. et al. (2013) ‘MMSET is the key molecular target in t(4;14) myeloma’, Blood Cancer Journal, 3(5), pp. e114–e114. doi: 10.1038/bcj.2013.9.

Moreaux, J. et al. (2014) ‘DNA methylation score is predictive of myeloma cell sensitivity to 5-azacitidine.’, British journal of haematology, 164(4), pp. 613–6. doi: 10.1111/bjh.12660.

Mundade, R. et al. (2014) ‘Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond’, Cell Cycle, 13(18), pp. 2847–2852. doi: 10.4161/15384101.2014.949201.

Nagano, T. et al. (2015) ‘Single-cell Hi-C for genome-wide detection of chromatin interactions that occur simultaneously in a single cell’, Nature Protocols, 10(12), pp. 1986–2003. doi: 10.1038/nprot.2015.127.

Nair, N., Shoaib, M. and Sørensen, C. S. (2017) ‘Chromatin Dynamics in Genome Stability: Roles in Suppressing Endogenous DNA Damage and Facilitating DNA Repair.’, International journal of molecular sciences, 18(7). doi: 10.3390/ijms18071486.

Nakahashi, H. et al. (2013) ‘A genome-wide map of CTCF multivalency redefines the CTCF code.’, Cell reports, 3(5), pp. 1678–1689. doi: 10.1016/j.celrep.2013.04.024.

Nature Structural & Molecular Biology, E. (2013) ‘The dynamic epigenome’, Nature structural & molecular biology, 20(3), p. 258. doi: 10.1038/nsmb.2534.

O’Donnell, E. K. and Raje, N. S. (2017) ‘Myeloma bone disease: pathogenesis and treatment.’, Clinical advances in hematology & oncology : H&O, 15(4), pp. 285–295.

Ochiai, K. et al. (2013) ‘Transcriptional Regulation of Germinal Center B and Plasma Cell Fates by Dynamical Control of IRF4’, Immunity, 38(5), pp. 918–929. doi: 10.1016/j.immuni.2013.04.009.

Offidani, M. et al. (2012) ‘Phase II study of melphalan, thalidomide and prednisone combined with oral panobinostat in patients with relapsed/refractory multiple myeloma.’, Leukemia & lymphoma, 53(9), pp. 1722–7. doi: 10.3109/10428194.2012.664844.

183

References

Ohashi, K. et al. (2014) ‘Neuron-derived neurotrophic factor functions as a novel modulator that enhances endothelial cell function and revascularization processes.’, The Journal of biological chemistry, 289(20), pp. 14132–44. doi: 10.1074/jbc.M114.555789.

Ohguchi, H., Hideshima, T. and Anderson, K. C. (2018) ‘The biological significance of histone modifiers in multiple myeloma: clinical applications’, Blood Cancer Journal, 8(9), p. 83. doi: 10.1038/s41408-018-0119-y.

Ooi, W. F. et al. (2016) ‘Epigenomic profiling of primary gastric adenocarcinoma reveals super-enhancer heterogeneity’, Nature Communications, 7, p. 12983. doi: 10.1038/ncomms12983.

Paiva, B. et al. (2017) ‘Differentiation stage of myeloma plasma cells: biological and clinical significance’, Leukemia, 31(2), pp. 382–392. doi: 10.1038/leu.2016.211.

Pawlyn, C. et al. (2014) ‘Current and potential epigenetic targets in multiple myeloma’, Epigenomics, 6(2), pp. 215–228. doi: 10.2217/epi.14.12.

Pawlyn, C. et al. (2016) ‘The Spectrum and Clinical Impact of Epigenetic Modifier Mutations in Myeloma.’, Clinical cancer research : an official journal of the American Association for Cancer Research, 22(23), pp. 5783–5794. doi: 10.1158/1078- 0432.CCR-15-1790.

Pawlyn, C. et al. (2017) ‘Overexpression of EZH2 in multiple myeloma is associated with poor prognosis and dysregulation of cell cycle control’, Blood Cancer Journal, 7(3), pp. e549–e549. doi: 10.1038/bcj.2017.27.

Phillips, J. E. and Corces, V. G. (2009) ‘CTCF: master weaver of the genome.’, Cell, 137(7), pp. 1194–211. doi: 10.1016/j.cell.2009.06.001.

Pinzone, J. J. et al. (2009) ‘The role of Dickkopf-1 in bone development, homeostasis, and disease.’, Blood, 113(3), pp. 517–25. doi: 10.1182/blood-2008-03-145169.

Portela, A. and Esteller, M. (2010) ‘Epigenetic modifications and human disease.’, Nature biotechnology, 28(10), pp. 1057–68. doi: 10.1038/nbt.1685.

Pujadas, E. and Feinberg, A. P. (2012) ‘Regulated Noise in the Epigenetic Landscape of Development and Disease’, Cell, 148(6), pp. 1123–1131. doi: 10.1016/j.cell.2012.02.045.

Pulido-Quetglas, C. et al. (2017) ‘Scalable Design of Paired CRISPR Guide RNAs for Genomic Deletion’, PLoS Computational Biology, 13(3). doi: 10.1371/journal.pcbi.1005341.

184

References

Rajkumar, S. V. et al. (2014) ‘International Myeloma Working Group updated criteria for the diagnosis of multiple myeloma’, The Lancet Oncology, 15(12), pp. e538–e548. doi: 10.1016/S1470-2045(14)70442-5.

Raninga, P. V. et al. (2015) ‘Inhibition of thioredoxin 1 leads to apoptosis in drug- resistant multiple myeloma’, Oncotarget, 6(17), pp. 15410–15424. doi: 10.18632/oncotarget.3795.

Reik, W. (2007) ‘Stability and flexibility of epigenetic gene regulation in mammalian development’, Nature, 447(May), pp. 425–432. doi: 10.1038/nature05918.

Richardson, P. G. et al. (2013) ‘PANORAMA 2: panobinostat in combination with bortezomib and dexamethasone in patients with relapsed and bortezomib-refractory myeloma.’, Blood, 122(14), pp. 2331–7. doi: 10.1182/blood-2013-01-481325.

Robiou du Pont, S. et al. (2017) ‘Genomics of Multiple Myeloma.’, Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 35(9), pp. 963–967. doi: 10.1200/JCO.2016.70.6705.

Salhia, B. et al. (2010) ‘DNA methylation analysis determines the high frequency of genic hypomethylation and low frequency of hypermethylation events in plasma cell tumors.’, Cancer research, 70(17), pp. 6934–44. doi: 10.1158/0008-5472.CAN-10- 0282.

San-Miguel, J. F. et al. (2017) ‘Panobinostat plus bortezomib and dexamethasone: impact of dose intensity and administration frequency on safety in the PANORAMA 1 trial’, British Journal of Haematology, 179(1), pp. 66–74. doi: 10.1111/bjh.14821.

San-Miguel, J. F., Einsele, H. and Moreau, P. (2016) ‘The Role of Panobinostat Plus Bortezomib and Dexamethasone in Treating Relapsed or Relapsed and Refractory Multiple Myeloma: A European Perspective’, Advances in Therapy, 33(11), pp. 1896– 1920. doi: 10.1007/s12325-016-0413-7.

Saxonov, S., Berg, P. and Brutlag, D. L. (2006) ‘A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters’, Proceedings of the National Academy of Sciences, 103(5), pp. 1412–1417. doi: 10.1073/pnas.0510310103.

Schep, A. N. et al. (2015) ‘Structured nucleosome fingerprints enable high-resolution mapping of chromatin architecture within regulatory regions’, Genome Research, 25(11), pp. 1757–1770. doi: 10.1101/gr.192294.115.

185

References

Schlesinger, Y. et al. (2007) ‘Polycomb-mediated methylation on Lys27 of histone H3 pre-marks genes for de novo methylation in cancer.’, Nature genetics, 39(2), pp. 232– 6. doi: 10.1038/ng1950.

Schneider, R. and Grosschedl, R. (2007) ‘Dynamics and interplay of nuclear architecture, genome organization, and gene expression.’, Genes & development, 21(23), pp. 3027–43. doi: 10.1101/gad.1604607.

Schroeder, H. W. et al. (2015) ‘epigenetics of Peripheral B-Cell Differentiation and the Antibody Response’, Frontiers in Immunology, 6(631). doi: 10.3389/fimmu.2015.00631.

Sebastian, S. et al. (2017) Multiple myeloma cells’ capacity to decompose H2O2 determines lenalidomide sensitivity, Blood. doi: 10.1182/blood-2016-09-738872.

Sengupta, S. and George, R. E. (2017) ‘Super-Enhancer-Driven Transcriptional Dependencies in Cancer’, Trends in Cancer, 3(4), pp. 269–281. doi: 10.1016/j.trecan.2017.03.006.

Sérandour, A. A. et al. (2011) ‘Epigenetic switch involved in activation of pioneer factor FOXA1-dependent enhancers.’, Genome research, 21(4), pp. 555–65. doi: 10.1101/gr.111534.110.

Shaffer, A. L. et al. (2008) ‘IRF4 addiction in multiple myeloma.’, Nature, 454(7201), pp. 226–31. doi: 10.1038/nature07064.

Sharifi-Zarchi, A. et al. (2017) ‘DNA methylation regulates discrimination of enhancers from promoters through a H3K4me1-H3K4me3 seesaw mechanism.’, BMC genomics, 18(1), p. 964. doi: 10.1186/s12864-017-4353-7.

Shlyueva, D., Stampfel, G. and Stark, A. (2014) ‘Transcriptional enhancers: from properties to genome-wide predictions’, Nature Publishing Group, 15. doi: 10.1038/nrg3682.

Shu, X. et al. (2011) ‘The epigenetic modifier PRDM5 functions as a tumor suppressor through modulating WNT/β-catenin signaling and is frequently silenced in multiple tumors.’, PloS one. Edited by C. Gottardi, 6(11), p. e27346. doi: 10.1371/journal.pone.0027346.

Shu, X. (2015) ‘Emerging role of PR domain containing 5 (PRDM5) as a broad tumor suppressor in human cancers’, Tumor Biology, 36(1), pp. 1–3. doi: 10.1007/s13277- 014-2916-7.

Shukla, S. et al. (2011) ‘CTCF-promoted RNA polymerase II pausing links DNA methylation to splicing.’, Nature, 479(7371), pp. 74–9. doi: 10.1038/nature10442.

186

References

Shukla, V. and Lu, R. (2014) ‘IRF4 and IRF8: Governing the virtues of B Lymphocytes’, Frontiers in biology, 9(4), p. 269. doi: 10.1007/S11515-014-1318-Y.

Simon, J. A. and Kingston, R. E. (2009) ‘Mechanisms of Polycomb gene silencing: knowns and unknowns’, Nature Reviews Molecular Cell Biology, 10(10), pp. 697–708. doi: 10.1038/nrm2763.

Simonis, M., Kooren, J. and de Laat, W. (2007) ‘An evaluation of 3C-based methods to capture DNA interactions’, Nature Methods, 4(11), pp. 895–901. doi: 10.1038/nmeth1114.

Sims III, R. J. and Reinberg, D. (2009) ‘Processing the H3K36me3 signature’, Nature Genetics, 41(3), pp. 270–271. doi: 10.1038/ng0309-270.

Snyder, M. W. et al. (2016) ‘Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin.’, Cell, 164(1–2), pp. 57–68. doi: 10.1016/j.cell.2015.11.050.

Solimini, N. L., Luo, J. and Elledge, S. J. (2007) ‘Non-Oncogene Addiction and the Stress Phenotype of Cancer Cells’, Cell, 130(6), pp. 986–988. doi: 10.1016/j.cell.2007.09.007.

Song, S. and Johnson, F. B. (2018) ‘Epigenetic Mechanisms Impacting Aging: A Focus on Histone Levels and Telomeres.’, Genes, 9(4). doi: 10.3390/genes9040201.

Splinter, E. et al. (2012) ‘Determining long-range chromatin interactions for selected genomic sites using 4C-seq technology: From fixation to computation’, Methods, 58(3), pp. 221–230. doi: 10.1016/j.ymeth.2012.04.009.

Stratton, M. R., Campbell, P. J. and Futreal, P. A. (2009) ‘The cancer genome.’, Nature, 458(7239), pp. 719–24. doi: 10.1038/nature07943.

Stroud, H. et al. (2011) ‘5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells’, Genome Biology, 12(6), p. R54. doi: 10.1186/gb-2011-12-6-r54.

Suan, D., Sundling, C. and Brink, R. (2017) ‘Plasma cell and memory B cell differentiation from the germinal center’, Current Opinion in Immunology, 45, pp. 97–102. doi: 10.1016/j.coi.2017.03.006.

Sur, I. and Taipale, J. (2016) ‘The role of enhancers in cancer’, Nature Reviews Cancer, 16(8), pp. 483–493. doi: 10.1038/nrc.2016.62.

Suva, M. L., Riggi, N. and Bernstein, B. E. (2013) ‘Epigenetic Reprogramming in Cancer’, Science, 339(6127), pp. 1567–1570. doi: 10.1126/science.1230184.

187

References

Tan, S.-X. et al. (2014) ‘Promoter methylation-mediated downregulation of PRDM5 contributes to the development of lung squamous cell carcinoma’, Tumor Biology, 35(5), pp. 4509–4516. doi: 10.1007/s13277-013-1593-2.

Tarakhovsky, A. (2010) ‘Tools and landscapes of epigenetics’, Nature Immunology, 11(7), pp. 565–568. doi: 10.1038/ni0710-565.

Thurman, R. E. et al. (2012) ‘The accessible chromatin landscape of the human genome.’, Nature, 489(7414), pp. 75–82. doi: 10.1038/nature11232.

Tripodi, I. J., Allen, M. A. and Dowell, R. D. (2018) ‘Detecting Differential Transcription Factor Activity from ATAC-Seq Data.’, Molecules, 23(5), p. 1136. doi: 10.3390/molecules23051136.

Trojer, P. and Reinberg, D. (2007) ‘Facultative heterochromatin: is there a distinctive molecular signature?’, Molecular cell, 28(1), pp. 1–13. doi: 10.1016/j.molcel.2007.09.011.

Vincenz, L. et al. (2013) ‘Endoplasmic Reticulum Stress and the Unfolded Protein Response: Targeting the Achilles Heel of Multiple Myeloma’, Molecular Cancer Therapeutics, 12(6), pp. 831–843. doi: 10.1158/1535-7163.MCT-12-0782.

Viré, E. et al. (2006) ‘The Polycomb group protein EZH2 directly controls DNA methylation.’, Nature, 439(7078), pp. 871–4. doi: 10.1038/nature04431.

Visel, A., Rubin, E. M. and Pennacchio, L. A. (2009) ‘Genomic views of distant-acting enhancers.’, Nature, 461(7261), pp. 199–205. doi: 10.1038/nature08451.

Walker, B. A. et al. (2010) ‘A compendium of myeloma-associated chromosomal copy number abnormalities and their prognostic value’, Blood, 116(15), pp. e56–e65. doi: 10.1182/blood-2010-04-279596.

Walker, B. A. et al. (2011) ‘Aberrant global methylation patterns affect the molecular pathogenesis and prognosis of multiple myeloma’, Blood, 117(2), pp. 553–562. doi: 10.1182/blood-2010-04-279539.

Walker, B. A. et al. (2013) ‘Characterization of IGH locus breakpoints in multiple myeloma indicates a subset of translocations appear to occur in pregerminal center B cells’, Blood, 121(17), pp. 3413–3419. doi: 10.1182/blood-2012-12-471888.

Walker, B. A. et al. (2015) ‘Mutational Spectrum, Copy Number Changes, and Outcome: Results of a Sequencing Study of Patients With Newly Diagnosed Myeloma.’, Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 33(33), pp. 3911–20. doi: 10.1200/JCO.2014.59.1503.

188

References

Walker, B. A. et al. (2018) ‘A high-risk, Double-Hit, group of newly diagnosed myeloma identified by genomic analysis’, Leukemia, p. 1. doi: 10.1038/s41375-018- 0196-8.

Wallington-Beddoe, C. T. and Pitson, S. M. (2017) ‘Enhancing ER stress in myeloma.’, Aging, 9(7), pp. 1645–1646. doi: 10.18632/aging.101273.

Wang, H. et al. (2014) ‘NOTCH1–RBPJ complexes drive target gene expression through dynamic interactions with superenhancers’, Proceedings of the National Academy of Sciences, 111(2), pp. 705–710. doi: 10.1073/pnas.1315023111.

Wang, J., Jia, S. T. and Jia, S. (2016) ‘New Insights into the Regulation of Heterochromatin.’, Trends in genetics : TIG, 32(5), pp. 284–294. doi: 10.1016/j.tig.2016.02.005.

Wang, L. et al. (2014) ‘Gene expression profiling identifies IRF4-associated molecular signatures in hematological malignancies.’, PloS one, 9(9), p. e106788. doi: 10.1371/journal.pone.0106788.

Watanabe, Y. et al. (2007) ‘PRDM5 Identified as a Target of Epigenetic Silencing in Colorectal and Gastric Cancer’, Clinical Cancer Research, 13(16), pp. 4786–4794. doi: 10.1158/1078-0432.CCR-07-0305. van de Werken, H. J. G. et al. (2012) ‘4C Technology: Protocols and Data Analysis’, Methods in Enzymology, 513, pp. 89–112. doi: 10.1016/B978-0-12-391938-0.00004- 5. van de Werken, H. J. G. et al. (2012) ‘Robust 4C-seq data analysis to screen for regulatory DNA interactions’, Nature Methods, 9(10), pp. 969–972. doi: 10.1038/nmeth.2173.

White-Gilbertson, S., Hua, Y. and Liu, B. (2013) ‘The role of endoplasmic reticulum stress in maintaining and targeting multiple myeloma: a double-edged sword of adaptation and apoptosis.’, Frontiers in genetics, 4, p. 109. doi: 10.3389/fgene.2013.00109.

Wiederschain, D. et al. (2009) ‘Single-vector inducible lentiviral RNAi system for oncology target validation’, Cell Cycle, 8(3), pp. 498–504. doi: 10.4161/cc.8.3.7701.

Willis, S. N. et al. (2014) ‘Transcription Factor IRF4 Regulates Germinal Center Cell Formation through a B Cell-Intrinsic Mechanism’, The Journal of Immunology, 192(7), pp. 3200–3206. doi: 10.4049/jimmunol.1303216.

189

References

Wingelhofer, B. et al. (2018) ‘Implications of STAT3 and STAT5 signaling on gene regulation and chromatin remodeling in hematopoietic cancer’, Leukemia, 32(8), pp. 1713–1726. doi: 10.1038/s41375-018-0117-x. de Wit, E. and de Laat, W. (2012) ‘A decade of 3C technologies: insights into nuclear organization.’, Genes & development, 26(1), pp. 11–24. doi: 10.1101/gad.179804.111.

Wu, H. et al. (2018) ‘Epigenetic regulation in B-cell maturation and its dysregulation in autoimmunity’, Cellular and molecular immunology, 15(7), pp. 676–684. doi: 10.1038/cmi.2017.133.

Wu, P. et al. (2017) ‘3D genome of multiple myeloma reveals spatial genome disorganization associated with copy number variations’, Nature Communications, 8(1), p. 1937. doi: 10.1038/s41467-017-01793-w.

Yan, H. et al. (2016) ‘ChIP-seq in studying epigenetic mechanisms of disease and promoting precision medicine: progresses and future directions’, Epigenomics, 8(9), pp. 1239–1258. doi: 10.2217/epi-2016-0053.

Yanai, H., Negishi, H. and Taniguchi, T. (2012) ‘The IRF family of transcription factors: Inception, impact and implications in oncogenesis.’, Oncoimmunology, 1(8), pp. 1376–1386. doi: 10.4161/onci.22475.

Yin, Y. et al. (2017) ‘Impact of cytosine methylation on DNA binding specificities of human transcription factors’, Science, 356(6337), p. eaaj2239. doi: 10.1126/science.aaj2239.

Yu, H. et al. (2017) ‘Clinicopathological significance of the <i>p16</i> hypermethylation in multiple myeloma, a systematic review and meta-analysis’, Oncotarget, 8(47), pp. 83270–83279. doi: 10.18632/oncotarget.18742.

Yu, X. et al. (2014) ‘Small molecule compounds targeting the p53 pathway: are we finally making progress?’, Apoptosis, 19(7), pp. 1055–1068. doi: 10.1007/s10495-014- 0990-3.

Zabidi, M. A. et al. (2015) ‘Enhancer–core-promoter specificity separates developmental and housekeeping gene regulation’, Nature, 518(7540), pp. 556–559. doi: 10.1038/nature13994.

Zaret, K. S. and Carroll, J. S. (2011) ‘Pioneer transcription factors: establishing competence for gene expression.’, Genes & development, 25(21), pp. 2227–41. doi: 10.1101/gad.176826.111.

190

References

Zentner, G. E., Tesar, P. J. and Scacheri, P. C. (2011) ‘Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions’, Genome Research, 21(8), pp. 1273–1283. doi: 10.1101/gr.122382.111.

Zhang, J. et al. (2017) ‘Targeting the Thioredoxin System for Cancer Therapy’, Trends in Pharmacological Sciences, 38(9), pp. 794–808. doi: 10.1016/j.tips.2017.06.001.

Zhang, T. et al. (2012) ‘Silencing thioredoxin induces liver cancer cell senescence under hypoxia.’, Hepatology research : the official journal of the Japan Society of Hepatology, 42(7), pp. 706–13. doi: 10.1111/j.1872-034X.2012.00973.x.

Zhang, T., Cooper, S. and Brockdorff, N. (2015) ‘The interplay of histone modifications-writers that read’, Embo Reports, 26(22), pp. 1467–1481. doi: 10.15252/embr.201540945.

Zhang, W. et al. (2015) ‘Aging stem cells. A Werner syndrome stem cell model unveils heterochromatin alterations as a driver of human aging.’, Science, 348(6239), pp. 1160–3. doi: 10.1126/science.aaa1356.

Zhou, T. et al. (2017) ‘Therapeutic Targeting of Interferon Regulatory Factor 4 with Next Generation Antisense Oligonucleotides Produces Robust In Vivo Antitumor Activity in Preclinical Models of Multiple Myeloma’, Blood, 130(Suppl 1).

Zhu, J. et al. (2013) ‘Genome-wide chromatin state transitions associated with developmental and environmental cues.’, Cell, 152(3), pp. 642–54. doi: 10.1016/j.cell.2012.12.033.

Ziller, M. J. et al. (2015) ‘Coverage recommendations for methylation analysis by whole-genome bisulfite sequencing’, Nature Methods, 12(3), pp. 230–232. doi: 10.1038/nmeth.3152.

191

APPENDIX

Appendix

Additionally to the work presented in this doctoral thesis, I have collaborated in the following projects:

Appendix 1. Discovery of Reversible DNA Methyltransferase and Lysine Methyltransferase G9a Inhibitors with Antitumoral in Vivo Efficacy

Appendix 2. Epigenomic profiling of myelofibrosis reveals widespread DNA methylation changes in enhancer elements and ZFP36L1 as a potential tumor suppressor gene epigenetically regulated

195

Appendix

Appendix 1

Discovery of Reversible DNA Methyltransferase and Lysine Methyltransferase G9a Inhibitors with Antitumoral in Vivo Efficacy

Obdulia Rabal* , Edurne San José-Enériz*, Xabier Agirre*, Juan Antonio Sánchez- Arias, Amaia Vilas-Zornoza, Ana Ugarte, Irene de Miguel, Estíbaliz Miranda, Leire Garate, Mario Fraga, Pablo Santamarina, Raul Fernandez Perez, Raquel Ordoñez, Elena Sáez, Sergio Roa, María José García-Barchino, José Angel Martínez-Climent, Yingying Liu, Wei Wu, Musheng Xu, Felipe Prosper*, and Julen Oyarzabal*. *These authors contributed equally to this work.

Journal of Medicinal Chemistry, 2018; 61(15):6518-6545

197

Rabal O, et al. Discovery ofReversibleDNAMethyltransferaseandLysine Methyltransferase G9aInhibitorswithAntitumoralinVivoEfficacy. Journal of Chemical Chemistry, 2018, 61(15):6518-6545. http://doi.org/10.1021/acs.jmedchem.7b01926

Appendix

Appendix 2

Epigenomic profiling of myelofibrosis reveals widespread DNA methylation changes in enhancer elements and ZFP36L1 as a potential tumor suppressor gene epigenetically regulated

Nicolás Martínez-Calle*, Marien Pascual*, Raquel Ordoñez*, Edurne San José-Enériz, Marta Kulis, Estíbaliz Miranda, Elisabeth Guruceaga, Víctor Segura, María José Larráyoz, Beatriz Bellosillo, María José Calasanz, Carles Besses, José Rifón, José I. Martín-Subero, Xabier Agirre*, Felipe Prosper*. *These authors contributed equally to this work.

Submitted to: Haematologica (HAEMATOL/2018/204917)

227

Haematologica HAEMATOL/2018/204917 Version 1

Haematologica HAEMATOL/2018/204917 Version 1 Epigenomic profiling of myelofibrosis reveals widespread DNA methylation changes in enhancer elements and ZFP36L1 as a potential tumor suppressor gene epigenetically regulated

Nicolas Martinez-Calle, Marien Pascual, Raquel Ordoñez, Edurne San José-Enériz, Marta Kulis, Estíbaliz Miranda, Elisabeth Guruceaga, Victor Segura, María José Larrayoz, Beatriz Bellosillo, Maria José Calasanz, Carles Besses, José Rifón, José Ignacio Martín-Subero, Xabier Agirre, and Felipe Prosper

Disclosures: The authors declare no competing interests.

Contributions: Conception and design: NMC, MP, RO, XA and FP Performed Research: NMC, MP, RO, ESJE, EM Acquisition of data and assistance with experiments: MK, EG, VS, MJL, BB, MJC, CB, JR, JIMS Analysis and interpretation of data: NMC, MP, RO, ESJE, MK, JIMS, XA and FP Writing, review, and/or revision of the manuscript: NMC, MP, RO, JIMS, XA and FP Study supervision: XA and FP Haematologica HAEMATOL/2018/204917 Version 1

Epigenomic profiling of myelofibrosis reveals widespread DNA methylation changes in enhancer elements and ZFP36L1 as a potential tumor suppressor gene epigenetically regulated

Nicolás Martínez-Calle1,2#, Marien Pascual1,2#, Raquel Ordoñez1,2#, Edurne San José- Enériz1,2, Marta Kulis3, Estíbaliz Miranda1,2, Elisabeth Guruceaga4, Víctor Segura4, María José Larráyoz5, Beatriz Bellosillo6, María José Calasanz2,5, Carles Besses7, José Rifón2,8, José I. Martín-Subero2,9,10, Xabier Agirre1,2*, Felipe Prosper1,2,8*.

1Área de Hemato-Oncología, Centro de Investigación Médica Aplicada, IDISNA, Universidad de Navarra, Avenida Pío XII-55, Pamplona, Spain. 2Centro de Investigación Biomédica en Red de Cáncer (CIBERONC). 3Fundació Clínic per a la Recerca Biomèdica, Barcelona, Spain 4Unidad de Bioinformática, Centro de Investigación Médica Aplicada, Universidad de Navarra, Avenida Pío XII-55, Pamplona, Spain. 5CIMA Lab Diagnostics, Universidad de Navarra, Pamplona, Spain. 6Departmento de Patología, Hospital del Mar, Barcelona, Spain. 7Departmento de Hematología, Hospital del Mar, Barcelona, Spain. 8Departamento de Hematología, Clínica Universidad de Navarra, Universidad de Navarra, Avenida Pío XII-36, Pamplona, Spain. 9Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain. 10Departament de Fonaments Clinics, Facultat de Medicina, Universitat de Barcelona, Barcelona, Spain.

# These authors share first authorship * These authors share senior authorship

Running head ZFP36L1 enhancer DNA methylation in MF.

1

Haematologica HAEMATOL/2018/204917 Version 1

Word Count: • Manuscript: 3486 • Abstract: 193

Figure count: 3 Table count: 0 Supplementary figures: 1 Supplementary tables: 2 Reference count: 46

Correspondence and requests for materials should be addressed to: X.A. ([email protected]) or to F.P ([email protected]).

KEY POINTS: • Epigenetic profiling of myelofibrosis patients reveals that aberrant DNA methylation is enriched in enhancer regions, outside traditional CpG islands. • ZFP36L1 is inactivated through DNA methylation of its enhancer region and represents a potential novel tumour suppressor, with a potential role in myelofibrosis.

2

Haematologica HAEMATOL/2018/204917 Version 1

ABSTRACT In this study we have interrogated the DNA methylome of myelofibrosis (MF) patients using high-density DNA methylation arrays. We detected 35,215 differentially methylated CpGs (DMC) corresponding to 10,253 genes between MF patients and healthy controls. These changes were present both in primary and secondary MF, which showed no differences between them. Remarkably, the majority of DMCs were located outside gene promoter regions and showed a significant association with enhancer regions. This enhancer aberrant hypermethylation showed a negative correlation with the expression of 27 genes in the MF patient cohort. Of these, we focused on ZFP36L1 gene and validated its decreased expression and enhancer DNA hypermethylation in an independent cohort of MF patients and myeloid cell-lines. In vitro reporter assay and 5’ azacitidine treatment confirmed the functional relevance of the enhancer hypermethylation of ZFP36L1. Furthermore, in vitro rescue of ZFP36L1 expression had an impact in cell proliferation and induced apoptosis in SET-2 cell line indicating a possible role of ZFP36L1 as a tumor suppressor gene in MF. Overall, we describe the MF DNA methylation profile, identify extensive changes in enhancer elements and reveal ZFP36L1 as a novel candidate tumor suppressor gene in this disease.

3

Haematologica HAEMATOL/2018/204917 Version 1

INTRODUCTION Philadelphia chromosome-negative myeloproliferative neoplasms (MPN), namely polycythemia vera (PV), essential thrombocythemia (ET) and primary myelofibrosis (MF) are characterized by a clonal transformation of hematopoietic progenitors leading to expansion of fully differentiated myeloid cells1. Primary MF carries the worst prognosis of all MPN, harboring progressive marrow fibrosis, extramedullary hematopoiesis, mild to severe splenomegaly and increased risk of leukemic transformation2. Secondary MF can also arise from PV and ET (post-PV or post-ET MF hereafter) by mechanisms that are still poorly understood and are clinically and morphologically indistinguishable from primary MF3.

MF has been intensively studied from the genetic perspective4,5, in fact, the modified WHO diagnostic criteria for MPN requires the demonstration of a genetic marker of clonal hematopoiesis (JAK2V617F, CALR or MPL mutations)6. The frequency of mutations on relevant epigenetic genes (i.e DNMT3A, EZH2 and ASXL1), suggest that MF might have an epigenetic component that to our knowledge, remains poorly characterized5. Actually, epigenetic changes such as DNA methylation have been scarcely addressed in MF7,8 partly due to the limited changes in promoter DNA methylation compared to other hematological malignancies, as previously published by our group9. DNA methylation of CpG islands (mostly on putative promoter regions) has been traditionally studied in both normal and neoplastic hematopoiesis10,11. However, high throughput platforms offer a wider coverage of the genome, allowing a better understanding of DNA methylation dynamics in regions distant from CpG islands12. In this regard, enhancer regions have been characterized as potentially relevant sites of DNA methylation outside CpG islands13-14,15. ChIP-seq studies permit the reliable mapping of genome-wide active enhancer regions based on histone modifications (i.e. H3K4me1 and H3K27Ac)16,17, allowing the identification of enhancers playing a role in the dynamic transcriptional regulation during hematopoiesis18.

The present work describes a comprehensive genome-wide analysis of DNA methylation in MF patients, which has been coupled with gene expression analysis and information of

4

Haematologica HAEMATOL/2018/204917 Version 1

functional chromatin states and compared with healthy donor samples17. Focusing on potential epigenetic alterations in enhancer regions, we identify ZFP36L1 as potential tumor suppressor gene with relevance for the pathogenesis of MF.

5

Haematologica HAEMATOL/2018/204917 Version 1

MATERIAL AND METHODS Patient samples and clinical data Samples and patient data were provided by the Biobank of the University of Navarra and were processed following standard operating procedures approved by the local Ethics & Scientific Committee. All patients consented prior to sample extraction and for the use of stored material for research purposes. MF patient samples (n=39) were composed of either bone marrow (BM), granulocytes or total peripheral blood (PB) cells. Among the MF cohort there were primary MF (n=22), MF post-ET (n=7) or MF post-PV (n=10). PB cells from healthy donors (n=6) were used as control samples in this study. All patients were diagnosed using the 2008 version of the World Health Organization (WHO) classification system of hematological malignancies19. Data for JAK2V617F mutation was retrospectively available for all patients. Patient’s data are available from the Gene Expression Omnibus (GSE118241).

DNA Methylation profiling DNA methylation was assessed with Human-Methylation 450K Bead-Chip kit (Illumina, Inc., San Diego, CA, USA). The array-based assays for DNA methylation profiling were performed at the National Centre of Oncologic Investigations (CNIO, Madrid, Spain). Briefly, 500ng of genomic DNA were modified with sodium bisulfite (EZ DNA Methylation Kit, Zymo Research) and subsequently whole genome amplified following manufacturer’s recommendations. Samples were then hybridized in the assay chips as previously described12. Data arising from the 450K Human-Methylation array was analyzed by Bioconductor open source software. The analytical pipeline implemented several filters to exclude technical and biological biases (i.e. sex-specific methylation or overlapping CpGs with SNPs) and taking into account the performance characteristics of Infinium I and Infinium II assays20. Differentially Methylated CpGs (DMC) were defined as previously described14,20. DNA methylation data sets are available from the Gene Expression Omnibus (GSE118241).

6

Haematologica HAEMATOL/2018/204917 Version 1

Genomic and functional annotation of CpG sites

The hg19 version of the UCSC Genome Browser database was used to annotate raw data from the DNA methylation array. DMCs were annotated into four categories relative to CpG islands (CGI) as follows: inside CGI, CGI-shore (0-2 Kb from the CGI), CGI-shelf (>2 Kb up to 4 Kb from the CGI) and outside CGI (>4kb from the CGI). All annotations were extracted from Ensembl database (http://www.ensembl.org). DMCs were also annotated according to publicly available functional chromatin states of human CD34+ cells following the ChromHMM algorithm16,17. Chromatin states were categorized in six functional features (0: heterochromatin; 1-3: transcription-start sites; 4-5: enhancer regions (weak and strong); 6: promoters). This final annotation led to the identification of a group of DMC between MF and healthy controls that mapped to enhancer regions. The genes adjacent to these enhancers were then used for Gene Ontology (GO) functional enrichment analysis (GO-PANTHER) as described elsewhere21.

Identification of candidate genes targeted by aberrant DNA methylation in enhancers Data of gene expression profiling (Affymetrix gene expression array) from primary MF (n=9) and healthy peripheral blood samples (n=21) was obtained from the publicly available GEO accession bank number GSE2604922. Data was further processed using R and the open source Limma package23. Genes showing consistent and ample differences in DNA methylation between MF and the control group were included (FDR<0.01; Δβ>0.4) and then were subsequently filtered by the changes in gene expression (logFC values below 0). The final list included 31 probes, encompassing 27 coding genes. Each of these genes was subsequently explored by literature search to identify those with potential implication in the hematopoietic system.

Cell culture The SET-2 cell line (DMSZ # ACC-608; established from the peripheral blood of a patient diagnosed with essential thrombocythemia at megakaryoblastic leukemic transformation) was maintained in RPMI medium supplemented with 20% fetal bovine serum and antibiotics (100 IU/mL penicillin, 50 µg/mL streptomycin). HEL (DSMZ #

7

Haematologica HAEMATOL/2018/204917 Version 1

ACC-11) and HL-60 (DSMZ # ACC-3) cell lines were maintained in RPMI medium supplemented with 10% fetal bovine serum and antibiotics (100 IU/mL penicillin, 50 µg/mL streptomycin). Cells were seeded at 0.5x106 cells/ml and incubated at 37oC and

5% CO2 with 95% humidity.

Bisulfite sequencing DNA methylation levels were interrogated and validated using traditional bisulfite sequencing. Briefly, after bisulfite modification (CpGenome DNA modification Kit, Merck, Darmstadt, Germany), the fragment of interest was amplified, sub-cloned into the pGEM-T easy vector system (Promega, Madison, USA) and transformed in JM109 competent cells (Promega, Madison, USA). Plasmid DNA was extracted (Nucleospin Plasmid, Macherey Nagel, Germany) and for each condition, at least 10 different CFUs (colony forming units) were sequenced by classical Sanger method using Genetic Analyzer 3130XL (Life Technologies, Carlsbad, USA). Universally methylated human DNA (Zymo research, USA) was used as a positive DNA methylation control for bisulfite modification. Primer sequences are available in Supplementary Table 1.

Luciferase reporter assays The CpG-free vector (pCPG-L), gently provided by Dr. Michael Rehli24, was used to clone the ZFP36L1 enhancer region, after amplification with high-fidelity PlatinumTM Taq polymerase (Invitrogen, Walthman, USA). Cloned plasmids were amplified on PIR1 competent E. coli cells (Invitrogen, Carlsbad, USA) and then treated with SssI CpG methyltransferase enzyme (New England Biolabs, Ipswich, USA) following manufacturer´s instructions. HEK293T cells were co-transfected with 10 ng/µl of DNA methylated or unmethylated reporter plasmids and 0.5 ng/µl renilla luciferase vector (pRL-SV40 Renilla Luciferase Control Reporter Vector, Promega), using Lipofectamine- 2000 (Invitrogen, Carlsbad, CA). Forty-eight hours after transfection, luciferase/Renilla activity was analyzed using the Dual Luciferase® reporter assay system (Promega, Madison, USA) in an automatic 96-well plate reader according to manufacturer's instructions. Luciferase experiments were performed in triplicates. Primer sequences are available in Supplementary Table 1.

8

Haematologica HAEMATOL/2018/204917 Version 1

Gene expression by Q-PCR Quantitative PCRs (Q-PCR) were performed with SYBR Green Master Mix (Applied Biosystems, Foster City, USA) and QuantStudio-3 96-well real-time PCR system (Applied Biosystems, Foster City, USA). GUSB gene was used as housekeeping reference gene in all cases. Primer sequences are available in Supplementary Table 1.

ZFP36L1 binding motif search To further validate the potential relevance of ZFP36L1 gene in MF, DREME motif- discovery algorithm25 was used to assess enrichment of genes with the ZFP36L1 consensus binding sequence among those genes differentially expressed in MF (FDR≤ 0.05).

Overexpression of ZFP36L1 A vector containing ZFP36L1 open reading frame was kindly provided by Dr. Murphy and sub-cloned into a PL-SIN-GK vector26. The same vector backbone carrying an EGFP open reading frame was used as experimental control. Lentiviral particles were generated by co-transfecting HEK293T cells with ZFP36L1-encoding plasmid, psPAX2 (Addgene, #12260) and pMD2G (Addgene, #12259) plasmids (3:2:1 ratio) using Lipofectamine- 2000 (Invitrogen, Carlsbad, CA). Viral supernatant was harvested after 72h, filtered (0.2 µm), concentrated by ultra-centrifugation (26,000 g for 2.5 hours at 4 oC) and supplemented to the cells for infection with polybrene (Sigma-Aldrich, Saint Louis, USA) at 4 µg/ml. SET-2 cell line was used as an in vitro model of JAK2V617F mutated MPN. Infection efficiency was assessed after 72h, determining EGFP positive cells by FACScanto II™ flow cytometer (BD Biosciences, San Jose, USA) and CellQuest software™ (Becton Dickinson, Franklin Lakes, USA).

Apoptosis and cell proliferation assays Apoptosis was assessed using the FITC Annexin V Apoptosis Detection Kit I (BD Biosciences, San Jose, USA) following manufacturer´s protocol. Apoptosis was analyzed using FACScanto™ flow cytometer and CellQuest software™ (Becton Dickinson,

9

Haematologica HAEMATOL/2018/204917 Version 1

Franklin Lakes, USA). Cellular proliferation was assessed with standard MTS assays using the CellTiter 96® AQueous MTS Reagent (Promega, Madison, USA). All experiments were performed in triplicates.

Western Blotting After standard protein extraction, 50µg of protein were separated by 10% SDS-PAGE electrophoresis and transferred onto a nitrocellulose membrane (Bio-Rad, Hercules, USA). Membranes were incubated with the primary antibodies as follows: polyclonal rabbit anti ZFP36L1/ZFP36L2 (#2119 Cell Signaling, Leiden, Netherlands), loading control was made with mouse anti β-actin antibody (A5441 Sigma-Aldrich, St. Louis, MO). Anti-rabbit IgG (A3687 Sigma-Aldrich, St. Louis, USA) and anti-mouse IgG (A1418 Sigma-Aldrich, St. Louis, USA) antibodies conjugated with alkaline phosphatase were used as secondary antibodies.

Statistical Analysis For parametric group comparisons one-way ANOVA with Dunnet correction was used, whereas for non-parametric group comparisons Kruskall-Wallis test with Dunn correction was employed. Paired data was analyzed with Friedman non-parametric test with Dunn correction for multiple comparisons, for the data with single measurements. Two-way ANOVA with Tukey correction was used for data with multiple paired measurements. All tests were performed using Prism 7TM software (GraphPad, La Jolla, USA).

10

Haematologica HAEMATOL/2018/204917 Version 1

RESULTS AND DISCUSSION MF is characterized by a specific DNA methylation pattern enriched in enhancer regions In order to provide an exhaustive analysis of the DNA methylation profile in patients with MF, we analyzed the DNA methylome of patients with primary MF, secondary MF (including MF post-ET/post-PV) and healthy donors as controls, using the Human- Methylation 450K array. The first result worth highlighting is the epigenetic similarity, between primary and post-ET/post-PV myelofibrosis. Interestingly, with a FDR<0.05, no differentially methylated CpGs (DMCs) between primary and secondary MF were found. Furthermore, we did not identify any CpG differentially methylated between post-ET and post-PV MF. However, both unsupervised principal component analysis (PCA) (Figure 1A) and hierarchical clustering study (Supplementary Figure 1A) using all CpGs analyzed confirmed an explicit segregation and a clear epigenetic difference between MF patient samples and healthy controls. These results allowed us hereafter to consider MF samples as a single sample cohort. Primary and secondary MFs are known to have very similar biological features, presenting symptoms and clinical course and in fact, both entities are treated indistinctively according to most published guidelines27. Nevertheless, some recent evidence from large retrospective trials has suggested that traditional prognostic factors may not be applicable to secondary MF as post-ET MF seems to have longer survival as compared to post-PV and primary MF3,27,28,29. Our current results support that consistent with the similarities in their biological and clinical characteristics, primary and secondary MF are also remarkably homogenous in terms of their epigenetic profile, supporting a common biological origin30.

Next, we sought to interrogate changes in DNA methylation between MF samples (considering primary and post-ET/PV MF as a single entity) and healthy controls. In this supervised analysis, we detected 35,215 DMCs (FDR≤0.05) corresponding to 10,253 coding genes. Among all of these DMCs, 65.3% were hypomethylated (corresponding to 22,998 CpGs) and the remaining 34.7% were hypermethylated (a total of 12,217 CpGs), suggesting that loss of DNA methylation is the predominant alteration in this disease. Global DNA hypomethylation has also been a common finding in other hematological

11

Haematologica HAEMATOL/2018/204917 Version 1

malignancies such as chronic lymphocytic leukemia, multiple myeloma or acute myeloid leukaemia14,31,32. Changes in DNA methylation levels are known to cooperate with the deposition of chromatin marks, particularly H3K4 methylation, to render the enhancers/promoters accessible/inaccessible for the transcription machinery33-35. Hence, the changes in methylation observed in MF are very likely to impact the transcriptional profile of MF and potentially contribute to the malignant phenotype.

Even though previous studies have already interrogated the DNA methylation landscape of MF7-9, their findings are limited to small numbers of epigenetic abnormalities mainly focused on the study of promoter regions. Our genome-wide approach of DNA methylation analysis using the 450k array allowed us to interrogate regulatory regions outside traditional promoters, allowing us to obtain a deeper insight into the aberrant DNA methylome of MF. Therefore, in order to characterize the functionality of the detected DMCs, we performed a series of analyses. First, we analyzed their genomic location and identified that both hyper and hypomethylated CpGs were underrepresented in classical CGIs and significantly enriched outside CGIs (Figure 1B). This is an interesting finding, because traditionally, neoplasms acquire hypomethylation outside CGIs and hypermethylation in CGIs14,32, and suggests that patterns of methylation gain in MF might differ to other neoplasms. To shed light into the specific function of the DMCs, chromatin state categorization of each CpG was done adapting a publicly available annotation of ChIP-seq data from CD34+ hematopoietic progenitor cells, in which four distinct states were defined: promoter (with H3K4me3), active enhancer (with H3K4me1 and H3K27ac), transcribed regions (showing H3K36me3) and heterochromatin (including H3K9me3 and H3K27me3)17. Both hyper and hypomethylated CpGs showed a significant enrichment in enhancer regions, together with a striking underrepresentation in promoter regions (Figure 1B). Unsupervised clustering of DMCs located exclusively in enhancer regions (Supplementary table 2) displayed a clear segregation of the majority of MF patients from healthy controls (Figure 1C) identifying 4182 hyper and 10935 hypomethylated. This result reveals that patients with MF show an intrinsic aberrant pattern of DNA methylation preferentially located in enhancer regions of the genome.

12

Haematologica HAEMATOL/2018/204917 Version 1

To functionally characterize this aberrant DNA methylation of enhancer regions in MF, GO-PANTHER enrichment analysis was performed separately in differentially methylated genes. GO terms with an adjusted FDR<0.05 were selected, showing in the case of hypermethylated enhancers interesting cellular processes such as cellular defense response or induction of apoptosis (Figure 1D). Enhancer DNA methylation changes have been described to play a more prominent role in transcriptional regulation than promoter DNA methylation, thus governing processes such as hematopoietic differentiation and neoplastic transformation through the regulation of key transcription factors and genes 13,14,32,35,36,. These evidence translated into the context of MPN, might support the implication of aberrant enhancer DNA methylation in the abnormal pattern of differentiation leading to MF.

DNA methylation of enhancer regions is associated with gene expression profile in MF Next, DNA methylation levels of enhancer regions were associated with the expression of host and adjacent coding genes using publicly available gene expression data of an independent cohort of MF patients and healthy donors (GSE26049)22. Fold increase in gene expression values were grouped according to the hypermethylated (Δβ>0.4) or hypomethylated (Δβ<-0.4) enhancer status in MF versus controls. This analysis showed that enhancer DNA hypermethylation was associated with decreased gene expression of host/adjacent coding genes. In contrast, hypomethylated enhancer regions were not related to increased gene expression (Figure 2A). These data suggest that aberrant DNA hypermethylation may be functionally more relevant than hypomethylation in MF. Enhancer hypermethylation has been reported in neutrophils13, B-cells37,AML cells31 and MM14 adding evidence to dynamic enhancer DNA hypermethylation as a relevant regulatory mechanism of gene expression both in normal and neoplastic hematopoietic cells.

Next, we analyzed in detail those genes showing consistent association between enhancer hypermethylation and downregulation of their expression (Figure 2B). After identifying

13

Haematologica HAEMATOL/2018/204917 Version 1

a number of potential candidates we focused on ZFP36L1, a gene coding for a RNA- binding protein that mediates the decay of unstable mRNAs with AU rich elements in the 3’ untranslated region (3’UTR)38,39. ZFP36L1 has been implicated in normal and malignant hematopoiesis40 and has been proven to mediate mRNA decay of relevant genes for cell proliferation, survival and differentiation such as CDK6, TNF-alpha, BCL2, NOTCH1 and STAT5B41-42. Interestingly, the enhancer region associated with this gene was consistently hypermethylated in MF patient cohort and correlated with a downregulation in MF as compared to controls (Figure 2B and Supplementary Figure 1B), which was further confirmed in an independent cohort of MF patients and myeloid cell lines (Figure 2C). Bisulfite sequencing revealed that DNA methylation of the enhancer region of ZFP36L1 was consistently higher in all MF samples and myeloid cell lines as compared to control samples, whereas the promoter region remained unmethylated (Figure 2D-E, and Supplementary figure 1C). Results obtained by luciferase-reporting assays demonstrated that the exogenous DNA methylation reduced significantly the ZFP36L1 enhancer activity (Figure 2F). Moreover, 5´azacytidine hypomethylating treatment was able to reverse the DNA methylation levels of the enhancer region in vitro, restoring the gene expression levels of ZFP36L1 in SET-2 cell line (Figures 2G and 2H). Collectively, our results suggest that ZFP36L1 downregulation in MF is mediated by hypermethylation of its enhancer element.

ZFP36L1 acts as a tumor suppressor gene in MPN ZFP36L1 is known to mediate the degradation of mRNAs with AU rich elements in their 3’UTR. Therefore, we hypothesized that ZFP36L1 downregulation could lead to upregulation of its putative targets in MF. To further validate this hypothesis, we used DREME, a motif discovery algorithm specifically designed to find short, core DNA- binding motifs enriched in the 3′UTR of genes. We found that the motif GTATTTDT (E- value=4.5*10-15) was in fact overrepresented in transcripts upregulated in MF patients (Figure 3A). Subsequently, an analysis of motif enrichment was performed, revealing a significant enrichment of the mentioned motif (p = 7.69*10-20) in transcripts upregulated in MF patients (log FC > 1; p < 0.05).

14

Haematologica HAEMATOL/2018/204917 Version 1

Moreover, as a complementary approach, we searched the database for AU-rich elements AREsite43 to determine if, from the differentially expressed genes (B-value > 10) between MF and controls, we could detect an enrichment of these sequences in the upregulated subset of genes. Of all the possible AU motifs, we focused on the most restricted 9, 11 and 13-mer motifs. Interestingly, we were able to identify an enrichment of 9-mer sequence WTATTTATW (p= 0.01) and the 13-mer sequence WWWTATTTATWWW (p= 0.03) exclusively among the upregulated genes in MF patients (Figure 3A). Remarkably, both AU motifs highly resemble the core binding motif for ZFP36L1 predicted by DREME algorithm, enforcing the regulatory role of this gene in MF. These results suggest that ZFP36L1 downregulation is involved in MF pathogenesis through a dysregulation of its transcriptome.

To further characterize the impact of ZFP36L1 downregulation in MF, we tried to revert this phenotype by ectopic overexpression of the gene using lentiviral infection in the cell line SET-2. 72 hours post-infection, the levels of EGFP positive cells were evaluated to infer the efficacy of infection, together with the expression and protein levels of ZFP36L1 to ensure the correct overexpression of the gene (Fig 3B-D). Cell viability was measured for five consecutive days by MTS assay and a decrease of more than 50% in cell proliferation was observed, with an increase of AnnexinV positive cells as measured by flow cell cytometry (Figures 3E and 3F). Taken together, these results suggest that ZFP36L1 downregulation might lead to a deregulation of the cell transcriptome, including relevant genes for cell proliferation, survival and differentiation, as previously described41,44-46. Consequently, when ZFP36L1 expression levels are restored, SET-2 cells loss their malignant proliferative phenotype, enforcing the tumor suppressor role of this gene in MF.

15

Haematologica HAEMATOL/2018/204917 Version 1

CONCLUSION

The DNA methylation landscape of primary MF and post-ET/post-PV MF compared to healthy individuals show a consistent and differential DNA methylation profile between them. Absence of differences between primary MF and post-ET/post-PV MF suggests that these changes are fundamental epigenetic alterations occurring at the level of MPN stem cells and maintained in differentiated myeloid cells. Aberrant DNA methylation in MF is predominantly located in enhancer regions and has a significant impact on the expression of their target genes. Combining DNA methylation and gene expression data, we have identified ZFP36L1 as an attractive new possible therapeutic target that shows a decrease of gene expression mediated by enhancer hypermethylation. Our results also suggest a direct effect of ZFP36L1 downregulation in the gene expression profile of MF, through upregulation of mRNAs harbouring ARE canonical sites. For instance, in vitro rescue of ZFP36L1 expression had an impact in cell proliferation and induced apoptosis in SET-2 cell line, indicating a possible role of ZFP36L1 as a tumour suppressor gene in MF. Moreover, treatment with 5´azacytidine further evidenced the plausibility of ZFP36L1 pharmacologic manipulation. Taken together, these results constitute an unexplored therapeutic target for MF patients, which remain to be properly evaluated in the pre-clinical scenario.

16

Haematologica HAEMATOL/2018/204917 Version 1

Acknowledgments We particularly acknowledge the patients for their participation and the Biobank of the University of Navarra for its collaboration. We thank John J Murphy and Amor Alcaraz for providing ZFP36L1 expression constructs. This research was funded by grants from Instituto de Salud Carlos III (ISCIII) PI14/01867, PI16/02024 and PI17/00701, TRASCAN (EPICA), CIBERONC (CB16/12/00489; Co-finance with FEDER funds), RTICC (RD12/0036/0068) and Departamento de Salud del Gobierno de Navarra 40/2016. N.M. is supported by a FEHH-Celgene research grant and M.P. was supported by a Sara Borrell fellowship CD12/00540.

Authorship and Conflicts of Interest The authors declare no competing conflicts of interests over the present manuscript.

17

Haematologica HAEMATOL/2018/204917 Version 1

REFERENCES 1. Spivak JL. Myeloproliferative Neoplasms. Longo DL, editor. N Engl J Med. Massachusetts Medical Society; 2017 Jun 1;376(22):2168–81.

2. Kim J, Haddad RY, Atallah E. Myeloproliferative neoplasms. Dis Mon. 2012 Apr;58(4):177–94.

3. Passamonti F, Rumi E, Caramella M, Elena C, Arcaini L, Boveri E, et al. A dynamic prognostic model to predict survival in post-polycythemia vera myelofibrosis. Blood. American Society of Hematology; 2008 Apr 1;111(7):3383– 7.

4. Kim SY, Im K, Park SN, Kwon J, Kim J-A, Lee DS. CALR, JAK2, and MPL mutation profiles in patients with four different subtypes of myeloproliferative neoplasms: primary myelofibrosis, essential thrombocythemia, polycythemia vera, and myeloproliferative neoplasm, unclassifiable. Am J Clin Pathol. 2015 May;143(5):635–44.

5. Vainchenker W, Kralovics R. Genetic basis and molecular pathophysiology of classical myeloproliferative neoplasms. Blood. 2017 Feb 9;129(6):667–79.

6. Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. American Society of Hematology; 2016 May 19;127(20):2391–405.

7. Myrtue Nielsen H, Lykkegaard Andersen C, Westman M, Sommer Kristensen L, Asmar F, Arvid Kruse T, et al. Epigenetic changes in myelofibrosis: Distinct methylation changes in the myeloid compartments and in cases with ASXL1 mutations. Sci Rep. Nature Publishing Group; 2017 Jul 28;7(1):6774.

8. Nischal S, Bhattacharyya S, Christopeit M, Yu Y, Zhou L, Bhagat TD, et al. Methylome profiling reveals distinct alterations in phenotypic and mutational subgroups of myeloproliferative neoplasms. Cancer Res. American Association for Cancer Research; 2013 Feb 1;73(3):1076–85.

9. Pérez C, Pascual M, Martín-Subero JI, Bellosillo B, Segura V, Delabesse E, et al. Aberrant DNA methylation profile of chronic and transformed classic Philadelphia-negative myeloproliferative neoplasms. Haematologica. 2013 Sep;98(9):1414–20.

10. Mendizabal I, Yi SV. Whole-genome bisulfite sequencing maps from multiple human tissues reveal novel CpG islands associated with tissue-specific regulation. Hum Mol Genet. 2016 Jan 1;25(1):69–82.

11. Deaton AM, Webb S, Kerr ARW, Illingworth RS, Guy J, Andrews R, et al. Cell

18

Haematologica HAEMATOL/2018/204917 Version 1

type-specific DNA methylation at intragenic CpG islands in the immune system. Genome Res. 2011 Jul;21(7):1074–86.

12. Sandoval J, Heyn H, Moran S, Serra-Musach J, Pujana MA, Bibikova M, et al. Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics. 2011 Jun;6(6):692–702.

13. Rönnerblad M, Andersson R, Olofsson T, Douagi I, Karimi M, Lehmann S, et al. Analysis of the DNA methylome and transcriptome in granulopoiesis reveals timed changes and dynamic enhancer methylation. Blood. American Society of Hematology; 2014 Apr 24;123(17):e79–89.

14. Agirre X, Castellano G, Pascual M, Heath S, Kulis M, Segura V, et al. Whole- epigenome analysis in multiple myeloma reveals DNA hypermethylation of B cell- specific enhancers. Genome Res. 2015 Apr;25(4):478–87.

15. Bell RE, Golan T, Sheinboim D, Malcov H, Amar D, Salamon A, et al. Enhancer methylation dynamics contribute to cancer plasticity and patient mortality. Genome Res. 2016 May;26(5):601–11.

16. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011 May 5;473(7345):43–9.

17. Wijetunga NA, Delahaye F, Zhao YM, Golden A, Mar JC, Einstein FH, et al. The meta-epigenomic structure of purified human stem cell populations is defined at cis-regulatory sequences. Nat Commun. 2014 Oct 20;5:5195.

18. Lara-Astiaso D, Weiner A, Lorenzo-Vivas E, Zaretsky I, Jaitin DA, David E, et al. Immunogenetics. Chromatin state dynamics during blood formation. Science. 2014 Aug 22;345(6199):943–9.

19. Tefferi A, Thiele J, Vardiman JW. The 2008 World Health Organization classification system for myeloproliferative neoplasms. Cancer. 2009 Sep 1;115(17):3842–7.

20. Kulis M, Merkel A, Heath S, Queirós AC, Schuyler RP, Castellano G, et al. Whole-genome fingerprint of the DNA methylome during human B cell differentiation. Nat Genet. 2015 Jul;47(7):746–56.

21. Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, et al. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 2017 Jan 4;45(D1):D183–9.

22. Skov V, Larsen TS, Thomassen M, Riley CH, Jensen MK, Bjerrum OW, et al. Whole-blood transcriptional profiling of interferon-inducible genes identifies highly upregulated IFI27 in primary myelofibrosis. European Journal of

19

Haematologica HAEMATOL/2018/204917 Version 1

Haematology. Blackwell Publishing Ltd; 2011 Jul;87(1):54–60.

23. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015 Apr 20;43(7):e47–7.

24. Klug M, Rehli M. Functional analysis of promoter CpG methylation using a CpG- free luciferase reporter vector. Epigenetics. 2006 Jul;1(3):127–30.

25. Bailey TL. DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics. 2011 Jun 15;27(12):1653–9.

26. Hotta A, Cheung AYL, Farra N, Vijayaragavan K, Séguin CA, Draper JS, et al. Isolation of human iPS cells using EOS lentiviral vectors to select for pluripotency. Nat Methods. 2009 May;6(5):370–6.

27. Reilly JT, McMullin MF, Beer PA, Butt N, Conneally E, Duncombe AS, et al. Use of JAK inhibitors in the management of myelofibrosis: a revision of the British Committee for Standards in Haematology Guidelines for Investigation and Management of Myelofibrosis 2012. British Journal of Haematology. 2014 Nov;167(3):418–20.

28. Hernández-Boluda J-C, Pereira A, Gómez M, Boqué C, Ferrer-Marín F, Raya J-M, et al. The International Prognostic Scoring System does not accurately discriminate different risk categories in patients with post-essential thrombocythemia and post-polycythemia vera myelofibrosis. Haematologica. Haematologica; 2014 Apr;99(4):e55–7.

29. Passamonti F, Giorgino T, Mora B, Guglielmelli P, Rumi E, Maffioli M, et al. A clinical-molecular prognostic model to predict survival in patients with post polycythemia vera and post essential thrombocythemia myelofibrosis. Leukemia. Nature Publishing Group; 2017 Dec;31(12):2726–31.

30. Brecqueville M, Rey J, Devillier R, Guille A, Gillet R, Adélaide J, et al. Array comparative genomic hybridization and sequencing of 23 genes in 80 patients with myelofibrosis at chronic or acute phase. Haematologica. 2014 Jan;99(1):37–45.

31. Qu Y, Siggens L, Cordeddu L, Gaidzik VI, Karlsson K, Bullinger L, et al. Cancer- specific changes in DNA methylation reveal aberrant silencing and activation of enhancers in leukemia. Blood. American Society of Hematology; 2017 Feb 16;129(7):e13–e25.

32. Kulis M, Heath S, Bibikova M, Queirós AC, Navarro A, Clot G, et al. Epigenomic analysis detects widespread gene-body DNA hypomethylation in chronic lymphocytic leukemia. Nat Genet. Nature Publishing Group; 2012 Nov;44(11):1236–42.

33. Sharifi-Zarchi A, Gerovska D, Adachi K, Totonchi M, Pezeshk H, Taft RJ, et al.

20

Haematologica HAEMATOL/2018/204917 Version 1

DNA methylation regulates discrimination of enhancers from promoters through a H3K4me1-H3K4me3 seesaw mechanism. BMC Genomics. BioMed Central; 2017 Dec 12;18(1):964.

34. King AD, Huang K, Rubbi L, Liu S, Wang C-Y, Wang Y, et al. Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 2016 Sep 27;17(1):289–302.

35. Blattler A, Yao L, Witt H, Guo Y, Nicolet CM, Berman BP, et al. Global loss of DNA methylation uncovers intronic enhancers in genes showing expression changes. Genome Biol. BioMed Central; 2014 Sep 20;15(9):469.

36. Hon GC, Rajagopal N, Shen Y, McCleary DF, Yue F, Dang MD, et al. Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat Genet. Nature Publishing Group; 2013 Oct;45(10):1198–206.

37. Bergmann AK, Castellano G, Alten J, Ammerpohl O, Kolarova J, Nordlund J, et al. DNA methylation profiling of pediatric B-cell lymphoblastic leukemia with KMT2A rearrangement identifies hypomethylation at enhancer sites. Pediatr Blood Cancer. 2017 Mar;64(3):e26251.

38. Hitti E, Bakheet T, Al-Souhibani N, Moghrabi W, Al-Yahya S, Al-Ghamdi M, et al. Systematic Analysis of AU-Rich Element Expression in Cancer Reveals Common Functional Clusters Regulated by Key RNA-Binding Proteins. Cancer Res. 2016 May 17.

39. Baou M, Norton JD, Murphy JJ. AU-rich RNA binding proteins in hematopoiesis and leukemogenesis. Blood. 2011 Nov 24;118(22):5732–40.

40. Galloway A, Saveliev A, Łukasiak S, Hodson DJ, Bolland D, Balmanno K, et al. RNA-binding proteins ZFP36L1 and ZFP36L2 promote cell quiescence. Science. 2016 Apr 22;352(6284):453–9.

41. Zekavati A, Nasir A, Alcaraz A, Aldrovandi M, Marsh P, Norton JD, et al. Post- transcriptional regulation of BCL2 mRNA by the RNA-binding protein ZFP36L1 in malignant B cells. PLoS ONE. 2014;9(7):e102625.

42. Liu J, Lu W, Liu S, Wang Y, Li S, Xu Y, et al. ZFP36L2, a novel AML1 target gene, induces AML cells apoptosis and inhibits cell proliferation. Leukemia Research. 2018 May;68:15–21.

43. Gruber AR, Fallmann J, Kratochvill F, Kovarik P, Hofacker IL. AREsite: a database for the comprehensive investigation of AU-rich elements. Nucleic Acids Res. 2011 Jan;39(Database issue):D66–9.

44. Chen M-T, Dong L, Zhang X-H, Yin X-L, Ning H-M, Shen C, et al. ZFP36L1 promotes monocyte/macrophage differentiation by repressing CDK6. Sci Rep. Nature Publishing Group; 2015 Nov 6;5(1):16229.

21

Haematologica HAEMATOL/2018/204917 Version 1

45. Vignudelli T, Selmi T, Martello A, Parenti S, Grande A, Gemelli C, et al. ZFP36L1 negatively regulates erythroid differentiation of CD34+ hematopoietic stem cells by interfering with the Stat5b pathway. Mol Biol Cell. American Society for Cell Biology; 2010 Oct 1;21(19):3340–51.

46. Hodson DJ, Janas ML, Galloway A, Bell SE, Andrews S, Li CM, et al. Deletion of the RNA-binding proteins ZFP36L1 and ZFP36L2 leads to perturbed thymic development and T lymphoblastic leukemia. Nat Immunol. 2010 Aug;11(8):717– 24.

22

Haematologica HAEMATOL/2018/204917 Version 1

LEGENDS TO THE FIGURES

Figure 1. MF harbours a differential DNA methylation profile compared to control samples, with changes located primarily on enhancer regions. A) Unsupervised principal component analysis (PCA) showing a differential DNA methylation profile of

MF patients and healthy controls with no differences between primary and secondary MF.

B) Distribution of DMCs according to CpG island mapping (left graph) or functional chromatin analysis (right graph) grouped by DNA methylation status of the probes

(legend). *p<0.05 C) Hierarchical clustering of DMC located to enhancer regions in MF patients and healthy controls. D) GO-PANTHER analysis of genes adjacent to enhancer-

DMCs. Analysis of hypermethylated and hypomethylated genes is shown on the left and right panel respectively.

Figure 2. Aberrant enhancer DNA methylation regulates gene expression in MF. A)

Violin density plots of expression of genes with differentially methylated CpGs located to enhancer regions. Vertical axis represents fold change in gene expression. The horizontal width of the plot represents density of data along the y axis. B) Candidate genes with substantial changes in DNA methylation (FDR < 0.01) and differential gene expression.

Red bars represent average of DNA methylation of all enhancer-mapped probes and black bars the average expression of all probes, error bars represent SD. C) ZFP36L1 downregulation validation by RT-qPCR in MF patients and 3 myeloid cell lines

(including SET-2) compared to healthy controls (HC) (n=3). D-E) Bisulfite sequencing of ZFP36L1 D) enhancer region and E) promoter region in healthy controls (HC), cell lines and primary MF samples. For each sample, graph shows mean ± SD of 10 CpG

23

Haematologica HAEMATOL/2018/204917 Version 1

dinucleotides for enhancer region and 15 CpG dinucleotides for promoter region. F) pCpG-L luciferase reporter assay showing the inhibition of luciferase activity after treatment of ZFP36L1 enhancer region with Sss-I methyltransferase. G) DNA methylation levels of the enhancer region (same 10 CpG dinucleotides as in D) after

5 ́azacytidine (AZA) treatment of SET-2. H) ZFP36L1 expression levels after

5 ́azacytidine (AZA) treatment of SET-2. Plots/bars indicate mean ± SD.

Figure 3. ZFP36L1 decreases cell viability in MF. A) Consensus binding motif for

ZFP36L1 obtained by DREME motif discovery among transcripts with putative AU-rich motifs upregulated in MF samples. B) Efficiency of infection measured by percentage of

EGFP positive cells after lentiviral infection. C) Q-PCR validation of ZFP36L1 restoration in SET-2 cell line after lentiviral infection. D) ZFP36L1 protein restoration measured by Western Blot in SET-2 cell line after lentiviral infection. E) ZFP36L1 rescue in SET-2 cell line decreased cell proliferation rate and F) increased AnnexinV positive cells.

24

Haematologica HAEMATOL/2018/204917 Version 1

FIGURE S1. ZFP36L1 is hypermethylated in MF A. Unsupervised dendrogram of

DMC between MF and control samples show a distinctive pattern of DNA methylation of

MF samples and controls (and no differences between primary and secondary MF). B)

Schematic representation of ZFP36L1 locus and the comparative DNA methylation of all

CpG dinucleotides included in the array for MF samples (upper panel) and controls samples (lower panel). Vertical bars represent normalised DNA methylation value as per the scale on the left. Chromatin state annotation is depicted on the color-coded horizontal bar on the top. Green boxes represent predicted CpG islands from UCSC genome browser. RefSeq transcript variants are shown in the bottom. C) Bisulfite sequencing of the enhancer and promoter region of ZFP36L1 in peripheral blood cells, different myeloid cell lines and MF patients. Black dots represent methylated and white dots are unmethylated CpG dinucleotides.

25

Haematologica HAEMATOL/2018/204917 Version 1

A B

CpG islands Chromatin 500 80 60 450k Hypermethylated * Hypomethylated 250 40

* 40 * *

0 % of CpGs 20 (9.70%)

PC2 Controls −250 PMF 0 0 Post PV/MF Post TE/MF −500 0 500 PC1 (14.31%)

C

Controls MF

0.8

0.6

methylation 0.4

0.2 %of DNA %of DNA

D Top Biological Processes identified by Top Biological Processes identified by GO in HyperMethylated DMCs GO in HypoMethylated DMCs

cell surface receptor signaling pathway anatomical structure morphogenesis regulation of phosphate metabolic process regulation of phosphate metabolic process cell-matrix adhesion cell recognition regulation of catalytic activity translation mRNA splicing, via spliceosome blood coagulation response to biotic stimulus cellular calcium ion homeostasis RNA splicing, via transesterification synaptic transmission induction of apoptosis rRNA metabolic process cytoskeleton organization visual perception B cell mediated immunity cell recognition complement activation cellular component morphogenesis mesoderm development MAPK cascade calcium-mediated signaling cytoskeleton organization MAPK cascade cell differentiation transcription from RNA polymerase II promoter B cell mediated immunity sensory perception of smell defense response to bacterium Cellular defense response complement activation 2 4 6 2 4 6 8 -Log10 (p-value) -Log10 (p-value) FIGURE 1 A B

1

2 0

0 (FDR<0.01) -1

FoldChange Expression Methylation (ΔB) Expression (logFC) Log Geneswith enhancer-related DMC

−2 -2 Hyper- Hypo- methylated methylated

C D E

2 p<0.005 100 100

1

0 p<0.001

50 50 EnhancerMethylation -1 PromoterMethylation

p<0.001

FoldChange Expression -2 ZFP36L1 ZFP36L1 Log -3 % of 0 % of 0 HC SET-2 HEL HL-60 MF HC SET-2 HEL HL-60 MF HC SET-2 HEL HL-60 MF samples samples samples

F G H

1.5 100 1.5 p<0.05 p<0.001 p<0.05 p<0.01 p<0.01 p<0.0001 1.0 1.0

50

0.5 EnhancerMethylation 0.5 FoldRLU Enhancervs FoldChange Expression ZFP36L1 0 0 0

pCpGL vector + + + + + + % of SET-2 2.5µM 5µM SET-2 2.5µM 5µM Minimal Promoter - + - - + + AZA treatment AZA treatment ZFP36L1 Enhancer - - + + + + M. SssI - - - + - +

FIGURE 2 AA B DREME motif discovery: GTATTTDT (E-value= 4.5*10-15 ) 2 EGFP

1 Control bits

20 40 60 80 100 % of Positive EGFP cells GT C 0 A 2 3 4 5 6 7 8 G1 TATTT T ZFP36L1 p-value ZFP36L1 ATTTA 265 1.03E-08 WWATTTAWW 139 0.04 C Control WTATTTATW 70 0.010 WWTATTTATWW 37 N.S 3 6 WWWTATTTATWWW 24 0.039 Fold Change Expression

D E D E F p<0.01 40 Control 100 ZFP36L1

p<0.01

30 ZFP36L2

ZFP36L1 50 20 p=0.002

β-ACTIN cell%of proliferantion 10 % of annexin%of positiveV cells

72 hours infection ZFP36L1 0 3 5 7 5 7 Time (days) Time (days)

FIGURE 3