University of Groningen

The aberrant transcriptional program of myeloid malignancies with poor prognosis Gerritsen, Myléne

DOI: 10.33612/diss.113503008

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record

Publication date: 2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA): Gerritsen, M. (2020). The aberrant transcriptional program of myeloid malignancies with poor prognosis: the effects of RUNX1 and TP53 mutations in AML. Rijksuniversiteit Groningen. https://doi.org/10.33612/diss.113503008

Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license. More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne- amendment.

Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

Download date: 06-10-2021 5

137853-gerritsen-layout.indd 102 14/01/2020 14:13 Ring sideroblasts in AML are associated with adverse risk characteristics and have a distinct expression pattern

G. Berger, M. Gerritsen, G. Yi, T.N. Koorenhof-Scheele, L.I. Kroeze, M. Stevens-Kroef, K. Yoshida, Y. Shiraishi, E. van den Berg, H. Schepers, G. Huls, A.B. Mulder, S. Ogawa, J.H.A. Martens, J.H. Jansen, E. Vellenga

This research was originally published in Blood Advances. © the American Society of Hematology.

137853-gerritsen-layout.indd 103 14/01/2020 14:13 Chapter 5

ABSTRACT

Ring sideroblasts (RS) emerge as result of aberrant erythroid differentiation leading to excessive mitochondrial iron accumulation, a characteristic feature for myelodysplastic syndromes (MDS) with mutations in the spliceosome gene SF3B1. However, RS can also be observed in patients diagnosed with acute myeloid leukemia (AML). The objective of this study was to characterize RS in AML patients. Clinically, RS-AML is enriched for ELN adverse-risk (55%). In line with this finding, 35% of all cases had complex cytogenetic aberrancies and TP53 was most recurrently mutated in this cohort (37%), followed by DNMT3A (26%), RUNX1 (25%), TET2 (20%) and ASXL1 (19%). In contrast to RS-MDS, the incidence of SF3B1 mutations was low (8%). Whole exome sequencing and SNP array analysis on a subset of patients did not uncover one single genetic defect underlying the RS phenotype. Shared genetic defects between erythroblasts and total mononuclear cell fraction indicate common ancestry for the erythroid lineage and the myeloid blast cells in RS-AML patients. RNA sequencing analysis on CD34+ AML cells revealed differential gene expression between RS-AML and non RS-AML cases, including involved in megakaryocyte and erythroid differentiation. Furthermore, several heme metabolism-related genes were found to be up-regulated in RS- CD34+ AML cells, as was observed in SF3B1mut MDS. These results demonstrate that although the genetic background of RS-AML differs from that of RS-MDS, certain downstream effector pathways are in common.

104

137853-gerritsen-layout.indd 104 14/01/2020 14:13 Ring sideroblasts in AML

INTRODUCTION

Ring sideroblasts (RS) are erythroid precursor cells that accumulate excessive mitochondrial iron and can be observed in bone marrow smears associated with multiple medical conditions1. Presence of RS is a characteristic feature in myelodysplastic syndrome (MDS) subtypes, including MDS with single lineage dysplasia (MDS-RS-SLD), multilineage dysplasia (MDS-RS-MLD) and in combination with the presence of thrombocytosis (MDS/MPN-RS-T)2. Non-malignant causes of RS include several drugs, toxins, alcohol, copper deficiency and congenital sideroblastic anemia3. This latter group comprises conditions caused by inborn defects in genes that operate in several mitochondrial pathways, including ALAS2, ABCB7, SLC25A38 and HSPA94–7. In MDS, the RS phenotype is strongly correlated with mutations in splicing factor 3B subunit 1 (SF3B1), with an incidence higher than 80%8–12. SF3B1 mutations are usually observed in low-risk MDS, which is characterized by a stable clinical course and a low risk of leukemic transformation8,13. As a core component of the U2 small nuclear ribonucleoprotein particle (snRNP), SF3B1 is essential for pre- RNA splicing14. The molecular mechanism by which SF3B1 mutations result in RS formation is 5 not yet fully understood. A proposed mechanism is that specific patterns of missplicing result in altered expression of genes that are essential for correct programming of erythropoiesis15–17. The relationship between genetic defects in SF3B1 and the RS phenotype is not one-to-one; in 10- 20% of the MDS-RS patients no mutation in the SF3B1 gene is detected8–12. Moreover, RS can also be present in a subset of AML patients, while SF3B1 mutations are infrequent in this disease10,18,19. Besides SF3B1, the only other correlation between a gene defect and the RS phenotype was described for PRPF8, for which mutations are reported in ~3% of myeloid neoplasms, including MDS, MDS/MPN and (s)AML20. In the present study we determined the prevalence of RS in various ontogenic AML subtypes and the association of the RS phenotype in AML with adverse risk characteristics. To identify the landscape of genomic defects that underlies the RS phenotype in AML, we performed whole exome sequencing, targeted sequencing and SNP-array analysis. Finally, to identify differential expression of genes associated with the RS phenotype in AML, we performed RNA sequencing on CD34+-selected AML cells.

MATERIALS AND METHODS

Patients and data collection For this study, we collected data from 126 AML and high-risk MDS (≥10% bone marrow blasts) patients who were diagnosed between January 2000 and April 2018 at the University Medical Center Groningen. The inclusion criterion was the presence of ring sideroblasts in the diagnostic bone-marrow smear. Patients with previously reported MDS-RS were excluded. Diagnosis and risk classification was revised based upon World Health Organization classification (2016)2 and European Leukemia Net (ELN) recommendations (2017)21. Bone marrow (BM) and/or peripheral blood (PB) from patients were biobanked after informed consent for investigational use. The

105

137853-gerritsen-layout.indd 105 14/01/2020 14:13 Chapter 5

study was conducted in accordance with the Declaration of Helsinki and institutional guidelines and regulations. Morphological and cytogenetic analyses were based on standard procedures. On iron-stained aspirate smears, 400 red blood cell precursors were counted. Ring sideroblasts were defined by ≥ 5 iron granules encircling one third or more of the nucleus. The ring sideroblasts percentage was determined as percentage of the total red blood cell precursor cells.

Sorting of cell fractions The mononuclear cell (MNC) fraction from BM and/or PB was obtained by density gradient centrifugation using lymphoprep (PAA, Cölbe, Germany) according to standard procedures. Analysis and sorting of various cell fractions was performed on MoFlo XDP or Astrios (Beckman Coulter). A list of antibodies can be found in the Supplementary Methods.

DNA isolation and amplification Genomic DNA from various cell fractions was extracted with the NucleoSpin Tissue kit (Macherey- Nagel, Düren, Germany) according to the manufacturer’s instructions. In case of insufficient yield, a maximum of 70 ng DNA was amplified using the Qiagen REPLI-g kit (Qiagen, Venlo, the Netherlands), according to the manufacturer’s protocol.

Targeted deep sequencing using a myeloid gene panel Targeted sequencing of DNA derived from BM or PB samples obtained at diagnosis was carried out using the myeloid TruSight sequencing panel (Illumina, San Diego, CA, USA) or by in- house sequencing panel containing 27 genes (Supplementary Table 1). Library preparation was performed according to the manufacturer’s protocol (Illumina). Aligning and filtering of sequencing data was performed using NextGENe version 2.3.4.2 (SoftGenetics, Pennsylvania, US). Cartagenia Bench Lab NGS (Agilent, Santa Clara, CA, USA) was used for analysis of the resulting vcf files. Sequencing artifacts were excluded using a threshold of 5%. A minimal variant read depth of 20 reads was set as criterion. Variants that frequently occur in the general healthy population (>2% 1000 Genome Phase 1, ESP6500 and dbSNP, and >5% Genome of the Netherlands) were excluded from further analysis.

Microarray-based genomic profiling Microarray-based genomic profiling on MNC and erythroblast fractions was performed on a CytoScan HD array platform (Affymetrix, Inc., Santa Clara, CA, USA) in agreement with the manufacturer’s reference. Data analysis was performed using Analysis Suite software package (Affymetrix) using annotations of reference genome build GRCh37 (hg19). Comprehensive analysis and interpretation of the obtained microarray genomic profiling data was performed using a previously described filtering pipeline and criteria22. Aberrations meeting these criteria were included for genomic profiling and described in accordance with the

106

137853-gerritsen-layout.indd 106 14/01/2020 14:13 Ring sideroblasts in AML

standardized ISCN 2016 nomenclature system23. Visualization of the resultant genomic profiles was performed using NEXUS software (Nexus Copy Number 8.0, BioDiscovery, El Segundo, CA, USA).

Whole exome sequencing and confirmation of mutations in erythroblasts Whole exome sequencing (WES) to an average depth of 143x was performed on DNA isolated from diagnostic BM-MNC (n=13) or PB-MNC (n=3) samples. The procedure is described in more detail in the supplemental methods. For a subset of patients, the presence of somatic variants in TP53 and SRSF2 identified by (WES) was validated and quantified in the erythroblast fraction as described in the Supplemental Methods.

RNA extraction and Illumina high-throughput sequencing RNA was isolated by separation of the aqueous phase by TRIzol Reagent (Thermo Fisher) according to the manufactures protocol. The aqueous phase was subsequently mixed 1:1 with 70% ethanol and isolation was continued using the RNeasy mini kit (Qiagen) including performing on-column 5 DNaseI treatment. Library preparation, Illumina high-throughput sequencing and RNA-seq data analysis are described in detail in the supplementary materials. Raw RNA-seq data is available via GSE127861.

Statistical analysis Bivariate correlations were made using a Pearson correlation (continuous variables) or Spearman correlation (categorical variables). A p-value <0.05 was used to define statistical significance. Statistical calculations were performed using Prism version 6.0.

RESULTS

Clinical characteristics To study the ring sideroblasts (RS) phenotype in more detail in a comprehensive group of myeloid neoplasms (MNs), we collected clinical data on a cohort of patients (n= 126) consisting of AML and high-risk MDS patients (≥10% BM blasts) hereafter also indicated as ‘AML patients’. These include patients with RS (≥1%) in the diagnostic bone marrow smear, excluding those with a documented prior clinical history of MDS-RS. The median blast percentage in this cohort was 32% (range 10-91%), two-third of the patients were male and the median age at diagnosis was 67 years (range 32-87). The majority of patients were diagnosed with de novo AML (55.6%) and AML with myelodysplasia-related changes was the most common WHO (2016) subtype (37.3%, Table 1). Although highest RS percentages were observed in cases with lower blast counts, blast count did not significantly correlate with RS percentage (Pearson r = 0.16, p = 0.07; Figure 1A). Patients were placed into three subgroups based on their RS percentage; 1-4% RS, 5-14% RS and ≥15% RS (Table 1). These groups were comparable regarding erythroblast percentage and age.

107

137853-gerritsen-layout.indd 107 14/01/2020 14:13 Chapter 5

Table 1 – Clinical characteristics of AML with RS phenotype

All RS cases 1-4% RS 5-14% RS ≥15% RS

Total 126 53 42% 28 22% 45 36% Age (years) Median 67 67 63 70 Range 32-87 40-81 32-78 43-87 Sex Male 84 67% 32 60% 20 71% 32 71% Female 42 33% 21 40% 8 29% 13 29% Type of disease de novo AML 70 56% 35 66% 12 43% 23 51% sAML 20 16% 6 11% 8 29% 6 13% t-MN 17 14% 7 13% 3 11% 7 16% Other 19 15% 5 9% 5 18% 9 20% WHO diagnosis AML with MDS- 47 37% 14 26% 16 57% 17 38% related changes AML NOS 24 19% 13 25% 3 11% 8 18% t-MN 17 14% 7 13% 3 11% 7 16% MDS-EB2 17 14% 5 9% 4 14% 8 18% AML with recurrent 16 13 % 13 25 % 1 4% 2 4% abnormalities Other 5 4% 1 2% 1 4% 3 7% ELN risk score Favorable 12 10% 11 21% 1 4% 0 0% Intermediate 8 6% 4 8% 2 7% 2 1% Intermediate* 29 23% 17 32% 5 18% 9 20 % Adverse 74 59% 21 40% 19 67% 34 76% Unknown 3 2% 0 0% 1 4% 2 4% BM blasts, % Median 32% 34% 24% 32% Range 10%-91% 10-88% 11-88% 10-91% Erythroblasts, % Median 21% 23% 21% 17% Range 1-64% 4-64% 2-50% 1-59% Data denoted as number (percentage), unless otherwise stated. * No evaluation for ASXL1, RUNX1 and TP53. Abbreviations: AML – acute myeloid leukemia, t-MN – therapy-related myeloid neoplasm, NOS – not otherwise specified, sAML- secondary AML, MDS-EB2 – myelodysplastic syndrome with excess blasts type 2, ELN – European Leukemia Net, WHO – World Health Organization, BM – bone marrow, RS - ringsideroblasts

108

137853-gerritsen-layout.indd 108 14/01/2020 14:13 Ring sideroblasts in AML

The group AML patients with an RS phenotype was enriched with patients in the ELN adverse risk category (55%) (Figure 1B). The proportion of adverse risk patients increased with increasing RS percentage (Table 1, Figure 1B). The subgroup with ≥15% RS, which represents the minimum required percentage for WHO MDS-RS diagnosis in absence of SF3B1 mutations2, had no patients in the ELN favorable risk category (Figure 1B).

5

Figure 1. Clinical data. A. Correlation between blast percentage and RS percentage, both determined in diagnostic bone marrow smear. Boxplot represents median and range of blast percentage in the RS cohort. B. Pie charts representing risk classification (ELN 2017) for total RS cohort (‘all cases’) and three subgroups based on RS percentage (Table 1) * Mutational status of ASXL1, RUNX1 and TP53 unknown.

Mutational and chromosomal defects observed in association with RS phenotype Complex cytogenetic aberrancies (defined as 3 or more chromosomal abnormalities) detected by conventional karyotyping were observed in 35% of all RS cases (n=126, Figure 2A). Increasing incidences of chromosomal abnormalities were associated with increasing RS percentages in the three subgroups (21%, 39% and 49%, respectively; data not shown). Screening for mutations in CEBPA, FLT3 and NPM1 was conducted by conventional RT-PCR. In addition, a subset of 60 patients (de novo AML n=36, sAML n=11, t-MN n=6, MDS-EB2 (MDS with excess blasts (<10%)) n=7) was analyzed for mutations in a panel of genes that are recurrently mutated in MNs using NGS methods (either panel-based (Supplementary Table 1) or WES). In this subset, 27 patients had ≥15% RS, 15 patients had 5-14% RS and 18 patients had 1-4% RS . In addition, in accordance with cytogenetic findings, mutations in the TP53 gene (37%), which frequently coincide with genetic instability, were the most recurrent in this subset (Figure 2B). Other frequently mutated genes included RUNX1 (25%), NPM1 (16%) and the epigenetic modifiers DNMT3A (26%), ASXL1 (19%) and TET2 (20%). SF3B1 mutations were detected in 6 cases; four with de novo AML, one with t-MN and one with sAML. In two patients, no mutations in genes frequently affected in myeloid neoplasms were identified. The incidence of certain mutations tended to segregate with

109

137853-gerritsen-layout.indd 109 14/01/2020 14:13 Chapter 5

RS percentages: NPM1 mutations were mainly observed in cases with low RS percentages (30% in 1-4% RS group vs. 3% in >15% RS group (Figure 2B)). In contrast, TP53 and GATA2 mutations were detected especially in patients with higher RS percentages (73% in >15% RS group vs. 12% in 1-4% RS group (Figure 2B)).

Figure 2 - Chromosomal and molecular defects observed in association with RS phenotype. A. Frequency of commonly observed cytogenetic defects in myeloid neoplasms (MNs) in the RS cohort, subdivided by RS percentage at diagnosis, n=126. B. Frequency of mutations in genes commonly mutated in MNs as detected by NGS, subdivided by RS percentage. ASXL1, CALR, CBL, DNMT3A, EZH2, IDH1, IDH2, JAK2, KIT, NRAS, RUNX1, SF3B1, SRSF2, TET2, TP53 and WT1 n=60, BCOR, BCORL, ETV6, GATA2, GNAS, IKZF1, NOTCH1, PTPN11, SETBP1, STAG2, U2AF1 and ZRSR2 n=45. For CEBPA (n=83), FLT3 (n=103) and NPM1 (n=93) results of PCR , followed by Sanger sequencing(CEBPA) or Capillary electrophoresis (FLT3 and NPM1) are shown.

Genome-wide screening for genetic defects To identify genetic defects underlying the RS phenotype that are not covered by panel-based sequencing, 15 patients of the RS cohort were selected for genome-wide analysis based on high RS percentage (≥13%) and material availability. This group was supplemented with one patient (RS022) that had 15% RS at AML relapse following autologous stem cell transplantation, while no RS were observed at initial AML diagnosis. All 16 patients belonged to the adverse risk group according to ELN criteria21. Using WES, samples were first analyzed for the occurrence of germline mutations in genes associated with congenital sideroblastic anemia (Supplementary Table 2). However, no such mutations were detected. Based on WES a median of 19 (range 9-34) somatically acquired mutations were found per patient, of which a median 2.5 (range 0-10) were observed in genes recurrently mutated in myeloid malignancies19 (Figure 3A, 3B and Supplementary Table 3). The total number of mutations did not correlate with age (Pearson r = 0.07, p = 0.78, Figure 3C). The vast majority of observed mutations concerned nonsynonymous point mutations (Figure 3D). TP53 was most frequently affected in this cohort (in 12/16 patients), followed by DNMT3A (4/16) and SRSF2 (3/16) (Figure 3A). In this cohort, one mutation was detected in SF3B1 and no mutations were observed in PRPF8. For patient RS022, who had 15% RS and several cytogenetic abnormalities in the relapse sample while both were absent at initial presentation, no mutations were detected in known leukemia-associated genes (Figure 3A). However, the

110

137853-gerritsen-layout.indd 110 14/01/2020 14:13 Ring sideroblasts in AML

observed mutation in BUB1, a key player in the mitotic spindle checkpoint, most likely explains the chromosomal aberrancies observed in this relapse sample (Supplementary Table 3). In addition to WES, SNP array analysis was used to identify chromosomal defects (Supplementary Figure 1). Scattering of one or more (chromothripsis) was detected in 9/16 cases. Deletions of chromosomes 5q, (parts of) chromosome 7, chromosome 12p, chromosome 17p and (partial) gain of chromosome 8 were frequently observed chromosomal aberrancies (Figure 3A and Figure 3E). In 6/16 cases the 17p deletion involved the TP53 locus, and for 4/16 cases this deletion involved the locus of PRPF8 (marked by asterisks in Figure 3A). Together, these data indicate that no single-gene defect underlies the RS phenotype in this cohort: RS in AML is associated with a variety of adverse-risk genetic defects.

Erythroblasts share genetic defects with leukemic clones To determine whether RS are part of the malignant leukemic clone, TP53 and SRSF2 mutations detected by WES in the total MNC fraction (containing both myeloid blast cells and differentiated cells) were verified in sorted erythroblast populations (see Supplementary methods and 5 Supplementary Figure 2) of 8 patients. VAFs strongly correlated between both cell fractions (Pearson r = 0.94, p = 0.0002) (Figure 4A). No correlation was observed between VAFs of erythroblast mutations and the percentage of RS (Pearson r = -0.43, p = 0.20) (Figure 4B). SNP array analysis on both fractions revealed that most of the cytogenetic defects were shared by the MNC fraction and erythroblast populations. However, differences were also detected in 4/6 patients (Figure 4C). Observed differences consisted of either novel aberrancies, exemplified by the gain of in the EBs of patient RS006, or discrepancies in observed frequency of cytogenetic defects, as observed for the 7q deletion in patient RS009 (0.6 in EBs vs. 0.25 in total MNCs) (Figure 4C). The frequency of cytogenetically aberrant clones within the erythroblast fractions did not correlate to RS percentages in these four patients (data not shown). No hotspot mutations in exons 13-16 of SF3B1 were detected in sorted EB fractions.

As erythroblasts of RS-MN patients were observed as part of the malignant clone, we hypothesized that RS could be eliminated upon treatment. We therefore analyzed follow-up BM smears, including RS percentage, which were available for 17 patients of the RS cohort. Eight patients received intensive chemotherapy24 (median interval between both evaluations 2 months (range 1-5)), six were treated using hypomethylating agents25 (median interval 4.3 months (2.5-12) and three patients received a combination of both (median interval 16 months (5-20) (Supplementary Table 4)). Of this group, 65% responded to therapy (defined as complete or partial response) and 35% did not respond (defined as no response or relapse). Patients who responded showed a marked decrease in RS percentage at follow-up evaluation, whereas the RS percentage in non- responders was stable or increased (Figure 4D).

111

137853-gerritsen-layout.indd 111 14/01/2020 14:13 Chapter 5

112

137853-gerritsen-layout.indd 112 14/01/2020 14:13 Ring sideroblasts in AML

Figure 3. Genetic defects detected by whole exome sequencing and SNP-array analysis. A. Overview; for each patient, all mutations in genes known to be recurrently mutated in myeloid malignancies detected by WES are depicted as well as cytogenetic abnormalities detected by SNP array analysis. 2 = two mutations were detected. B. Number of acquired mutations per patient as determined by WES. In black, the number of mutations in genes that have been previously implicated in pathogenesis of myeloid malignancies; in grey the number of mutations in genes that have not been previously implicated in myeloid malignancies. C. Correlation between age and the number of mutations. D. Distribution of the various types of alterations detected in the total set of patients with RS phenotype. E. SNP array results overview, figure created using Nexus software. Red=loss, blue=gain.

Up-regulated genes in RS-AML are associated with megakaryocyte/erythroid differentiation and mRNA splicing To investigate transcriptional differences that underlie the RS-phenotype in AML, RNA sequencing was performed on CD34+-selected AML cells of six patients with ≥13% RS (Supplementary Table 5). The results were compared to normal bone marrow (NBM) CD34+ and SF3B1-mutated MDS CD34+ samples (GSE6356926; hereafter indicated as SF3B1mut MDS), samples from 36 5 AML patients with a mixed background regarding cytogenetic and genetic defects (with no documented presence of RS) that were part of the Blueprint study27 (hereafter indicated as non- RS-AML) and three AML samples containing SF3B1 mutations (TCGA, Blueprint dataset and 1 of our own samples (RS020); hereafter indicated as SF3B1mut AML). First PCA and t-SNE analysis was performed which consistently separated non-RS-AML from NBM and MDS samples (Figure 5A). RS-AML samples were located in between these populations, as well as the SF3B1mut AMLs. These findings suggest that RS-AML comprises a distinct entity that partially resembles expression patterns in SF3B1mut AML, probably reflecting involvement of overlapping pathways. By direct comparison of RS-AML to NBM CD34+, 1196 genes were identified to be up-regulated in AML with RS-phenotype. Functionally, this gene set was highly enriched for genes involved in epigenetic regulation as well as modifications including ubiquitination (Supplementary Figure 3A). Processes associated with down-regulated genes (n=1309) included cell cycle-related pathways and DNA replication (Supplementary Figure 3B).

To investigate gene expression patterns that are correlated to RS-AML but not to AML in general, we compared gene expression of RS-AML to non-RS-AML patients. Down-regulated genes in RS- AML vs. non-RS-AML are involved in immune-related processes (Supplementary Figure 3C). Gene ontology analysis on the up-regulated genes in RS- AML revealed enrichment for genes involved in megakaryocyte differentiation and platelet function (Figure 5B and Supplementary Table 6), including GATA1, GATA2, KLF1, AHSP and TRIM58, which are genes that are also involved in the regulation of erythroid differentiation. Furthermore, this analysis determined that EPOR, the gene encoding for the erythropoietin receptor, was also up-regulated in CD34+ cells of patients with RS-AML.

113

137853-gerritsen-layout.indd 113 14/01/2020 14:13 Chapter 5

Figure 4 - Genetic defects in erythroblast of patients with RS-phenotype A. VAFs of TP53 and SRSF2 mutations detected in mononuclear cell (MNC) fractions (determined by WES) and erythroblast fraction (determined by amplicon based sequencing). B. Correlation between VAFs in erythroblast fractions (VAF indicated on left y-axis) and observed RS percentages (percentage indicated on right y-axis). Different colors represent individual mutations (as indicated in legend of Figure 4A). C. Differences observed in SNP array results between MNC- and EB fractions, figure shows screenshots taken from Chromosome Analysis Suite software package (Affymetrix). D. Relative changes in RS percentage of patients who received treatment, the RS percentage at follow-up examination (post-treatment) are displayed relative to the RS percentages determined at diagnosis (Diagnosis). Blue lines indicate patients who responded to therapy and red lines indicate patients who did not respond (see Supplementary Table 4).

114

137853-gerritsen-layout.indd 114 14/01/2020 14:13 Ring sideroblasts in AML

To determine whether the increased expression of these genes is due to an increased amount of megakaryocyte–erythroid progenitors (MEPs) in the RS-AML we extracted a MEP signature based on previously published gene expression patterns28,29. As our cohort of non-RS-AMLs comprises both CD33+ and CD34+ selected samples, we first compared the MEP signature between both cell populations to exclude that marker-biased analysis determines this difference. However this comparison did not reveal differences between both cell-fractions with regard to the extent of the MEP signature. (Supplementary Figure 3D). Compared to the non-RS-AML cohort, RS- AML displayed a slightly increased expression of this MEP signature (Figure 5C). Altogether these findings demonstrate that CD34+ RS-AML cells are partially distinct reflected by elevated expression of genes associated to megakaryocyte and erythroid differentiation including the MEP signature.

5

Figure 5 - RS-AML transcription program. A. PCA and t-SNE plots using RNA-seq results of the indicated cell types. Previously published RNA-seq was included for the following cell types: normal bone marrow (NBM) (GSE63569), non-RS-AML (Blueprint study), SF3B1mut MDS (GSE63569) and SF3B1mut AML (combination of TCGA dataset, Blueprint study and own data). B. Biological process enrichment for genes upregulated (1464) in RS-AML as compared to non-RS-AML. C. MEP signature comparison between RS-AML, non-RS-AML and NBM.

Similarities and differences with SF3B1 mutated myeloid neoplasms Next we studied in more detail the similarities and differences between the different MNs with RS. Therefore we determined the most differentially expressed genes followed by clustering in RS-AML CD34+ cells (n=6), NBM CD34+ cells (n=5), CD34+ cells of SF3B1mut AML (n=3) and SF3B1mut MDS (n=8) (Figure 6A). Six clusters were identified whereby clusters 1 and 6, which distinguish the

115

137853-gerritsen-layout.indd 115 14/01/2020 14:13 Chapter 5

RS-AML from MDS and NBM, are enriched for genes associated with cell cycle-related processes and protein modifications, respectively, highlighting the difference in proliferation and expansion between RS-AMLs and SF3B1mut MDS (Supplementary Figure 4A-E). Cluster 4, which is enriched for genes involved in cellular communication and trafficking, is more specific for NBM CD34+ cells (Supplementary Figure 4D). Cluster 5 represents genes that are specifically up-regulated in CD34+ cells of SF3B1mut MDS and show heterogeneous expression in RS-AML. Functionally, these genes are involved in erythroid development including iron metabolism (Figure 6B).

Figure 6 - Comparison between RS-AML and SF3B1mut AML and MDS. A. Heatmap of gene expression by supervised k-means clustering in RS-AML, SF3B1mut AML and SF3B1mut MDS versus control normal bone marrow (NBM) cells. The expression of the 6 clusters identified in for individual groups is shown on the right. B. Biological process enrichment for cluster 5 as identified in A, for other clusters see Supplementary Figure 3. C. Gene overlap of common upregulated genes between RS-AML, SF3B1mut AML and SF3B1mut MDS versus control. Biological process enrichment is shown for commonly upregulated genes.

116

137853-gerritsen-layout.indd 116 14/01/2020 14:13 Ring sideroblasts in AML

To identify gene expression differences that are shared by RS AML,SF3B1 mut AML and SF3B1mut MDS, we compared gene expression of all gene sets to NBM and determined the overlapping genes. A total of 47 genes were consistently up-regulated and 125 genes were down-regulated in all three groups (Figure 6C, Supplementary Figure 5A, Supplementary Table 7). Examples of up-regulated genes include ALAS2, HBB and NFE2, all players in heme-metabolism. Functional annotation revealed erythroid-related processes as highest for this gene set (Figure 6C). This result is in line with the previously described increased expression of the erythroid gene program and increased MEP signature expression in these CD34+ cells. Commonly down-regulated genes strongly enrich for immune pathways (Supplementary Figure 5A). Gene expression of PRPF8, SF3B1 and ABCB7, which were previously described as being involved in RS of SF3B1mut MDS due to deregulated splicing, were not expressed at a lower level in RS-AML compared to NBM (Supplementary Figure 5B). Finally, we analyzed whether genes that have been described as being differentially spliced in the previous datasets of Dolatshad et al.26 were differentially spliced (Supplementary Figure 5C) or differentially expressed in our dataset (Supplementary Figure 5D). We found only limited overlap in genes that are differentially spliced or expressed in 5 SF3B1mut MDS and our dataset of RS-AML. These findings demonstrate that RS-AML shares a gene expression signature with SF3B1mut MDS, but that these AMLs also have unique characteristics that distinguish them from MDS.

DISCUSSION

In this study, we characterized the presence of RS in a cohort of patients diagnosed with acute myeloid leukemia or high-risk MDS. We observed that RS-AML is enriched for ELN adverse risk disease, including the associated genetic and chromosomal defects. Clinically, higher RS percentage at diagnosis is accompanied with higher incidence of poor risk disease characteristics. Unlike MDS, no single-gene defect was found that underlies the RS phenotype in AML. Gene expression analysis indicated up-regulation of genes enriched for megakaryocyte and erythroid differentiation in RS-AML, as compared to a general AML cohort.

Although RS are generally regarded as a specific feature of certain MDS subtypes, the reported incidence of 25% indicates that RS is also a rather common finding in AML18. In contrast to reports on MDS-RS subtypes, mutations in spliceosome gene SF3B1 were rarely observed in our RS-AML cohort. The paucity of this mutation in RS-AML might be explained by the low tendency towards AML transformation from MDS subtypes with SF3B1 mutations8,13. Also, we did not identify genetic mutations in PRPF8, another gene that has been associated with an RS phenotype20. In our cohort, 25% (4/16) of the patients had a deletion of the chromosomal locus of PRPF8, but we did not observe significantly lower gene expression of PRPF8.

We did not identify a single gene defect that underlies the RS phenotype in AML however, panel- based and whole-exome sequencing revealed a high incidence of adverse-risk mutations in the

117

137853-gerritsen-layout.indd 117 14/01/2020 14:13 Chapter 5

RS-AML cohort, including DNMT3A, RUNX1 and ASXL1. Mutations in TP53 were observed most recurrently, particularly in patients with >15% RS at diagnosis. These results are in agreement with a previous study using a restricted gene panel to analyze the RS phenotype in AML patients18. The high incidence of poor-risk cytogenetic aberrancies, including chromothripsis, probably reflects the high frequency of TP53 mutations30.

In the present study we have primarily included patients with de novo AML reflecting the lack of a clinical relevant history of antecedent MDS. However, a previous study has suggested that in particular patients with t-AML and s-AML with TP53 mutations have transited through an unrecognized MDS prodrome based on genome ontogeny alterations in time31. Whether similar findings apply to patients with RS-AML whereby RS development is an early MDS event and AML development a late event will require additional studies.

Interestingly, RS have also been reported in relation to MNs in Li-Fraumeni patients with a congenital TP53 mutation32. Defects in TP53 also occur frequently in acute erythroid leukemia (AEL). Besides having a high frequency of TP53 mutations, AML with RS phenotype resembles AEL regarding higher age at diagnosis (median age 68 years), a male predominance and presence of complex chromosomal aberrancies33. However, unlike AEL, erythroid differentiation in RS-AML does not stop at the pro-erythroblastic stage. Instead, accumulation of immature myeloid blast cells is commonly observed in RS-AML, while this is rare in AEL33.

The presence of RS in our AML cohort is not restricted to a specific ontogenic subtype34, indicating that RS in AML arise from a common mechanism that goes beyond disease ontogeny. Also, it has been reported that the presence of RS on its own is not predictive for overall survival in AML patients18, suggesting that RS-AML is not a distinct disease entity. However, we observed several differences in megakaryocytic and erythroid differentiation gene expression that are specific for RS-AML compared to non-RS-AML suggesting a role for the megakaryocyte-erythrocyte progenitor (MEP) cell35. The transcription factors GATA1 and GATA2 are involved in erythroid lineage restriction, partly via stimulation of erythropoietin receptor expression36, which we also observed as being up-regulated in RS-AML, while both become down-regulated during myeloid differentiation37. These results suggest that the erythroid differentiation program is largely functional, despite the presence of a block in the myeloid differentiation program. These findings were present despite the fact that the majority of cytogenetic and molecular defects were shared by MNC and erythroblast populations, although our SNP array results suggests that cytogenetic evolution can take place.

In MDS, SF3B1 mutations presumably result in RS formation by interfering with mRNA splicing, resulting in differential gene expression8,26. Although RS-AML is genetically distinct from RS- MDS, downstream mechanisms that result in aberrant erythroid differentiation may be similar. However, when comparing differentially spliced genes observed in SF3B1mut MDS samples26 and our RS-AML samples, we observed only limited overlap.

118

137853-gerritsen-layout.indd 118 14/01/2020 14:13 Ring sideroblasts in AML

In conclusion, we have shown that RS are a frequent finding in AML, particularly in relation to adverse risk genetic defects. We revealed that erythroblasts share the mutations that are found in the malignant myeloid blast cells in RS-AML, and thus are part of the malignant population. Although the genetic background of RS-AML differs from that of RS-MDS, downstream effector pathways may be comparable, providing a possible explanation for presence of RS in AML.

Acknowledgments We would like to thank K. Chiba and S. Miyano from the Laboratory of DNA information Analysis at The University of Tokyo for their technical support. S. Ogawa was supported by the following grants: Grant-in-Aid for Scientific Research on Innovative Areas from the Ministry of Health, Labor and Welfare of Japan (15H05909). Grant for Project for Development of Innovative Research on Cancer Therapeutics from the Japan Agency for Medical Research and Development, AMED (JP18ck0106250h0002). Grant for project for cancer research and therapeutics evolution (P-CREATE) from AMED (JP18cm0106501h0003). 5

119

137853-gerritsen-layout.indd 119 14/01/2020 14:13 Chapter 5

REFERENCES

1. Ohba R, Furuyama K, Yoshida K, et al. Clinical and genetic characteristics of congenital sideroblastic anemia: comparison with myelodysplastic syndrome with ring sideroblast (MDS-RS). Ann. Hematol. 2013;92(1):1–9. 2. Arber DA, Orazi A, Hasserjian R, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127(20):2391–2406. 3. Sheftel A, Richardson D, Prchal J, Ponka P. Mitochondrial iron metabolism and sideroblastic anemia. Acta Haematol. 2009;122(2–3):120–33. 4. Cotter P, Baumann M, Bishop D. Enzymatic defect in “X-linked” sideroblastic anemia: molecular evidence for erythroid delta-aminolevulinate synthase deficiency. Proc. Natl. Acad. Sci. United States Am. 1992;89(9):4028–4032. 5. Allikmets R, Raskind W, Hutchinson A, et al. Mutation of a putative mitochondrial iron transporter gene (ABC7) in X-linked sideroblastic anemia and ataxia (XLSA/A). Hum. Mol. Genet. 1999;8(5):743–9. 6. Guernsey D, Jiang H, Campagna D, et al. Mutations in mitochondrial carrier family gene SLC25A38 cause nonsyndromic autosomal recessive congenital sideroblastic anemia. Nat. Genet. 2009;41(6):651–3. 7. Schmitz-Abe K, Ciesielski S, Schmidt P, et al. Congenital sideroblastic anemia due to mutations in the mitochondrial HSP70 homologue HSPA9. Blood. 2015;126(25):2734–8. 8. Papaemmanuil E, Cazzola M, Boultwood J, et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N. Engl. J. Med. 2011;365(15):1384–95. 9. Yoshida K, Sanada M, Shiraishi Y, et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature. 2011;478(7367):64–9. 10. Malcovati L, Papaemmanuil E, Bowen D, et al. Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/ myeloproliferative neoplasms. Blood. 2011;118(24):6239–46. 11. Patnaik M, Lasho T, Hodnefield J, et al. SF3B1 mutations are prevalent in myelodysplastic syndromes with ring sideroblasts but do not hold independent prognostic value. Blood. 2012;119(2):569–72. 12. Damm F, Kosmider O, Gelsi-boyer V, et al. Mutations affecting mRNA splicing define distinct clinical phenotypes and correlate with patient outcome in myelodysplastic syndromes. Blood. 2019;119(14):3211–3218. 13. Greenberg P, Tuechler H, Schanz J, et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood. 2012;120(12):2454–65. 14. Chen M, Manley J. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat. Rev. Mol. Cell Biol. 2009;10(11):741–54. 15. Alsafadi S, Houy A, Battistella A, et al. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat. Commun. 2016;7:10615. 16. Dolatshad H, Pellagatti A, Fernandez-Mercado M, et al. Disruption of SF3B1 results in deregulated expression and splicing of key genes and pathways in myelodysplastic syndrome hematopoietic stem and progenitor cells. Leukemia. 2015;29(5):1092–1103. 17. Shiozawa Y, Malcovati L, Gallì A, et al. Aberrant splicing and defective mRNA production induced by somatic spliceosome mutations in myelodysplasia. Nat. Commun. 2018;9(1):. 18. Martin-Cabrera P, Jeromin S, Perglerova K, et al. Acute myeloid leukemias with ring sideroblasts show a unique molecular signature straddling secondary acute myeloid leukemia and de novo acute myeloid leukemia. Haematologica. 2017;102(4):e125-128. 19. Papaemmanuil E, Gerstung M, Bullinger L, et al. Genomic Classification and Prognosis in Acute Myeloid Leukemia. N. Engl. J. Med. 2016;374(23):2209–2221. 20. Kurtovic-Kozaric A, Przychodzen B, Singh J, et al. PRPF8 defects cause missplicing in myeloid malignancies. Leukemia. 2015;29(1):126–36. 21. Dohner H, Estey E, Grimwade D, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood. 2017;129(4):424–448. 22. da Silva-Coelho P, Kroeze L, Yoshida K, et al. Clonal evolution in myelodysplastic syndromes. Nat. Commun. 2017;8:15099. 23. McGowan-Jordan J, Simons A, Schmid M. An international system for human cytogenomic nomenclature. Cytogenet. genome Res. 2016;148:1–140. 24. Lowenberg B, Pabst T, Maertens J, et al. Therapeutic value of clofarabine in younger and middle-aged (18-65 years) adults with newly diagnosed AML. Blood. 2017;129(12):1636–1645. 25. van der Helm L, Scheepers E, Veeger N, et al. Azacitidine might be beneficial in a subgroup of older AML patients compared to intensive chemotherapy: a single centre retrospective study of 227 consecutive patients. J. Hematol. Oncol. 2013;6(29):. 26. Dolatshad H, Pellagatti A, Liberante F, et al. Cryptic splicing events in the iron transporter ABCB7 and other key target genes in SF3B1- mutant myelodysplastic syndromes. Leukemia. 2016;30(12):2322–2331. 27. Yi G, Wierenga ATJ, Petraglia F, et al. Chromatin-based classification of genetically heterogeneous AMLs into two distinct subtypes with diverse stemness phenotypes. Cell Rep. 2019;26(4):1059–1069. 28. Chen L, Kostadima M, Martens J, et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science (80-. ). 2014;345(6204):1251033. 29. Aran D, Hu Z, Butte AJ. xCell: Digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18(220):. 30. Fontana M, Marconi G, Feenstra J, et al. Chromothripsis in acute myeloid leukemia: Biological features and impact on survival. Leukemia. 2017;32:1609–1620. 31. Lindsley RC, Mar BG, Mazzola E, et al. Acute myeloid leukemia ontogeny is defined by distinct somatic mutations.Blood . 2015;125(9):1367– 1376.

120

137853-gerritsen-layout.indd 120 14/01/2020 14:13 Ring sideroblasts in AML

32. Talwalkar S, Yin C, Naeem R, et al. Myelodysplastic syndromes arising in patients with germline TP53 mutation and Li-Fraumeni syndrome. Arch Pathol Lab Med. 2010;134(7):1010–5. 33. Reinig E, Greipp P, Chui A, Howard M, Reichard K. De novo pure erythroid leukemia: refining the clinicopathologic and cytogenetic characteristics of a rare entity. Mod. Pathol. 2018;31(5):705–717. 34. Lindsley R, Mar B, Mazzola E, et al. Acute myeloid leukemia ontogeny is defined by distinct somatic mutations. Blood. 2015;125(9):1367–76. 35. Akaashi K, Traver D, Miyamoto T, Weissman I. A clonogenic common myeloid progenitor that gives rise to all myeloid lineages. Nature. 2000;404(6774):193–7. 36. Zon L, Youssoufian H, Mather C, Lodish H, Orkin S. Activation of the erythropoietin receptor promoter by transcription factor GATA-1. Proc. Natl. Acad. Sci. U. S. A. 1991;88(23):10638–41. 37. Tenen D, Hromas R, Licht J, Zhang D. Transcription factors, normal myeloid development, and leukemia. Blood. 1997;90(2):489–519.

5

121

137853-gerritsen-layout.indd 121 14/01/2020 14:13 Chapter 5

SUPPLEMENTARY METHODS

Sorting of cell fractions To separate cell fractions, following thawing of viably frozen MNCs, cells were washed and stained with a panel of antibodies and sorted for purification of different cell fractions (antibodies against CD3, CD34, CD71 and CD235a surface markers (conjugates CD3-FITC (cat. 345763), CD34-APC (cat. 555824) or CD34-PE Cy7 (cat. 348811) or CD34-FITC (cat. 345801, BioLegend, Uithoorn, the Netherlands), CD71-BV786 (cat. 563768) or CD71-APC (cat. 551374) or CD71-PE (cat. 555537) and CD235a-BV421 (cat. 562938) or CD235a-APC (cat. 551336) Unless stated otherwise, these antibodies were obtained from BD Bioscience (Breda, the Netherlands). Single viable cells were selected based on forward and side scatter profiles in combination with negativity for DAPI or PI (both Sigma-Aldrich, Saint Louis, MO, USA). Blast fraction was defined as CD34 positive, T cell fraction as CD3 positive and erythroblast fraction was sorted as CD71/CD235a positive. Sorting purity was defined as ≥95% and confirmed by reanalysis.

Whole exome sequencing and confirmation of mutations in erythroblasts Following exome capturing using Human All Exon V5 (Agilent Technologies, Santa Clara, CA, USA), massively parallel sequencing was performed on enriched exome fragments using the HiSeq 2500 platform (Illumina, San Diego, CA, USA). Alignment of sequences and calling of mutations was executed our previously described in-house pipelines1,2, with minor modifications. The resultant data file was analyzed for the presence of germline variants in genes that have been previously implicated in sideroblastic anemia. Subsequently, DNA isolated from sorted autologous T cells was used as a constitutive reference to exclude germline variants. Filtering strategy and variant calling was performed as previously described3. Somatic variants in TP53 and SRSF2 were validated using amplicon-based deep sequencing on an Ion Torrent Personal Genome Machine (Thermo Fisher Scientific, Waltham, MA, USA). The sequencing procedure using an automated robotic workflow was performed as previously described4. The obtained sequencing data was mapped to the reference genome build GRCh37 (hg19). Variant calling was performed using the SeqNext module of the Sequence pilot software, version 4.2.2 (JSI Medical Systems, Ettenheim, Germany).

RNA Extraction and Illumina high-throughput sequencing RNA libraries were prepared using the KAPA RNA HyperPrep Kit with RiboErase (HMR) according to the manufactures protocol (KR1351 – v1.16, Roche Sequencing Solutions). The process is described briefly here: 25ng -1ug input RNA was depleted from ribosomal RNA by oligo hybridization, RNaseH treatment and DNase digestion. rRNA-depleted RNA was fragmented to ~200 bp fragments and first strand synthesis was performed using random primers. Second- strand synthesis was performed using dUTP for strand specificity. After adapter ligation, library amplification was performed and the number of cycles was dependent on the amount of starting

122

137853-gerritsen-layout.indd 122 14/01/2020 14:13 Ring sideroblasts in AML

material. Fragment size and quality was checked on a bioanalyser using a high sensitivity DNA Chip (Agilent). Samples were sequenced on the Illumina HiSeq 2000. Finally, each eligible library was subjected to 243 bp paired-end sequencing (PE43) on an Illumina NextSeq 500 system.

RNA-Seq analysis The hg19 reference genome was first indexed by STAR aligner with UCSC gene annotation. The resulting RNA-seq reads were mapped to the hg19 genome using STAR with two-pass mode, and the gene-level read counts were enumerated at the same time. The DESeq2 tool was used to examine differentially expressed genes by conducting pair-wise comparison between different groups. Only genes with a Benjamini-Hochberg-adjusted p-value -0.1 and a fold change 1.5 were considered significantly deregulated. Principal component analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE) were used to probe the transcriptomic relationships between these groups. To group differential genes with similar expression patterns, the k-means clustering approach was performed based on z-score normalization and then displayed as a heatmap. To assess the enriched annotation for deregulated genes, we performed functional 5 enrichment analysis with the Metascape tool, and only terms showing p-values below 0.01 were considered significantly over-represented.

For alternative splicing analysis we used MISO suites5 with default options to detect alternative splicing events in our study. The MISO annotations contained five types of events: skipped exons (SE), alternative 3’/5’ splice sites (A3SS, A5SS), mutually exclusive exons (MXE) and retained introns (RI). We first merged RNA-seq data from the same cell types then computed percentage splicing index (PSI) value of each event, and then finally performed pairwise comparisons between groups to identify differentially spliced genes.

We downloaded the MEP signature gene list from two previous papers 6,7 to analyze how these genes were expressed in the four groups. Expression quantification (FPKM-scale) for each RefSeq gene was performed with the Cuffnorm function in Cufflinks.

Data retrieval Data from NBM CD34+ and SF3B1 mutated MDS were retrieved from Dolatshad et al.8, while Blueprint data and RS-AML are in-house sequencing data. Similar sequencing technologies based on the Illumina platform are used and strand specific RNA-seq data was analyzed, using comparable sample prep methods. For processing, sequencing reads were trimmed when needed to similar read length and files were normalized for sequencing depth. In addition the analysis pipeline integrated in DEseq2 was used to discover and correct for batch effects.

123

137853-gerritsen-layout.indd 123 14/01/2020 14:13 Chapter 5

REFERENCES

1. Yoshida K, Sanada M, Shiraishi Y, et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–9 (2011). 2. da Silva-Coelho P, Kroeze LI, Yoshida K, Koorenhof-Scheele TN, Knops R, van de Locht LT, et al. lonal evolution in myelodysplastic syndromes. Nat. Commun. 8, 15099 (2017). 3. Berger G, Kroeze LI, Koorenhof-Scheele TN, de Graaf AO, Yoshida K, Ueno H, et al. Early detection and evolution of preleukemic clones in therapy-related myeloid neoplasms following autologous SCT. Blood 131, 1846–57 (2018). 4. Sandmann, S., de Graaf, A. O., van der Reijden, B. A., Jansen, J. H. & Dugas, M. GLM-based optimization of NGS data analysis: A case study of Roche 454, Ion torrent PGM and Illumia NextSeq sequencing data. PLoS One 12, e0171983 (2017). 5. Katz, Y., Wang, E., Airoldi, E. & Burge, C. Analysis and design of RNA sequencing experimetns for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010). 6. Chen, L. et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science (80-. ). 345, 1251033 (2014). 7. Aran, D., Hu, Z. & Butte, A. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 18, (2017). 8. Dolatshad H, Pellagatti A, Liberante F, et al. Cryptic splicing events in the iron transporter ABCB7 and other key target genes in SF3B1- mutant myelodysplastic syndromes. Leukemia. 2016;30(12):2322–2331.

124

137853-gerritsen-layout.indd 124 14/01/2020 14:13 Ring sideroblasts in AML

Table S1: Detailed Information of 21 Targeted Genes for Mutation Detection.

Target TruSight smMIPS Target TruSight smMIPS gene sequencing panel panel gene sequencing panel panel

ABL1 x JAK3 x ASXL1 x x KDM6A x ATRX x KIT x x BCOR x KRAS x x BCORL1 x MLL x BRAF x x MPL x x CALR x x MYD88 x x CBL x x NOTCH1 x x CBLB x NPM1 x x CBLC x NRAS x x CDKN2A x PDGFRA x CEBPA x PHF6 x 5 CSF3R x x PTEN x CUX1 x PTPN11 x DNMT3A x x RAD21 x ETV6/TEL x RUNX1 x x EZH2 x x SETBP1 x x FBXW7 x SF3B1 x x FLT3 x x SMC1A x GATA1 x SMC3 x GATA2 x SRSF2 x x GNAS x STAG2 x HRAS x TET2 x x IDH1 x x TP53 x IDH2 x x U2AF1 x x IKZF1 x WT1 x x JAK2 x x ZRSR2 x

125

137853-gerritsen-layout.indd 125 14/01/2020 14:13 Chapter 5

Table S2. Genes implicated in congenital sideroblastic anemia

ABCB7 ISCA1 ALAS2 ISCA2 FECH ISCU GLRX5 MFRN1 HSCB NIFS HSP70 PUS1 HSPA9 SLC25A38 IRBP1 YARS2 IRBP2

126

137853-gerritsen-layout.indd 126 14/01/2020 14:13 Ring sideroblasts in AML 0.21 0.32 0.36 0.33 0.25 0.26 0.43 0.28 0.43 0.45 0.55 0.97 0.58 0.35 0.46 0.38 0.57 0.11 0.45 0.52 0.48 0.48 0.47 0.45 0.44 0.51 VAF 0.47 p.T1014M p.T1014M p.P44Q p.L454F p.D587fs p.F711fs p.R266L p.P391fs p.R77H p.V928M p.L52M p.R148G p.A366V p.A298P p.V358A p.R279Q p.V181M p.G120D p.P1485A p.D1083E p.R1095Q p.H92L p.R291C p.E390K p.G56S p.S840P Amino acid Amino change p.G1491R c.C3041T :c.A150T c.C131A c.C1360T c.1760_1764del c.2132delT c.G797T c.1173delC c.G230A c.G2782A c.C154A c.A442G c.C1097T c.G892C c.T1073C c.G836A c.G541A c.G359A c.C4453G c.C3249A c.G3284A c.A275T c.C871T c.G1168A c.G166A c.T2518C Coding Coding sequence change c.G4471A NM_001067 NM_001287014 NM_001080513 NM_032208 NM_003425 NM_001172638 NM_178470 NM_153759 NM_001167992 NM_178457 NM_152345 NM_001126115 NM_024335 NM_174978 NM_030979 NM_001923 NM_001300727 NM_021245 NM_000093 NM_007356 NM_001127231 NM_138571 NM_000865 NM_001004360 NM_022662 NM_001190274 Transcript ID Transcript NM_001306079 A T T T - - A - A A A C T G C T A T G T A T T A T G Observed sequence A 5 G A G C CACAT A C G G G C T C C T C G C C G G A C G C A Reference Reference sequence G 38556279 36162450 194063301 69420473 44417828 180276363 125685795 25467135 48759747 57768856 27934799 7577100 55362987 60933638 25671409 61091536 58949859 75394385 137711968 107703252 70255558 126288106 87725923 116525948 112638237 48036334 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 27460682 38556279 36162450 194063301 69420473 44417824 180276363 125685795 25467135 48759747 57768856 27934799 7577100 55362987 60933638 25671409 61091536 58949859 75394385 137711968 107703252 70255558 126288106 87725923 116525948 112638237 48036334 Start co-ordinate of variant (GRCh37/hg19) 27460682 17q21.2 9p13.3 3q29 2p13.3 19q13.31 5q35.3 2p23.3 Xq25 Xp11.23 20q13.32 17q11.2 17p13.1 16q12.2 14q23.1 13q12.13 11q12.2 11q12.1 10q22.2 9q34.3 7q31.1 7q11.22 6q22.32 6q14.3 2q14.1 2q13 2p16.3 Chr. 2p23.3 nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift deletion frameshift deletion frameshift deletion nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type SNV nonsynonymous exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene exonic TOP2A GLIPR2 CPN2 ANTXR1 ZNF45 ZFP62 DNMT3A DCAF12L1 PQBP1 ZNF831 ANKRD13B TP53 IRX6 C14orf39 PABPC3 DDB1 DTX4 MYOZ1 COL5A1 LAMB4 AUTS2 HINT3 HTR1E DPP10 ANAPC1 FBXO11 Gene CAD 7 6 5 4 3 2 1 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 No 1 RS002 Patient RS001 Table S3: WES results Table

127

137853-gerritsen-layout.indd 127 14/01/2020 14:13 Chapter 5 0.07 0.44 0.49 0.55 0.09 0.47 0.79 0.43 0.08 0.28 0.51 0.80 0.31 0.44 0.44 0.48 0.17 0.40 0.44 0.48 0.42 0.22 0.45 0.42 0.36 0.31 0.30 VAF p.R951X p.E79K p.Y498C p.G121S p.P4L p.C44F p.R2316Q p.A371V p.S653L p.V144L p.A545V p.S10T p.N91S p.V91G p.F173S p.A16V p.V350L p.G510C p.R1400H p.R291C p.E208A p.R318X p.N182fs p.R77C p.G870S p.P95L Amino acid Amino change c.3497-1G>A c.C2851T c.G235A c.A1493G c.G361A c.C11T c.G131T c.G6947A c.C1112T c.C1958T c.G430T c.C1634T c.G29C c.A272G c.T272G c.T518C c.C47T c.G1048T c.G1528T c.G4199A c.C871T c.A623C c.C952T c.544delA c.C229T c.G2608A c.C284T Coding Coding sequence change NM_001042492 NM_004525 NM_004794 NM_000216 NM_032037 NM_006387 NM_001126115 NM_001005407 NM_022480 NM_001201380 NM_001135694 NM_020223 NM_001032394 NM_001017408 NM_001167579 NM_001079524 NM_000460 NM_001278453 NM_153759 NM_007113 NM_032880 NM_001042664 NM_177404 NM_001429 NM_001291006 NM_015559 NM_001195427 Transcript ID Transcript A A A C T A A A A T T T C C G C A T A T T G T - T A A Observed sequence G G G T C G C G G C G C G T T T G G C C C T C A C G G Reference Reference sequence 29560019 170103945 129306271 8504940 19625876 16653179 7578403 1270897 86311930 43861084 42259409 299825 142630707 117923180 176813234 57314708 184093770 231134614 25463587 152081494 18692047 6533483 30269562 41513640 21401734 42531913 74732959 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 29560019 170103945 129306271 8504940 19625876 16653179 7578403 1270897 86311930 43861084 42259409 299825 142630707 117923180 176813234 57314708 184093770 231134614 25463587 152081494 18692047 6533483 30269562 41513640 21401734 42531913 74732959 Start co-ordinate of variant (GRCh37/hg19) 17q11.2 2q31.1 Xq26.1 Xp22.31 19p13.11 19p13.11 17p13.1 16p13.3 15q25.3 9p11.2 8p11.21 7p22.3 6q24.1 6q22.1 5q35.3 4q12 3q27.1 2q37.1 2p23.3 1q21.3 1p36.13 1p36.31 22q13.2 Xp21.2 22q11.21 18q12.3 17q25.1 Chr.

stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift deletion stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type splicing exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene NF1 LRP2 RAB33A ANOS1 TSSK6 CHERP TP53 CACNA1H KLHL25 CNTNAP3B VDAC3 FAM20C ADGRG6 GOPC SLC34A1 PAICS THPO SP140 DNMT3A TCHH IGSF21 PLEKHG5 EP300 MAGEB1 LRRC74B SETBP1 SRSF2 Gene 25 24 23 22 21 20 19 18 17 16 15 13 12 11 10 9 8 7 6 4 3 2 1 11 10 9 8 No RS003 Patient Table S3: Continued Table

128

137853-gerritsen-layout.indd 128 14/01/2020 14:13 Ring sideroblasts in AML 0.52 0.44 0.54 0.31 0.33 0.32 0.43 0.45 0.29 0.43 0.36 0.34 0.35 0.10 0.12 0.47 0.13 0.32 0.57 0.44 0.84 0.49 0.55 0.87 0.47 VAF 0.53 p.N695S p.N817S p.S55X p.Q370fs p.G217S p.A2V p.T352M p.L697F p.T341I p.V353M p.S647P p.R1263W p.S19F p.N135T p.A300fs p.R211X p.P269fs p.R368C p.P2742L p.G260D p.C106Y p.S1035I p.H47P p.M9T p.Q72R Amino acid Amino change p.D146N c.A2084G c.A2450G c.C164G c.1109_1122del c.G649A c.C5T c.C1055T c.C2089T c.C1022T c.G1057A c.T1939C c.C3787T c.C56T c.A404C c.898dupG c.C631T c.806delC c.C1102T c.C8225T c.G779A c.G317A c.G3104T c.A140C c.T26C c.A215G Coding Coding sequence change c.G436A NM_005921 NM_001170688 NM_001126118 NM_001001890 NM_001207019 NM_001171251 NM_004913 NM_001040610 NM_152447 NM_001003827 NM_025145 NM_001287059 NM_001430 NM_178553 NM_001145662 NM_014989 NM_001282142 NM_001007468 NM_000267 NM_001256265 NM_001126115 NM_004783 NM_173538 NM_001305 NM_001304437 Transcript ID Transcript NM_017625 G G C - T T A A T A G T T G C T - T T T T T C C C Observed sequence T 5 A A G GGGCGAG CTGGCTT C C G G C G A C C T - C G C C C C G A T T Reference Reference sequence C 56174925 1564610 7579406 36164685 7755121 42222572 89777197 82444553 42356850 5664529 105945803 43548590 46574041 27360794 128202821 72889437 48890995 24176338 29687632 18156687 7577568 30002843 87899821 73245557 83763921 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 160851072 56174925 1564610 7579406 36164672 7755121 42222572 89777197 82444553 42356850 5664529 105945803 43548590 46574041 27360794 128202821 72889437 48890995 24176338 29687632 18156687 7577568 30002843 87899821 73245557 83763921 Start co-ordinate of variant (GRCh37/hg19) 160851072 5q11.2 1p36.33 21q22.12 17p13.1 19p13.2 17q21.31 16q24.3 15q25.2 14q21.1 11p15.4 10q25.1 7p13 2p21 2p23.3 3q21.3 Xp11.23 6q13 22q11.23 17q11.2 17p11.2 17p13.1 16p11.2 8q21.3 7q11.23 6q14.1 Chr. 1q23.3 nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift deletion stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift insertion frameshift deletion stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type SNV nonsynonymous exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene exonic MAP3K1 MIB2 RUNX1 TP53 FCER2 C17orf53 VPS9D1 EFTUD1 LRFN5 TRIM34,TRIM6- TRIM34 CFAP43 HECW1 EPAS1 PRR30 GATA2 TFE3 RIMS1 SMARCB1 NF1 FLII TP53 TAOK2 CNBD1 CLDN4 UBE3D Gene ITLN1 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2 1 10 9 8 7 6 5 4 3 2 No 1 RS006 RS005 Patient RS004 Table S3: Continued Table

129

137853-gerritsen-layout.indd 129 14/01/2020 14:13 Chapter 5 0.40 0.45 0.45 0.40 0.44 0.41 0.76 0.46 0.46 0.40 0.43 0.47 0.42 0.38 0.39 0.39 0.46 0.41 0.45 0.37 0.95 0.51 0.48 0.51 0.74 0.43 0.42 VAF p.R1337X p.Q514X p.R354X p.S762G p.R72H p.V963I p.H61R p.T188M p.N155S p.N243S p.I366M p.R198C p.H439Y p.T524K p.R132C p.T1613M p.N3158S p.I420T p.D90fs p.L367S p.Y91C p.P95H p.R10Q p.E315K p.G375C p.E264K Amino acid Amino change c.C4009T c.C1540T c.C1060T c.A2284G c.G215A c.G2887A c.A182G c.C563T c.A464G c.A728G c.C1098G c.C592T c.C1315T c.C1571A c.C394T c.C4838T c.A9473G c.T1259C c.114+1G>A c.269dupA c.T1100C c.A272G c.C284A c.G29A c.G943A c.G1123T c.G790A Coding Coding sequence change NM_005215 NM_005632 NM_001284224 NM_005462 NM_001013736 NM_005215 NM_001126115 NM_001080417 NM_001376 NM_175932 NM_152629 NM_020815 NM_001256614 NM_001267617 NM_001282386 NM_207363 NM_000081 NM_024654 NM_001281471 NM_153759 NM_006306 NM_001161359 NM_001195427 NM_001290114 NM_152638 NM_001109974 NM_001286751 Transcript ID Transcript T T T G A A C A G G C T T A A A C G A T G G T T T T T Observed sequence C C C A G G T G A A G C C C G G T A G - A A G C C G C Reference Reference sequence 51025778 599083 28017905 140995474 37026698 50929215 7578271 30795086 102445775 249005 4117915 134071887 183824347 35779794 209113113 133539546 235884048 6592799 102465408 25470924 53438958 17881319 74732959 90631934 91347577 150028960 71519557 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 51025778 599083 28017905 140995474 37026698 50929215 7578271 30795086 102445775 249005 4117915 134071887 183824347 35779794 209113113 133539546 235884048 6592799 102465408 25470924 53438958 17881319 74732959 90631934 91347577 150028960 71519557 Start co-ordinate of variant (GRCh37/hg19) 18q21.2 16p13.3 8p21.1 Xq27.2 Xp21.1 18q21.2 17p13.1 16p11.2 14q32.31 11p15.5 9p24.2 4q28.3 3q27.1 3p22.3 2q34 2q21.2 1q42.3 1p36.31 2p23.3 5q21.1 Xp11.22 19p13.11 17q25.1 15q26.1 12q21.33 5q33.1 5q13.2 Chr. stopgain stopgain stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift insertion

nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic splicing exonic exonic exonic exonic exonic exonic exonic Location Location of gene DCC CAPN15 ELP3 MAGEC1 FAM47C DCC TP53 ZNF629 DYNC1H1 PSMD13 GLIS3 PCDH10 HTR3E ARPP21 IDH1 NCKAP5 LYST NOL9 DNMT3A PPIP5K2 SMC1A FCHO1 SRSF2 IDH2 CCER1 SYNPO MRPS27 Gene 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 11 10 9 8 7 6 5 4 No RS007 Patient Table S3: Continued Table

130

137853-gerritsen-layout.indd 130 14/01/2020 14:13 Ring sideroblasts in AML 0.62 0.13 0.14 0.45 0.34 0.61 0.27 0.16 0.34 0.35 0.13 0.34 0.35 0.47 0.35 0.07 0.34 0.06 0.07 0.06 0.10 0.06 0.07 0.12 0.11 0.36 0.09 VAF 0.36 p.V220F p.P361A p.S471L p.E105K p.A58V p.I63T p.V353 p.R961H p.V266I p.R26Q p.A65T p.G549D p.K2010N p.R2049C p.C537Y p.D160Y p.V127I p.V11M p.R156C p.R151C p.V29F p.A156P p.L70F p.A1003T p.R271S p.G642fs Amino acid Amino change c.G658T c.C1081G c.C1412T c.G313A c.C173T c.T188C c.G1057A c.G2882A c.G796A c.G77A c.G193A c.G1646A c.G6030T c.C6145T c.G1610A c.G478T c.G379A c.G31A c.C466T c.C451T c.G85T c.G466C c.C208T c.G3007A c.A813T c.3268-1G>C c.1927dupG Coding Coding sequence change c.562+1G>T NM_014799 NM_001146115 NM_000629 NM_001077182 NM_001004334 NM_001126115 NM_001098517 NM_006133 NM_001004743 NM_002233 NM_001134658 NM_080872 NM_001723 NM_015136 NM_001017371 NM_001279 NM_025106 NM_001126115 NM_017570 NM_018100 NM_001005216 NM_001031703 NM_001768 NM_080630 NM_001384 NM_002025 NM_015338 Transcript ID Transcript NM_001127586 T G T A A G T A T T A A A T T T A T A T T G A T T C G Observed sequence A 5 G C C G G A C G C C G G C C C G G C G C G C G C A G - Reference Reference sequence C 65415029 47952055 34726061 79495870 36499500 7578265 115049433 61511714 56230082 30034149 95658342 35584012 56482802 52555698 174783339 12274239 9416329 7578503 145113797 52303267 29079752 47543199 87017646 103405912 44437387 148055000 31022441 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 6760489 65415029 47952055 34726061 79495870 36499500 7578265 115049433 61511714 56230082 30034149 95658342 35584012 56482802 52555698 174783339 12274239 9416329 7578503 145113797 52303267 29079752 47543199 87017646 103405912 44437387 148055000 31022441 Start co-ordinate of variant (GRCh37/hg19) 6760489 Xq12 21q22.3 21q22.11 17q25.3 17q12 17p13.1 11q23.3 11q12.2 11q12.1 11p14.1 10q23.33 8p12 6p12.1 3p21.1 2q31.1 1p36.22 18p11.21 17p13.1 8q24.3 6p12.2 6p22.1 3p21.31 2p11.2 1p21.1 1p34.1 20q11.21 Xq28 Chr. 12p13.31 nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift insertion

Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic splicing splicing Location Location of gene HEPH DIP2A IFNAR1 FSCN2 GPR179 TP53 CADM1 DAGLA OR5M9 KCNA4 SLC35G1 UNC5D DST STAB1 SP3 SPSB1 CIDEA TP53 OPLAH EFHC1 OR2J3 ELP6 CD8A COL11A1 DPH2 ASXL1 ING4 AFF2 Gene 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 10 9 8 7 6 5 4 3 2 1 20 21 No RS009 RS008 Patient Table S3: Continued Table

131

137853-gerritsen-layout.indd 131 14/01/2020 14:13 Chapter 5 0.06 0.11 0.32 0.09 0.33 0.34 0.52 0.26 0.36 0.25 0.17 0.09 0.29 0.07 0.34 0.21 0.21 0.26 0.38 0.32 0.24 0.28 VAF 0.24 p.R105X p.R1659C p.R353H p.G62E p.V660A p.P33S p.M1L p.R287H p.P244L p.R178C p.R359C p.V450M p.G14D p.K690Q p.A93T p.47_55del p.675_676del p.F369fs p.G642fs p.L957fs p.L198fs p.A356fs Amino acid Amino change p.G312fs c.C313T c.C4975T c.G1058A c.G185A c.T1979C c.C97T c.A1T c.G860A c.C731T c.C532T c.C1075T c.G1348A c.G41A c.A2068C c.G277A c.141_164del c.2025_2027del c.1104_ c.1104_ 1105insCCCG c.1927dupG c.2871dupA c.594delG c.1066delG Coding Coding sequence change c.934delG NM_001987 NM_002972 NM_144773 NM_002975 NM_001282540 NM_001126118 NM_080914 NM_001197181 NM_021214 NM_005164 NM_006028 NM_017551 NM_001135947 NM_024831 NM_001234 NM_001195427 NM_016107 NM_001001890 NM_015338 NM_001127208 NM_001285829 NM_001024218 Transcript ID Transcript NM_005188 T A T A G A A A T A T T A C A - - CGGG G A - - Observed sequence - C G C G A G T G C G C C G A G CGGGACTC CTTGGTGTA GCGGTCC TGT - - - C G Reference Reference sequence G 11992223 50893009 5282783 51227199 15496678 7579473 7017559 90001935 81041994 40012886 113815462 87487797 131140320 56717520 8787374 74733102 32390498 36164689 31022441 106157969 33792370 67555720 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 119146771 11992223 50893009 5282783 51227199 15496678 7579473 7017559 90001935 81041994 40012886 113815462 87487797 131140320 56717520 8787374 74733079 32390496 36164689 31022441 106157969 33792370 67555720 Start co-ordinate of variant (GRCh37/hg19) 119146771 12p13.2 22q13.33 20p12.3 19q13.33 17p12 17p13.1 17p13.1 16q24.3 15q25.1 12q12 11q23.2 10q23.1 9q34.11 8q12.1 3p25.3 17q25.1 5p13.3 21q22.12 20q11.21 4q24 19q13.11 14q23.3 Chr. 11q23.3 stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonframeshift deletion nonframeshift deletion frameshift insertion frameshift insertion frameshift insertion frameshift deletion frameshift deletion Type of variant Type frameshift deletion exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene exonic ETV6 SBF1 PROKR2 CLEC11A CDRT1 TP53 ASGR2 TUBB3 ABHD17C ABCD2 HTR3B GRID1 URM1 TGS1 CAV3 SRSF2 ZFR RUNX1 ASXL1 TET2 CEBPA GPHN Gene CBL 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 No 1 Patient RS010 Table S3: Continued Table

132

137853-gerritsen-layout.indd 132 14/01/2020 14:13 Ring sideroblasts in AML 0.42 0.47 0.36 0.49 0.59 0.13 0.50 0.45 0.42 0.41 0.27 0.42 0.31 0.31 0.40 0.42 0.33 0.45 0.42 0.16 0.45 0.32 0.38 0.34 0.37 0.05 0.44 VAF p.V585I p.N90S p.Q153R p.V533M p.W476X p.M1254V p.M1481L p.H62N p.A29P p.G388D p.N64I p.T92M p.K289M p.G102C p.D224E p.V1192D p.R335H p.H2706Q p.L769Q p.K454X p.D104N p.V56E p.S331T p.F871_ K872delins Amino acid Amino change c.G1753A c.A269G c.A458G c.G1597A c.G1427A c.A3760G c.A4441T c.C184A c.G85C c.G1163A c.A191T c.1328-1G>C c.C275T c.A866T c.G304T c.C672A c.T3575A c.G1004A c.T8118A c.T2306A c.A1360T c.G310A c.T167A c.3955-2A>T c.G992C c.2611_ c.2611_ 2612insAG c.1655-1G>A Coding Coding sequence change NM_001178030 NM_024945 NM_153815 NM_181882 NM_138370 NM_001198834 NM_006031 NM_001005168 NM_003553 NM_012090 NM_181622 NM_003685 NM_001278911 NM_001143828 NM_003495 NM_017709 NM_001006600 NM_003750 NM_173628 NM_022802 NM_020536 NM_001145652 NM_018111 NM_001127208 NM_182557 NM_001123383 NM_005529 Transcript ID Transcript T G C T A C T T G A A G T T T A A T T T T A T T G CT T Observed sequence 5 C A T C G T A G C G T C C A G C T C A A A G A A C - C Reference Reference sequence 64534688 86616170 79291152 40902662 42284773 144881436 47817403 5878749 3301620 39750771 31798040 6416662 178546013 22057483 27107391 118166162 65350721 120825029 76472690 126683132 18143278 49518815 49622113 106182914 118773460 39931987 22207996 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 64534688 86616170 79291152 40902662 42284773 144881436 47817403 5878749 3301620 39750771 31798040 6416662 178546013 22057483 27107391 118166162 65350721 120825029 76472690 126683132 18143278 49518815 49622113 106182914 118773460 39931987 22207996 Start co-ordinate of variant (GRCh37/hg19) 11q13.2 9q21.32 15q24.2 19q13.2 2p21 1q21.2 21q22.3 11p15.4 17p13.3 1p32-p31 21q22.1 19p13.3 3q26.32 18q11.2 6p21.33 1p12 5q12.3 10q26 17q25.3 10q26.13 20p11.23 6p12.3 19q13.33 11q23.3 4q24 1p36.12 Xp11.4 Chr. nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous splicing nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous

stopgain Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic splicing exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic splicing splicing exonic Location Location of gene SF1 RMI1 RASGRF1 PRX PKDCC PDE4DIP PCNT OR52E8 OR1E1 MACF1 KRTAP13-3 KHSRP KCNMB2 HRH4 HIST1H4I FAM46C ERBB2IP EIF3A DNAH17 CTBP2 CSRP2BP C6orf141 C19orf73 BCL9L HSPG2 TET2 BCOR Gene 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 25 26 24 No RS011 Patient Table S3: Continued Table

133

137853-gerritsen-layout.indd 133 14/01/2020 14:13 Chapter 5 0.25 0.21 0.36 0.15 0.27 0.26 0.23 0.16 0.35 0.26 0.29 0.16 0.38 0.16 0.25 0.08 0.20 0.12 0.41 0.44 0.41 0.39 0.46 0.43 0.32 0.38 VAF p.V140M p.Q1084X p.R153H p.N676D p.P34S p.S375T p.G2402D p.D1815N p.P290S p.C211W p.G53R p.V8M p.W329R p.S1662N p.I3181T p.S325G p.R101H p.E430K p.W52X p.Y73X p.P1617A p.N1616fs p.Y274C p.K431N p.A362T p.G1770R Amino acid Amino change c.G418A c.C3250T c.G458A c.A2026G c.C100T c.T1123A c.G7205A c.G5443A c.C868T c.C633G c.G157A c.G22A c.T985C c.G4985A c.T9542C c.A973G c.G302A c.G1288A c.G155A c.T219G c.C4849G c.4848_ c.4848_ 4849insT c.A821G c.G1293T c.G1084A c.G5308C Coding Coding sequence change NM_001126115 NM_006662 NM_001039958 NM_001145224 NM_001191323 NM_001256405 NM_001145966 NM_001080495 NM_001177598 NM_002213 NM_000844 NM_001001824 NM_001010846 NM_001198834 NM_020765 NM_001128223 NM_014675 NM_001291491 NM_001126118 NM_001126115 NM_001127208 NM_001127208 NM_153271 NM_001013251 NM_001199633 NM_001041 Transcript ID Transcript T T A G T A T T A C A T G T G C A T T C G T G T T G Observed sequence C C G A C T C C G G G C A C A T G C C A C - A G C C Reference Reference sequence 7577124 30732296 90320046 75586760 33022991 67017941 129901819 5391477 184090478 124560377 6903232 248814164 154461566 144873972 19451081 75787801 17250925 44890654 7579415 7578234 106196516 106196515 75942264 62655868 86905134 164700138 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 7577124 30732296 90320046 75586760 33022991 67017941 129901819 5391477 184090478 124560377 6903232 248814164 154461566 144873972 19451081 75787801 17250925 44890654 7579415 7578234 106196516 106196515 75942264 62655868 86905134 164700138 Start co-ordinate of variant (GRCh37/hg19) 17p13.1 16p11.2 15q26.1 15q24.2 15q13.3 11q13.2 10q26.2 7p22.1 3q27.1 3q21.2 3p26.1 1q44 1q21.3 1q21.1 1p36.13 1p36.13 3p12.3 19q13.31 17p13.1 17p13.1 4q24 4q24 15q24.2 11q13.2 9q22.2 3q25.2-q26.2 Chr. nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain stopgain nonsynonymous SNV nonsynonymous frameshift insertion nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene TP53 SRCAP MESP2 GOLGA6D GREM1 KDM2A MKI67 TNRC18 THPO ITGB5 GRM7 OR2T27 SHE PDE4DIP UBR4 CROCC ZNF717 ZNF285 TP53 TP53 TET2 TET2 SNX33 SLC3A2 SLC28A3 SI Gene 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 34 33 32 31 30 29 28 27 26 25 No RS018 Patient Table S3: Continued Table

134

137853-gerritsen-layout.indd 134 14/01/2020 14:13 Ring sideroblasts in AML 0.12 0.39 0.22 0.37 0.24 0.22 0.53 0.29 0.39 0.46 0.32 0.37 0.32 0.06 0.15 0.38 0.23 0.72 0.32 0.41 0.26 0.39 0.35 0.36 0.25 0.39 0.32 VAF p.R244X p.K666N p.T5095S p.S137A p.Q61H p.P402A p.V40F p.R150W p.R288C p.V1280M p.E278X p.V178M p.I23T p.G1094S p.K302R p.G191D p.Q425K p.I674T p.F142fs p.I143fs p.R614G p.R1388Q p.W2X p.H36R Amino acid Amino change c.C730T c.G1998T c.C15284G c.T409G c.296+3G>C c.A183C c.C1204G c.G118T c.C448T c.C862T c.G3838A c.G832T c.G532A c.858-5C>T c.T68C c.G3280A c.A905G c.G572A c.C1273A c.829+5A>T c.T2021C c.425_426insC c.427_430del c.A1840G c.G4163A c.G6A c.A107G Coding Coding sequence change NM_020676 NM_012433 NM_003319 NM_001127710 NM_004518 NM_002524 NM_000479 NM_001126115 NM_001126115 NM_001286430 NM_153046 NM_001100118 NM_032230 NM_001034852 NM_020358 NM_001115 NM_020381 NM_001142603 NM_031955 NM_001100874 NM_001080415 NM_017801 NM_017801 NM_153759 NM_001146706 NM_001288829 NM_001126115 Transcript ID Transcript T A C G G G G A A A A A A T G T C A T A C G - C T A C Observed sequence 5 C C G T C T C C G G G C G C A C T G G T T - CTAT T C G T Reference Reference sequence 58270860 198267359 179498747 186275662 62103518 115256528 2251477 7578416 7577094 52254643 104506654 104165344 82792574 70478197 89537570 131793112 107531746 141309554 172642063 103853276 142754903 32525578 32525577 25462000 21868167 161692468 7578427 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 58270860 198267359 179498747 186275662 62103518 115256528 2251477 7578416 7577094 52254643 104506654 104165344 82792574 70478197 89537570 131793112 107531746 141309554 172642063 103853276 142754903 32525578 32525574 25462000 21868167 161692468 7578427 Start co-ordinate of variant (GRCh37/hg19) 3p14.3 2q33.1 2q31.2 1q31.1 1p13.2 20q13.33 19p13.3 17p13.1 17p13.1 15q21.2 14q32.33 14q32.33 14q24.2 12q21.31 11q14.3 8q24.22 6q21 5q31.3 4q24 3q26.31 3q23 3p22.3 3p22.3 2p23.3 1q23.3 Yq11.222 17p13.1 Chr. stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous

nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift insertion frameshift deletion nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type exonic exonic exonic exonic exonic splicing exonic exonic exonic exonic exonic splicing exonic exonic exonic exonic exonic splicing exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene ABHD6 SF3B1 TTN PRG4 NRAS KCNQ2 AMH TP53 TP53 LEO1 TDRD9 SMOC1 XRCC3 METTL25 TRIM49 ADCY8 PDSS2 SLC9B1 KIAA0141 SPATA16 U2SURP CMTM6 CMTM6 DNMT3A FCRLB KDM5D TP53 Gene 5 4 3 2 1 20 19 18 17 16 15 13 14 12 11 10 9 7 8 6 5 4 3 2 1 18 17 No RS020 RS019 Patient Table S3: Continued Table

135

137853-gerritsen-layout.indd 135 14/01/2020 14:13 Chapter 5 0.46 0.47 0.46 0.27 0.32 0.41 0.27 0.27 0.20 0.32 0.36 0.79 0.11 0.28 0.35 0.30 0.35 0.11 0.22 0.24 0.07 0.37 0.25 0.24 0.29 0.29 VAF p.E374K p.E363Q p.K346N p.P65S p.G362E p.A13V p.G3606V p.L747fs p.V399M p.P120L p.R354H p.E370K p.G427V p.N286H p.T331S p.R19H p.I768L p.C1173S p.S271X p.A365fs p.R417C p.E810fs p.K105E p.R590W p.H277Q p.N297S Amino acid Amino change c.G1120A c.G1087C c.G1038C c.C193T c.G1085A c.C38T c.G10817T c.2241delG c.G1195A c.C359T c.G1061A c.G1108A c.G1280T c.A856C c.A991T c.G56A c.A2302C c.G3518C c.C812A c.1093_ c.1093_ 1094insTCGG c.C1249T c.2429delA c.A313G c.C1768T c.C831G c.A890G Coding Coding sequence change NM_000237 NM_000237 NM_000237 NM_001004320 NM_002263 NM_022346 NM_173651 NM_005760 NM_001145122 NM_001144886 NM_000488 NM_152577 NM_019612 NM_152605 NM_001480 NM_025237 NM_002892 NM_000834 NM_001759 NM_000378 NM_000378 NM_001105521 NM_018051 NM_001166161 NM_025258 NM_001145662 Transcript ID Transcript A C C A A T T - T A T A T G T T C G A CCGA A - G T C C Observed sequence G G G G G C G C C G C G G T A C A C C - G A A C G T Reference Reference sequence 19816872 19816839 19816790 15599830 33372957 17812738 186664583 37443529 31414884 71521796 173878782 22292216 44223990 38160194 74980799 41836054 58831109 13716654 4409117 32417907 32414251 133978184 158664076 94827674 31741105 128202830 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 19816872 19816839 19816790 15599830 33372957 17812738 186664583 37443529 31414884 71521796 173878782 22292216 44223990 38160194 74980799 41836054 58831109 13716654 4409117 32417907 32414251 133978184 158664076 94827674 31741105 128202830 Start co-ordinate of variant (GRCh37/hg19) 8p21.3 8p21.3 8p21.3 7p21.2 6p21.32 4p15.31 2q32.1 2p22.2 2p23.1 1q25.1 Xq13.1 Xp22.11 19q13.31 19q13.12 18q23 17q21.31 14q23.1 12p13.1 12p13.32 11p13 11p13 10q26.3 7q36.3 7q21.3 6p21.33 3q21.3 Chr. nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift deletion nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain frameshift insertion nonsynonymous SNV nonsynonymous frameshift deletion nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene LPL LPL LPL AGMO KIFC1 NCAPG FSIP2 CEBPZ CAPN14 SERPINC1 CITED1 ZNF645 IRGC ZNF781 GALR1 SOST ARID4A GRIN2B CCND2 WT1 WT1 JAKMIP3 WDR60 PPP1R9A VWA7 GATA2 Gene 10 9 8 7 6 5 4 3 2 1 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 No RS021 Patient Table S3: Continued Table

136

137853-gerritsen-layout.indd 136 14/01/2020 14:13 Ring sideroblasts in AML 0.23 0.21 0.29 0.30 0.31 0.28 0.23 0.25 0.44 0.23 0.34 0.44 0.32 0.37 0.29 0.37 0.39 0.39 0.37 0.39 VAF p.R4730Q p.H54Y p.P91A p.M194T p.R250C p.P900S p.Q937X p.L505V p.V188A p.V97M p.S11220R p.R116W p.L321R p.A261V p.A588fs p.P1359S p.G349V p.S714R p.A755V p.K203N Amino acid Amino change c.G14189A c.C160T c.C271G c.T581C c.C748T c.C2698T c.C2809T c.C1513G c.T563C c.G289A c.T33660G c.C346T c.T962G c.C782T c.1763delC c.C4075T c.G1046T c.T2142G c.C2264T c.G609T Coding Coding sequence change NM_001042723 NM_020764 NM_001033028 NM_001001961 NM_021927 NM_001278616 NM_001102426 NM_020247 NM_001204174 NM_001267611 NM_024690 NM_001126115 NM_004691 NM_001098814 NM_001199096 NM_001897 NM_001177984 NM_006165 NM_152643 NM_001252618 Transcript ID Transcript A A G G T A A G C T C A C A - A T C T T Observed sequence 5 G G C A C G G C T C A G A G C G G A C G Reference Reference sequence 39068589 2240158 22956327 107298514 44688540 111398909 101627976 227172583 19632589 168073851 9047971 7577539 67472525 4242794 1395031 75977757 52096610 129744220 135012276 138418270 End co-ordinate End co-ordinate of variant (GRCh37/hg19) 39068589 2240158 22956327 107298514 44688540 111398909 101627976 227172583 19632589 168073851 9047971 7577539 67472525 4242794 1395031 75977757 52096610 129744220 135012276 138418270 Start co-ordinate of variant (GRCh37/hg19) 19q13.2 16p13.3 15q11.2 9q31.1 4p12 2q13 2q11.2 1q42.13 1q24.2 21q21.1 19p13.2 17p13.1 16q22.1 16p13.3 16p13.3 15q24.2 12q13.13 11q24.3 10q26.3 9q34.3 Chr. nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous stopgain nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous frameshift deletion nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous nonsynonymous SNV nonsynonymous Type of variant Type exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic exonic Location Location of gene RYR1 CASKIN1 CYFIP1 OR13C3 GUF1 BUB1 TBC1D8 ADCK3 GPR161 CHODL MUC16 TP53 ATP6V0D1 SRL BAIAP3 CSPG4 SCN8A NFRKB KNDC1 LCN1 Gene 9 8 7 6 5 4 3 2 1 21 20 19 18 17 16 15 14 13 12 11 No RS022 Patient Table S3: Continued Table

137

137853-gerritsen-layout.indd 137 14/01/2020 14:13 Chapter 5 Treatment Azacitidine Idarubicin + cytarabin Idarubicine + cytarabin Daunorubicine + cytarabin Azacitidine Daunorubicine + cytarabin Daunorubicine + cytarabin, mercaptopurine, by followed decitabine azacitidine, Azacitidine Azacitidine Azacitidine Daunorubicin + cytarabin, bij azacitidine 1year followed Azacitidine Idarubicin + cytarabin Idarubicin + cytarabin Idarubicin + cytarabin Azacitidine followed by by Azacitidine followed daunorubicin + cytarabin Daunorubicin + cytarabin Outcome CR NR CR CR PR CR NR NR* PR NR R CR NR CR CR CR CR Interval (months) 12 1.5 3.5 1 4 1 16 2.5 3.5 4.5 20 8 3 2.5 1.5 4 5 Erythro- blasts (%) 39% 19% 30% 10% 21% 30% 58% 25% 43% 65% 36% 47% 64% 28% 32% 21% 35% Sidero- blasts (%) 29% 29% 19% 1% 14% 39% 43% 23% 27% 27% 36% 9% 29% nd 25% 28% 38% Blasts (%) 1% 14% 3% 1% 14% 3% 18% 3% 8% 21% 24% 2% 5% 2% 2% 3% 1% Timepoint 2: Follow-up Timepoint RS (%) 25% 55% 7% 0% 6% 0% 33% 39% 2% 56% 25% 0% 13% 0% 0% 1% 1% Erythro- blasts (%) 59% 33% 40% 54% 15% 35% 17% 12% 30% 29% 29% 16% 16% 50% 24% 25% 33% Sidero- blasts (%) 7% 13% 11% 24% 27% 15% 35% 45% 49% 44% 43% 49% 17% 48% 1% 27% 20% Blasts (%) 10% 15% 12% 24% 47% 28% 69% 15% 12% 37% 36% 20% 33% 22% 21% 40% 31% RS (%) Timepoint 1: Diagnosis Timepoint 85% 77% 74% 61% 50% 50% 44% 34% 25% 25% 22% 17% 15% 12% 7% 9% 8% Diagnosis MDS-EB2 MDS-EB2 t-MN t-MN sAML sAML t-MN de novo AML MDS-EB2 MDS-EB2 de novo AML de novo AML de novo AML de novo AML t-MN sAML sAML de novo AML UPN RS002 RS037 RS012 RS018 RS074 RS021 RS006 RS045 RS061 RS122 RS042 RS028 RS001 RS068 RS108 RS093 RS071 NR* no response based on persistent presence of monosomy 7 upon cytogenetic evaluation. of monosomy presence based on persistent NR* no response and responses. strategies Treatment S4: Table Abbreviations: UPN – unique patient number, RS – ring sideroblasts, NR – no response, CR – complete remission, PR – partial R relapse, nd not determined Abbreviations: UPN – unique patient number, RS ring sideroblasts, NR no response, CR complete

138

137853-gerritsen-layout.indd 138 14/01/2020 14:13 Ring sideroblasts in AML Cluster 5 Cluster down up down down Recurrent Mutations TP53 TP53, GATA2 IDH2, RUNX1, SRSF2 TP53, ASXL1

5 snp array[GRCh37] snp array[GRCh37] 5q14.2q33.3(82797741_157092860)x1[0.9], 6p25.1p24.2(6106822_11589618)x1[0.9], 6p22.3p22.1(15429731_27521891)x1[0.9], 11q23.3q24.1(119562848_121539282)x4 hmz[1.0], 11q24.3(128018010_128611889)x3[0.5], 11q24.3q25(128611918_131046161)x5 hmz[1.0], 12p13.31p12.3(8859609_15137901)x1[0.9], (15)x3[0.4], 16pterp11.2(85881_30731290)cth, hmz, 17pterp12(526_12251758)x2 3q13.33q22.1(120891728_130611475)x1[0.2], 5q13.3qter(74695345_180719789)x1[0.7], 6pterp12.3(156975_49643070)cth, 7q21.3q22.1(93298976_98534420)x1[0.7], 7q22.1qter(99472313_159119708)x1[0.7], 8q21.11q21.3(76031549_87513217)x4, 8q24.13q24.23(125797118_136838801)x3, 8q24.3(143902971_146035919)x1[0.5], 9q21.32q21.33(85778256_88125947)x3[0.5], 9q22.31q22.32(94680957_97096581)x1[0.7], 11p15.5p15.4(2143251_2915571)x1[0.6], 11p15.3p15.1(11019434_18941183)x1[0.6], 11q13.2qter(67845557_134942626)cth, 17p13.2(3612128_5242297)x1[0.6], 17p13.1(6907157_8211242)x1[0.6], (1-22)x2,(X,Y)x1 (6,8,10,13)x3[0.2] hmz 17q11.2qter(31284651_81041938)x2[0.25] ELN 2017 adverse risk adverse risk adverse risk adverse risk diagnosis de novo AML de novo AML de novo AML de novo AML % ring sideroblasts 15% 36% 44% 20% 16% 21% 17% 27% % erythroblasts 33% 37% 69% 35% % blasts 59 78 68 74 age at age at diagnosis F M M M sex RS001 RS005 RS006 RS008 UPN RS-AML patients included for RNA-seq analysis. RNA-seq included for patients S5: RS-AML Table

139

137853-gerritsen-layout.indd 139 14/01/2020 14:13 Chapter 5 Cluster 5 Cluster up up Recurrent Mutations TP53, TET2, SF1 TP53, DNMT3A snp array[GRCh37] snp array[GRCh37] 3pterp24.1(61892_29858520)x1[0.75] 3p24.1p12.1(76175526_84636824)cth 3q21.3(128310915_128502957)x3[0.6] 3q26.2(168797064_169026916)x3[0,7] 3q26.2(169293101_169535888)x3[0.7] 5q11.2q12.3(52636985_66251459)cth 5q12.3q35.3(66251460_180719790)x1[0.7] 5q14.3q33.3(83016330_156896155)x1[0.6], 7pterp11.2(43361_57709441)x1[0.6], 7q11.21qter(62453455_159119707)x1[0.2], (13)cth, 16q12.1qter(49778690_90155062)cth, (19)x3 [0.6], 20pterq11.22(61569_32496494)x3[0.6], 20q11.22q13.33(32497342_59556875)x1 [0.6], [0.6], 20q13.33qter(59561326_62915555)x3 (21)x3 [0.6] ELN 2017 adverse risk n/a diagnosis tAML MDS-EB2 % ring sideroblasts 22% 13% 10% 24% % erythroblasts 20% 15% % blasts 70 72 age at age at diagnosis M F sex RS011 RS019 UPN Table S5: Continued. Table

140

137853-gerritsen-layout.indd 140 14/01/2020 14:13 Ring sideroblasts in AML

Table S6: Upregulated genes within GO:0030219 megakaryocyte differentiation

Input ID Gene ID Description Biological Process (GO)

ZFP36 7538 ZFP36 ring finger protein GO:1901835 positive regulation of deadenylation-independent decapping of nuclear-transcribed mRNA;GO:1901834 regulation of deadenylation- independent decapping of nuclear-transcribed mRNA;GO:0060213 positive regulation of nuclear-transcribed mRNA poly(A) tail shortening RPS19 6223 ribosomal protein S19 GO:0060265 positive regulation of respiratory burst involved in inflammatory response;GO:0060266 negative regulation of respiratory burst involved in inflammatory response;GO:0060264 regulation of respiratory burst involved in inflammatory response ZFP36L1 677 ZFP36 ring finger GO:0045657 positive regulation of monocyte differentiation;GO:0060710 protein like 1 chorio-allantoic fusion;GO:0097403 cellular response to raffinose PCBP2 5094 poly(rC) binding protein 2 GO:0075522 IRES-dependent viral translational initiation;GO:0019081 viral translation;GO:0039694 viral RNA genome replication GP1BA 2811 glycoprotein Ib platelet GO:0007597 blood coagulation, intrinsic pathway;GO:0042730 subunit alpha fibrinolysis;GO:0045652 regulation of megakaryocyte differentiation ITGA2B 3674 integrin subunit alpha 2b GO:0045652 regulation of megakaryocyte differentiation;GO:0070527 platelet aggregation;GO:0002576 platelet degranulation ZFPM1 161882 zinc finger protein, FOG GO:0060377 negative regulation of mast cell differentiation;GO:0002295 family member 1 T-helper cell lineage commitment;GO:0003195 tricuspid valve formation 5 GATA1 2623 GATA binding protein 1 GO:0035854 eosinophil fate commitment;GO:0010725 regulation of primitive erythrocyte differentiation;GO:0030221 basophil differentiation TAL1 6886 TAL bHLH transcription GO:0030221 basophil differentiation;GO:0060375 regulation of mast cell factor 1, erythroid differentiation;GO:0060217 hemangioblast cell differentiation differentiation factor GATA2 2624 GATA binding protein 2 GO:0035854 eosinophil fate commitment;GO:1903589 positive regulation of blood vessel endothelial cell proliferation involved in sprouting angiogenesis;GO:0010725 regulation of primitive erythrocyte differentiation TGFB1 7040 transforming growth GO:0048298 positive regulation of isotype switching to IgA factor beta 1 isotypes;GO:0090190 positive regulation of branching involved in ureteric bud morphogenesis;GO:0048296 regulation of isotype switching to IgA isotypes HIST1H3I 8354 histone cluster 1 H3 GO:0038111 interleukin-7-mediated signaling pathway;GO:0006335 DNA replication- family member i dependent nucleosome assembly;GO:0098761 cellular response to interleukin-7 MYC 4609 MYC proto-oncogene, GO:0090096 positive regulation of metanephric cap mesenchymal cell bHLH transcription factor proliferation;GO:0090095 regulation of metanephric cap mesenchymal cell proliferation;GO:0090094 metanephric cap mesenchymal cell proliferation involved in metanephros development EGR3 1960 early growth response 3 GO:0033089 positive regulation of T cell differentiation in thymus;GO:0045586 regulation of gamma-delta T cell differentiation;GO:0042492 gamma-delta T cell differentiation SETD1A 9739 SET domain containing 1A GO:1902036 regulation of hematopoietic stem cell differentiation;GO:0045652 regulation of megakaryocyte differentiation;GO:1901532 regulation of hematopoietic progenitor cell differentiation SOS1 6654 SOS Ras/Rac GO:2000973 regulation of pro-B cell differentiation;GO:1904693 midbrain guanine nucleotide morphogenesis;GO:1905456 regulation of lymphoid progenitor cell differentiation exchange factor 1 RUNX3 864 runt related GO:0043378 positive regulation of CD8-positive, alpha-beta transcription factor 3 T cell differentiation;GO:0043376 regulation of CD8-positive, alpha-beta T cell differentiation;GO:0043371 negative regulation of CD4-positive, alpha-beta T cell differentiation MIF 4282 macrophage migration GO:0061078 positive regulation of prostaglandin secretion involved in inhibitory factor immune response;GO:0090238 positive regulation of arachidonic acid secretion;GO:0090323 prostaglandin secretion involved in immune response HIST1H3A 8350 histone cluster 1 H3 GO:0038111 interleukin-7-mediated signaling pathway;GO:0006335 DNA replication- family member a dependent nucleosome assembly;GO:0098761 cellular response to interleukin-7

141

137853-gerritsen-layout.indd 141 14/01/2020 14:13 Chapter 5

Table S6: Continued.

Input ID Gene ID Description Biological Process (GO)

TCF3 6929 transcription factor 3 GO:0002326 B cell lineage commitment;GO:0033152 immunoglobulin V(D)J recombination;GO:1902036 regulation of hematopoietic stem cell differentiation SRF 6722 serum response factor GO:0003257 positive regulation of transcription from RNA polymerase II promoter involved in myocardial precursor cell differentiation;GO:0046016 positive regulation of transcription by glucose;GO:0045059 positive thymic T cell selection JUN 3725 Jun proto-oncogene, AP-1 GO:0045657 positive regulation of monocyte differentiation;GO:0043922 transcription factor subunit negative regulation by host of viral transcription;GO:0043923 positive regulation by host of viral transcription ZNF385A 25946 zinc finger protein 385A GO:1902164 positive regulation of DNA damage response, signal transduction by p53 class mediator resulting in transcription of p21 class mediator;GO:0006977 DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest;GO:0072431 signal transduction involved in mitotic G1 DNA damage checkpoint HIST1H3C 8352 histone cluster 1 H3 GO:0038111 interleukin-7-mediated signaling pathway;GO:0006335 DNA replication- family member c dependent nucleosome assembly;GO:0098761 cellular response to interleukin-7 PML 5371 promyelocytic leukemia GO:1902187 negative regulation of viral release from host cell;GO:0006977 DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest;GO:0050713 negative regulation of interleukin-1 beta secretion ASH2L 9070 ASH2 like, histone lysine GO:0045652 regulation of megakaryocyte differentiation;GO:1904837 beta- methyltransferase catenin-TCF complex assembly;GO:0051568 histone H3-K4 methylation complex subunit LDB1 8861 LIM domain binding 1 GO:0021702 cerebellar Purkinje cell differentiation;GO:0043973 histone H3- K4 acetylation;GO:0021694 cerebellar Purkinje cell layer formation NOTCH1 4851 notch 1 GO:0003270 Notch signaling pathway involved in regulation of secondary heart field cardioblast proliferation;GO:0072144 glomerular mesangial cell development;GO:0003252 negative regulation of cell proliferation involved in heart valve morphogenesis HIST1H4K 8362 histone cluster 1 H4 GO:0045653 negative regulation of megakaryocyte differentiation;GO:0034080 family member k CENP-A containing nucleosome assembly;GO:0006335 DNA replication-dependent nucleosome assembly HIST1H4J 8363 histone cluster 1 H4 GO:0045653 negative regulation of megakaryocyte differentiation;GO:0034080 family member j CENP-A containing nucleosome assembly;GO:0006335 DNA replication-dependent nucleosome assembly HIST1H4L 8368 histone cluster 1 H4 GO:0045653 negative regulation of megakaryocyte differentiation;GO:0034080 family member l CENP-A containing nucleosome assembly;GO:0006335 DNA replication-dependent nucleosome assembly HIST1H4B 8366 histone cluster 1 H4 GO:0045653 negative regulation of megakaryocyte differentiation;GO:0034080 family member b CENP-A containing nucleosome assembly;GO:0006335 DNA replication-dependent nucleosome assembly NCOA6 23054 nuclear receptor GO:0042921 glucocorticoid receptor signaling pathway;GO:0031958 coactivator 6 corticosteroid receptor signaling pathway;GO:0030520 intracellular estrogen receptor signaling pathway PRKCQ 5588 protein kinase C theta GO:2000570 positive regulation of T-helper 2 cell activation;GO:2000569 regulation of T-helper 2 cell activation;GO:0035712 T-helper 2 cell activation SOD1 6647 superoxide dismutase 1 GO:0060088 auditory receptor cell stereocilium organization;GO:0045541 negative regulation of cholesterol biosynthetic process;GO:0032287 peripheral nervous system myelin maintenance HIST1H3H 8357 histone cluster 1 H3 GO:0038111 interleukin-7-mediated signaling pathway;GO:0006335 DNA replication- family member h dependent nucleosome assembly;GO:0098761 cellular response to interleukin-7 PRMT6 55170 protein arginine GO:0051572 negative regulation of histone H3-K4 methylation;GO:0034970 methyltransferase 6 histone H3-R2 methylation;GO:0043985 histone H4-R3 methylation

142

137853-gerritsen-layout.indd 142 14/01/2020 14:13 Ring sideroblasts in AML

Table S6: Continued.

Input ID Gene ID Description Biological Process (GO)

TNRC6C 57690 trinucleotide repeat GO:0060213 positive regulation of nuclear-transcribed mRNA poly(A) tail containing 6C shortening;GO:0060211 regulation of nuclear-transcribed mRNA poly(A) tail shortening;GO:1900153 positive regulation of nuclear-transcribed mRNA catabolic process, deadenylation-dependent decay WASF2 10163 WAS protein family GO:0051497 negative regulation of stress fiber assembly;GO:0038096 Fc- member 2 gamma receptor signaling pathway involved in phagocytosis;GO:0032232 negative regulation of actin filament bundle assembly SCRIB 23513 scribbled planar cell GO:0039502 suppression by virus of host type I interferon-mediated polarity protein signaling pathway;GO:0039503 suppression by virus of host innate immune response;GO:0039563 suppression by virus of host STAT1 activity ABL1 25 ABL proto-oncogene GO:0002333 transitional one stage B cell differentiation;GO:0051281 1, non-receptor positive regulation of release of sequestered calcium ion into tyrosine kinase cytosol;GO:1905555 positive regulation blood vessel branching ATPIF1 93974 ATP synthase inhibitory GO:1901030 positive regulation of mitochondrial outer membrane factor subunit 1 permeabilization involved in apoptotic signaling pathway;GO:1901028 regulation of mitochondrial outer membrane permeabilization involved in apoptotic signaling pathway;GO:1904925 positive regulation of autophagy of mitochondrion in response to mitochondrial depolarization 5 PKN1 5585 protein kinase N1 GO:0035407 histone H3-T11 phosphorylation;GO:0007257 activation of JUN kinase activity;GO:0030889 negative regulation of B cell proliferation BCR 613 BCR, RhoGEF and GTPase GO:0043314 negative regulation of neutrophil degranulation;GO:1902564 negative activating protein regulation of neutrophil activation;GO:0043313 regulation of neutrophil degranulation MED1 5469 mediator complex GO:0060750 epithelial cell proliferation involved in mammary gland subunit 1 duct elongation;GO:0070562 regulation of vitamin D receptor signaling pathway;GO:2000347 positive regulation of hepatocyte proliferation HIST1H4C 8364 histone cluster 1 H4 GO:0045653 negative regulation of megakaryocyte differentiation;GO:0034080 family member c CENP-A containing nucleosome assembly;GO:0006335 DNA replication-dependent nucleosome assembly C6orf25 80739 megakaryocyte and GO:0035855 megakaryocyte development;GO:0030220 platelet platelet inhibitory formation;GO:0036344 platelet morphogenesis receptor G6b RHAG 6005 Rh associated glycoprotein GO:0035378 carbon dioxide transmembrane transport;GO:0060586 multicellular organismal iron ion homeostasis;GO:0015670 carbon dioxide transport UBASH3A 53347 ubiquitin associated and GO:0050860 negative regulation of T cell receptor signaling pathway;GO:0050858 SH3 domain containing A negative regulation of antigen receptor-mediated signaling pathway;GO:0050856 regulation of T cell receptor signaling pathway MPL 4352 MPL proto-oncogene, GO:1905221 positive regulation of platelet formation;GO:1905219 thrombopoietin receptor regulation of platelet formation;GO:1990959 eosinophil homeostasis IL7 3574 interleukin 7 GO:0038111 interleukin-7-mediated signaling pathway;GO:0002360 T cell lineage commitment;GO:2001240 negative regulation of extrinsic apoptotic signaling pathway in absence of ligand CSF1 1435 colony stimulating factor 1 GO:1904141 positive regulation of microglial cell migration;GO:1902228 positive regulation of macrophage colony-stimulating factor signaling pathway;GO:1904139 regulation of microglial cell migration KLF1 10661 Kruppel like factor 1 GO:0060135 maternal process involved in female pregnancy;GO:0030218 erythrocyte differentiation;GO:0034101 erythrocyte homeostasis SERPING1 710 serpin family G member 1 GO:0001869 negative regulation of complement activation, lectin pathway;GO:0001868 regulation of complement activation, lectin pathway;GO:0045916 negative regulation of complement activation CBFA2T3 863 CBFA2/RUNX1 GO:0045820 negative regulation of glycolytic process;GO:2001170 negative translocation partner 3 regulation of ATP biosynthetic process;GO:0006110 regulation of glycolytic process

143

137853-gerritsen-layout.indd 143 14/01/2020 14:13 Chapter 5

Table S6: Continued.

Input ID Gene ID Description Biological Process (GO)

FOXJ1 2302 forkhead box J1 GO:0072016 glomerular parietal epithelial cell development;GO:1901248 positive regulation of lung ciliated cell differentiation;GO:0002897 positive regulation of central B cell tolerance induction AHSP 51327 alpha hemoglobin GO:0020027 hemoglobin metabolic process;GO:0030218 erythrocyte stabilizing protein differentiation;GO:0034101 erythrocyte homeostasis PAWR 5074 pro-apoptotic GO:1901300 positive regulation of hydrogen peroxide-mediated programmed cell WT1 regulator death;GO:0050860 negative regulation of T cell receptor signaling pathway;GO:0042986 positive regulation of amyloid precursor protein biosynthetic process FBN1 2200 fibrillin 1 GO:2001205 negative regulation of osteoclast development;GO:0035583 sequestering of TGFbeta in extracellular matrix;GO:0035582 sequestering of BMP in extracellular matrix F2RL1 2150 F2R like trypsin receptor 1 GO:0070963 positive regulation of neutrophil mediated killing of gram-negative bacterium;GO:1900135 positive regulation of renin secretion into blood stream;GO:0070962 positive regulation of neutrophil mediated killing of bacterium NRARP 441478 NOTCH regulated GO:1902367 negative regulation of Notch signaling pathway involved in ankyrin repeat protein somitogenesis;GO:1902366 regulation of Notch signaling pathway involved in somitogenesis;GO:1902359 Notch signaling pathway involved in somitogenesis NR1D1 9572 nuclear receptor subfamily GO:0034144 negative regulation of toll-like receptor 4 signaling 1 group D member 1 pathway;GO:0070859 positive regulation of bile acid biosynthetic process;GO:0034143 regulation of toll-like receptor 4 signaling pathway RNF26 79102 ring finger protein 26 GO:1905719 protein localization to perinuclear region of cytoplasm;GO:0050687 negative regulation of defense response to virus;GO:0070979 protein K11-linked ubiquitination ZFP36L2 678 ZFP36 ring finger GO:1900153 positive regulation of nuclear-transcribed mRNA catabolic protein like 2 process, deadenylation-dependent decay;GO:0061158 3’-UTR-mediated mRNA destabilization;GO:1900151 regulation of nuclear-transcribed mRNA catabolic process, deadenylation-dependent decay ARG2 384 arginase 2 GO:1905403 negative regulation of activated CD8-positive, alpha-beta T cell apoptotic process;GO:2000562 negative regulation of CD4-positive, alpha-beta T cell proliferation;GO:2000666 negative regulation of interleukin-13 secretion KLF2 10365 Kruppel like factor 2 GO:0060509 type I pneumocyte differentiation;GO:0071409 cellular response to cycloheximide;GO:0097533 cellular stress response to acid chemical PTPRS 5802 protein tyrosine GO:0034164 negative regulation of toll-like receptor 9 signaling phosphatase, pathway;GO:0034163 regulation of toll-like receptor 9 signaling receptor type S pathway;GO:0048671 negative regulation of collateral sprouting TRIM58 25893 tripartite motif GO:0061931 positive regulation of erythrocyte enucleation;GO:0061930 containing 58 regulation of erythrocyte enucleation;GO:0043131 erythrocyte enucleation TNFRSF21 27242 TNF receptor superfamily GO:2000663 negative regulation of interleukin-5 secretion;GO:2000666 member 21 negative regulation of interleukin-13 secretion;GO:2001180 negative regulation of interleukin-10 secretion ZMIZ1 57178 zinc finger MIZ-type GO:0007296 vitellogenesis;GO:0045582 positive regulation of T cell containing 1 differentiation;GO:0045621 positive regulation of lymphocyte differentiation TICAM1 148022 toll like receptor GO:0034128 negative regulation of MyD88-independent toll-like receptor signaling adaptor molecule 1 pathway;GO:0034127 regulation of MyD88-independent toll-like receptor signaling pathway;GO:0035666 TRIF-dependent toll-like receptor signaling pathway CD59 966 CD59 molecule (CD59 GO:0001971 negative regulation of activation of membrane attack blood group) complex;GO:0001969 regulation of activation of membrane attack complex;GO:0045916 negative regulation of complement activation DYRK3 8444 dual specificity tyrosine GO:0035617 stress granule disassembly;GO:0043518 negative regulation of phosphorylation DNA damage response, signal transduction by p53 class mediator;GO:0043516 regulated kinase 3 regulation of DNA damage response, signal transduction by p53 class mediator

144

137853-gerritsen-layout.indd 144 14/01/2020 14:13 Ring sideroblasts in AML

Table S6: Continued.

Input ID Gene ID Description Biological Process (GO)

HLA-H 3077 homeostatic iron regulator GO:1904283 negative regulation of antigen processing and presentation of endogenous peptide antigen via MHC class I;GO:0002626 negative regulation of T cell antigen processing and presentation;GO:1904282 regulation of antigen processing and presentation of endogenous peptide antigen via MHC class I JUNB 3726 JunB proto-oncogene, GO:0001829 trophectodermal cell differentiation;GO:0060716 labyrinthine AP-1 transcription layer blood vessel development;GO:0046697 decidualization factor subunit TOB2 10766 transducer of ERBB2, 2 GO:0045671 negative regulation of osteoclast differentiation;GO:0002762 negative regulation of myeloid leukocyte differentiation;GO:0045670 regulation of osteoclast differentiation CLPTM1 1209 CLPTM1, transmembrane GO:0033081 regulation of T cell differentiation in thymus;GO:0033077 T cell protein differentiation in thymus;GO:0045580 regulation of T cell differentiation TGFB3 7043 transforming growth GO:0042704 uterine wall breakdown;GO:0060364 frontal suture factor beta 3 morphogenesis;GO:0010936 negative regulation of macrophage cytokine production EIF2AK1 27102 eukaryotic translation GO:0045993 negative regulation of translational initiation by iron;GO:0010999 initiation factor 2 regulation of eIF2 alpha phosphorylation by heme;GO:0046986 alpha kinase 1 negative regulation of hemoglobin biosynthetic process PCID2 55795 PCI domain containing 2 GO:0090267 positive regulation of mitotic cell cycle spindle assembly 5 checkpoint;GO:0090266 regulation of mitotic cell cycle spindle assembly checkpoint;GO:1903504 regulation of mitotic spindle checkpoint SPN 6693 sialophorin GO:0002296 T-helper 1 cell lineage commitment;GO:0002295 T-helper cell lineage commitment;GO:0045063 T-helper 1 cell differentiation PIP4K2A 5305 phosphatidylinositol- GO:2000786 positive regulation of autophagosome assembly;GO:0035855 5-phosphate 4-kinase megakaryocyte development;GO:2000785 regulation of autophagosome assembly type 2 alpha FBXO7 25793 F-box protein 7 GO:0045736 negative regulation of cyclin-dependent protein serine/threonine kinase activity;GO:1903204 negative regulation of oxidative stress-induced neuron death;GO:1903599 positive regulation of autophagy of mitochondrion FAM3A 60343 family with sequence GO:1905035 negative regulation of antifungal innate immune similarity 3 member A response;GO:1905034 regulation of antifungal innate immune response;GO:0061760 antifungal innate immune response PABPC4 8761 poly(A) binding protein GO:0043488 regulation of mRNA stability;GO:0061515 myeloid cell cytoplasmic 4 development;GO:0061013 regulation of mRNA catabolic process

145

137853-gerritsen-layout.indd 145 14/01/2020 14:13 Chapter 5

Table S7: Common upregulated genes

Input ID Gene ID Description Biological Process (GO)

PRKCQ 5588 protein kinase C theta GO:2000570 positive regulation of T-helper 2 cell activation;GO:2000569 regulation of T-helper 2 cell activation;GO:0035712 T-helper 2 cell activation PARP9 83666 poly(ADP-ribose) GO:0034356 NAD biosynthesis via nicotinamide riboside salvage polymerase family pathway;GO:0060335 positive regulation of interferon-gamma- member 9 mediated signaling pathway;GO:2001034 positive regulation of double-strand break repair via nonhomologous end joining ALAS2 212 5’-aminolevulinate GO:0006782 protoporphyrinogen IX biosynthetic process;GO:0032364 synthase 2 oxygen homeostasis;GO:0033483 gas homeostasis NFE2 4778 nuclear factor, erythroid 2 GO:0045652 regulation of megakaryocyte differentiation;GO:0006337 nucleosome disassembly;GO:0031498 chromatin disassembly NCF4 4689 neutrophil cytosolic GO:0048010 vascular endothelial growth factor receptor signaling factor 4 pathway;GO:0045730 respiratory burst;GO:0006801 superoxide metabolic process ADCY7 113 adenylate cyclase 7 GO:0006171 cAMP biosynthetic process;GO:0071377 cellular response to glucagon stimulus;GO:0003091 renal water homeostasis HBG2 3048 hemoglobin subunit GO:0015671 oxygen transport;GO:0042744 hydrogen peroxide gamma 2 catabolic process;GO:0015669 gas transport SLC4A1 6521 solute carrier family GO:0051453 regulation of intracellular pH;GO:0030641 regulation 4 member 1 (Diego of cellular pH;GO:0015701 bicarbonate transport blood group) HBA2 3040 hemoglobin GO:0015671 oxygen transport;GO:0015701 bicarbonate subunit alpha 2 transport;GO:0042744 hydrogen peroxide catabolic process HBB 3043 hemoglobin subunit beta GO:0030185 nitric oxide transport;GO:0045429 positive regulation of nitric oxide biosynthetic process;GO:0070527 platelet aggregation SLC29A3 55315 solute carrier family GO:1901642 nucleoside transmembrane transport;GO:0015858 29 member 3 nucleoside transport;GO:1901264 carbohydrate derivative transport PFKFB4 5210 6-phosphofructo-2- GO:0045821 positive regulation of glycolytic process;GO:2001171 positive regulation kinase/fructose-2,6- of ATP biosynthetic process;GO:0006110 regulation of glycolytic process biphosphatase 4 SLC43A1 8501 solute carrier family GO:1902475 L-alpha-amino acid transmembrane transport;GO:0015804 43 member 1 neutral amino acid transport;GO:0015807 L-amino acid transport BATF 10538 basic leucine zipper ATF- GO:0072540 T-helper 17 cell lineage commitment;GO:0002295 T-helper like transcription factor cell lineage commitment;GO:0045064 T-helper 2 cell differentiation PML 5371 promyelocytic leukemia GO:1902187 negative regulation of viral release from host cell;GO:0006977 DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest;GO:0050713 negative regulation of interleukin-1 beta secretion IFITM1 8519 interferon induced GO:0046597 negative regulation of viral entry into host cell;GO:0046596 regulation of transmembrane protein 1 viral entry into host cell;GO:0045071 negative regulation of viral genome replication DTX3L 151636 deltex E3 ubiquitin GO:1902966 positive regulation of protein localization to early endosome;GO:2001034 ligase 3L positive regulation of double-strand break repair via nonhomologous end joining;GO:1902965 regulation of protein localization to early endosome IFIT3 3437 interferon induced protein GO:0035457 cellular response to interferon-alpha;GO:0060337 type I with tetratricopeptide interferon signaling pathway;GO:0035455 response to interferon-alpha repeats 3 APOBEC3D 140564 apolipoprotein B GO:0045869 negative regulation of single stranded viral RNA replication via mRNA editing enzyme double stranded DNA intermediate;GO:0045091 regulation of single stranded catalytic subunit 3D viral RNA replication via double stranded DNA intermediate;GO:0039692 single stranded viral RNA replication via double stranded DNA intermediate SLAMF6 114836 SLAM family member 6 GO:0072540 T-helper 17 cell lineage commitment;GO:0002295 T-helper cell lineage commitment;GO:0072539 T-helper 17 cell differentiation

146

137853-gerritsen-layout.indd 146 14/01/2020 14:13 Ring sideroblasts in AML

Table S7: Continued

Input ID Gene ID Description Biological Process (GO)

TESC 54997 tescalcin GO:0032417 positive regulation of sodium:proton antiporter activity;GO:0032415 regulation of sodium:proton antiporter activity;GO:0030854 positive regulation of granulocyte differentiation KBTBD7 84078 kelch repeat and BTB GO:0043687 post-translational protein modification;GO:0000165 domain containing 7 MAPK cascade;GO:0016567 protein ubiquitination HECTD3 79654 HECT domain E3 ubiquitin GO:0043161 proteasome-mediated ubiquitin-dependent protein protein ligase 3 catabolic process;GO:0006511 ubiquitin-dependent protein catabolic process;GO:0010498 proteasomal protein catabolic process DHRS9 10170 dehydrogenase/ GO:0042904 9-cis-retinoic acid biosynthetic process;GO:0042905 9-cis-retinoic reductase 9 acid metabolic process;GO:0002138 retinoic acid biosynthetic process PDE6G 5148 phosphodiesterase 6G GO:0022400 regulation of rhodopsin mediated signaling pathway;GO:0016056 rhodopsin mediated signaling pathway;GO:0045742 positive regulation of epidermal growth factor receptor signaling pathway DFFB 1677 DNA fragmentation GO:0030263 apoptotic chromosome condensation;GO:0006309 apoptotic factor subunit beta DNA fragmentation;GO:0030262 apoptotic nuclear changes OMA1 115209 OMA1 zinc GO:0010637 negative regulation of mitochondrial fusion;GO:0002024 diet metallopeptidase induced thermogenesis;GO:0010635 regulation of mitochondrial fusion 5 NETO2 81831 neuropilin and tolloid like 2 GO:2000312 regulation of kainate selective glutamate receptor activity;GO:1901016 regulation of potassium ion transmembrane transporter activity;GO:2000649 regulation of sodium ion transmembrane transporter activity KCNE3 10008 potassium voltage-gated GO:1905025 negative regulation of membrane repolarization during channel subfamily E ventricular cardiac muscle cell action potential;GO:1903765 negative regulatory subunit 3 regulation of potassium ion export across plasma membrane;GO:1903946 negative regulation of ventricular cardiac muscle cell action potential PTH2R 5746 parathyroid hormone GO:0007186 G-protein coupled receptor signaling pathway;GO:0007166 cell 2 receptor surface receptor signaling pathway;GO:0007165 signal transduction GPA33 10223 glycoprotein A33 GO:0007165 signal transduction;GO:0023052 signaling;GO:0007154 cell communication MAL 4118 mal, T cell differentiation GO:0001766 membrane raft polarization;GO:0031580 membrane protein raft distribution;GO:1902043 positive regulation of extrinsic apoptotic signaling pathway via death domain receptors FRRS1 391059 ferric chelate reductase 1 GO:0055114 oxidation-reduction process;GO:0008152 metabolic process;GO:0008150 biological_process AKR1E2 83592 aldo-keto reductase GO:0055114 oxidation-reduction process;GO:0008152 family 1 member E2 metabolic process;GO:0008150 biological_process ZMAT3 64393 zinc finger matrin-type 3 GO:0043065 positive regulation of apoptotic process;GO:0043068 positive regulation of programmed cell death;GO:0010942 positive regulation of cell death PARS2 25973 prolyl-tRNA synthetase GO:0006433 prolyl-tRNA aminoacylation;GO:0006418 tRNA aminoacylation 2, mitochondrial for protein translation;GO:0043039 tRNA aminoacylation TTLL11 158135 tubulin tyrosine GO:0018095 protein polyglutamylation;GO:0051013 microtubule ligase like 11 severing;GO:0018200 peptidyl-glutamic acid modification SLA 6503 Src like adaptor GO:0038083 peptidyl-tyrosine autophosphorylation;GO:0018108 peptidyl- tyrosine phosphorylation;GO:0046777 protein autophosphorylation TNFAIP8L2 79626 TNF alpha induced GO:0050868 negative regulation of T cell activation;GO:1903038 protein 8 like 2 negative regulation of leukocyte cell-cell adhesion;GO:0051250 negative regulation of lymphocyte activation FRAT2 23401 FRAT2, WNT signaling GO:1904886 beta-catenin destruction complex disassembly;GO:0060070 canonical pathway regulator Wnt signaling pathway;GO:0032984 protein-containing complex disassembly CD82 3732 CD82 molecule GO:0007166 cell surface receptor signaling pathway;GO:0007165 signal transduction;GO:0023052 signaling

147

137853-gerritsen-layout.indd 147 14/01/2020 14:13 Chapter 5

Table S7: Continued

Input ID Gene ID Description Biological Process (GO)

LY6E 4061 lymphocyte antigen GO:0007166 cell surface receptor signaling pathway;GO:0007165 6 family member E signal transduction;GO:0023052 signaling SEMA4B 10509 semaphorin 4B GO:0048843 negative regulation of axon extension involved in axon guidance;GO:0048841 regulation of axon extension involved in axon guidance;GO:1902668 negative regulation of axon guidance CPEB4 80315 cytoplasmic GO:2000766 negative regulation of cytoplasmic translation;GO:0042149 polyadenylation element cellular response to glucose starvation;GO:0035235 binding protein 4 ionotropic glutamate receptor signaling pathway ZFP64 55734 ZFP64 zinc finger protein GO:0031060 regulation of histone methylation;GO:0031056 regulation of histone modification;GO:0016571 histone methylation FANCF 2188 FA complementation GO:0036297 interstrand cross-link repair;GO:0006281 DNA group F repair;GO:0006974 cellular response to DNA damage stimulus DCP1B 196513 decapping mRNA 1B GO:0031087 deadenylation-independent decapping of nuclear- transcribed mRNA;GO:0000290 deadenylation-dependent decapping of nuclear-transcribed mRNA;GO:0031086 nuclear-transcribed mRNA catabolic process, deadenylation-independent decay

148

137853-gerritsen-layout.indd 148 14/01/2020 14:13 Ring sideroblasts in AML

5 . Image represents screenshot taken following analysis using Nexus software. using Nexus analysis taken following screenshot . Image represents of all individual patients with an overview results 1 – SNP array Supplementary Figure

149

137853-gerritsen-layout.indd 149 14/01/2020 14:13 Chapter 5

Supplementary Figure 2: Example of sorting strategy for erythroblasts. Cells are sorted as CD34-/CD33- and CD71+/CD235a+.

Supplementary Figure 3: A. Functional annotation of upregulated genes (n=1196) in RS-AML CD34+ cells vs. NBM CD34+ cells. B. Functional annotation of downregulated genes (n=1309) in RS-AML CD34+ cells vs. NBM CD34+ cells. C. Functional annotation of downregulated genes in RS-AML CD34+ cells vs. non-RS-AML cells. D. MEP signature analysis of AMLs in the non-RS-AML in CD33 vs. CD34 sorted AMLs.

150

137853-gerritsen-layout.indd 150 14/01/2020 14:13 Ring sideroblasts in AML

5

Supplementary Figure 4: A. Functional annotation of genes in cluster 1 shown in Figure 6A. B. Functional annotation of genes in cluster 2 shown in Figure 6A. C. Functional annotation of genes in cluster 3 shown in Figure 6A. D. Functional annotation of genes in cluster 4 shown in Figure 6A. E. Functional annotation of genes in cluster 6 shown in Figure 6A.

151

137853-gerritsen-layout.indd 151 14/01/2020 14:13 Chapter 5

Supplementary Figure 5: A. Gene overlap of common downregulated genes between RS-AML, SF3B1mut AML and SF3B1mut MDS versus control. Biological process enrichment is shown for commonly downregulated genes. B. Expression of SF3B1, PRPF8 and ABCB7 in indicated groups. One-way Anova followed by pairwise t-test was used to determine statistical significance, * p = < 0.05; ** p = < 0.01; *** p = < 0.001; **** p = < 0.0001. C. Overlap of differentially spliced genes in Dolatshad et al. (2016) and our dataset (RS-AML vs. NBM). D. Overlap of differentially spliced genes in Dolatshad et al. (2016) and genes with decreased gene expression (1Log) in our dataset (RS-AML vs. non-RS-AML)

152

137853-gerritsen-layout.indd 152 14/01/2020 14:13 137853-gerritsen-layout.indd 153 14/01/2020 14:13