Supporting Information Supplementary Methods Patients for whole genome sequencing and validation cohort. Heparinized bone marrow samples were obtained from 8 RAEB patients with informed consent for WGS according to the ethics review board of Shanghai Institute of Hematology. Briefly, these 8 patients were 4 RAEB-1, 4 RAEB-2, 5 males, 3 females, 1 with complex karyotype, 1 with +8, 5 with normal karyotype, and classified as intermediate to very high risk level. 6 patients died 4-23 months after diagnosis of infection, hemorrhage, cerebral infarction or evolution to AML (complete information see Table S1). The validation cohort consisted of 188 various subtypes of MDS patients diagnosed and treated in Shanghai Ruijin Hospital and Shanghai No.6 People’s Hospital. All patients provided written informed consent. Bone marrow and paired buccal samples were obtained after informed consent.
DNA sample preparation. Mononuclear cells (MNC) were separated by density gradient centrifugation using Ficoll in 8 RAEB patients and 188 MDS patients from validation cohort. Subsequently, CD34+ cells were isolated by magnetic cell separation (Miltenyi Biotech, Bergisch Gladbach, Germany) to reach a purity of 89-97.7% (average: 93.1%) in 8 RAEB patients. Flow through CD34- cells were also collected for analysis. Skin biopsy was obtained for analysis of normal genome and extracted by DNeasy Blood & Tissue Kit (Qiagen). Genomic DNA of CD34+ cells were isolated by QuickGene DNA whole blood kit L (FUJIFILM, Life Science). Genomic DNA of MNC from validation set was extracted by Wizard® Genomic DNA Purification Kit (Promega).
DNA library preparation. Genomic DNA was sheared by sonication 1 and adaptors were ligated to the resulting fragments. The adaptor-ligated templates were fractionated by agarose gel electrophoresis and fragments of the desired size were excised. The resulting fragments were amplified by ligation-mediated PCR, purified and subjected to DNA sequencing on the Illumina platform.
Massively parallel sequencing. The workflow of Cluster generation using the Illumina cluster station was as follows: template hybridization, isothermal amplification, linearization, blocking, denaturation and sequencing primer hybridization. Then, deep sequencing was performed for the captured libraries with the Illumina GAIIx and HiSeq2000, and 2×120 bp (base pairs) paired end reads were output following the manufacturer’s protocols. Image analysis and base calling were performed by Illumina RTA versions 1.6 with default parameters.
Alignment, SNV/INDEL calling and quality control. A third party software BWA (1) was used to align the paired end reads to the reference human genome (hg19, http://genome.ucsc.edu/) with default parameters. Variations including SNVs and INDELs were called with the Samtools software package (2) and filtered with recommended threshold (SNV quality ≥20, INDEL quality ≥50 and at least 3 reads covered) for cases. To ensure the filter power and minimize the false discover rate, loose criteria were applied to filter those control variations (SNV quality ≥10, INDEL quality ≥10 and ≥3 reads covered). When we were plotting figures and doing clonality analysis based on SNVs, a strict threshold was adopted as below: SNV quality ≥100, case depth ≥30, Map quality ≥55 and control depth ≥30.
Targeted gene resequencing. In order to determine the recurrent 2 mutations in highlighted genes, we designed PCR primers following the guidelines from Fluidigm using iPLEX AssayDesigner software. The Fluidigm Access Array microfluidic platform was adapted to generate highly multiplexed libraries of tagged amplicons from MDS patients. Deep resequencing was performed on Illumina GAIIx/MiSeq platform.
Somatic copy number variation (CNV) and uniparental disomy (UPD) detection. The DNA from case tumor and matched germline control was prepared for hybridization to Illumina high density Genome Wide Human 660W Quad_v1 (657,366 probes) SNPs array according to the manufacturer’s protocol. The raw intensity data (*.idat) files were analyzed using the Genotyping Module of Illumina Genome Studio software Version 2011.1. With this software, normalized Log R Ratio (LRR) and B Allele Frequency (BAF) for all the available probes in each sample were extracted. OncoSNP (3) (version 1.1) was selected to detect somatic genomic alterations in paired samples. To verify the reliability of CNVs and UPDs, all the reported alterations were plotted based on their LRR and BAF by R statistical software (www.r-project.org, version 2.15.1) and visually checked. Only the somatic alterations meet the criteria proposed by OncoSNP and PennCNV (4) were kept for further study. Copy Number Variations (CNVs) were analyzed with regard to their chromosomal positions (Fig. S3), indicating that amplification of chr8 regions and DELs or uniparental disomy (UPD) of chr7 regions were the most common events. Cases A2 and A6 were found to have complex chromosomal aberrations (Table S6), harboring DELs of respectively TP53 gene–containing 17p13.3-p13.1 and 17p13.3-p11.2 in the presence of TP53 mutations on the remaining allele. By contrast, UPD of regions on 4q21.22-q35.2 and 7q32.1-q36.3 was found in A7 while amplification 3 of p23.3-q24.3 in A8, the two cases with normal karyotype.
Clonality analysis. Clonality analysis was performed according to a previous report (5). And these figures were plotted based on high-quality somatic SNVs with strict detailed threshold described above.
Statistical analysis. Student’s t-test was used to compare the difference between the average coding sequence mutations between groups. The patients with mutations between RAEB and RCMD was analyzed by Chi-square test. Fisher exact test was applied to determine the co-occurrence of highly recurrent genes. Overall survival (OS) was defined as the time from the date of diagnosis to death or alive at last follow-up (censored). Progression free survival (PFS) was calculated from diagnosis to disease progression, defined as relapse, progression to acute leukemia or RAEB phase, death, or alive at last follow-up (censored). Kaplan-Meier was used to evaluate time to survival and time to progression. All p values were based on 2-sided tests. The statistical analyses were performed with the statistical software package SPSS 19.0 (SPSS Science, Chicago, IL, USA). Univariate analyses were performed among 196 MDS patients to access the impact of age, gender, WHO subtypes, percentage of BM blasts, levels of hemoglobin, platelet and neutrophil, chromosomal aberrations and gene mutations as variables on OS and PFS and to screen the main prognostic factors. It was found that male (p = 0.025), RAEB subtypes (p = 0.002), high percentage of BM blasts (p < 0.001), lower hemoglobin (p < 0.001) and occurrence of gene mutations (p = 0.001) were associated with adverse OS, and except for gender (p = 0.081), these factors were also associated with poor PFS (Table S11). Furthermore, when each of the gene mutations was analyzed separately, mutations of STAG2 (p = 0.007), cohesin family complex (p = 4
0.004), DNMT3A (p = 0.024), IDH1/IDH2 (p = 0.01), U2AF1 (p = 0.018), RUNX1 (p = 0.013) and TP53 (p = 0.048) predicted adverse PFS (Table S12). We then performed multivariate analyses in clinical parameters and mutated genes separately. It was found that clinical factors including percentage of BM blasts (hazard ratio [HR] = 2.30; 95% CI, 1.56-3.37; p < 0.001 for OS, [HR] = 2.29; 95% CI, 1.62-3.23; p < 0.001 for PFS) and hemoglobin (HR = 2.56; 95% CI, 1.44-4.57; p = 0.001 for OS, HR = 2.63; 95% CI, 1.46-4.73; p = 0.001 for PFS) were independent adverse prognostic factors for OS and PFS. While in mutated genes, STAG2 ([HR] = 4.21; 95% CI, 1.07-16.2; p = 0.04), IDH1/IDH2 ([HR] = 5.67; 95% CI, 1.26-10.9; p = 0.017), RUNX1 ([HR] = 4.11; 95% CI, 1.03-5.80; p = 0.043) were independent adverse prognostic factors for PFS (Table S13).
5
Supplementary Figures
Fig. S1. (A), (B) Impact of age on the number of somatic mutations in the genomes of bone marrow CD34+ cells among 8 RAEB cases. No significant correlation was observed between the age and the number of all mutations (p = 0.403) (A) or between the age and the number of non-silent mutations in coding sequences (p = 0.487) (B). (C) Proportion of nucleotide transitions and transversions (62.3% vs. 37.7%) in the genomes of bone marrow CD34+ cells among 8 RAEB cases analyzed with WGS. (D) Numbers of distinct SNVs in coding sequences among 8 RAEB cases analyzed with WGS.
6
Fig. S2. (A) Percentage of nucleotide transitions and transversions in two previously reported exome sequencing studies for MDS (65.0% vs. 35.0%) (MDS-exome-1 and MDS-exome-2) (6, 7) and in a recently reported exome sequencing/WGS study for AML (67.7% vs. 32.3%) (8). (B) Genomic mutation spectrum of SNVs in each of six mutation classes in previously reported studies for MDS (6, 7) and AML (8). Note that C→T is the most prevalent change.
7
Fig. S3. Types and genomic distribution of somatic copy number variations (CNVs) in the genomes of bone marrow CD34+ cells among 7 RAEB case. UPD: uniparental disomy. CNVs are shown in colored lines.
8
Fig. S4. Predicted domain structures of two EWSR1-ASXL1 transcripts, with fusions between sequences for N-terminal 483 aa or 431 aa of EWSR1 and the exon 3 or exon 1 of ASXL1, in a tail-to-tail manner, respectively. EAD: Gln/Pro/Thr-rich region; IQ: IQ domain, binds calmodulin; RRM: RNA recognition motif.
9
Fig. S5. Circos plot showing a landscape view of 287 somatic mutations of 38 genes in seven functional categories among 145 MDS cases. Ribbons connecting distinct categories of gene abnormalities reflect the concurrent mutations of each two categories, whereas mutual exclusivity may exist in areas that are not connected.
10
Fig. S6. Frequencies of non-silent gene mutations in MDS vs. AML. Red: gene abnormalities in all MDS cases with mutations; Blue: gene mutations in RAEB; Green: gene mutations in RCMD; Purple: integrated data of gene mutations previously reported in three AML series (8-10).
11
Fig. S7. (A) Kaplan-Meier estimates of OS for five groups according to IPSS-R. 3-year OS rates for very low, low, intermediate, high and very high were 100%, 80.1±12.9%, 87.9±4.7%, 59.2±8.2% and 24.2±12.5%, respectively (p < 0.001). n: number of cases (B) Survival analysis of MDS patients with distinct status of gene mutations. Kaplan-Meier estimates of OS for three subgroups according to mutations of 21 marker genes (with mutation frequencies 2.5%). 3-year OS rates for absence of mutation (N = 0), one mutation (N = 1) and 2 mutations (N 2) were 93.2±3.3%, 77.3±6.2% and 66.3±6.2%, respectively (p < 0.001).
12
Supplementary Tables Table S1. Clinical characteristics of 8 RAEB patients
Patient
A8 A7 A6 A5 A4 A3 A2 A1
ID
WHO
RAEB RAEB RAEB RAEB RAEB RAEB RAEB RAEB
*
subtype
------
2 2 2 2 1 1 1 1
Ag
e
74 63 61 48 80 62 53 43
(yr)
Sex
M M M M M
F F F
consent
WGS
Yes Yes Yes Yes Yes Yes Yes Yes
*
Blasts
BM
(%)
11 14 14 17
8 8 9 7
BM CD34+ BM CD34+
Sorting (%) Sorting
After
97.7 93.5 93.8
89 93 92 95 91
ANC (×10
0.67 1.23 0.82 1.04 2.01 1.01 0.98
1.2
9
/L)
Hb (g/L) Hb
108 113
76 98 67 92 76 45
PLT (×10PLT
227 263
68 27 27 87 39 50
9
/L)
48,XY,+8,+8,+9,der(
20)t(20;21),
Cytogenetics
47,XX,+8
46,XX 46,XY 46,XY 46,XY 46,XY
N/A
-
21
13
IPSS
N/A
6.5 5.5 5.5 4.5
5 5 8
-
R
Intermediate
Very High Very High
High High High High
Risk
N/A
Decitabine+Chemotherapy
Decitabine+Low dose dose Decitabine+Low dose Decitabine+Low
Supportive Care Supportive Care Supportive Care Supportive Care Supportive
Chemotherapy Chemotherapy
Allo
Therapy
-
HSCT
Status
Alive Alive
Dead Dead Dead Dead Dead Dead
OS
10+ 10+
17 23 10 10 14
†
4
ont
hs)
(m
Cigarette
30 years 30
Quit for for Quit
Yes
No No No No No No
Hepatitis
HA
No No No No No No No
V
Cholecystitis, hypertension, thyroid adenoma thyroid hypertension, Cholecystitis,
Diabetes, pneumonectasis, gallstones Diabetes, pneumonectasis,
Hypertension, tuberculous pleuritis tuberculous Hypertension,
Hypertension, arrhythmia Hypertension,
Hypertension, diabetes Hypertension,
Hypertension
Past History
Appendicitis
Urticaria
*WGS: Whole Genome Sequencing †OS: Overall Survival
14
Table S2. Sequencing depth and coverage Patient Coverage Samples Depth ID ≥1X ≥2X ≥4X ≥8X ≥10X ≥15X ≥20X Bone 33.0 99.1% 98.7% 97.9% 95.7% 94.3% 89.4% 82.0% A1 Marrow Skin 37.4 99.0% 98.6% 97.5% 94.9% 93.1% 87.6% 80.4% Bone 48.9 99.9% 99.9% 99.7% 99.5% 99.3% 98.3% 96.0% A2 Marrow Skin 33.8 99.8% 99.6% 99.3% 98.6% 98.0% 95.4% 90.5% Bone 36.6 99.7% 99.6% 99.4% 98.9% 98.5% 96.9% 93.8% A3 Marrow Skin 31.1 99.7% 99.5% 99.0% 97.2% 95.6% 88.7% 77.5% Bone 23.3 99.7% 99.3% 97.7% 91.9% 87.8% 75.0% 60.9% A4 Marrow Skin 20.2 99.9% 99.8% 99.3% 97.3% 95.6% 89.4% 80.5% Bone 32.5 99.8% 99.7% 99.4% 98.4% 97.5% 93.5% 85.9% A5 Marrow Skin 31.1 99.9% 99.8% 99.5% 98.3% 97.3% 92.8% 84.3% Bone 31.6 99.4% 98.8% 97.0% 90.9% 87.0% 76.3% 66.1% A6 Marrow Skin 30.9 99.1% 98.0% 94.7% 85.5% 80.6% 69.5% 60.8% Bone 33.0 99.9% 99.9% 99.7% 99.0% 98.3% 95.0% 88.1% A7 Marrow Skin 34.7 99.9% 99.8% 99.6% 98.6% 97.6% 92.8% 83.3% Bone 31.9 99.8% 99.7% 99.5% 98.9% 98.2% 94.5% 85.7% A8 Marrow Skin 32.8 99.7% 99.6% 99.4% 98.2% 97.0% 90.9% 79.6%
15
Table S3. Summary of SNVs in 8 RAEB patients Patient ID A1 A2 A3 A4 A5 A6 A7 A8 All SNVs 1965 2329 2605 1290 1166 2972 2075 1437 Coding region
Missense 7 12 18 10 7 12 10 6 Nonsense 2 0 2 1 1 1 1 0 Synonymous 7 4 4 4 1 2 5 2 Noncoding, transcribed
5' UTR 2 4 2 3 0 1 3 2 3' UTR 9 12 18 7 4 18 24 6 Splice site 1 0 1 1 0 2 0 0 Intronic 655 820 813 459 365 957 656 466 Intergenic 1282 1476 1746 805 788 1977 1375 955
Table S4. Summary of INDEL* in 8 RAEB patients Patient ID A1 A2 A3 A4 A5 A6 A7 A8 All INDELs 1994 2377 2935 5905 2555 3226 2570 3060 CDS 2 5 5 8 4 2 3 1 Noncoding, transcribed
5' UTR 1 2 3 1 0 5 3 5 3' UTR 13 15 13 40 14 18 18 20 Splice site 0 0 0 0 0 1 0 0 Intronic 737 829 1045 2169 906 1239 952 1153 Intergenic 1241 1526 1869 3687 1631 1961 1594 1881 *INDEL: Short insertion and deletion
Table S5. Genomic rearrangements in 8 RAEB patients
Intrachromosomal Rearrangement Interchromosomal Patient ID Deletions Deletions * Total ITX Insersion Rearrangement (≥1000bp) (50-1000bp) A1 1 3 0 0 1 5 A2 10 0 26 11 23 70 A3 1 0 0 1 3 5 A4 1 2 1 0 1 5 A5 2 2 10 0 65 79 A6 8 4 6 3 9 30 A7 0 0 0 0 0 0 A8 0 0 0 0 0 0 Average 2.9 1.4 5.4 1.9 12.8 24.3 Total 23 11 43 15 102 194 *ITX: Intra-chromosome translocation
16
Table S6. Analysis of somatic CNV* and UPD in CD34+ cells from 7 RAEB patients† Gain/l Patient Chromoso Size Karyotype Altered region oss/U Gene (position)‡ ID me (Mb) PD A1 46,XY - - - - - A2 48,XY,+8,+8,+9, 3 p26.3-p12.3 77.7 Loss - der(20)t(20;21),- 3 p12.2-p12.1 6.4 Loss -
21 3 q11.1-q11.2 2 Loss -
3 q12.2-q23 42.5 Loss -
5 q11.1-q11.1 0.7 Loss -
5 q11.2-q11.2 0.9 Loss -
5 q11.2-q11.2 0.9 Loss -
5 q12.2-q12.3 2.5 Loss -
5 q13.1-q13.2 1.4 Loss -
5 q13.2-q35.3 110.6 Loss -
7 p22.3-q36.3 159.1 Loss -
17 p13.3-p13.1 7.6 Loss TP53 (17p13.1)
17 p13.1-p13.1 0.2 Loss -
17 p13.1-p13.1 0.8 Loss -
18 q12.3-q12.3 1.6 Loss -
18 q21.1-q23 32.7 Loss -
20 q11.21-q11.21 1.5 Gain -
20 q11.23-q13.2 16.8 Loss -
22 q11.1-q12.2 13.6 Gain -
A3 47,XX,+8 8 p23.3-q24.3 146.3 Gain SULF1 (8q13.2) A5 46,XY - - - - - A6 fail 1 p36.13-p36.13 0.9 Loss - 2 q33.1-q33.1 0.8 Loss -
3 q13.33-q24 21.7 Loss -
7 p22.3-q11.22 69.7 Loss -
7 q21.2-q21.3 2.5 UPD -
7 q21.3-q31.31 22.8 Loss LAMB4 (7q31.1)
7 q31.31-q32.3 13.5 UPD -
7 q32.3-q36.1 20.3 Loss -
7 q36.1-q36.3 7.5 UPD -
8 p23.3-q24.3 146.3 Gain -
9 p24.3-q34.3 141.1 Gain CIZ1 (9q34.11)
12 p13.31-p12.1 15.9 Loss -
12 p12.1-p12.1 0.9 Loss -
12 q13.13-q13.2 2.4 Loss -
16 p13.3-q24.3 90.2 Gain -
17 p13.3-p11.2 17.5 Loss TP53 (17p13.1)
20 p13-p13 3.3 Loss -
20 p11.23-p11.23 0.7 Loss -
17
20 q11.21-q13.13 19.2 Loss -
A7 46,XY 4 q21.22-q35.2 107.1 UPD - 7 q32.1-q36.3 31.5 UPD -
A8 46,XX 8 p23.3-q24.3 146.3 Gain - *CNV: Copy Number Variation
†CNV data for patient A4 was not available ‡Gene (position): Validated somatic mutations in CNV regions in 7 RAEBpatients
18
Table S7. The total number of mutated genes in 8 RAEB patients
Patient ID A1 A2 A3 A4 A5 A6 A7 A8
Cytogenetic Risk Very Very Good Int. Good Good Good Good Poor Poor* Genes ASXL1
KAT6B 1. Epigenetic TET2† modifiers FTSJD2
IDH2
STAG2
2. Cohesin and USH2A
cell adhesion LAMB4
CDH10
TP53 3. Tumor TMPRSS11A suppressors TSSC1
ZNF219
ZFP161
KDM5C
ZKSCAN7
ZNF391 4. Transcription SCML2 factors STAT4
FOXR1
ZUFSP
CEBPA
NKAP
OR10J5
5. G-protein PDCL3
modulators OR10X1
ARHGAP28
OSR1
RPS6KA2
PGK2 6. Protein ERBB4 kinases CDC42BPG
CAMKV
BRSK2
7. Spliceosome SRSF2
19 and RNA SNRNP200 conformation HELZ
GRIA2
SLC7A11
GRIN3A
SCN9A
SLC22A4
8. Transporters SLC4A9 and ion channel SLC6A12
ABCA5
GABRA4
SLC16A10
KCNK1
CACNA1B
TMEM107 9. TMC8 Transmembran TMEM79 e proteins TMEM132D
UBR2
10. Ubiquitin / RLIM proteosome BRCC3 pathway UBR4
HERC1
CAV3
CNTRL
11. Skeleton KRT1 and scaffold KRT15 proteins MTUS2
PLS3
SHANK1
FAM96B
D4S234E
ASPM
NLRC4
12. Regulate SERPINA3 cell CARD14 proliferation, IFRD1 differentiation PIK3AP1 and apoptosis CIZ1
PLEKHG5
SAMD9
ALAS2
PTPRD
20
SCUBE1
SULF1 13. Cell SH2B3‡ signaling MPL
CRLF1
14. Genome ATAD5 stability EEA1
MYH2
ZFHX2
MTMR4
MRO
PCDHB5 15. Others SLAMF7
DNAJC17
CCDC74A
PLEKHF1
PRSS48
CRY2
*Very poor: Verified by CNV †TET2: One mutation and one INDEL co-occurred in A4 patient ‡SH2B3: Two nonsense mutations co-occurred in A5 patient Deepened color : INDELs Gene names with gray background: gene mutations seen in VAF-defined subclone in CD34+ cells
21
Table S8. Intra- and inter-chromosomal fusion genes.
Patient Rearrange Confirmed Fusion genes Frame Genomic breakpoint ID ment type by RNAseq
Out-of-fra chr22:29692515︱ A2 EWSR1-ASXL1 CTX* + me chr20:30971684 Out-of-fra chr22:28151485︱ A2 MN1-MICAL3 ITX - me chr22:18454310 Out-of-fra A2 DNAH2-ARL15 CTX chr17:7649996︱chr5:53226738 - me Out-of-fra A2 PER1-DNAH2 ITX chr17:8048691︱chr17:7650750 + me A2 MYH10-PIK3R1 In-frame CTX chr17:8510250︱chr5:67524878 - A2 ARL15-NTN1 In-frame CTX chr5:53228901︱chr17:8967321 - chr7:14696397︱ A3 DGKB-CDK15 In-frame CTX - chr2:202730745 PEMT-TMEM189 chr17:17482036︱ A6 In-frame CTX - -UBE2V1 chr20:48718667 Out-of-fra chr3:122182765︱ A6 KPNA1-ZBTB20 ITX - me chr3:114162595 chr3:148585842︱ A6 CPA3-WWTR1 In-frame ITX - chr3:149364465 *CTX: Inter-chromosome translocation
Table S9. Fusion transcripts in 6 RAEB patients of RNA-seq Patient ID Coding Fusion transcripts A1 A2 A3 A5 A6 A7 A7 change CD34+ CD34- CD34- CD34- CD34- CD34+ CD34- PER1-DNAH2 Out-of-frame
ACOT13-SYN3 Out-of-frame
C15orf57-CBX3 Out-of-frame
EWSR1-ASXL1(1) Out-of-frame
EWSR1-ASXL1(2) Out-of-frame
NSUN3-NIT2 Out-of-frame
HIRA-MICAL3 Out-of-frame
FXR1-ATAD2 Out-of-frame
Total fusion n. 0 7 1 0 0 0 0
Grey color : fusion transcript detected
22
Table S10. Integrative analysis of recurrently mutated genes in 196 patients with different MDS subtypes and comparison of gene mutations between MDS and AML
MDS patients RCMD RAEB patients AML patients with non-silent patients with with non-silent with non-silent Function Genes mutations non-silent mutations, n mutation number, n mutations, n (n = 89) (8-10) (n = 196) (n = 95) 1. Cohesin STAG2 13 (6.6%) 2 (2.1%) 11 (12.4%) 3.0% complex SMC3 5 (2.6%) 3 (3.2%) 2 (2.2%) 3.5% RAD21 3 (1.5%) 1 (1.1%) 2 (2.2%) 3.5% SMC1A 2 (1.0%) 2 (2.1%) 0 2.5% Total 23 (11.7%) 8 (8.4%) 15 (16.9%) 12.5% 2. DNA modifiers TET2 27 (13.8%) 12 (12.6%) 13 (14.6%) 9.5% DNMT3A 18 (9.2%) 5 (5.3%) 12 (13.5%) 18.0% IDH2/IDH1 5 (2.6%) 1 (1.1%) 4 (4.5%) 17.0% Total 50 (25.5%) 18 (17.0%) 29 (32.6%) 44.5% 3. Chromatin ASXL1 28 (14.3%) 11 (11.6%) 17 (19.1%) 3.7% modifiers BCOR 12 (6.1%) 2 (2.1%) 9 (10.1%) 1.0% EZH2 9 (4.6%) 5 (5.3%) 3 (3.4%) 1.5% Total 49 (25.0%) 18 (17.0%) 29 (32.6%) 6.2% 4. Spliceosome U2AF1 29 (14.8%) 8 (8.4%) 20 (22.5%) 4.0% genes SF3B1 22 (11.2%) 5 (5.3%) 10 (11.2%) 0.5% SRSF2 11 (5.6%) 0 11 (12.4%) 0.5% ZRSR2 6 (3.1%) 3 (3.2%) 3 (3.4%) / SNRNP200 1 (0.5%) 0 1 (1.1%) 0.5% Total 69 (35.2%) 16 (16.9%) 45 (50.6%)* 5.5% 5. Transcription RUNX1 17 (8.7%) 6 (6.3%) 11 (12.4%) 6.5% factors GATA2 5 (2.6%) 1 (1.1%) 4 (4.5%) 1.0% CEBPA 4 (2.0%) 1 (1.1%) 3 (3.4%) 14.3% ETV6 2 (1.0%) 0 2 (2.2%) 1.0% Total 28 (14.3%) 8 (8.5%) 20 (22.5%)* 22.8% 6. Activated MPL 6 (3.1%) 1 (1.1%) 3 (3.4%) 0.5% signaling NRAS/KRAS 6 (3%) 1 (1.1%) 5 (5.6%) 8.7% molecules SH2B3 5 (2.6%) 2 (2.1%) 3 (3.4%) 0.5% CBL 3 (1.5%) 2 (2.1%) 1 (1.1%) 1.5% PTPN11 3 (1.5%) 0 3 (3.4%) 4.5% CALR 2 (1.0%) 1 (1.1%) 1 (1.1%) 1.0% NF1 2 (1.0%) 1 (1.1%) 1 (1.1%) 1.5% JAK2 1 (0.5%) 0 1 (1.1%) 4.5% FLT3 1 (0.5%) 1 (1.1%) 0 21.9% KIT 0 0 0 5.2% Total 28 (14.3%) 9 (9.5%) 18 (20.2%) 49.8%
23
7. Tumor TP53 20 (10.2%) 5 (5.3%) 12 (13.5%) 4.0% suppressors WT1 3 (1.5%) 0 3 (3.4%) 5.3% PHF6 1 (0.5%) 1 (1.1%) 0 3.0% Total 24 (12.2%) 6 (6.3%) 15 (16.9%)* 12.3% 8. NPM1 and SETBP1 4 (2.0%) 2 (2.1%) 2 (2.2%) 1.0% other myeloid NPM1 5 (2.6%) 3 (3.2%) 2 (2.2%) 24.2% genes DST 2 (1.0%) 1 (0.5%) 1 (0.5%) 1.0% BOD1L 1 (0.5%) 1 (0.5%) 0 0.5% FAM5C 3 (1.5%) 2 (1.0%) 1 (0.5%) 2.5% *Compared with RCMD, chi-square test, p < 0.05
24
Table S11. Clinical variables related to survival and disease progression OS PFS Patients, n 3-year (%) p value 3-year (%) p value Gender 196 Male 109 52.3±7.8 49.8±7.8 0.025 0.081 Female 87 75.2±7.9 67.5±8.5 Age (yr) 196 < 60 113 66.5±7.3 66.4±6.2 0.141 0.226 60 83 57.1±8.8 45.8±10.2 WHO subtypes 196 RAEB-1/RAEB-2 89 33.0±9.8 25.5±9.3 RCMD 95 82.7±6.1 80.6±6.1 0.002 0.001 RARS 6 100 100 RCUD 6 75.0±21.7 75.0±21.7 BM blasts (%) 196 0-2 65 98.4±1.6 92.7±3.6 2-5 42 82.2±6.1 75.0±6.9 <0.001 <0.001 5-10 44 47.3±13.4 46.8±11.6 > 10 45 29.7±10.7 23.6±10.0 Hb (g/L) 196 < 80 116 50.4±8.7 47.5±7.9 8-100 39 80.8±7.1 <0.001 70.5±8.3 0.005 100 41 86.0±7.6 83.9±7.7 PLT (×109/L) 196 < 50 91 61.4±7.9 59.1±7.4 50-100 46 62.7±12.0 0.482 50.6±13.5 0.936 100 59 73.7±6.7 70.1±6.9 ANC (×109/L) 196 < 0.8 58 66.4±7.3 66.1±7.2 0.192 0.504 0.8 138 62.3±7.2 56.7±7.3 Mutations* 196 With mutations 145 52.0±6.9 45.2±6.9 0.001 <0.001 Without mutations 51 93.8±3.5 91.3±4.1 Number of mutations 196 0 mutation 51 93.8±3.5 91.3±4.1 1 mutation 69 65.9±9.2 <0.001 62.7±8.8 <0.001 2 mutations 76 41.1±9.4 37.5±8.8 Cytogenetic aberrations 192 Normal 113 63.1±6.7 62.1±6.3 0.841 0.35 Others 79 63.4±9.5 52.9±10.3 Mutations + cyt† 196 With mutations and cyt 161 55.6±6.6 49.6±6.7 0.004 0.006 Without mutations or cyt 35 90.3±5.3 90.3±5.3 RAEB 89 25
With mutations 81 31.1±9.4 22.5±8.6 0.083 0.07 Without mutations 8 100 100 RAEB 89 With mutations and cyt 81 31.5±9.5 22.9±8.8 0.148 0.099 Without mutations or cyt 8 100 100 RCMD 95 With mutations 53 76.1±10.2 71.1±10.9 0.268 0.196 Without mutations 42 92.9±4.0 90.1±4.7 RCMD 95 With mutations and cyt 65 78.4±9.3 74.4±9.9 0.351 0.374 Without mutations or cyt 30 89.2±5.9 89.2±5.9 IPSS-R 192 Very low 4 100 100 Low 34 80.1±12.9 78.8±11.6 Intermediate 66 87.9±4.7 <0.001 82.8±5.8 <0.001 High 52 59.2±8.2 58.5±7.9 Very High 36 24.2±12.5 22.6±10.2 *Mutations in 38 genes †Cytogenetic aberrations
26
Table S12. Mutated genes associated with OS and PFS in 196 MDS patients
OS PFS Function Genes* 18-months (%) p value 18-months (%) p value STAG2 29.9±15.4 <0.001 24.9±13.6 0.007 1. Cohesin complex SMC3 75.0±21.7 0.244 50.0±25.0 0.588 Cohesin 46.0±12.6 0.016 44.7±11.4 0.004 DNMT3A 44.7±14.3 0.076 37.6±13.8 0.024 2. DNA modifiers TET2 68.1±10.2 0.671 60.1±12.0 0.511 IDH1/IDH2 40.0±21.9 0.001 40.0±21.9 0.01 EZH2 44.4±16.6 0.043 44.4±16.6 0.087 3. Chromatin modifiers ASXL1 67.8±9.5 0.659 63.7±9.8 0.353 BCOR 33.9±18.2 0.277 19.0±16.0 0.051 SRSF2 55.6±16.6 0.552 55.6±16.6 0.197 ZRSR2 80.0±17.9 0.348 80.0±17.9 0.624 4. Spliceosome genes SF3B1 66.9±12.8 0.842 61.4±12.9 0.755 U2AF1 59.2±12.0 0.148 49.0±11.4 0.018 RUNX1 40.8±18.6 0.026 37.5±17.5 0.013 5. Transcription factors GATA2 66.7±27.2 0.737 53.3±24.8 0.186 NRAS/KRAS 50.0±25.0 0.338 50.0±25.0 0.494 6. Activated signaling MPL 60.0±21.9 0.322 60.0±21.9 0.471 molecules SH2B3 80.0±17.9 0.981 80.0±17.9 0.851 7. Tumor suppressors TP53 54.8±13.2 0.016 51.3±13.8 0.048 8. NPM1 and other NPM1 80.0±17.9 0.752 80.0±17.9 0.882 myeloid disease genes *Genes mutated in 5 patients
Table S13. Multivariate analysis for OS and PFS in 196 MDS patients OS PFS Variables p value 95% CI HR* p value 95% CI HR Clinical BM blasts <0.001 1.56-3.37 2.30 <0.001 1.62-3.23 2.29 parameters Hb 0.001 1.44-4.57 2.56 0.001 1.46-4.73 2.63 STAG2 NS† 0.04 1.07-16.2 4.21
Cohesin NS NS
IDH1/2 0.001 2.28-19.5 6.67 0.017 1.26-10.9 5.67 Mutated U2AF1 / NS genes DNMT3A / NS
EZH2 0.001 2.43-27.9 8.23 /
RUNX1 0.001 1.89-11.2 4.59 0.043 1.03-5.80 4.11 TP53 0.001 1.85-10.2 4.35 NS
*HR: Hazard ratio †NS: No significance
27
Table S14. IPSS-R-M prognostic scoring system Prognostic variable 0 0.5 1 1.5 2 3 4 Very Inter- Very Cytogenetics* - Good - Poor Good mediate Poor Bone marrow blast 2 - > 2 - <5 - 5-10 > 10 - (%) Hemoglobin (g/dL) 10 - 8 - < 10 < 8 - - - Platelets (×109/L) 100 50 -< 100 < 50 - - - - ANC (×109/L) 0.8 < 0.8 - - - - - Number of High risk 0 1 2 - - - mutations† mutation‡ *Very Good: -Y, del(11q); Good: normal, del(5q), del(12p), del(20q), double including del(5q); Intermediate: del(7q), +8, +19, i(17q), any other single or double independent clones; Poor: -7, inv(3)/t(3q)/del(3q), double including -7/del(7q), complex: 3 abnormalities; Very Poor: >3 abnormalities †Mutations in 21 genes: U2AF1, SF3B1, SRSF2, ZRSR2, STAG2, SMC3, TET2, DNMT3A, IDH1/IDH2, ASXL1, BCOR, EZH2, NPM1, RUNX1, GATA2, TP53, MPL, SH2B3 and NRAS/KRAS ‡Any mutation of IDH1/IDH2, EZH2, RUNX1 and TP53.
Table S15. IPSS-R-M prognostic risk categories/scores 3-year Risk category Patients, n Risk Score* survival (%) Very Low 13 2.0 100 Low 33 > 2.0-3.5 89.5±5.9 Intermediate 56 > 3.5-5.0 83.7±5.7 High 42 > 5.0-6.5 64.9±9.7 Very High 48 > 6.5 41.7±8.4 *IPSS-R-M risk score = IPSS-R risk score + 0.5
28
Table S16. Re-stratification of MDS patients using M-based* and IPSS-R-M based system 3-year All MDS RAEB-1/RAEB- RCMD/RCUD/RAR Risk category survival patients, n 2, n S, n (%) IPSS-R 192 Very Low 4 0 4 100 Low 34 0 34 80.1±12.9 Intermediate 66 12 54 87.9±4.7 High 52 40 12 59.2±8.2 Very High 36 35 1 24.2±12.5 M-based system 196 Low 63 12 51 93.2±3.3 Intermediate-1 47 23 24 69.4±11.2 Intermediate-2 37 24 13 53.0±13.9 High 49 30 19 49.7±8.3 IPSS-R-M based system 192 Very Low 13 0 13 100 Low 33 0 33 89.5±5.9 Intermediate 56 12 44 83.7±5.7 High 42 30 12 64.9±9.7 Very High 48 45 3 41.7±8.4 *Molecular-marker based system
29
Reference:
1. Li H & Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754-1760. 2. Li H, et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16):2078-2079. 3. Yau C, et al. (2010) A statistical approach for detecting genomic aberrations in heterogeneous tumor samples from single nucleotide polymorphism genotyping data. Genome biology 11(9):R92. 4. Wang K, et al. (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17(11):1665-1674. 5. Welch JS, et al. (2012) The origin and evolution of mutations in acute myeloid leukemia. Cell 150(2):264-278. 6. Papaemmanuil E, et al. (2011) Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. The New England journal of medicine 365(15):1384-1395. 7. Yoshida K, et al. (2011) Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478(7367):64-69. 8. Ley T, et al. (2013) Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med 368(22):2059-2074. 9. Shen Y, et al. (2011) Gene mutation patterns and their prognostic impact in a cohort of 1185 patients with acute myeloid leukemia. Blood 118(20):5593-5603. 10. Patel JP, et al. (2012) Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. The New England journal of medicine 366(12):1079-1089.
Other Supporting Information Files
Dataset S1 (XLSX)
Dataset S2 (XLSX)
30