Supporting Information Supplementary Methods Patients for Whole Genome Sequencing and Validation Cohort. Heparinized Bone Marrow

Supporting Information Supplementary Methods Patients for whole genome sequencing and validation cohort. Heparinized bone marrow samples were obtained from 8 RAEB patients with informed consent for WGS according to the ethics review board of Shanghai Institute of Hematology. Briefly, these 8 patients were 4 RAEB-1, 4 RAEB-2, 5 males, 3 females, 1 with complex karyotype, 1 with +8, 5 with normal karyotype, and classified as intermediate to very high risk level. 6 patients died 4-23 months after diagnosis of infection, hemorrhage, cerebral infarction or evolution to AML (complete information see Table S1). The validation cohort consisted of 188 various subtypes of MDS patients diagnosed and treated in Shanghai Ruijin Hospital and Shanghai No.6 People’s Hospital. All patients provided written informed consent. Bone marrow and paired buccal samples were obtained after informed consent. DNA sample preparation. Mononuclear cells (MNC) were separated by density gradient centrifugation using Ficoll in 8 RAEB patients and 188 MDS patients from validation cohort. Subsequently, CD34+ cells were isolated by magnetic cell separation (Miltenyi Biotech, Bergisch Gladbach, Germany) to reach a purity of 89-97.7% (average: 93.1%) in 8 RAEB patients. Flow through CD34- cells were also collected for analysis. Skin biopsy was obtained for analysis of normal genome and extracted by DNeasy Blood & Tissue Kit (Qiagen). Genomic DNA of CD34+ cells were isolated by QuickGene DNA whole blood kit L (FUJIFILM, Life Science). Genomic DNA of MNC from validation set was extracted by Wizard® Genomic DNA Purification Kit (Promega). DNA library preparation. Genomic DNA was sheared by sonication 1 and adaptors were ligated to the resulting fragments. The adaptor-ligated templates were fractionated by agarose gel electrophoresis and fragments of the desired size were excised. The resulting fragments were amplified by ligation-mediated PCR, purified and subjected to DNA sequencing on the Illumina platform. Massively parallel sequencing. The workflow of Cluster generation using the Illumina cluster station was as follows: template hybridization, isothermal amplification, linearization, blocking, denaturation and sequencing primer hybridization. Then, deep sequencing was performed for the captured libraries with the Illumina GAIIx and HiSeq2000, and 2×120 bp (base pairs) paired end reads were output following the manufacturer’s protocols. Image analysis and base calling were performed by Illumina RTA versions 1.6 with default parameters. Alignment, SNV/INDEL calling and quality control. A third party software BWA (1) was used to align the paired end reads to the reference human genome (hg19, http://genome.ucsc.edu/) with default parameters. Variations including SNVs and INDELs were called with the Samtools software package (2) and filtered with recommended threshold (SNV quality ≥20, INDEL quality ≥50 and at least 3 reads covered) for cases. To ensure the filter power and minimize the false discover rate, loose criteria were applied to filter those control variations (SNV quality ≥10, INDEL quality ≥10 and ≥3 reads covered). When we were plotting figures and doing clonality analysis based on SNVs, a strict threshold was adopted as below: SNV quality ≥100, case depth ≥30, Map quality ≥55 and control depth ≥30. Targeted gene resequencing. In order to determine the recurrent 2 mutations in highlighted genes, we designed PCR primers following the guidelines from Fluidigm using iPLEX AssayDesigner software. The Fluidigm Access Array microfluidic platform was adapted to generate highly multiplexed libraries of tagged amplicons from MDS patients. Deep resequencing was performed on Illumina GAIIx/MiSeq platform. Somatic copy number variation (CNV) and uniparental disomy (UPD) detection. The DNA from case tumor and matched germline control was prepared for hybridization to Illumina high density Genome Wide Human 660W Quad_v1 (657,366 probes) SNPs array according to the manufacturer’s protocol. The raw intensity data (*.idat) files were analyzed using the Genotyping Module of Illumina Genome Studio software Version 2011.1. With this software, normalized Log R Ratio (LRR) and B Allele Frequency (BAF) for all the available probes in each sample were extracted. OncoSNP (3) (version 1.1) was selected to detect somatic genomic alterations in paired samples. To verify the reliability of CNVs and UPDs, all the reported alterations were plotted based on their LRR and BAF by R statistical software (www.r-project.org, version 2.15.1) and visually checked. Only the somatic alterations meet the criteria proposed by OncoSNP and PennCNV (4) were kept for further study. Copy Number Variations (CNVs) were analyzed with regard to their chromosomal positions (Fig. S3), indicating that amplification of chr8 regions and DELs or uniparental disomy (UPD) of chr7 regions were the most common events. Cases A2 and A6 were found to have complex chromosomal aberrations (Table S6), harboring DELs of respectively TP53 gene–containing 17p13.3-p13.1 and 17p13.3-p11.2 in the presence of TP53 mutations on the remaining allele. By contrast, UPD of regions on 4q21.22-q35.2 and 7q32.1-q36.3 was found in A7 while amplification 3 of p23.3-q24.3 in A8, the two cases with normal karyotype. Clonality analysis. Clonality analysis was performed according to a previous report (5). And these figures were plotted based on high-quality somatic SNVs with strict detailed threshold described above. Statistical analysis. Student’s t-test was used to compare the difference between the average coding sequence mutations between groups. The patients with mutations between RAEB and RCMD was analyzed by Chi-square test. Fisher exact test was applied to determine the co-occurrence of highly recurrent genes. Overall survival (OS) was defined as the time from the date of diagnosis to death or alive at last follow-up (censored). Progression free survival (PFS) was calculated from diagnosis to disease progression, defined as relapse, progression to acute leukemia or RAEB phase, death, or alive at last follow-up (censored). Kaplan-Meier was used to evaluate time to survival and time to progression. All p values were based on 2-sided tests. The statistical analyses were performed with the statistical software package SPSS 19.0 (SPSS Science, Chicago, IL, USA). Univariate analyses were performed among 196 MDS patients to access the impact of age, gender, WHO subtypes, percentage of BM blasts, levels of hemoglobin, platelet and neutrophil, chromosomal aberrations and gene mutations as variables on OS and PFS and to screen the main prognostic factors. It was found that male (p = 0.025), RAEB subtypes (p = 0.002), high percentage of BM blasts (p < 0.001), lower hemoglobin (p < 0.001) and occurrence of gene mutations (p = 0.001) were associated with adverse OS, and except for gender (p = 0.081), these factors were also associated with poor PFS (Table S11). Furthermore, when each of the gene mutations was analyzed separately, mutations of STAG2 (p = 0.007), cohesin family complex (p = 4 0.004), DNMT3A (p = 0.024), IDH1/IDH2 (p = 0.01), U2AF1 (p = 0.018), RUNX1 (p = 0.013) and TP53 (p = 0.048) predicted adverse PFS (Table S12). We then performed multivariate analyses in clinical parameters and mutated genes separately. It was found that clinical factors including percentage of BM blasts (hazard ratio [HR] = 2.30; 95% CI, 1.56-3.37; p < 0.001 for OS, [HR] = 2.29; 95% CI, 1.62-3.23; p < 0.001 for PFS) and hemoglobin (HR = 2.56; 95% CI, 1.44-4.57; p = 0.001 for OS, HR = 2.63; 95% CI, 1.46-4.73; p = 0.001 for PFS) were independent adverse prognostic factors for OS and PFS. While in mutated genes, STAG2 ([HR] = 4.21; 95% CI, 1.07-16.2; p = 0.04), IDH1/IDH2 ([HR] = 5.67; 95% CI, 1.26-10.9; p = 0.017), RUNX1 ([HR] = 4.11; 95% CI, 1.03-5.80; p = 0.043) were independent adverse prognostic factors for PFS (Table S13). 5 Supplementary Figures Fig. S1. (A), (B) Impact of age on the number of somatic mutations in the genomes of bone marrow CD34+ cells among 8 RAEB cases. No significant correlation was observed between the age and the number of all mutations (p = 0.403) (A) or between the age and the number of non-silent mutations in coding sequences (p = 0.487) (B). (C) Proportion of nucleotide transitions and transversions (62.3% vs. 37.7%) in the genomes of bone marrow CD34+ cells among 8 RAEB cases analyzed with WGS. (D) Numbers of distinct SNVs in coding sequences among 8 RAEB cases analyzed with WGS. 6 Fig. S2. (A) Percentage of nucleotide transitions and transversions in two previously reported exome sequencing studies for MDS (65.0% vs. 35.0%) (MDS-exome-1 and MDS-exome-2) (6, 7) and in a recently reported exome sequencing/WGS study for AML (67.7% vs. 32.3%) (8). (B) Genomic mutation spectrum of SNVs in each of six mutation classes in previously reported studies for MDS (6, 7) and AML (8). Note that C→T is the most prevalent change. 7 Fig. S3. Types and genomic distribution of somatic copy number variations (CNVs) in the genomes of bone marrow CD34+ cells among 7 RAEB case. UPD: uniparental disomy. CNVs are shown in colored lines. 8 Fig. S4. Predicted domain structures of two EWSR1-ASXL1 transcripts, with fusions between sequences for N-terminal 483 aa or 431 aa of EWSR1 and the exon 3 or exon 1 of ASXL1, in a tail-to-tail manner, respectively. EAD: Gln/Pro/Thr-rich region; IQ: IQ domain, binds calmodulin; RRM: RNA recognition motif. 9 Fig. S5. Circos plot showing a landscape view of 287 somatic mutations of 38 genes in seven functional categories among 145 MDS cases. Ribbons connecting distinct categories of gene abnormalities reflect the concurrent mutations of each two categories, whereas mutual exclusivity may exist in areas that are not connected.

Supporting Information Supplementary Methods Patients for Whole Genome Sequencing and Validation Cohort. Heparinized Bone Marrow

Genomic Correlates of Relationship QTL Involved in Fore- Versus Hind Limb Divergence in Mice

Bahl Et Al Revisedmanuscript.Pdf

Open Dogan Phdthesis Final.Pdf

A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus

4-6 Weeks Old Female C57BL/6 Mice Obtained from Jackson Labs Were Used for Cell Isolation

Mouse Pcdhb5 Conditional Knockout Project (CRISPR/Cas9)

Supplementary Table 1: Adhesion Genes Data Set

Genome-Wide DNA Methylation Analysis of KRAS Mutant Cell Lines Ben Yi Tew1,5, Joel K

Early-Onset Obesity and Paternal 2Pter Deletion Encompassing the ACP1, TMEM18,Andmyt1l Genes

Supplementary Table S4. FGA Co-Expressed Gene List in LUAD

A Compendium of Co-Regulated Protein Complexes in Breast Cancer Reveals Collateral Loss Events

Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude