Multi-Omics of 34 Colorectal Cancer Cell Lines - a Resource for Biomedical Studies Kaja C
Total Page:16
File Type:pdf, Size:1020Kb
Berg et al. Molecular Cancer (2017) 16:116 DOI 10.1186/s12943-017-0691-y RESEARCH Open Access Multi-omics of 34 colorectal cancer cell lines - a resource for biomedical studies Kaja C. G. Berg1,2†, Peter W. Eide1,2†, Ina A. Eilertsen1,2, Bjarne Johannessen1,2,3, Jarle Bruun1,2, Stine A. Danielsen1,2,3, Merete Bjørnslett1,2, Leonardo A. Meza-Zepeda3,4, Mette Eknæs1,2, Guro E. Lind1,2, Ola Myklebost3,4,5, Rolf I. Skotheim1,2,3, Anita Sveen1,2,3 and Ragnhild A. Lothe1,2,3* Abstract Background: Colorectal cancer (CRC) cell lines are widely used pre-clinical model systems. Comprehensive insights into their molecular characteristics may improve model selection for biomedical studies. Methods: We have performed DNA, RNA and protein profiling of 34 cell lines, including (i) targeted deep sequencing (n = 612 genes) to detect single nucleotide variants and insertions/deletions; (ii) high resolution DNA copy number profiling; (iii) gene expression profiling at exon resolution; (iv) small RNA expression profiling by deep sequencing; and (v) protein expression analysis (n = 297 proteins) by reverse phase protein microarrays. Results: The cell lines were stratified according to the key molecular subtypes of CRC and data were integrated at two or more levels by computational analyses. We confirm that the frequencies and patterns of DNA aberrations are associated with genomic instability phenotypes and that the cell lines recapitulate the genomic profiles of primary carcinomas. Intrinsic expression subgroups are distinct from genomic subtypes, but consistent at the gene-, microRNA- and protein-level and dominated by two distinct clusters; colon-like cell lines characterized by expression of gastro-intestinal differentiation markers and undifferentiated cell lines showing upregulation of epithelial-mesenchymal transition and TGFβ signatures. This sample split was concordant with the gene expression-based consensus molecular subtypes of primary tumors. Approximately ¼ of the genes had consistent regulation at the DNA copy number and gene expression level, while expression of gene-protein pairs in general was strongly correlated. Consistent high-level DNA copy number amplification and outlier gene- and protein- expression was found for several oncogenes in individual cell lines, including MYC and ERBB2. Conclusions: This study expands the view of CRC cell lines as accurate molecular models of primary carcinomas, and we present integrated multi-level molecular data of 34 widely used cell lines in easily accessible formats, providing a resource for preclinical studies in CRC. Keywords: Colorectal cancer cell lines, Consensus molecular subtypes, Copy number aberrations, Gene expression, Genomics, Methylation, Microsatellite instability, miRNA, Mutations, Protein expression * Correspondence: [email protected] †Equal contributors 1Department of Molecular Oncology, Institute for Cancer Research & K.G.Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, P.O.Box 4953 Nydalen, -0424 Oslo, NO, Norway 2Center for Cancer Biomedicine, Institute for Clinical Medicine, University of Oslo, Oslo, Norway Full list of author information is available at the end of the article © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Berg et al. Molecular Cancer (2017) 16:116 Page 2 of 16 Background specific alterations of biological or clinical interest for Colorectal cancers (CRC) are molecularly heterogeneous further experimental studies. and can be divided into clinically relevant subtypes as- sociated with patient prognosis and treatment response. Methods At the DNA level, this includes the genomic instability Cell lines – Culturing, processing and analyses overview phenotypes microsatellite instability (MSI) and chromo- Thirty-four CRC cell lines purchased from cell line reposi- somal instability (CIN), as well as the epigenomic CpG tories or kindly provided by collaborators (Additional file 1: island methylator phenotype (CIMP). About 15% of pri- Table S1), were subjected to DNA, RNA and protein ana- mary CRCs have MSI, while the rest are microsatellite lyses (Fig. 1 a and b). Cell lines were cultured as previously stable (MSS), most of which have the CIN phenotype. described [12] and harvested at approximately 80–90% MSI tumors have errors in the mismatch repair machinery confluency. Genomic DNA was extracted either by a stand- and display numerous single nucleotide variants (SNVs) ard phenol/chloroform procedure or a magnetic beads and insertions/deletions (indels) [1]. CIN tumors typically protocol (Maxwell 16 DNA purification kit, Promega, display aneuploidy with structural and/or numerical aber- Madison, WI, U.S.A.). Cell line authenticity was verified by rations, but the underlying cause(s) remains undetermined DNA profiling based on 15 short tandem repeat (STR) loci, [2]. CIMP tumors overlap to a large extent with MSI and using the AmpFLSTR Identifiler PCR Amplification Kit are characterized by widespread hypermethylation of CpG (Thermo Fisher, Waltham, MA) and matched to the pro- islands [3, 4]. files from supplier (Additional file 1: Table S1) where avail- At the transcriptional level, several classification schemes able. MSI and CIMP status was determined according to have identified biologically distinct subtypes of CRCs [5–7]. previously described procedures [12]. For CL-40 MSI status The recent identification of four consensus molecular was additionally assessed using the MSI Analysis System, subtypes (CMS) has provided evidence that the expres- version 1.2 (Promega, Fitchburg, WI, USA). Total RNA sion subtypes have clinical relevance independent of was extracted using the Qiagen AllPrep DNA/RNA/ cancer stage [8]. Although several genomic aberrations miRNA Universal kit (Qiagen, Hilden, Germany) and qual- associate with individual CMS groups, including MSI ity controlled by the Agilent RNA 6000 nano kit for Agilent and hypermutation in CMS1 and oncogene amplification 2100 Bioanalyzer (Agilent, Santa Clara, CA, U.S.A.). All in CMS2, a potential genomic basis for the expression RIN values were above 9. Protein lysates were produced subtypes remains elusive. Integrative DNA, RNA and pro- from cell pellets at the MD Anderson Cancer Centre RPPA tein level analyses promise to improve our understanding Core Facility. of the biological and clinical importance of the evolving For all cell lines, DNA copy number, mRNA, miRNA molecular classification of CRC. and protein expression profiles were generated. Targeted CMS classification is heavily influenced by the tumor DNA sequencing was performed for 27 cancer cell lines microenvironment, as demonstrated by strong expression in addition to Sanger sequencing of selected genes in 31 of mesenchymal marker genes in the stroma of tumors of cell lines. For frequency counts and statistical tests we the stem-like/mesenchymal subtype CMS4 [9, 10]. How- excluded the neuroendocrine Colo320 and kept only one ever, all four CMS subtypes were recently demonstrated to cell line derived from the same patient, thus excluding be represented in in vitro model systems (Sveen et al., sub- IS3, SW620, DLD-1 and WiDr. mitted), and cancer cell lines may therefore be used to identify the cancer cell intrinsic aberrations characteristic Targeted deep sequencing of the four CMS groups. Furthermore, genomic studies Sequencing libraries for the “kinome” and selected and drug sensitivity screening have demonstrated that cancer-relevant genes (totally n = 612 genes; Additional CRC cell lines in general recapitulate the molecular alter- file 1: Table S2) was generated using the Agilent SureSe- ations and pharmacogenomics of primary tumors [11–15]. lect Human Kinome V1 kit (Agilent), and 2 × 101 basepair Accordingly, improved molecular characterization of these paired-end sequencing was performed on the Illumina in vitro model systems may further increase their value as HiSeq 2500 system (Illumina, San Diego, CA, U.S.A.) at preclinical models of CRC. the Oslo University Hospital Genomics Core Facility to an Here we present a resource of information for 34 CRC average sequencing depth of 161X (range 105-289X). cell lines by multi-level data integration, including tar- Sequencing reads were aligned to the reference genome geted deep sequencing, DNA copy numbers, gene ex- GRCh37 (hg19) with the Burrows-Wheeler Aligner pression, microRNA (miRNA) expression and protein (v0.6.21) [16], converted from sequence alignment map expression. We describe consistent gene/pathway regu- (SAM) files to the binary alignment map (BAM) format lation across data types and associate this with known by Picard, version 1.61 [17], and sorted and indexed CRC subtypes. Each data set and data combination are using SAMtools (v0.1.18) [18]. Duplicate reads were re- presented in accessible tables and figures, emphasizing moved by Picard, and the Genome Analysis Toolkit Berg et al. Molecular Cancer (2017) 16:116 Page 3 of 16 a Molecular phenotype MSI MSS POLE mutated Co115 DLD-1 HCT15 KM12 EBFRI IS3 LS1034