Transcriptome Characterization by RNA Sequencing Identifies a Major Molecular and Clinical Subdivision in Chronic Lymphocytic Leukemia
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press Research Transcriptome characterization by RNA sequencing identifies a major molecular and clinical subdivision in chronic lymphocytic leukemia Pedro G. Ferreira,1,2,12 Pedro Jares,3,13 Daniel Rico,4,13 Gonzalo Go´mez-Lo´pez,4 Alejandra Martı´nez-Trillos,3 Neus Villamor,3 Simone Ecker,4 Abel Gonza´lez-Pe´rez,5 David G. Knowles,1,2 Jean Monlong,1,2 Rory Johnson,1,2 Victor Quesada,6 Sarah Djebali,1,2 Panagiotis Papasaikas,2,7 Mo´nica Lo´pez-Guerra,3 Dolors Colomer,3 Cristina Royo,3 Maite Cazorla,3 Magda Pinyol,3 Guillem Clot,3 Marta Aymerich,3 Maria Rozman,3 Marta Kulis,3 David Tamborero,5 Anaı¨sGouin,1,2 Julie Blanc,8 Marta Gut,8 Ivo Gut,8 Xose S. Puente,6 David G. Pisano,4 Jose´ Ignacio Martin-Subero,9 Nuria Lo´pez-Bigas,5,10 Armando Lo´pez-Guillermo,11 Alfonso Valencia,4 Carlos Lo´pez-Otı´n,6 Elı´as Campo,3 and Roderic Guigo´1,2,14 1Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 08003 Barcelona, Catalonia, Spain; 2Universitat Pompeu Fabra (UPF), 08003 Barcelona, Catalonia, Spain; 3Unitat d’Hematopatologia, Servei d’Anatomia Patolo`gica, Hospital Clı´nic, Universitat de Barcelona, Institut d’Investigacions Biome`diques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; 4Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Spanish National Bioinformatics Institute, 28029 Madrid, Spain; 5Research Unit on Biomedical Informatics, Department of Experimental and Health Sciences, University Pompeu Fabra, 08003 Barcelona, Catalonia, Spain; 6Departamento de Bioquı´mica y Biologı´a Molecular, Instituto Universitario de Oncologı´a, Universidad de Oviedo, 33006 Oviedo, Spain; 7Gene Regulation Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), 08003 Barcelona, Catalonia, Spain; 8Centro Nacional de Ana´lisis Geno´mico, PCB, 08028 Barcelona, Spain; 9Departamento de Anatomı´a Patolo´gica, Farmacologı´a y Microbiologı´a, Universitat de Barcelona, 08036 Barcelona, Spain; 10Institucio´ Catalana de Recerca i Estudis Avancxats (ICREA), 08010 Barcelona, Spain; 11Servei de Hematologia, Hospital Clı´nic, IDIBAPS, 08036 Barcelona, Spain Chronic lymphocytic leukemia (CLL) has heterogeneous clinical and biological behavior. Whole-genome and -exome sequencing has contributed to the characterization of the mutational spectrum of the disease, but the underlying transcriptional profile is still poorly understood. We have performed deep RNA sequencing in dif- ferent subpopulations of normal B-lymphocytes and CLL cells from a cohort of 98 patients, and characterized the CLL transcriptional landscape with unprecedented resolution. We detected thousands of transcriptional elements dif- ferentially expressed between the CLL and normal B cells, including protein-coding genes, noncoding RNAs, and pseudo- genes. Transposable elements are globally derepressed in CLL cells. In addition, two thousand genes—most of which are not differentially expressed—exhibit CLL-specific splicing patterns. Genes involved in metabolic pathways showed higher ex- pression in CLL, while genes related to spliceosome, proteasome, and ribosome were among the most down-regulated in CLL. Clustering of the CLL samples according to RNA-seq derived gene expression levels unveiled two robust molecular sub- groups, C1 and C2. C1/C2 subgroups and the mutational status of the immunoglobulin heavy variable (IGHV ) region were the only independent variables in predicting time to treatment in a multivariate analysis with main clinico-biological features. This subdivision was validated in an independent cohort of patients monitored through DNA microarrays. Further analysis shows that B-cell receptor (BCR) activation in the microenvironment of the lymph node may be at the origin of the C1/C2 differences. [Supplemental material is available for this article.] Chronic lymphocytic leukemia (CLL) is one of the most common leukemias among adults in the Western world (Zenz et al. 2009). 12Present address: Department of Genetic Medicine and Development, The clinical evolution of the disease is very heterogeneous with University of Geneva Medical School, 1211 Geneva, Switzerland a group of patients following an indolent course with no need for 13 These authors contributed equally to this work. 14Corresponding author E-mail [email protected] Article published online before print. Article, supplemental material, and pub- Ó 2014 Ferreira et al. This article, published in Genome Research, is available lication date are at http://www.genome.org/cgi/doi/10.1101/gr.152132.112. under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), Freely available online through the Genome Research Open Access option. as described at http://creativecommons.org/licenses/by-nc/3.0/. 24:000–000 Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/14; www.genome.org Genome Research 1 www.genome.org Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press Ferreira et al. treatment during a long period, whilst others have a rapid ag- dividuals, and sequenced in triplicates. In total, close to 6 billion gressive evolution and short survival. These clinical differences 2 3 76-bp paired-end reads were obtained (see Methods sections have been mainly associated with two major molecular subtypes of ‘‘Patients and Samples’’ and ‘‘RNA Preparation, Sequencing, and the disease characterized respectively by high and low numbers of Microarrays’’; Supplemental Figs. S1, S2; Supplemental File 1), with somatic mutations in the variable region of the immunoglobulin a median number of 45M reads per sample. The reads were pro- genes. Disease progression has been also associated with a number cessed with the Grape pipeline (Knowles et al. 2013), being mapped of genetic alterations that include cytogenetic abnormalities and to the human genome (hg19) with GEM (Marco-Sola et al. 2012), specific gene mutations (TP53, NOTCH1, SF3B1) (Fabbri et al. and used to quantify genes and transcripts annotated in the 2011; Puente et al. 2011; Quesada et al. 2011; Wang et al. 2011; GENCODE gene set (Harrow et al. 2006) with the Flux Capacitor Oscier et al. 2012; Ramsay et al. 2013). However, the heterogeneity (Montgomery et al. 2010; http://flux.sammeth.net/index.html) in the evolution of the patients is largely unexplained by these (see Methods sections ‘‘Read Mapping and Processing’’ and ‘‘Gene, simple genetic events. Transcript, and Exon Quantifications’’). The transcriptome of 219 Genome-wide transcriptome analysis in cancer provides a CLL patients, including 95 of the RNA-seq samples, was also ana- global view of the expressed elements and networks that reshapes lyzed using DNA microarrays. We found very high correlation the biology of the normal cells in their transformation and pro- between RNA-seq and microarray-based quantification of gene gression to aggressive cancer cells. Microarray gene expression expression (CC ranged from 0.81 to 0.88), and between RNA and profiling studies have characterized gene signatures related to the protein levels for ZAP70 and CD38 (known CLL markers), as pre- different molecular subtypes of CLL and have identified individual viously determined for these samples (Quesada et al. 2011) (CC: genes or pathways related to the clinical and biological evolution 0.52 and 0.77, respectively) (Supplemental Fig. S3). of the disease (Klein et al. 2001; Rosenwald et al. 2001; Du¨hren-von Across the whole genome, we found large transcriptional Minden et al. 2012). However, the molecular mechanisms un- differences between normal lymphocytes and CLL cells. On aver- derlying the origin and progression of the disease remain mainly age, we found 13.6% of the human genome covered by RNA-seq unknown. Here, we have characterized the CLL transcriptional reads in CLL samples, while for normal cells the average was 10.5% landscape at high resolution by performing RNA-seq on a large (t-test mean coverage P = 1.8 3 10À8) (Supplemental Fig. S4). For cohort of CLL samples, for the majority of which whole-exome comparison, this average is 12.7% across the cell lines investigated sequencing (Quesada et al. 2011) and high-density DNA methyl- within the ENCODE Project (Djebali et al. 2012)—among which ation microarrays (Kulis et al. 2012) have been previously pro- there are numerous cancer cell lines. At gene level, and under very duced. The transcriptomic architecture of CLL that we uncover stringent criteria (false discovery rate [FDR] <0.01 and median fold refines the molecular characterization of the disease and opens change >3; Methods, section ‘‘Expression Analysis’’), we identified new avenues for the clinical management of patients. 1089 genes differentially expressed between normal and tumoral samples (Table 1; Supplemental File 2). The top differentially expressed genes are dominated by immunoglobulins, as expected Results due to the clonality of the CLL cells (Supplemental Fig. S5a). Path- way analyses showed that genes involved in metabolic pathways The gene expression landscape of CLL had higher expression in CLL, while genes related to spliceosome, Total poly(A)+ RNA was extracted from the CLL samples and se- proteasome, and ribosome were among the most down-regulated quenced using the Illumina HiSeq 2000 instrument. RNA was also in CLL when compared with normal lymphocytes (Supplemental extracted from three subtypes of normal B cells (naı¨ve, memory Fig. S5b). Among the