Genetic Heterogeneity of Diffuse Large B-Cell Lymphoma
Total Page:16
File Type:pdf, Size:1020Kb
Genetic heterogeneity of diffuse large B-cell lymphoma Jenny Zhanga,b,1, Vladimir Grubora,1, Cassandra L. Lovea, Anjishnu Banerjeec, Kristy L. Richardsd, Piotr A. Mieczkowskid, Cherie Dunphyd, William Choie, Wing Yan Aue, Gopesh Srivastavae, Patricia L. Lugarf, David A. Rizzierif, Anand S. Lagoof, Leon Bernal-Mizrachig, Karen P. Manng, Christopher Flowersg, Kikkeri Nareshh, Andrew Evensi, Leo I. Gordonj, Magdalena Czaderk, Javed I. Gilll, Eric D. Hsim, Qingquan Liua, Alice Fana, Katherine Walsha, Dereje Jimaa, Lisa L. Smithn, Amy J. Johnsonn, John C. Byrdn, Micah A. Luftigf, Ting Nio, Jun Zhuo, Amy Chadburnj, Shawn Levyp, David Dunsonc, and Sandeep S. Davea,b,f,2 aDuke Institute for Genome Sciences and Policy, bDuke Cancer Institute and Department of Medicine, and cDepartment of Statistical Science, Duke University, Durham, NC 27710; dUniversity of North Carolina at Chapel Hill, Chapel Hill, NC 27599; eThe University of Hong Kong, Queen Mary Hospital, Hong Kong, China; fDuke University Medical Center, Durham NC 27710; gEmory University, Atlanta GA 30322; hImperial College, London, United Kingdom; iUniversity of Massachusetts, Worcester, MA 01655; jNorthwestern University, Chicago IL 60208; kIndiana University, Indianapolis IN 46202; lBaylor University Medical Center, Dallas TX 75246; mCleveland Clinic, Cleveland, OH 44195; nDivision of Hematology and Comprehensive Cancer Center, Ohio State University, Columbus, OH 43210; oGenetics and Development Biology Center, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20892; and pHudson Alpha Institute for Biotechnology, Huntsville, AL 35806 Edited* by Elliott Kieff, Harvard Medical School and Brigham and Women’s Hospital, Boston, MA, and approved November 27, 2012 (received for review April 2, 2012) Diffuse large B-cell lymphoma (DLBCL) is the most common form Results of lymphoma in adults. The disease exhibits a striking heteroge- Sequencing of a Lymphoma Genome Uncovers the Spectrum of neity in gene expression profiles and clinical outcomes, but its Somatic Variation in DLBCL. Lymphoma biopsy tissue and unaf- genetic causes remain to be fully defined. Through whole genome fected bone marrow were obtained from the same patient. Using and exome sequencing, we characterized the genetic diversity of the Illumina platform, we generated at total of 171 Gb of 100 bp DLBCL. In all, we sequenced 73 DLBCL primary tumors (34 with paired-end sequences from the tumor and matched normal ge- matched normal DNA). Separately, we sequenced the exomes of nomes corresponding to an average per-base sequencing cover- fi 21 DLBCL cell lines. We identi ed 322 DLBCL cancer genes that age of 37-fold and 20-fold, respectively. GENETICS were recurrently mutated in primary DLBCLs. We identified recur- We identified 23,214 somatic sequence alterations occurring rent mutations implicating a number of known and not previously throughout the lymphoma genome, summarized in Fig. 1 A and B identified genes and pathways in DLBCL including those related to and SI Appendix,TableS1. C chromatin modification (ARID1A and MEF2B), NF-κB(CARD11 and Transitions accounted for about 60% of these events (Fig. 1 ), TNFAIP3), PI3 kinase (PIK3CD, PIK3R1, and MTOR), B-cell lineage similar to patterns observed in a number of other malignancies – (IRF8, POU2F2, and GNA13), and WNT signaling (WIF1). We also (9 11) and suggest that the majority of these DLBCL mutations experimentally validated a mutation in PIK3CD, a gene not pre- arise from stochastic endogenous processes rather than envi- viously implicated in lymphomas. The patterns of mutation dem- ronmental exposures, for example, in the context of tobacco exposure and lung cancer (6). onstrated a classic long tail distribution with substantial variation Known oncogenes (12) found to be somatically mutated in this of mutated genes from patient to patient and also between pub- DLBCL patient included ARID1A, SETD2, CARD11,and lished studies. Thus, our study reveals the tremendous genetic PIK3R1. Of these genes, only CARD11 (13) has been previously heterogeneity that underlies lymphomas and highlights the need experimentally identified as an oncogene in DLBCL. We also for personalized medicine approaches to treating these patients. identified structural genetic alterations using approaches de- scribed previously (14, 15). In all, we identified seven deletions next-generation sequencing | cancer genetics | cancer heterogeneity and three amplifications. Known oncogenes that were implicated by these copy number alterations include PTEN (chromosome iffuse large B-cell lymphoma (DLBCL) is the most common 10) and CDKN2A (chromosome 9). Dform of lymphoma in adults (1). Although nearly half the patients can be cured with standard regimens, the majority of Exome Sequencing Defines the Spectrum of Coding Region Mutations relapsed patients succumb. Thus, there is an urgent need to in DLBCL. To identify recurrently mutated genes in DLBCLs, we identify the genetic underpinnings of the disease and to identify obtained a total of 73 cases of primary human samples. We di- n = novel treatment strategies. Gene expression profiling (2, 3) has vided the primary human cases into a discovery set ( 34) and n = uncovered distinct molecular signatures for DLBCL subtypes that a prevalence set ( 39). For each of the discovery set cases of have unique biology and prognoses. High-throughput sequencing primary DLBCLs, we also sequenced paired normal tissue. In has provided rich opportunities for the comprehensive identifi- addition, we sequenced the exomes of 21 DLBCL cell lines that cation of the genetic causes of cancer (4–6). Whereas exhaustive are widely used to model the disease. portraits of individual cancer genomes are emerging, the degree to which these genomes represent the disease is unclear. We generated a detailed analysis of a DLBCL genome by se- Author contributions: J. Zhang, V.G., C.L.L., A.J.J., J.C.B., M.A.L., and S.S.D. designed re- search; J. Zhang, V.G., C.L.L., K.L.R., P.A.M., C.D., W.C., W.Y.A., G.S., P.L.L., D.A.R., A.S.L., quencing a primary human tumor and paired normal tissue L.B.-M., K.P.M., C.F., K.N., A.E., L.I.G., M.C., J.I.G., E.D.H., Q.L., K.W., L.L.S., T.N., J. Zhu, A.C., (Dataset S1). We further characterized the genetic diversity of S.L., and S.S.D. performed research; J. Zhang, V.G., A.B., A.F., D.J., M.A.L., D.D., and S.S.D. DLBCL by sequencing the exomes of 73 DLBCL primary tumors analyzed data; and J. Zhang, V.G., C.L.L., and S.S.D. wrote the paper. (34 with matched normal DNA) and 21 DLBCL cell lines for The authors declare no conflict of interest. comparative purposes. This in-depth sequencing identified 322 *This Direct Submission article had a prearranged editor. DLBCL cancer genes that were recurrently mutated in DLBCLs. Data deposition: The data reported in this paper have been deposited in dbGap, http:// We also experimentally validated the effects of genetic alteration www.ncbi.nlm.nih.gov/gap (accession no. phs000573.v1.p1), and the Gene Expression of PIK3CD, an oncogene that we identified in DLBCL. Our work Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE22898). provides one of the largest genetic portraits yet of human 1J. Zhang and V.G. contributed equally to this work. DLBCLs and offers insights into the molecular heterogeneity of 2To whom correspondence should be addressed. E-mail: [email protected]. the disease, especially in the context of other recently published This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. studies in DLBCL (7, 8). 1073/pnas.1205299110/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1205299110 PNAS Early Edition | 1of6 Downloaded by guest on September 26, 2021 A Somatically acquired genetic alterations Copy Number Alterations Intergenic Mutations Regulatory Region Mutations Coding Region Mutations Type of Mutation B Fig. 1. Results from sequencing a lymphoma ge- Nonsense nome. (A) Circos diagram (36) summarizing the Genomic Region Missense somatically acquired genetic variants in a DLBCL Frame Shift genome. The outermost ring depicts the chromo- Intronic Splice Site/Intronic some ideogram oriented clockwise, pter-qter. The Transcribed Region 5'UTR next ring indicates copy number alterations in the Regulatory Region 3'UTR DLCBL genome. The next three rings indicate so- matically acquired mutations in intergenic regions, Intergenic Synonymous potential regulatory regions, and the exome re- Non Coding RNAs spectively. (B) Pie chart depicts the relative number of somatically acquired mutations in the DLBCL C P<10-6 genome, which can be classified by their genomic 30% location as intergenic, intronic, potential regula- Transition tory, or transcribed regions (Left). (Right) Break- Proportion Transversion of alterations 20% down of different mutation types observed in the in transcribed regions. (C) Histogram depicts the mu- DLBCL Genome 10% tation profile of DLBCL. The proportion of muta- 0% tions in each of the six mutational classes is shown. G A A G T G G T T A C G Transitions represent the majority of the somati- − C T T C A C C A A T G C cally acquired mutations (P < 10 6). We performed whole-exome sequencing for all of these We empirically explored the power of our study to identify DLBCL and paired normal cases using the Agilent solution– novel genetic variants by plotting the expected number of new based system of exon capture, which targets the NCBI Consensus variants that would be discovered from each additional case of CDS database (CCDS) (16). We generated more than 500 Gb of DLBCL (SI Appendix). We found a progressively smaller number mappable sequence data and generated sequence data for 94% of unique variants by sequencing each additional sample (Fig. (median) of the targeted bases in each sample. Our average exome 2C).