The Open Chromatin Landscape of Non–Small Cell Lung Carcinoma
Total Page:16
File Type:pdf, Size:1020Kb
Published OnlineFirst June 17, 2019; DOI: 10.1158/0008-5472.CAN-18-3663 Cancer Genome and Epigenome Research The Open Chromatin Landscape of Non–Small Cell Lung Carcinoma Zhoufeng Wang1, Kailing Tu2, Lin Xia2, Kai Luo2,Wenxin Luo1, Jie Tang2, Keying Lu2, Xinlei Hu2, Yijing He2, Wenliang Qiao3, Yongzhao Zhou1, Jun Zhang2, Feng Cao2, Shuiping Dai1, Panwen Tian1, Ye Wang1, Lunxu Liu4, Guowei Che4, Qinghua Zhou3, Dan Xie2, and Weimin Li1 Abstract Non–small cell lung carcinoma (NSCLC) is a major cancer identified 21 joint-quantitative trait loci (joint-QTL) that type whose epigenetic alteration remains unclear. We ana- correlated to both assay for transposase accessible chroma- lyzed open chromatin data with matched whole-genome tin sequencing peak intensity and gene expression levels. sequencing and RNA-seq data of 50 primary NSCLC cases. Finally, we identified 87 regulatory risk loci associated with We observed high interpatient heterogeneity of open chro- lung cancer–related phenotypes by intersecting the QTLs matin profiles and the degree of heterogeneity correlated to with genome-wide association study significant loci. In several clinical parameters. Lung adenocarcinoma and lung summary, this compendium of multiomics data provides squamous cell carcinoma (LUSC) exhibited distinct open valuable insights and a resource to understand the land- chromatin patterns. Beyond this, we uncovered that the scape of open chromatin features and regulatory networks broadest open chromatin peaks indicated key NSCLC genes in NSCLC. and led to less stable expression. Furthermore, we found that the open chromatin peaks were gained or lost together Significance: This study utilizes state of the art genomic with somatic copy number alterations and affected the methods to differentiate lung cancer subtypes. expression of important NSCLC genes. In addition, we See related commentary by Bowcock, p. 4808 Introduction mount (4). The epigenetic profiles, such as DNA methylation (5), histone modifications (6), and noncoding RNA (7) have been Lung cancer is one of the leading causes of cancer-related death characterized in NSCLC. Until now, however, the open chromatin worldwide (1). Non–small cell lung carcinoma (NSCLC) landscape of NSCLC remains undetermined. accounts for approximately 85% of lung cancer cases, with lung Recently, the highly efficient assay for transposase accessible adenocarcinoma and lung squamous cell carcinoma (LUSC) chromatin sequencing (ATAC-seq) approach (8) has successfully being the two major histologic types (2). Recent genome sequenc- mapped genome-wide open chromatin patterns in multiple ing efforts have identified millions of somatic mutations in human cell types and provided valuable insights into the under- NSCLC (3), which further led to the discovery of "driver muta- lying regulatory mechanisms (9). Several works have profiled the tions" in key oncogenes. While the genetic factor only accounts for open chromatin state in lymphocytic leukemia (10, 11). To date, a parts of the interpersonal variability in NSCLC risk, the epigenetic few studies have characterized open chromatin state in primary contributions to this disease are becoming increasingly para- NSCLC samples. A recent work published by Corces and collea- gues cataloged the open chromatin states of 23 cancer types (12), 1Department of Respiratory and Critical Care Medicine, State Key Laboratory of including NSCLC. However, none has associated the open chro- Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan, China. matin variations to genomic alterations among patients with 2 National Frontier Center of Disease Molecular Network, State Key Laboratory of NSCLC. It is known that the profile of open chromatin in primary Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan, China. fi 3Lung Cancer Center, West China Hospital Sichuan University, Chengdu, Sich- NSCLC samples is particularly dif cult partially because of cancer uan, China. 4Department of Thoracic Surgery, West China Hospital, Sichuan tissues confounded by cell-type heterogeneity (13). Recent work University, Chengdu, Sichuan, China. has shown that using negative cell isolation to deplete immune fi fi Note: Supplementary data for this article are available at Cancer Research cells and broblasts could signi cantly increase the purity of Online (http://cancerres.aacrjournals.org/). cancer cells from primary tumor samples (14), enabling mean- ingful epigenomic analysis. Z. Wang, K. Tu, L. Xia, K. Luo, and W. Luo contributed equally to this article. The integrative analysis combining whole-genome sequencing Corresponding Authors: Dan Xie, West China Hospital of Sichuan University, (WGS) and RNA-seq data with open chromatin state from the Chengdu, Sichuan 610000, China. Phone: 136-9346-2346; Fax: 028-85164165; same patients could potentially delineate the effects of genomic E-mail: [email protected]; and Weimin Li, Phone: 189-8060-1009; E-mail: [email protected] alterations on the gene regulatory network in NSCLC. Notably, we can explore the effects of genomic mutations and structure varia- Cancer Res 2019;79:4840–54 tions on open regulatory elements and understand how it is doi: 10.1158/0008-5472.CAN-18-3663 associated with NSCLC transcriptome. In this study, we generated Ó2019 American Association for Cancer Research. matched ATAC-seq, WGS, and RNA-seq data from 50 primary 4840 Cancer Res; 79(19) October 1, 2019 Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2019 American Association for Cancer Research. Published OnlineFirst June 17, 2019; DOI: 10.1158/0008-5472.CAN-18-3663 Open Chromatin Landscape of Non–Small Cell Lung Carcinoma NSCLC cases. The comprehensive open chromatin landscape of to the instructions of the manufacturer. The RNA quality was NSCLC provided important resources that allowed us to identify assessed at the Bioanalyzer 2100 DNA Chip 7500 (Agilent Tech- key regulatory elements in this disease. The integration of mul- nologies), and samples with an RNA integrity number of over 7 tiple-omics datasets revealed novel insights into the gene regu- were further analyzed by RNA-seq. RNA sequencing libraries were latory mechanism of NSCLC. generated using the rRNA-depleted RNA by NEBNext Ultra Direc- tional RNA Library Prep Kit for Illumina following the manu- fi Materials and Methods facturer's recommendations. The products were puri ed (AMPure XP system) and library quality was assessed on the Agilent Patients and clinical information Bioanalyzer 2100 system. The libraries were sequenced on an Patients with NSCLC were staged according to the American Illumina HiSeq 4000 platform, and 150 bp paired-end reads were Joint Committee on Cancer version 6 and initially diagnosed with generated. lung cancer at West China Hospital of Sichuan University (Chengdu, China) from September 2016 to December 2018. WGS data processing Information, including patients' age, gender, ethnicity, patholo- Raw pair-end WGS reads were subjected to adapter and low- gy, and tumor stage was collected for these 51 patients (Supple- quality sequence trimming by using Trimmomatic (version 0.36; mentary Table S1). All of them received surgical treatment, and ref. 15) with default parameters. We mapped trimmed pair-end none of them underwent neoadjuvant therapy before surgery. reads to human reference build hg19 by using BWA mem (version Tumors and matched distal normal lung tissues were obtained 0.7.13-r1126; ref. 16). BAMs were sorted and indexed using during surgery. All samples were evaluated by two pathologists to SAMtools (version 1.3; ref. 17), and marking duplicates using determine the pathologic diagnosis and tumor cellularity. Only Picard (version 2.2.1; http://broadinstitute.github.io/picard.). tumor tissues containing at least 80% of tumor cells were includ- The Genome Analysis Toolkit (GATK, version 3.6; ref. 18) was ed. This study was approved by the Institutional Review Board of used for local realignment and base quality recalibration, proces- West China Hospital of Sichuan University (Chengdu, China; sing tumor/normal pairs independent. project identification code: 2017.114) and all patients provided written informed consent. Germline mutation detection Using default parameters, GATK HaplotypeCaller (18) was ATAC-seq library preparation and sequencing used to detect germline single-nucleotide variant (SNV) and To profile open chromatin, ATAC-seq was performed as indels. The known sites' files used for germline SNVs and indels 5 described previously (8). A total of 1 Â 10 cell pellets were calling were downloaded from ftp://gsapubftp-anonymous@ftp. washed once with PBS and cells were pelleted by centrifugation broadinstitute.org/bundle/. using the previous settings. Cell pellets were resuspended in 50 mL of lysis buffer, and nuclei were pelleted by centrifugation for 10 Somatic mutation detection minutes at 500 Â g,4C. The supernatant was discarded, and Somatic SNVs and indels were predicted using MuTect2 (19) nuclei were resuspended in 50 mL reaction buffer containing 2.5 and VarScan2 (20) with default parameters. Somatic SNVs and mL of Tn5 transposase and 22.5 mL of TD buffer (Nextera Illu- indels identified by VarScan2 (20) were retained if all of the mina). The reaction was incubated at 37C for 30 minutes. following criteria were met: (i) P value of the reported somatic Tagmented DNA was isolated by MinElute PCR Purification Kit SNV 0.05; (ii) the natural frequency of the reported somatic (Qiagen). Libraries were amplified for 10 cycles and purified using SNV 5%;