Pan-Cancer Multi-Omics Analysis and Orthogonal Experimental Assessment of Epigenetic Driver Genes
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 9, 2021 - Published by Cold Spring Harbor Laboratory Press Resource Pan-cancer multi-omics analysis and orthogonal experimental assessment of epigenetic driver genes Andrea Halaburkova, Vincent Cahais, Alexei Novoloaca, Mariana Gomes da Silva Araujo, Rita Khoueiry,1 Akram Ghantous,1 and Zdenko Herceg1 Epigenetics Group, International Agency for Research on Cancer (IARC), 69008 Lyon, France The recent identification of recurrently mutated epigenetic regulator genes (ERGs) supports their critical role in tumorigen- esis. We conducted a pan-cancer analysis integrating (epi)genome, transcriptome, and DNA methylome alterations in a cu- rated list of 426 ERGs across 33 cancer types, comprising 10,845 tumor and 730 normal tissues. We found that, in addition to mutations, copy number alterations in ERGs were more frequent than previously anticipated and tightly linked to ex- pression aberrations. Novel bioinformatics approaches, integrating the strengths of various driver prediction and multi- omics algorithms, and an orthogonal in vitro screen (CRISPR-Cas9) targeting all ERGs revealed genes with driver roles with- in and across malignancies and shared driver mechanisms operating across multiple cancer types and hallmarks. This is the largest and most comprehensive analysis thus far; it is also the first experimental effort to specifically identify ERG drivers (epidrivers) and characterize their deregulation and functional impact in oncogenic processes. [Supplemental material is available for this article.] Although it has long been known that human cancers harbor both gression, potentially acting as oncogenes or tumor suppressors genetic and epigenetic changes, with an intricate interplay be- (Plass et al. 2013; Vogelstein et al. 2013). Although several distinct tween the two mechanisms underpinning the hallmarks of cancer definitions of “driver gene” exist in the literature (Sawan et al. (Hanahan and Weinberg 2011), it is only with the fruition of large- 2008; Vogelstein et al. 2013), we define “driver genes” as those scale international sequencing efforts that major enigmas of the genes that, when deregulated (through somatic mutations, copy cancer (epi)genome have started to be solved (Jones et al. 2016; number variations, or aberrant expression), assume primary im- Ng et al. 2018). One of the most remarkable findings of the inter- portance in tumor development such as conferring a selective national high-resolution cancer genome sequencing efforts, spear- growth advantage, immortalization, and invasiveness. This defini- headed by The Cancer Genome Atlas (TCGA), is the high tion relies on inference models for driver prediction and functional frequency of genetic alterations in the genes encoding proteins data (based on the impact of the gene on cellular processes) com- that directly regulate the epigenome (referred to here as epigenetic pared to other methods that are mostly based on statistical models regulator genes [ERGs]) (Gonzalez-Perez et al. 2013; Plass et al. (largely driven by the mutation frequency of a gene) (Parmigiani 2013; Shen and Laird 2013; Timp and Feinberg 2013; Vogelstein et al. 2009; Meyerson et al. 2010; Lahouel et al. 2020). In line et al. 2013; Yang et al. 2015). This high rate of ERG genetic dereg- with this physiological definition, we refer to those ERGs that ulation constitutes a “genetic smoking gun,” indicating that epige- make a net contribution to tumorigenesis as “epigenetic driver netic mechanisms lie at the very heart of cancer biology. These genes” (henceforth called “epidrivers”). Our definition is different discoveries have sparked a debate on the role of ERG deregulation from that used by other investigators (Vogelstein et al. 2013), who (either through mutational or nongenetic events) in ERG expres- define epidrivers as the genes (not necessarily among ERGs) that sion and in the mechanisms underlying tumorigenesis and epige- are aberrantly expressed through changes in DNA methylation nome alterations that are rampant in virtually all human and chromatin modifications and confer a selective growth malignancies (Plass et al. 2013; Timp and Feinberg 2013). We advantage. also still lack a systematic understanding of the functional impor- The products of ERGs are involved in processes such as DNA tance of ERG disruption in tumor development and progression, as methylation, histone modification, chromatin remodeling, and well as its impact on cancer cell phenotype. other chromatin-based modifications, and many ERGs may have ERGs are a group of more than 400 coding genes in the hu- both histone and nonhistone substrates. All of these processes, in man genome, most of which encode enzymes that add (“writers”), turn, are involved in the proper control of not only gene expression modify/revert (“editors”), or recognize (“readers”) epigenetic mod- programs, required for the establishment and maintenance of cell ifications (Plass et al. 2013; Vogelstein et al. 2013) controlling a identity and function, but also DNA repair, recombination, and ge- range of critical cellular processes. Based on the observation that nome integrity (Murr et al. 2006; Bell et al. 2011). Because common many ERGs are frequently disrupted across different malignancies, cancers represent the final outcome of a multistep process, epi- they are candidates to be drivers of cancer development and pro- driver-based disruption of cellular processes may not only assume a primary role at different stages of tumorigenesis but also constitute critical mechanisms underpinning cancer cell plasticity and 1These authors contributed equally to this work. Corresponding author: [email protected] Article published online before print. Article, supplemental material, and publi- © 2020 Halaburkova et al. This article, published in Genome Research, is avail- cation date are at http://www.genome.org/cgi/doi/10.1101/gr.268292.120. able under a Creative Commons License (Attribution 4.0 International), as de- Freely available online through the Genome Research Open Access option. scribed at http://creativecommons.org/licenses/by/4.0/. 30:1–16 Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/20; www.genome.org Genome Research 1 www.genome.org Downloaded from genome.cshlp.org on October 9, 2021 - Published by Cold Spring Harbor Laboratory Press Halaburkova et al. emergence of cancer resilience. Here, we conducted a systematic and terations, exceeding 40% of samples in LGG and LUSC, respective- comprehensive pan-cancer investigation of (epi)genetics- and tran- ly. Several ERGs had the highest mutation frequency repeatedly in scriptome-based deregulation of all ERGs using in silico data cura- many cancer types, namely the KMT2C/D family (seven cancers), tion in clinical samples and characterization of the driver ARID1A (five cancers), BAP1 (three cancers), and ATRX (three can- potential by different computational tools. We also developed and cers) (Fig. 2E,G; Supplemental Table S3). A similar observation was tested a conceptual framework for experimental identification and made for deep CNAs in ERGs, namely BOP1 (four cancers), ATAD2 functional characterization of the mechanistically important epi- (four cancers), MECOM (three cancers), and PHF20L1 (three can- drivers that reshape the epigenome and contribute to cancer pheno- cers) (Fig. 2F,H). A larger percentage of ERG alterations was also ob- types. This framework builds on the latest knowledge of the cancer served when both deep and shallow CNAs were included (Fig. 2C, (epi)genome and genomic databases and includes powerful new ex- D; Supplemental Fig. S1C,D). Among the top ERGs altered by deep perimental models including state-of-the-art genome-editing CNAs, the majority showed amplifications, with the exception of screens, phenotyping, and functional genomics. HR, PHF11, and SETB2, which were commonly deleted in many cancer types (Fig. 2H). Frequently amplified ERGs often co-oc- curred in the same tumor sample in many cancer types; in partic- Results ular, the aforementioned pan-cancer recurrent genes BOP1, A four-stage strategy was used to identify and characterize ERGs ATAD2, and PHF20L1 highly co-occurred (Supplemental Fig. with cancer driver potential (Fig. 1A). We assembled a comprehen- S3A). These co-occurrences remained prominent even when the sive compendium of ERG genes by literature mining and manual analysis was focused only on deep amplifications/deletions across curation, resulting in a list of 426 genes coding for histone modifi- tumors (Supplemental Fig. S3B,C). Moreover, the family of TDRs ers, DNA methylation regulators, chromatin remodelers, helicases, (TDRKH, TDRD10, and TDRD5) highly co-occurred together. and other epigenetic entities (Fig. 1B; Supplemental Tables S1, S2). Generally little overlap was observed between the genes with a To identify the candidate epidrivers across different cancer types, high frequency of SNAs and those with a high frequency of – we first used comprehensive in silico data mining of genetic and CNAs (except for a few ERGs) (Fig. 2E H). RNA expression alterations of ERGs using data from TCGA (Fig. When ERGs were stratified by functional groups, similar total 1A). Collectively, the data encompassed 33 different cancer types proportions of genetic alterations were seen among ERG classes from 25 tissue types, with sequencing information from 10,845 tu- (Fig. 2I). DNA methylation writers and editors were characterized mor samples and 730 normal tissues, including a total of by a prominent proportion