Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Clinical Outcome-Related Mutational Signatures Identified by Integrative Genomic Analysis in Nasopharyngeal Carcinoma Wei Dai1,2*, Dittman Lai-Shun Chung1, Larry Ka-Yue Chow1, Valen Zhuoyou Yu1, Lisa Chan Lei1, Merrin Man-Long Leong1, Candy King-Chi Chan1, Josephine Mun-Yee Ko1, Maria Li Lung1* 1Department of Clinical Oncology, University of Hong Kong, Hong Kong (SAR), P. R. China 2University of Hong Kong-Shenzhen Hospital, Shenzhen, P. R. China

Running title: Clinical Outcome-Related Mutational Signatures in NPC

*Co-corresponding authors: Maria Li Lung, Wei Dai MLL Address: Department of Clinical Oncology, The University of Hong Kong, Room L6- 43, 6/F, Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong Email: [email protected] Tel: (852) 3917 9783 Fax: (852) 2816 6279 WD Address: Department of Clinical Oncology, The University of Hong Kong, Room L10- 56, 10/F, Laboratory Block, Faculty of Medicine Building, 21 Sassoon Road, Pokfulam, Hong Kong Email: [email protected] Tel: (852) 3917 6930 Fax: (852) 2816 6279

Conflict of interest The authors declare no potential conflicts of interest

Translational relevance In NPC, general consensus for treatment is to use radiotherapy (RT) alone for stage I disease, RT with or without concurrent for stage II and chemoradiotherapy (CRT) for advanced stage disease. However, 15-58% of the cases do not respond well to conventional treatment, and, thus, have poor clinical outcomes. There is a need for biomarkers to assist our understanding of the molecular basis of disease pathogenesis and progression and to aid clinical management in NPC. We have systematically examined the signatures and evaluated their prognostic values in association with clinical outcome. The mutational signature relevant to deficiency (BRCAness) was discovered, which was unappreciated in NPC before. Importantly, independent prognostic values of the BRCAness signature and mismatch repair signature are now revealed. These data show the

1

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

clinical importance of DNA repair pathways in NPC and their potential as prognostic and predictive biomarkers for future clinical studies.

Abstract

Purpose: Investigation of biological mechanisms underlying genetic alterations in cancer can assist the understanding of etiology and identify the potential prognostic biomarkers. Experimental design: We performed an integrative genomic analysis for a total of 731 NPC cases from five independent NPC cohorts to identify the genetic events associated with clinical outcomes. Results: In addition to the known mutational signatures associated with aging, APOBEC and mismatch repair (MMR), a new signature for homologous recombination deficiency (BRCAness) was discovered in 64 of 216 (29.6%) cases in the discovery set including three cohorts. This signature appears more frequently in the recurrent and metastatic tumors and significantly correlated with shorter overall survival in the primary tumors. Independent prognostic value of MMR and BRCAness signatures were revealed by multivariable Cox analysis after adjustment for clinical parameters and stratification by studies. The cases with both signatures have much worse clinical outcome than those without these signatures (hazard ratio (HR)=12.4, P=0.002). This correlation was confirmed in the validation set (HR=8.9, P=0.003). The BRCAness signature is highly associated with BRCA2 pathogenic or somatic alterations (7.8% vs. 0%, P=0.002). Targeted sequencing results from a prospective NPC cohort (N=402) show that the cases carrying BRCA2 germline rare variants are more likely to have poor overall survival and progression-free survival. Conclusions: Our study highlights importance of defects of DNA repair machinery in NPC pathogenesis and their prognostic values for clinical implications. These signatures will be useful for patient stratification to evaluate conventional and new treatment for precision medicine in NPC.

2

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Introduction Nasopharyngeal carcinoma (NPC) is a complex disease involving Epstein-Barr virus (EBV) infection, genetic and environmental factors (1). To understand the molecular basis of NPC pathogenesis, we performed genomic studies characterizing the important genetic alterations using whole-exome sequencing (WES) in NPC tissues obtained from cases and xenografts (2,3). Together with the results from other NPC genomic studies, multiple critical genetic alterations including inactivating of negative regulators in the NF-ĸB pathway, mutations in the PI3K/MAPK pathway, and epigenetic regulators were discovered (2,4,5). Mutational signatures are the footprints of mutational processes relevant to endogenous and exogenous factors contributing to tumorigenesis (6). Analysis of mutational signatures helps to decipher the etiology of NPC. Three mutational signatures relevant to aging caused by an endogenous mutational process initiated by spontaneous of 5’-methylcytosine, defective DNA mismatch repair (MMR) and APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like) family of cytidine deaminases were reported in NPC (2,5). However, our understanding of the molecular basis of NPC is far from complete, as these individual studies typically only profiled between 50 to 110 cases. We interrogated genomic data and clinical information from four published NPC studies and one prospectively collected NPC cohort. By increasing the statistical power to distinguish the infrequent driver events from passengers, we aim to identify the additional genetic events driving tumorigenesis, gain insights in the mechanisms underlying genetic alterations and systematically evaluate their prognostic value. Materials & Methods Patient cohorts and clinical information In the discovery stage, we included Asian NPC patients from three cohorts. The first NPC cohort was obtained from the sequence read archive (SRA) database (accession #SRP035573) from Singapore (SG cohort) and includes 56 Singapore NPC cases (4). Our earlier WES Hong Kong study includes 59 NPC cases (accession #SRA288429) (2), together with twelve additional cases (accession #SRP265671) to make up the HK-1 cohort. The third NPC cohort (HK-2 cohort) includes 97 independent Hong Kong cases from dbGAP-NHGRI (accession #phs001244.v1.p1) (5). None of patients with primary NPC had radiotherapy or

3

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

chemotherapy. In total, 178 primary tumors and 38 recurrent or metastatic NPC were included. The male to female ratio is 3.7. The independent validation cohort including 113 NPC patients was obtained from another genomic study by Sun Yat-Sen University Cancer Center (SYSUCC, Guangzhou, China) (7), and, thus, was named as the GZ cohort. To evaluate the prognostic value of BRCA2 germline pathogenic variants in association with clinical outcome, we examined a total of 420 NPC patients (HK-3) prospectively collected by the Area of Excellence of Hong Kong NPC Tissue Bank. After quality evaluation, 402 patients were included in the analysis for the HK-3 cohort. Following the REMARK recommendations (8), we include full details of clinical parameters in Table 1 and Supplementary Table S1 and their relationship to patient outcome in Tables 2 and 3. The study workflow is illustrated in Supplementary Figure S1. We included the Research Resource Identifiers (RRIDs) for the relevant tools and databases for the genomic analysis. Genomic data processing For the discovery cohort, raw sequencing reads were aligned to the human genome reference (hg19) using Burrows-Wheeler Aligner and were processed according to GATK Best Practices recommendations (version 3.8, RRID:SCR_001876) (9). To combine the samples from different platforms, only overlapping regions (~31Mb) captured from the WES kits for three cohorts were considered. A total of 216 NPC cases were analyzed after removing eight cases that did not pass quality evaluation. Median coverage for tumor tissues for HK-1, HK-2 and SG NPC cohorts is 80-fold, 100-fold and 79-fold, respectively. For the GZ cohort, the mutations were directly obtained from the previous study (7) as independent validation. Identification of germline and somatic single nucleotide variants (SNVs) and For the germline variants, to ensure the data quality, two levels of quality controls, both variant-based and genotype-based, were applied on called variants. At the variant level, single nucleotide variants (SNVs) and indels were recalibrated and grouped separately with reference to the HapMap (RRID:SCR_002846) and 1000 Genomes known variants into multiple Variant Quality Score Recalibration (VQSR) sensitivity tranche using GATK (RRID:SCR_001876). A VQSR sensitivity tranche of 99.9% was chosen for SNVs and the tranche of 97.5% was selected for indels. After variant-based quality control, -to- (Ti/Tv) ratio of the resulting exonic known variants is 2.75. We further applied genotype-based quality control, in which genotypes with low genotype quality (GQ<20) were set to missing. Multi-allelic variants and variants with >10% missing genotypes were excluded. The relatedness of the cases was evaluated using the identity-by-descent (IBD) analysis in Plink (v1.90, RRID:SCR_001757) (10). The related samples with IBD score

4

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

(>0.25) were removed from the analysis. The somatic SNVs and INDELs were detected by MuTect (RRID:SCR_000559) (11). The SNVs and INDELs with at least five supporting reads and 5% allele frequency in the overlapped regions were included in the analysis. The somatic variants with minor allele frequency (MAF)>1% in the 1000 Genomes project (RRID:SCR_008801), NHLBI Grand Opportunity Exome Sequencing Project (ESP6500), ExAC database (RRID:SCR_004068) and our in-house database for 895 Southern Chinese (12) were removed. Deciphering mutational signatures The somatic mutations were converted to the 96 possible mutated trinucleotides matrix, and the de novo mutation signatures extraction from this matrix was performed using non- negative matrix factorization algorithm in the R package NMF (13). The cophenetic correlation coefficient was used as an indicator to evaluate the stable reproducibility. The R package MutationalPatterns (14) was applied to deconvolute the mutational data against the 30 Catalogue of Somatic Mutation In Cancer (COSMIC, RRID:SCR_002260) signatures, where the contribution of each COSMIC signature was estimated. Targeted sequencing to identify germline rare variants at BRCA2 in the HK-3 cohort The genomic DNA was extracted from the peripheral blood mononuclear cell (PBMC) fraction from 420 newly diagnosed NPC cases. The bloods from the cases were prospectively collected by the Area of Excellence of Hong Kong NPC Tissue Bank. Only the cases with follow-up times longer than 24 months were included in the analysis. The study was approved by the Institutional Review Board of the University of Hong Kong (UW 12-239) and conducted in accordance with Declaration of Helsinki. The informed written consent form was obtained from each subject. The library preparation and sequencing data processing was performed as described (2). The germline rare variants were identified as the loss-of-function mutations, including truncations and frameshift insertions and deletions (INDELs), as well as missense variants with minor allele frequency (MAF)<0.01 in the public database and our in-house database for 895 Southern Chinese (12). A total of 402 cases with good quality of data (average coverage >30- fold) was included for the survival analysis. Sample size and statistical power estimation for survival analysis We performed the power estimation based on the method described before (15) (Supplementary Table S2). The discovery set includes the genomic data for a total of 178 primary tumors and 105 samples had adequate mutations for estimating the mutation signatures. The approximate statistical power for this set was estimated prior to analysis. We

5

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

expected to detect a hazard ratio (HR) at 2.9 for a genetic event occurring in 30% of the cases (N=105) with the analysis power to be 80%. The GZ validation set consists of 113 primary tumors and 71 samples had adequate mutations for estimating the mutation signatures. We expected to have a statistical power over 90% for detecting a HR at 4.5 for a genetic event occurring in 30% of the cases (N=71). This high HR was achievable based on the result in the discovery cohort. For validation of the association between BRCA2 germline variants and clinical outcomes, we assume the BRCA2 germline rare variants occurring in 10% of the cases, and a total of 417 cases is required for detecting a HR at 2.6 with the analysis power to be 90%. Survival Analysis The survival analysis was performed using IBM SPSS (v25, RRID:SCR_002865). The association between overall survival and clinical parameters and selected mutation signature were examined by the univariate Cox model. The primary endpoint is overall survival (OS) and the secondary endpoint is disease progression-free survival (PFS), if available. PFS is defined as the time from diagnosis to progressive disease or early death due to NPC or other causes in the HK-1 and HK-3 cohorts. Only the patients with adequate mutations (>30 in the coding regions, 1 mutation per Mb) in the protein-coding regions for deconvolution of the mutational signature were considered in the analysis. The assumption of the proportional hazard in the Cox model was examined using the R package survival. The mutation signatures and all the clinical parameters including overall stage and sex were used as the categorical variables, except that age at diagnosis was used as a continuous variable. In the multivariable analysis for the discovery set, in addition to the clinical parameters, different study cohorts across three genomic studies were used as a stratification factor in the multivariable Cox analysis. The significance level was set at P < 0.05. Detection of copy number alterations and consensus clustering The somatic copy number alterations (SCNAs) were detected by Aberration Detection in Tumor Exome (ADTEx) using the matched normal-tumor pair, as we previously described (16). This method is tailored for WES data to infer the SCNAs (17). Out of the overlapped regions captured by three studies, we randomly selected 2000 regions to perform the unsupervised hierarchical clustering 1000 times. We considered the samples stably clustered together in 60% of the random sampling times as one group; otherwise the samples were considered as the unclassified ones. Three distinct clusters were identified as SCNA-H-gain, SCNA-M-gain and SCNA-L-gain. To evaluate the stability of the clusters, we permutated the data for both sample and region labels and tested the probability of the samples clustering

6

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

together by chance through the same unsupervised hierarchical clustering procedure. From 100 times permutation, the estimated probability for three groups clustered randomly is P<0.01, P<0.01 and P=0.05 for SCNA-H-gain, SCNA-M-gain and SCNA-L-gain clusters, respectively. The median CNs for the overlapping regions were calculated for three clusters and illustrated on the chromosome ideograms by Phenogram (18). Methylation data analysis To evaluate the difference of host and EBV methylation between the signature-positive and - negative groups, we compared the methylome data between two groups using Illumina HumanMethylation450 BeadChip for host genome by LIMMA analysis (19) and bisulfite sequencing for EBV genome by Mann-Whitney U test. The host methylome data were obtained from our previous study in NPC (Gene Expression Omnibus (GEO), RRID:SCR_005012, accession #GSE62336) (20). Out of 25 cases available from previous study, 24 cases have matched WES data in the HK-1 cohort. To examine the promoter methylation at BRCA1, BRCA2 and other selected DNA damage and repair relevant genes, we combined our methylome data for 25 NPC cases (GEO accession #GSE6233) (20) together with another publicly available methylome data for 24 NPC cases (GEO accession #GSE52068) (21). All the data were generated using Illumina HumanMethylation450 BeadChip. The average methylation level for the multiple CpG sites at the promoter CGIs was calculated using the normalized β value: β = M/(M + U + 100), where M and U are the signals of the methylated and unmethylated probes, respectively. β values range from 0 (unmethylated) to 1 (fully methylated). The average promoter methylation level of the selected genes was estimated for comparison between groups. Results Mutation signature relevant to homologous recombination deficiency (BRCAness) identified in NPC Low heterogeneity for clinical characteristics was observed in three primary NPC cohorts in the discovery set with Higgin’s and Thompson’s I2<5% (Table 1). To identify the mutational signatures, we applied non-negative matrix factorization (NMF) (13). A sharp decrease of cophenetic correlation coefficient was observed at r=5, indicating substantially less stability was achieved using more than four clusters; the mutation data can be grouped as four robust signatures (Supplementary Figure S2). Besides the known age-related signature, MMR and APOBEC signatures (i.e. signatures 1, 6, 13) previously reported in NPC (2,5), an additional signature was discovered (Figures 1A and B). It corresponds to COSMIC signature 3, associated with failure of DNA double-strand break-repair (DSBR) by homologous

7

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

recombination (6). To rule out the possibility that this signature was random, we deconvoluted the mutational data against the COSMIC signatures; results show that signature 3 was indeed present in a subset of NPC cases (Supplementary Figure S3). In total, signature 3 was detectable in 29.6% of the total cases (Figure 1C). No obvious difference of age, gender and overall stage was found between the signature-positive and -negative groups in primary NPC (Supplementary Table S3). The MMR signature was found in 75.4% of the cases. Both signatures were detectable in 22.7% of the total cases (N=216). Signature 3 and MMR signature are independent prognostic factors in NPC Signature 3 is detectable in 31.6% of recurrent or metastatic tumors from two Hong Kong (HK) patient cohorts. This frequency is slightly higher than the primary tumors from the same cohorts with marginal significance (19.5% vs. 31.6%, Chi-square test P = 0.058). In the discovery cohort, we explored the association between the mutation signatures in the primary tumors and OS. The cases with detectable signature 3 have shorter OS than those without this signature (hazard ratio (HR)=2.8, 95% CI 1.4-5.8, P=0.005, Table 2 and Figure 2A). The same trend was observed in three different study cohorts (Figures 2B-D). Among the clinical parameters, age at diagnosis and overall stage are significantly associated with survival. Although the association of MMR signature with survival is not significant in the univariate analysis, after adjustment for age, stage, gender and stratification by studies in the multivariable Cox analysis, independent prognostic value of signature 3 and MMR signatures were discovered (Table 2). Therefore, we categorized the cases based on the signature status. Group 1 comprises the cases without both signatures. Group 2 is for the MMR-positive and signature 3-negative cases. Group 3 includes MMR-negative and signature 3-positive cases, while Group 4 are the cases carrying both signatures. Clinical outcomes of the four groups differ statistically and the cases with both signatures have strikingly higher risk of death compared to those without the signatures (adjusted HR=12.4, 95% CI 2.6-60.6, P=0.002, Figure 2E). We further examined the association between these mutational signatures and PFS in our study cohort (HK-1). The result suggests that the cases with both signature 3 and MMR signature have shorter PFS (log-rank test P=0.011, Supplementary Figure S4). This association was further validated in the GZ cohort and its prognostic value is independent from overall stage (adjusted HR=8.9, 95% CI 2.1-38, P=0.003, Table 2 and Figure 2F). Signature 3 is highly associated with BRCA2 pathogenic alterations The signature 3 is often due to pathogenic germline and somatic mutations of BRCA1 and BRCA2 reported in breast, ovarian and pancreatic (22-24). We examined the germline

8

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

and somatic alterations of BRCA1 and BRCA2 in this combined NPC cohort. No germline pathogenic mutation was detected at BRCA1. Interestingly, the cases carrying signature 3 are more likely to have the pathogenic germline or somatic alterations at BRCA2, while no BRCA2 pathogenic alteration was detected in the cases without this signature (7.8% vs. 0%, Fisher’s exact test P=0.002, Supplementary Table S4). Although we and others reported the NPC hypermethylation phenotype (20,25), examination of promoter methylation at BRCA1 and BRCA2 in a total of 49 NPC cases assayed by Illumina HumanMethylation450 BeadChip showed that it is unlikely that these two genes are inactivated by promoter methylation in NPC (N=49, Supplementary Figure S5) and no methylation difference was found at these two genes between the signature-positive/negative cases (Supplementary Figure S6). Polak et al. reported that promoter methylation at RAD51C is relevant to this signature (22). To explore genetic lesions in the other potential genes, we examined the WES and methylome data for the germline and somatic alterations leading to inactivation of other important genes in DNA damage pathway including RAD51C, RAD51, RAD50, PALB2, BARD1, BRIP1, NBN, ATM, ATR, CHEK2, FANCA and FANCM (22). Overall, no evidence shows that promoter methylation or mutations of any of these genes are relevant to this signature in NPC (Supplementary Table S5, Figures S5 and S6). BRCA2 germline rare variants are associated with OS and PFS in the HK-3 cohort Our integrative analysis result emphasizes the importance of BRCA2 genetic alterations in association with the BRCAness signature. We further examined its germline variants by targeted sequencing approach. We identified 27 BRCA2 germline rare variants in 57 cases, accounting for 14.1% of the total cases in the HK-3 NPC cohort (Supplementary Table S6). The germline rare variants have no association with age and stage, while the female cases are more likely to carry these germline rare variants than the male cases (20.4% vs. 11.9%, P=0.043, Supplementary Table S7). The univariate analysis shows that the BRCA2 germline rare variants are significantly correlated with both OS (HR=2.0, 95% CI 1.1-3.9, P=0.034) and PFS (HR=2.0, 95% CI 1.1-3.6, P=0.027). This prognostic value remains after adjustment for age, gender and overall stage (P<0.05, Table 3). Signature 3 and MMR signature associate with genomic instability Somatic copy number alterations (SCNAs) are frequently reported in NPC (26,27). Given the homologous recombination pathway plays an important role in governing genomic integrity, we hypothesize that the cases with the BRCAness signature have elevated levels of SCNAs. In the combined cohort, three clusters with distinct SCNA patterns were uncovered (Figure 3A). Cluster 1 (SCNA-L-gain), accounting for 67.8% of cases, is the genomic stable group

9

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

and it has consistent CN loss in chromosomes 3, 14 and 16, which are typical SCNAs in NPC. Both Cluster 2 (SCNA-M-gain, 9.2%) and Cluster 3 (SCNA-H-gain, 23%) have relatively higher genomic instability (Figure 3B). Cluster 2 is characterized as a moderate level of CN gain involved in the selected chromosomes 17, 19 and 22 and associated with CCND1 amplifications (Figure 3C). Strikingly, Cluster 3 (SCNA-H-gain) has extensive CN gain across the genome with regional loss near telomeres in several chromosomes. For example, CN gain in the large regions involved both short and long arms on chromosome 12 are found in this cluster (Figure 3B). Significant elevation of signature 3 activity was found in Cluster 3 (SCNA-H-gain) (Figure 3C), while higher MMR signature activity was found in Cluster 2 (SCNA-M-gain). This result supports our hypothesis about high chromosome instability relevant to this BRCAness signature. Previously, we and others reported that inactivation of multiple negative regulators in the NF-κB pathway was the driver events in NPC pathogenesis (2,5). The cases in the cluster of SCNA-H-gain are mutually exclusive with the cases with somatic inactivated alterations in the NF-κB pathway (P=0.018, Figure 3C), suggesting high genomic instability as an independent mechanism underlying etiology in a subset of NPC cases. Distinct host and EBV methylation in NPC with signature 3 The host methylation profiles were examined between the signature-positive and -negative NPC by Illumina HumanMethylation450 BeadChip. Differential methylation analysis showed that this signature is associated with a unique methylation pattern compared to the signature-negative cases (adjusted P <0.05) (Supplementary Figure S7A and Table S8). The EBV methylation profiles were investigated between the two groups by bisulfite sequencing. In general, the EBV genome is highly methylated with several unmethylated regions corresponding to QriP, Qp, promoters of RPMS1 and LMP-2A, which are important for regulating expression of EBV latent genes (28). Surprisingly, the signature-positive cases have characteristic hypomethylated regions at the LMP-1 promoters and LMP-2A gene body (Supplementary Figures S7B-C). RNA sequencing data suggest there is no statistically significant difference for LMP-1 expression between the two groups. Previously, LMP-1 staining was performed in the HK-2 cohort; among 24 cases with positive LMP-1 protein expression, there were only six signature 3 cases, confirming no association between LMP-1 and this signature. However, a significantly reduced ratio of LMP-1-to-LMP-2A expression was detected in the signature-positive cases (P=0.017, Supplementary Figure S7D). Discussion

10

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

In the retrospective analysis for the discovery stage, WES data revealed an etiologically distinct subset of NPC with a new mutational signature relevant to homologous recombination deficiency. The accuracy for calling the somatic mutations and germline variants was systematically evaluated by us previously (2,12). Although the BRCAness signature is frequently reported in breast, ovarian and pancreatic cancers (6,29,30), it was underappreciated in NPC before. Because it is only present in a subset of the NPC cases, previous genomic analyses may have missed this signature due to sample size issues. To positively confirm whether or not this new signature is present in NPC, we used two validation strategies: (1) deconvolution of the mutations against the known COSMIC mutation database to evaluate the correlation between the de novo extracted and known signatures; (2) analysis of additional genomic features including germline variants, copy number alterations and methylations to determine the presence of this signature. Our result shows that this signature is highly associated with BRCA2 germline and somatic alterations and increased signature activity is associated with high genomic instability and distinct methylation patterns, further supporting that this signature is indeed present in a subset of NPC and has important functional impact on the molecular profiles during NPC pathogenesis and disease progression. DNA damage repair plays an essential role to maintain the genomic stability and is a key factor determining the cancer risk, disease progression and therapeutic response (31). Previously, we have reported that significant NPC risks associated with multiple single nucleotide polymorphisms (SNPs) involved in homologous recombination repair and non- homologous end joining (NHEJ) repair (32). Here, we further demonstrate that the somatic and germline genetic alterations relevant to homologous recombination deficiency have significant clinical impact on patients’ outcomes in NPC. In breast and ovarian cancers, the patients with homologous recombination deficiency are generally sensitive to ionizing radiation, as well as the crosslinking agents such as cisplatin and carboplatin that can induce double-strand breaks (33,34), and, thus, they are associated with a good clinical outcome. In contrast, in NPC the BRCAness signature is associated with worse clinical outcome and this trend was consistently found in four independent NPC study cohorts. Furthermore, in a prospective cohort the patients carrying the germline BRCA2 rare variants also have poor OS and PFS. Similar trend of the homologous recombination deficiency in correlation with worse clinical outcome has been reported in esophageal and prostate cancers by a genomic and molecular landscape study recently (35). One explanation out of many possible reasons for this observation is that accumulated genetic and epigenetic alterations caused by

11

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

dysfunctional DNA repair pathways may contribute to the aggressive phenotype and resistance to the conventional treatment in NPC. In Hong Kong, almost all NPC patients are EBV-positive (36). Recently we reported that EBV infection modifies bivalent histone marks in the genes involved in multiple DNA repair pathways including base-excision repair, homologous recombination and MMR, leading to their transcription suppression and higher DNA damage in EBV-positive cells (37). In this study, the risks of death and disease progression for the cases carrying both BRCAness and MMR signatures are dramatically increased compared to those without these signatures. We speculate that the cumulative and suppressive effect in multiple genes involved in homologous recombination repair and MMR by EBV infection partially contributes to the relevant mutational signatures and their association with poor survival in NPC. Interestingly, hypomethylation of selected EBV regions and imbalanced expression of LMP-1 and LMP- 2A was observed in the cases carrying the BRCAness signature. These results suggest the potential link between specific EBV latent gene expression and dysfunctional DNA repair pathway through homologous recombination in a subset of NPC patients. In addition, Zhu et al. reported that the EBV microRNAs miR-15a and miR-16 target BRCA1, another key player in the homologous recombination repair pathway, and suppress its expression in NPC (38). This study suggests EBV microRNAs may also have a role to tune down this pathway. All these results raise the possibility that dysregulation of multiple epigenetic mechanisms by EBV may work collectively to suppress the relevant genes and impair the function of this pathway, contributing to the BRCAness signature. Systematic investigation of the epigenetic changes including not only methylation, but also histone modifications, microRNAs as well as EBV-host interaction for regulating the expression of the genes involved in the homologous recombination in a larger cohort, will be helpful to elucidate the detailed mechanisms underlying this phenomenon. In conclusion, we utilized the genomic data together with methylome data and RNA-seq data to characterize the molecular profiles in NPC. Our study highlights the importance of DNA repair pathways involved in homologous recombination repair and MMR in NPC molecular pathogenesis and their prognostic value for clinical implication. Identification of the mutational signatures relevant to DNA repair pathways will be helpful for patient stratification and provides the evidence for evaluating the conventional treatment and new therapy targeting these pathways in the selected NPC patients. Acknowledgements

12

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

We acknowledge the authors who contributed to the WES data from previous genomic studies. The public data were obtained from the SRA database (accession #SRP035573 and #SRA288429) and the dbGAP-NHGRI database (accession #phs001244.v1.p1). This study was funded by the Hong Kong Research Grants Council grant (AoE/M-06/08) to ML Lung, General Research Fund (17103218) from Hong Kong Research Grant Council and seed fund for basic research (201611159158) from the University of Hong Kong to W Dai. References 1. Lung ML. Unlocking the Rosetta Stone enigma for nasopharyngeal carcinoma: genetics, viral infection, and epidemiological factors. Semin Cancer Biol 2012;22(2):77-8 doi 10.1016/j.semcancer.2012.01.006. 2. Zheng H, Dai W, Cheung AK, Ko JM, Kan R, Wong BW, et al. Whole-exome sequencing identifies multiple loss-of-function mutations of NF-kappaB pathway regulators in nasopharyngeal carcinoma. Proc Natl Acad Sci U S A 2016;113(40):11283-8 doi 10.1073/pnas.1607606113. 3. Lin W, Yip YL, Jia L, Deng W, Zheng H, Dai W, et al. Establishment and characterization of new tumor xenografts and cancer cell lines from EBV-positive nasopharyngeal carcinoma. Nat Commun 2018;9(1):4663 doi 10.1038/s41467-018- 06889-5. 4. Lin DC, Meng X, Hazawa M, Nagata Y, Varela AM, Xu L, et al. The genomic landscape of nasopharyngeal carcinoma. Nat Genet 2014;46(8):866-71 doi 10.1038/ng.3006. 5. Li YY, Chung GT, Lui VW, To KF, Ma BB, Chow C, et al. Exome and genome sequencing of nasopharynx cancer identifies NF-kappaB pathway activating mutations. Nat Commun 2017;8:14121 doi 10.1038/ncomms14121. 6. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, et al. Signatures of mutational processes in human cancer. Nature 2013;500(7463):415-21 doi 10.1038/nature12477. 7. Zhang L, MacIsaac KD, Zhou T, Huang PY, Xin C, Dobson JR, et al. Genomic Analysis of Nasopharyngeal Carcinoma Reveals TME-Based Subtypes. Mol Cancer Res 2017;15(12):1722-32 doi 10.1158/1541-7786.MCR-17-0134. 8. McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM, et al. Reporting recommendations for tumor marker prognostic studies. J Clin Oncol 2005;23(36):9067-72 doi 10.1200/JCO.2004.01.0454. 9. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011;43(5):491-8 doi 10.1038/ng.806. 10. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 2015;4:7 doi 10.1186/s13742-015-0047-8. 11. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 2013;31(3):213-9 doi 10.1038/nbt.2514. 12. Dai W, Zheng H, Cheung AK, Tang CS, Ko JM, Wong BW, et al. Whole-exome sequencing identifies MST1R as a genetic susceptibility gene in nasopharyngeal carcinoma. Proc Natl Acad Sci U S A 2016;113(12):3317-22 doi 10.1073/pnas.1523436113.

13

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

13. Gaujoux R, Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 2010;11:367 doi 10.1186/1471-2105-11-367. 14. Blokzijl F, Janssen R, van Boxtel R, Cuppen E. MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med 2018;10(1):33 doi 10.1186/s13073-018-0539-0. 15. Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med 2000;19:441-52. 16. Dai W, Ko JMY, Choi SSA, Yu Z, Ning L, Zheng H, et al. Whole-exome sequencing reveals critical genes underlying metastasis in oesophageal . J Pathol 2017;242(4):500-10 doi 10.1002/path.4925. 17. Amarasinghe KC, Li J, Hunter SM, Ryland GL, Cowin PA, Campbell IG, et al. Inferring copy number and genotype in tumour exome data. BMC Genomics 2014;15:732 doi 10.1186/1471-2164-15-732. 18. Wolfe D, Dudek S, Ritchie MD, Pendergrass SA. Visualizing genomic information across chromosomes with PhenoGram. BioData Min 2013;6(1):18 doi 10.1186/1756- 0381-6-18. 19. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43(7):e47 doi 10.1093/nar/gkv007. 20. Dai W, Cheung AK, Ko JM, Cheng Y, Zheng H, Ngan RK, et al. Comparative methylome analysis in solid tumors reveals aberrant methylation at chromosome 6p in nasopharyngeal carcinoma. Cancer Med 2015;4(7):1079-90 doi 10.1002/cam4.451. 21. Jiang W, Liu N, Chen XZ, Sun Y, Li B, Ren XY, et al. Genome-Wide Identification of a Methylation Gene Panel as a Prognostic Biomarker in Nasopharyngeal Carcinoma. Mol Cancer Ther 2015;14(12):2864-73 doi 10.1158/1535-7163.MCT-15- 0260. 22. Polak P, Kim J, Braunstein LZ, Karlic R, Haradhavala NJ, Tiao G, et al. A mutational signature reveals alterations underlying deficient homologous recombination repair in . Nat Genet 2017;49(10):1476-86 doi 10.1038/ng.3934. 23. Vanderstichele A, Busschaert P, Olbrecht S, Lambrechts D, Vergote I. Genomic signatures as predictive biomarkers of homologous recombination deficiency in ovarian cancer. Eur J Cancer 2017;86:5-14 doi 10.1016/j.ejca.2017.08.029. 24. Connor AA, Denroche RE, Jang GH, Timms L, Kalimuthu SN, Selander I, et al. Association of Distinct Mutational Signatures With Correlates of Increased Immune Activity in Pancreatic Ductal Adenocarcinoma. JAMA Oncol 2017;3(6):774-83 doi 10.1001/jamaoncol.2016.3916. 25. Li L, Zhang Y, Fan Y, Sun K, Su X, Du Z, et al. Characterization of the nasopharyngeal carcinoma methylome identifies aberrant disruption of key signaling pathways and methylated tumor suppressor genes. Epigenomics 2015;7(2):155-73 doi 10.2217/epi.14.79. 26. Hui AB, Lo KW, Leung SF, Teo P, Fung MK, To KF, et al. Detection of recurrent chromosomal gains and losses in primary nasopharyngeal carcinoma by comparative genomic hybridisation. Int J Cancer 1999;82(4):498-503 doi 10.1002/(sici)1097- 0215(19990812)82:4<498::aid-ijc5>3.0.co;2-s. 27. Lo KW, Teo PM, Hui AB, To KF, Tsang YS, Chan SY, et al. High resolution allelotype of microdissected primary nasopharyngeal carcinoma. Cancer Res 2000;60(13):3348-53. 28. Woellmer A, Hammerschmidt W. Epstein-Barr virus and host cell methylation: regulation of latency, replication and virus reactivation. Curr Opin Virol 2013;3(3):260-5 doi 10.1016/j.coviro.2013.03.005.

14

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

29. Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 2016;534(7605):47-54 doi 10.1038/nature17676. 30. Waddell N, Pajic M, Patch AM, Chang DK, Kassahn KS, Bailey P, et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 2015;518(7540):495-501 doi 10.1038/nature14169. 31. Jeggo PA, Pearl LH, Carr AM. DNA repair, genome stability and cancer: a historical perspective. Nat Rev Cancer 2016;16(1):35-42 doi 10.1038/nrc.2015.4. 32. Yee Ko JM, Dai W, Wun Wong EH, Kwong D, Tong Ng W, Lee A, et al. Multigene pathway-based analyses identify nasopharyngeal carcinoma risk associations for cumulative adverse effects of TERT-CLPTM1L and DNA double-strand breaks repair. Int J Cancer 2014;135(7):1634-45 doi 10.1002/ijc.28802. 33. Abbott DW, Freeman ML, Holt JT. Double-strand break repair deficiency and radiation sensitivity in BRCA2 mutant cancer cells. J Natl Cancer Inst 1998;90(13):978-85 doi 10.1093/jnci/90.13.978. 34. Konstantinopoulos PA, Ceccaldi R, Shapiro GI, D'Andrea AD. Homologous Recombination Deficiency: Exploiting the Fundamental Vulnerability of Ovarian Cancer. Cancer Discov 2015;5(11):1137-54 doi 10.1158/2159-8290.CD-15-0714. 35. Knijnenburg TA, Wang L, Zimmermann MT, Chambwe N, Gao GF, Cherniack AD, et al. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 2018;23(1):239-54 e6 doi 10.1016/j.celrep.2018.03.076. 36. Chan KCA, Woo JKS, King A, Zee BCY, Lam WKJ, Chan SL, et al. Analysis of Plasma Epstein-Barr Virus DNA to Screen for Nasopharyngeal Cancer. N Engl J Med 2017;377(6):513-22 doi 10.1056/NEJMoa1701717. 37. Leong MML, Cheung AKL, Dai W, Tsao SW, Tsang CM, Dawson CW, et al. EBV infection is associated with histone bivalent switch modifications in squamous epithelial cells. Proc Natl Acad Sci U S A 2019;116(28):14144-53 doi 10.1073/pnas.1821752116. 38. Zhu JY, Pfuhl T, Motsch N, Barth S, Nicholls J, Grasser F, et al. Identification of novel Epstein-Barr virus microRNA genes from nasopharyngeal carcinomas. J Virol 2009;83(7):3333-41 doi 10.1128/JVI.01689-08.

15

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Table 1: Clinical parameters in the discovery and validation cohorts from four NPC studies (N=329) Discovery set (N=216) Validation set (N=113) Total SG HK-1 HK-2 I2 P value$ GZ P value$$ Primary 178 50 60 68 113 Site Recurrent/Metastasis 38 0 11 27 90.60% P < 0.001 0 Total 216 50 71 95 113 Primary NPC Age mean ± SD 53 ± 9 53 ± 11 51 ± 13 P = 0.575 -- Female 38 10 11 17 0% Gender Male 140 40 49 51 -- Unknown 0 0 0 0 P = 0.632 -- I 13 1 6 6 5 II 31 9 9 13 2 Stage III 66 15 27 24 41 IV 64 22 18 24 25 Unknown 4 3 0 1 4.54% P = 0.392 40 P = 0.005 Yes 12 3 5 4 Metastasis (M) No 161 44 55 63 Unknown 5 3 0 1 0% P = 0.86 -- Yes 88 38 50 -- Lymph node No 18 9 9 -- metastasis (N) Unknown 4 3 1 -- 0% P = 0.787 -- Alive 132 36 48 48 Survival status Dead 40 9 12 19 Unknown 6 5 0 1 0% P = 0.449 -- Overall survival Mean (95% CI) 75 (70, 81) 50 (44, 57) 77 (70, 84) 73 (65, 82) P = 0.291 (month) -- Yes 17 17 16 Disease progression No 43 -- 43 ------72 Unknown 0 0 25 P = 0.209 Progression-free Mean (95% CI) 70 (61, 78) -- 70 (61, 78) ------46 (42, 50) P = 0.702 survival (month) $ P value was estimated for difference among three cohorts within the discovery set (two-sided); $$ P value was estimated for difference between discovery and validation cohorts (two-sided). The cases with missing information were not included in both analyses. The cases with unknown information were removed from the analysis.

16

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Table 2: Survival analysis in primary NPC for the discovery cohort (N=105) and validation cohort (N=71) Univariate survival analysis Discovery set HR 95% CI P value Age 1 (1.0, 1.1) 0.002 Female 1 (reference) Gender Male 2.9 (0.9, 9.6) 0.081 I & II 1 (reference) Stage III 1.0 (0.3, 3.6) 0.951 IV 3.1 (1.0, 9.1) 0.042 SG 1 (reference) Study HK-1 0.8 (0.3, 2.6) 0.763 HK-2 0.9 (0.3, 2.4) 0.78 Negative 1 (reference) Signature 3 Positive 2.8 (1.4, 5.8) 0.005 MMR Negative 1 (reference) signature Positive 1.9 (0.7, 5.0) 0.181 Multivariable survival analysis$ Discovery set HR 95% CI P value Age 1.1 (1.0, 1.1) 0.001 Female 1 (reference) Gender Male 2.1 (0.6, 7.3) 0.264 I & II 1 (reference)

III 0.9 (0.3, 3.2) 0.891 Stage IV 3.6 (1.2, 6.2) 0.026 Negative 1 (reference) Signature 3 Positive 2.9 (1.4, 6.2) 0.006 MMR Negative 1 (reference) signature Positive 2.8 (1.0, 7.5) 0.046 Multivariable survival analysis$ Discovery set HR 95% CI P value Age 1.1 (1.0, 1.1) 3.8×10-4 Gender 2.1 (0.6, 7.3) 0.261 I & II 1 (reference) Stage III 0.8 (0.2, 2.8) 0.704 IV 3.5 (1.1, 11.2) 0.033 MMR-/Sig3- 1 (reference) Signature MMR+/Sig3- 5.2 (1.1, 24.1) 0.035 group MMR-/Sig3+ 9.1 (1.3, 61.8) 0.023 MMR+/Sig3+ 12.4 (2.6, 60.6) 0.002 Univariate survival analysis Validation set (GZ cohort) HR 95% CI P value I & II & III 1 (reference) Stage IV 2.6 (0.6, 10.3) 0.186 Signature Others 1 (reference) group MMR+/Sig3+ 9.5 (2.2, 40.8) 0.002 Multivariable analysis

HR 95% CI P value I & II & III 1 (reference) Stage IV 2.6 (0.6, 10.6) 0.193 Signature Others 1 (reference) group MMR+/Sig3+ 8.9 (2.1, 38) 0.003 $ The analysis is stratified by three different genomic studies. HR: hazard ratio estimated from Cox proportional hazard regression model; CI: confidence interval of the estimated hazard ratio; P value: p value estimated from score test (two-sided).

17

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Table 3: Survival analysis in primary NPC for the prospectively collected HK-3 cohort (N=402) Progress-free survival (PFS) Overall survival (OS) Univariate analysis HK-3 cohort HR 95% CI P value HR 95% CI P value Age 1 (0.9, 1.0) 0.164 1. (1.0, 1.1) 0.031 Female 1 (reference) 1 (reference) Gender Male 1.8 (0.9, 3.5) 0.081 1.9 (0.9, 38) 0.094 I & II 1 (reference) 1 (reference) Stage III 2.5 (2.7, 29) 0.138 1.9 (0.6, 6.2) 0.32 IV 8.8 (2.7, 29) 3.3×10-4 7.9 (2.4, 26) 0.001 BRCA2 Negative 1 (reference) 1 (reference) germline Positive 2.0 (1.1, 3.6) 0.027 2.0 (1.1, 3.9) 0.034 variants Multivariable analysis HK-3 cohort HR 95% CI P value HR 95% CI P value Age 1 (1.0, 1.1) 0.085 1.0 (1.0, 1.1) 0.014 Female 1 (reference) 1 reference Gender Male 1.4 (0.7, 2.8) 0.284 1.4 (0.7, 3.0) 0.349 I & II 1 (reference) 1 (reference) Stage III 2.3 (0.7, 7.6) 0.163 1.8 (0.5, 6.1) 0.347 IV 8.1 (2.4, 27) 0.001 7.5 (1.0, 3.7) 0.001 BRCA2 Negative 1 (reference) 1 (reference) germline Positive 1.9 (1.0, 3.4) 0.042 1.9 (1.0, 3.7) 0.046 variants HR: hazard ratio estimated from the Cox proportional hazard regression model; CI: confidence interval of the estimated hazard ratio; P value: p value estimated from score test (two-sided).

18

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Figure legends Figure 1: Mutation signatures extracted from NPC. (A) Lego plot for mutation patterns in 216 NPC cases. The single-nucleotide substitutions are divided into six categories (different colors) with 16 possible flanking changes surrounding the mutated base. Insert pie chart shows the proportion of six categories of mutation patterns. (B) Four mutational signatures identified in NPC by non-negative matrix factorization. Mutation signatures are displayed according to the 96-substitution classification defined by the substitution class and flanking changes surrounding the mutated base. (C) Pie charts for the proportion of signature 3 and MMR signature in NPC (N=216).

Figure 2: Kaplan-Meier plots for survival in NPC cases in the discovery and validation sets. The patients are categorized based on the signature 3 for all the patients (A), HK-1 set (B), HK-2 set (C) and SG set (D). (E) All the patients in the discovery set categorized based on the signature 3 and MMR signature. (F) All the patients in the validation set categorized based on signature 3 and MMR signature. The HR was estimated by the univariate analysis using the Cox proportional hazard regression model in (A)-(D), while the HR was adjusted by the available clinical parameters in the multivariable Cox analysis in (E) and (F).

Figure 3: Three distinct clusters identified from SCNAs. (A) Consensus clustering for SCNAs. Columns represent the selected regions ordered according to the chromosome locations and rows represent samples grouped by the consensus clustering. Two thousand randomly selected regions are included. The unclassified samples after 1000 iterations (see Methods) were not included in the graph. The genes with asterisk (*) are significantly altered in the discovery cohort (q value < 0.05). (B) Chromosome ideograms with the median CN changes for three SCNA clusters. The SCNAs for the overlapping regions were identified by ADTEx. (C) The signature 3 or MMR signature activities in three SCNA clusters. The cases in SCNA-H-gain cluster have higher signature 3 activity compared to the other cases, while the cases in SCNA-M-gain cluster have higher MMR signature activity. (D) Genetic alterations in the NF-ĸB pathway and CCND1 amplifications in three SCNA clusters. The cases with genetic alterations at NF-ĸB pathway (CYLD, NFKBIA, TRAF3, and NLRC5) are mutually exclusive with the cases in the SCNA-H-gain cluster, while the cases in SCNA-M- gain cluster have more frequent CCND1 amplifications (copy number ≥ 4).

19

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on September 28, 2020; DOI: 10.1158/1078-0432.CCR-20-2854 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Clinical Outcome-Related Mutational Signatures Identified by Integrative Genomic Analysis in Nasopharyngeal Carcinoma

Wei Dai, Dittman Lai-Shun Chung, Larry Ka-Yue Chow, et al.

Clin Cancer Res Published OnlineFirst September 28, 2020.

Updated version Access the most recent version of this article at: doi:10.1158/1078-0432.CCR-20-2854

Supplementary Access the most recent supplemental material at: Material http://clincancerres.aacrjournals.org/content/suppl/2020/09/30/1078-0432.CCR-20-2854.DC1

Author Author manuscripts have been peer reviewed and accepted for publication but have not yet Manuscript been edited.

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://clincancerres.aacrjournals.org/content/early/2020/09/26/1078-0432.CCR-20-2854. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from clincancerres.aacrjournals.org on September 29, 2021. © 2020 American Association for Cancer Research.