Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

1

NRF2-Driven KEAP1 Transcription in Human Lung Cancer

Yijun Tian1, Qian Liu1, Shengnan Yu2, Qian Chu2, Yuan Chen2, Kongming Wu*2,3, Liang Wang*1,4 1Department of Tumor Biology, Moffitt Cancer Center, 12902 Magnolia Drive, Tampa, FL 33612, United States 2Department of Oncology, Tongji Hospital of Tongji Medical College, 1095 Jiefang Avenue, Wuhan 430030, P.R. China 3Department of Oncology, The First Affiliated Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou 450020, P.R. China 4Department of Pathology, Medical College of Wisconsin, Milwaukee, WI 53206, United States

Running Title: NRF2 activates KEAP1 expression in human cancers

*Correspondence to: Liang Wang, MD, PhD, Department of Tumor Biology, Moffitt Cancer Center, 12902 Magnolia Drive, Tampa 33612, USA. Tel: 813-745-4955 Fax: 813-745-6606 Email: [email protected]

Kongming Wu, MD, PhD, Department of Oncology, Tongji Hospital of Tongji Medical College, Building 303, 1095 Jiefang Avenue, Wuhan 430030, P.R. China. Tel: 0086-135-1719-6182 Fax: 0086-027-8366-3476; Department of Oncology, The First Affiliated Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou 450020, P.R. China Email: [email protected]

Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

2

Abstract

Constitutive NRF2 activation by disrupted KEAP1-NRF2 interaction has been reported in a variety of human cancers. However, studies focusing on NRF2-driven KEAP1 expression under human cancer contexts are still uncommon. We examined mRNA expression correlation between NRF2 and KEAP1 in multiple human cancers. We measured KEAP1 mRNA and alterations in response to the activation or silencing of NRF2. We queried ChIP-seq datasets to identify NRF2 binding to KEAP1 promoters in human cells. We used reporter assay and CRISPR editing to assess KEAP1 promoter activity and mRNA abundance change. To determine specimen implication of the feedback pattern, we used expression ratio to predict NRF2 signal disruption as well as patients' prognosis. Correlation analysis showed KEAP1 mRNA expression was in positive association with NRF2 in multiple squamous cell cancers. The positive correlations were consistent across all squamous cell lung cancer cohorts, but not in adenocarcinomas. In human lung cells, NRF2 interventions significantly altered KEAP1 mRNA and protein expressions. ChIP-qPCR and sequencing data demonstrated consistent NRF2 occupancy to KEAP1 promoter. Deleting NRF2 binding site significantly reduced baseline and inducible KEAP1 promoter activity and KEAP1 mRNA expression. By incorporating tumor tissue KEAP1 mRNA expressions in estimating NRF2 signaling disruptions, we found increased TXN/KEAP1 mRNA ratio in cases with NRF2 gain or KEAP1 loss and decreased NRF2/KEAP1 mRNA ratio in cases with NRF2-KEAP1 somatic mutations. In TCGA PanCancer datasets, we also identified that cases with loss-of-function mutations in NRF2 pathway recurrently appeared above the NRF2-KEAP1 mRNA expression regression lines. Moreover, compared with previous NRF2 signatures, the ratio-based strategy showed better predictive performance in survival analysis with multiple SQC cohort validations.

Implications: NRF2-driven KEAP1 transcription is a crucial component of NRF2 signaling modulation. This hidden circuit will provide in-depth insight into novel cancer prevention and therapeutic strategies.

Keywords: Squamous cell cancer; Lung cancer; responsive element; NRF2, KEAP1,

TXN

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

3

Introduction

For decades, cancer prevention and therapeutic breakthroughs occur in pace with deepened insights about cancer genome (1-3). Among the endeavors to decoding cancer genomics, the landmark program — The Cancer Genome Atlas (TCGA) generated abundant profiling from over 20,000 primary tumor tissues spanning 32 cancer types and broadened cancer research boundaries with massive discoveries (4,5). In these discoveries, a pivotal factor regulating redox homeostasis, nuclear factor erythroid-2 related factor 2 (NFE2L2, which is generally referred to NRF2), has shown frequent activation in nearly 30% squamous cell lung cancers (SQC) (6-8). These NRF2 activations are mainly caused by somatic mutations and copy number variations in itself and its gatekeeper gene Kelch-like ECH-associated protein 1 (KEAP1, also named as INRF2) gene (6). It has been reported that KEAP1 anchors NRF2 in the cytoplasm and mediates its degradation (9). By forming a "Hinge and Latch" structure through homo-dimerization and interaction with NRF2 ETGE and DLG domain, KEAP1 restricts NRF2 activation under basal conditions (10). In the presence of oxidative or electrophilic stress, NRF2 factor evades from ubiquitination and translocates into the nucleus. The in-nucleus NRF2 binds to specific regulatory sequences named antioxidant responsive elements (ARE) to transactivate cytoprotective and antioxidant (11). However, two interesting studies have also identified a hidden circuit in NRF2 regulations. In the mouse Keap1 (INrf2) gene, Lee et al. (12) found that an ARE located on a negative strand can subtly connect Nrf2 activation to Keap1 transcription. When examining NRF2 occupancies in human lymphocytes, Chorley et al. identified a ~700bp within the KEAP1 promoter region was consistently top rank enriched, even at the whole genome scale (13). These basic findings have depicted a mutually influenced pattern between NRF2 and KEAP1. Since NRF2 shows oncogenic functions in human cells (14,15), and displays a high frequency of aberrant activation in tumors (16), these results also suggested the potential regulatory role of NRF2- KEAP1 axis in human cancers. To identify the presence and extents of this feedback regulation under human cancer contexts, we reconsidered transcriptome associations between NRF2 and KEAP1 in TCGA PanCancer and currently available lung cancer RNA profiling datasets. We functionally characterized NRF2-driven KEAP1 expression in lung cancer cells and further leveraged KEAP1 expression to enhance prediction of NRF2 signaling disruption. This work will

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

4

give new perspectives to pinpoint NRF2 regulation and pave the way toward possible pharmaceutical interventions.

Materials and Methods

TCGA PanCancer dataset query and somatic mutation annotation NRF2, KEAP1 and NRF2 downstream ("mRNA Expression, RSEM (Batch normalized from Illumina HiSeq_RNASeqV2)"), copy number variation ("Copy-number Alterations (OQL is not in effect)") and mutations ("Mutations (OQL is not in effect)") of 10,967 tumor samples from 32 TCGA PanCancer atlas studies were downloaded from cBioportal (https://www.cbioportal.org/) (4,5) on September 26th, 2019. Annotations of NRF2 and KEAP1 somatic mutation were based on previous publications (6,17-25). Truncating mutations in KEAP1 were bona fide regarded as loss of function mutations. A summary of the NRF2 and KEAP1 somatic mutation annotations can be found in Supplementary Table S1.

Regents and cell culture The antibodies against NRF2 were purchased from Abcam (For western blot, ab62352, Cambridge, MA, USA) and Active Motif (For ChIP-qPCR, 61599, Atlanta, GA). Antibodies against KEAP1 (10503-2-AP) and β-actin (60008-1-Ig) were purchased from Proteintech Group (Rosemont, IL, USA). IgG isotype control antibodies were purchased from Abcam (ab171870, Cambridge, MA, USA). Dimethyl sulfoxide (DMSO) and tert-butylhydroquinone (tBHQ) were purchased from Sigma-Aldrich (St. Louis, MO, USA). Firefly luciferase Glow Assay Kit was purchased from ThermoFisher Scientific (Waltham, MA, USA). A549 (RRID: CVCL_0023), H292 (RRID: CVCL_0455), HEK293FT (RRID: CVCL_6911), SK-MES-1 (RRID: CVCL_0630), BEAS-2B (RRID: CVCL_0168) and H460 (RRID: CVCL_0459) cells were obtained from the American Type Culture Collection and verified with small tandem repeats (STR) profiling prior to use. All cell lines were disposed and replaced with low passage aliquots after subcultured for 15 times. All lung cells were grown in RPMI-1640, expect for SK-MES-1 was grown in MEM. The HEK293FT cells were cultured in DMEM. The complete media were supplemented with 10% fetal bovine serum (ThermoFisher Scientific, MA,

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

5

USA) and were all free of sodium pyruvate. All cell lines were examined for mycoplasma contamination with VenorTM GeM mycoplasma detection kit (Sigma Aldrich, MO, USA).

Plasmid construction and siRNA design Promoter sequence of KEAP1 gene was searched from The LightSwitch™ Promoter Reporter GoClone® Collection (https://switchgeargenomics.com/products/promoter-reporter- collection, SWITCHGEAR GENOMIC, Carlsbad, CA, USA). A 329bp subsequence containing the putative ARE was amplified from the genomic DNA of HEK293FT cells. This promoter was further cloned into a pGL3-basic vector between the KpnI and BglII sites. Q5 site-directed mutagenesis kit (New England Biolabs, MA, USA) was used to introduce ARE and SP1 binding site deletions to the pGL3-KEAP1 plasmid. Small interfering RNA (siRNA) targeting NRF2 was the same as described in a previous publication (26). The promoter sequences, primers, RNA oligos, and guide RNA sequences were listed in Supplementary Table S2.

Real-time PCR, western blot analysis, and immunofluorescence staining Total RNA was extracted from cells using the RNeasy Mini Kit (QIAGEN, MD, USA). One microgram of total RNA was reverse transcribed by SuperScript™ IV VILO™ Master Mix (ThermoFisher Scientific, MA, USA). Quantification reactions were performed with PowerUp™ SYBR™ Green Master Mix (ThermoFisher Scientific, MA, USA) on the ABI 7900 HT platform. The primers were listed in Supplementary Table S2. Total protein was extracted and electrophoresed, as previously described (27). SuperSignal West Pico Chemiluminescent Substrate (ThermoFisher Scientific, MA, USA) was used to generate luminescent signals on the Syngene fluorescence imaging system. Captured images were aligned in Photoshop and cropped in Microsoft PowerPoint

Luciferase reporter assay The cells were seeded into a 24-well plate one day before per-well 250 ng of the pGL3 plasmid was transfected by using Lipofectamine 2000 (27). The tBHQ was added to the replaced medium after transfection for 36 h, whereas empty vector (200 ng) or NRF2 ORF (290 ng) was co-transfected from the beginning. After 12 h of tBHQ treatment or a total of 72 h for co- transfection, the cells were lysed for the luciferase assay according to Firefly Luciferase Glow Assay Kit protocol. Luminescence was measured by infinite 200Pro. After subtracting the empty

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

6

vector value, gene promoter activities were represented by relative luminescence unit fold- change.

Functional validation of ARE by CRISPR-Cas9 based genome editing Flanking sequence (±20 bp) around the putative ARE was examined for sgRNA using the CRISPOR software (28). Guide RNA with the best off-target score was synthesized along with a hU6 promoter and a scaffold as a gblock fragment (https://benchling.com/rrm38/f/h4fdYFOi- protocols/prt-10T3UWFo-detailed-gblocks-based-crispr-protocol/edit) (29,30) (The guide RNA sequence can be found in Supplementary Table S2). The gblock fragment was further amplified with Phusion High-Fidelity DNA Polymerase (ThermoFisher Scientific, MA, USA) and cleaned with MinElute PCR Purification Kit (QIAGEN, MD, USA). To edit, gblock fragments and Cas9 plasmid (pSpCas9-2A-GFP, Addgene, #48138) (31) by a molar ratio of 8:1 using Lipofectamine 2000, along with a ΔARE promoter sequence as repair template donor. At 24 h post-transfection, GFP-positive cells were sorted by FACSMELODY (BD Bioscience, CA, USA). The genomic DNA of the sorted cells was extracted for the T7E1 assay to verify sgRNA-directed cleavage. Meanwhile, the sorted cells were seeded into a 96-well plate to obtain single cell clones. DirectPCR Lysis Reagent (Viagen Biotech, Los Angeles, CA, USA) was used to prepare the template for sanger sequencing. The sgRNA sequences and T7E1 primers were listed in Supplementary Table S2.

Confirmation of NRF2 binding to KEAP1 promoter by ChIP analysis

To confirm the NRF2 binding to KEAP1 promoter, we re-analyzed the ChIP-seq dataset evaluated NRF2 occupancies in . Raw FASTQ files were downloaded from the European Nucleotide Archive (ENA) database (13,32,33). Relevant and qualified sequence files were aligned to hg38 human genome using bowtie2. Unique alignments were sorted and indexed, and then fed to the Model-based Analysis of ChIP-Seq 2 (MASC2) (34) callpeak function with a false positive rate threshold of <0.01. After obtaining significantly enriched loci coordinates, the alignments were visualized by Integrative Genomics Viewer (35). For ENCODE NRF2 ChIP-seq data, we used MACS2 to call peak between NRF2 ChIP and Input sequencing alignment file to identify potential peaks. For Sulforaphane (SFN, commonly use NRF2 activator) treated ChIP-

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

7

seq data, we used MACS2 to call peak between DMSO and SFN treatment group to identify potential peaks. ChIP-seq datasets used were listed in Supplementary Table S3. To further verify NRF2 binding in lung cancer cells, we performed ChIP-qPCR in H292, SK- MES-1, H520, and A549 cells. After 8 h treatment with 0.1% DMSO or 60 μM tBHQ, chromatin was prepared from above cell lines with ChIP-IT High Sensitivity kit (Active Motif, 53040, Atlanta, GA) and used for four independent ChIP reactions against IgG control and NRF2 antibodies, respectively. To assess NRF2 binding at NQO1, KEAP1, and TXN genes, we used previously published primers (13), and our designs to amplify ARE containing as well as a gene- specific negative control (NC) region. We performed quantification reactions with input or ChIPed DNA by PowerUp™ SYBR™ Green Master Mix on CFX96 qPCR instrument (Bio-Rad, Hercules, CA). For each gene, enrichments at ARE-containing regions were calculated based on gene-specific NC after normalized to the respective input control. The primers used were listed in Supplementary Table S2.

Correlation analysis between KEAP1 and NRF2 expression in lung cancer

To interrogate the RNA expression correlation between KEAP1 and NRF2, we retrieved transcriptomic data from both Gene Expression Omnibus (GEO) and TCGA. For Affymetrix microarrays, we used JetSet (36) probeset expression of KEAP1 (202417_at), TXN (216609_at), and NRF2 (201146_at) to represent bona fide RNA abundances. For Illumina or other microarrays, we selected the probeset with the maximum average expression across samples to represent gene RNA abundance. To perform a summary statistics meta-analysis, we calculated the Pearson correlation coefficient from each original gene expression dataset (log2-transformed). We used each coefficient and respective sample size for pooled analysis (37) in ADC and SQC, respectively. This strategy has been proven (38) to be decent to investigate the correlation between expression profiles of a gene pair across multiple microarray studies. The heterogeneity test (39) in SQC (Heterogeneity I2=0) has indicated little correlation variability between datasets came from array and sequencing platforms, suggesting the results cannot be explained by chances.

Estimation of NRF2 disruption based on KEAP1 expression

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

8

If nuclear NRF2 protein indeed regulates KEAP1 mRNA expression, a dynamic equilibrium between NRF2 and KEAP1 can be written as Eq1:

(aX − mbY) × e = Y

In Eq1, a and b denote NRF2 and KEAP1 protein translated from per respective unit mRNAs; e denotes the KEAP1 mRNA transcribed from per unit of the functional nuclear NRF2 protein; m denotes NRF2 protein degraded by per unit of the KEAP1 protein; X and Y denote NRF2 and KEAP1 mRNA abundance. This equation can be further transformed to:

푋/푌 = 1/푎푒 + 푚푏/푎

In the transformed equation, NRF2-KEAP1 disequilibrium caused by loss of function mutation of NRF2 (elevated "a") and KEAP1 (decreased "mb") can be reflected by the decrease of NRF2/KEAP1 mRNA ratio.

In the above equations, disequilibrium caused by KEAP1 deletion and NRF2 amplification were not included. To include this situation, another downstream gene E that expressed in a similar pattern with KEAP1 was introduced. Z denotes gene E expression, and f denotes its mRNA transcribed from unit nuclear NRF2 protein. This hypothesis can be written as Eq2:

(푎푋 − 푚푏푌) × f = Z

This equation can be transformed together with Eq1 into:

푍/푌 = 푓/푒.

From this equation, NRF2 disequilibrium caused by KEAP1 deletion and NRF2 amplification can be reflected as increases in the ratio between the downstream Gene E and KEAP1 mRNA. We eventually selected a well-known NRF2 target gene: TXN as the KEAP1 footprint for below reasons: Similar coefficient of variant (CV) for mRNA expression measured by microarray as well as RNA-seq across each dataset (Supplementary Figure S1A and S1B); TXN and KEAP1 both have similar high confidence NRF2 occupancy near respective transcription start site (Supplementary Figure S1C).

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

9

Based on above considerations, we used receiver operating characteristic (ROC) curve in testing TXN/KEAP1 and NRF2/KEAP1 ratio for predicting NRF2 signal disruptions (copy number variations and somatic mutations). We also stratified SQC patients by upper 25% TXN/KEAP1 or lower 25% NRF2/KEAP1 and used Kaplan Meier curve to test potential prognostic associations. In previous published NRF2 signatures, Singh et al. (40) indicated 14 genes under NRF2 control. Cescon et al. (41) summarized 27 genes altered in NRF2 activation SQC cases. Rodrigo et al. (42) identified 108 high confidence NRF2-target genes. Notably, none of above gene signatures included KEAP1 as NRF2 targets. To evaluate prognostic performance of these NRF2 signatures in SQC (GSE3141, GSE37745, and TCGA-LUSC), we clustered patients of each cohort by each gene signature expression (center and standardized by mean and standard deviation) into high and low NRF2 activity groups. We used time-dependent areas under receiver operating characteristic curves (43) to compare the ratio and above NRF2 signature-based stratification. A follow-up limit of 60 months was set for the datasets with high overall survival censor rate (>40%). No follow-up limits were set to relapse free survival (RFS), progress free survival (PFS), disease specific survival (DFS) and disease free survival (DFS) data. The characteristics of datasets used were listed in Supplementary Table S4. Previously published NRF2 signatures were listed in Supplementary Table S6.

Statistical analysis

The data were expressed as the mean ± SEM. The unpaired t-test was used to compare differences between groups after determining the homogeneity of variance. Two-way ANOVA was used to compare differences between groups in the time-series data. The log-rank test was used to compare survival between stratified patients. Two-tail p values less than 0.05 indicated statistically significant differences.

Results

NRF2 and KEAP1 expression are significantly correlated in squamous cell cancers To evaluate NRF2 expression and its tissue distribution in human cancers, we ranked TCGA PanCancer datasets by respective median NRF2 expression (Figure 1A). We found that four cancer types with squamous histology expressed substantially higher NRF2 mRNA. This high-

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

10

level mRNA was positively correlated with NRF2 copy number gain. Furthermore, Pearson correlation analysis showed that NRF2 and KEAP1 were significantly correlated in the four squamous cancer types (Figure 1C-F), but not in lung adenocarcinoma (TCGA-LUAD, Figure 1B). This correlation was further confirmed by meta-analysis in NSCLC datasets (Figure 1H and 1G). Studies in dash boxes indicated those measured RNA profiling for ADC and SQC at the same time.

NRF2 regulates endogenous KEAP1 expression in human lung cells Previous studies showed NRF2 feedback in KEAP1 activation in mouse hepatoma cells (12) and human lymphoid cells (13). To test if NRF2 activation also regulated KEAP1 expression in human lung cell lines, we used two well-known NRF2 activator drug tBHQ (Tertiary butylhydroquinone) (44) and DMF (Dimethylformamide) (45) to treat H292 cells, which was previously reported to have low NRF2 activity (26,46). We found that 60 μM tBHQ and DMF induced KEAP1 mRNA expression after 12-hour treatment (Figure 2A). Correspondingly, in a well-known NRF2-addictive cell line A549, silencing NRF2 by siRNA significantly reduced KEAP1 mRNA expression (Figure 2B). The same responses were detected at protein level in above settings (Figure 2C-E), as well as in SK-MES-1 (SQC, Figure 2F), H520 (SQC, Supplementary Figure S5A) and BEAS-2B (normal human bronchial epithelium) cells (Figure 2G). Besides, in A549 and H460 cells (both with KEAP1 loss of function mutation), KEAP1 protein (Figure 2H, Supplementary Figure S5B) did not elevate after tBHQ treatment. Western blot bands in Figure 2, Supplementary Figure S5 were quantified in Supplementary Figure S5C and S5D.

NRF2 directly binds to KEAP1 promoters in human cell lines To investigate possible NRF2 binding at human KEAP1 promoter, we retrieved publicly available NRF2 ChIP-seq datasets from ENCODE (https://www.encodeproject.org/) (47). As expected, we found a confined region in KEAP1 promoter with variable NRF2 binding enrichment across four human cell lines (Figure 3A). Additionally, in another 2 NRF2 ChIP-seq datasets, SFN treatment increased NRF2 occupancy in the same region in BEAS-2B cells (Figure 3B) and immortalized lymphocytes (Figure 3C). To identify significantly enriched regions in lymphocytes, we used MACS2 to callpeak between the SFN treatment and DMSO

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

11

control. We found that KEAP1 promoter loci ranked third (FDR q value) in all enriched regions. (Figure 3D). To further identify NRF2 binding at human KEAP1 promoter, we performed ChIP-qPCR with NRF2 antibody in four lung cancer cells, including two SQC lines (SK-MES-1 and H520). We observed reproducible baseline (0.1% DMSO) NRF2 occupancies in ARE-containing promoter regions of NQO1, KEAP1, and TXN genes (Figure 3E-H). One interesting fact in H292 cells was the mild increased (~1.7 folds) occupancy for KEAP1 promoter and steady occupancies for NQO1 and TXN during tBHQ treatment, suggesting NRF2 activation was tightly restricted by the feedback regulation, which might be the reason that we saw transient induction and rapid reduction of KEAP1 protein in NRF2 activator treated western blot. Additionally, with the highest baseline NRF2 occupancies in KEAP1 ARE-containing promoter, tBHQ treatment did not induce significant NRF2 binding increase in A549 cells (Figure 3H).

NRF2 binding site is crucial to KEAP1 transcription To identify NRF2 binding motif in KEAP1 promoter region, we screened for TF binding in the JASPAR database (48). In accordance with Chorley et al. (13), we verified the existence of a NRF2 binding site (ARE, score: 15.16, relative score: 0.98), as well as a SP1 binding site (score: 13.428, relative score: 0.95) (Figure 4A). After cloning this promoter into luciferase reporter plasmid, we first tested activity of this promoter during tBHQ treatment. Interestingly, in KEAP1 wild-type H292 cells, KEAP1 promoter activities increased up to 2-fold within the first 6-hour treatment, then decreased after 12 hours. However, in KEAP1 mutated A549 cells, the activities remain unaltered (Figure 4B). We further used site-directed mutagenesis to introduce three putative TF site deletion to this promoter and compared the activity alteration. In H292 cells, full-length (FL) and ΔSP1 promoter activities were significantly enhanced by tBHQ treatment or by NRF2 overexpression. However, deletion of ARE (ΔARE) or double deletion (ΔΔ) abolished this enhancement (Figure 4C). In A549 cells, we only observed a significant reduction of baseline promoter activity of ΔARE and ΔΔ. The inducible activity absence suggested oversaturated KEAP1 promoter activity in A549 cells (Figure 4D). To further test the effect of ARE locus on endogenous KEAP1 expression, we applied the CRISPR-Cas9-based editing technology to introduce an ARE deletion in KEAP1 promoters. After established multiple ARE-deleted single cell clones verified by sanger sequencing (Figure

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

12

4E), we used qPCR to examine KEAP1 mRNA abundance in these single cell clones. We found robust baseline KEAP1 mRNA reduction in both heterozygous and homozygous ARE deletion mutants when compared to the non-edited controls (CTL) (Figure 4F and 4G). Further qPCR showed higher expression of well-known NRF2 downstream genes in subclones with ARE deletion than CTL in H292 cells (Figure 4H). To confirm NRF2 and KEAP1 alterations at protein level, we selected a homozygous ARE deletion subclone (P4C10) and a CTL (P3C12) subclone from H292 for western blotting analysis. In P4C10, expression of the KEAP1 protein and its dimer were attenuated, whereas NRF2 protein abundance was increased (Figure 4I, bands were quantified in Supplementary S5E). During tBHQ treatment, KEAP1 mRNA in P4C10 subclone remained stable for 60 h compared to CTL (Figure 4J, p<0.0001).

KEAP1-based gene expression ratio predicts NRF2 disruption Based on the dynamic balance between NRF2 and KEAP1 gene (Figure 5A), we assumed that KEAP1 deletion and NRF2 amplification might lead to an increased TXN/KEAP1 ratio (See method Eq 2), whereas NRF2-KEAP1 loss of function mutation may lead to a decreased NRF2/KEAP1 ratio (See method Eq 1). To confirm this assumption, in TCGA PanCancer, LUSC, and LUAD datasets, we observed the expected TXN/KEAP1 ratio increase (Figure 5B and 5D). In the two datasets with the highest NRF2-KEAP1 somatic mutation frequencies, the observations also conformed to our assumptions (Figure 5C and 5E). Thanks to cBioportal co- expression module, we used linear regression equations between NRF2 and KEAP1 mRNA expressions to test which the NRF2-KEAP1 mutations were above the trend lines (Figure 5F). Interestingly, NRF2 mutations recurrently presented above the trend lines are mostly activation mutations in DLG and ETGE domains (20,25). KEAP1 mutations recurrently presented above the trend lines are likely to lead to function loss (17,20,25) (Figure 5I). To be brief, the more times one mutation presented above the NRF2-KEAP1 regression line, the more possible it could be functional (Chi-square p=1.42e-20, Figure 5G). The same tendencies were observed in the other two separate cell line datasets (Figure 5H). The full above the lines analysis results can be accessed in Supplementary Table S5.

KEAP1-based gene expression ratio is prognostic in SQC datasets NRF2 downstream gene signatures have been used to infer NRF2 activation and tested using unsupervised clustering method in clinical samples to discover potential prognostic markers (38-

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

13

40). However, this strategy does not seem to be promising in SQC datasets (41). Here, we thus used KEAP1-based gene expression ratio as an alternative strategy to identify SQC patients with poor prognosis. To determine abnormal TXN/KEAP1 and NRF2/KEAP1 ratio, we compared these ratios between SQC and paired normal tissues. We found that the maximum TXN/KEAP1 ratio in normal lung was approximately equal to upper quartile of the ratio in SQC. On the other side, the minimum NRF2/KEAP1 ratio in normal lung was roughly equal to lower quartile of the ratio in SQC, respectively (Supplementary Figure S2A). To characterize molecular signatures in patients with different ratio quartiles, we compared gene expression profiles in outlier patients with either higher TXN/KEAP1 or lower NRF2/KEAP1 ratio to the remaining (Supplementary Figure S2B). GSEA analysis in Molecular Signatures Database (MSigDB) demonstrated that reactive oxygen species hallmark (Supplementary Figure S2C) and oncogenic NRF2 pathway (Supplementary Figure S2D) were significantly enriched in outlier patients. To better demonstrate the performance of ratio-based strategy in predicting patient survival, we used the Kaplan-Meier curve and time-dependent ROC curve in each dataset for visualization of this difference. Compared with previous NRF2 signatures (40-42) (green, red and purple lines), the ratio-based strategy (black, blue, and yellow lines) showed better Kaplan-Meier curve separation in multiple SQC datasets with various prognostic settings (Figure 6A-D, left panel). The time-dependent ROC analysis demonstrated better accuracy of ratio-based strategy in predicting endpoint over gene signatures during the whole follow-up in each study (higher AUC of black, blue, and yellow curve) (Figure 6A-D, right panel). In other SQC cohorts, the ratio- based strategy exhibited a consistent trend in predicting patients' survival, which demonstrated that higher TXN/KEAP1 and lower NRF2/KEAP1 were negative biomarkers under RNA profiling measurement by different platforms (Supplementary Figure S3A-J).

Discussion As a crucial transcription factor modulating antioxidant response and metabolism reaction, NRF2 is activated in several types of human squamous cell cancers (49). It is well known that KEAP1 protein mediates NRF2 protein degradation and inhibits its nuclear translocation. According to the TCGA Research Network publications in the four datasets (7,50-52), NRF2 alterations are recurrently present in squamous cancer types and may have resulted from NRF2

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

14

and KEAP1 gene somatic mutations and copy number variations (Supplementary Figure S4). Consistent with a recent study (8), human lung adenocarcinoma was mainly featured with KEAP1 somatic mutation and copy number deletion, while squamous cell lung carcinoma was more commonly with NRF2 mutations and copy number amplifications. The correlation in SQC suggested that KEAP1 expressions were responding to the increase NRF2 expressions, which were mostly driven by NRF2 copy number amplification and further magnified in combination with NRF2 activation mutation. However, the absence of NRF2-KEAP1 mRNA correlation in lung adenocarcinoma specimens did not mean the regulation vanished in lung adenocarcinoma cells. This is supported by the observation in lung adenocarcinoma that mutated KEAP1 present higher mRNA expression (Figure 5F). Notably, KEAP1 truncating cases do not comply with this observation (yellow dots labeled in Figure 5F), which can be partially explained by two recent researches about the impact of truncated mutations that can lead to nonsense-mediated mRNA decay (NMD) and thus decrease the gene expression (53,54). The NRF2 driven KEAP1 expression at mRNA and protein levels in multiple human lung cell lines proved that activation of NRF2 could increase KEAP1 transcription. The highly consistent binding region in KEAP1 promoter indicates that NRF2 feedback is conserved among human cell types. Practically, we observed three different responding patterns in different lung cell lines. The fast-in-fast-out pattern of KEAP1 protein alteration in H292 cells suggested tightly controlled NRF2 activation in this cell line, which was also observed in BEAS-2B cells. The steadily increasing pattern of KEAP1 protein alteration in SQC cell lines (SK-MES-1, Figure 2F; H520, Supplementary Figure S5D) suggested delayed feedback control of NRF2 activation. Moreover, the KEAP1 protein western blot, KEAP1 promoter luciferase assay and ChIP-qPCR results in A549 cells (KEAP1 p.G333C, loss of function mutation) imply the existence of a completely broken balance between NRF2 signaling and loss of functional KEAP1 mutation, which can reflect as oversaturated NRF2 occupancies at KEAP1 promoter regions. Interestingly, negative feedback may generally function as a cushion to alleviate the effect of functional mutations (55). Though previous studies demonstrated that NRF2 and KEAP1 somatic mutations were responsible for constitutively NRF2 activation in human cancers (40,41), thoroughly functional characterization of these candidate variants remained as a great challenge (20). Here, by perceiving the nature of KEAP1 mRNA compensation in response to NRF2

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

15

activation, we could anticipate more experimental or algorithm designs in prioritizing mutations causing NRF2 signaling disequilibrium. Another NRF2 downstream gene TXN (Thioredoxin) mRNA expression in the model may help define copy number changes of NRF2 and KEAP1. TXN protein catalyzes dithiol-disulfide exchange reactions, which is crucial to maintain intracellular redox balance (56). Similar to KEAP1, the NRF2 binding site of TXN was also in the proximal-promoter region, right near its first exon (Supplementary Figure S1C). This ARE pattern similarity might partially explain the parallel expression between the two genes (Supplementary Figure S1A and S1B) and suggested TXN/KEAP1 ratio could be a good consequential indicator of NRF2-KEAP1 disequilibrium, which was the reason that we observed good separation by TXN/KEAP1 ratio in multiple SQC datasets (Supplementary Figure S3D-I). Current RNA profiling technologies detect certain transcript presence from a snapshot of the whole transcriptome. However, the real transcriptome undergoes continuously changing and complicated interactions (57). Though technologies are improving now and then, standardizations between different platforms and batches still challenge the application of certain gene signature to clinical settings (58). Under this circumstance, knowing transcriptome regulatory seems considerably important for noise reduction and alteration identification. In gene-ratio based strategy, dividing two genes within each sample effectively removed between- sample variation and system noise, which made the prediction more accurate and reliable (59,60). By applying gene ratios to evaluate NRF2-KEAP1 disequilibrium, we can expect to dissect functional disruptive somatic mutations in this pathway. More importantly, since NRF2 activation leads to chemotherapy resistance in multiple cancer types (61), this strategy could also provide quantitative and reproducible information in identifying intrinsic non-responding patients. In summary, we characterized NRF2-driven KEAP1 expression in human cancer contexts and proposed a new perspective in understanding NRF2 signaling regulation. Considering that NRF2 alteration is a frequently altered signaling pathway in human cancers, our work will help pinpoint NRF2-KEAP1 disruption and promote drug discovery designed for targeting this pathway.

List of abbreviations:

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

16

NSCLC: non-small cell lung cancer; SQC: squamous cell lung cancer; ADC: lung adenocarcinoma; NRF2: nuclear factor (erythroid-derived 2)-like 2; KEAP1: kelch-like ECH- associated protein 1; Dimethylformamide: DMF; SFN: sulforaphane; tBHQ: tert- Butylhydroquinone; TXN: thioredoxin; MACS: Model-based Analysis of ChIP-Seq; GSEA: Gene Set Enrichment Analysis; OS: overall survival; DFS: disease-free survival; DSS: disease- specific survival; CN: copy number; QuEST: Quantitative Enrichment of Sequence Tags; ARE: Antioxidant responsive element.

Ethics approval and consent to participate Not applicable Consent for publication Not applicable Availability of data and materials

All data generated or analyzed in this study are included either in this article or in the supplementary information files.

Funding This study was supported by Natural Science Foundation of China (Grant No. 81572608, 81874120) to KMW and QC, Advancing a Healthier Wisconsin fund (Project # 5520227) and Moffitt Cancer Center Faculty Startup fund to LW. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript. Acknowledgments We want to give our thanks to Dr. Kunnimalaiyaan (Muthusamy Kunnimalaiyaan) at Medical College of Wisconsin for his time spent in discussing this project. Authors' Contributions Conception and design: Y. Tian Development of methodology: Y. Tian and Y. Chen Acquisition of data: Y. Tian, Q. Liu, and S. Yu Analysis and interpretation of data: Y. Tian and Q. Chu Writing, review, and revision of the manuscript: Y. Tian, L. Wang, and K. Wu Study supervision: K. Wu and L. Wang

References

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

17

1. Herbst RS, Khuri FR, Fossella FV, Glisson BS, Kies MS, Pisters KM, et al. ZD1839 (Iressa) in non-small-cell lung cancer. Clin Lung Cancer 2001;3(1):27-32 doi 10.3816/clc.2001.n.014. 2. Koivunen JP, Mermel C, Zejnullahu K, Murphy C, Lifshits E, Holmes AJ, et al. EML4- ALK fusion gene and efficacy of an ALK kinase inhibitor in lung cancer. Clin Cancer Res 2008;14(13):4275-83 doi 10.1158/1078-0432.CCR-08-0168. 3. Rizvi NA, Hellmann MD, Snyder A, Kvistborg P, Makarov V, Havel JJ, et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 2015;348(6230):124-8 doi 10.1126/science.aaa1348. 4. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2012;2(5):401-4 doi 10.1158/2159-8290.CD-12-0095. 5. Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 2013;6(269):pl1 doi 10.1126/scisignal.2004088. 6. Ohta T, Iijima K, Miyamoto M, Nakahara I, Tanaka H, Ohtsuji M, et al. Loss of Keap1 function activates Nrf2 and provides advantages for lung cancer cell growth. Cancer Res 2008;68(5):1303-9 doi 10.1158/0008-5472.CAN-07-5003. 7. Cancer Genome Atlas Research N. Comprehensive genomic characterization of squamous cell lung cancers. Nature 2012;489(7417):519-25 doi 10.1038/nature11404. 8. Frank R, Scheffler M, Merkelbach-Bruse S, Ihle MA, Kron A, Rauer M, et al. Clinical and Pathological Characteristics of KEAP1- and NFE2L2-Mutated Non-Small Cell Lung Carcinoma (NSCLC). Clin Cancer Res 2018;24(13):3087-96 doi 10.1158/1078- 0432.CCR-17-3416. 9. Tian Y, Liu Q, He X, Yuan X, Chen Y, Chu Q, et al. Emerging roles of Nrf2 signal in non- small cell lung cancer. J Hematol Oncol 2016;9:14 doi 10.1186/s13045-016-0246-5. 10. Tong KI, Padmanabhan B, Kobayashi A, Shang C, Hirotsu Y, Yokoyama S, et al. Different electrostatic potentials define ETGE and DLG motifs as hinge and latch in response. Mol Cell Biol 2007;27(21):7511-21 doi 10.1128/MCB.00753- 07. 11. Ma Q. Role of nrf2 in oxidative stress and toxicity. Annu Rev Pharmacol Toxicol 2013;53:401-26 doi 10.1146/annurev-pharmtox-011112-140320. 12. Lee OH, Jain AK, Papusha V, Jaiswal AK. An auto-regulatory loop between stress sensors INrf2 and Nrf2 controls their cellular abundance. J Biol Chem 2007;282(50):36412-20 doi 10.1074/jbc.M706517200. 13. Chorley BN, Campbell MR, Wang X, Karaca M, Sambandan D, Bangura F, et al. Identification of novel NRF2-regulated genes by ChIP-Seq: influence on retinoid X receptor alpha. Nucleic Acids Res 2012;40(15):7416-29 doi 10.1093/nar/gks409. 14. Na HK, Surh YJ. Oncogenic potential of Nrf2 and its principal target protein heme oxygenase-1. Free Radic Biol Med 2014;67:353-65 doi 10.1016/j.freeradbiomed.2013.10.819. 15. Sanghvi VR, Leibold J, Mina M, Mohan P, Berishaj M, Li Z, et al. The Oncogenic Action of NRF2 Depends on De-glycation by Fructosamine-3-Kinase. Cell 2019;178(4):807-19 e21 doi 10.1016/j.cell.2019.07.031. 16. Menegon S, Columbano A, Giordano S. The Dual Roles of NRF2 in Cancer. Trends Mol Med 2016;22(7):578-93 doi 10.1016/j.molmed.2016.05.002. 17. Berger AH, Brooks AN, Wu X, Shrestha Y, Chouinard C, Piccioni F, et al. High- throughput Phenotyping of Lung Cancer Somatic Mutations. Cancer Cell 2016;30(2):214-28 doi 10.1016/j.ccell.2016.06.022.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

18

18. Fukutomi T, Takagi K, Mizushima T, Ohuchi N, Yamamoto M. Kinetic, thermodynamic, and structural characterizations of the association between Nrf2-DLGex degron and Keap1. Mol Cell Biol 2014;34(5):832-46 doi 10.1128/MCB.01191-13. 19. Hast BE, Cloer EW, Goldfarb D, Li H, Siesser PF, Yan F, et al. Cancer-derived mutations in KEAP1 impair NRF2 degradation but not ubiquitination. Cancer Res 2014;74(3):808-17 doi 10.1158/0008-5472.CAN-13-1655. 20. Kerins MJ, Ooi A. A catalogue of somatic NRF2 gain-of-function mutations in cancer. Sci Rep 2018;8(1):12846 doi 10.1038/s41598-018-31281-0. 21. Kim E, Ilic N, Shrestha Y, Zou L, Kamburov A, Zhu C, et al. Systematic Functional Interrogation of Rare Cancer Variants Identifies Oncogenic Alleles. Cancer Discov 2016;6(7):714-26 doi 10.1158/2159-8290.CD-16-0160. 22. Kobayashi M, Itoh K, Suzuki T, Osanai H, Nishikawa K, Katoh Y, et al. Identification of the interactive interface and phylogenic conservation of the Nrf2-Keap1 system. Genes Cells 2002;7(8):807-20 doi 10.1046/j.1365-2443.2002.00561.x. 23. Lee DF, Kuo HP, Liu M, Chou CK, Xia W, Du Y, et al. KEAP1 E3 ligase-mediated downregulation of NF-kappaB signaling by targeting IKKbeta. Mol Cell 2009;36(1):131- 40 doi 10.1016/j.molcel.2009.07.025. 24. Shibata T, Kokubu A, Gotoh M, Ojima H, Ohta T, Yamamoto M, et al. Genetic alteration of Keap1 confers constitutive Nrf2 activation and resistance to chemotherapy in gallbladder cancer. Gastroenterology 2008;135(4):1358-68, 68 e1-4 doi 10.1053/j.gastro.2008.06.082. 25. Shibata T, Ohta T, Tong KI, Kokubu A, Odogawa R, Tsuta K, et al. Cancer related mutations in NRF2 impair its recognition by Keap1-Cul3 E3 ligase and promote malignancy. Proc Natl Acad Sci U S A 2008;105(36):13568-73 doi 10.1073/pnas.0806268105. 26. Tian Y, Wu K, Liu Q, Han N, Zhang L, Chu Q, et al. Modification of platinum sensitivity by KEAP1/NRF2 signals in non-small cell lung cancer. J Hematol Oncol 2016;9(1):83 doi 10.1186/s13045-016-0311-0. 27. Liu Q, Li A, Yu S, Qin S, Han N, Pestell RG, et al. DACH1 antagonizes CXCL8 to repress tumorigenesis of lung adenocarcinoma and improve prognosis. J Hematol Oncol 2018;11(1):53 doi 10.1186/s13045-018-0597-1. 28. Haeussler M, Schonig K, Eckert H, Eschstruth A, Mianne J, Renaud JB, et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol 2016;17(1):148 doi 10.1186/s13059-016-1012-2. 29. Arbab M, Srinivasan S, Hashimoto T, Geijsen N, Sherwood RI. Cloning-free CRISPR. Stem Cell Reports 2015;5(5):908-17 doi 10.1016/j.stemcr.2015.09.022. 30. Chen B, Gilbert LA, Cimini BA, Schnitzbauer J, Zhang W, Li GW, et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell 2013;155(7):1479-91 doi 10.1016/j.cell.2013.12.001. 31. Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F. Genome engineering using the CRISPR-Cas9 system. Nat Protoc 2013;8(11):2281-308 doi 10.1038/nprot.2013.143. 32. Wang X, Campbell MR, Lacher SE, Cho HY, Wan M, Crowl CL, et al. A Polymorphic Antioxidant Response Element Links NRF2/sMAF Binding to Enhanced MAPT Expression and Reduced Risk of Parkinsonian Disorders. Cell Rep 2016;15(4):830-42 doi 10.1016/j.celrep.2016.03.068. 33. Levings DC, Wang X, Kohlhase D, Bell DA, Slattery M. A distinct class of antioxidant response elements is consistently activated in tumors with NRF2 mutations. Redox Biol 2018;19:235-49 doi 10.1016/j.redox.2018.07.026. 34. Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nat Protoc 2012;7(9):1728-40 doi 10.1038/nprot.2012.101.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

19

35. Robinson JT, Thorvaldsdottir H, Wenger AM, Zehir A, Mesirov JP. Variant Review with the Integrative Genomics Viewer. Cancer Res 2017;77(21):e31-e4 doi 10.1158/0008- 5472.CAN-17-0337. 36. Li Q, Birkbak NJ, Gyorffy B, Szallasi Z, Eklund AC. Jetset: selecting the optimal microarray probe set to represent a gene. BMC Bioinformatics 2011;12:474 doi 10.1186/1471-2105-12-474. 37. Balduzzi S, Rucker G, Schwarzer G. How to perform a meta-analysis with R: a practical tutorial. Evid Based Ment Health 2019;22(4):153-60 doi 10.1136/ebmental-2019-300117. 38. Almeida-de-Macedo MM, Ransom N, Feng Y, Hurst J, Wurtele ES. Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data. BMC Bioinformatics 2013;14:214 doi 10.1186/1471-2105-14-214. 39. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta- analyses. BMJ 2003;327(7414):557-60 doi 10.1136/bmj.327.7414.557. 40. Singh A, Misra V, Thimmulappa RK, Lee H, Ames S, Hoque MO, et al. Dysfunctional KEAP1-NRF2 interaction in non-small-cell lung cancer. PLoS Med 2006;3(10):e420 doi 10.1371/journal.pmed.0030420. 41. Cescon DW, She D, Sakashita S, Zhu CQ, Pintilie M, Shepherd FA, et al. NRF2 Pathway Activation and Adjuvant Chemotherapy Benefit in Lung Squamous Cell Carcinoma. Clin Cancer Res 2015;21(11):2499-505 doi 10.1158/1078-0432.CCR-14- 2206. 42. Romero R, Sayin VI, Davidson SM, Bauer MR, Singh SX, LeBoeuf SE, et al. Keap1 loss promotes Kras-driven lung cancer and results in dependence on glutaminolysis. Nat Med 2017;23(11):1362-8 doi 10.1038/nm.4407. 43. Blanche P, Dartigues JF, Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat Med 2013;32(30):5381-97 doi 10.1002/sim.5958. 44. Wang H, Liu K, Geng M, Gao P, Wu X, Hai Y, et al. RXRalpha inhibits the NRF2-ARE signaling pathway through a direct interaction with the Neh7 domain of NRF2. Cancer Res 2013;73(10):3097-108 doi 10.1158/0008-5472.CAN-12-3386. 45. Saidu NE, Noe G, Cerles O, Cabel L, Kavian-Tessler N, Chouzenoux S, et al. Dimethyl Fumarate Controls the NRF2/DJ-1 Axis in Cancer Cells: Therapeutic Applications. Mol Cancer Ther 2017;16(3):529-39 doi 10.1158/1535-7163.MCT-16-0405. 46. Homma S, Ishii Y, Morishima Y, Yamadori T, Matsuno Y, Haraguchi N, et al. Nrf2 enhances cell proliferation and resistance to anticancer drugs in human lung cancer. Clin Cancer Res 2009;15(10):3423-32 doi 10.1158/1078-0432.CCR-08-2822. 47. Davis CA, Hitz BC, Sloan CA, Chan ET, Davidson JM, Gabdank I, et al. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res 2018;46(D1):D794-D801 doi 10.1093/nar/gkx1081. 48. Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 2020;48(D1):D87-D92 doi 10.1093/nar/gkz1001. 49. Dotto GP, Rustgi AK. Squamous Cell Cancers: A Unified Perspective on Biology and Genetics. Cancer Cell 2016;29(5):622-37 doi 10.1016/j.ccell.2016.04.004. 50. Cancer Genome Atlas N. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 2015;517(7536):576-82 doi 10.1038/nature14129. 51. Cancer Genome Atlas Research N, Albert Einstein College of M, Analytical Biological S, Barretos Cancer H, Baylor College of M, Beckman Research Institute of City of H, et al. Integrated genomic and molecular characterization of cervical cancer. Nature 2017;543(7645):378-84 doi 10.1038/nature21386.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

20

52. Cancer Genome Atlas Research N, Analysis Working Group: Asan U, Agency BCC, Brigham, Women's H, Broad I, et al. Integrated genomic characterization of oesophageal carcinoma. Nature 2017;541(7636):169-75 doi 10.1038/nature20805. 53. Jia P, Zhao Z. Impacts of somatic mutations on gene expression: an association perspective. Brief Bioinform 2017;18(3):413-25 doi 10.1093/bib/bbw037. 54. You KT, Li LS, Kim NG, Kang HJ, Koh KH, Chwae YJ, et al. Selective translational repression of truncated from frameshift mutation-derived mRNAs in tumors. PLoS Biol 2007;5(5):e109 doi 10.1371/journal.pbio.0050109. 55. Marciano DC, Lua RC, Herman C, Lichtarge O. Cooperativity of Negative Autoregulation Confers Increased Mutational Robustness. Phys Rev Lett 2016;116(25):258104 doi 10.1103/PhysRevLett.116.258104. 56. Mochizuki A, Saso A, Zhao Q, Kubo S, Nishida N, Shimada I. Balanced Regulation of Redox Status of Intracellular Thioredoxin Revealed by in-Cell NMR. J Am Chem Soc 2018;140(10):3784-90 doi 10.1021/jacs.8b00426. 57. Kim K, Zakharkin SO, Allison DB. Expectations, validity, and reality in gene expression profiling. J Clin Epidemiol 2010;63(9):950-9 doi 10.1016/j.jclinepi.2010.02.018. 58. Michiels S, Ternes N, Rotolo F. Statistical controversies in clinical research: prognostic gene signatures are not (yet) useful in clinical practice. Ann Oncol 2016;27(12):2160-7 doi 10.1093/annonc/mdw307. 59. Reddy A, Growney JD, Wilson NS, Emery CM, Johnson JA, Ward R, et al. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages. PLoS One 2015;10(9):e0138486 doi 10.1371/journal.pone.0138486. 60. Price ND, Trent J, El-Naggar AK, Cogdell D, Taylor E, Hunt KK, et al. Highly accurate two-gene classifier for differentiating gastrointestinal stromal tumors and leiomyosarcomas. Proc Natl Acad Sci U S A 2007;104(9):3414-9 doi 10.1073/pnas.0611373104. 61. Rojo de la Vega M, Chapman E, Zhang DD. NRF2 and the Hallmarks of Cancer. Cancer Cell 2018;34(1):21-43 doi 10.1016/j.ccell.2018.03.022.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

21

Figure legends Figure 1. NRF2 and KEAP1 mRNA correlations in human cancers. A. NRF2 mRNA expression across TCGA PanCancer datasets. Scatter plot with NRF2 and KEAP1 mRNA expression in lung adenocarcinoma (LUAD, B), squamous cell lung cancer (LUSC, C), esophageal squamous cell carcinoma (ESCA: squamous, D), head and neck squamous cell carcinoma (HNSC, E), cervical squamous cell carcinoma (CESC: squamous, F). G. Pooled correlation coefficient between NRF2 and KEAP1 mRNA expression in lung adenocarcinoma (ADC). *, Three bronchioloalveolar carcinomas and one squamous carcinoma included in Lu's dataset. H. Pooled correlation coefficient between NRF2 and KEAP1 mRNA expression in squamous cell lung cancer (SQC).

Figure 2. KEAP1 expression alterations under NRF2 intervention in human lung cell lines. A. KEAP1 mRNA alteration in H292 cells after DMF and tBHQ treatment for 24 hours. B. NRF2 and KEAP1 mRNA alterations in A549 cells after NRF2 siRNA transfection for 72 hours. C and D. Changes of KEAP1 protein abundance of H292 cells treated by tert-butylhydroquinone (tBHQ, 60 μM, C) or dimethylformamide (DMF, 60 μM, D) within 24 h. E. Changes of NRF2 and KEAP1 protein abundance in A549 cells transfected with scramble control (NC) or NRF2 siRNA for 72 h. F and G. Changes of KEAP1 protein abundance in SK-MES-1 (F) and BEAS- 2B (G) cells treated by tBHQ (60 μM) within 24 h. H, Changes of NRF2 and KEAP1 protein abundance in A549 and H460 cells treated with DMSO or tBHQ (60 μM) for 12 h. The 0-hour time point indicates cells treated with 0.1% DMSO. DMSO and tBHQ treatments were given in 10% FBS medium without sodium pyruvate. The P values were calculated using the two-tailed t- test.

Figure 3. NRF2 occupancies on KEAP1 promoter in human cell lines. A. NRF2 occupancy in KEAP1 promoter of IMR90, A549, HepG2, and HELA cells, Sequencing data come from ENCODE NRF2 ChIP-seq collection. B. NRF2 occupancy in KEAP1 promoter of BEAS-2B

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

22

cells treated with DMSO or sulforaphane (SFN, 10 μM), sequencing data come from SRA accession PRJNA305423. C. NRF2 occupancy in KEAP1 promoter of immortalized lymphocytes treated with DMSO or sulforaphane (SFN, 10 μM), sequencing data come from SRA accession PRJNA161889. D. Top-ranked enriched regions with the NRF2 antibody in SFN- treated immortalized lymphocytes. NRF2 ChIP-qPCR at NQO1, KEAP1, and TXN promoter region in lung cancer cells H292 (E), H520 (F), SK-MES-1(G) and A549(H). Quantification data were normalized to sample specific input and then normalized to gene-specific negative control (NC) at IgG isotype control. P-values represented comparisons of foldchanges between DMSO-NRF2 and tBHQ-NRF2 group at each ARE containing promoters. Bars represented an average of 4 independent ChIP experiments ± SEM.

Figure 4. KEAP1 promoter reporter assay and KEAP1 mRNA quantification in ARE edited clones. A. In silico predicted NRF2- (ARE) and SP1-binding sites and KEAP1 promoter reporter variants. B. Alteration of full length KEAP1 promoter activity in H292 and A549 cells treated with tBHQ (60 μM) within 24 h. C and D, Relative luciferase activity fold-changes driven by different mutants of KEAP1 promoters after chemical (12 h) or genetic (48 h) NRF2 activation in H292 (C) and A549 (D) cells. Data represent the mean ± SEM (n=3). NC, scramble control. NRF2, pcDNA3.1-NRF2-ORF. Vector, pcDNA3.1-empty vector. E. The schematic bar plot shows homozygous ARE-deleted (ΔARE) clones selected from CRISPR edited HEK293FT and H292 cells. F and G. Baseline KEAP1 mRNA expressions in ΔARE and CTL clones from HEK293FT (F) and H292 (G) cells. Error bars represent the mean ± SEM. The unpaired t-test was used to compare between-group differences. H. NRF2 downstream gene expression between ΔARE and CTL clones in H292 cells. I. KEAP1 and NRF2 protein expression levels in one ΔARE (P4C10) and CTL (P3C12) clone. J. KEAP1 mRNA quantification in ΔARE (P4C10) and CTL (P3C12) H292 clones treated with 60 M tBHQ within 60 hours. Fresh media containing 60 M tBHQ were replenished at 30-hour time point. Data represent the mean ± SEM (n=3). Two-way ANOVA was used to test between-group differences.

Figure 5. Inferring NRF2 signaling disruption by ratio-based strategy. A. Schematic for inferred NRF2-KEAP1 disequilibrium caused by NRF2-KEAP1 alterations. Dash B. ROC curve of TXN/KEAP1 ratio predicting NRF2 amplification and KEAP1 deletion in TCGA PanCancer (left), LUSC (center), and LUAD (right) datasets. C. ROC curve of NRF2/KEAP1 ratio predicting NRF2-KEAP1 mutations in LUSC (left) and LUAD (right) datasets. D. Distribution of TXN/KEAP1 ratio for cases with or without NRF2 amplification or KEAP1 deletion in TCGA PanCancer (left), LUSC (center), and LUAD (right) datasets. E. Distribution of NRF2/KEAP1 ratio for cases with or without NRF2-KEAP1 mutations in LUSC (left) and LUAD (right) datasets. F. Visualization of "Above the line" analysis in LUSC (left) and LUAD (right) datasets. G. Associations between times of "above the line" in TCGA PanCancer and NRF2-KEAP1 mutation annotations. H. "Above the line" profiling in NSCLC cell line dataset CCLE 2019 (left) and Coldren 2006 (right). I. Top-rank mutations with above line and presence times in TCGA PanCancer datasets.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

23

Figure 6. Evaluation of prognosis performance with previous NRF2 signatures and gene ratios in SQC cohorts. A. Kaplan-Meier curve (left panel) and time-dependent ROC analysis (right panel) in LUSC comparing disease free survival (DFS) between previous published NRF2 signatures and gene ratio strategy. B. Kaplan-Meier curve (left panel) and time-dependent ROC analysis (right panel) in LUSC comparing disease specific survival (DSS) between previous published NRF2 signatures and gene ratio strategy. C. Kaplan-Meier curve (left panel) and time- dependent ROC analysis (right panel) in Bild's SQC cohort comparing overall survival (OS) between previous published NRF2 signatures and gene ratio strategy. D. Kaplan-Meier curve (left panel) and time-dependent ROC analysis (right panel) in Botling's SQC comparing OS between previous published NRF2 signatures and gene ratio strategy.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 FigureAuthor 1 manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

A TCGA-PanCancer studies 16 Amplification&gain Diploid 14 Deep&shallow Deletion

12

10

8

Log2 NRF2mRNA expression 6 D TGCT AML PCPG UVM ACC U P LIHC CHOL THYM BRCA L SARC UCE C OV L SKCM P B KIRC GBM K STAD ME KICH THCA CES ESCA HNS LUSC LBC CS RAD GG OAD UA AA L IRP CA S C D D O C C

B LUAD C LUSC D ESCA: squamous E HNSC F CESC: squamous 16 r=0.009 16 r=0.50 16 r=0.49 16 r=0.33 16 r=0.20 p=0.84 p<1e-15 p=7.13e-7 p=5e-15 p=0.0014 14 14 n=510 n=484 14 n=94 14 n=515 14 n=243 12 12 12 12 12

10 10 10 10 10

8 8 8 8 8

mRNAKEAP1 expression 6 mRNAKEAP1 expression 6 mRNAKEAP1 expression 6 mRNAKEAP1 expression 6 mRNAKEAP1 expression 6 8 10 12 14 16 8 10 12 14 16 8 10 12 14 16 6 8 10 12 14 16 6 8 10 12 14 16 NRF2 mRNA expression NRF2 mRNA expression NRF2 mRNA expression NRF2 mRNA expression NRF2 mRNA expression G NRF2-KEAP1 correlation in ADC H NRF2-KEAP1 correlation in SQC Dataset Sample Size Coefficient 95% CI Weight Dataset Sample Size Coefficient 95% CI Weight Bild 58 0.29 [ 0.04; 0.51] 4.1% Bild 53 0.55 [0.33; 0.72] 3.7% Lee 63 0.17 [−0.08; 0.40] 4.4% Lee 75 0.59 [0.42; 0.72] 5.3% Kuner 40 −0.03 [−0.34; 0.28] 3.2% Kuner 18 0.51 [0.06; 0.79] 1.1% Hou 45 0.05 [−0.24; 0.34] 3.5% Hou 27 0.56 [0.23; 0.77] 1.8% Zhu 71 0.03 [−0.21; 0.26] 4.7% Zhu 52 0.50 [0.27; 0.68] 3.6% Micke 50 −0.02 [−0.29; 0.26] 3.8% Micke 28 0.39 [0.02; 0.67] 1.8% Fujiwara 9 0.60 [−0.11; 0.90] 0.7% Fujiwara 48 0.47 [0.22; 0.67] 3.3% Rousseaux 85 0.12 [−0.09; 0.33] 5.1% Rousseaux 61 0.55 [0.34; 0.70] 4.3% Botling 106 0.20 [ 0.01; 0.37] 5.7% Botling 66 0.45 [0.24; 0.63] 4.6% Tarca 77 0.17 [−0.06; 0.38] 4.9% Tarca 73 0.61 [0.45; 0.74] 5.1% Tang 133 0.30 [ 0.14; 0.45] 6.2% Tang 43 0.56 [0.32; 0.74] 2.9% Der 128 −0.16 [−0.32; 0.01] 6.1% Der 43 0.50 [0.24; 0.70] 2.9% TCGA−LUAD 510 0.01 [−0.08; 0.09] 8.5% TCGA−LUSC 484 0.50 [0.43; 0.57] 36.6% Beer 86 0.25 [ 0.04; 0.44] 5.1% Raponi 130 0.58 [0.46; 0.69] 9.3% Landi 58 −0.02 [−0.27; 0.24] 4.1% Wilkerson 56 0.44 [0.20; 0.63] 3.9% Shedden 443 −0.09 [−0.18; 0.00] 8.3% Meyerson 135 0.43 [0.28; 0.56] 9.7% Tomida 117 −0.16 [−0.34; 0.02] 5.9% Lu* 60 0.05 [−0.21; 0.30] 4.2% Fixed effect model 0.50 [0.47; 0.55] 100.0% Selamat 58 −0.20 [−0.44; 0.06] 4.1% Total 1409 Heterogeneity Okayama 226 0.10 [−0.03; 0.23] 7.3% I2 = 0%, t2 = 0 Total 2428 −0.5 0 0.5 Random effect model 0.06 [−0.01; 0.13] 100.0% Heterogeneity −0.5 0 0.5 I2 = 60%, t2 = 0.01

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Figure 2

H292 60 μM tBHQ E A549 A H292 B A549 C KEAP1 wt KEAP1 p.G333C NC siNRF2 0h 1h 3h 6h 12h 24h p = 9.45e-5 NC 4.0 DMSO 1.5 siNRF2 DMF 68kd NRF2 100kd p = 0.012 KEAP1 tBHQ 1.2 p = 0.01 3.0 KEAP1 68kd p = 0.0011 0.9 β-actin 43kd 2.0 H292 β-actin 43kd 0.6 KEAP1 wt 60μM DMF 1.0 D 0.3 0h 1h 3h 6h 12h 24h Relative KEAP1 mRNA expression mRNA 0.0 0.0 KEAP1 68kd

Relative mRNA expression RelativemRNA NRF2 KEAP1 β-actin 43kd

SK-MES-1 BEAS-2B 60μM tBHQ 60 μM tBHQ 60 μM tBHQ F KEAP1 wt G KEAP1 wt H A549 H460 KEAP1 p.G333C KEAP1 p.D236H 0h 1h 3h 6h 12h 24h 0h 1h 3h 6h 12h 24h 0 h 12h 0 h 12h KEAP1 68kd KEAP1 68kd KEAP1 68kd

β-actin 43kd β-actin 43kd β-actin 43kd

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Figure 3

A ENCODE NRF2 ChIP-seq collection (Santa cruz biotech sc-13032)

IMR90 [0-328] human embryonic lung fibroblasts FDR q = 1e-354

A549 [0-328] human lung carcinoma FDR q = 1e-139

HepG2 [0-328] human hepatocellular carcinoma FDR q = 1e-228

HELA [0-328] human cervival adenocarcinoma FDR q = 1e-47

B PRJNA305423 NRF2 ChIP-seq (Abcam ab62352) in BEAS-2B cells SFN [0-32]

DMSO [0-32]

C PRJNA161889 NRF2 ChIP-seq (Abcam ab62352) in lymphocytes SFN [0-646]

DMSO [0-646]

NM_203500 NM_012289 D Top altered NRF2 binding loci during NRF2 activator SFN treatment Inducible binding Start End Pileup -Log10(pvalue) -Log10(qvalue) Regulated gene foldchange 3 53,270,085 53,270,474 72 64.91 15.19 56.78 TKT 8 30,727,704 30,728,097 48 64.25 25.11 56.33 GSR 19 10,502,601 10,503,002 68 59.80 14.36 52.54 KEAP1 22 35,371,858 35,372,275 68 59.81 14.36 52.54 Z82244.2 14 95,668,338 95,668,782 27 50.42 23.73 43.60 TCL6 16 69,726,802 69,727,205 76 48.55 8.94 41.79 NQO1 9 110,256,729 110,257,123 68 48.48 10.29 41.72 TXN 20 653,305 653,695 38 47.56 19.99 40.83 SRXN1 E H292 F H520 150 DMSO-IgG tBHQ-IgG 200 DMSO-IgG tBHQ-IgG DMSO-NRF2 tBHQ-NRF2 p=0.82 DMSO-NRF2 tBHQ-NRF2 p=0.003 p=0.61 150 p=0.069 100 p=0.053 p=0.003 100 50 50 Genespecific NRF2 0 Genespecific NRF2 0 occupancies(IP/Input) NQO1-NC NQO1-ARE KEAP1-NC KEAP1-ARE TXN-NC TXN-ARE occupancies(IP/Input) NCNQO1- NQO1-ARE KEAP1-NC KEAP1-ARE TXN-NC TXN-ARE G SK-MES-1 H A549 500 DMSO-IgG tBHQ-IgG 250 DMSO-IgG tBHQ-IgG DMSO-NRF2 tBHQ-NRF2 p=0.073 DMSO-NRF2 tBHQ-NRF2 400 200 p=0.38 p=0.56 300 p=0.004 150 p=0.97 200 100 p=0.029 100 50 Genespecific NRF2 Genespecific NRF2 occupancies(IP/Input) 0 occupancies(IP/Input) 0 NQO1-NC NQO1-ARE KEAP1-NC KEAP1-ARE TXN-NC TXN-ARE NQO1-NC NQO1-ARE KEAP1-NC KEAP1-ARE TXN-NC TXN-ARE

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Figure 4

A B H292 A549 3.0 1.5 p=0.019 SP1 binding site ARE: NRF2 binding site Full-length p=0.034 ΔSp1 2.0 1.0 ΔARE ΔΔ 1.0 0.5 TF name Score Relative score Start End Predict site sequence NRF2 15.16 0.98 10502793 10502803 ATGACTAAGCA Relativeluciferase activity Relativeluciferase activity SP1 13.428 0.95 10502941 10502951 CCCCCGCCCCG 0.0 0.0 0h 1h 3h 6h 12h24h 0h 1h 3h 6h12h24h 60μM tBHQ 60μM tBHQ C D H292 H292 A549 A549 2.0 2.0 1.5 1.5 p=0.04 p=0.011 p=8.3E-5 p=0.03 p=0.0016 p=0.02 p=0.00012 1.5 1.5 p=0.02 DMSO Vector 1.0 DMSO 1.0 Vector tBHQ NRF2 tBHQ NRF2 1.0 1.0

0.5 0.5 0.5 0.5 Relativeluciferase activity Relativeluciferase activity Relativeluciferase activity Relativeluciferase activity 0.0 0.0 0.0 0.0 FL ΔSP1ΔARE ΔΔ FL ΔSP1ΔARE ΔΔ FL ΔSP1ΔARE ΔΔ FL ΔSP1ΔARE ΔΔ E P1C3 Non-edited genome P3C12 ARE ACTTTTATTGTGACACGGCGGGCGCCCGGCTCTGCTTAGTCATGGTGACCTGC

ACTTTT ATTGTG ACACGGC GGGCG CCCG GC TCTGC TT AGTCATGGTG ACC P4C10 ACTTTTATTGTGACACGGCGGGCGCCCGGCTCTGCTTAGTCATGGTGACCTGC P3E11

ACTTTTATTGTGACACGGCGGGCGCCCGGCTCTGCTTAGTCATGGTGACCTGC

F HEK293FT G H292 H H292 0.20 0.004 2.5 ΔARE (n=5) p=0.0081 p=0.0051 p=0.026 CTL (n=4) P3C12 2.0 0.15 0.003 p=0.027 p=0.043 p=0.018 P3E11 1.5 p=0.032 0.002 0.10 P4C10 1.0 P1C3 0.05 0.001 0.5 RelativemRNA KEAP1/ACTB RelativemRNA KEAP1/ACTB Relativeexpression mRNA 0.00 0.000 0.0 ΔARE CTL ΔARE CTL TKT NQO1 GCLC SRXN1 TXN n=7 n=6 n=5 n=4 I H292 J H292 P3C12 P4C10 3.0 KEAP1 60 uM tBHQ renew at 30h 140kd dimer 2.0 p<0.0001 KEAP1 68kd 1.0

Relative KEAP1 P3C12 NRF2 100kd mRNAexpression P4C10 0.0 0 12 24 36 48 60 β-actin 43kd 60 uM tBHQ treatment (hours)

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Figure 5

A NRF2 amplification Equilibrium state NRF2-KEAP1 mutation or KEAP1 deletion TXN mRNA TXN mRNA TXN mRNA

NRF2 NRF2 NRF2 Nuclear Nuclear Nuclear KEAP1 KEAP1 KEAP1

KEAP1 mRNA KEAP1 mRNA KEAP1 mRNA

B TXN/KEAP1 ratio predict KEAP1 deletion and NRF2 amplification CNRF2/KEAP1 ratio predict NRF2-KEAP1 mutation Pan-cancer TCGA-LUSC TCGA-LUAD TCGA-LUSC TCGA-LUAD 100 100 100 100 100

80 80 80 80 80

60 60 60 60 60

40 40 40 40 40

Sensitivity(%) 20 AUC=0.66, p<1e-15 Sensitivity(%) 20 AUC=0.67, p=1e-10 Sensitivity(%) 20 AUC=0.62, p=6.9e-6 Sensitivity(%) 20 AUC=0.61, p=1.3e-4 Sensitivity(%) 20 AUC=0.70, p=2.3e-10 95%CI=0.65-0.67 95%CI=0.62-0.72 95%CI=0.57-0.67 95%CI=0.56-0.68 95%CI=0.63-0.76 0 0 0 0 0 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 1-Specificity (%) 1-Specificity (%) 1-Specificity (%) 1-Specificity (%) 1-Specificity (%)

D 15 KEAP1 del or NRF2 amp n=2842 30 KEAP1 del or NRF2 amp n=281 40 KEAP1 del or NRF2 amp n=330 E 25 NRF2-KEAP1 mutation n=118 40 NRF2-KEAP1 mutation n=107 KEAP1 not del and NRF2 not amp n=7047 KEAP1 and NRF2 not altered n=203 KEAP1 not del and NRF2 not amp n=177 NRF2-KEAP1 mutation n=363 NRF2-KEAP1 mutation n=403 20 30 30 10 20 15 20 20 10 5 10 10 5 10 Frequency(%) Frequency(%) Frequency(%) Frequency(%) Frequency(%) 0 0 0 0 0 -4.25 -2.25 -0.25 1.75 2.75 -2 0 2 4 6 -1.5 0.5 2.5 4.5 6.5 -1.2 0.4 2.0 3.6 -1.2 0.4 2.0 3.6 5.2 Log2 (TXN/KEAP1) Log2 (TXN/KEAP1) Log2 (TXN/KEAP1) Log2 (NRF2/KEAP1) Log2 (NRF2/KEAP1)

NRF2 mutation KEAP1 mutation KEAP1 truncation F Both mutated Neither mutated G Functional mutation TCGA-LUSC TCGA-LUAD Above line times Netrual or unknow mutation 15 13 “Above the line” “Above the line” Never p=1.42e-20 14 12 Once 100% Twice E41* 2% 80% 13 11 3 2% 1% 2% 4 60% 12 10 5 Q201* 7% K287* 6+ 40% 11 E441* E449* 9 Q193* 34% Y345* K323* Percentage 20% 10 8 Q75* 0% 9 3 5 7 4 6+ 8 Never Once Twice 6 52% Above line times mRNA expression mRNA KEAP1 expression mRNA KEAP1 10 11 12 13 14 15 16 10 11 12 13 14 15 NRF2 mRNA expression NRF2 mRNA expression Top rank above line mutations H CCLE 2019 NSCLC cell lines Coldren 2006 NSCLC cell lines I (RNA-seq) (Microarray) NRF2 mutation Times above Presence KEAP1 mutationTimes abovePresence No mutation KEAP1 mutation R34G 20 20 R470C 5 5 Unknown E82D 14 14 G480W 4 4 KEAP1/NRF2 ratio 0.5 D29H 13 16 V271L 4 5 1.0 1.5 2.0 E79Q 13 14 R554Q 3 3 E79K 6 7 S45F 3 3 G31A 6 6 X570_splice 3 5 No mutation D29Y 5 5 D236N 2 3 NRF2 mutation R34P 5 7 E117K 2 2 KEAP1 mutation KEAP1/NRF2 ratio D29N 4 5 G333C 2 2 1 G81S 4 5 G333S 2 2 2 G81V 4 4 G417V 2 2 R34Q 4 6 G423V 2 2 p=0.0007 r=0.30 n=126 expression mRNA KEAP1 p=0.0189 r=0.37 n=41 mRNA expression mRNA KEAP1 W24R 4 4 L115Q 2 2 NRF2 mRNA expression NRF2 mRNA expression L30F 4 6 P278L 2 2

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research. Downloaded from C A Figure 6

Rodrigo’s signature Cescon’ Singh’s NRF2target Survival fraction Rodrigo’s signature Cescon’ Singh’s NRF2target

Survival fraction 0.0 0.2 0.4 0.6 0.8 1.0 Author manuscriptshavebeenpeerreviewedandacceptedforpublicationbutnotyetedited. 0.0 0.2 0.4 0.6 0.8 1.0 Author ManuscriptPublishedOnlineFirstonJune22,2020;DOI:10.1158/1541-7786.MCR-20-0108 010150 100 50 0 s signature s signature 04 80 40 20 0 Time (months) Time (months) 95%CI=0.25-1.22 p=0.14 HR=0.55 95%CI=0.25-1.26 p=0.15 HR=0.56 95%CI=0.25-1.22 p=0.14 HR=0.55 95%CI=0.6-1.69 p=0.98 HR=1.00 95%CI=0.55-1.60 p=0.82 HR=0.94 95%CI=0.59-1.69 p=0.99 HR=0.99 mcr.aacrjournals.org 60 TCGA-LUSC DFS GSE3141-SQC OS

AUC(t) AUC(t) 0.5 0.6 0.8 0.9 1.0 0.7 0.4 0.3 0.4 0.5 0.6 0.7 0.8 0 1.0 .9 ratio TXN/KEAP1 ratio NRF2/KEAP1 T ratio TXN/KEAP1 ratio NRF2/KEAP1 T wo ratios wo ratios wo 02 04 06 70 60 50 40 30 20 10 02 04 06 70 60 50 40 30 20 10 on September 24, 2021. © 2020American Association for Cancer Research. Time (months) T ime (months) ime 95%CI=1.35-6.70 p=0.005 HR=3.00 95%CI=1.55-4.53 p=2e-4 HR=2.65 95%CI=0.53-2.82 p=0.64 HR=1.22 95%CI=1.76-8.65 p=3e-4 HR=3.90 95%CI=1.05-3.12 p=0.03 HR=1.81 95%CI=1.01-2.9 p=0.046 HR=1.71 D B Survival fraction Survival fraction Rodrigo’s signature Cescon’ Singh’s NRF2target Rodrigo’s signature Cescon’ Singh’s NRF2target 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 010150 100 50 0 k2 4k 2k 1k 0 s signature s signature Time (months) Time (days) 3k 95%CI=0.32-0.99 HR=0.57 p=0.044 95%CI=0.73-1.20 HR=1.27 p=0.40 95%CI=0.45-1.34 HR=0.77 p=0.36 95%CI=0.53-1.31 HR=0.83 p=0.43 95%CI=0.44-1.15 HR=0.71 p=0.16 95%CI=0.54-1.35 HR=0.85 p=0.49 TCGA-LUSC DSS GSE37745 SQCOS 5k

AUC(t) AUC(t) 0.5 0.6 0.7 0.8 0 1.0 0.5 0.6 0.7 0.8 0 1.0 0.4 .9 .9 ratio TXN/KEAP1 ratio NRF2/KEAP1 T ratio TXN/KEAP1 ratio NRF2/KEAP1 T .k1 .k3.5k2k 4k 1.5k 1k 0.5k 02 04 06 70 60 50 40 30 20 10 wo ratios wo ratios wo Time (months) Time (days) 95%CI=1.30-3.20 HR=2.04 p=0.0016 95%CI=0.98-3.25 HR=1.78 p=0.06 95%CI=0.74-2.46 HR=1.35 p=0.33 95%CI=1.09-3.29 HR=1.90 p=0.021 95%CI=1.37-3.32 HR=2.13 p=6e-4 95%CI=0.77-2.02 HR=1.24 p=0.38 Author Manuscript Published OnlineFirst on June 22, 2020; DOI: 10.1158/1541-7786.MCR-20-0108 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

NRF2-Driven KEAP1 Transcription in Human Lung Cancer

Yijun Tian, Qian Liu, Shengnan Yu, et al.

Mol Cancer Res Published OnlineFirst June 22, 2020.

Updated version Access the most recent version of this article at: doi:10.1158/1541-7786.MCR-20-0108

Supplementary Access the most recent supplemental material at: Material http://mcr.aacrjournals.org/content/suppl/2020/06/20/1541-7786.MCR-20-0108.DC1

Author Author manuscripts have been peer reviewed and accepted for publication but have not yet been Manuscript edited.

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://mcr.aacrjournals.org/content/early/2020/06/20/1541-7786.MCR-20-0108. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from mcr.aacrjournals.org on September 24, 2021. © 2020 American Association for Cancer Research.