Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Research Article

Nongenetic Evolution Drives Lung Adenocarcinoma Spatial Heterogeneity and Progression

Daniele Tavernari1,2,3, Elena Battistello1,2,3,4, Elie Dheilly1,4, Aaron S. Petruzzella1,4, Marco Mina1,2,3, Jessica Sordet-Dessimoz4, Solange Peters1,5, Thorsten Krueger6, David Gfeller1,7, Nicolo Riggi1,8, Elisa Oricchio1,4, Igor Letovanec1,8,9, and Giovanni Ciriello1,2,3 Illustration by Bianca Dunn by Illustration

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

abstract Cancer evolution determines molecular and morphologic intratumor heterogene- ity and challenges the design of effective treatments. In lung adenocarcinoma, disease progression and prognosis are associated with the appearance of morphologically diverse tumor regions, termed histologic patterns. However, the link between molecular and histologic features remains elusive. Here, we generated multiomics and spatially resolved molecular profiles of histologic patterns from primary lung adenocarcinoma, which we integrated with molecular data from >2,000 patients. The transition from indolent to aggressive patterns was not driven by genetic alterations but by epigenetic and transcriptional reprogramming reshaping cancer cell identity. A signature quantify- ing this transition was an independent predictor of patient prognosis in multiple human cohorts. Within individual tumors, highly multiplexed spatial profiling revealed coexistence of immune desert, inflamed, and excluded regions, which matched histologic pattern composition. Our results provide a detailed molecular map of lung adenocarcinoma intratumor spatial heterogeneity, tracing nongenetic routes of cancer evolution.

Significance: Lung adenocarcinomas are classified based on histologic pattern prevalence. However, individual tumors exhibit multiple patterns with unknown molecular features. We characterized nonge- netic mechanisms underlying intratumor patterns and molecular markers predicting patient prognosis. Intratumor patterns determined diverse immune microenvironments, warranting their study in the context of current immunotherapies.

INTRODUCTION exposure (3, 4). Lung adenocarcinoma genetic diversity has been documented both across patients (1, 5), where it can Cancer cells evolve by acquiring novel alterations and determine treatment choices (6–10), and within individual adapting to changing conditions. Genetic, epigenetic, and tumors (11, 12), where it sustains disease evolution and transcriptional changes determine extensive heterogeneity treatment resistance (13, 14). Lung adenocarcinoma inter- among patients and within individual tumors, influencing and intrapatient molecular diversity is, however, not exclu- disease prognosis and therapeutic options. Lung adenocar- sively genetic. Transcriptional and epigenetic heterogeneity cinoma is the most common subtype of lung cancer, and has been reported both among and within patients (15–19). it encompasses molecularly and phenotypically diverse dis- Moreover, molecular diversity can translate in diverse tumor eases (1, 2), either associated with or unrelated to tobacco microenvironments, for example in association with variable tumor mutational burden (TMB; refs. 20–23) or presence of specific oncogenic alterations (24). In the clinic, histopathologic analyses have revealed het- 1Swiss Cancer Center Leman, Lausanne, Switzerland. 2Department of Com- erogeneous tumor tissue morphologies referred to as histo- putational Biology, University of Lausanne (UNIL), Lausanne, Switzerland. 3Swiss Institute of Bioinformatics, Lausanne, Switzerland. 4Swiss Institute logic patterns. The most frequent patterns are classified as for Experimental Cancer Research (ISREC), EPFL, Lausanne, Switzerland. lepidic, papillary, acinar, and solid (Fig. 1A), and 80% of lung 5Department of Oncology, University Hospital of Lausanne (CHUV), Lausanne, adenocarcinoma tumors concurrently exhibit at least two of Switzerland. 6Division of Thoracic Surgery, University Hospital of Lausanne these patterns. According to the latest WHO classification (2), 7 (CHUV), Lausanne, Switzerland. Department of Oncology, Ludwig Institute for histopathologic lung adenocarcinoma classification is based Cancer Research, University of Lausanne (UNIL), Lausanne, Switzerland. 8Institute of Pathology, University Hospital of Lausanne (CHUV), Lausanne, on pattern prevalence, which is a major prognostic indicator Switzerland. 9Department of Pathology, Central Institute, Hôpital du Valais, (25, 26). Indeed, tumors with a prevalent lepidic pattern are Sion, Switzerland. typically considered less aggressive and associated with the Note: Supplementary data for this article are available at Cancer Discovery early phases of the disease, whereas solid-prevalent tumors Online (http://cancerdiscovery.aacrjournals.org/). are indicative of poor prognosis. These prognostic associa- Current address for E. Battistello: Department of Pathology, New York tions define a potential progression of patterns from lepidic University (NYU), New York, New York; current address for E. Dheilly, Ichnos to papillary, acinar, and at last solid (Fig. 1A). Whether this Sciences SA, La Chaux-de-Fonds, Switzerland; and current address for M. Mina, SOPHiA GENETICS, Saint-Sulpice, Switzerland. progression can also be observed at the molecular level is, however, unknown. Indeed, reconciling molecular and histo- Corresponding Authors: Giovanni Ciriello, Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland. Phone: 412- logic heterogeneity is hampered by the difficulty of assaying 1692-5450; E-mail: [email protected]; and Igor Letovanec, Hôpital these features in the same tumor material. Recent digital du Valais, Avenue Grand-Champsec 80, 1951 Sion, Switzerland. Phone: pathology and spatial genomics technologies coupled with 412-7603-4774; E-mail: [email protected] advanced computational approaches provided new strate- Cancer Discov 2021;11:1–18 gies to address this challenge. For example, the combination doi: 10.1158/2159-8290.CD-20-1274 of multiregion molecular profiles and histologic data was ©2021 American Association for Cancer Research. recently used to draw a link between lung adenocarcinoma

june 2021 CANCER DISCOVERY | OF2

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al.

A BC LUAD histologic patterns P = 0.007 P = 0.0004 TCGA-44-7670 3,000 P = 0.0009 2,500 TCGA-44-7670 2,000

1,500

1,000 # Mutations

500

0 01Z-DX slide 01A-TS1 slide Solid Lepidic Acinar Lepidic Papillary Acinar Solid Papillary Tumor aggressiveness/pattern progression

DE F 8 G 5 out of 6 negative 6 ALL 5,835 4,3844 FC negative 5K 2,582 50K pression FC mRNA expression TPM) 4 1,337 2 fold changes (FC) 1K 8,789 10K All negative (n = 1,337 )

(log 2 ALL 500 5K 100 1,753 RAP1GAP fold changes positive mRNA ex 0 5 out of 6 50 FC 1K positive FC 10 500 8 ANLN 5 100 7 All positive 5 out of 6

1 loci rentially methylated rentially expressed genes rentially expressed 50 6 0.5 fold changes negative FC ALL

20 pression 5

TPM) negative 2 4 DNA methylation # Dif fe # Dif fe FDR: 0.1 0.01 0.001 0.1 0.01 0.001 FC FDR: 0.1 0.01 0.001 0.10.010.001 3 fold changes (FC)

(log 5 out of 6 Real Random Real Random 2 positive FC (n = 1,753 probes)

mRNA ex ALL patterns patterns patterns patterns 1 positive FC Solid Lepidic Acinar Papillary H IJ LepidicPapillaryAcinarSolid Enrichment significance [−log (q)] Enrichment significance [−log (q)] 10 10 Lung epithelial cells • 20 10 01020608090510 0510 15 Lung alveolar cells * Mean Endothelial cells * Overexpressed signature

Methylated B cells Neurogenesis Biological adhesion promoters

in lepidic score

in solid Fibroblasts Cellular morphogenesis Regulation of transport Plasma cells (Z -score) Cell differentiation Neurogenesis T-regs Cell projection CD4 T cells Developmental growth Immune score • • Developmental process Cell cycle arrest CD8 T cells • −101 Tissue morphogenesis Dendritic cells • Methylated promoters T cells • in lepidic γδ Cell cycle Overexpressed Immune system NK cells • Cytotoxic cells • • Mitosis Defense response in solid Monocytes • * segregation Neutrophil migration Macrophages M2 • * Post hoc Immune effector process Response to chemokine Macrophages • P-value Macrophages M1 • * Immune system • * • < 0.1 Eosinophils * < 0.01 Defense response q = 0.01 Mast cells • ** * Cytokine production Neutrophils •• * ** < 0.001 T cells exhausted * q = 0.01

Figure 1. Interpatient heterogeneity among lung adenocarcinoma histologic patterns. A, Hematoxylin and eosin (H&E) staining of lung adenocarci- noma (LUAD) histologic patterns (from left to right): lepidic, papillary, acinar, and solid. B, Total number of somatic coding mutations (y-axis) in TCGA samples (colored points) stratified by histologic pattern classificationx ( -axis). The outlier lepidic-annotated TCGA sample is highlighted by its patient ID. P values are computed by Wilcoxon two-tailed test. C, Representative H&E images of two tumor tissue slides for the TCGA-44-7670 sample: an image taken from a slide corresponding to the tumor sample used for histopathology review (left) and an image corresponding to the tumor sample used for molecular analyses (right). Complete images can be accessed at: https://cancer.digitalslidearchive.org/. D and E, Number of (D) significantly differentially expressed genes and (E) significantly differentially methylated probes identified based on different FDR thresholdsx ( -axis) by comparing patients grouped by the real prevalent histologic pattern (black) or after randomizing the histologic pattern labels (gray). Error bars correspond to one standard deviation upon 100 label permutations. F, mRNA expression of two top differentially expressed genes (y-axis) in TCGA samples stratified by prevalent pattern (x-axis). Colored dots on the right indicate the median expression value of each group and arrows represent the direction of the fold change (FC): upward (downward) arrows indicate that the more aggressive patterns have lower (higher) median expression than the less aggressive patterns. Pairwise FCs always compare the more aggressive to the less aggressive pattern; hence, upward arrows correspond to negative FCs and downward arrows to posi- tive FCs. G, Pie chart distributions of the sign of pairwise FCs computed for all differentially expressed genes (top) and differentially methylated probes (bottom). H, Significantly enriched sets among genes overexpressed in lepidic-prevalent samples (blue bars) and in solid-prevalent samples (red and yellow bars). I, Significantly enriched gene sets among promoter probes with lower DNA methylation in lepidic-prevalent samples than solid-prevalent samples (blue bars) or with lower DNA methylation in solid-prevalent samples than in lepidic-prevalent samples (yellow bars). J, Mean mRNA expression scores for multiple cell types (rows) within each pattern subtype (columns). Values are normalized by rows (Z-scores) to show relative differences among patterns. mutational heterogeneity and microenvironment composi- noma histologic patterns and the molding of their tumor tion, with relevant implications for the adoption of current microenvironment are largely unknown. immunotherapies in this disease (21, 22). However, in spite of Here, we performed histopathology-guided multiregion their prognostic relevance, comprehensive molecular profiles sampling from primary human lung adenocarcinoma to dis- of morphologically diverse regions in the same patient are sect tumor regions corresponding to unique histologic pat- missing. Indeed, the molecular features of lung adenocarci- terns. Multiomics and spatially resolved molecular profiles of

OF3 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE these regions allowed us to define tumor-intrinsic and tumor- differentially expressed genes (n = 1,337, adj. P < 0.001) and extrinsic processes that determined lung adenocarcinoma methylated DNA loci (n = 1,753, adj. P < 0.001) among the pattern progression. We validated and evaluated the prog- four histologic subtypes did not highlight features unique to nostic significance of our results in more than 2,000 lung each group but rather progressive changes from lepidic- to adenocarcinoma samples from independent patient cohorts. solid-prevalent samples. To quantify this trend, we com- Importantly, none of these processes could be traced back puted gene expression and DNA methylation fold changes to specific genetic variants, but rather to epigenetic and between each pair of histologic subtypes, always comparing a transcriptional reprogramming. Overall, we identified onco- more aggressive to a less aggressive subtype (Supplementary genic processes and spatial features that support nongenetic Fig. S1G). In this way, progressive changes from lepidic evolution as a driver of lung adenocarcinoma heterogeneity to papillary, acinar, and solid cases would result in all and progression. fold changes having the same sign: all positive for increas- ing expression/methylation or all negative for decreasing RESULTS expression/methylation with pattern progression (e.g., see the top differentially expressed genes RAP1GAP and ANLN; Molecular Interpatient Heterogeneity Fig. 1F). Indeed, concordant positive or negative gene of Histologic Patterns expression (Fig. 1G, top) or DNA methylation (Fig. 1G, bot- We first examined molecular features of 206 lung adeno- tom) fold changes were observed in the majority of the cases, carcinoma samples from The Cancer Genome Atlas (TCGA) suggesting that histologic patterns do not represent four cohort (5), which had been annotated based on their most independent molecular phenotypes, but rather a transition prevalent pattern: lepidic (n = 8), papillary (n = 47), acinar from lepidic to solid, driven by epigenetic and transcrip- (n = 86), and solid (n = 65; Supplementary Table S1). Sam- tional reprogramming. ples with the same most prevalent pattern had a similar rep- Differentially expressed genes and methylated gene pro- resentation of tumor stages (Supplementary Fig. S1A), but moters were enriched for similar functional categories (Sup- solid-prevalent tumors exhibited significantly higher TMB plementary Table S4). Indeed, genes overexpressed in lepidic (Fig. 1B), consistent with recent observations in an independ- compared with solid samples were enriched for cell differ- ent cohort (27), and number of copy-number alterations at entiation, development, and morphogenesis terms, whereas coding genes (Supplementary Fig. S1B). A notable excep- genes overexpressed in solid compared with lepidic cases tion to this trend was a lepidic-annotated tumor sample were highly enriched for cell proliferation and markers of (TCGA-44-7670). However, after reviewing the virtual slides immune infiltration (Fig. 1H). Transcriptional differences provided for this data set, we found highly different histo- were confirmed in the EAS data set (Supplementary Fig. logic patterns in the tumor region submitted for molecular S1H) and in an additional lung adenocarcinoma cohort profiling (01A-TS1) and the one submitted for pathology (ref. 31; Supplementary Fig. S1I). Similarly, promoter probes review (01Z-DX), suggesting that intratumor heterogene- that increased DNA methylation with pattern progression ity could explain this inconsistency (Fig. 1C). Predicted were enriched for genes involved in cell differentiation and neoantigens increased proportionally with the TMB (Sup- morphogenesis, whereas probes that lost methylation with plementary Fig. S1C), potentially predicting diverse immu- pattern progression were enriched for immune cell markers nogenicity among the histologic patterns. No recurrent (Fig. 1I), further suggesting that aggressive patterns are asso- genetic lesion (mutation, copy-number alteration, or gene ciated with changes in the tumor microenvironment. To cor- fusion) was found enriched in a specific pattern, except for a roborate this finding, we estimated the presence of distinct few PIK3CA mutations mostly occurring in lepidic samples nontumor cell populations from transcriptional data (32). (three of eight patients, adj. P = 0.0004) and a trend for a Lepidic samples were enriched for lung alveolar and epithelial higher fraction of TP53 mutations in solid samples (adj. P = markers, supporting a similar cell identity between lepidic 0.096; Supplementary Fig. S1D; Supplementary Table S2). cancer cells and normal lung tissue, whereas both lymphoid The association between solid-prevalent tumors and high and myeloid immune cell types were invariably enriched TMB and TP53 mutations was confirmed in an independent in acinar- and solid-prevalent samples, in both the TCGA data set of patients of East Asian ancestry (EAS) with lung (Fig. 1J) and EAS (Supplementary Fig. S1J) cohorts. Overall, adenocarcinoma (ref. 28; Supplementary Fig. S1E and S1F) these results indicated that lung adenocarcinoma pattern and in a recently analyzed clinical cohort (27). Conversely, progression is associated with a progressive reprogramming no association was found in these cohorts for PIK3CA muta- of both tumor cells and their microenvironment. However, tions. Additional analyses to test candidate weak drivers (29) molecular profiles analyzed so far were generated from single or alterations converging on the same pathway (30) did not tumor samples annotated by predominant pattern; hence, return significant hits that could be confirmed across data it remained unclear whether similar features and plasticity sets (Supplementary Table S3). Overall, our results suggest could be observed within individual tumors. limited associations between histologic patterns and lung adenocarcinoma genetic features. Molecular Intratumor Heterogeneity In contrast, TCGA samples with different prevalent pat- of Histologic Patterns terns exhibited highly diverse transcriptional and epigenetic To determine the molecular features of histologic pattern profiles (Supplementary Table S3), with at least 2-fold more progression within individual tumors, we selected a cohort differentially expressed genes or methylated probes than of 10 early-stage lung adenocarcinoma primary patient sam- expected by chance (Fig. 1D and E). Interestingly, the most ples that each exhibited at least two distinct patterns (CHUV

june 2021 CANCER DISCOVERY | OF4

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al.

A B FFPE tissue slides Normal (n = 10) Patient: 128910 34567 Acinar Pattern: (n = 10) EGFR KRAS STK11 RBM10 Lepidic Solid KEAP1 Histology-based (n = 12) Papillary (n = 3) IDH1 sample selection 1 mm (n = 4) & dissection Lung adenocarcinoma EGFR: H773dup E746_A750del L858R KRAS: G12C patients (n = 10) IDH1: R132L

Tr uncating mutation Missing data Histopathology-guided multiregion sampling Lepidic Papillary Acinar Solid C D

COL1A1 MMP9 MMP12 Enriched categories Pattern 4 4 ECM 4 Collagene degradation 2 2 2 MMP

Proliferation (Ki-67) d to lepidic) 0 0 0

−2 −2 −2 Immune response −4 B cell activation −4 −4 (normali ze mRNA expression differences mRNA expression Solid Solid Solid Lepidic Acinar Lepidic Acinar Lepidic Acinar Papillary Papillary Papillary TNXB EMP2 DLC1 4 4 4 2 2 2 Cell adhesion/ECM

Tissue development d to lepidic) 0 0 0 Tissue morphogenesis −2 −2 −2 −4 −4

(normali ze −4 1F 5B 8E 2B 6B 1E 9B 4B 3E 7B 3B 1B 8B 3C 4C 9C 1D 8D 2D 5C 6C 1C 3D 8C 2C 7C mRNA expression differences mRNA expression 10B 10C 10D −202 Solid Solid Solid Lepidic Acinar Lepidic Acinar Lepidic Acinar Lepidic Papillary Acinar Solid Z–scaled log (TPM) Papillary Papillary Papillary

SampleUpregulated in solid Upreulated in lepidic Samples from and concordant intratumor trend and concordant intratumor trend the same patient Discordant intratumor trend E Patient 1 Patient 2Patient 3Patient 4Patient 5Patient 6Patient 7Patient 8Patient 9Patient 10 Lepidic 0.8 0.7 Papillary 0.6 Acinar 0.5 Solid Immune score B C D 1D 1B 1E 1C 1F 2B 2C 2D 3B 3C 3E 3D 4C 4B 5B 5C 6B 6C 7B 7C 8C 8E 8D 8B 9C 9B 10 10 10

Figure 2. Intratumor heterogeneity among lung adenocarcinoma histologic patterns. A, Schematic representation of histopathology-guided multire- gion sampling: FFPE slides were reviewed for pattern identification, and tumor regions corresponding to a unique pattern were dissected and molecularly profiled (left). We have collected 29 tumor regions +( 10 adjacent normal tissue) from 10 primary lung adenocarcinoma samples (right). B, Occurrence of recurrent lung adenocarcinoma genetic mutations in molecularly profiled regions. Regions from the same patient are grouped together; patients are numbered (top) and histologic patterns are color coded (annotation bar). C, Heat map representation of differentially expressed genes among lung adenocarcinoma histologic patterns (rows, adjusted P < 0.001). Samples (columns) are identified by patient number followed by a letter correspond- ing to individual tumor regions. Histologic patterns are color coded. Cellular processes associated with significantly enriched gene sets are annotated on the right. D, mRNA expression differences among histologic patterns for a selected panel of ECM components and/or regulators (overexpressed in solid regions at the top, overexpressed in lepidic regions at the bottom). Expression values within each patient were normalized to the mean of the corresponding lepidic regions. Samples corresponding to same patient are connected by a dashed line and color coded based on concordance between intratumor differences and pattern progression. E, Immune score predicted within each tumor region for each patient. cohort; Supplementary Table S5) and performed histopa- predominantly clonal, i.e., observed in all regions, and not thology-guided multiregion sampling. For each patient, we associated with a specific pattern (Fig. 2B). In most cases, we reviewed and dissected tumor regions from formalin-fixed confirmed a trend for higher TMB in more advanced pat- paraffin-embedded (FFPE) tissue slides such that each region terns (Supplementary Fig. S2A). After accounting for patient- was composed by a unique pattern (Fig. 2A). In total, we col- specific features, differentially expressed genes and methylated lected 29 tumor regions and 10 normal tissue samples. These probes clustered together samples annotated for the same samples were processed by whole-exome sequencing, RNA pattern (Fig. 2C; Supplementary Fig. S2B; Supplementary sequencing (RNA-seq), and DNA methylation EPIC array Table S6). Transcriptional differences among patterns in our (see Methods). Lung adenocarcinoma driver mutations were cohort were consistent with those observed in the TCGA

OF5 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE and EAS cohorts (Supplementary Fig. S2C and S2D; Sup- whole transition from lepidic to solid (Fig. 3D). To explore plementary Table S7). Indeed, genes overexpressed in lepidic the origin of these transcriptional changes, we algorithmi- samples were enriched for tissue development and morpho- cally predicted which master transcriptional regulators (TR) genesis (Fig. 2C, blue cluster), whereas solid but especially were most likely to modulate differentially expressed genes acinar samples exhibited overexpression of immune infiltra- between lepidic and solid samples (39). Results in the TCGA tion markers, in particular of B cells (Fig. 2C, orange cluster). and our cohorts were extremely concordant (Fig. 3E) and Genes overexpressed in solid samples were more specifically identified, among solid master TRs, cell-cycle regulators such enriched for markers of cell proliferation and overexpression as E2F transcription factors, minichromosome maintenance of matrix metallopeptidase (MMP) genes (Fig. 2C, red clus- (MCM) complex components, which regulate DNA replica- ter). Both lepidic- and solid-associated genes were enriched tion and elongation, and the Forkhead Box M1 (FOXM1) for extracellular matrix (ECM) components and regulators transcription factor, which is a key regulator of cell prolifera- (Fig. 2C), albeit exerting opposite functions (33). Indeed, ECM tion and overexpressed in several cancer types (40). Among genes upregulated in solid samples were mostly enriched for lepidic master TRs, we found genes associated with tumor- ECM degradation (e.g., MMP genes) and collagen proteins suppressive functions, such as the circadian repressor CRY2, (e.g., COL1A1 and COL1A2), whose activation is known to which degrades the MYC oncogene (41), and the zinc-finger alter cell adhesion and promote invasion (34). Vice versa, transcription factor ZBTB4 (42), as well as transcription fac- ECM genes overexpressed in lepidic samples included several tors involved in cell differentiation and development, such as proteins mediating cell adhesion (ref. 35; e.g., TNXB, FBLN5, CASZ1 (43, 44), and the YAP repressor WWC1 (45, 46). In both and MFAP4), and putative tumor suppressors (e.g., DLC1, ref. the TCGA and our cohorts, lepidic master TRs and lepidic 36; and FOXF1, ref. 37). Importantly, expression of these genes cancer cell markers exhibited on average higher promoter was associated with pattern progression within individual DNA methylation in solid samples (Supplementary Fig. S3A), tumors (Fig. 2D). Similarly, immune infiltration predicted suggesting that downregulation of lepidic TRs and markers from gene expression (32) increased from lepidic to solid pat- is at least in part driven by epigenetic silencing. Interestingly, tern (Fig. 2E) within 8 of 10 patients, and intratumor patterns data from a high-throughput CRISPR knockout screening showed a different enrichment for markers of normal lung (47) revealed that, in lung adenocarcinoma cell lines, loss of tissue (enriched in lepidic) and immune cell markers (enriched TRs enriched in the solid pattern was largely deleterious and in acinar and solid; Supplementary Fig. S2E). Altogether, many, though not all, were classified as essential genes, due transcriptional and epigenetic differences observed among to their role on cell proliferation (Fig. 3F). Conversely, in the patients classified by predominant pattern paralleled expres- same cells, knockout of TRs enriched in the lepidic pattern sion and methylation changes observed within individual led to moderate effects on cell viability and sometimes even tumors. Importantly, these differences pointed to both tumor- improved cell fitness (Fig. 3F), consistent with a putative intrinsic (differentiation, migration, proliferation) and tumor- tumor-suppressive function. extrinsic (immune infiltration) processes as key determinants Next, we combined cancer-specific lepidic and solid markers of lung adnocarcinoma histologic patterns. to generate a unique mRNA signature and quantify lepidic- to-solid transition (L2S signature). L2S signature scores in Cancer Cell Plasticity Underlies TCGA and CHUV samples were consistent with patient and Pattern Progression intratumor classifications and pattern progression (Supple- To explore tumor-intrinsic features of pattern progres- mentary Fig. S3B and S3C) and, indeed, normal lung tissues sion, independent of the extent of immune infiltration, we had the lowest scores, followed by lepidic, papillary, acinar, analyzed single-cell RNA-seq data for three lung adenocarci- and finally solid samples, which on average had the highest noma samples (38). Differential expression analysis between scores. Interestingly, L2S scores correctly predicted the pat- tumor and nontumor cells allowed us to extract 2,410 genes tern of the misannotated TCGA sample (TCGA-44-7670; Fig. that were highly expressed only in tumor cells (cancer-specific 1C) and, unlike the classification based on predominant pat- genes, Fig. 3A; see Methods). First, we selected cancer-specific tern, it stratified TCGA samples in classes with significantly genes that were significantly differentially expressed between different prognosis (Fig. 3G; Supplementary Fig. S3D). This lepidic and solid tumor regions in our cohort (adj P < 0.1 signature gave us the possibility of estimating pattern pro- and absolute fold change > 2) to determine lepidic (n = 36) gression and assessing its prognostic value in a much larger and solid (n = 21) cancer cell markers (Fig. 3B). These genes ensemble of lung adenocarcinoma tumors, where transcrip- confirmed the enrichment for cell proliferation (solid) and tional profiles were available, but histopathology annotations differentiation (lepidic) terms (Fig. 3C; Supplementary were not. In total, we analyzed and scored >2,000 lung adeno- Table S8). Next, using these genes as cancer cell markers carcinoma human samples, from 10 patient cohorts (5, 23, of lepidic and solid patterns, we derived a transcriptional 28, 31, 48–51). Multivariate Cox regression confirmed that score for each single cancer cell to quantify their lepidic-like tumor stage and L2S scores were orthogonal and independent or solid-like transcriptional state (see Methods). Single-cell prognostic factors in all except one of the tested cohorts (i.e., transcriptional scores from these patient samples showed a cohorts comprising more than 100 patients; Fig. 3H; Sup- transition of states consistent with plastic reprogramming plementary Fig. S3E; Supplementary Table S1). Furthermore, (Fig. 3D): cells from sample S1 exhibited predominantly across all cohorts, L2S scores were strongly associated with lepidic features, sample S2 instead harbored tumor cells that the predicted activity of lepidic and solid TRs (Supplemen- lost lepidic markers and exhibited variable expression of solid tary Fig. S3F) and microenvironment composition (Fig. 3I). markers, and lastly sample S3 comprised cells spanning the Intriguingly, the highest correlation between L2S scores and

june 2021 CANCER DISCOVERY | OF6

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al.

A BCGene set enrichment Single-cell CHUV data set (−log (p), FDR < 0.05) RNA-seq 10

S1 FC) 024681012 2 Lung cancer good prognosis 3 Embryo development Cell adhesion Epithelium morphogenesis S2 2,410 lepidic genes

36 cancer-specific Tissue morphogenesis tumor-specific 0 genes Lung cancer bad prognosis

G2–M cell cycle S3 −3 Oxidoreductase activity Mitotic cell cycle

solid genes Regulation of cell death 3 LUAD primary3 LUAD human samples 21 cancer-specific Tumor cell change (log Lepidic/solid fold Nontumor cell 024 6

−log10 (P)

D EF) S1 S3 0.5 FOXM1 WWC1 5 CCNA2 CENPK TRIP13 Cor = 0.85 TOP2A NFIX MCM6 ZBTB4

MCM2 Fitness 0.25 MCM4 increase S2 2.5 ZNF217 0 0.20

0 0.15

PTTG1 Fitness WWC1 −0.5 decrease 0.10 −2.5 * * FOXM1 Lepidic-like signature Lepidic-like NFIX * CASZ1 FRY ZNF217 0.05 NR3C2 High activity in lepidic −1 High activity in solid * * ZBTB4CRY2 ZNF75D Lepidic TR −5 * * 0.05 0.10 0.15 0.20 0.25 −5 −2.5 0 2.557.5 * * Solid TR ANA gene dependency score (CRISPR KO anscriptional regulator (TR) activity score (TCGA) 1.5 − * Essential gene Solid-like signature Tr Transcriptional regulator (TR) activity score (CHUV) * AV *

d GH I

100 Lepidic-like score (bottom 25%) Lung Lungalveolar Endothelialepithelial cellsB cells cellsFibroblasts cellsTregsT cellsMonocytes CD4T cellsPlasma CD8Macrophages cellsEosinophilsCytotoxic M2Immune cellsNK cellsscoreMacrophagesDendriticNeturophils cellsT cellsMast gamma–delta MacrophagescellsT cells exhauste M1 Data set # Hazard ratio P LuMu (29) 80 Solid-like score (top 25%) TCGA (513) 0.0011 TCGA (513) Micke (106) 60 Chen (169) 0.47 Micke (106) 0.025 Chen (169) 40 0.021 Yo kota (226) Survival Yo kota (226) Beg (398) Beg (398) 1.5e–05 20 Shedden (443) Shedden (443) 0.00063 Pintilie (181) P-value = 7.9E-6 Pintilie (181) 0.03 0 Ding (38) 0 2,000 4,000 6,000 0.71 1 1.41 2 TRACERx (95) Time (days) Correlation between P < 0.01 TME signatures Not significant and L2S scores −1 0 1

Figure 3. Tumor-intrinsic features of lung adenocarcinoma (LUAD) histologic patterns. A, Schematic representation of single-cell RNA-seq analysis of three tumor samples from three patients. For each sample, differential expression analysis between tumor (red) and nontumor (gray) cells led to identifica- tion of 2,410 genes preferentially expressed in cancer cells. B, Volcano plot showing mRNA expression fold changes of cancer-specific genes between lepidic and solid tumor regions (log2–y-axis) and corresponding P values (−log10–x-axis). Significant genes are color coded (red, overexpressed in solid regions; blue, overexpressed in lepidic regions). C, Significantly enriched gene sets among genes overexpressed in lepidic regions (blue bars) and in solid regions (red bars). D, Scatterplot of single tumor cells from three patients scored by lepidic-like single-cell signature (y-axis) and solid-like single-cell signature (x-axis). Single cells are color coded by combined signature scores (main scatter plot) and separately shown for each patient sample (top right insets). E, TR activity scores obtained with the VIPER algorithm upon comparing lepidic and solid-annotated regions in the CHUV (x-axis) and TCGA (y-axis) cohorts. Significant TRs are color coded (red, solid associated; blue, lepidic associated) and the top scoring are labeled. F, Gene dependency scores obtained from the AVANA CRISPR screening data set. Negative (positive) values indicate fitness decrease (increase) upon gene knockout. Values for lepidic (blue) and solid (red) TRs are the mean obtained upon gene KO in lung adenocarcinoma cell lines. G, Overall survival difference (Kaplan–Meier curve) between TCGA samples within the top (red) and bottom (blue) quartiles of L2S scores. P value was computed by log-rank test. H, Hazard ratios associated with increasing values of the L2S signature score (10% increase) in seven independent lung adenocarcinoma data sets comprising >100 patients each (# column). P values were computed by multivariate Cox regression. The size of the dots is proportional to the number of sample. 95% confidence intervals are reported as horizontal lines. I, Cor- relation values between gene set mRNA expression scores for multiple cell types and L2S scores in 10 independent data sets. immune cell markers was with markers of T-cell exhaustion, the spatial composition of the tumor immune microenvi- suggesting that mechanisms of immune evasion occur in ronment in correspondence of different patterns. First, we tumor samples with solid pattern features. analyzed FFPE tumor tissue slides from our patient cohort and from three additional patients with solid patterns by The Tumor Microenvironment of Lung multicolor immunofluorescence to detect proliferating cells Adenocarcinoma Histologic Patterns (Ki-67+), B cells (CD20+), CD4+ and CD8+ T cells, and mac- The reproducible association between our L2S signature rophages (CD68+). We distinguished lung adenocarcinoma and immune infiltration across independent lung adenocar- patterns and tumor cells by hematoxylin and eosin (H&E) cinoma patient cohorts (Fig. 3I) prompted us to investigate staining (Fig. 4A) and TTF1 staining (Fig. 4B), respectively,

OF7 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE

A CD Patient 1

Pixel size: 10 to 500 µm Pixel size: 50 µm 2 mm Ki-67 1 mm CD8 1 mm E Lepidic Acinar B 3,000 Normal )

2 Lepidic Papillary 2,000 Acinar Solid

1,000

Cell density N/(mm 0 2 mm TTF1 CD20 1 mm CD4 1 mm Ki-67 CD4+ CD8+ B cells Macrophages T cells T cells (CD20) (CD68)

F G H Ki-67 0.8 0.75

CD4 ) P = 2.3E−6 2 CD20 0.5 0.6 0.25 0.4 0

TTF1-colocalization −0.25 . 0.2 −0.5 vs + (spearman correlation) TLS density N/(mm 0 −0.75

100 µm 200 µm TTF1

Solid Solid Normal Lepidic Acinar NormalLepidic Acinar Papillary Papillary

I Patient 8 J Tumor region boundary Tumor 8B (solid region) SOLID boundary

Core

−1 −0.5 100 µm 0 +0.5 2 mm 2 mm Ki-67 CD20 CD68 Distance from CD4 CD8 boundary (mm) Lepidic Papillary Acinar Solid

KL Tumor 8B (solid region) Tumor 7C (solid region) Tumor 8B (solid region)Tumor 11C (solid region) ) 2 CD4+ T cells CD8+ T cells 800 2,000 1,500 600 ) 1,600 500 ) 1,500 2 400 2 1,000 1,200 400 300 1,000 800 200 500 500 200 400 100 (N cells/mm (N cells/mm Mean cell density 0 0 Mean cell density 0 0 0 Mean cell density (N/mm

B cells (CD20) Macrophages (CD68) [0,0.5] [Core] [0,0.5] [Core] [0,0.5] [Core] [−0.5,0] [−0.5,0] [−0.5,0] [−1,−0.5] [−1,−0.5] [−1,−0.5]

) Boundary Core 500 ) Boundary Core Boundary Core 2

1,000 2 400 750

300 ) Tumor 12B (solid region)Tumor 13B (solid region) 200 500 2 100 250 1,000 (N cells/mm 0 (N cells/mm 600 + Mean cell density 0 Mean cell density CD8 T cells 750 + 400 CD4 T cells 500 B cells (CD20) 200 Macrophages 250 (CD68)

Mean cell density (N/mm 0

[0,0.5] [Core] [0,0.5] [Core] [−0.5,0] [−0.5,0] [−1,−0.5] [−1,−0.5] Boundary Core Boundary Core

Figure 4. Spatial immune profiles of lung adenocarcinoma histologic patterns.A and B, H&E (A) and TTF1 (B) staining of lung adenocarcinoma tissue sample (patient 1). Lepidic (blue) and acinar (orange) patterns are contoured. C, Multicolor immunofluorescence staining for an lung adenocarcinoma tissue sample (patient 1). Images show separately fluorescence staining for Ki-67 (top left), CD8 (top right), CD20 (bottom left), and CD4 (bottom right). D, Schematic representation of GridQuant: each image is binned into a grid with bins/pixels of variable sizes. In this study, we tested pixel sizes varying from 10 to 500 mm (left). Fluorescence intensity is then averaged for each bin. An example for CD8 fluorescence is shown (right). E, Box plot distribution of cell densities (number of cells per mm2, N/mm2) for cells that were positive for each of the tested antibodies (x-axis). For each antibody, cell density values are computed for each pixel, and values obtained for pixel from regions with a different histologic pattern are compared. Pixel size = 200 μm. F, Multicolor IF staining of TLS. G, Quantification of TLS density across different regions corresponding to a unique histologic pattern (colored points). H, Correlation between the number of TTF1+ and TTF1− cells within each pixel of a given region corresponding to a unique histologic pattern (colored points). Pixel size = 200 μm. I, H&E (left) and multicolor IF (center) staining of lung adenocarcinoma tissue sample (patient 8) and zoom-in of the IF stain- ing of the solid pattern (right). Histologic patterns are contoured and color coded. J, Schematic representation of spatial quantification based on distance from the tumor boundary (red line). Contoured regions define discrete subsets of pixels within a certain interval of distances from the tumor boundary. K, Spatial quantification based on distance from the tumor boundary for the solid region of patient 8 (8B). Signal intensities were averaged among pixels within a certain interval of distances from the tumor boundary (gray line) and at the tumor core, defined as> 1 mm inside the boundary. Pixel size = 100 μm. L, Spatial quantification based on distance from the tumor boundary for all solid regions. Pixel size= 100 μm.

june 2021 CANCER DISCOVERY | OF8

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al. and quantified fluorescent signal intensities (Fig. 4C) by core ROIs expressed high levels of cancer cell–specific markers designing a spatial grid quantification approach (GridQuant) (PanCK and EPCAM), Ki-67, and the interleukin 7 receptor that averaged fluorescence signals within pixels of variable (IL7R or CD127; Fig. 5C). Although IL7R can be expressed size (Fig. 4D; see Methods). These analyses revealed strik- by both tumor and immune cells, the low content of immune ing differences in the extent and geographic organization of cells in these ROIs suggested that IL7R was here expressed immune cell infiltration across lung adenocarcinoma pat- by cancer cells. Importantly, cancer cell expression of IL7R terns. Solid regions exhibited significantly stronger Ki-67 has been associated with poor prognosis in non–small cell intensity than the other patterns, whereas immune cell mark- lung cancer (54). Conversely, periphery ROIs exhibited high ers increased intensity with pattern progression but were expression of immune cell markers, including the immuno- highest in acinar regions (Fig. 4E). Interestingly, in several suppressive regulatory T-cell (Treg) marker TIM3 and the tumors we observed the formation of tertiary lymphoid struc- immune checkpoint VISTA (Fig. 5C). The highly different tures (TLS; Fig. 4F), sometimes characterized by a Ki-67– extent of immune infiltration at the core and periphery of positive core of proliferating B cells (Fig. 4F, right) resembling the solid tumor region challenged the possibility of com- germinal centers. TLS formation has been associated with paring features specific to either cancer cells or immune improved prognosis and response to immunotherapies (52, cells. To overcome this challenge, we profiled 36 additional 53); hence, we assessed their distribution across patterns in ROIs from three solid tumor regions, and, in each ROI, we our samples. We automatically identified all TLS in our slides separately analyzed immune cells (CD45+) and cancer cells and found that these were absent in normal lung tissue and (PanCK+; Fig. 5D; Supplementary Fig. S5B). By selectively lepidic cancer regions but prevalently observed within acinar retaining the signal coming from either one or the other regions and, less frequently, in papillary and solid regions cell population (see an example of the masking strategy on (Fig. 4G). Altogether, these results suggested that immune Fig. 5D), we first compared cancer cells at the core and infiltration increased with pattern progression but was maxi- periphery of solid tumor regions. Here, we found that cancer mal in acinar and not in solid patterns. cells at the periphery actually exhibited significantly higher Next, we investigated the spatial organization of the tumor levels of the proliferation marker Ki-67 than cancer cells at microenvironment in different patterns by assessing colocaliza- the core of the tumor, consistent with an invasive margin, as tion of tumor and nontumor cells. TTF1+ and TTF1− signals well as Pan-AKT and p53 (Fig. 5E; Supplementary Table S9). were positively correlated in normal lung and lepidic regions, Next, although immune infiltration was low at the core of likely due to the presence of cell-depleted lung alveolar struc- solid ROIs, specific comparison of CD45+ cells at the core tures, lacked correlation in papillary and acinar regions, but and periphery showed that immune cells infiltrated at the were highly anticorrelated in solid regions (Fig. 4H), consistent core of solid regions were significantly enriched for markers with low intermixing of cancer and noncancer cells. Similarly, of immunosuppressive effector Tregs, such as FOXP3, CD25, colocalization of immune cell markers and Ki-67, which here and TIM3, and immune checkpoints, such as CTLA4, VISTA, could be used to mark tumor cells (Supplementary Fig. S4A), and ICOS (Fig. 5F; Supplementary Fig. S5C). Overall, whereas was lowest at solid regions independent of the pixel size (Sup- all solid tumors exhibited features of immune exclusion, the plementary Fig. S4B). Consistent with these trends, we noticed residual immune infiltration was consistent with an immu- that lymphoid cells and macrophages localize at the boundary nosuppressive microenvironment associated in particular of solid regions within individual tumor slides (Fig. 4I). To with the presence of effector Tregs. quantify these observations, we used GridQuant to extract aver- age signal intensities at different distances from the periphery of each solid tumor region toward its core (Fig. 4J). In all cases, DISCUSSION the density of immune cells was higher at the periphery than Cancer heterogeneity across and within patients is appar- at the core of the tumor region (Fig. 4K and L), indicating that ent at both the molecular and histologic levels. However, the spatial distribution of immune cells in solid patterns was how and whether genomic features determine cell morphol- consistent with an immune-excluded phenotype. ogy and spatial organization is largely unexplored. Here, we To corroborate this evidence and explore in more detail reconciled molecular and histologic heterogeneity in lung the molecular profiles and immune microenvironment of the adenocarcinoma by introducing an approach based on histo- core and periphery of lung adenocarcinoma, we performed pathology-guided multiregion sampling. Our results showed digital spatial profiling (DSP; NanoString GeoMX) in five tis- evidence of nongenetic mechanisms of tumor evolution as sue slides from five patients. Briefly, in each slide, we selected determinants of histologic heterogeneity and disease progres- and analyzed with a panel of 58 antibodies 12 regions of sion. Indeed, progression from lepidic to solid histology was interest (ROI; n = 60 in total), located at either the core or associated with plastic reprogramming of differentiation cell periphery of different histologic patterns (Supplementary markers, increased cell proliferation, and a transition from Table S9; Supplementary Fig. S5A). ROI localization was not an immune desert (lepidic) to an inflamed (papillary and associated with immune infiltration, measured by either ratio especially acinar) and eventually excluded and suppressive of CD45-positive cells or protein expression, except for solid (solid) microenvironment (Fig. 6). Importantly, the transi- ROIs (Fig. 5A). Indeed, out of six solid ROIs from patient 8 tion of both cancer cell–intrinsic features and microenviron- (Fig. 5B), two were localized at the core of the tumor (R9 and ment composition was evident within individual tumors and R8) and had the lowest levels of immune infiltration, and matched intratumor pattern heterogeneity. four were selected at the tumor periphery (R6, R7, R10, R12) The concomitant evidence of plastic reprogramming of can- and all exhibited high immune infiltration (Fig. 5A). Solid cer cells and changing microenvironment prompts questions

OF9 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE

A BC Patient 8 0.6 Lepidic Lepidic PanCK 0.5 Papillary Papillary Acinar Acinar CD127 0.4 EPCAM Solid Solid 0.3 Ki-67 TIM3 0.2 VISTA 0.1 BCL2 CD4 0 2 mm Immuno-score (CD45 IF) CD8 Patient 8 Lepidic β2m PanCK CD3 HER2 R2 R4 R7 Papillary CD45 EPCAM R3 R1 R9 Acinar CD45RO CD45 R5 R8 R6 CD4 R10 Solid

CD3 R12 R9 R8 R7 R6 Normal tissue R10 R12 SNR-normalized protein expression R7 DSP-selected SNR-normalized protein (z-score) R9 expression (z-score) −202 R8 region of interest −202 Tumor core ROI Tumor periphery ROI R12 R10 R6 Tumor core ROI Tumor periphery ROI

DEROI PanCK mask CD45 mask F

Patient 12 (solid pattern) riphery Pe regions + regions + CD45 PanCK Core

PanCK CD3 p53 BAD BAD TIM3 Ki-67 ICOS CD40 CD25 CD14 CD68 VISTA B7.H3 DNA CD45 B7.H3 OX40L CTLA4 BCLXL FOXP3

+ + HLA.DR PanCK CD45 Pan.AKT core vs. core vs. SNR-normalized protein Tumor core ROI Patient 11 periphery periphery expression (z-score) Tumor periphery ROI Patient 12 −202 Patient 13

Figure 5. Digital spatial profiling of lung adenocarcinoma histologic patterns. A, Immuno-score of all tumor ROIs for five patients defined as fraction of CD45-positive cells by immunofluorescence analysis (top barplot; bars are color coded based on the pattern of the corresponding region). ROIs are annotated by tumor core or periphery localization (black and white circles, respectively) and by DSP protein expression of the top correlated protein with CD45+ immuno-score (bottom heat map). DSP values are normalized by SNR and by z-score for each patient. B, Top, H&E staining of lung adenocar- cinoma tissue sample (patient 8). Bottom, schematic diagram of patient 8 normal tissue regions and tumor histologic patterns (color coded) and selected tumor ROIs analyzed by DSP. C, Differentially expressed proteins between ROIs at the core of the solid region (R8 and R9) and at the periphery of the solid region (R6, R7, R10, and R12). Values are normalized by SNR and by z-score for each patient. D, Schematic representation of DSP analysis with ROI masks: ROIs at the core or periphery of the tumor (left) were first analyzed by IF for PanCK (green), CD3 (light blue), DNA (dark blue), and CD45 (red); next, CD45 and PanCK fluorescence was used to build two masks (white, selected; black, unselected) to selectively analyze either PanCK+ cells or CD45+ cells. E and F, Differentially expressed proteins between core (black circles) and periphery (white circle) ROIs exclusively comprising (E) PanCK+ cells or (F) CD45+ cells from solid tumor region of three patients (patient of origin is annotated on the left). Values are normalized by SNR and by z-score for each patient. on the origin of such changes. Does tumor cell dedifferentia- bility of genetically and therapeutically manipulating specific tion activate specific immune shaping and responses? Or are TRs driving lung adenocarcinoma progression. Moreover, dynamic changes of the tumor immune microenvironment cross-talks between the tumor microenvironment and changes triggering cancer cell plasticity? To address these questions, in cancer cell epigenetic features have been observed in the models of lung adenocarcinoma that mimic transcriptional, context of neoantigen presentation, cytokine production, epigenetic, and morphologic features of the human disease and epigenetic regulation of PD-L1 expression (59, 60). How- are required. Mouse models of non–small cell lung cancer ever, it is challenging to establish the relative timing and recapitulate some of the histologic patterns observed in the causative interactions between cancer cell reprogramming human disease (55, 56), but the molecular features of these and immune surveillance. Toward this goal, detailed single- patterns remain to be investigated. Interestingly, recent evi- cell spatial characterization of tumor molecular profiles from dence has shown that lung adenocarcinoma progression in a primary patient samples or longitudinal analysis of tumor genetically engineered mouse model (GEMM) is accompanied spatial features could inform and complement functional by plastic reprogramming driving cell dedifferentiation (57, assays in experimental models. 58). A detailed comparison of the cell state transitions that we Intriguingly, even by selecting intratumor regions char- observed in our human cohort with those in the lung adeno- acterized by a unique pattern, we did not find evidence of carcinoma GEMM will be important to investigate the possi- markers discriminating the four patterns into separate and

june 2021 CANCER DISCOVERY | OF10

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al.

Cell differentiation Migration/proliferation putational approaches able to exploit spatial information to ZBTB4 FOXM1 provide novel insight on interactions between tumor and non- WWC1 MMP tumor cells and on how such interactions shape cancer cell Alveolar/epithelial ECM degradation markers poorly differentiated identity. Overall, reconciling molecular and phenotypic het- erogeneity is a critical first step to understand and integrate genetic and nongenetic mechanisms of cancer evolution. Lepidic Papillary Acinar Solid

Cold Hot Excluded METHODS Tertiary Immune lymphoid infiltration Statistical Analyses structures Immune Details of all statistical analyses and tests performed and referred evasion Effector Tregs to in the main text and Methods are outlined in Supplementary Table S2. Standard statistical tests (χ2, Fisher, Kruskal–Wallis, t,

Lepidic Papillary Acinar Solid Wilcoxon rank-sum, correlation coefficients) were performed using the appropriate functions from R “stats” package. Dunn and Tukey post-hoc tests were performed with R packages “FSA” (v 0.8.22; Figure 6. Schematic representation of cancer cell (top) and microen- https://github.com/droglenc/FSA) and “multcomp” (v1.4-13; 69), vironment (bottom) evolution in the progression from lepidic to papillary, respectively. Multiple hypothesis corrections were made using the acinar, and at last solid patterns. Benjamini–Hochberg procedure. All analyses were performed imple- menting custom scripts in bash and R (v3.4 and 3.5) languages. independent classes. Instead, our results suggested a transition between two extreme states, lepidic and solid, with papillary and TCGA Data Set acinar as possible intermediate states. To quantify this transi- Molecular and clinical data for the TCGA lung adenocarcinoma cohort tion, we proposed a transcriptional signature (called L2S signa- (TCGA-LUAD) were downloaded from the Genomic Data Commons ture) derived from the comparison of pure lepidic and pure solid (GDC; ref. 70) and GDAC FireHose (https://gdac.broadinstitute. tumor regions. Importantly, L2S scores were independent pre- org/) repositories. The data set included somatic point mutations dictors of patients’ overall survival and immune infiltration in (whole-exome sequencing, MAF file version: mc3 v0.2.8; ref. 71), multiple independent lung adenocarcinoma cohorts and high- copy-number changes (Illumina SNP6 array, segmentation files, and lighted sample misannotations due to intratumor heterogene- gene-level copy number generated with GISTIC; ref. 72), gene expres- ity. With the expanding use of molecular profiling technologies sion profiled by RNA-seq (Illumina HiSeq, HTSeq raw counts, and for diagnostic purposes, signatures such as the one we identified FPKM-normalized values), DNA methylation data (Illumina Infin- ium HM450k array, beta values), H&E-stained images, and clinical could provide a complement to histopathology. In particular, data. Neoantigen counts were retrieved from https://gdc.cancer.gov/ intratumor pattern heterogeneity is prevalent in early-stage lung about-data/publications/panimmune (ref. 73; metric: “numberOf- adenocarcinoma. With the recent success and possible increased BindingExpressedPMHC”). Only primary and normal samples were adoption of screenings to detect the disease (61, 62), cases diag- considered; in case multiple samples for the same patient were nosed at an early stage are expected to increase. It will be critical available, the sample with the latest plate number was retained as to discriminate those more likely to progress or relapse after recommended by the GDAC guidelines. All data were generated treatment, surgery and/or radiotherapy, and better select those and processed by the TCGA research network (74). Consistent pre- needing adjuvant treatment and type of therapy. dominant histologic pattern annotation for a subset of tumors (N = In advanced/metastatic adenocarcinoma, immunotherapy 206) was obtained by merging clinical tables from GDC, GDAC, and as a single agent or in combination with other drugs is now Supplementary Table S1 from TCGA-LUAD 2014 manuscript (74). Patients ambiguously annotated to different patterns in different the treatment of choice in an important portion of cases (63). clinical tables were excluded from this pattern-annotated subset. Furthermore, immune-checkpoint inhibitors (ICI) in the neo- adjuvant setting are currently of great interest in non–small CHUV Data Set cell lung cancer, as shown by recent results with the PD-1 We retrieved from the database of the Pathology Institute of the inhibitor nivolumab, which led to major pathologic response Lausanne University Hospital resected stage I and II lung adenocar- in 45% of the patients (64). Additional clinical trials on early- cinoma. Thirteen adenocarcinomas were selected (Supplementary stage tumors are ongoing, and results are expected this year Table S5) fulfilling following criteria: (i) presence of at least two of (65, 66). As new data will become available, it will be inter- the main histologic patterns of lung adenocarcinoma (lepidic, acinar, esting to test the association between lung adenocarcinoma papillary, and solid); (ii) sufficient material available with appropriate histologic pattern composition or signature and response to surface and delineation of each pattern allowing microscopic dissec- ICI in both adjuvant and neoadjuvant setting. tion and material harvesting for molecular and imaging analyses. For The emergence of histologic heterogeneity with disease pro- every patient, an FFPE block of normal lung tissue distant from the gression is not a feature exclusively observed in lung adenocar- tumor was selected. Two to five tumor regions/FFPE blocks were also cinoma. Evidence of morphologic changes has been reported selected for every patient each with a specific dissectible pattern. For three patients (1, 3 and 8), two nonadjacent regions of the same pat- in several tumor types such as breast cancer (67) and hepa- tern were also selected for intratumor comparisons. Five- to 20-μm tocellular carcinoma (68). In these tumor types, integrating thick unstained slides were obtained from every selected block. Slides histopathology-guided multiregion sampling with molecular were dried and deparaffinized using Xylol and Ethanol (100% and profiling could prove to be an effective strategy to study 70%) and stained with Toluidine Blue 0.024% isopropanol 30%. Mac- the evolution of the disease. In particular, recently developed roscopic and microscopic dissection was performed, and unwanted spatial genomics technologies need to be coupled with com- tissue (nonpattern-specific and nontumor tissue) was removed from

OF11 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE the slides. Four-micrometer-thick slides were also taken before, after, and retained only if they satisfied all of the following conditions: and in between the dissected slides and stained with H&E to improve (i) GnomAD population frequency < 0.01, (ii) not being tagged morphologic control. Slides taken at the same time were subse- by Mutect2 filters panel_of_normals, artifact_in_normal, germline_ quently used for molecular and imaging analyses. Local Ethical Com- risk, str_contraction, multiallelic and clustered_events, (iii) variant mittee approval was obtained to perform all mentioned analyses, allele frequency in the tumor sample is at least twice greater than under authorization No. 2017-00334. the one in the normal sample, (iv) tumor depth is greater than 6 and Mutect2 filter value is “PASS” or, alternatively, the same variant Other Data Sets is shared by more than one tumor region of the same patient. The All additional data sets were generated and processed as described rationale behind these choices consists in not filtering out variants in the corresponding publications and are summarized in Supplemen- whose evidence is supported by independent regions (point iv), pro- tary Table S1. Chen/EAS (28) data set including somatic point muta- vided that they are not germline (points ii and iii). Known oncogenic tions, copy-number changes, gene expression profiled by RNA-seq, variants were manually verified and recovered using IGV genome H&E-stained images, and clinical data were downloaded from browser (83) and inspecting aligned RNA reads. OncoSG (75, 76). Ding (31) gene-expression data set was downloaded from the UCSC Xena browser (77). TRACERx (12) raw RNA-seq RNA-seq and Data Processing data for each tumor region were downloaded from the European RNA-seq was performed on 29 tumor regions by Genewiz with Genome-Phenome Archive (ID: EGAS00001003458) and processed a protocol optimized for FFPE samples, which involved ribosomal as described below in “RNA-seq and Data Processing”; lymph node RNA depletion and 2 × 150 bases paired-end sequencing with Illu- metastases were excluded; access was provided by the authors upon mina HiSeq. On average, the total number of reads per sample was request. Micke (48), Yokota (49), Beg (50), Shedden (51), and Pintilie around 65 million, with a mean quality score of 38 and 91% of bases (78) gene-expression data sets were downloaded from the Gene with quality ≥ 30. Sequencing reads were checked for quality using Expression Omnibus (accession numbers, respectively: GSE37745, FastQC v0.11.7, trimmed with TrimGalore v0.4.5 to remove Illu- GSE31210, GSE72094, GSE68465, and GSE50081). mina universal adapter contamination (parameters: -q 15 –phred33 –illumina –length 20 –paired –retain_unpaired -r1 24 -r2 24) and Whole-Exome Sequencing and Data Processing processed with RSEM (84) v1.3.0, performing the following steps: Whole-exome sequencing was performed on 27 tumor regions (i) alignment with STAR (85) v2.5.4b to (hg38) using and 9 adjacent normal lung specimens by Genewiz with a protocol GTF annotation file “gencode.v27.primary_assembly.annotation. optimized for FFPE samples. Briefly, genomic DNA samples were gtf” (downloaded from GATK resource bundle) and default RSEM fragmented into 200–500 fragments; libraries were pre- parameters; (ii) removal of reads mapping to tRNA and rRNA regions pared using the Agilent SureSelect Exome library preparation kit (retrieved using the UCSC table browser); (iii) estimation of isoform- and sequenced on Illumina HiSeq in High Output mode with a 2 × level and gene-level expression as transcripts per million (TPM) 150 bases paired-end sequencing configuration. On average, the total and RSEM-expected counts using rsem-calculate-expression. A batch number of reads per sample was around 98 million, with a mean qual- effect detected with principal component analysis was corrected in ity score of 38.42 and 92% of bases with quality ≥ 30. Sequencing reads the TPM expression matrix using the ComBat function implemented were checked for quality using FastQC v0.11.7, trimmed with Trim- in the R package sva (86) v3.30.1. Conversions between Ensembl IDs Galore v0.4.5 to remove Illumina universal adapter contamination and gene symbols were performed using BioMart (87). (parameters: -q 15 –phred33 –illumina –length 20 –paired –retain_ unpaired -r1 24 -r2 24) and aligned to human genome [hg38 build, DNA Methylation Array and Data Processing downloaded from GATK (79) resource bundle] using bwa-mem (80) Pattern-specific tumor samples were extracted as described in the v0.7.17. Duplicates were removed with Picard tool MarkDuplicates section “CHUV data set” and collected in deparafinization solution (v2.18.4). Reads were processed with GATK v4.0.3.0 Best Practices from the GeneRead DNA FFPE Kit (cat. #180134, Qiagen); DNA workflow (81) using the tools AddOrReplaceReadGroups, BaseRe- was extracted according to the manufacturer’s instructions. DNA calibrator, ApplyBQSR, and AnalyzeCovariates (see https://gatk. concentration was assessed using the Qubit High Sensitivity Assay, broadinstitute.org/hc/en-us/articles/360035535912 for details) and and DNA quality was monitored using the Infinium HD FFPE QC tagging known variant sites with the VCF files dbsnp_146.hg38. Assay (cat. #WG-321-1001, Illumina), according to the manufac- vcf, Mills_and_1000G_gold_standard.indels.hg38.vcf, af-only-gnomad. turer’s instructions. Samples for which at least 1 μg was available and hg38.vcf, Homo_sapiens_assembly38.known_indels.vcf (downloaded with good overall quality (Delta Cq from the Infinium HD FFPE QC from GATK resource bundle). The BAM files thus processed were Assay lower than 5) were selected for bisulfite conversion. Bisulfite indexed with samtools v1.6 and used for somatic variant calling. conversion was performed on 1 μg of DNA using the EZ DNA Methylation Kit (cat. # D5001, Zymo Research), using the alternative Variant Calling and Filtering protocol for CT conversion (optimized for the Illumina Infinium Somatic point mutations and short insertions–deletions for each Methylation Assay). FFPE restoration was then performed using the tumor region were called with GATK v4.0.3.0 tool Mutect2 using the Infinium HD FFPE DNA restore kit (cat. #WG-321-1002; Illumina), matched normal lung sample from the same patient (the only excep- and samples were processed using the Infinium MethylationEPIC tions were tumor regions of patient 8, for which variants were called 850k Kit (iGE3, University of Geneva). Raw signal intensities were in tumor-only mode) and a panel of normal samples. Additional processed, quantile-normalized, and converted into beta values using parameters used were “–af-of-alleles-not-in-resource ‘0.0000025’” and the R package minfi (88) v1.28.4. Probes were annotated using the “–disable-read-filter MateOnSameContigOrNoMappedMateReadFilter.” hg38 EPIC manifest file generated by Zhou and colleagues (ref. 89; Variants were filtered using FilterMutectCalls, CollectSequencingAr- version of September 2018) and filtered according to the correspond- tifactMetrics and FilterByOrientationBias (–artifact-modes “G/T” ing masking column. Probes mapping to sex and hav- and “C/T”); the latter was designed specifically to filter out transi- ing a detection P value above 0.01 were also removed, thus obtaining tions likely resulting from FFPE-related deamination of cytosines. a final set of 730,257 probes. A batch effect detected with principal Resulting VCF files were converted into MAF files using vcf2maf component analysis was corrected using the ComBat function imple- (https://github.com/mskcc/vcf2maf) v1.6.16 with default param- mented in the R package sva (86) v3.30.1. Probes were annotated to eters. Variants were then tagged with OncoKB (82) for oncogenicity FANTOM5 (90) gene promoters, and promoter methylation for each

june 2021 CANCER DISCOVERY | OF12

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al. gene was computed as the average beta value of all probes mapping Differential Methylation Analyses to the gene main promoter (p1 or p). Differential methylation analyses were performed on M-values, which were derived from beta values, on all probes and with the Copy-Number Alterations and Fusion Calling same pipeline as for expression data. Pairwise pattern comparisons Gene-level copy-number alterations were called from DNA meth- were performed with limma function decideTests. In the CHUV data ylation data using the R package conumee v1.16.0 and GISTIC (72) set, the patient corresponding to each tumor region was inserted v83 with default parameters. Gene fusions were called from adapter- as covariate in the limma model. For purely graphical purposes, in trimmed RNA-seq FASTQ files using STAR-Fusion (91) v2.6.1d with the heat maps the patient-specific batch effect was removed with default parameters. limma function “removeBatchEffect.” For TCGA, a null model was constructed by randomly permuting the assignment of patterns to Genetic Alteration Differences among Patterns samples and reperforming differential methylation analysis on 100 The fraction of genome altered for each sample was computed as random permutations. the fraction of genes having a gene-level GISTIC value equal to 2 or −2. Differences among patterns in terms of tumor mutational burden Analyses and fraction of genome altered were tested with Kruskal–Wallis test fol- Gene ontology analyses were performed on differentially expressed lowed by post hoc Dunn test. The potential confounding role of purity in genes and genes targeted by differentially methylated promoters the association between mutational burden and patterns was assessed using MSigDB (98) and Gene Ontology gene sets (ref. 99; Biological by performing a main-effects ANCOVA with the log-scaled number of Process and Molecular Function categories), using an FDR cutoff of mutations as dependent variable, purity (CPE values from ref. 92) as 0.01 and retrieving a maximum of 100 categories. continuous covariate, and histologic pattern as categorical independ- ent variable. In the TCGA pattern-annotated data set, two lists of driver Extraction of Lung Adenocarcinoma Tumor genetic alterations were inspected for differences among patterns: (i) a and Nontumor Markers published binary genomic alteration matrix (93), which included point Genes expressed preferentially in lung adenocarcinoma tumor cells mutations, copy-number changes, and gene fusions (results shown were extracted from a previously published lung cancer single-cell in figures); (ii) a list of drivers, which included “weak drivers” (as RNA-seq data set (38), which included three lung adenocarcinoma described in ref. 29) obtained by running FunSeq2 (94) algorithm as patients (two from the “discovery cohort” and one from the “valida- web service and retaining coding and noncoding variants with score tion cohort”), in the following way: (i) For the two “discovery cohort” ≥1.5. Driver alterations occurring in at least three samples were then patients, genes × cell expression matrix and cell IDs for each cell tested for differences among the four patterns with a χ2 test. Residuals type including cancer cells were downloaded from ArrayExpress (ID: were inspected in order to determine which pattern was enriched for E-MTAB-6149) and SCope repositories, and the expression matrix a given genetic event. P values were adjusted for multiple hypotheses was filtered such that only single cells coming from tumor regions with Benjamini–Hochberg procedure. FunSeq2 was also applied to the belonging to the two patients with lung adenocarcinoma were EAS data set, and the same downstream analysis was performed. For retained; (ii) for the tumor regions of the patient with lung adenocar- pathway-level analysis, driver alterations called by FunSeq2 were used cinoma in the “validation cohort,” only raw FASTQ files were avail- and the following steps were performed: (i) relevant pathways in cancer able (ArrayExpress ID: E-MTAB-6653), and thus they were aligned, and their gene components annotated as “oncogenes,” “tumor suppres- filtered, and processed as described in the Lambrecht and colleagues sor genes,” or “unknown” were retrieved from ref. 30; (ii) a pathway was study (38) using CellRanger v3.0.2 and Seurat v3.0.0; clusters were called “altered” in TCGA pattern-annotated data set if at least one gene detected with Seurat function “FindClusters” (resolution = 0.5) and was altered in FunSeq2 calls or the gene harbored GISTIC-based deep 12 of 16 of them could be assigned to the main immune/stromal cell deletion (in case of “tumor suppressor genes” or “unknown”) or ampli- types using the expression of the markers reported in Supplementary fication (in case of “oncogenes” or “unknown”); (iii) pathway altera- Fig. S1 of (38); of the remaining four clusters, two showed increased tions were then tested for differences among the four patterns with a expression of several keratin genes and they were thus assigned to χ2 test, residuals were inspected in order to determine which pattern cancer cells; (iii) for each of the three patients separately, markers for was enriched for a given altered pathway, and P values were adjusted for cells classified as “cancer” were extracted using the function “Find- multiple hypotheses with the Benjamini–Hochberg procedure. Markers” (with min.pct = 0) of Seurat (100); (iv) markers were further

filtered to have adjustedP < 0.05 and average log2(fold change) > Differential Expression Analyses 0.25; (v) in order to better accommodate for interpatient heterogene- Differential expression analyses among patterns were performed ity in tumor cell expression, the union of the three lists of patient- on RNA-seq read counts (HTSeq counts for TCGA and RSEM specific cancer markers was extracted as the final list. This procedure expected counts for Chen and CHUV data sets) using R packages yielded a list of 2,410 cancer-specific genes. Markers of lung alveolar limma v3.38.3 (95) and edgeR (96) v3.24.3 with a standard published and epithelial cells were extracted with FindMarkers applied to the pipeline (97). Only genes expressed (counts per million > 1) in at least lung adenocarcinoma samples of the “discovery cohort” data set 3 (CHUV) or 50% (TCGA and Chen) of samples of any pattern were using more stringent thresholds (adjusted P < 0.0001 and average tested. P values were adjusted using the Benjamini–Hochberg FDR- log2(fold change) > 10) in order to increase specificity. controlling procedure. Pairwise pattern comparisons were performed with limma function decideTests. In the CHUV data set, the patient Quantification of Immune Cell Infiltration corresponding to each tumor region was inserted as covariate in Methylation-based immune cell infiltration fractions for TCGA- the limma model. For purely graphical purposes, in the heat maps LUAD samples were downloaded from a previous study (92). Bulk the patient-specific batch effect was removed with limma function RNA-seq deconvolution was performed with consensusTME (32) “removeBatchEffect.” For TCGA, a null model was constructed by implemented in the corresponding R package (v 0.0.1.9000, parameters: randomly permuting the assignment of patterns to samples and cancer = “LUAD,” statMethod = “ssgsea”). Three cell types were added reperforming differential expression analysis on 100 random permu- to the available ones (T cells exhausted, lung epithelial cells, and lung tations. Differential expression analysis on the Ding data set was per- alveolar cells), and their score in each sample was quantified in the same formed on microarray gene-expression profiles between samples with way as done by consensusTME, namely, by computing single sample ≥50% (solid-like) and <50% (lepidic-like) of solid pattern prevalence. gene set enrichment analysis [ssgsea, implemented in GSVA (101) R

OF13 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE package v1.30.0 with ssgsea.norm = T] using markers of each of these corrected values were not within [0,1] were discarded. The rationale cell types. Markers for T-cell exhaustion used were PD-1, PD-L1, LAG3, behind step (i) was to select probes for which a reliable estimate of TIGIT, PD-L2, B7-H3 (CD276), HAVCR2 (TIM3), CD244, CTLA4, leukocyte methylation could be obtained. The conservative choices CD160 (ref. 102; https://www.rndsystems.com/product-highlights/ at steps (i) and (iii) resulted in a decrease of the number of probes for adoptive-cell-transfer-monitor-t-cell-exhaustion); markers of lung epi- which tumor cell–specific methylation could be computed (571,349 thelial and alveolar cells were extracted as described above in “Extrac- for the CHUV and 91,228 for the TCGA data sets), with the advan- tion of Lung Adenocarcinoma Tumor and Nontumor Markers.” tage, however, of a higher reliability for the purity correction. Dif- ferential methylation analyses (as described above) were performed Construction and Scoring of Lepidic-to-Solid Signature on purity-corrected methylation profiles to assess methylation differ- ences shown in Supplementary Fig. S3A. A differential expression analysis restricted to the 2,410 lung adeno- carcinoma cancer-specific genes was performed between lepidic and solid samples of the CHUV data set. Differentially expressed genes Master Transcriptional Regulator Activity Analysis were extracted by filtering for adjustedP < 0.1 and absolute log2(fold Virtual Inference of Protein-activity by Enriched Regulon analysis change) > 1, thus obtaining 36 cancer-specific lepidic markers and 21 (ref. 39; implemented in the “viper” R package, v1.16.0) was used cancer-specific solid markers. For bulk RNA-seq data sets, sample- to estimate TR activity, following guidelines contained in the user wise enrichment scores for these markers were computed using manual. Briefly, “msviper” function takes as input (i) a coexpression the singscore (103) R package (v1.0.0). Singscore outputs a unified regulatory network estimated with ARACNe methodology (106), score for the complete signature (“TotalScore”) as well as scores for for which an lung adenocarcinoma-specific regulatory network was the upregulated (lepidic) and downregulated (solid) genes separately. downloaded as a Bioconductor package (DOI: 10.18129/B9.bioc. The sign of scores relative to the lepidic marker set was changed for aracne.networks); (ii) a gene-expression signature generated with graphical reasons (such that lepidic-like samples would harbor higher viper function “rowTtest” between two biological conditions; and lepidic marker signature scores). For the single-cell RNA-seq data set, (iii) a null model obtained with viper function “ttestNull” and 1,000 a different strategy was adopted in order to account for the high drop- random permutations. The output is a list of TRs driving the bio- out rate of single-cell profiles: (i) cancer cells from the three patients logical conditions of interest and their corresponding enrichment with lung adenocarcinoma extracted as reported above in “Extraction P value, FDR, and normalized enrichment score. VIPER analyses were of Lung Adenocarcinoma Tumor and Nontumor Markers” were fur- thus performed on lepidic and solid samples of CHUV and TCGA ther filtered to retain only cells with more than 2,000 genes expressed; data sets separately, restricting the gene-expression matrix in input to (ii) a differential expression analysis was performed between the 10 contain only cancer-specific genes (see “Extraction of Lung Adenocar- most lepidic-like TCGA patients (i.e., harboring the lowest “Total­ cinoma Tumor and Nontumor Markers” above). Lastly, TR activity Score” computed as described above) and the 10 most solid-like TCGA was estimated in each single sample of all data sets using the “viper” patients (with highest “TotalScore”), restricting the set of genes tested function again restricted to cancer-specific genes. to the list of 2,410 cancer-related genes; (iii) differentially expressed genes were again extracted by filtering for adjustedP < 0.1 and absolute Survival Analyses log2(fold change) > 1, in this case obtaining 279 cancer-specific lepidic Survival analyses were performed using the R package survival v markers and 215 cancer-specific solid markers, due to the higher sta- 2.44-1.1. Cox regression models included sex, age, and stage (numeric) tistical power achieved with increased sample size (most of the lepidic as covariates. Signature scores were recalibrated such that hazard and solid markers obtained from the CHUV data set as reported above ratios represented the effect of a 10% increase of the signature value were among these augmented lepidic and solid marker lists, and none from its theoretical minimum (−1) to its theoretical maximum (+1). was found in the wrong list); (iv) this augmented signature was used to score single cells obtained at point (i) for lepidic-like or solid-like features using AUCell (ref. 104; with default parameters), a tool specifi- Analyses of Genetic Dependency Screenings cally designed for scRNA-seq data sets. CRISPR-based gene dependency scores and gene expression of cell lines were downloaded from the DepMap portal (https://depmap.org/ Purity Correction of DNA Methylation Profiles portal/download/, CRISPR/Avana: “Achilles_gene_effect.csv,” expres- sion: “CCLE_expression.csv”) along with cell line annotations. Only Tumor purity was estimated from DNA methylation profiles in the cell lines with lineage_sub_subtype = NSCLC_adenocarcinoma were CHUV data set using the Lump algorithm (92). Briefly, we retrieved investigated. Essential genes were also retrieved from the DepMap portal DNA methylation of purified leukocytes profiled with Illumina 850k (union of “common_essentials.csv” and “Achilles_common_essentials. EPIC array (105), and probes with a beta value below 0.1 in all leu- csv”). TRs that were differentially active in lepidic versus solid in the kocyte samples were intersected with probes having a beta value CHUV and TCGA cohorts were tested for dependency. above 0.85 in the top seven TCGA purest samples as estimated by Lump, which yielded a set of 24 probes methylated in tumor cells and unmethylated in immune cells. Finally, purity in heterogeneous H&E, TTF1, and Multicolor Immunofluorescence Staining samples was computed as mean beta value of these 24 probes. Purity H&E staining was performed with standard protocol to retrieve estimates were very correlated with ConsensusTME’s immune scores the histologic patterns on all samples that were used for molecu- (Spearman r = 0.87), and different thresholds yielded consistent lar or imaging assays. TTF1 staining was performed using mouse results. Next, in order to estimate tumor cell–specific methylation, anti-TTF1 antibody, clone 8G7G3/1 (Invitrogen 18-0221) on Roche- beta values observed for a mixture of tumor and immune cells were Ventana Benchmark Ultra using the following protocol: (i) retrieval modeled as a linear combination of tumor and immune cells beta in Ventana Cell Conditioning 1 solution at 95°C for 64 minutes; (ii) values, i.e., for probe i: βobserved[i] = pβtumor[i] + (1 − p) βimmune[i], where incubation for 32 minutes at 37°C, with the 8G7G3/1 being diluted p is the purity computed as described. βtumor was estimated in the 1/15; (iii) application of Ventana ultraView Universal DAB Detection following way: (i) variance of beta values across all leukocyte samples Kit followed by Ventana Hematoxylin as a nuclear counterstain. The was computed for each probe, and probes showing a variance greater multicolor immunofluorescence assay was performed using the Ven- than 0.01 were discarded; (ii) βimmune for the remaining probes was tana Discovery ULTRA automate (Roche Diagnostics). All steps were computed as the mean beta value in leukocyte samples; (iii) βtumor was performed automatically with Ventana solutions except if mentioned. thus computed with the equation above, and probes whose purity- Dewaxed and rehydrated paraffin sections were pretreated with heat

june 2021 CANCER DISCOVERY | OF14

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al. using the CC1 solution for 40 minutes at 95°C. Primary antibodies uncertain transitioning patterns or the external cut delimiting the were applied and revealed sequentially either with a rabbit Immpress end of the slide were removed). The first subregion was constituted by HRP (ready to use; Vector Laboratories) or a mouse Immpress HRP the pixels displaced at a distance between 0 and 0.5 mm, the second (ready to use; Vector Laboratories) followed by incubation with a at a distance between 0.5 and 1 mm, and the third was composed of fluorescent tyramide. A heat denaturation step was performed after all the remaining internal pixels (core). Moreover, an external region every revelation. The primary antibody sequence was: rabbit anti-CD8 with pixels at a distance between 0 and 0.5 mm outside the boundary (clone: Sp57, fluorophore: R6G), rabbit anti-CD4 (clone: Sp35, fluo- was also considered (Fig. 4J). The mean and distribution of densities rophore: DCC), mouse anti-CD68 (clone: KP-1, fluorophore: R610), of each cell type were then computed across all pixels falling within rabbit anti-Ki-67 (clone: 30-9, fluorophore: Cy5) and mouse anti-CD20 each of these four regions. (clone: L26, fluorophore: FAM). All sections (H&E, TTF1, and multi- color immunofluorescence) were mounted with FluoromountG (Bio- In Situ Detection and Quantification of concept) and scanned at 20× magnification using an Olympus VS120 Tertiary Lymphoid Structures (TLS-Finder) whole slide scanner equipped with specific filters. TLSs were modeled as clusters of B cells detected with immunofluo- rescence staining. An automatic detection pipeline (named TLS-finder) Spatially Resolved Cell Quantification was developed to quantify their presence across histologic patterns. TLS- Framework (GridQuant) finder involved the following steps: (i) GridQuant framework was used H&E, TTF1, and multicolor immunofluorescence-stained images to generate high-resolution (pixel size = 20 mm) matrices of B-cell counts; were visualized using QuPath (107) v0.2.0 software. Histologic pat- (ii) each matrix was imported in Fiji (108) image analysis software as text terns were assessed and drawn with QuPath on H&E images and image; (iii) B-cell count matrices were binarized using Fiji “Convert to annotations were then transferred with minor adjustments to the mask” function; (iv) TLSs were modeled as connected components of nearby adjacent TTF1-stained and immunofluorescence-stained the binarized B-cell matrices, which were detected and labeled using Fiji images. Regions that were too small were discarded. Next, quanti- plugin “Find connected regions,” with a minimum number of pixels to fications and downstream statistical analyses were performed by call a connected component set to 25 (corresponding to a minimum TLS developing a gridding framework named GridQuant, which involved size of 10,000 mm2); (v) Labeled TLSs were then processed with R scripts the following steps for each image: (i) automated cell detections on for downstream statistical analyses. The density of TLSs across patterns QuPath with available algorithms; (ii) acquisition of the coordinates was computed as the number of distinct TLSs divided by the area of each of histologic pattern boundaries drawn on the image; (iii) setup of a pattern-annotated region (Fig. 4G). pixel grid spanning the entire image, with tunable grid spacing (pixel size); (iv) for each cell type, summarization of cell detections at the Digital Spatial Profiling pixel level as counts of the number of cells whose centroid fell within Two runs of highly multiplexed and spatially resolved proteomic each pixel of the grid, thus obtaining one matrix for each cell type profiling of tumor and immune cells were performed with the GeoMx and for each pixel size considered; (v) various downstream spatially Digital Spatial Profiler (NanoString) as previously described (52) resolved statistical analyses across cell types and histologic patterns on five (batch1) and three (batch2) FFPE tissue sections. In batch1, (described in the following paragraphs). For TTF1-stained images, immunofluorescence assays were performed using antibodies against cell detection algorithm used on QuPath was “positive cell detection” CD3, CD20, CD45, and DAPI, and the multicolor images were used and cell types investigated were TTF1-positive and TTF1-negative. to guide the selection of 12 ROIs for each slide; for each ROI, digital For multicolor immunofluorescence-stained images, “Watershed cell counts from barcodes corresponding to protein probes (52 immune detection” algorithm was used to detect macrophages (CD63+ cells), and tumor-related proteins; Supplementary Table S9) were obtained CD4 T cells (CD4+ cells), CD8 T cells (CD8+ cells), B cells (CD20+ using nCounter (NanoString). In each ROI, automated cell detection cells), and proliferating cells (Ki-67+ cells). In order to accommodate was performed on the immunofluorescence images to count nucle- for potential variability in signal intensities among cell types and ated cells (DAPI-positive) and, among them, CD45-positive cells. slides, the parameters of these cell detection algorithms were tuned Immune ratio was computed as the ratio between the number of with visual inspection in each slide and for each cell type. Steps (i) CD45-positive cells and DAPI-positive cells. In batch2, Pan-cytokera- and (ii) of GridQuant were implemented in Groovy programming tin (PanCK), CD45, CD3, and DAPI antibodies were used; from each language; steps (iii), (iv), and (v) were implemented in R v3.5. ROI, two areas of interest (AOI) were extracted with image segmenta- tion, one containing pixels positive for PanCK (tumor compartment) GridQuant: Cell Density across Patterns and one for CD45 (immune compartment); digital counts were then Cell densities for each pixel were computed as the pixel counts divided obtained for each AOI separately (73 immune and tumor-related pro- by the pixel area. Distributions of densities across patterns were derived teins; Supplementary Table S9); in PanCK-positive and CD45-positive aggregating all pixels annotated to each pattern across all slides. AOIs, only tumor-related and immune-related proteins, respectively, were tested in downstream analyses. Levels of three housekeeping pro- GridQuant: Cell Type Colocalizations across Patterns teins (GAPDH, histone H3, and S6) and three negative controls (Ms IgG1, Ms IgG2a, and Rb IgG) were also measured. Digital counts for For each pattern-annotated region, the colocalization between cell each protein were normalized with internal spike-in controls (ERCC) types X and Y was computed as the Spearman correlation coefficient and signal-to-noise ratio (SNR), i.e., the ratio between the ERCC- between X and Y densities across all pixels falling within the region normalized counts of the protein and the geometric mean of the nega- boundary. Pixels having total cell densities (summing densities of all tive controls assayed in the ROI/AOI considered. In all downstream available cell types) below the fifth percentile or below 200 N/mm2 were analyses, only proteins having SNR>2 in at least three ROIs/AOIs discarded as likely representing empty regions corresponding to alveoli. were tested. ROIs/AOIs were annotated according to their histologic pattern and location with respect to the pattern boundary (center/ GridQuant: Solid Pattern Boundary Analysis core or periphery) using adjacent H&E-stained tissue sections. Some Regions annotated to the solid pattern were partitioned into three ROIs were also taken from TLS but were not used for the analyses internal subregions according to their distance from the bound- presented in this study. Differences between locations in solid pattern ary delimiting solid-annotated region and other patterns, including regions were tested with t test (batch1, one solid pattern region and six the normal lung (boundaries between the solid-annotated region and ROIs) and with two-way ANOVA controlling for sample and testing

OF15 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE separately AOIs from immune and tumor compartments (batch2, 2. Travis WD, Brambilla E, Nicholson AG, Yatabe Y, Austin JHM, three samples with one solid pattern region in each of them, 26 total Beasley MB, et al. The 2015 World Health Organization Classification AOIs for each tissue compartment). of lung tumors: impact of genetic, clinical and radiologic advances since the 2004 classification. J Thorac Oncol 2015;10:1243–60. Data and Code Availability 3. Subramanian J, Govindan R. Lung cancer in never smokers: a review. J Clin Oncol 2007;25:561–70. All data sets analyzed and the corresponding accession numbers 4. Samet JM, Avila-Tang E, Boffetta P, Hannan LM, Olivo-Marston S, are reported in Supplementary Table S1. Data generated in this study Thun MJ, et al. Lung cancer in never smokers: clinical epidemiology have been deposited in two Zenodo repositories, one containing raw and environmental risk factors. Clin Cancer Res 2009;15:5626–45. images for H&E, TTF1, and immunofluorescence staining and DSP 5. The Cancer Genome Atlas Research Network. Comprehensive molec- data (DOI: 10.5281/zenodo.3941450) and one containing processed ular profiling of lung adenocarcinoma. Nature 2014;511:543–50. molecular data (somatic mutations, gene-expression tables, and DNA 6. Rosell R, Carcereny E, Gervais R, Vergnenegre A, Massuti B, Felip methylation beta values; DOI: 10.5281/zenodo.4443496). Source E, et al. Erlotinib versus standard chemotherapy as first-line treat- code for GridQuant and TLS-finder pipelines is available in two pub- ment for European patients with advanced EGFR mutation-positive lic GitHub repositories (https://github.com/CSOgroup/GridQuant non-small-cell lung cancer (EURTAC): a multicentre, open-label, and https://github.com/CSOgroup/TLS-finder). randomised phase 3 trial. Lancet Oncol 2012;13:239–46. 7. Solomon BJ, Mok T, Kim D-W, Wu Y-L, Nakagawa K, Mekhail T, et al. Authors’ Disclosures First-line crizotinib versus chemotherapy in ALK-positive lung cancer. D. Tavernari reports grants from Fondation Emma Muschamp N Engl J Med 2014;371:2167–77. during the conduct of the study. E. Dheilly reports working for Ichnos 8. Peters S, Camidge DR, Shaw AT, Gadgeel S, Ahn JS, Kim D-W, et al. Sciences, a pharma company, for the last 11 months. The transition Alectinib versus crizotinib in untreated ALK-positive non–small-cell lung cancer. N Engl J Med 2017;377:829–38. from academia to the company occurred after the contribution to this 9. Shaw AT, Ou S-HI, Bang Y-J, Camidge DR, Solomon BJ, Salgia R, work. M. Mina reports working for SOPHiA GENETICS, a biomedi- et al. Crizotinib in ROS1-rearranged non–small-cell lung cancer. cal company, for the last nine months. The transition from academia N Engl J Med 2014;371:1963–71. to the company occurred after the contribution to this work. S. Peters 10. Planchard D, Besse B, Groen HJM, Souquet P-J, Quoix E, Baik CS, reports personal fees from AbbVie, Amgen, AstraZeneca, Bayer, et al. Dabrafenib plus trametinib in patients with previously treated Beigene, Biocartis, Boehringer Ingelheim, BMS, Clovis, Daiichi Sankyo, BRAF(V600E)-mutant metastatic non-small cell lung cancer: an Debiopharm, Eli Lilly, Roche/Genentech, Foundation Medicine, Illu- open-label, multicentre phase 2 trial. Lancet Oncol 2016;17:984–93. mina, Incyte, Janssen, Medscape, MSD, Merck Seono, Merrimack, 11. de Bruin EC, McGranahan N, Mitter R, Salm M, Wedge DC, Yates L, Novartis, Pharmamar, Phosphopatin Therapeutics, Pfizer, Regeneron, et al. Spatial and temporal diversity in genomic instability processes Sanofi, Seattle Genetics, and Takeda outside the submitted work. No defines lung cancer evolution. Science 2014;346:251–6. other disclosures were reported. 12. Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, Veeriah S, et al. Tracking the evolution of non–small-cell lung Authors’ Contributions cancer. N Engl J Med 2017;376:2109–21. D. Tavernari: Conceptualization, data curation, software, formal 13. Dagogo-Jack I, Shaw AT. Tumour heterogeneity and resistance to analysis, visualization, methodology, writing–original draft, writing– cancer therapies. Nat Rev Clin Oncol 2018;15:81–94. review and editing. N. Riggi: Data curation, funding acquisition, and 14. Turke AB, Zejnullahu K, Wu Y-L, Song Y, Dias-Santagata D, Lifshits E, et al. Pre-existence and clonal selection of MET amplification in investigation. E. Oricchio: Data curation, supervision, and investiga- EGFR mutant NSCLC. Cancer Cell 2010;17:77–88. tion. I. Letovanec: Conceptualization, data curation, supervision, 15. Hayes DN, Monti S, Parmigiani G, Gilks CB, Naoki K, Bhattacharjee funding acquisition, and investigation. G. Ciriello: Conceptualiza- A, et al. Gene expression profiling reveals reproducible human lung tion, formal analysis, supervision, funding acquisition, visualization, adenocarcinoma subtypes in multiple independent patient cohorts. writing–original draft, project administration, writing–review and J Clin Oncol 2006;24:5079–90. editing. E. Battistello: Data curation and investigation. E. Dheilly: 16. Dietz S, Lifshitz A, Kazdal D, Harms A, Endris V, Winter H, et al. Data curation and investigation. A.S. Petruzzella: Data curation Global DNA methylation reflects spatial heterogeneity and molecular and investigation. M. Mina: Data curation and formal analysis. evolution of lung adenocarcinomas. Int J Cancer 2019;144:1061–72. J. Sordet-Dessimoz: Resources. S. Peters: Resources. T. Krueger: 17. Sharma A, Merritt E, Hu X, Cruz A, Jiang C, Sarkodie H, et al. Non- Resources. D. Gfeller: Data curation and formal analysis. genetic intratumor heterogeneity is a major predictor of phenotypic heterogeneity and ongoing evolutionary dynamics in lung tumors. Acknowledgments Cell Rep 2019;29:2164–74. We would like to thank Leila Akkari and Luca Magnani for critical 18. Kim N, Kim HK, Lee K, Hong Y, Cho JH, Choi JW, et al. Single-cell RNA reading of the manuscript and insightful discussions. This work was sequencing demonstrates the molecular and cellular reprogramming supported by the Gabriella Giorgi-Cavaglieri foundation (G. Ciriello), of metastatic lung adenocarcinoma. Nat Commun 2020;11:2285. University of Lausanne Interdisciplinary Research Grant (G. Ciriello, I. 19. Hua X, Zhao W, Pesatori AC, Consonni D, Caporaso NE, Zhang T, et al. Genetic and epigenetic intratumor heterogeneity impacts Letovanec, and N. Riggi), the Swiss National Science Foundation (SNSF; prognosis of lung adenocarcinoma. Nat Commun 2020;11:2459. grant 310030_169519; G. Ciriello and D. Tavernari), and Emma 20. Ghorani E, Reading JL, Henry JY, de Massy MR, Rosenthal R, Turati Muschamp foundation (D. Tavernari). V, et al. The T cell differentiation landscape is shaped by tumour mutations in lung cancer. Nat Cancer 2020;1:546–61. Received September 2, 2020; revised December 21, 2020; accepted 21. AbdulJabbar K, Raza SEA, Rosenthal R, Jamal-Hanjani M, Veeriah S, January 22, 2021; published first February 9, 2021. Akarca A, et al. Geospatial immune variability illuminates differen- tial evolution of lung adenocarcinoma. Nat Med 2020;26:1054–62. 22. Joshi K, de Massy MR, Ismail M, Reading JL, Uddin I, Woolston A, et al. Spatial heterogeneity of the T cell receptor repertoire reflects References the mutational landscape in lung cancer. Nat Med 2019;25:1549–59. . 1 Campbell JD, Alexandrov A, Kim J, Wala J, Berger AH, Pedamallu CS, et al. 23. Rosenthal R, Cadieux EL, Salgado R, Bakir MA, Moore DA, Hiley Distinct patterns of somatic genome alterations in lung adenocarcino- CT, et al. Neoantigen-directed immune escape in lung cancer evolu- mas and squamous cell carcinomas. Nat Genet 2016;48:607–16. tion. Nature 2019;567:479–85.

june 2021 CANCER DISCOVERY | OF16

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

RESEARCH ARTICLE Tavernari et al.

24. Klemm F, Maas RR, Bowman RL, Kornete M, Soukup K, Nassiri S, 45. Knight JF, Sung VYC, Kuzmin E, Couzens AL, de Verteuil DA, et al. Interrogation of the microenvironmental landscape in brain Ratcliffe CDH, et al. KIBRA (WWC1) is a metastasis suppressor gene tumors reveals disease-specific alterations of immune cells. Cell 2020; affected by chromosome 5q loss in triple-negative breast cancer. Cell 181:1643–60. Rep 2018;22:3191–205. 25. Travis WD, Brambilla E, Geisinger KR. Histological grading in lung 46. Wang Z, Katsaros D, Biglia N, Shen Y, Fu Y, Tiirikainen M, et al. Low cancer: one system for all or separate systems for each histological expression of WWC1, a tumor suppressor gene, is associated with type? Eur Respir J 2016;47:720–3. aggressive breast cancer and poor survival outcome. FEBS Open Bio 26. Mäkinen JM, Laitakari K, Johnson S, Mäkitaro R, Bloigu R, Pääkkö P, 2019;9:1270–80. et al. Histological features of malignancy correlate with growth pat- 47. Meyers RM, Bryan JG, McFarland JM, Weir BA, Sizemore AE, Xu H, terns and patient outcome in lung adenocarcinoma. Histopathology et al. Computational correction of copy-number effect improves 2017;71:425–36. specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat 27. Caso R, Sanchez-Vega F, Tan KS, Mastrogiacomo B, Zhou J, Genet 2017;49:1779–84. Jones GD, et al. The underlying tumor genomics of predomi- 48. Mezheyeuski A, Bergsland CH, Backman M, Djureinovic D, Sjöblom nant histologic subtypes in lung adenocarcinoma. J Thorac Oncol T, Bruun J, et al. Multispectral imaging for quantitative and compart- 2020;15:1844–56. ment-specific immune infiltrates reveals distinct immune profiles 28. Chen J, Yang H, Teo ASM, Amer LB, Sherbaf FG, Tan CQ, et al. that classify lung cancer patients. J Pathol 2018;244:421–31. Genomic landscape of lung adenocarcinoma in East Asians. Nat 49. Okayama H, Kohno T, Ishii Y, Shimada Y, Shiraishi K, Iwakawa R, et al. Genet 2020;52:177–86. Identification of genes upregulated in ALK-positive and EGFR/KRAS/ 29. Kumar S, Warrell J, Li S, McGillivray PD, Meyerson W, Salichos L, ALK-negative lung adenocarcinomas. Cancer Res 2012;72:100–11. et al. Passenger mutations in more than 2,500 cancer genomes: overall 50. Schabath MB, Welsh EA, Fulp WJ, Chen L, Teer JK, Thompson molecular functional impact and consequences. Cell 2020;180:915–27. ZJ, et al. Differential association of STK11 and TP53 with KRAS 30. Sanchez-Vega F, Mina M, Armenia J, Chatila WK, Luna A, La KC, et al. mutation-associated gene expression, proliferation and immune Oncogenic signaling pathways in The Cancer Genome Atlas. Cell surveillance in lung adenocarcinoma. Oncogene 2016;35:3209–16. 2018;173:321–37. 51. Shedden K, Taylor JMG, Enkemann SA, Tsao M-S, Yeatman TJ, 31. Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, Gerald WL, et al. Gene expression–based survival prediction in lung et al. Somatic mutations affect key pathways in lung adenocarci- adenocarcinoma: a multi-site, blinded validation study. Nat Med noma. Nature 2008;455:1069–75. 2008;14:822–7. 32. Jiménez-Sánchez A, Cast O, Miller ML. Comprehensive benchmark- 52. Helmink BA, Reddy SM, Gao J, Zhang S, Basar R, Thakur R, et al. ing and integration of tumor microenvironment cell estimation B cells and tertiary lymphoid structures promote immunotherapy methods. Cancer Res 2019;79:6238–46. response. Nature 2020;577:549–55. 33. Nallanthighal S, Heiserman JP, Cheon D-J. The role of the extracel- 53. Sautès-Fridman C, Petitprez F, Calderaro J, Fridman WH. Tertiary lular matrix in cancer stemness. Front Cell Dev Biol 2019;7:86. lymphoid structures in the era of cancer immunotherapy. Nat Rev 34. Lu P, Takai K, Weaver VM, Werb Z. Extracellular matrix degrada- Cancer 2019;19:307–25. tion and remodeling in development and disease. Cold Spring Harb 54. Suzuki K, Kachala SS, Kadota K, Shen R, Mo Q, Beer DG, et al. Prog- Perspect Biol 2011;3:a005058. nostic immune markers in non–small cell lung cancer. Clin Cancer 35. Liot S, Aubert A, Hervieu V, Kholti NE, Schalkwijk J, Verrier B, et al. Res 2011;17:5247–56. Loss of Tenascin-X expression during tumor progression: a new 55. Nikitin AY, Alcaraz A, Anver MR, Bronson RT, Cardiff RD, Dixon D, pan-cancer marker. Matrix Biology Plus 2020;6–7:100021. et al. Classification of proliferative pulmonary lesions of the mouse: 36. Yuan B-Z, Jefferson AM, Baldwin KT, Thorgeirsson SS, Popescu NC, recommendations of the mouse models of human cancers consor- Reynolds SH. DLC-1 operates as a tumor suppressor gene in human tium. Cancer Res 2004;64:2307–16. non-small cell lung carcinomas. Oncogene 2004;23:1405–11. 56. Meuwissen R, Berns A. Mouse models for human lung cancer. Genes 37. Tamura M, Sasaki Y, Koyama R, Takeda K, Idogawa M, Tokino T. Dev 2005;19:643–64. Forkhead transcription factor FOXF1 is a novel target gene of the 57. Marjanovic ND, Hofree M, Chan JE, Canner D, Wu K, Trakala M, p53 family and regulates cancer cell migration and invasiveness. et al. Emergence of a high-plasticity cell state during lung cancer Oncogene 2014;33:4837–46. evolution. Cancer Cell 2020;38:229–46. 38. Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, 58. LaFave LM, Kartha VK, Ma S, Meli K, Del Priore I, Lareau C, et al. et al. Phenotype molding of stromal cells in the lung tumor micro- Epigenomic state transitions characterize tumor progression in environment. Nat Med 2018;24:1277. mouse lung adenocarcinoma. Cancer Cell 2020;38:212–28. 39. Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH, et al. 59. Cao J, Yan Q. Cancer epigenetics, tumor immunity, and immuno- Functional characterization of somatic mutations in cancer using therapy. Trends Cancer 2020;6:580–92. network-based inference of protein activity. Nat Genet 2016;48: 60. Villanueva L, Álvarez-Errico D, Esteller M. The contribution of epige- 838–47. netics to cancer immunotherapy. Trends Immunol 2020;41:676–91. 40. Myatt SS, Lam EW-F. The emerging roles of forkhead box (Fox) 61. de Koning HJ, van der Aalst CM, de Jong PA, Scholten ET, proteins in cancer. Nat Rev Cancer 2007;7:847–59. Nackaerts K, Heuvelmans MA, et al. Reduced lung-cancer mortal- 41. Huber A-L, Papp SJ, Chan AB, Henriksson E, Jordan SD, Kriebs A, ity with volume CT screening in a randomized trial. N Engl J Med et al. CRY2 and FBXL3 cooperatively degrade c-MYC. Mol Cell 2020;382:503–13. 2016;64:774–89. 62. Kauczor H-U, Baird A-M, Blum TG, Bonomo L, Bostantzoglou C, 42. Roussel-Gervais A, Naciri I, Kirsh O, Kasprzyk L, Velasco G, Grillo G, Burghuber O, et al. ESR/ERS statement paper on lung cancer et al. Loss of the methyl-CpG–binding protein ZBTB4 alters mitotic screening. Eur Radiol 2020;30:3277–94. checkpoint, increases aneuploidy, and promotes tumorigenesis. 63. Blons H, Garinet S, Laurent-Puig P, Oudart J-B. Molecular markers Cancer Res 2017;77:62–73. and prediction of response to immunotherapy in non-small cell 43. Liu Z, Yang X, Li Z, McMahon C, Sizer C, Barenboim-Stapleton L, lung cancer, an update. J Thorac Dis 2019;11:S25–36. et al. CASZ1, a candidate tumor-suppressor gene, suppresses neuro- 64. Forde PM, Chaft JE, Smith KN, Anagnostou V, Cottrell TR, blastoma tumor growth through reprogramming gene expression. Hellmann MD, et al. Neoadjuvant PD-1 blockade in resectable Cell Death Differ 2011;18:1174–83. lung cancer. N Engl J Med 2018;378:1976–86. 44. Charpentier MS, Christine KS, Amin NM, Dorr KM, Kushner EJ, 65. Owen D, Chaft JE. Immunotherapy in surgically resectable non- Bautch VL, et al. CASZ1 promotes vascular assembly and morpho- small cell lung cancer. J Thorac Dis 2018;10:S404–11. genesis through the direct regulation of an EGFL7/RhoA-mediated 66. Vansteenkiste J, Dooms C. MS14.01 basic science rationale of IO in pathway. Dev Cell 2013;25:132–43. early stage NSCLC? J Thorac Oncol 2018;13:S268–9.

OF17 | CANCER DISCOVERY june 2021 AACRJournals.org

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution of Lung Adenocarcinoma Heterogeneity RESEARCH ARTICLE

. 67 Patani N, Barbashina V, Lambros MBK, Gauthier A, Mansour M, 88. Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Mackay A, et al. Direct evidence for concurrent morphological and Hansen KD, et al. Minfi: a flexible and comprehensive Bioconductor genetic heterogeneity in an invasive ductal carcinoma of triple- package for the analysis of Infinium DNA methylation microarrays. negative phenotype. J Clin Pathol 2011;64:822–8. Bioinformatics 2014;30:1363–9. 68. Friemel J, Rechsteiner M, Frick L, Böhm F, Struckmann K, Egger M, 89. Zhou W, Laird PW, Shen H. Comprehensive characterization, anno- et al. Intratumor heterogeneity in hepatocellular carcinoma. Clin tation and innovative use of Infinium DNA methylation BeadChip Cancer Res 2015;21:1951–61. probes. Nucleic Acids Res 2017;45:e22. 69. Hothorn T, Bretz F, Westfall P. Simultaneous inference in general 90. Forrest ARR, Kawaji H, Rehli M, Kenneth Baillie J, de Hoon MJL, parametric models. Biom J 2008;50:346–63. Haberle V, et al. A promoter-level mammalian expression atlas. 70. Grossman RL, Heath AP, Ferretti V, Varmus HE, Lowy DR, Kibbe Nature 2014;507:462–70. WA, et al. Toward a shared vision for cancer genomic data. N Engl J 91. Haas BJ, Dobin A, Li B, Stransky N, Pochet N, Regev A. Accuracy assess- Med 2016;375:1109–12. ment of fusion transcript detection via read-mapping and de novo 71. Ellrott K, Bailey MH, Saksena G, Covington KR, Kandoth C, fusion transcript assembly-based methods. Genome Biol 2019;20:213. Stewart C, et al. Scalable open science approach for mutation call- 92. Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of ing of tumor exomes using multiple genomic pipelines. Cell Syst tumour purity. Nat Commun 2015;6:8971. 2018;6:271–81. 93. Mina M, Raynaud F, Tavernari D, Battistello E, Sungalee S, 72. Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Saghafinia S, et al. Conditional selection of genomic alterations Getz G. GISTIC2.0 facilitates sensitive and confident localization dictates cancer evolution and Oncogenic dependencies. Cancer Cell of the targets of focal somatic copy-number alteration in human 2017;32:155–68. cancers. Genome Biol 2011;12:R41. 94. Fu Y, Liu Z, Lou S, Bedford J, Mu XJ, Yip KY, et al. FunSeq2: a 73. Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang T-H, framework for prioritizing noncoding regulatory variants in cancer. et al. The immune landscape of cancer. Immunity 2018;48:812–30. Genome Biol 2014;15:480. 74. Collisson EA, Campbell JD, Brooks AN, Berger AH, Lee W, Chmielecki 95. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma J, et al. Comprehensive molecular profiling of lung adenocarcinoma. powers differential expression analyses for RNA-sequencing and Nature 2014;511:543–50. microarray studies. Nucleic Acids Res 2015;43:e47. 75. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The 96. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor cBio Cancer Genomics Portal: an open platform for exploring multidi- package for differential expression analysis of digital gene expres- mensional cancer genomics data. Cancer Discov 2012;2:401–4. sion data. Bioinformatics 2010;26:139–40. 76. Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. 97. Law CW, Alhamdoosh M, Su S, Dong X, Tian L, Smyth GK, et al. Integrative analysis of complex cancer genomics and clinical profiles RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR. using the cBioPortal. Sci Signal 2013;6:pl1. F1000Res 2018;5:1408. 77. Goldman MJ, Craft B, Hastie M, Repeˇcka K, McDade F, Kamath A, 98. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, et al. Visualizing and interpreting cancer genomics data via the Xena Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. platform. Nat Biotechnol 2020;38:675–8. Bioinformatics 2011;27:1739–40. 78. Der SD, Sykes J, Pintilie M, Zhu C-Q, Strumpf D, Liu N, et al. 99. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Validation of a histology-independent prognostic gene signature for et al. Gene Ontology: tool for the unification of biology. Nat Genet early-stage, non-small-cell lung cancer including stage IA patients. 2000;25:25–9. J Thorac Oncol 2014;9:59–64. 100. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck 79. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, WM, et al. Comprehensive integration of single-cell data. Cell Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce 2019;177:1888–902. framework for analyzing next-generation DNA sequencing data. 101. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analy- Genome Res 2010;20:1297–303. sis for microarray and RNA-Seq data. BMC Bioinformatics 2013;14:7. 80. Li H. Aligning sequence reads, clone sequences and assembly contigs 102. Danaher P, Warren S, Lu R, Samayoa J, Sullivan A, Pekker I, et al. with BWA-MEM. arXiv:1303.3997. 2013. Pan-cancer adaptive immune resistance as defined by the Tumor 81. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Inflammation Signature (TIS): results from The Cancer Genome Hartl C, et al. A framework for variation discovery and genotyp- Atlas (TCGA). J Immunother Cancer 2018;6:63. ing using next-generation DNA sequencing data. Nat Genet 2011; 103. Foroutan M, Bhuva DD, Lyu R, Horan K, Cursons J, Davis MJ. Single 43:491–8. sample scoring of molecular phenotypes. BMC Bioinformatics 2018; 82. Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, et al. 19:404. OncoKB: a precision oncology knowledge base. JCO Precis Oncol 104. Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova 2017;2017:PO.17.00011. H, Hulselmans G, et al. SCENIC: Single-cell regulatory network 83. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, inference and clustering. Nat Methods 2017;14:1083–6. Getz G, et al. Integrative genomics viewer. Nat Biotechnol 2011; 105. Salas LA, Koestler DC, Butler RA, Hansen HM, Wiencke JK, Kelsey 29:24–6. KT, et al. An optimized library for reference-based deconvolution 84. Li B, Dewey CN. RSEM: accurate transcript quantification from of whole-blood biospecimens assayed using the Illumina Human- RNA-Seq data with or without a reference genome. BMC Bioinfor- MethylationEPIC BeadArray. Genome Biol 2018;19:64. matics 2011;12:323. 106. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, 85. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Favera RD, et al. ARACNE: an algorithm for the reconstruction of et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics gene regulatory networks in a mammalian cellular context. BMC 2013;29:15–21. Bioinformatics 2006;7:S7. 86. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva pack- 107. Bankhead P, Loughrey MB, Fernández JA, Dombrowski Y, McArt age for removing batch effects and other unwanted variation in DG, Dunne PD, et al. QuPath: open source software for digital high-throughput experiments. Bioinformatics 2012;28:882–3. pathology image analysis. Sci Rep 2017;7:16878. 87. Smedley D, Haider S, Durinck S, Pandini L, Provero P, Allen J, et al. 108. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, The BioMart community portal: an innovative alternative to large, Pietzsch T, et al. Fiji: an open-source platform for biological-image centralized data repositories. Nucleic Acids Res 2015;43:W589–98. analysis. Nat Methods 2012;9:676–82.

june 2021 CANCER DISCOVERY | OF18

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research. Published OnlineFirst February 9, 2021; DOI: 10.1158/2159-8290.CD-20-1274

Nongenetic Evolution Drives Lung Adenocarcinoma Spatial Heterogeneity and Progression

Daniele Tavernari, Elena Battistello, Elie Dheilly, et al.

Cancer Discov Published OnlineFirst February 9, 2021.

Updated version Access the most recent version of this article at: doi:10.1158/2159-8290.CD-20-1274

Supplementary Access the most recent supplemental material at: Material http://cancerdiscovery.aacrjournals.org/content/suppl/2021/01/27/2159-8290.CD-20-1274.DC1

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://cancerdiscovery.aacrjournals.org/content/early/2021/05/14/2159-8290.CD-20-1274. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from cancerdiscovery.aacrjournals.org on September 28, 2021. © 2021 American Association for Cancer Research.