ARTICLE

https://doi.org/10.1038/s41467-021-21795-z OPEN Comprehensive single-cell sequencing reveals the stromal dynamics and tumor-specific characteristics in the microenvironment of nasopharyngeal carcinoma

Lanqi Gong1,2, Dora Lai-Wan Kwong1,2, Wei Dai 1,2, Pingan Wu3, Shanshan Li1, Qian Yan1,2, Yu Zhang2, Baifeng Zhang 2, Xiaona Fang2, Li Liu 4,5,6, Min Luo1, Beilei Liu 2, Larry Ka-Yue Chow2, Qingyun Chen7, Jinlin Huang2, Victor Ho-Fun Lee1,2, Ka-On Lam 1,2, Anthony Wing-Ip Lo 8, Zhiwei Chen 4,5,6, Yan Wang9,

1234567890():,; ✉ Anne Wing-Mui Lee1,2 & Xin-Yuan Guan 1,2,7

The tumor microenvironment (TME) of nasopharyngeal carcinoma (NPC) harbors a het- erogeneous and dynamic stromal population. A comprehensive understanding of this tumor- specific ecosystem is necessary to enhance cancer diagnosis, therapeutics, and prognosis. However, recent advances based on bulk RNA sequencing remain insufficient to construct an in-depth landscape of infiltrating stromal cells in NPC. Here we apply single-cell RNA sequencing to 66,627 cells from 14 patients, integrated with clonotype identification on T and B cells. We identify and characterize five major stromal clusters and 36 distinct sub- populations based on genetic profiling. By comparing with the infiltrating cells in the non- malignant microenvironment, we report highly representative features in the TME, including phenotypic abundance, genetic alternations, immune dynamics, clonal expansion, develop- mental trajectory, and molecular interactions that profoundly influence patient prognosis and therapeutic outcome. The key findings are further independently validated in two single-cell RNA sequencing cohorts and two bulk RNA-sequencing cohorts. In the present study, we reveal the correlation between NPC-specific characteristics and progression-free survival. Together, these data facilitate the understanding of the stromal landscape and immune dynamics in NPC patients and provides deeper insights into the development of prognostic biomarkers and therapeutic targets in the TME.

1 Department of Clinical Oncology, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China. 2 Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China. 3 Department of Surgery, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China. 4 Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China. 5 The AIDS Institute, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China. 6 State Key Laboratory of Emerging Infectious Disease, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China. 7 State Key Laboratory of Oncology in Southern China, Sun Yat-sen University Cancer Center, Guangzhou, China. 8 Department of Pathology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China. 9 Department of Pathology, The University of ✉ Hong Kong-Shenzhen Hospital, Shenzhen, China. email: [email protected]

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

PC is a squamous cell carcinoma arising from the naso- single-cell suspension and analyzed by unique transcript counting Npharynx epithelium. Although NPC remains uncommon through barcoding each single cell with a unique molecular worldwide, it is endemic in East and Southeast Asia and identifier (UMI) (Fig. 1a). After mapping and North Africa, indicating the incidence is highly influenced by quality filtering, we obtained 66,627 cells where 50,169 cells (75%) epidemiological patterns and genetic susceptibility in certain originated from the NPC microenvironment and 16,458 cells ethical groups1. Despite its geographical distribution, the intense (25%) from the nonmalignant microenvironment, and detected immune infiltration and the commonly low degree of differ- on average 1215 median and 99,739 median reads per cell entiation are the unique characteristics that make NPC sig- (Fig. 1b and Supplementary Ffig. S1a). We used the Seurat R nificantly differ from other cancers1,2. Clinical observations have package (version 3.0) to perform the down-stream analysis15.We revealed that a substantial amount of stromal cells often inter- applied principal component analysis and a graph-based clus- mixes with malignantly transformed nasopharyngeal epithelial tering approach to categorize individual cells into distinct clus- cells. Furthermore, the Epstein–Barr virus (EBV) infection is ters. Then, the Uniform Manifold Approximation and Projection extensively associated with NPC incidence, and significantly (UMAP) was used to reduce the dimensionality and visualize the shapes the NPC microenvironment via chronic immune activa- cell distribution. To build on this, we assigned the clusters into six tion, so that eventually affects tumor progression and therapeutic major cell lineages via well-recognized marker genes (Fig. 1b, c). outcome3. So far, chemoradiotherapy remains the first-line Although myeloid cells and fibroblasts exhibited strong treatment option against NPC, but it often fails to achieve the preference in the TME, most of the clusters were not tumor- optimal effect in patients with either advanced or metastatic specific nor patient-specific, indicating that there were no tumors4. Recently, immunotherapy has emerged as a promising extensive variations in the composition of the major stromal strategy in many cancers5. For instance, the use of PD-1 inhibi- lineages (Fig. 1b, e). The degree of stromal infiltration for each tors, such as pembrolizumab and nivolumab, against NPC has patient was also independently determined by a group of generated promising pre-clinical results6. Nevertheless, the lack of professional pathologists based on hematoxylin and eosin a comprehensive understanding of the NPC microenvironment staining and immunohistochemical staining (Fig. 1f and Supple- constantly hinders the optimization of therapeutic interventions. mentary Fig. S2a). The interpatient analysis showed that T cells Therefore, it is essential to further elucidate the landscape, and B cells were the most predominant cell types in the NPC dynamics, and functional characteristics of tumor-infiltrating microenvironment, which was consistent with previous findings stromal cells in NPC patients and the nonmalignant counterpart. (Fig. 1d)16–18. However, patient 6 with a very rare type of Infiltration of dysfunctional and tumor-promoting stromal differentiated NPC, showed abnormal myeloid and B cell cells is one of the cancer-dependent hallmarks during malignant infiltration, suggesting differentiation grade of NPC might transition and progression. The subpopulations of the tumor- influence the stromal infiltration in the TME. In this study, we associated stromal cells and their interactions are extensively comprehensively identified and characterized the heterogeneous diverse with the evidence for transcriptional alterations and clo- subpopulations within the infiltrating T cells, B cells and nal expansion7. Previous studies of NPC-associated immune cells myeloid cells. have largely focused on EBV-specific chimeric antigen receptor To map the conserved and tumor-specific subpopulations in T cells that can induce a high level of antigen-specific cytotoxic the infiltrating immune cells, we performed second-round UMAP response8,9. Other TME-residing lymphocytes, myeloid cells, and within the T cells, B cells and myeloid cells. Together, the second- fibroblasts remain insufficiently identified and characterized in round UMAP more specifically deciphered the heterogeneous the NPC microenvironment. For example, recent evidence has components in the TME and non-malignant microenvironment, suggested that B cells constitute an emerging factor in immu- – revealing 36 distinct immune subpopulations. Using the random notherapeutic outcome for varied malignancies10 13, but the forest algorithm, we further validated the T and B subpopulations phenotype and function of infiltrating B cells are primarily in an independent cohort containing 37,442 single cells obtained unexplored in NPC. Overall, many tumor-infiltrating sub- from different tissues of patients 9-13, showing a strong clustering populations have been increasingly recognized as promising correlation between the discovery and validation cohorts biomarkers and therapeutic targets that can be used in early (Supplementary Fig. S1b–f). To further define the phenotype, diagnosis, prognostic prediction and reinvigoration of dysfunc- functional state and prognostic value of the major clusters, we tional immunity14. However, due to the lack of high throughput performed more in-depth characterization and validation within single-cell analyzing techniques in the last decade, the stromal these genetically distinct subpopulations. heterogeneity and molecular mechanism of stroma-tumor inter- play in NPC remain poorly explored. Hence, to comprehensively decipher the complexity of the stromal infiltration and reveal Identification and characterization of T-cell diversity in the NPC-specific features, we performed 5’ single-cell RNA sequen- NPC and non-malignant microenvironment. Fourteen sub- cing integrated with V(D)J profiling on 14 patients with either populations were yielded from 32,656 T cells, representing the nasopharyngeal carcinoma (NPC) or nasopharyngeal lymphatic most prevalent and diverse cell type in the nasopharyngeal hyperplasia (NLH). The in-depth analysis of stromal cells derived microenvironment (Fig. 2a). The expression of marker genes and from the NPC microenvironment and the non-malignant coun- functional signatures were used to identify and characterize the terpart revealed the molecular heterogeneity within the TME and subpopulations of tumor-infiltrating T cells (Fig. 2d, Supple- the potential effects on tumor progression and treatment mentary Fig. S3a). For example, three distinct exhausted T cell outcome. subpopulations were identified and characterized by LAG3, HAVCR2, and TOX, which were previously reported in T cell Results dysfunction19–21. Although most of T cell subpopulations Identification of stromal cells in the NPC microenvironment exhibited minor inter-patient heterogeneity and were primarily via single-cell RNA sequencing. Fourteen patients were enrolled shared in malignant and non-malignant microenvironment, the in this study, in which eleven patients were diagnosed with NPC, relative abundance and activated/inhibitory signatures were and the others were diagnosed with NLH. Fresh tissue samples considerably different (Fig. 2c, d). Two subpopulations of from these patients were collected via endoscopic biopsy. Sub- exhausted T cells, characterized by high expression of an immune sequently, the fresh tissues were immediately dissociated into a checkpoint HAVCR2 and an exhaustion-associated transcription

2 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

a

Nasopharynx TCR and BCR identification Expression profiling CD3+ T cells 5’ 3’ Pathway enrichment CD79A+ B cells Single-cell Single-cell Regulatory network V D J dissociation LYZ + Sequencing myeloid cells Linear model construction GNLY+ NK cells Survival analysis COL1A1+ fibroblasts Developmental trajectory Clonotype analysis KRT19+ tumor cells Cellular communication Primary nasopharyngeal carcinoma and UMAP reduction hyperplastic lymphoid tissues

b

Plasma B cells B cells Myeloid cells UMAP_2 UMAP_2 UMAP_2 NK cells Fibroblasts T cells Tumor cells

Malignant 1 3 5 7 9 11 13 Non-malignant 2 4 6 8 10 12 14 UMAP_1 UMAP_1 UMAP_1 c d CD3D GNLY LYZ COL1A1 Malignant Non-malignant 80

60 6

40 MS4A1 CD79A KRT19 JCHAIN 4

20 Cell fraction (%) Cell fraction (%) UMAP_2 2

0 0 T cell B cell Myeloid NK cell Fibroblast

UMAP_1 e f

100 Fibroblasts T cell Tumor cells B cell Granulocytes Myeloid NK cell Lymphocytes 50 Fibroblast Histocytes Cell fraction (%) 400 μm

Fibroblasts Histocytes Lymphocytes 0

Patient ID 1 234 5 6 7 8 9 10 14 11 12 13 Tumor cells Lymphocytes

Stage Differentiation type EBV status Granulocytes

Normal I II III IVUndifferentiated Differetiated N/A Negative Positive 400 μm

Fig. 1 The landscape of the microenvironment in nasopharyngeal carcinoma and nasopharyngeal lymphoid hyperplasia at a single-cell resolution. a Graphical summary of the experimental design. Stromal cells and cancer cells from the NPC and NLH patients were collected and processed by single-cell 5’ RNA sequencing for transcriptomics analysis and coupled V(D)J sequencing for clonal profiling on T and B cells. b UMAP plot of 66,627 cells showing the components in the NPC and NLH microenvironment, color-coded by the major cell lineage (left), sample type (middle) and corresponding patient (right). c The expression of marker genes for the major lineages defined in a. d The relative abundance of the major cell lineages in the malignant (n = 11 biologically independent samples) and non-malignant (n = 3 biologically independent samples) microenvironment. Data are presented as mean values ± SD. e The fraction of major cell lineages originating from the 14 patients, with their clinical information. Source data are provided as a Source Data file. f The representative H&E staining images (400×) taken from two NPC patients, showing the high stromal infiltration in the NPC microenvironment. Scale bar = 400 µm.

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

a C12-FOXP3-CD4 b C13-CLTA4-CD4 Resting Treg Suppresive Treg C10-ANXA1-CD4 C1-CCR7-CD8 Central memory Naive C11-CXCL13-CD4 T helper TCF7 CD4 C8- - C9-IFI-CD4 Naive IFN-stimulated C14-MKI67-CD3 Proliferating C4-ZNF683-CD8 Tissue-resident UMAP_2 C5-LAG3-CD8 UMAP_2

Exhausted TRM C2-GZMA-CD8 Activated C6-HAVCR2-CD8 Exhausted

C3-KLRG1-CD8 Effector memory C7-TOX-CD8 Exhausted Malignant Non-malignant UMAP_1 UMAP_1 d c C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 Expression CCR7 ANXA1-CD4 CCR7-CD8 Naive TCF7-CD4 Naive LEF1 Central memory SELL ** ** TCF7 12 40 25 IFNG ** NKG7 CD8+ 20 T 30 GZMA CD4+ T 8 GZMK Proliferating T 15 KLRG1 Naive 20 LAG3 Cytotoxic 10 PDCD1 4 HAVCR2 Exhaustion 10 5 Transcription Fraction of T cells (%) cells T of Fraction Fraction of T cells (%) cells T of Fraction Fraction of T cells (%) cells T of Fraction CTLA4 LAYN 0 0 0 Inflammatory NPC NLH NPC NLH NPC NLH EMOES ZEB2 IFN-induced CTLA4-CD4 TBX21 Co-stimulatory HAVCR2 CD8 ZNF683 - Exhausted GZMA-CD8 Activated Suppressive Treg Proliferation **** HIF1A Treg 40 ** 30 ** 20 TOX CCL4L2

CXCR5 s (%) l ls (%) 30 l 15 CXCL13 cel cells (%) cells 20 CCL5 T T ce 20 f 10 of of ANXA1 n n o o o

i IL7R 10 t cti IFI44L a 10 ac 5 r r RSAD2 F Fraction of T F

CD27 0 0 0 TNFRSF4 NPC NLH NPC NLH NPC NLH TNFRSF9 ICOS CIBERSORT: GSE68799 MKI67 FOXP3-CD4 Resting Treg Malignant Non-malignant FOXP3 40 80 IL2RA **** IKZF2 * * f s (%) s (%) 30 60 ll

e Hallmark pathways enriched in NPC-derived T cells

c ***

T IFN-γ response 20 40 of IFN-α response n ****

io Allograft rejection t 10 c 20 IL2-STAT5 signaling ra Fraction of T cell F Complement 0 0 Oxidative phosphorylation NPC NLH Treg Effector Exhausted Naive Hypoxia Apoptosis MYC target v1 e Inflammatory response NPC patient 9 NLH paitient 11 0 204060 -log (FDR) eus 10 l cleus Nuc Nu Hallmark pathways enriched in NLH-derived T cells 10 μm GC PD1 PD1 TNF-α signaling via NF-κB CD8 CD8 MYC target v1 100 μm 50 μm 10 μm UV response up P53 pathway Hypoxia eus

leus Inflammatory response cl Nu Nuc Allograft rejection 10 μm GC

P3 IL6 JAK STAT3 signaling Fox FoxP3 Apoptosis CD3 CD3 IL2 STAT5 signaling

010203040

-log10 (FDR) factor TOX, respectively, were observed highly enriched in the subtype (Supplementary Fig. S3d, e). Other subtypes of T cells, NPC microenvironment (Fig. 2c). However, unlike HAVCR2- including regulatory T cells and activated T cells, were also pre- exhausted T cells that were constantly distributed in all NPC ferentially infiltrated in the TME, whereas the naïve T cells and patients, TOX-exhausted T cells were found extensively infiltrated central memory T cells were more likely to reside in the non- in patient 9, indicating that they were a highly patient-specific malignant microenvironment. Deconvolution analysis via

4 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

Fig. 2 Enrichment of exhausted and immunosuppressive T cell phenotype is a tumor-specific characteristic in the NPC microenvironment. a UMAP plot of 32,656 T cells from 14 patients, identifying 14 subpopulations. Each dot represents one single cell, color-coded by the cluster. b UMAP plot of 32,656 T cells from the NPC microenvironment and NLH microenvironment. Each dot represents one single cell, color-coded by the sample type. c The relative abundance of significantly enriched T cell subpopulations in the malignant (For single-cell data, n = 11 biologically independent samples; for CIBERSORTx estimation, n = 42 biologically independent samples) and non-malignant (For single-cell data, n = 3 biologically independent samples; for CIBERSORTx estimation, n = 3 biologically independent samples) microenvironment. Each dot represents one patient. The two-sided Student’s t test was used to determine the statistical significance. p values ≤ 0.05 were represented as *, ≤0.01 as **, ≤0.001 as ***, ≤0.0001 as ****. Data are presented as mean values ± SD. Source data are provided as a Source Data file. d The normalized average expression of functional signatures for T cells subpopulations defined in a. e Infiltration of exhausted CD8+ T cells and Treg in the NPC and NLH microenvironment. Representative images of PD1+ CD8+ T cells and FOXP3+ Treg cells in the NPC patient 9 (white arrows). The left photo shows a low magnification overview (×100); the right photo shows the boxed area in the left photo. Representative images of expression of PD1 (FITC) on CD8+ T cells (TRITC, white arrow) and FOXP3 (TRITC) on CD3+ T cells (FITC) in NLH patient 11 (original magnification, ×200). GC, germinal center. Three independent experiments were performed and generated similar results. f The GSEA hallmark pathways enriched in the NPC-derived and NLH-derived T cells, ordered by -log10 false discovery rate (FDR).

CIBERSORTx from bulk RNA sequencing data corroborates our CXCL13-high and CXCL13-low groups based on the median finding that effector, exhausted and suppressive T cell subtypes expression, and observed that the fraction of the CXCR5+ B cells were highly enriched in NPC patients22 (Fig. 2c). We further was significantly higher in CXCL13-high patients (Fig. 3e). In validated the enrichment of exhausted T cells and Tregs via addition, higher expression of CXCL13 was found correlated to multicolor immunofluorescence of CD3, CD8, FOXP3, and PD-1, better progression-free survival in NPC patients, suggesting the showing a consistent result with our single-cell data (Fig. 2e). CXCL13-high exhausted T cells might remain beneficial to These findings suggested that the severe infiltration of dysfunc- immune modulation (Fig. 3f). tional and immunosuppressive T cells largely affected T-cell LGALS1 has been reported to have the immunomodulatory immunity against NPC. Hence, we later compared the hallmark function in varied immune cells25,26. In this study, LGALS1 was pathways in NPC and non-malignant microenvironment-derived found highly expressed in NPC-associated Tregs (Fig. 3d). We T cells. The pathway analysis revealed that -gamma also discovered that the fraction of immunosuppressive Tregs, (IFN-γ) response and interferon-alpha (IFN-α) response were up- instead of resting Tregs, were significantly higher in LGALS1-high regulated in the NPC-derived T cells, suggesting that the excessive patients, indicating it might play a vital role in Treg activation production of type 1 and type 2 (IFNs) due to chronic (Fig. 3e). A previous study has shown that galectin- activation might significantly influence the composition and 1–homozygous null mutant mice exhibited impeded Treg activity, functional state of the infiltrating T cells (Fig. 2f). The T cells confirming its immunosuppressive role27. Although LGALS1 derived from the non-malignant microenvironment were mostly might serve as a promising therapeutic target to revert the influenced by the nuclear factor kappa--chain-enhancer of immunosuppressive function in Tregs, the molecular mechanism activated B cells (NF-κB) pathway (Fig. 2f). Subsequently, we of how LGALS1 deficiency affects Treg activity remains largely performed z-normalized pathway analysis and identified several unexplored. Based on the gene regulatory network, S100A11 and major pathways that were differently enriched across the CD8+ S100A4 were found upregulated in LGALS1+ Treg and predicted and CD4+ T cell subpopulations (Supplementary Fig. S3b). For to interact with LGALS1, suggesting its regulatory effect might be instance, exhausted T cells showed loss of IFN responses that exerted via a calcium channel-dependent process (Fig. 3b). might explain its dysfunctional state, whereas suppressive Tregs showed increased IFN responses compared with resting Tregs, Functional modules, clonal analysis and developmental tra- indicating that IFN-associated pathways might lead to Treg- jectory revealed the dynamics of exhausted T cells and Tregs in associated immune suppression. Besides, the apoptosis and P53 the NPC microenvironment. The down-stream analysis revealed pathways were found significantly downregulated in exhausted a rich set of functional modules that were associated with naïve, T cells and suppressive Tregs that might facilitate their pro- cytotoxic, exhausted and immunoregulatory programs in the T liferation in the TME. Other pathways, such as glycolysis, allo- cell subpopulations. Thus, we identified the most representative graft rejection, and oxidative phosphorylation showed a high genes in these programs, including CCR7 (naïve), NKG7 (cyto- preference in the TME-resided T cells, suggesting that targeting toxic), LAG3 (exhaustion), and FOXP3 (Treg). We later con- these pathways might enhance immunotherapeutic efficiency. structed gene profiles containing the top ten genes that were Overall, the NPC-derived T cells exhibited apparent functional mostly correlated to the representative genes (Fig. 3g–j). Based on differences compared to the NLH-derived counterpart. these gene profiles, we subsequently established generalized linear To further corroborate the molecular variations between T cell models in naïve T cells, cytotoxic T cells, exhausted T cells and infiltrates in the NPC and non-malignant microenvironment. We Tregs using the glm function in R. The generalized linear models performed MAST analysis to identify differentially expressed assigned a distinct parameter to each gene in the profile based on genes (DEGs) in T cells (Fig. 3a). Among all the DEGs, we its expression weight. Subsequently, the functional scores were predicted upstream transcription factors of the top 30 upregu- calculated based on the parameters estimated from the model, lated genes via RegNetwork and constructed the gene regulatory which represented the relative abundance of certain cell types. network via Cytoscape (Fig. 3b). Intriguingly, we found that We firstly performed the quantitative comparison of the CXCL13 and LGALS1 were significantly upregulated in the NPC- cytotoxic program within the CD8+ T cell clusters. The result derived T cells. CXCL13 was one of the gene signatures for demonstrated that the exhausted T cell clusters (C5, C6, and C7) follicular T helper cells, but it was also found highly expressed on + maintained intermediate-to-high cytotoxic activity instead of exhausted T cells, particularly PD-1 T cells (Fig. 3c). Previous being terminally dysfunctional, suggesting there might be a studies had also reported that CXCL13 was highly expressed in transitional process where effector T cells gradually lost PD-1High exhausted T cells, and it was probably involved in the + cytotoxicity due to chronic activation and tumor-dependent formation of tertiary lymphoid structures (TLSs) via CXCR5 B mechanisms in the TME (Fig. 3g). Also, the analysis of the cell recruitment23,24. Therefore, we divided the NPC patients into exhaustion program showed that exhaustion was not only

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

abPredicted Co-locolization Up-regulated genes

TNFRSF18 Predicted TFs OAS1 LY6E DUSP4 Predicted targets 60 JUN IFI6 S100A4 OAS2

IFIT1 OASL S100A11 TNFRSF4 FOS IFI44 MX2 IFI44L

FOSB CREB1 RPS6 IFI27 ISG15 RPL3 MT2A EGR1 UBE2L6 MYL6 RSAD2 MX1 40 DUSP1 LY 6 E IFIT3 IRF7 JUN FOXP3 RPL17 CD74 XAF1 MT−ATP8 ETS1 PSMB9 LGALS1 FOS

EEF1G TYMP STAT1 RELA ZFP36L2 (FDR) JUNB GZMB HLA-DMA GADD45B CALM2

10 POU2F1 SH3BGRL3 NFKB1 IER2 CD74 BTG2 GZMK CYBA HLA-DRA MYC TPI1 IFI6 AR -log GAPDH CYBA GZMA RPS20 CTSW HLA-DRB1 HLA-DPA1 TP53 20 ISG15 CTSW CXCL13 EGR1 BST2 HLA−DRA HLA-DMB IGLV2−8 S100A4 CCL4 RPS27L HLA-DRB5 HSPA1B HLA-DPB1 POU2F2 S100A11 E2F2 IFI44L LGALS1 GZMH E2F1 OAS1 HLA−DQA1 CKLF OASL CD8B NKG7 GZMH E2F4 GZMA GZMB NKG7 HLA-DQB1 GZMK XAF1 FOXP3 TNF STMN1 CXCL13 TFDP1 CCL4L2 Co-expression CCL4L2 RPL41 0 HLA-DQA1 Physical interactions -1 1 Log2 fold change CXCL13 ef c 80 ns. 50 * 1.0

s (%) 40 60

ell 0.8

B c 30 + 40 0.6 CR5 20 CX UMAP_2 0.4 of

20

Fraction of B cells (%) cells of B Fraction 10 p on Log- =0.01 0.2 CXCL13-High racti

0 F 0 CXCL13-Low h w g igh ow Hi -Lo H -L Progression-free survival 0.0 3- 3 3- 1 1 13 L L C CL1 0 204060 UMAP_1 X XC X C C C CXCL Month ** 40 20 LGALS1 1.0 ns. egs (%) egs d r 30 15 0.8 e T v ressi

20 p 10 0.6 of sup of

0.4 10 5 Log-rank p=0.13 ction of resting Tregs (%) Tregs of resting ction ction UMAP_2 a a 0.2 LGALS1-High Fr Fr 0 0 LGALS1-Low h h g g

i i ow Progression-free survival H -Low -L 0.0 1- -H 1 LS1 LS ALS A 0204060 GA G LG LGALS1 L L Month UMAP_1 g h 100 25 NKG7 LAG3 CCL5 GZMB

80 e 20

GZMA e HAVCR2

GZMK PTMS cor 60 s 15

CCL4 CXCL13 n o i

CST7 VCAM1 t 40 s 10 GZMH PRF1 u

ITM2C TNFRSF9 ha x Cytotoxic scor Cytotoxic

CCL4L2 20 TIGIT E 5 IFNG PDCD1 0 0 e c x c x 0.0 0.5 1.0 v m e 0.0 0.5 1.0 m m a e ex r Ta T ive e Tr -Tex -Trm-T - - - -T -Trm -T -T -Nai A -Tem -Na 1-T 3 X G3 M Correlation 7 8 MA G3 Correlation TOX Z CR2 TO LA G V GZ LA CCR7 ZNF683 HA KLRG1 CCRKLRGZNF6 HAVCR2

i j 15 25 CCR7 FOXP3 TCF7 IL2RA 20 LEF1 TNFRSF18 e r ACTN1 10 IL1R2 e S1PR1 TNFRSF4 or 15 MAL CCR8 sc IL7R 10 5 LAIR2 reg T PLAC8 Naive sco LAYN SPINT2 CTLA4 5 BACH2 TIGIT 0 0 0.0 0.5 1.0 e g m m iv rm ive m 0.0 0.5 1.0 Tex r -Tac T a c ive 2- -T -Tex -Te - lated -T sting s 3 X 1 estin u -Naive-N 1 e Correlation O 3-Helper 7 AG T -R im Correlation -R L 3 AVCR GZMA -st TCF NXA H KLRG-SuppressXP3 ZNF683 CCR7A XP O XCL1 IFN -Suppres F C FO CTLA4 CTLA4 induced in the exhausted T cells, but also induced in the effector and CD8+ naïve T cells had a high naïve score (Fig. 3i). Analysis T cells, but at a lower degree, stating that the activation-to- of the Treg program suggested that suppressive Tregs possessed exhaustion transition was indeed a dynamic process where T cells more potent immunoregulatory capacity than the resting Tregs could be in either a progenitor, intermediate or late exhausted (Fig. 3j). Such increased capacity could be explained by the higher state (Fig. 3h). Consistent with the clustering, the quantitative expression of stimulatory molecules-encoding genes in the analysis of naïve program across all T cells showed that CD4+ suppressive Tregs, including CD27, TNFRSF4, TNFRSF9,

6 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

Fig. 3 Identification of NPC-specific T cell signatures and characterization of their correlations to patient prognosis and T cell dynamics. a The −2 differentially expressed genes (log2 fold change ≥0.5, FDR ≤ 5×10 ) in NPC-derived and NLH-derived T cells identified by the MAST analysis. b The gene regulatory network constructed by Cytoscape, colored for the associated gene type (red: top 30 upregulated genes from NPC-derived T cells, blue: the upstream transcription factors predicted by RegNetwork, green: the targeted genes predicted by GeneMANIA). c The expression of CXCL13 in each T cell profiled on the UMAP. d The expression of LGALS1 in each T cell profiled on the UMAP. e The total B cell fraction and CXCR5+ B cell fraction in CXCL13-high (n = 5 biologically independent samples) and CXCL13-low groups (n = 5 biologically independent samples); The resting Treg fraction and suppressive Treg faction in LGALS1-high (n = 6 biologically independent samples) and LGALS1-low groups (n = 5 biologically independent samples). Each dot represents one patient. The two-sided Student’s t-test was used to determine the statistical significance, p-values >0.05 were represented as ns. (not significant), p values ≤ 0.05 were represented as *, ≤0.01 as **. Data are presented as mean values ± SD. f The progression-free survival for 88 NPC patients, stratified for the normalized average expression (binary: high expression vs. low expression) of CXCL13 and LGALS1. p values were determined based on the two-sided log-rank test. g The most correlated genes to NKG7 (the representative gene in the cytotoxic module) across cytotoxic T cells and predicted cytotoxic scores for CD8+ subpopulations based on the linear model constructed by this set of genes. h Similar to g but showing the most correlated genes to LAG3 (the representative gene in the exhaustion module) across exhausted T cells and predicted exhaustion scores for CD8+ subpopulations. i The most correlated genes to CCR7 (the representative gene in the naive module) across naïve T cells and predicted naive scores for all T cell subpopulations. j The most correlated genes to FOXP3 (the representative gene in the Treg module) across Tregs and predicted Treg scores for Treg subpopulations. Source data are provided as a Source Data file.

TNFRSF18, and ICOS. To validate the reliability of our linear To further understand the clonal expansion and its influence models, we performed the correlation analysis between functional on T-cell dynamics, the top three largest clones in each patient scores and corresponding abundances of T cell subpopulations in were projected on the UMAP with their corresponding the discovery cohort and validation cohort. The result exhibited clonotypes (Fig. 5d and Supplementary Fig. S4a, b). The activated that functional scores were highly correlated to the T cell (C2 and C3) and exhausted T cells (C5, C6, and C7) were shared abundance, suggesting the linear model as an alternative identical clonotypes in most of the NPC patients, validating the approach to comprehensively and accurately estimate dynamic presence of activation-to-exhaustion transition. Nevertheless, we functional states of T cells in the NPC microenvironment found a substantial amount of TOX-exhausted T cells were only (Fig. 4a). In addition, we analyzed the T cell subpopulations shared identical clonotypes with HAVCR2-exhausted T cells. The from an independent NPC single-cell cohort done by Chen result inferred that the exhausted T cells in different clusters were et al.28 The functional scores calculated by our model in Chen’s not completely independent, and the state transition could data also showed high correlations to the abundance of selected T happen from one exhausted subtype to another, namely from cell subpopulations, which further demonstrated the reliability of HAVCR2-exhausted to TOX-exhausted or vice versa. It attracted our linear model (Fig. 4b). Meanwhile, we computed the our attention that a significant fraction of proliferative T cells also correlation between the expression of representative genes and shared identical clonotypes with the exhausted T cells, indicating functional scores in two independent bulk RNA sequencing the exhausted program was often accompanied by elevated cohorts, containing 45 patients and 88 patients, respectively29. proliferation. Besides, we conducted TCR motif analysis and The functional scores were significantly correlated to the found that although most of the motifs were consistent in T cells corresponding genes, indicating that the linear model is capable either derived from the NPC or NLH microenvironment, there of more comprehensively and accurately quantifying real still existed a small portion of TCR patterns that were NPC- dynamic T cell functional states from bulk RNA sequencing specific (Supplementary Fig. S4c). Taken together, it was data, compared to the signature-based characterization. (Fig. 4c). noteworthy to mention that C5-LAG3, C6-HAVCR2 and C7- We later linked the functional scores of 88 NPC patients with TOX exhausted T cells with different transcriptional states their clinical information and found higher naïve score was exhibited distinct cytotoxic and exhausted activities. It demon- significantly correlated to better progression-free survival (Sup- strated that the exhausted T cells that had been previously plementary Fig. S3f). reported as a relatively homogenous population, were possibly To characterize the clonal diversity of infiltrated T cells in the highly heterogeneous and could be originated from varied NPC and non-malignant microenvironment, we systematically sources, including activation-to-exhaustion transition and clonal analyzed the identity of TCRs on individual T cells obtained from expansion. the coupled V(D)J profiling. In total, we detected on average The pseudotime developmental trajectory was independently 2,071 unique clonotypes per patient. As expected, the clonotypes performed on CD8+ T cells and CD4+ T cells via Monocle were highly variable across the patients, but the composition of v230,31. The developmental trajectory provided a unique visua- clonal sizes remained relatively consistent in NPC and NLH lization that inferred the lineage structure of T cells in the NPC patients. Therefore, we divided the clonotypes into three and non-malignant microenvironment. The trajectory CD8+ categories based on the number of cells assigned to each clone T cells showed that the HAVCR2-exhausted, TOX-exhausted, ≥ = = (refer to clonal size 3, 2 and 1) (Fig. 5a). The fraction of LAG3-exhausted T cells, and TRM were located at the end of larger clone sizes (clonal size ≥3 and =2) were found significantly different branches, indicating their distinct developmental enriched in the NPC patients, indicating a strong T-cell clonal processes (Fig. 5e and Supplementary Fig. S3c). The effector expansion in the TME (Fig. 5b). In contrast, the T-cell expansion and memory effector T cells were positioned in between, in the non-malignant microenvironment was relatively static suggesting that they were at an intermediate developmental state. since it was largely dominated by T cells with small clonality The result further demonstrated that exhaustion was the terminal (clonal size = 1) (Fig. 5b). We later computed the exhaustion and stage during CD8+ T cells development, and reprogramming Treg scores of patients 4–13, and found that both scores were exhausted T cells via TME modulation might become a feasible positively correlated to large clonality. This result illustrated that therapeutics to re-activate anti-tumor immunity. The trajectory the exhausted T cells and Tregs in the TME underwent a rapid CD4+ T cells also exhibited the transitional process from naïve clonal expansion and eventually resulted in a higher abundance of T cells to Tregs, central memory T cells, and T follicular helper these T cell subtypes in NPC patients (Fig. 5c). cells (Fig. 5f and Supplementary Fig. S3c). The pseudotime

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

a Discovery cohort Validation cohort c GSE102349 GSE68799 Patient ID 1 3 5 7 9 11 13 Patient ID 1 2 3 4 5 2 4 6 8 10 12 14 80 80 15 15 r=0.849 r=0.769 r=0.758 r=0.927 p<1×10-15 p<1×10-15 p=0.002 p=0.024 n

60 60 o i Tcells(%) Tcells(%) + + ress 7 p

G 40 40 10 10 ex expression NKG7 NK 7 20 20 NKG NKG7

0 0 5 5 Fraction of Fraction of 0 2040608010 15 20 25 12 14 16 18 20 22 12 14 16 18 20 22 24 Cytotoxic score Cytotoxic score Cytotoxic score Cytotoxic score 80 60 15 15 r=0.778 r=0.914 p p=0.001 =0.030 r=0.951 r=0.897 p -15 60 <1×10 p<1×10-15 40

40 10 10 xpression expression e 20 20 LAG3 LAG3

0 0 5 5 0 5 10 15 0246814 16 18 20 22

Fraction of exhausted T cells (%) Fraction of exhausted T cells (%) 14 16 18 20 22 Exhaustion score Exhaustion score Exhaustion score Exhaustion score 50 60 15 15

r=0.937 r=0.918 r=0.908 r=0.865 40 -7 p -15 p=8×10 p=0.028 <1×10 p<1×10-15 on ion i

40 s 30

press 10 10 xpres e 20 ex 20 7 10 CCR7 CCR Fraction of naive T cells (%) Fraction of naive T cells (%) 0 0 5 5 0246810 0 5 10 15 24 26 28 30 32 34 36 22 24 26 28 30 32 34 Naive score Naive score Naive score Naive score 50 25 15 15 r=0.932 r=0.949 r=0.951 r=0.940 -6 -15 -15 40 p=1×10 20 p=0.013 n p<1×10 p<1×10 o

30 15 ressi ression p 10 p 10 ex 20 10 ex 3 P 10 5 FOX FOXP3 Fraction of Tregs (%) Fraction of Tregs (%)

0 0 5 5 0246810 01234516 18 20 22 24 26 28 16 18 20 22 24 26 28 Treg score Treg score Treg score Treg score

b Chen, Y.P. et al. NPC single-cell cohort

Sample ID 1 3 5 7 9 11 13 2 4 6 8 10 12

) 80 60 80 r=0.773 60 % r=0.809 r=0.844 r=0.908 p=0.002 p=0.001 p=0.0003 p=2×10-5 lls (

e 60 60 40 Tc 40 + 7 40 40 NKG 20 20 20 20 tion of Fraction of Tregs (%) ac 0 0

Fraction of naive T cells (%) 0

Fr 0 0 1020304050 0 5 10 15 20 25 02468 Fraction of exhausted T cells (%) 0246 Cytotoxic score Exhaustion score Naive score Treg score

Fig. 4 Independent validation of the functional modules in bulk RNA sequencing and single-cell RNA sequencing cohorts. a The Pearson correlation analysis of functional scores (cytotoxic, exhaustion, naïve, and Treg) and the abundance of T cell subpopulations (NKG7+ cytotoxic T cells, exhausted T cells, naïve T cells, and Tregs) in the discovery cohort and validation cohort. b The Pearson’s correlation analysis of functional scores (cytotoxic, exhaustion, naïve, and Treg) and the abundance of T cell subpopulations (NKG7+ cytotoxic T cells, exhausted T cells, naïve T cells, and Tregs) in PY Chen, et al. NPC single-cell cohort. c The Pearson correlation analysis of functional scores and the corresponding genes in two GEO cohorts (GSE102349 and GSE68799), respectively. The two-sided p value for each functional module in GSE102349 and GSE68799 was extremely small (<1 × 10−15), therefore the exact p values cannot be shown in the figure. Source data are provided as a Source Data file. estimation showed that the resting Tregs were considered as an previously considered as a relatively homogenous population in earlier developed subtype, which further supported our previous the TME. Nevertheless, we detected 27,506 B cells in 11 distinct finding that the suppressive Tregs, as a more mature form, had subtypes, representing the second-largest and second-most the more potent immunoregulatory capability and underwent diverse stromal cell type in the TME (Fig. 6a, b and Supple- more robust clonal expansion. mentary Fig. S5a). The unique clusters of innate-like and germinal center B cells were firsttobefoundintheintratumor fi Deciphering the prognostic value of B cell-specific signatures mass of NPC. Plasma B cells were speci cally enriched in patient and NPC-associated functions of B cell subtypes.Bcellswere 2, patient 6 and patient 7 and primarily contributed to the inter-

8 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

a b p=0.069 * 60 15 100 * 100 80 ze (%) 40 i 10

esize(%) 60 ne size (%) n o o

fcl 40 of cl 20 5 o nofclones n n o o i

tio 20 c a r Fracti Fract 0 0 F 0 NPC NLH NPC NLH NPC NLH 50 Clone size>=3 Clone size=2 Clone size=1 c 15 r=0.612 10 r=0.770 e p=0.067 p=0.005 8 10 6 Fraction of clone size (%) 4 5 Treg score

Patient ID Patient ID

Exhaustion scor 2 0 4 7 9 11 13 4 7 9 11 13 6 8 10 12 14 6 8 10 12 14 Patient ID 46 7 8 9 10 1411 12 13 0 0 0204060 Shannon index 6.2 7.2 6.7 5.5 5.8 7.67.3 7.8 7.3 7.0 0204060 Clone size ≥3 (%) Clone size ≥3 (%) Clone size ≥3 =2 =1

d Patient 4 clonotype 1 Patient 9 clonotype 1 Patient 12 clonotype 1 Patient 13 clonotype 1 VDJC VDJC VDJC V DJ C α TRAV12-3 TRAJ31 TRAC α TRAV8-3 TRAJ47 TRAC α TRAV21 TRAJ4 TRAC α TRAV29DV5 TRAJ35 TRAC β TRBV28 TRBD1 TRBJ1-1 TRBC1 β TRBV6-5 TRBD1 TRBJ2-1 TRBC2 β TRBV2 TRBD1 TRBJ1-2 TRBC1 β TRBV5-6 TRBD1 TRBJ2-3 TRBC2

Patient 4 clonotype 2 Patient 9 clonotype 2 Patient 12 clonotype 2 Patient 13 clonotype 2 V DJC V DJC V DJC V DJC α TRAV14DV4 TRAJ39 TRAC α TRAV38-1 TRAJ45 TRAC α TRAV12-1 TRAJ49 TRAC α TRAV19 TRAJ38 TRAC β TRBV5-6 TRBD1 TRBJ2-7 TRBC2 β TRBV27 TRBD2 TRBJ1-4 TRBC1 β TRBV10-3 TRBD2 TRBJ2-3 TRBC2 β TRBV24-1 TRBJ1-5 TRBC1

Patient 4 clonotype 3 Patient 9 clonotype 3 Patient 12 clonotype 3 Patient 13 clonotype 3 V DJC V DJC V DJC V DJC α TRAV8-4 TRAJ53 TRAC α TRAV14DV4 TRAJ49 TRAC α TRAV29DV5 TRAJ28 TRAC α TRAV29DV5 TRAJ49 TRAC β TRBV19 TRBD1 TRBJ1-5 TRBC1 β TRBV28 TRBD1 TRBJ2-4 TRBC2 β TRBV30 TRBD2 TRBJ2-3 TRBC2 β TRBV14 TRBD1 TRBJ1-1 TRBC1

e Developmental trajectory of CD8+ T cellsf Developmental trajectory of CD4+ T cells

Naive Exhausted tissue-resident Effector Naive IFN-induced T helper Central memory 2 HAVCR2-exhausted TOX-exhausted 2 Suppressive Treg Resting Treg Tissue-resident Effector memory

1 1

0 0 Component 2 Component 2

-1

-1

-2

-2 -3 -2 -1 0 1 2 -5 -2.5 0 2.5 Component 1 Component 1

Fig. 5 Single-cell V(D)J sequencing reveals the clonal expansion and activation-to-exhaustion transition in exhausted and immunosuppressive T cell subtypes. a Distribution of T cell clones by size, with different Shannon index showing the clonal diversity for each patient. b The relative abundance of different clone sizes in the NPC-derived (n = 7 biologically independent samples) and NLH-derived (n = 3 biologically independent samples) T cells. Each dot represents one patient. The two-sided Student’s t test was used to determine the statistical significance, p values ≤ 0.05 were represented as *, ≤0.01 as **, ≤0.001 as ***, ≤0.0001 as ****. Data are presented as mean values ± SD. c The Spearman’s correlation between the calculated exhaustion score/ Treg score and the fraction of clones with ≥3 cells in patients. Source data are provided as a Source Data file. d UMAP plot of the top three largest clones in patients 4, 9 (NPC) and 12, 13 (NLH), with their associated clonotypes. (e and f) The developmental trajectory of CD8+ and CD4+ T cells, colored-coded by the associated cell subpopulations and numbered by different developmental branches.

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z patient heterogeneity of B cells in the NPC microenvironment Heterogeneity and dynamics of myeloid subpopulations in the (Fig. 6c and supplementary fig. 5b). Although the number of NPC microenvironment. With 3671 myeloid cells clustered plasma B cells and switched memory B cells was significantly into 11 subpopulations, we found that 95% of the cells were increased in the TME, it did not reach statistical significance due derived from NPC patients, indicating the myeloid accumula- to considerable interpatient variation (Fig. 6c and Supplemen- tion was an NPC-dependent process (Fig. 7a, b). Our single-cell tary Fig. S5c). Total naïve B cells and innate-like B cells were data showed that there existed three major myeloid-lineage found enriched in the non-malignant microenvironment, components, including myeloid-derived suppressor cells whereas the IFN-induced B cells (including the naïve form) and (MDSCs), macrophages and dendritic cells (DCs) (Fig. 7c). double-negative B cells were particularly NPC-enriched Previous studies have also reported that the MDSCs, macro- (Fig. 6d). From the bulk RNA sequencing data, deconvolution phages and DCs were the most abundant myeloid-lineage analysis validated that plasma B cells and double-negative B cells subtypes in the NPC microenvironment36–38,andwefurther were more likely to infiltrate in the NPC microenvironment. classified these cells into finer subpopulations (Fig. 7a, d). Whereas pan B cells, including naïve B cells, unactivated B cells Compared to T cell and B cell subpopulations, the myeloid and innate-like B cells, were preferentially enriched in the cells, especially TAMs, displayed extensive heterogeneity driven nonmalignant microenvironment (Fig. 6e). The genetic differ- by interpatient variation and tissue specificity (Fig. 7eand ences between NPC-derived and NLH-derived B cells were rare, Supplementary Fig. S7a). For instance, TAMs primarily derived suggesting that the tumor-associated function of B cells might from patient 6 (C1, C2, and C3) and patient 8 (C4) exhibited largely depend on the abundance instead of the differences in highly distinct genetic profiles (Fig. 7c and Supplementary genetic regulations or secretion. The differentially Fig. S7a). In addition, deconvolution analysis revealed that expressed genes in NPC-derived B cells were primarily MDSC were highly enriched in NPC patients (Fig. 6f). DCs interferon-induced and immunoglobins-encoding genes, were found not only resided in the TME, but also frequently including IFITM1, IFI44L, IGHA1, IGKC,andIGHG3 (Supple- infiltrated in the non-malignant microenvironment so that did mentary Fig. S5d). Consistent with the results shown in T cell not yield a statistical significance. Pathway enrichment of NPC- subpopulations, the pathway analysis of B cells revealed that the derived and NLH-derived myeloid cells confirmed that the NPC-derived B cells were also largely influenced by the excessive presence of IFNs also severely shaped the myeloid landscape in release of IFN-γ and IFN-α, whereas the NLH-derived B cells the NPC patients (Fig. 7g). The MAST analysis and gene reg- were altered due to inflammatory disorder (Fig. 6f). Previous ulatory network revealed the co-expression of M2 and M1 studies have reported that the effects of IFNs on B cells were polarization-associated genes in the TAMs, including FCGR3A, associated with the enhancement of maturation, differentiation, FCGR2A, TREM2, and APOC1, indicating that the TAM sub- immune activation and survival32,33.However,thefunctionof populations might be more complicated than the conventional those IFN-induced B cells in tumor progression was mainly M1/M2 classification (Supplementary Fig. S7b, c)39,40.The unexplored. Emerging evidence has suggested that IFNs might developmental trajectory validated the dynamics of myeloid- trigger immunosuppressive mechanisms34,butwhetherthe lineage cells. We found C7-LAMP3,C8-IDO1, and C9-LGALS2 enrichment of IFN-induced B cells was prognostic in NPC DCs were grouped on branch 4, indicating that they were patients remains unclear. The double-negative B cells were developmentally similar (Supplementary Fig. S7d). However, previously found expanded in non-small-cell lung carcinoma C6-Langerhans cells, a type of specialized tissue-resident DCs, (NSCLC) and negatively correlated to the number of activated B were distinctly grouped on branch 3. Macrophages were mainly cells, which was prognostic for better overall survival in NSCLC locatedattheendofbranches1and2.C1-FOLR2 and C2- patients35. Since the presence of tumor-infiltrating B cells was CCL20 TAMs exhibited distinct developmental programs, and usually correlated to better overall survival and treatment out- C3-FCGR2B TAMs were mainly located in the middle, indi- comes, we selected several representative B-cell signatures cating its intermediate developmental state. MDSCs, including (CD79A, MS4A1, IGHD,andFCRL4) and examined their C10-S100A9 and C11-NFKB1, were primarily located on the prognostic values in NPC patients. The survival analysis end of branch 4, which suggested that they were developmental revealed that the signature genes were significantly correlated to similar but transcriptionally different subtypes. better progression-free survival in NPC patients, suggesting an active role for B cells in anti-NPC immunity (Supplementary Fig. S5e). Fibroblasts and NK cells as microenvironment-based targets in To further gain insights into the function of infiltrating B NPC treatment. Fibroblasts and NK cells represented two kinds cells, we performed the clonal analysis to quantify the of minor subpopulations in the NPC microenvironment18,41,42. distribution and diversity of clonotypes in each patient (Fig. 6g). Unlike NK cells that were found in both TME and non- In total, we detected on average 2419 unique clonotypes per malignant microenvironment, fibroblasts were overwhelmingly patient. Similar to the result of T cells, larger B cell clones were enriched in NPC patients (Fig. 1d, e). MAST analysis revealed significantly enriched in NPC patients, whereas the smaller the distinct profiles in the fibroblasts and clones were more abundant in NLH patients (Fig. 6h). We later NK cells (Supplementary Fig. S7e). Genes that encoded extra- examined the frequently enriched B-cell clonotypes in NPC and cellular matrix (ECM) components, such as COL1A1, COL1A2, NLH patients, respectively (Supplementary Fig. S6a). We LUM and FN1, were specifically expressed by fibroblasts, sug- conducted the motif analysis and found that all of the BCR gesting that the ECM in the NPC microenvironment was highly motifs were highly consistent in the composition but remained complex and might interact with surrounding tumor and distinct in frequency (Supplementary fig. 6b). We subsequently stromal cells via integrin signaling. Thus, inhibition of parti- performed the developmental trajectory of B cells and revealed cular integrin receptors on tumor and immune cells might serve the developmental lineage from naïve B cells to different types as a potential therapeutic target to disrupt ECM-dependent of mature B cells, including plasma B cells, switched memory B tumor progression and suppressive immunomodulation. Simi- cells and IFN-induced B cells. We also found that many B cell lar to effector T cells, NK cells commonly expressed a high level subtypes were at the intermediate development state, such as of cytotoxic genes, including NKG7, GZMA, GZMB, and unactivated B cells, germinal center B cells, and double-negative GZMH. However, GNLY and KLRD1 were preferentially B cells (Supplementary Fig. S6c). expressed by NK cells (Supplementary Fig. S7e). Recent

10 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

a b C3-Naive

C8-IFN-induced C2-Plasma B Naive C7- C11- Double-negative EBV-induced C4-Unactivated C1- Innate-like

C5- UMAP_2 IFN-induced UMAP_2 C6-Switched memory

C9-Germinal center

Malignant C10-Proliferating Non-malignant UMAP_1 UMAP_1

c f Hallmark pathways enriched in NPC-derived B cells

IFN-γ response IFN-α response Patient 1 Patient 2 Patient 3 Patient 4 Patient 5 Inflammatory response Allograft rejection IL2 STAT5 signaling Complement system

0 204060 Patient 6 Patient 7 Patient 8 Patient 9 Patient 10 -log10(FDR) C1-Innate-like C2-Plasma B C3-Naïve C4-Unactivated Hallmark pathways enriched in NLH-derived B cells C5-IFN-induced C6-Switched memory C7-Double-negative C8-IFN-induced naïve TNF-α signaling via NF-κB C9-Germinal-center C10-Proliferating MYC target v1 Patient 14 Patient 11 Patient 12 Patient 13 C11-EBV-induced P53 pathway mTORC signaling Hypoxia d UV response up IL2 STAT5 signaling Naive B Total naive B Innate-like B Allograft rejection 50 50 40 Apoptosis ** * * Unfolded expression 40 40 ls (%) l 30 TGF-β signaling cells (%) cells ce

30 30 (%) cells B IFN-γ response

f B f B

o 20 o Apical junction 20 20 on i

act 10 051015 10 10 Fraction Fraction of Fr -log10(FDR) 0 0 0 NPC NLH NPC NLH NPC NLH

IFN-induced naive B IFN-induced B Double-negative B g * 100 20 30 80 p * =0.056 s (%) s (%) s (%) l l l l l l 15 60 e e e

20 c c B B

f %) 10 o 40 ( of tion tion 10 ize c c

5 20 s Fra Fraction of B c Fraction Fra e 50

0 0 0 lon NPC NLH NPC NLH NPC NLH fc o e n CIBERSORT: GSE68799 tio

Malignant Non-malignant Frac 100 ** 0 p=0.09 Patient ID 46 7 8 9 10 1411 12 13 80 p=0.07 Shannon index 6.3 4.7 6.0 5.1 6.9 7.45.9 7.2 5.1 7.7 Clone size 60 >=3 =2 =1 h p=0.069 * 80 10 100 ) * ) (% 40 (%) 8 e 60 e(% z ze iz i i s s es e 6 n n 20 ne o o o Fraction of B cells (%)

l 50 l l 40 c fc fc 4 o n no nof o o 20 o i 0 i 2 ct ct cti a Plasma Double- Pan-B ra F Fra 0 Fr 0 0 negative NPC NLH NPC NLH NPC NLH Clone size>=3 Clone size=2 Clone size=1 advances suggested that NK cells could also become exhibited microenvironment remained immune activated and might after chronic activation, characterized by the upregulation of serve as a positive regulator in the immune response KIR, NKG2A, LAG3,andHAVCR243,44, but we did not observe against NPC. the up-regulation of these exhaustion markers in NK cells. Furthermore, NK cells differentially expressed chemokine- Correlation between immune dynamics and patient survival, encoding genes XCL1 and XCL2, which were previously – fi + and cell cell communications in the in ltrating immune cells. reported to recruit CCR5 DCs. Thus, NK cells in the NPC Based on our single-cell sequencing data, we generated an

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

Fig. 6 Clonal expansion and enrichment of immune-activated and IFN-associated B cells is a tumor-specific characteristic in the NPC microenvironment. a UMAP plot of 27,506 B cells, color-coded by the associated cluster. b UMAP plot of 27,506 B cells, color-coded by the sample type. c The inter-patient distribution of B cell subpopulations, shown by the percentage of total B cells. d The relative abundance of significantly enriched subpopulations in the NPC (n = 11 biologically independent samples) and NLH (n = 3 biologically independent samples) microenvironment. Each dot represents one patient. The two-sided Student’s t test was used to determine the statistical significance, p values ≤0.05 were represented as *, ≤0.01 as **. Data are presented as mean values ±SD. e The estimated abundance of selected B cell subpopulations via CIBERSORTx in the NPC patients (n = 42 biologically independent samples) and normal patients (n = 3 biologically independent samples). The two-sided Student’s t test was used to determine the statistical significance, p values ≤ 0.05 were represented as *, ≤0.01 as **. f The GSEA hallmark pathways enriched in NPC-derived and NLH-derived B cells, ordered by -log10 FDR. (g) Distribution of B cell clones by size, with different Shannon index showing the clonal diversity for each patient. h The relative abundance of different clone sizes in the NPC-derived (n = 7 biologically independent samples) and NLH-derived (n = 3 biologically independent samples) B cells. Each dot represents one patient. The two-sided Student’s t test was used to determine the statistical significance, p values ≤ 0.05 were represented as *. Data are presented as mean values ± SD. Source data are provided as a Source Data file.

expression matrix containing 733 representative genes for selec- exhausted T cells in the TME exhibited a strong interaction with ted major subpopulations in T cells, B cells and myeloid cells, varied B cell subpopulations, primarily via the CXCL13–CXCR5 respectively. By using the annotated signature matrix, the RNA- axis (Fig. 9c). Notably, the NPC-derived exhausted T cells and sequencing data of 88 NPC patients was deconvoluted to estimate suppressive Tregs possessed a higher number of ligand-receptor the immune abundance via CIBERSORTx (Fig. 8a). The patients pairs, whereas the resting Tregs, double-negative B cells and were further clustered into three groups based on immune tissue-resident T cells possessed fewer pairs, compared to their abundance, with either activated tumor immune microenviron- non-malignant counterparts (Fig. 9d). ment (TIME, cluster 1 and 2) or inactivated TIME (cluster 3). The functional score for each patient was also calculated and Discussion normalized, which was later used to characterize the dynamic Together, we comprehensively deciphered the stromal composi- status of T cells (Fig. 8b). Subsequently, we evaluated the hazard tion of the NPC microenvironment at single-cell resolution. More ratio for each immune subtype and patient cluster. The result importantly, our in-depth analysis has successfully revealed exhibited that exhausted T cells were correlated with better phenotypic and molecular differences between stromal cells progression-free survival, whereas other T cell subtypes did not derived from the TME and those from the nonmalignant yield a significant correlation with survival (Fig. 8c). Moreover, a microenvironment. The results exhibited a substantial number of higher abundance of plasma B cells, DCs and macrophages was unique features in the NPC microenvironment that had not been also associated with a better prognosis. Consistent with the pre- previously reported, and highlight key characteristics for further vious studies, a higher percentage of double-negative B cells and implications in TME-targeted therapy and immunotherapy. MDSCs in NPC patients was predictive of worse progression-free Importantly, the NPC-specific characteristics and their relation- survival35,45. Thus, targeting double-negative B cells and MDSCs ship with patient stratification and survival significantly con- might serve as a promising therapeutic approach in NPC, but tribute to the development of personalized diagnostics and further investigations are needed to fully understand the mole- therapeutics. It can also serve as a public-available cohort where cular mechanism of these subtypes in the NPC microenviron- hypothesis from other cancers, especially EBV-associated malig- ment. In addition, patients with low T-immune scores or nancies, can be independently validated, and used to explore inactivated TIME were shown to have a worse prognosis. Cur- promising therapeutic targets and prognostic biomarkers. Our rently, clinical models for patient stratification, survival predic- findings were carefully examined in our single-cell validation tion and treatment evaluation in NPC remain lacking. The single- cohort, NPC single-cell cohort from another group, the two NPC cell-based devolution method incorporated with functional bulk RNA sequencing cohorts, and by staining of formalin-Fixed modules enhanced its reliability as an integrated and feasible paraffin-embedded tissues. Additionally, our findings, such as model to classify disease, predict survival and therapeutic out- stromal compositions and immune signatures, are highly con- come, but it remains necessary to validate our findings in larger sistent with the results from recent single-cell studies on NPC, clinical cohorts. indicating the reliability and reproducibility of our single-cell To depict the differences in molecular interaction between the data28,47. cells derived from the NPC microenvironment and those from In the present study, we have proven that the NPC micro- the NLH microenvironment, we applied CellPhoneDB to environment is indeed more complex and heterogeneous than construct a cell–cell communication network via known ligand- previously reported. In total, we analyzed 104,069 cells from 14 receptor pairs within the 38 identified subpopulations46. Since patients (66,627 in the detecting cohort and 37,442 in the vali- T cells and B cells were highly infiltrated in the microenviron- dation cohort) and revealed 38 distinct stromal populations, ment, we visualized the cellular communications within the major including fibroblasts, NK cells and different subtypes of T cells, B T and B subpopulations (Fig. 9a). The result showed that three cells and myeloid cells. Although each cell subpopulation exhib- clusters of exhausted T cells (HAVCR2, TOX, and LAG3) in the ited divergent pathway activities, IFN-γ and IFN-α responses TME preferentially interacted with memory B cells, innate-like B profoundly influenced the composition and function of the TME- cells, unactivated B cells, IFN-induced B cells, but not with residing cells. Nevertheless, the molecular mechanism of IFN- plasma B cells, naïve B cells and double-negative B cells, due to associated alterations remains largely unexplored. Further lack of interacting pairs in these subtypes. Furthermore, we understanding of the IFN- and immune-enriched NPC micro- identified specific ligand-receptor pairs that were constantly environment might fuel the advances in treatment strategies involved in T cell and B cell communication. We found that against NPC and other immunologically activated cancers. resting Tregs, compared to suppressive Tregs, had much fewer Subsequently, many NPC-specific signatures were found cor- interacting pairs with exhausted T cells, indicating that they were related to patient survival, indicating that the relative abundance a less dynamic and immunosuppressive subtype (Fig. 9b). of various stromal subpopulations and immune activation status Consistent with the findings above, the HAVCR2 and TOX in NPC patients directly influenced tumor development and

12 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

a b C11-NFKB1 C4-MMP9 MDSCs TAMs C2-CCL20 TAMs

C10-S100A9 C1-FOLR2 MDSCs TAMs

C7-LAMP3 DCs C3-FCGR2B TAMS UMAP_2 UMAP_2 C9-LGALS2 DCs

C6-FCER1A Langerhans cells C5-STMN1 C8-IDO1 Proliferating DCs Malignant Non-malignant UMAP_1 UMAP_1 c d 25 50 75 Average 2 1 0 −1 expression C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 2 FOLR2 Average expression Percent Expressed 1 0 SLC40A1 -1 LGALS2 C11

S100A8 C10 CCL20 C1QC C1QA C1QB C9

C8

IDO1 C7 FCER1A

C6 S100A9 FCN1 C5

MMP9 STMN1 C4

C3 APOE

C2

TAMs DCs MDSCs Proliferating C1 e FA IL10 IDO1 CD14 CD1E VEG TGFB1 LGMN CD163 LAMP3 CD1C S100A9 S100A8FCGR3A MS4A4A LGALS2

Immunoregulatory MDSC TAM DC

Patient 1 Patient 2 Patient 3 Patient 4 Patient 5 g

Hallmark pathways enriched in NPC-derived myeloid cells

Patient 6 Patient 7 Patient 8 Patient 9 Patient 10 Complement system IFN-γ response DC MDSC KRAS signaling up TAM Coagulation Proliferating TNF-α signaling via NF-κB Patient 14 Patient 11 Patient 12 Patient 13 Inflammatory response IFN-α response Apoptosis f P53 pathway CIBERSORT: GSE68799 Hypoxia Malignant Non-malignant 0 5 10 15 20 25 ns. -log10(FDR) 100 Hallmark pathways enriched in NLH-derived myeloid cells * TNF-α signaling via NF-κB MYC target v1 Allograft rejection IL6 JAK STAT3 signaling 50 ns. MTORC1 signaling IFN-γ response Inflammatory response IL2 STAT5 signaling UV response up Fraction of myeloid cells (%) Hypoxia

0 0 5 10 15 20 25 -log (FDR) MDSC DC Macrophage 10

Fig. 7 The NPC microenvironment harbors suppressive myeloid subtypes with high patient-specific heterogeneity. a UMAP plot of 3,671 myeloid cells, color-coded by the associated cluster. b UMAP plot of 3,671 myeloid cells, color-coded by the sample type. c The expression matrix containing the top 50 enriched genes in each subpopulation defined in a, with a clustering tree categorizing related subpopulations into major subtypes. d The expression of functional signatures that were used to identified and characterized the subpopulations. e The inter-patient distribution of major myeloid subpopulations, shown by the percentage of total myeloid cells. f The estimated abundance of major myeloid subpopulations via CIBERSORTx (malignant n = 42 biologically independent samples, non-malignant n = 3 biologically independent samples). The two-sided Student’s t test was used to determine the statistical significance, p values > 0.05 were represented as ns. (not significant), p values ≤ 0.05 were represented as *. g The GSEA hallmark pathways enriched in NPC-derived and NLH-derived myeloid cells, ordered by −log10 FDR. Source data are provided as a Source Data file.

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z treatment outcome. The finding provides emerging opportunities Single-cell RNA and V(D)J sequencing. The single-cell encapsulation and library for using these signatures as biomarkers and targets for early preparation was done at the Centre for PanorOmic Sciences of the University of diagnosis and prognosis. Validating our findings in larger cohorts Hong Kong. The single-cell suspension was converted to uniquely barcoded RNA, TCR and BCR libraries by using the Chromium Single Cell 5′ Library, Gel Bead & and clinical trials remains necessary to further elucidate the Multiplex Kit, Chromium Single Cell V(D)J Enrichment Kit (Human T Cell), correlation between stromal composition and tumor progression. Chromium Single Cell V(D)J Enrichment Kit (Human B Cell), and Chromium However, some of the stromal composition remains highly vari- Single Cell Chip Kit (10× Genomics), as per manufacturer’s instructions. The able across the patients, and we can only confidently characterize libraries were sequenced on a NovaSeq 6000, and mapped to the human genome (hg38) using CellRanger (10× Genomics). Gene positions were annotated as per the subtypes that are representative and enriched in the TME. Ensembl build 85 and filtered for biotype (only protein-coding, long intergenic The minor subpopulations residing in the NPC microenviron- noncoding RNA, antisense, immunoglobulin, T-cell receptor, and B-cell receptor). ment might also have effects on altering the clinical outcome fi dramatically. Therefore, the in-depth identi cation and char- Sample aggregation and determination of the batch effect. The processed data acterization of minor subpopulations in the TME remains of 14 samples were aggregated by cellranger aggregation function with default necessary so that the single-cell sequencing technique should parameters, where the sequencing reads were normalized across samples. Since all proceed to become capable of analyzing finer parts of the libraries were generated by the same chemistry version and the data for all the samples were generated and processed at a similar time (within six months), no tumor mass. significant batch effect was observed in the aggregated data (Supplementary Most notably, we confirmed that the presence of a dynamic Figs. S8–S10). transitional process from activated T cells to exhausted T cells in the TME. The exhausted T cells, instead of being completely Quality control of the single-cell data and determination of major cell linea- dysfunctional, exhibited features that were associated with TLS ges. The raw gene expression matrix generated from each sample were aggregated formation via the B cell recruitment. Besides, we focused on the using CellRanger (version 3.1.0) and converted to a Seurat object using Seurat R package (version 3.0). The cells that had unique feature counts over 4000 or <200, increased abundance of suppressive Tregs in the NPC micro- fi environment and found promising signatures associated with the or >15% mitochondrial counts, were ltered. From the remaining 66,627 cells, the fl gene expression dataset was normalized and subsequently dimensionally reduced Treg activation. The gene regulatory network re ected the based on 3,000 differentially expressed genes and principal components (n = 25). importance of calcium channels in shaping the TME and affect- Major cell clusters projected in the two-dimensional UMAP representation were ing the immune response in NPC patients. Previous studies have annotated to known cell lineages using well-recognized marker genes. reported that calcium channel served as a switch that could turn immune cells into the activated mode48,49. We also confirmed the Subclustering of the T cell, B cell, and myeloid populations. To identify sub- prevalence of B cell subpopulations in the NPC microenviron- populations within these major cell clusters, we performed the second-round ment that has not been previously recognized. The presence of B- UMAP reduction with the resolution of 1 on cells belonging to each of the major cell clusters separately. The number of principal components in each major subtype cell signatures and the abundance of B cell subpopulations were was independently determined by the Elbowplot function implemented in Seurat significantly correlated to the disease progression of NPC v3. In addition, the doublets in each cell cluster were identified by graph-based patients. We hypothesized that B cells might also play a vital role clustering and R package DoubletFinder50 and filtered out from the analysis. The in regulating the outcome of radio-chemotherapy and immu- second-round UMAP reduction revealed 36 distinct subpopulations in total in the T cells, B cells, and myeloid cells. notherapy. Moreover, the myeloid cells were mainly derived from the TME, indicating the recruitment of myeloid cells was an NPC-specific event. The MDSCs and TAMs in the TME served as Clustering validation of T cell and B cell subpopulations via the random forest. To validate the presence of the T cell and B cell subpopulations in the a connecting bridge to facilitate immune suppression, since many microenvironment, we collected additional tissues from patients 9-13 and analyzed signature genes in these subtypes have been previously found them via the aforementioned single-cell sequencing protocol. We established a associated with exhaustion, cell cycle arrest of immune cells. In 37,442-cell cohort and clustered the dataset based on the same method used in the summary, the single-cell data from our clinical cohort and the discovery cohort. To assess to which of the T and B cell subpopulations these cells correspond, we generated a Random Forrest classifier using modified BuildRF- bulk RNA data from the GEO cohort provide a vital and unique Classifier for Seurat (version 3.0) in the discovery cohort and classify the cells based insight into the identification and characterization of stromal on this classifier by ClassifyCells in the validation cohort. The correlations between landscape in the NPC microenvironment and are likely to sti- clusters in the discovery and validation cohort were determined. The random forest mulate promising research in the future. classifier assigned the 37,442 cells from the validation set of five patients to the 25T and B cell subpopulations identified in the discovery cohort of 14 patients.

Method Identification of differentially expressed genes between the NPC and NLH Enrollment of NPC and NLH patients. The study was approved by the ethics microenvironment. To identify the differentially expressed genes, we applied committee at the University of Hong Kong and the University of Hong Kong- MAST analysis to assess genetic alterations between NPC-derived cells and NLH- Shenzhen Hospital. We complied with all related ethical regulations. Written derived cells, which can effectively eliminate the large deviation between cells in informed consent was obtained from all patients with primary NPC and non- heterogenous single-cell sequencing data. The genes with false-discovery rate ≤ fi malignant NLH for their tissues to be used in this study. (FDR) 0.05 were considered signi cantly different in the NPC and NLH microenvironment. Significantly up-regulated genes were required to have an ≥ average expression in the NPC microenvironment that was at least 0.5 log2-fold higher than the average expression in the NLH microenvironment. Preparation of the single-cell suspension. After the endoscopic biopsy, the fresh tissue samples (2–3mm3) from NPC and NLH patients were rinsed with phosphate-buffered saline (PBS) on ice. Subsequently, each sample was placed into Survival correlation to the bulk RNA sequencing data in GEO. To further the 500 µL dissociation medium containing 0.5 mg/mL collagenase IV (Sigma) and investigate the survival outcomes of identified signature-genes in our single-cell 1 mg/mL DNAse I (Sigma) in RPMI-1640 (ThermoFisher Scientific). Samples were sequencing data, we assessed their expression in bulk RNA sequencing data from minced in the dissociation medium on ice and then incubated for 30 min at 37 °C, GEO (GSE102349). There are 113 NPC patients enrolled in the study. Subse- with manual vortexing every 10 min. Then, 1 mL cold RPMI-1640 containing 10% quently, the read counts were generated by HTSeq (version 0.9.1) and normalized fetal bovine serum (FBS, ThermoFisher Scientific) was added and each sample was by DESeq2 (version 1.22.2). The quality of the RNAseq data was evaluated by filtered using 70-µm nylon mesh (ThermoFisher Scientific), and subsequently fil- Picard metrics (version 2.17.4) and RSeQC (version 2.6.4). Only the samples with tered again using 40-µm nylon mesh (ThermoFisher Scientific). The samples were at least 60% of reads mapped to coding regions were included. One sample which centrifuged at 300×g for 5 min at 4 °C, and the supernatant was discarded. The failed to pass the quality control and 24 samples without complete follow-up data single cells were resuspended in 1 mL ACK lysis buffer (ThermoFisher Scientific) were removed from the downstream analysis. The remaining 88 patients were and incubated for 5 min. 5 mL cold RPMI-1640 containing 10% FBS was added included for the subsequent survival analysis. The significance of progression-free and the cell mixture was centrifuged at 300×g for 5 min at 4 °C. The single-cell survival based on signature genes and functional scores (binary: high vs. low) was pellet was resuspended in PBS without calcium and magnesium ions to reach the evaluated by the two-sided log-rank test. To assess the correlations between density ≤1000 cells/µL. functional scores and immune abundance with the prognosis of the NPC patients,

14 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE

a Cluster 1,2- Activated TIME 3- Inactivated TIME

EBV reads Grade Stage 4 Cluster Normalized abundance 3 Naïve T 2 1 Double negative B 0

MDSC −1 −2 Conventional B

DC EBV reads 35000 Plasma B

5000 Treg Grade Macrophage 0 1 2 Effector T 3

Exhausted T Stage 0 1 2 Cluster 1-High T-immune score 2-Low T-immune score b 3 4

EBV reads 2 Grade Stage Normalized score Cluster 1

Cytotoxic score 0

Exhaustion score −1

Naïve score −2

Treg score

c GSE102349 Better prognosis Worse prognosis Hazard ratio (95% CI) p-value Regulatory T cell 0.93 ( 0.34-2.48) 0.886

Effector T cell 0.60 ( 0.22-1.66) 0.325

Naïve T cell 1.05 ( 0.40-2.81) 0.918

Exhausted T cell 0.34 ( 0.12-0.93) 0.035* Plasma B cell 0.30 ( 0.11-0.87) 0.027*

Double negative B cell 2.79 ( 0.97-8.00) 0.058+ Conventional B cell 1.07 ( 0.40-2.85) 0.893

Myeloid-derived suppressor cell 2.45 ( 0.85-7.09) 0.097+ Dendridic cell 0.36 ( 0.13-0.98) 0.046*

Macrophage 0.20 ( 0.06-0.72) 0.013* Low T-immune score 4.09 ( 1.45-11.50) 0.008* Inactivated TIME 2.16 ( 0.80-5.81 0.130+

-5 0 5

Log2 (hazard ratio)

Fig. 8 Deconvolution analysis and functional modules reveal dynamic immune signatures that are associated with patient prognosis. a Heatmap of the normalized immune abundance and clinical parameters in 88 NPC patients estimated by CIBERSORTx. Patients were clustered into three groups, representing those with activated TIME (cluster 1 and 2) or inactivated TIME (cluster 3). b Heatmap of the normalized functional scores and clinical parameters in 88 NPC patients calculated by our linear model. Patients were clustered into two groups, representing those with high T-immune score (cluster) or low T-immune score (cluster 2). c The correlation between estimated subpopulations, patient clusters and progression-free survival in 88 NPC patients. The log2 hazard ratio and the associated p-value was evaluated by the Cox proportional hazards model with 95% CI. p values ≤ 0.05 were considered statistically significant in prognosis, whereas p value > 0.05 and ≤ 0.15 were considered marginally significant in prognosis and represented as +. Source data are provided as a Source Data file.

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 15 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

a HAVCR2 LAG3 TOX Helper T -exhausted T -exhausted TRM -exhausted T Suppressive Treg Resting Treg Naive B IFN-naive B Innate-like B Unactivated B Double-negative B IFN-B Memory B Plasma B

Major T-B communication in the NPC microenvironment Major T-B communication in the NLH microenvironment

b c Resting Treg Suppressive Treg HAVCR2-exhausted T TOX-exhausted T IL2 receptor I-IL2 IL2 receptor I-IL2 -log10 (P-value) -log10 (P-value) IL10 receptor-IL10 0 IL10 receptor-IL10 0 CXCL13-ACKR4 1 1 2 CXCL13-ACKR4 2 CXCL13-CXCR5 3 CXCL13-CXCR5 3 CCR6-CCL20 CCR6-CCL20 CXCR3-CCL20 Log mean CXCR3-CCL20 2 Log2 mean IFNA5-Type I IFNR (Molecule 1, Molecule 2) IFNA5-Type I IFNR (Molecule 1, Molecule 2) IFNA2-Type I IFNR 0 IFNA2-Type I IFNR 2 IFNG-Type II IFNR -2 IFNG-Type II IFNR 0 LGALS9-HAVCR2 -4 LGALS9-HAVCR2 -2 -4 PDCD1-CD274 -6 PDCD1-CD274 -6 EGFR-TGFB1 EGFR-TGFB1 TGFB1-TGFBR2 TGFB1-TGFBR2 TGFB1-TGFBR1 TGFB1-TGFBR1 TNFSF9-TNFRSF9 TNFSF9-TNFRSF9 TNFSF4-TNFRSF4 TNFSF4-TNFRSF4

NPC microenvironment TNF-ICOS TNF-ICOS CD27-CD70 CD27-CD70 CTLA4-CD80 CTLA4-CD80 CTLA4-CD86 CTLA4-CD86

IL2 receptor I-IL2 IL2 receptor I-IL2 IL10 receptor-IL10 IL10 receptor-IL10 CXCL13-ACKR4 CXCL13-ACKR4 CXCL13-CXCR5 CXCL13-CXCR5 CCR6-CCL20 CCR6-CCL20 CXCR3-CCL20 CXCR3-CCL20 IFNA5-Type I IFNR IFNA5-Type I IFNR IFNA2-Type I IFNR IFNA2-Type I IFNR IFNG-Type II IFNR IFNG-Type II IFNR LGALS9-HAVCR2 LGALS9-HAVCR2 PDCD1-CD274 PDCD1-CD274 EGFR-TGFB1 EGFR-TGFB1 TGFB1-TGFBR2 TGFB1-TGFBR2 TGFB1-TGFBR1 TGFB1-TGFBR1 TNFSF9-TNFRSF9 TNFSF9-TNFRSF9 TNFSF4-TNFRSF4 TNFSF4-TNFRSF4

NLH microenvironment TNF-ICOS TNF-ICOS CD27-CD70 CD27-CD70 CTLA4-CD80 CTLA4-CD80 CTLA4-CD86 CTLA4-CD86 M M T T Ex Ex RM EM C R - -B -T Ex Ex -B RM EM RM CM -T reg reg T T T T d -T -T N - T T T er- T -T -T T 2 X d p IFN IFN IFN g R EBV-B IFN EBV-B OX n vate Naive B i i T Naive B -naive B -naive B Innate B Innate B Hel TOX Helper-T VC Plasma B Plasma B ct Memory B Memory B auste est A A IFN IFN h Activated-T H R x HAVCR2 Resting Treg Proliferating B Unactivated B Proliferating B Unactivated B E Exhausted-T uppressive Doubel-negative Doubel-negative S Germinal-center B Germinal-center B d Treg Suppressive NPC microenvironment NLH microenvironment 2500

2000

1500

1000

500

Total No. of ligand-receptor pairs ligand-receptor of No. Total 0 TOX HAVCR2 LAG3 Suppressive Resting Double Tissue-resident T Exhausted Exhausted Exhausted Treg Treg Negative B

Fig. 9 Single-cell referenced communication analysis exhibits tumor-specific ligand–receptor interactions within T and B cell subpopulations. a The cell-cell communications within major T and B cell subpopulations in the NPC microenvironment and NLH microenvironment, shown by chord diagram, colored by cell subtypes. The size of the arc represented the total number of ligand–receptor pairs, and the size of the flow represented the number of ligand-receptor pairs between two cell subtypes. b Overview of selected ligand-receptor interactions between resting Treg/suppressive Treg and functional T cell subpopulations, c between HAVCR2 exhausted T cells/TOX exhausted T cells and all B cell subpopulations. p values were calculated using the two-sided Student’s t test with 95% CI and refer to the enrichment of the interacting ligand-receptor pair in each of the interacting pairs of cell types.

Log2 mean referred to the strength of a specific ligand-receptor connection in the corresponding cell types. d The total number of ligand-receptor pairs in the NPC-derived and NLH-derived T and B subpopulations. Source data are provided as a Source Data file.

16 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z ARTICLE we applied a Cox proportional hazards model in IBM SPSS Statistics 22 to evaluate ≤0.01 as **, ≤0.001 as ***, ≤0.0001 as ****. p values were adjusted (FDR) for the hazard ratio between these signatures and patient prognosis. multiple hypothesis testing in the MAST analysis (implemented in R version 3.6) and GSEA analysis (GSEA v4.0.3). In survival analysis, p values were evaluated by the two-sided log-rank test (survival analysis based on signature genes and func- Characterization of functional modules in T cells. We identified the most tional scores) and the Cox proportional hazards model (hazard ratio). representative genes associated with naïve (CCR7), cytotoxic (NKG7), exhausted (LAG3), and Treg (FOXP3) modules. We later constructed gene profiles containing the top ten genes that were mostly correlated to these representative signatures in Data availability fi naïve, cytotoxic, exhausted and Treg subpopulations. Based on these gene pro les, The raw and processed single-cell sequencing data are publicly available in Gene we subsequently established generalized linear models in naïve T cells, cytotoxic Expression Omnibus (GEO) with the accession number GSE150825. T cells, exhausted T cells, and Tregs using glm function in R version 3.6. During the (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE150825) linear model construction, we used the binomial family function because the The publicly available NPC bulk RNA sequencing data used in the study are available response variable is categorical, indicating that the cells belong to certain clusters or in GEO with the accession number GSE68799 and GSE102349. not (1 stands for yes and 0 stands for no). The generalized linear models assigned a = fi (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc GSE68799, https://www.ncbi. distinct parameter to each gene in the pro le based on its expression weight. nlm.nih.gov/geo/query/acc.cgi?&acc=GSE102349) Subsequently, the functional scores were calculated based on the parameters esti- The publicly available NPC single-cell RNA sequencing data used in the study are mated from the model, which represented the relative abundance of certain available in GEO with the accession number GSE150430. cell types. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE150430) The patient information for single-cell sequencing and bulk RNA sequencing is Construction of gene regulatory network. Top 30 upregulated genes in the NPC- available in Supplementary Tables S1–S3. All data supporting the findings of the study derived T cells and myeloid cells were extracted by MAST based on the afore- are available within the Article, Supplementary Information or available from the mentioned protocol. Their up-stream transcription factors were predicted in corresponding author upon request. Source data are provided with this paper. RegNetwork: Regulatory Network Repository (http://regnetworkweb.org/). Only the transcription factors with high confidence were collected. The genetic and transcriptional interaction was predicted by GeneMANIA, implemented in Code availability Cytoscape. The gene regulatory network was constructed using Cytoscape. The codes used to perform clustering and functional analysis via Seurat is available on Github at https://github.com/LanqiGong/NPC-and-NLH-single-cell-analysis. Clonotype diversity and motif analysis. We characterized the clonotypes of the TCR β chain and the BCR IGH chain in each patient. We used Shannon Index and Received: 11 May 2020; Accepted: 8 February 2021; dominance index (count of the maximum clone/sum count of all clones) to eval- uate the diversity of clonotype in each patient. To find the motifs for CDR3 amino acid sequence, we applied MoDec tool to perform motif deconvolution in the T cells ad B cells derived from the NPC and NLH microenvironment, respectively. The optimal number of motifs (6) was selected based on the Akaike Information Criterion (AIC). References 1. Chen, Y. P. et al. Nasopharyngeal carcinoma. Lancet 394,64–80 (2019). Pseudotime developmental trajectory of T cell, B cell, and myeloid popula- 2. Jain, A., Chia, W. K. & Toh, H. C. Immunotherapy for nasopharyngeal cancer- tions. We applied the Monocle v2 to determine the potential development lineages a review. Chin. Clin. Oncol. 5, 22 (2016). between the T cell, B cell, and myeloid subpopulations. The differentially expressed 3. Chan, K. C. A. et al. Analysis of plasma Epstein-Barr Virus DNA to screen for genes across the clusters were identified by FindMarkers in Seurat for T cell, B cell, nasopharyngeal cancer. N. Engl. J. Med. 377, 513–522 (2017). and myeloid subpopulations, respectively. The cells were ordered in pseudotime, 4. Zhang, L., Chen, Q. Y., Liu, H., Tang, L. Q. & Mai, H. Q. Emerging treatment fi where the best trajectory tree was t after reduction of dimentionality of the data by options for nasopharyngeal carcinoma. Drug Des. Devel. Ther. 7,37–52 Reversed Graph Embedding algorithm. The proliferating cluster was excluded from (2013). the analysis in three subpopulations because it is a heterogenous cluster containing 5. Lee, A. W. M. et al. A multicenter, phase 3, randomized trial of concurrent varied subtypes of proliferating cells. chemoradiotherapy plus adjuvant chemotherapy versus radiotherapy alone in patients with regionally advanced nasopharyngeal carcinoma: 10-year Estimation of immune abundances from RNA-sequencing data via CIBER- outcomes for efficacy and toxicity. Cancer 123, 4147–4157 (2017). SORTx. Based on our single-cell sequencing data, we selected several representative 6. Hsu, C. et al. Safety and antitumor activity of pembrolizumab in patients with clusters in T cells (Naïve T, effector T, exhausted T and Treg), B cells (Plasma B, programmed death-ligand 1-positive nasopharyngeal carcinoma: results of the double-negative B and pan-B), and myeloid cells (MDSCs, DCs, and macrophages), keynote-028 study. J. Clin. Oncol. 35, 4050–4056 (2017). respectively. We subsequently generate a gene expression matrix containing 733 7. Binnewies, M. et al. Understanding the tumor immune microenvironment genes that characterized these clusters. Using the signature matrix as the reference, (TIME) for effective therapy. Nat. Med. 24, 541–550 (2018). CIBERSORTx (https://cibersortx.stanford.edu/) deconvoluted bulk RNA- 8. Basso, S. et al. T cell therapy for nasopharyngeal carcinoma. J. Cancer 2, sequencing data obtained from two GEO cohorts into the subpopulation abun- 341–346 (2011). dances in each patient. Since the signature patterns of effector T cells and 9. Cao, Y. EBV based cancer prevention and therapy in nasopharyngeal exhausted T cells were highly similar, CIBERSORTx could not adequately distin- carcinoma. NPJ Precis. Oncol. 1, 10 (2017). guish these two subtypes. Thus, we applied an unsupervised linear regression to 10. Helmink, B. A. et al. B cells and tertiary lymphoid structures promote calculate the relative percentage of effector and exhausted subpopulations based on immunotherapy response. Nature 577, 549–555 (2020). CIBERSORTx data and functional scores. The deconvoluted information of each 11. Mantovani, A., Schioppa, T., Porta, C., Allavena, P. & Sica, A. Role of tumor- fi patient was used to validate our ndings from single-cell sequencing and further associated macrophages in tumor progression and invasion. Cancer Metastasis correlated with progression-free survival in NPC patients. Rev. 25, 315–322 (2006). 12. Petitprez, F. et al. B cells are associated with survival and immunotherapy Constructing the molecular interaction network via CellPhoneDB. CellPho- response in sarcoma. Nature 577, 556–560 (2020). neDB (version 2.0.0, https://github.com/Teichlab/cellphonedb) used the cluster 13. Tripathi, M., Billet, S. & Bhowmick, N. A. Understanding the role annotation and counts from our single-cell transcriptomics data to compute cell- of stromal fibroblasts in cancer progression. Cell Adh. Migr. 6, 231–235 cell communication within the identified cell subtypes. The default ligand-receptor (2012). pair information was used in this process. The p value ≤ 0.05 indicated significant 14. Giraldo, N. A. et al. The clinical role of the TME in solid cancer. Br. J. Cancer enrichment of the interacting ligand-receptor pair in each of the interacting pairs of 120,45–53 (2019). cell subpopulations. Log2 mean referred to the log2-transformed total mean of the 15. Stuart, T. et al. Comprehensive Integration of Single-. Cell Data. Cell 177, individual partner average expression values in the corresponding interacting pairs 1888–1902 e21 (2019). of cell subpopulations. 16. Huang, Y. T. et al. Profile of cytokine expression in nasopharyngeal carcinomas: a distinct expression of 1 in tumor and CD4+ T cells. – Statistical analysis. Statistical analysis was performed and plots were generated Cancer Res. 59, 1599 1605 (1999). using GraphPad Prism 8 and IBM SPSS Statistics 22, and mentioned in the figure 17. Zhang, Y. L. et al. Different subsets of tumor infiltrating lymphocytes legends. All data points are shown for bar plots with a sample size <10. For larger correlate with NPC progression in different ways. Mol. Cancer 9,4 sample sizes, violin plots are used to visualize the data distribution. Dara are (2010). presented as the mean values ± SD. For normally distributed data, p values were 18. Lu, J. et al. Detailed analysis of inflammatory cell infiltration and the evaluated by the two-sided Student’s t test. p values > 0.05 were considered not prognostic impact on nasopharyngeal carcinoma. Head. Neck 40, 1245–1253 statistically significant and represented as ns., p-values ≤0.05 were represented as *, (2018).

NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications 17 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-21795-z

19. Yang, Z. Z. et al. Expression of LAG-3 defines exhaustion of intratumoral PD- 45. Li, Z. L. et al. COX-2 promotes metastasis in nasopharyngeal carcinoma by 1(+) T cells and correlates with poor outcome in follicular lymphoma. mediating interactions between cancer cells and myeloid-derived suppressor Oncotarget 8, 61425–61439 (2017). cells. Oncoimmunology 4, e1044712 (2015). 20. Gorman, J. V. & Colgan, J. D. Regulation of T cell responses by the receptor 46. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. molecule Tim-3. Immunol. Res. 59,56–65 (2014). CellPhoneDB: inferring cell-cell communication from combined expression 21. Khan, O. et al. TOX transcriptionally and epigenetically programs CD8(+)T of multi-subunit ligand-receptor complexes. Nat. Protoc. 15, 1484–1506 cell exhaustion. Nature 571, 211–218 (2019). (2020). 22. Newman, A. M. et al. Determining cell type abundance and expression 47. Zhao, J. et al. Single cell RNA-seq reveals the landscape of tumor and from bulk tissues with digital cytometry. Nat. Biotechnol. 37, 773–782 infiltrating immune cells in nasopharyngeal carcinoma. Cancer Lett. 477, (2019). 131–143 (2020). 23. Thommen, D. S. et al. A transcriptionally and functionally distinct PD-1(+) 48. Vig, M. & Kinet, J. P. Calcium signaling in immune cells. Nat. Immunol. 10, CD8(+) T cell pool with predictive potential in non-small-cell lung cancer 21–27 (2009). treated with PD-1 blockade. Nat. Med. 24, 994–1004 (2018). 49. Prakriya, M. & Lewis, R. S. Store-operated calcium channels. Physiol. Rev. 95, 24. Workel, H. H. et al. A transcriptionally distinct CXCL13(+)CD103(+)CD8 1383–1436 (2015). (+) T-cell population is associated with B-cell recruitment and neoantigen 50. McGinnis, C. S., Murrow, L. M. & Gartner, Z. J. DoubletFinder: doublet load in human cancer. Cancer Immunol. Res. 7, 784–796 (2019). detection in single-cell RNA sequencing data using artificial nearest neighbors. 25. Alhabbab, R. et al. Galectin-1 is required for the regulatory function of B cells. Cell Syst. 8, 329–337 e4 (2019). Sci. Rep. 8, 2725 (2018). 26. Camby, I., Le Mercier, M., Lefranc, F. & Kiss, R. Galectin-1: a small protein with major functions. Glycobiology 16, 137R–157R (2006). Acknowledgements 27. Garin, M. I. et al. Galectin-1: a key effector of regulation mediated by CD4+ This work was supported by grants from the Hong Kong Research Grant Council (RGC) CD25+ T cells. Blood 109, 2058–2065 (2007). grants including GRF (17143716), Collaborative Research Funds (C7065-18GF and 28. Chen, Y. P. et al. Single-cell transcriptomics reveals regulators underlying C7026-18GF), Theme-based Research Scheme (T12-704/16-R), National Key Sci-Tech immune cell diversity and immune subtypes associated with prognosis in Special Project of Infectious Diseases (2013ZX10002-011-005), and The Shenzhen Pea- nasopharyngeal carcinoma. Cell Res. 30, 1024–1042 (2020). cock team project (KQTD2015033117210153 and KQTD2018041118502879), X.Y.G. is 29. Zhang, L. et al. Genomic analysis of nasopharyngeal carcinoma reveals TME- the Sophie YM Chan Professor in Cancer Research. based subtypes. Mol. Cancer Res. 15, 1722–1732 (2017). 30. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Author contributions Census. Nat. Methods 14, 309–315 (2017). X.G. designed and supervised the study. L.G., W.D., and X.G. wrote the manuscript. D.L.K., 31. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are P.W., and S.L. supervised the sample collection. L.G., Q.Y., Y.Z., X.F., L.L., M.L., and B.L. revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, performed the related experiments. L.G., W.D., B.Z., L.K.C., and Q.C. contributed to the 381–386 (2014). data analysis. A.W.L. and Y.W. collected the clinical information and performed the 32. de Groen, R. A., Groothuismink, Z. M., Liu, B. S. & Boonstra, A. IFN-lambda pathological examination. J.H., V.H.L., K.L., Z.C. and A.W. Lee contributed to the inter- is able to augment TLR-mediated activation and subsequent function of pretation of clinical information. All of the authors have read and approved the manuscript. primary human B cells. J. Leukoc. Biol. 98, 623–630 (2015). 33. Hervas-Stubbs, S. et al. Direct effects of type I interferons on cells of the immune system. Clin. Cancer Res. 17, 2619–2627 (2011). Competing interests 34. Ivashkiv, L. B. & Donlin, L. T. Regulation of type I interferon responses. Nat. The authors declare no competing interests. Rev. Immunol. 14,36–49 (2014). 35. Centuori, S. M. et al. Double-negative (CD27(-)IgD(-)) B cells are expanded in Additional information NSCLC and inversely correlate with affinity-matured B cell populations. J. Supplementary information Transl. Med 16, 30 (2018). The online version contains supplementary material 36. Tang, K. F. et al. A distinct expression of CC by macrophages available at https://doi.org/10.1038/s41467-021-21795-z. in nasopharyngeal carcinoma: implication for the intense tumor infiltration by Correspondence T lymphocytes and macrophages. Hum. Pathol. 32,42–49 (2001). and requests for materials should be addressed to X.-Y.G. 37. Yu, Y. et al. Elevated levels of TNF-alpha and decreased levels of CD68- Peer review information Nature Communications thanks Terence Speed, Jianjun Zhang positive macrophages in primary tumor tissues are unfavorable for the and the other, anonymous, reviewer(s) for their contribution to the peer review of this survival of patients with nasopharyngeal carcinoma. Technol. Cancer Res. work. Peer reviewer reports are available. Treat. 18, 1533033819874807 (2019). 38. Nilsson, J. S., Abolhalaj, M., Lundberg, K., Lindstedt, M. & Greiff, L. Dendritic Reprints and permission information is available at http://www.nature.com/reprints cell subpopulations in nasopharyngeal cancer. Oncol. Lett. 17, 2557–2561 (2019). Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in 39. Chavez-Galan, L., Olleros, M. L., Vesin, D. & Garcia, I. Much More than M1 published maps and institutional affiliations. and M2 Macrophages, there are also CD169(+) and TCR(+) macrophages. Front. Immunol. 6, 263 (2015). 40. Van Overmeire, E., Laoui, D., Keirsse, J., Van Ginderachter, J. A. & Sarukhan, A. Mechanisms driving macrophage diversity and specialization in distinct Open Access This article is licensed under a Creative Commons tumor microenvironments and parallelisms with other tissues. Front. Attribution 4.0 International License, which permits use, sharing, Immunol. 5, 127 (2014). adaptation, distribution and reproduction in any medium or format, as long as you give 41. Chen, J. et al. Overexpression of alpha-sma-positive fibroblasts (CAFs) in appropriate credit to the original author(s) and the source, provide a link to the Creative nasopharyngeal carcinoma predicts poor prognosis. J. Cancer 8, 3897–3902 Commons license, and indicate if changes were made. The images or other third party (2017). material in this article are included in the article’s Creative Commons license, unless 42. Wang, S. et al. Relationships of alpha-SMA-positive fibroblasts and SDF-1- indicated otherwise in a credit line to the material. If material is not included in the positive tumor cells with neoangiogenesis in nasopharyngeal carcinoma. article’s Creative Commons license and your intended use is not permitted by statutory Biomed. Res. Int. 2014, 507353 (2014). regulation or exceeds the permitted use, you will need to obtain permission directly from 43. Tarazona, R. et al. Current progress in NK cell biology and NK cell- the copyright holder. To view a copy of this license, visit http://creativecommons.org/ based cancer immunotherapy. Cancer Immunol. Immunother. 69, 879–899 licenses/by/4.0/. (2020). 44. Zhang, C. et al. NKG2A is a NK cell exhaustion checkpoint for HCV persistence. Nat. Commun. 10, 1507 (2019). © The Author(s) 2021

18 NATURE COMMUNICATIONS | (2021) 12:1540 | https://doi.org/10.1038/s41467-021-21795-z | www.nature.com/naturecommunications