Resolving the Spatial and Cellular Architecture of Lung Adenocarcinoma by Multi- Region Single-Cell Sequencing
Total Page:16
File Type:pdf, Size:1020Kb
Resolving the spatial and cellular architecture of lung adenocarcinoma by multi- region single-cell sequencing SUPPLEMENTARY DATA FILE 2 Supplementary Figures S1-S7 Supplementary Fig. S1 Count matrix Filtering Removal of: Doublets Low quality cells Cells with low complexity libraries/debris Batch effects evaluation and correction Unsupervised clustering, UMAP visualization & major cell lineages annnotation Hierarchical relationship analysis EPCAM- fractions EPCAM+ fractions Subclustering & cell subsets annotation Sublustering & cell subsets annotation Confirmation of subclustering robustness Confirmation of subclustering robustness for specific EPCAM- subsets Stromal inferCNV analysis and clustering Differential gene expression analysis Single-cell trajectory analysis of specific subpopulations Immune Lymphoid Analysis of cell Myeloid states & signatures Cell-Cell communication analysis Supplementary Fig. S1. Schematic view of the bioinformatics analysis workflow. An overview of the quality control steps and analyses done. Supplementary Fig. S2 A 10k 0.15 6k 5k 1k 0.10 4k 3k # of cell 100 0.05 2k 1k # of detected genes fraction of mito-genes 10 0.00 0 4 2 1 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 2 1 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 2 1 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ 4 Ep- 4 Ep+ 3 Ep- 3 Ep+ 2 Ep- 2 Ep+ 1 Ep- 1 Ep+ Patient P1 P2 P3 P4 P5 P1 P2 P3 P4 P5 P1 P2 P3 P4 P5 1 2 3 4 LUAD Adj Int Dis B D EpithelialLymphoidMyeloidStromal LAMP3 Patient P1 P2 P3 n = 186,916 KRT8 P1 NKX2−1 EPCAM P2 KRT17 TP63 P3 KRT5 PTPRC P4 GZMB GZMA Scaled exp P5 MZB1 P4 P5 XCL2 CD8B 1.0 CD8A GZMK 0.5 XCL1 0.0 CD40LG SLAMF7 −0.5 TRGV8 −1.0 NCAM1 KLRC1 NCR1 CD19 % expressed TRDC MS4A1 0 LILRB4 25 ITGAX C S100A8 50 S100A9 75 Batch CD14 n = 186,916 PM1127 CD4 100 PM1178 HLA−DPB1 HLA−DRB1 PM1157_1158 CD68 SC153 FCER1G CD1C SC172 TPSB2 PM1127 PM1178 PM1157_1158 SC153 TPSAB1 SC174 MS4A2 LILRA4 PM1137 CLEC9A CLDN5 PECAM1 PDGFRA SERPINF1 ACTA2 COL1A2 SC172 SC174 PM1137 PDGFRB COL1A1 Supplementary Fig. S2. Quality control metrics and expression of major cell lineage markers across the spatial LUAD scRNA-seq dataset. A, Statistical summary of cells passing quality control (QC) and showing cell number (left), fraction of mitochondrial genes (middle), and the number of detected genes (right) per sample. Dis, distant normal; Int, intermediate normal; Adj, adjacent normal; LUAD, tumor tissue. B-C, UMAP plots showing cells colored by patient ID (B) and library/sequencing batch (C). D, Bubble plot showing the percentage of cells expressing lineage markers (indicated by the size of the circle) as well as their scaled expression levels (indicated by the color of the circle) across all cells (related to main Fig. 1D and Fig. 1E). Supplementary Fig. S3 A Harmony rPCA Cluster overlap Ratio of overlap: 98.4% rPCA Endothelial Epithelial Jaccard Stromal_C17.C14 index Stromal 8 11 3 4 5997 Myeloid_C2.C6..C22 0 177 837 34020 0 Lymphoid Myeloid Lymphoid_C1.C3...C10 12 1 39510 319 9 Endothelial Lymphoid Epithelial_C0.C4..C18 0 52325 0 19 0 1.00 Lymphoid Myeloid 0.75 Endothelial_C9.C21 6932 0 0 0 2 Epithelial 0.50 Myeloid 0.25 Epithelial Myeloid Stromal EpithelialLymphoid Endothelial 0.00 Harmony Stromal B 25% cells sampled 50% cells sampled 75% cells sampled Epithelial Epithelial Endothelial Lymphoid Stromal Endothelial Myeloid Lymphoid Lymphoid Endothelial Stromal Stromal Myeloid Epithelial Myeloid Cluster overlap Cluster overlap Cluster overlap Ratio of overlap: 99.5% Ratio of overlap: 99.8% Ratio of overlap: 99.7% Stromal_C13.C15 Stromal_C14.C16 Stromal_C14.C17 12 0 0 0 6003 1 0 0 0 1964 10 0 0 0 3937 Myeloid_C1.C6.C11.C12.C18.C20 0 0 0 10978 1 Myeloid_C10.C13.C3.C6.C20.C12 0 0 1 22020 1 Myeloid_C4.C13.C6.C10.C20.C12.C24 0 0 1 33237 1 Lymphoid_C3.C5.C7.C19.C14.C9 4 1 13202 22 2 Lymphoid_C2.C4.C7.C21.C17.C9 2 3 26573 10 1 Lymphoid_C2.C3.C7.C22.C16.C8 1 7 39536 7 1 Epithelial_C0.C2.C10.C4.C16 9 17467 0 0 18 Epithelial_C0.C1.C11.C5.C15.C18 5 34880 0 1 3 Epithelial_C0.C1.C5.C11.C15.C18.C25 4 52355 0 1 1 Endothelial_C8.C21 2410 0 0 0 2 Endothelial_C8.C22 4599 0 0 0 4 Endothelial_C9.C21.C23 6935 0 0 0 2 MyeloidStromal MyeloidStromal MyeloidStromal EpithelialLymphoid EpithelialLymphoid EpithelialLymphoid Endothelial Endothelial Endothelial Jaccard index Cell type 0 0.25 0.50 0.75 1 Cell type (100% cells) (random sampling) C Cluster overlap Ratio of overlap: 97.0% C2 1 0 0 0 782 Jaccard index C1 0 0 2 4002 0 1 0.75 C7 36 12 5666 896 12 0.50 0.25 C3.C4.C6 1 7345 0 4 0 0 C5 869 0 0 0 0 MyeloidStromal EpithelialLymphoid Endothelial Cell type Cell type (100% cells) (K-means clustering) Supplementary Fig. S3. Analysis of clustering robustness of major cellular lineages. A, UMAP plots showing cells colored by major lineages and identified with Harmony (left) or rPCA (middle). The right part of panel A shows a heatmap depicting the extent of cluster assignment overlap between rPCA (rows) and Harmony results (columns) as quantified by Jaccard index. B, UMAP plots (top) and the corresponding cluster overlap indices (bottom) when using 25% (left), 50% (middle), and 75% (right) of randomly sampled cells. The heatmaps show cluster overlap indices in randomly sampled results (rows) versus when using all 186,916 cells (columns). C, Heatmap depicting the extent of cluster assignment overlap between k-means clustering (columns) and Harmony (rows) as quantified by Jaccard index. Supplementary Fig. S4 A 1.0 1.0 P = 0.07 1.0 0.8 0.8 0.8 0.6 0.6 0.5 0.4 0.4 0.3 Fraction (Proliferating) 0.2 0.2 Fraction (Non-proliferating) Fraction (Proliferating epithelial) 0.0 0.0 0.0 LUAD Other MyeloidStromal Myeloid EpithelialLymphoid Epithelial Lymphoid Endothelial 1 2 3 4 LUAD Adj Int Dis B 50k P < 0.001 6,000 P < 0.001 40k 4,500 30k 20k 3,000 # genes (All cells) (All cells) # transcripts 10k 1,500 0 0 Ep+ Ep- Ep+ Ep- (n = 623) (n = 14,509) (n = 623) (n = 14,509) C 40k P < 0.01 6,000 P < 0.01 30k 4,500 20k 3,000 # genes 10k # transcripts 1,500 (Proliferating cells) (Proliferating cells) 0 0 Ep+ Ep- Ep+ Ep- (n = 7) (n = 139) (n = 7) (n = 139) D 40k P < 0.001 6000 P < 0.001 30k 4500 20k 3000 # genes 10k # transcripts 1500 0 (Non-proliferating cells) (Non-proliferating cells) 0 Ep+ Ep- Ep+ Ep- (n = 616) (n = 14,370) (n = 616) (n = 14,370) E Relative expression % of cells expressed 0 1 2 3 0 25 50 75 100 NK B T Mac DC EC Y L A2 IL6 T XP3 CA4 CD4 IL10 CD7 IL1B VWF PTX3 PRF1 C XCL1 CD14 CD68 XBP1 CD19 XCL2 O SDC1 MZB1 CD8A NCR1 CD3E CD8B CD3D GN CD1C TRDC IL2RA NKG7 GZMA GZMB GZMK ITGAX A F TPSB2 KLRC1 KLRB1 MS4A1 TRGV8 TRGC1 MS4A2 CLDN5 TRGC2 PTPRC LILRB4 LILRA4 NCAM1 S100A8 S100A9 COL1A1 TPSAB1 CD40LG CLEC9A COL1A2 FCER1G FCGR3A SLAMF7 PECAM1 PDGFRA PDGFRB SERPINF1 HLA−DPB1 HLA−DRB1 Supplementary Fig. S4. Analysis of expression markers among epithelial and immune cell fractions. A, Stacked bar plots showing the relative cell fraction of spatial samples for major lineages considering non-proliferating cells (left) and proliferating ones (middle). Box plot showing fraction of proliferating epithelial cells in LUAD tissues versus normal spatial samples (right). P – value was calculated using Wilcoxon rank sum test. B, Boxplots showing the number of detected transcripts (left) and the number of detected genes (right) in epithelial cells versus in all non-epithelial lineages including lymphoid, myeloid, and stromal cells. C, Boxplots showing the number of detected transcripts (left) and the number of detected genes (right) in proliferating epithelial cells versus in proliferating cells of non-epithelial lineages including lymphoid, myeloid, and stromal cells. D, Boxplots showing the number of detected transcripts (left) and genes (right) in non-proliferating epithelial cells versus in non-proliferating and non-epithelial lineages including lymphoid, myeloid, and stromal cells. E, Bubble plot showing the percentage of cells expressing lineage markers (indicated by the size of the circle) as well as their scaled expression levels (indicated by the color of the circle) across selected cell types (related to main Fig. 1G and Fig. 1H). NK; natural killer cell, DC; dendritic cell, EC; endothelial cell. Supplementary Fig. S5 A B P1 P2 P3 PM1127 PM1178 PM1157_PM1158 n=14,279 n=5,118 n=24,541 SC153 SC172 SC174 P4 P5 n=623 n=18,761 n=6,708 C 25% cells sampled 50% cells sampled 75% cells sampled Ionocyte Ionocyte Ionocyte Proliferating Basal basal Malignant Malignant Malignant Basal enriched Proliferating enriched enriched basal Basal Ciliated Club and Proliferating Club and secretory basal secretory Club and AT2 secretory AT2 Bronchio- Bronchio- AT2 alveolar Ciliated alveolar Alveolar Alveolar Ciliated progenitor AT1 AT1 Alveolar progenitor Bronchio- progenitor AT1 alveolar Cluster overlap Cluster overlap Cluster overlap Ratio of overlap: 93.3% Ratio of overlap: 95.8% Ratio of overlap: 97.6% C2_2.Ionocyte 0 0 0 0 0 0 0 0 0 18 C9.Ionocyte 0 0 0 0 0 0 0 0 0 32 C9.Ionocyte 0 0 0 0 0 0 0 0 0 45 C7.Proliferating.basal 0 0 0 0 0 2 0 0 96 0 C8.Proliferating.basal 0 0 0 0 0 0 3 0 209 0 C8.Proliferating.basal 1 0 3 0 0 3 2 0 326 0 C6.Ciliated 0 0 0 0 0 0 0 763 0 0 C6.Ciliated 0 0 0 0 0 0 0 1612 0 0 C6.Ciliated 0 0 0