Supplementary Tables

Supplementary Table 1: List of that are profiled by the seqFISH experiment. 4931431F19Rik Ctla4 Hnf1a Obsl1 Cpne5

4932429P05Rik Cyp2c70 Hoxb3 Olr1 Nes

Abca15 Cyp2j5 Hoxb8 Osr2 Acta2

Abca9 Dbx1 Hyal5 Pld1 Gja1

Adcy4 Dcstamp Kif16b Pld5 Omg

Aldh3b2 Ddb2 Laptm5 Poln Nov

Ankle1 Egln3 Lefty2 Ppp1r3b Col5a1

Ano7 Fam69c Lhx3 Psmd5 Dcx

Anxa9 Fbll1 Lhx4 Rbm31y Itpr2

Arhgef26 Foxa1 Lmod1 Rrm2 Rhob

B3gat2 Foxa2 Mertk Scml2

Barhl1 Foxd1 Mgam Senp1 Cldn5

Bcl2l14 Foxd4 Mmgt1 Serpinb11 Mrc1

Blzf1 Galnt3 Mmp8 Sis Tbr1

Bmpr1b Gata6 Mrgprb1 Slc4a8 Pax6

Capn13 Gdf2 Murc Slc6a16 Calb1

Cdc5l Gdf5 Nell1 Spag6 Gda

Cdc6 Gm15688 Neurod4 Sumf2 Slc5a7

Cdh1 Gm6377 Neurog1 Tnfrsf1b Sema3e

Cecr2 Gm805 Nfkb2 Vmn1r65 Mfge8

Cilp Gpc4 Nfkbiz Vps13c Lyve1

Clec5a Gpr114 Nhlh1 Wrn Loxl1

Creb1 Gykl1 Nkd2 Zfp182 Slco1c1

Creb3l1 Hdx Nlrp12 Zfp715 Amigo2

Nature Biotechnology: doi:10.1038/nbt.4260 Csf2rb2 Hn1l Npy2r Zfp90 Kcnip2

Supplementary Table 2: List of 43 genes used for cell type mapping. Fbll1 Itpr2 Vps13c Tnfrsf1b Sox2

Hdx Wrn Sumf2 Vmn1r65 Rhob

Mrgprb1 Calb1 Pld1 Laptm5 Tbr1

Slc5a7 Abca9 Ankle1 Olr1

Cecr2 Cpne5 Blzf1 Mertk

Nell1 Npy2r Cdc5l Slco1c1

Pax6 Cldn5 Cyp2j5 Mfge8

Col5a1 Bmpr1b Rrm2 Gja1

Dcx Spag6 Csf2rb2 Gda

Arhgef26 Slc4a8 Gm805 Omg

Supplementary Table 3: Astrocyte prediction accuracy, evaluated using fluorescent staining images. We examined and contrasted DAPI, Nissl staining on astrocyte cells. Percentage of cells with no Nissl, and with DAPI staining are recorded (these are accurate instances) % predicted % predicted Total predicted astrocytes with weak astrocytes with astrocytes or no Nissl stain present Nissl stain

Cortex Column 1 100% 0% 11

Cortex Column 2 88% 12% 25

Cortex Column 3 87% 13% 24

Cortex Column 4 87% 13% 24

Supplementary Table 4: List of 69 genes used for HMRF. Calb1 Gdf5 Rbm31y Zfp182 Pld5

Kcnip2 Nkd2 Creb3l1 Gpc4 Gm805

Tbr1 Fam69c Nfkbiz Obsl1 Clec5a

Nes Sema3e Hoxb8 Dcstamp Serpinb11

Gda Dcx Adcy4 Poln Zfp715

Nature Biotechnology: doi:10.1038/nbt.4260 Col5a1 Lhx3 Mmgt1 Cdh1 Blzf1

Loxl1 Omg Lhx4 Ppp1r3b Bm6377

Sox2 Wrn Lefty2 Mgam Zfp90

Slc5a7 Aldh3b2 Gm15688 Hn1l

Nov Foxd1 Kif16b Hdx

Cpne5 Cyp2c70 Vmn1r65 Psmd5

Mrc1 Hnf1a Foxa2 Bcl2l14

Acta2 Slc4a8 Gpr114 Osr2

Cyp2j5 Barhl1 Gata6 4931431F19Rik

Lyve1 Egln3 Nhlh1 Lmod1

Supplementary Table 5: Expanded domain specific signatures used for Tasic et al reanalysis O1 aldh3b2 kcnips gm805 b3gat2 amigo2 cdc6

O2 serpinb11 fam69c capn13 ankle1 mmp8 cecr2 foxd4 barhl1

O3 lmod1 mmgt1 lefty2 hn1l foxd1 pld1 olr1

O4 omg arhgef26

IS adcy4 nkd2 cyp2j5 zfp715 mgam senp1 ddb2

I2 hn1l cdc5l murc nhlh1

I3 gata6 vmn1r65 sema3e obsl1

I1a loxl1 clec5a cpne5 creb3l1 gpc4 calb1 sox2 nes vps13c

I1b mrc1 nkd2 sema3e col5a1 dcx slc5a7 nov rhob

Nature Biotechnology: doi:10.1038/nbt.4260 Supplementary Figures

Supplementary Fig 1: Quantile-quantile plot per for 113 genes shared between seqFISH and scRNAseq data sets. Q-Q plot shows the comparison of quantiles between scRNAseq (y- axis, across n=1723 cells) and seqFISH (x-axis, across n=1597 cells). Quantiles refer to expression z-score quantiles after each scRNAseq and seqFISH has been transformed by row- and column-wise z-score normalization.

Supplementary Fig 2: Robustness analysis of domain identifications by HMRF model with respect to cell-type marker filtering. (a) Different scenarios. “ct” denotes “cell-type”. “Plus.11.ct.genes” denotes unfiltered scenario which added all 11 ct genes. (b, c) Agreement between domain assignment using different cell-type filtering settings. Agreement quantified by (b) overall clustering consistency and (c) contingency table. (d) Spatial patterns found by HMRF in each scenario, in comparison with 69-gene HMRF. For each scenario, the HMRF model used 1,000 initial centroids to select the best one to initiate HMRF. HMRF clustering was repeated for two more times with similar results.

Supplementary Fig 3: Comparison between the (a) spatial domain annotation and (b) major cell type annotation for all 1597 cells in the seqFISH data set. tSNE constructed from the 69 genes used in HMRF for the seqFISH dataset.

Supplementary Fig 4: Enrichment of cell types in spatial domains. Enrichment is calculated as (num. cells) / sqrt(colsum x rowsum), where number of cells refers to cells that overlap between a domain and a cell type. Col- and row-sums refer to the overlap table.

Supplementary Fig 5: Spatial expression of domain marker genes. Representative genes of the 9 domains include Barhl1 (O2), Nfkb2 (IS), Col5a1 (I1a), Aldh3b2 (O1), Calb1 (I1b), Omg (O4), Nhlh1 (I2), Gata6 (I3), and Pld1 (O3). Black outlines around some cells highlight the domain annotations. One mouse brain was assayed by seqFISH due to experimental cost.

Supplementary Fig 6: General domain signatures transcend cell types. For cells annotated to each cell type : spatial domain pair; only those containing at least 4 cells are represented. Top row: domain type; 2nd row: cell type. The heatmap below shows the average level of domain associated genes.

Supplementary Fig 7: Distribution of expression levels representative of domain- and cell-type specific marker genes. (a) Representative domain-specific marker genes. The corresponding domains are indicated in parentheses in title. (b) Representative cell-type specific marker genes. The corresponding cell types are indicated in parentheses in title. Each box represents the distribution associated with the intersection of a specific cell type and spatial domain pair. A dot inside each box represents a cell. Only pairs which contain >5 cells are analyzed, and other pairs are discarded. Boxes show the quartiles of data points with whiskers extended to show the rest of distribution. Sample sizes from left to right: n=6, 60, 11, 86, 21, 13, 71, 74, 9, 11, 8, 6, 71, 29, 93, 30, 8, 77, 78, 83, 22, 13, 171, 202, 20, 134, and 111 cells.

Nature Biotechnology: doi:10.1038/nbt.4260

Supplementary Fig 8: Comparison of seqFISH’s domain specific genes with Allen Brain Atlas. Genes compared are: Calb1, Sema3e, Gda, Nell1, Tbr1, Cpne5, Gdf5, Nov, Aldh3b2, Nkd2, Serpinb11. For each gene, the left image is the seqFISH profile (n=1597 cells); the top right image shows the Allen Brain Atlas ISH staining. Bottom right shows the z-scored expression in microdissection cell clusters from Zeisel et al and Tasic et al scRNAseq data sets. SeqFISH profiles agree well with existing resources.

Supplementary Fig 9: Metagene expression levels. (a) 9 metagenes each marking a different spatial domain. (b) the spatial domain annotations. Metagene expression is defined as the average of general domain signature genes (Fig 2d). One mouse brain was assayed by seqFISH due to experimental cost.

Supplementary Fig 10: Applications of domain metagene signatures on mapping the domain patterns of glutamatergic cells. Shown are glutamatergic cells only. Red indicates metagene expression. For mapping glutamatergic cells, the expanded domain specific signatures (Supplementary Table 5) were used. This has been indicated in the legend.

Supplementary Fig 11: A zoom-in view of comparison between spatial domain annotations (left) and metagene expression levels (right) at domain boundaries indicated in Fig 3b. Five snapshot regions were selected to show that (meta) switching is frequently observed at domain boundaries.

Supplementary Fig 12: Comparison of morphological features across different spatial domains. 6 of the 15 morphological features extracted from Nissl staining images are significantly different across domains. Significance is judged by both of the following criteria, whereby 1) each domain is compared to an individual domain and is P<0.05 in at least 7 of 8 one-vs-one K-S tests, and 2) each domain is compared to all cells in remaining domains (P<0.00001, one-vs-rest test). (a) The distribution of the 6 significant features across domains. (b) The P-values obtained from the one-vs-rest tests. Domains being compared have the following cell numbers: O2 (n=109), I1a (n = 389), O4 (n =120), I1b (n = 79), O1 (n = 135), I2 (n = 117), I3 (n = 205), O3 (n = 270), and IS (n = 173).

Supplementary Fig 13: ISH validation of additional spatial domain markers identified from integrative analysis. ISH images were obtained from the Allen Brain Atlas. Each gene wasa selected based on co-expression with the domain specific signatures. The ISH images shown were representative sagittal brain slices cropped to the visual cortex region. There are at least two replicates in each case.

Supplementary Fig 14: Overlap between spatial domain and glutamatergic subtype annotations for theTasic et al dataset. (a) Overlap between domain- and cell subtype-specific genes focused within the 125-gene fraction. (b) Overlap between cells annotated as a specific domain vs subtype. P-values were obtained by using the hypergeometric tests corrected for

Nature Biotechnology: doi:10.1038/nbt.4260 multiple hypotheses. Cell numbers contained in each glutamatergic cell subtype are: L2/3_Ptgs2 (n=96), L2_Ngb (n=21), L4_Arf5 (n=44), L4_Ctxn3 (n=79), L4_Scnn1a (n=99), L5_Chrna6 (n=11), L5_Ucma (n=16), L5a_Batf3 (n=66), L5a_Hsd11b1 (n=53), L5a_Tcerg1l (n=35), L5b_Cdh13 (n=42), L5b_Tph2 (n=32), L5a_Car12 (n=22), L6a_Mgp (n=53), L6a_Sla (n=74), L6a_Syt17 (n=15), L6b_Rgs12 (n=15), and L6b_Serpinb11 (n=20). Cell numbers contained in each metagene-derived cluster are: O2 (n=132), I1a (n=98), O4 (n=92), I1b (n=131), O1 (n=84), I2 (n=22), I3 (n=97), O3 (n=100), IS (n=56).

Supplementary Fig 15: Comparison of cell subtype annotations and metagene cluster annotations on Tasic et al glutamatergic cells (n=812). t-SNE plot has been generated based on metagene expression (same configuration as in Fig 4a). (a) Cell subtype annotations as provided by Tasic et al. (b) Metagene-cluster annotations (domains).

Supplementary Fig 16: Comparison of metagene-recovered spatial populations with cell subtypes in Tasic et al data set (n=812 glutamatergic cells). (a) Overlay of expression patterns of representative cell subtype markers (col5a1, cpne5, sema3e, serpinb11) on t-SNE plot that was generated in the same way as Fig 4a (obtained from metagene signatures). Red demarcates cells expressing the gene. It can be seen that these genes served as markers of both cell subtypes and domains. (b) Some distinct populations recovered by metagenes do not correspond to any specific cell subtypes identified by Tasic et al. Shown are three examples (O1, I2, and O4). The second row shows the subtype annotations.

Supplementary Fig 17: Expression of astrocyte-associated domain genes in an external astrocyte expression database (Zhang et al 2016). Low-level expression is defined as <10 FPKM. High level is between 10 and 160 FPKM.

Supplementary Fig 18: Spatial organization of the mouse olfactory bulb as revealed by the HMRF model. (a) This spatial transcriptome data set (n=282 wells) was generated by Stahl et al 2016. and obtained by using tissue microarray to barcode spatial locations at a 100um resolution followed by RNA sequencing. (b) Spatial domains identified by using a 5-state HMRF model. Settings of HMRF: 10,000 initial centroids were used and the best centroid setting was selected to initiate HMRF. Beta (60.0) selected in comparison with randomly shuffled case, and 500 spatially coherent genes. HMRF clustering was repeated for two more times with similar results. (c) Expression patterns of representative marker genes. (d) The anatomical structure identified by hematoxylin and eosin staining.

Supplementary Fig 19: Spatial organization of the mouse dentate gyrus as revealed by the HMRF model. This is a 249-gene seqFISH data set (n=696 cells) (Shah et al 2016) with single- cell spatial and transcriptomic information. (a) Spatial domains identified by the 5-state HMRF model. Settings of HMRF: 10,000 initial centroids were used and the best centroid was selected to initiate HMRF. Beta (28.0) selected in comparison with randomly shuffled case, and 91 spatially coherent genes. HMRF clustering was repeated two more times with similar results. (b) Schematic of the anatomical structure at dentate gyrus. Rectangular boxes indicate the imaging fields shown in (a). (c) The expression pattern of a number of representative marker genes.

Nature Biotechnology: doi:10.1038/nbt.4260

Methods: Supplementary Fig 20: Estimates of imaging bias in the seqFISH data. Cortex (n=1597 cells) is divided into 20 adjacent fields of view (shortened as fields). We divide each field into a 50 bin- by-50 bin grid. Then for each bin, we compute the average expression level at that bin over all 20 fields. Together this forms an overall bias map. Principal component analysis was applied to the bias map to model the most significant bias patterns among genes. The top PCs are shown below representing the major bias patterns.

Supplementary Fig 21: Dependency of cell-type mapping accuracy on the number of genes used. (a) Relationship between the 8 major, 22 finer, and 49 minor cell types determined by Tasic et al. (b,c,d) Cross-validation estimated mapping accuracy associated with (b) 8 major, (c) 22 finer, and (d) 49 minor classes.

Supplementary Fig 22: Cross-validation estimated mapping accuracy for individual cell types. (a) Receiver operating curves (ROC) for each of the eight major cell types. (b) Cross validation accuracies of individual cell types: astrocytes (n=97), endothelial (n=11), GABA-ergic neurons (n=556), glutamatergic neurons (n=859), microglia (n=22), OPC (n=8), oligodendrocyte.1 (n=21), oligodendrocyte.2 (n=23).

Supplementary Fig 23: Robustness analysis of our HMRF model. Agreement between domain assignment using different numbers of top spatially coherent genes. Agreement is quantified by (a) contingency table and (b) adjusted mutual information (left) and the proportion of cells clustered consistently (i.e. the diagonal of matrix in (a)) (right).

Supplementary Fig 24: Robustness analysis of beta in our HMRF model. Agreement between domain assignment using different betas. Agreement is quantified in (a) by the adjusted mutual information (left) and the proportion of cells clustered consistently (right) (i.e. the diagonal of matrix in (b)). (b) Overlap matrix in each case. (c) Spatial pattern comparison. For each beta, 10,000 initial centroid configurations were used to select the best one to initiate HMRF, and HMRF clustering was repeated two more times with similar results.

Supplementary Fig 25: Robustness analysis of the HMRF model (n=1597 cells) against disruption of spatial patterns. A subset of cells at various fractions (p=0.1, 0.2, 0.4, and 0.99) were randomly selected and switched spatial locations. 100 different random samples were selected for each parameter setting, and the distribution was plotted. For each perturbed dataset, the HMRF model was applied to identify spatial domains. Goodness-of-fit was quantified by using the log-likelihood. Blue vertical line indicates the log-likelihood for the original data set, and is significant compared to shuffling rates of 0.99 (one sided, empirical P<0.01), 0.4 (P<0.01), and 0.2 (P<0.03).

Supplementary Fig 26: SeqFISH expression z-scores (n=199,625 values) approximately follow a normal distribution.

Nature Biotechnology: doi:10.1038/nbt.4260 Supplementary Fig 27: Domain signatures obtained by non-parametric Mann-Whitney U test. (a) general domain signature, complementary to Fig 2d. Cell numbers contained in each domain: O2 (n=109), I1a (n=389), O4 (n=120), I1b (n=79), O1 (n=135), I2 (n=117), I3 (n=205), O3 (n=270), and IS (n=173); (b) glutamatergic restricted signature, complementary to Fig 3b, middle panel. Cell numbers in each domain: O2 (n=79), I1a (n=187), O4 (n=88), I1b (n=58), O1 (n=93), I2 (n=60), I3 (n=73), O3 (n=129), and IS (n=92).

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 1 scRNAseq

Expr. z-score Expr. seqFISH Expr. z-score

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 2 Compared with 69-gene HMRF c Overlap matrices d

a Minus.5.ct.genes HMRF 277 5 3 0 0 020 020 Baseline: 8 83 1 0 0 0 9 0 7 69-gene HMRF (already removed 11 cell type (ct) genes) 8 374 026 0 0 0 4 Minus.5.ct.genes: 5 0 811323 9 0 1 0 remove 5 additional ct genes (64-gene HMRF) 13 113 3130 6 0 0 0 0 0 3 1 5 67 0 0 0 Minus.5.ct.genes Minus.3.ct.genes: HMRF 69-gene 23 7 0 0 0 0124 1 2 remove 3 additional ct genes (66-gene HMRF) 23 6 6 4 2 0 011912 Plus.5.ct.genes: 41 6 12 0 1 0 12 14233

add back 5 ct genes (74-gene HMRF) Minus.3.ct.genes HMRF Plus.11.ct.genes (unfiltered): 81 511 217 0 0 2 3 add back all 11 ct genes (80-gene HMRF) 3233 45 7 1 1 0 15 5 2 23291 15 3 0 1 23 8 8 8 6104 1 0 0 0 2 15 111 2142 4 3 2 1 b Compared with 69-gene HMRF 2 0 1 0 5 72 2 0 0

Overlap statistics 69-gene HMRF 9 0 6 118 5115 0 0 0 117 3 0 0 0114 3 Minus.3.ct.genes 1 0.7 0 710 1 0 0 0 989

0.8 0.6 Plus.5.ct.genes HMRF 0.5 8628 3 3 011 0 1 0 0.6 0.4 5144 5 5 0 4 0 0 1 01099 2 0 2 0 0 1 0.4 0.3 7 62 17347 0 10 1 3 2 0.2 0 0 0 0 73 0 1 5 1 0.2 313 1 3 0130 1 0 0 0.1 0 1 0 3 7 1171 12 26 69-gene 69-gene HMRF 0 0 0 3 1 7 2 0 4100 3 10 17 9 28 0 7 9 0 86 Baseline AdjustedMI Plus.11.ct.genes HMRF (unfiltered) 74 5 8 0 0 1 1 0 0 39916 0 1 3 4 0 3 5 12154 2 1 38 1 1 1 Proportionclustered consistently 0 0 011315 0 5 5 1

Plus.5.ct.genes 0 0 1 5110 414 3 2 Plus.5.ct.genes Plus.11.ct.genes Minus.5.ct.genes Minus.3.ct.genes Plus.11.ct.genes

Minus.5.ct.genes Minus.3.ct.genes 0 2 6 9 25 68 18 7 11

69-gene 69-gene HMRF 0 2 2 22 85 2341 8 17 0 0 01231 0 784 4 0 1 0 210 4 7 396 (unfiltered) (unfiltered) Plus.5.ct.genes

Plus.11.ct.genes (unfiltered)

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 3

a b

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 4 Glut-N GABA-N Astro Micro OPC Oligo.1 Oligo.2 Endo Glut-N GABA-N Astro Micro OPC.1 OPC.2 Oligo Endo 0 O2 0.1 I1a 0.2 O4 0.3 0.4 I1b 0.5 O1 I2 I3 O3 IS

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 5

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 6

O1 O2 O3 O4 IS I2 I3 I1a I1b Microglia Astrocyte Cell type Glutamatergic Neuron GABA-ergic Neuron OPC Oligo.1 Oligo.2 Endothelial Cell

Nature Biotechnology: doi:10.1038/nbt.4260 Nes (I1a, I1b) Mmgt1 (O3) SFig 7 a

Creb1 (I2) Calb1 (I1a, I1b)

z-score of log2 expression log2 z-score of Serpinb11 (O1) Ddb2 (IS) I2; Astro I2; Astro I3; I2; Astro I2; Astro I3; O1; Astro O1; Astro O3; Astro O4; I1a; Astro I1a; I2;Glut-N I3; Glut-N O1; Astro O1; Astro O3; Astro O4; O4; Micro I1a; Astro I1a; I2;Glut-N I3; Glut-N IS; Glut-N O4; Micro IS; Glut-N O1; Glut-N O2; Glut-N O3; Glut-N O4; Glut-N I1a; Glut-N I1b; Glut-N O1; Glut-N O2; Glut-N O3; Glut-N O4; Glut-N I1a; Glut-N I1b; Glut-N O3;Oligo.1 O3; Oligo.2 O3;Oligo.1 O3; Oligo.2 I2; GABA-N I3; GABA-N IS; GABA-N I2; GABA-N I3; GABA-N IS; GABA-N O1; GABA-N O2; GABA-N O3; GABA-N O4; GABA-N I1a; GABA-N I1b; GABA-N O1; GABA-N O2; GABA-N O3; GABA-N O4; GABA-N I1a; GABA-N I1b; GABA-N

O1 O2 O3 O4 IS I1a I1b I2 I3 O1 O2 O3 O4 IS I1a I1b I2 I3 b Mertk (astrocyte) Gja1 (astrocyte)

Slc4a8 (GABA-ergic) Gda (Glutamatergic) z-score of log2 expression z-score of log2

I2; Astro I2; Astro I3; I2; Astro I2; I3; Astro I3; O1; Astro O1; Astro O3; Astro O4; O1; Astro O1; Astro O3; Astro O4; I1a; Astro I1a; I1a; Astro I1a; I2; Glut-N I3;Glut-N I2; Glut-N I3;Glut-N O4; Micro O4; Micro IS; Glut-N IS; Glut-N O1; Glut-N O2; Glut-N O3; Glut-N O4; Glut-N O1; Glut-N O2; Glut-N O3; Glut-N O4; Glut-N I1a; Glut-N I1b; Glut-N I1a; Glut-N I1b; Glut-N O3; Oligo.1 O3; Oligo.2 O3; Oligo.1 O3; Oligo.2 I2;GABA-N I3; GABA-N I2; GABA-N I3; GABA-N IS; GABA-N IS;GABA-N O1; GABA-N O2; GABA-N O3; GABA-N O4; GABA-N O1; GABA-N O2; GABA-N O3; GABA-N O4; GABA-N I1a; GABA-N I1b; GABA-N I1a; GABA-N I1b; GABA-N

Nature Biotechnology: doi:10.1038/nbt.4260O1 O2 O3 O4 IS I1a I1b I2 I3 O1 O2 O3 O4 IS I1a I1b I2 I3 SFig 8

Calb1 (sagittal) ABA ABA 70919860 Gdf5 (sagittal) L2,3 | | | | | | | 81654111 | | | | | | | L6b

Zeisel et al Tasic et al Tasic et al Zeisel et al

| | | | | | | | | | | | | | L6b L2,3

ABA Sema3e (sagittal) ABA Nov (sagittal) | | | | | | | 69015117 1242 L6a | | | | | | |

Tasic et al Zeisel et al

Tasic et al Zeisel et al

| | | | | | | | | | | | | | L6a

Gda (sagittal) ABA aldh3b2 (sagittal) ABA | | | | | | | 70276867 L1 70201462 L5 | | | | | | |

Tasic et al Zeisel et al Tasic et al Zeisel et al

| | | | | | | | | | | | | | L5 L1

ABA Nell1 (sagittal) L4 71495885 Nkd2 (sagittal) ABA | | | | | | | | | | | | | | 587433

Tasic et al Zeisel et al Zeisel et al Tasic et al

| | | | | | | | | | | | | | L4

Tbr1 (sagittal) ABA | | | | | | | 77455019 Serpinb11 (sagittal) ABA L6a | | | | | | | 68269538

Tasic et al Zeisel et al Tasic et al Zeisel et al

| | | | | | | L6a | | | | | | |

Cpne5 (sagittal) ABA 385757 | | | | | | |

Tasic et al Zeisel et al

| | | | | | |

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 9 a b O2 I1a O4

O1 I1a I1b I1b O1 I2 IS I2 I3 O2 O3 O4

I3 O3 IS

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 10 Glutamatergic cells O2 I1a O4

I1b O1 I2

I3 O3 IS

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 11 Spatial domain L6a L6b O1 I1a I2, O2 I1b IS I2 I3 O2 L2/3 O3 O4 I1a, I1b

L6a L6b EC

O2, O3, O4

L4 L5

I2, I3

L1 L2/3

O1, I1a

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 12 a

Glutamatergic Cells b Perimeter Width Angle MinFeret Circularity Major O2 ------I1a 1.78E-05 3.60E-08 8.17E-06 - - - O4 - - - - 7.00E-12 - I1b - 1.75E-10 7.90E-08 - - - O1 - - - 0.00288 4.46E-05 - I2 - - - - - 3.89E-05 I3 - 0.0158 - - - - O3 4.46E-10 1.26E-19 2.53E-10 8.67E-14 - - IS ------

All cells Perimeter Width Angle MinFeret Circularity Major O2 ------I1a 1.07E-14 4.34E-17 - 3.02E-11 - - O4 - - - - 6.18E-12 - I1b 1.24E-05 1.60E-14 6.28E-11 - - 2.74E-06 O1 - - - 0.002773 - - I2 - - - - - 1.30E-08 I3 ------O3 2.32E-20 6.80E-37 6.70E-18 5.27E-26 - 1.50E-10 IS ------

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 13 Ugp2 Luzp2 Sfxn5

Meta.O2

Morf4l2 Ube2v1 Tubb2a

Meta.I1a

Ywhae Syngr3 Ndrg3

Meta.O4

Csmd3 Arpp21 Zbtb16

Meta.I1b

Lancl2 Gpr137c Mta1

Meta.O1

Sprn Wsb2 Uba2

Meta.I2

Nrp1 Sorcs1 Sorcs3

Meta.I3

Cycs Ndufv3 Wdr37

Meta.O3

Fgf12 Ncoa7 Arl2

Meta.IS

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 14

Gene overlap (within 125 genes) Cell overlap

-log10(P) I1a I1b I2 I3 IS O1 O2 O3 O4 I1a I1b I2 I3 IS O1 O2 O3 O4 L2/3_Ptgs2 10.2 2.3 0.5 1.1 3.7 5.7 2.5 0.8 3.7 L2/3_Ptgs2 10.1 0.0 0.0 0.0 0.8 3.3 0.0 0.0 0.1 0 L2_Ngb 9.4 2.8 0.7 1.4 2.7 2.8 1.3 1.0 4.2 L2_Ngb 1.5 1.0 0.1 0.0 0.1 1.0 0.1 0.2 2.7 2 L4_Arf5 3.9 2.3 1.2 2.7 2.2 3.9 2.5 1.9 1.5 L4_Arf5 0.2 0.0 0.9 0.1 0.6 1.8 0.2 1.1 0.1 4 L4_Ctxn3 1.9 0.8 10.0 1.0 0.7 0.8 3.8 7.4 1.4 L4_Ctxn3 0.2 0.1 0.8 0.1 0.3 0.0 2.8 2.1 0.0 6 L4_Scnn1a 1.5 1.5 10.0 0.8 0.5 4.0 1.7 7.5 1.1 L4_Scnn1a 0.0 0.0 0.9 0.0 0.3 0.4 0.9 4.4 0.1 8 L5_Chrna6 1.5 1.5 1.0 1.7 1.5 6.2 3.9 3.2 2.1 L5_Chrna6 0.1 0.1 0.4 0.5 0.4 3.3 0.3 0.6 0.6 10 L5_Ucma 1.4 5.6 2.2 6.3 1.4 1.4 1.5 1.2 2.0 L5_Ucma 0.1 0.2 0.6 1.1 0.5 1.5 0.2 0.6 0.4 L5a_Batf3 2.8 2.8 2.9 1.4 2.7 2.8 1.3 1.0 1.8 L5a_Batf3 0.4 0.0 3.8 0.0 2.1 0.3 0.1 0.2 0.6 L5a_Hsd11b1 1.3 3.1 2.0 1.5 5.0 1.3 1.4 2.7 1.9 L5a_Hsd11b1 0.0 0.0 0.1 0.3 2.7 0.0 0.8 2.7 1.3 L5a_Pde1c 1.5 3.6 2.5 4.0 1.5 1.5 1.7 1.4 2.1 L5a_Pde1c 1.2 0.0 1.8 0.5 0.4 0.1 0.1 0.4 1.9 L5a_Tcerg1l 1.0 2.0 1.5 2.2 2.0 2.0 2.1 1.8 2.6 L5a_Tcerg1l 0.2 0.0 0.4 0.1 3.5 0.5 0.0 1.4 0.2 L5b_Cdh13 4.5 6.6 1.5 5.2 4.3 1.1 1.2 0.9 1.0 L5b_Cdh13 2.1 0.9 0.5 1.6 0.0 0.0 0.0 0.1 5.3 L5b_Tph2 6.1 10.8 0.5 4.9 2.3 1.0 1.1 2.1 3.9 L5b_Tph2 0.7 3.1 0.1 1.0 0.3 0.0 0.0 0.2 1.3 L6a_Car12 3.6 3.6 1.0 4.0 1.5 1.5 1.7 1.4 2.1 L6a_Car12 0.6 0.1 0.6 6.7 0.0 0.0 0.7 0.0 0.3 L6a_Mgp 0.8 8.9 10.0 4.0 0.7 0.8 2.2 2.7 1.4 L6a_Mgp 0.1 2.5 0.2 3.3 0.0 0.2 0.1 0.1 0.1 L6a_Sla 1.1 9.0 1.5 3.1 1.1 2.6 2.9 2.3 1.7 L6a_Sla 0.0 12.9 0.0 0.9 0.0 1.4 0.0 0.0 0.1 L6a_Syt17 5.6 5.6 0.9 6.3 1.4 1.4 1.5 1.2 2.0 L6a_Syt17 0.3 1.6 0.6 1.2 0.3 0.3 0.2 0.1 1.2 L6b_Rgs12 1.7 1.0 1.2 4.5 1.7 1.7 1.8 1.5 2.3 L6b_Rgs12 0.1 3.0 0.3 0.7 0.1 0.7 1.8 0.0 0.5 L6b_Serpinb11 1.3 1.3 2.0 1.5 3.0 3.1 8.4 4.6 1.9 L6b_Serpinb11 0.0 0.0 0.0 0.1 0.4 0.4 14.2 0.1 0.4

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 15 a b

O1 I1a I1b IS I2 I3 O2 O3 O4

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 16 a

L6b_Serpinb11

L6a_Syt17, L6a_Car12, L6a_Sla, L6_Mgp L2/3_Ptgs2 L6a_Mgp b Meta.O1 Meta.I2 Meta.O4

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 17

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 18 b c a d

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 19 a b

cortex

hippocampus

dentate gyrus

c Tbr1 Mzf1 Mfge8 Alldh1l1 Nr3c2 Slc17a7

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 20

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 21 a 8 major classes Oligo.1 Oligo.2 OPC Astro Micro Endo Glut-N GABA-N

22 finer

classes Oligo.1 Oligo.2 OPC Astro Micro Endo L2/3 L6a L4_Arf5 L4_Ctxn3/ Scnn1a / L5_Ucma Chrna6 L5a L5b L6a L6b Vip Sncg Ndnf Igtp/Smad3 Sst Pvalb_1 Pvalb_2

49 minor classes Oligo_96_Rik Oligo_Opalin OPC_Pdgfra Astro_Gja1 Micro_Ctss Endo_Myl9 Endo_Tbc1d4 L2/3_Ptgs2 L2_Ngb L6a_Car12 L6a_Syt17 L4_Arf5 L4_Ctxn3 L4_Scnn1a L5_Ucma L5_Chrna6 L5a_Pde1c L5a_Hsd11b1 L5a_Batf3 L5a_Tcerg1l L5b_Cdh13 L5b_Tph2 L6a_Mgp L6a_Sla L6b_Rgs12 L6b_Serpinb11 Vip_Mybpc1 Vip_Parm1 Vip_Chat Vip_Gpc3 Sncg Vip_Sncg Ndnf_Car4 Ndnf_Cxcl14 Igtp Smad3 Sst_Chodl Sst_Tacstd2 Sst_Th Sst_Cbln4 Sst_Cdk6 Sst_Myh8 Pvalb_Obox3 Pvalb_Rspo2 Pvalb_Tacr3 Pvalb_Wt1 Pvalb_Cpne5 Pvalb_Gpx3 Pvalb_Tpbg

b 8 major classes c 22 finer classes d 49 minor classes 200 60 80 100 1600 1 1 400 800 0.9 1600 3000 40 200 800 0.9 0.9 0.8 400 20 100 0.8 10 0.8 0.7 200 0.7 100 0.7 0.6 0.6 0.6 0.5 0.5 0.5 20 0.4 0.4 0.4 20 10 0.3 0.3 0.3 0.2 0.2 10 0.2 0.1 0.1 Cross-validation accuracy Cross-validation 0.1 0 0 0 10 30 50 70 90 110 130 150 170 190 210 10 210 410 610 810 10101210141016101810 10 510 1010 1510 2010 2510 3010 3510 # random DE genes used for multiclass SVM

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 22 a OPC Oligo.1 Endothelial cell GABA-ergic Neuron b

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

GABA-ergic Neuron

Microglia

Astrocyte

Glutamatergic Neuron

Endothelial Cell Microglia Oligo.2 Astrocyte Glutamatergic Neuron OPC

Oligo.2

True positive rate True Oligo.1

False positive rate

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 23

a Clustering overlap with respect to 69-gene HMRF 10-gene 20-gene 40-gene 60-gene 65-gene 75-gene

85 1 0 55 73 2 13 20 50 16330 5 7 0 035 2 6 48 34 1 17 14 23 1 0 2 347 4 13 4 0 5 12 0 10 119 2 2 0 0 234 0 1 131 1 7 4 0 0 2 222 135 4 2 02810 8 0 68 88 47 16 0 0 13 1 70 18249 7 622 0 0 0 24116 1 1 0 0 1 0 2 0416 0 3 6 9 313 6 0109 0 1 0 0 3 0 4 0 1 45 0 0 2 0 0 0 32 346 6 0 013 143 0 138 1 572 0 0 0 85 31109 22 0 1 14 0 4 0 0 46 0 0 0 0 0 0 2 4123 0 0 2 0 013 8 0 0214 16 0 10 10 5 22 5 170 1 1 12013 6 12 5110 5111 5 0 19 10 3 7110 0 0 2 0 1 0 8 0111 3 0 011 0 6 0 5127 0 1 7 3 9 48 0 0 23 27 1 18 8 27 3 0 01739 3 533 0 1 18 19 23 75 28 1 0 0 2 6 0 0109 4 0 0 0 0 9 0 0129 1 010 6 0 0 0 0 50 1 0 0 0 0 23 0 4 012041 2 0 0 0 0 0 1 46 0 0 0 15 0 7 238237 5 0 0 2 1 0 012154 4 0 0 0 4 0 0 013625 2 2 0 0 0 1 0118 0 8 1 7 3 1 0 01349 1 0 48 7 6 856 096 211 82 2 0 0 0 8119 015 3 3 0 3 022129 0 0 3 7 0 0 0 6124 0 4 2 3 6 3 0 0138 0 6 8 64 0 35 6 13 9 67 5 5 0 0 37 24 0 1130 0 0 0 0 0 0 0 0 47 1 0 0 0 0 2 0 0 48 0 0 44 0 1 12 9 1108 11 0 0 119 0 3 0174 1 8 0 0110 7 8 26 44 73 152 717 0 0 0 2 111 7 0 0 1 0 261 185 26 2 13 10 0 0 1 0102 315 0 4 0 0 0 0126 6 113 5 0 0 1 1447 69-gene HMRF 0.9 b 6875 1 68 80 0.8 75 1400 65 1000.9 60 70 65 80 0.7 70 100 0.8 60 1200 0.6 0.7 1000 40 0.6 0.5 40 800 10 20 0.5 0.4 10 20 600 0.4 0.3 0.3 400 0.2 0.2 # cells clustered consistently clustered cells # Adjusted mutual information mutual Adjusted 0.1 200 0.1 consistently clustered proportion 0 0 0 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Number of genes Number of genes

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 24 b c Compared with beta=9 HMRF Overlap matrices

Beta=5 beta=5 116 0 0 0 0 2 0 7 2 a 0135 0 2 0 36 14 0 1 Beta=5 Compared with beta=9 HMRF 1 2106 0 0 15 13 12 3 0 7 1104 0 14 32 1 1 Overlap statistics 1 0 0 0 82 0 0 5 0 0 10 1 1 0270 47 0 2 0 7 3 2 023156 1 4 1 0.9 3 1 7 1 028 4161 3 0 3 2 1 0 10 12 0119 0.9 0.8 Beta=7 beta=7 0.8 0.7 116 0 0 0 0 1 0 5 0 0136 0 2 0 3 7 0 0 0.7 1 2108 0 0 2 5 5 0 0.6 0 6 0102 0 420 1 0 0.6 1 0 0 0 82 0 0 4 0 Beta=7 0.5 2 12 0 5 0369 37 0 17 0.5 0 6 2 0 012201 0 1 0.4 1 0 8 1 0 3 0172 0 0.4 0 3 2 1 0 4 8 0117

adjustedMI 0.3 0.3 Beta=10 beta=10 114 0 0 0 0 0 0 1 0 0.2 0149 0 6 0 5 3 0 0 0.2 0 0112 0 0 0 0 7 0 0.1 0 1 0 97 0 1 6 0 1 0.1 0 0 0 0 79 0 0 0 0 Proportionclustered consistently 0 9 2 5 0342 5 1 1 0 0 0 4 0 2 012262 0 4 Beta=10 5 7 10 12 15 5 7 10 12 15 4 1 3 0 319 0176 0 3 1 3 1 019 2 2129

beta beta Beta=12 beta=12 111 0 2 0 3 0 0 4 0 0146 2 12 0 12 7 3 1 2 0104 1 0 2 016 1 1 2 277 0 018 0 0 0 0 0 0 77 0 0 0 0 0 8 3 10 0352 23 2 4 0 5 0 3 0 13229 1 11 4 1 5 0 2 2 0158 0 3 3 2 8 017 1 3118 Beta=12 Beta=15 beta=15 101 0 5 0 6 0 0 6 0 0137 111 0 6 7 2 1 3 191 3 0 2 016 2 1 6 368 0 425 0 1 0 0 0 0 72 0 0 0 0 3 13 3 15 0351 34 2 18 0 5 0 6 023203 1 6 10 112 0 4 1 0157 0 3 2 5 8 011 9 3107 Beta=15

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 25

30

25

20

15 0.99 10 0.4

Frequency 0.2 5 0.1

0 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 -9 -9 -9 -9 -9 -9 -9 -9 -8 -8 -8 -8 -8 -8 -8 -8

Log likelihood

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 26

Expression z-scores Frequency

Nature Biotechnology: doi:10.1038/nbt.4260 SFig 27 Mann Whitney U test a b

O1 O2 O3 O4 IS I2 I3 I1a I1b O1 O2 O3 O4 IS I2 I3 I1a I1b

Nature Biotechnology: doi:10.1038/nbt.4260