comment | FOCUS comment | FOCUS A community-based transcriptomics classifcation and nomenclature of neocortical cell types To understand the function of cortical circuits, it is necessary to catalog their cellular diversity. Past attempts to do so using anatomical, physiological or molecular features of cortical cells have not resulted in a unifed taxonomy of neuronal or glial cell types, partly due to limited data. Single-cell transcriptomics is enabling, for the frst time, systematic high-throughput measurements of cortical cells and generation of datasets that hold the promise of being complete, accurate and permanent. Statistical analyses of these data reveal clusters that often correspond to cell types previously defned by morphological or physiological criteria and that appear conserved across cortical areas and species. To capitalize on these new methods, we propose the adoption of a transcriptome-based taxonomy of cell types for mammalian neocortex. This classifcation should be hierarchical and use a standardized nomenclature. It should be based on a probabilistic defnition of a cell type and incorporate data from diferent approaches, developmental stages and species. A community-based classifcation and data aggregation model, such as a knowledge graph, could provide a common foundation for the study of cortical circuits. This community-based classifcation, nomenclature and data aggregation could serve as an example for cell type atlases in other parts of the body. Rafael Yuste, Michael Hawrylycz, Nadia Aalling, Argel Aguilar-Valles, Detlev Arendt, Ruben Armañanzas, Giorgio A. Ascoli, Concha Bielza, Vahid Bokharaie, Tobias Borgtoft Bergmann, Irina Bystron, Marco Capogna, YoonJeung Chang, Ann Clemens, Christiaan P. J. de Kock, Javier DeFelipe, Sandra Esmeralda Dos Santos, Keagan Dunville, Dirk Feldmeyer, Richárd Fiáth, Gordon James Fishell, Angelica Foggetti, Xuefan Gao, Parviz Ghaderi, Natalia A. Goriounova, Onur Güntürkün, Kenta Hagihara, Vanessa Jane Hall, Moritz Helmstaedter, Suzana Herculano-Houzel, Markus M. Hilscher, Hajime Hirase, Jens Hjerling-Lefer, Rebecca Hodge, Josh Huang, Rafq Huda, Konstantin Khodosevich, Ole Kiehn, Henner Koch, Eric S. Kuebler, Malte Kühnemund, Pedro Larrañaga, Boudewijn Lelieveldt, Emma Louise Louth, Jan H. Lui, Huibert D. Mansvelder, Oscar Marin, Julio Martinez-Trujillo, Homeira Moradi Chameh, Alok Nath Mohapatra, Hermany Munguba, Maiken Nedergaard, Pavel Němec, Netanel Ofer, Ulrich Gottfried Pfsterer, Samuel Pontes, William Redmond, Jean Rossier, Joshua R. Sanes, Richard H. Scheuermann, Esther Serrano-Saiz, Jochen F. Staiger, Peter Somogyi, Gábor Tamás, Andreas Savas Tolias, Maria Antonietta Tosches, Miguel Turrero García, Christian Wozny, Thomas V. Wuttke, Yong Liu, Juan Yuan, Hongkui Zeng and Ed Lein

Classifcations of cortical cell types: classifying groups of cells based on investigators have described hundreds of cell from Cajal to the Petilla Convention shared characteristics and grouping them types in nervous systems of different species. The conceptual foundation of modern into taxa with ranks and a hierarchy. This effort has been particularly arduous biology is the cell theory of Virchow, Taxonomies are important: they provide in the (or neocortex), the which described the cell as the basic unit a conceptual foundation for a field and largest part of the brain in mammals and the of structure, reproduction and pathology also enable the systematic accumulation primary site of higher cognitive functions. of biological organisms1. This idea, which of knowledge. Essential to this effort is The mammalian neocortex has a thin arose from the use of microscopes by the clear definition of cell type, normally layered structure, composed of mixtures of Leeuwenhoek, Hooke, Schleiden and understood as cells with shared phenotypic excitatory and inhibitory arranged Schwann, among others, generated the need characteristics. in circuits of a forbidding complexity, called to build catalogs of the cellular components Virchow’s cell theory was introduced “impenetrable jungles” by Cajal3. This basic of tissues as the first step toward studying to neuroscience by Cajal, whose ‘ structure is very similar in different cortical their structure and function. As with doctrine’ postulated that the structural unit areas and in different species, which has species, these cell catalogs, or atlases, can be of the nervous system was the individual given rise to the possibility that there is a ideally systematized into ‘cell taxonomies’, neuron2. Since then, generations of ‘canonical’ cortical microcircuit4–7, replicated

1456 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment during evolution, which underlies all optimal basis for classification. In principle, effective new methods have emerged for cortical function. many criteria can be used, including profiling thousands of cells or nuclei at a After more than a hundred years of (i) anatomical or connectivity-based time44–48. With simultaneous computational sustained progress, it is clear that neocortical features19,20, (ii) parametrization of intrinsic advances for analyzing large sequence-based neurons and glial cells, like cells in any electrophysiological properties21, data49,50, it is now possible to systematically tissue, belong to many distinct types. (iii) combination of structural and classify and characterize the diversity of Different cell types likely play discrete physiological criteria22,23, (iv) molecular neural cells in any tissue, including the roles in cortical function and computation, markers14,24,25, (v) developmental origins26,27, neocortex (Fig. 2). making it important to characterize and (vi) epigenetic attractor states28 or Conceptually, as much as the genome describe them accurately and in their (vii) evolutionary approaches identifying is the internal genetic description for absolute and relative numbers. Towering homology across species29,30. Ideally, these each species, the transcriptome, as the historical figures like Cajal, Lorente de Nó classifications should converge and agree, complete set of genes being expressed, and Szentágothai, among others, proposed or at least substantially overlap. Indeed, provides an internal code that can classifications of cortical cells based on their there is substantial concordance among describe each cell within an organism in morphologies as visualized with histological categories based on anatomical, molecular a spatiotemporal context. Practically, the stains4,8,9 (Fig. 1a–c). These anatomical and physiological criteria13,22,31–34, but it has scale of scRNAseq promises near-saturating classifications described several dozen types not been easy to combine these approaches analysis of complex cellular brain regions of pyramidal neurons, short- cells into a unified taxonomy. There are like the neocortex, providing, for the first and glial cells, and they were subsequently substantial differences between researchers time, a comprehensive and quantitative complemented by morphological accounts in assigning neurons to particular types description of cellular diversity and of additional cortical cell types by many in the literature19, and even experts often the prospect of simplifying tissue cell researchers10–12, but without arriving at a disagree on what constitutes ground truth. composition to a finite number of cell clear consensus as to the number or even the For example, while most publications agree types and states defined by statistical definition of a cortical cell type. on what a chandelier cell is, the concept of clustering. Importantly, however, these Over the last few decades, the basket cells, a major subtype of inhibitory transcriptionally defined clusters represent introduction of new morphological, neuron, is much less clear19. a probabilistic description of cell types ultrastructural, immunohistochemical This uncertainty is explained and in a high-dimensional landscape of gene and electrophysiological methods, new exacerbated by technical challenges: expression across all cells in a tissue, rather molecular markers, and a growing conventional approaches have been than a definition based on a small set of appreciation of the developmental origins laborious, low-throughput, frequently necessary and sufficient cellular markers or of distinct neuronal subtypes (Fig. 1d–h), non-quantitative and generally plagued other features (see below). have provided increasingly finer by an inability to sample cells in The scale, precision and information phenotypic measurements of cortical cells standardized and systematic ways. Thus, content of these current methods now far and enabled new efforts to classify them setting aside debates about the importance outpace other classical methods of cellular more quantitatively, using supervised or of various criteria and the nature or even phenotyping in neuroscience and have the unsupervised methods such as cluster existence of discrete cell types, it is not potential to approach the complete, accurate analysis13–16. A community effort to classify surprising that the cell-type problem has and permanent (CAP) criteria cited by neocortical inhibitory cells was attempted at remained challenging. Brenner as the gold standard in biological the 2005 Petilla Convention, held in Cajal’s science51. Indeed, major efforts now aim hometown in Spain, and led to a common Transcriptomics: a new framework for to generate a complete description of cell standardized terminology describing the classifying cortical cell types types based on molecular criteria across anatomical, physiological and molecular Recent advances in high-throughput the neocortex (Allen Institute for Brain features of neocortical interneurons17. single-cell transcriptomics (scRNAseq) Science36,40), the whole brain (the National While useful, this fell short of providing have changed the paradigm of cellular Institutes of Health (NIH) BRAIN Initiative a classification and working framework classification, offering a new quantitative Cell Census Network52) and even the whole that investigators could incorporate into genetic framework35–40. These approaches body (the Human Cell Atlas53). Also, as the their research. One reason why this early measure the expression profiles of thousands Human Genome Project offered a means for effort failed was because the datasets for of genes from individual cells in large comparative analysis of orthologous genes phenotypically characterizing cortical numbers, at relatively high speed and low across species, these efforts could define neurons were small. Indeed, many of the cost. Related methods in epigenomics can all or most cell types and states in humans early studies are based on characterizing identify sites of methylation and putative and model organisms, with the possibility dozens or at most hundreds of neurons, gene transcriptional regulation, essential to of extending them to a variety of species small samples from the nearly 20 billion in cell function and state. These new methods to understand the evolution of cell-type human neocortex18. are an outcome from the methodological, diversity. These large investments have An outcome of the Petilla Convention conceptual and economic revolution the potential for a transformative effect on was the realization that there was not yet a created by the Human Genome Project41 neuroscience, which will be accelerated by single method that captured the inherently and have flourished with support from a formalization of a molecular classification multimodal nature of cell phenotypes and the BRAIN Initiative42,43. With genomes in and its adoption by the community. They could serve as a standard for classification. hand, it is now feasible to generate entire also hold promise for the development While most researchers accepted the transcriptomes (which include the sequence of methods for querying circuit function existence of cell types that could be and structure of transcripts) from tissues by providing tools for the targeting and measured and defined independently and to scale these methods for amplifying manipulation of particular subtypes. by different methods, there was no RNA in single cells. Initially limited to Transcriptomic classification offers agreement as to which would form an only a few hundred cells per experiment, the following advantages as a framework

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1457 comment | FOCUS comment | FOCUS

a d e Stereotypy of neuron morphology Interneuron subclass molecular markers Molecular marker/physiology correspondence Parvalbumin (PV) Fast spiking Number Regular spiking non-pyramidal of cells Irregular spiking 20 20

16 16

12 12 Somatostatin (SST) 8 8

4 4

0 0 – Calretinin (CR) + Calbindin-calretinin (CB-CR) Vasoctive intestinal peptide (VIP) – + Calbindin (CB) Neuropeptides – Calbindin-parvalbumin (CB-PV) + Parvalbumin (PV)

f g Classification by physiological features Promiscuous phenotype relationships

Burst Continuous Delayed Stuttering Molecular Physiology marker type

Fast spiking b-AC 20 mV Chandelier cell ChC CB 500 msec

c-AC b Non-adapting Large basket cell LBC PV Stereotypy of glial morphology non-fast spiking b-NAC NBC CR Adapting Nest basket cell c-NAC Small basket cell SBC Irregular d-NAC spiking Double bouquet cell DBC NPY Intrinsic burst c-STUT 30 mV firing 400 ms Bipolar cell BPC VIP d-STUT Accelerating 20 mV BTC SOM 200 ms Bitufted cell b-IS

MC CCK c-IS h Martinotti cell Systematic morpho-electric neuron classification

DAC NGC-DA NGC-SA HAC LAC SAC c Stererotyped connectivity motifs 200 μm MC BTC DBC BP NGC LBC NBC SBC ChC 1 Small basket cell Axon tuft 2 cell GABAergic interneurons Chandelier cell DAC Descending axon cell 3 NGC-DA Neurogliaform cell with dense axonal arborization NGC-SA Neurogliaform cell with slender axonal arborization HAC Horizontal axon cell LAC Large axon cell Large 4 basket SAC Small axon cell cell MC Martinotti cell BTC Bitufted cell 5 DBC Double bouquet cell BP Bipolar cell NGC Neurogliaform cell 6 LBC Large basket cell NBC Nest basket cell SBC Small basket cell ChC Chandelier cell

Fig. 1 | Non-transcriptomics cortical cell-type classifications. a,b, Morphological characterization and classification of neurons (a) and glial cells (b) by Ramón y Cajal (1904)4. c, Diagram showing the connections of different types of interneurons with pyramidal cells. Adapted from Szentágothai (1975)9. d, Definition of GABAergic interneuron classes based on non-overlapping and combinatorial marker gene expression. e, Correlation of firing properties with class markers. f, Cortical cell type classification based on intrinsic firing properties (Petilla convention). g, Complex relationships between cellular morphology, marker-gene expression and intrinsic firing properties based on multimodal analysis. h, Comprehensive morphological and physiological classifications of cortical cell types. Images in a,b reprinted with permission from ref. 4, Cajal Institute; in c, adapted with permission from ref. 9, Elsevier; in d, adapted with permission from ref. 25, Oxford Univ. Press; in e, adapted with permission from ref. 14, Society for Neuroscience; in f and g, adapted with permission from refs. 17,21, respectively, Springer Nature; in h, adapted with permission from ref. 23, Cell Press.

1458 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment for bounding the problem of cellular Indeed, initial transcriptomic studies synaptic connectivity is challenging unless diversity53–56: of cortical tissue are already providing such a classification correlates strongly with many biological insights. For example, those features. Recent work in the retina is 1. High-throughput transcriptomics is scRNAseq analysis of mouse and human promising in this regard, where a large body very efective at allowing a systematic, cortex identified a complex but finite set of work has established a highly diverse comprehensive analysis of cellular di- of ~100 molecularly defined cell types per set of anatomically, physiologically and versity in complex tissues. Its quantita- cortical region that generally agree with prior functionally discrete cell types69 and where tive and high-throughput nature enables literature on cytoarchitectural organization, transcriptomic clusters strongly correlate the adoption of rigorous defnitions developmental origins, functional properties with this prior knowledge35,69,70. For example, and criteria using datasets from tens of and long-range projections65. Moreover, the for mouse bipolar cells, a class comprising thousands to millions of cells. hierarchical (agglomerative) taxonomy of 15 types of excitatory interneurons, there 2. Te genes expressed by a cell during its transcriptomic cell types66, based on relative is essentially perfect correspondence development and maturity ultimately similarity between clusters, reflects these between types defined by scRNAseq, underlie its structure and function, and organizational principles. Viewed as a tree high-throughput optical imaging of so the transcriptome ofers predictive or dendrogram, the initial branches reflect electrical activity, and serial section electron power based on interpreting gene func- major classes (neuronal vs non-neuronal; microscopy35. The spinal cord provides tion. Other cellular phenotypes, includ- excitatory vs inhibitory), with finer splits another good example of correspondence ing morphology, are in part encoded by reflecting more subtle variants of each between scRNAseq and other cellular genes, rather than completely independ- class that reflect different developmental characteristics, including developmental ent defning criteria57. programs; for example, neocortical neurons origins and connectivity profiles71,72. 3. A molecular defnition of cell types are split into excitatory glutamatergic vs Similarly, scRNAseq of mammalian allows the identifcation of cell-type inhibitory GABAergic classes reflecting their hippocampus identifies neuronal cell types markers and the creation of genetic different developmental origins in embryonic that were already described by anatomy and tools to target, label and manipulate pallium vs subpallial proliferative regions, electrophysiology73,74. specifc cell types58,59, thereby provid- while the next splits in the GABAergic Strong evidence for cross-modal ing the means to standardize datasets branch contain neurons generated by correspondence in neocortical cell types is obtained by diferent researchers. medial and caudal subdivisions of the accumulating as well. An early application 4. Transcriptomic data can also provide ganglionic eminence and the preoptic area of cluster analysis of mouse layer 5 neurons information about human diseases, by (Fig. 2a). These transcriptomic divisions are showed correspondence between synaptic allowing a potential linkage between consistent with a long literature on cell fate connectivity, morphology and even laminar genes associated with disease and their specification of different GABAergic classes position13. Almost perfect correlations cellular locus of action. By combining and the transcription factors involved in that were seen between major interneuron with genome-wide association studies process62–64,67 (Fig. 2b). Transcriptomics also subclasses for molecular markers, axonal (GWAS) that identify genes causally allows quantitative analysis of developmental morphology and kinetics of synaptic inputs31 involved in the pathophysiology of a trajectories involved in this specification (Fig. 3a). Within somatostatin-positive disease, cell-type transcriptomics-based and maturation62–64 (Fig. 2c). Genes that interneurons, morphological and data might lead to identifcation of differentiate neuronal classes are enriched electrophysiological subgroups were mechanistically unresolved diseases as for those involved in neuronal connectivity correlated22. Other more specific neuron detected changes in expression levels of and synaptic communication, indicating types show concordance between scRNAseq, genes from key cell types60. they are predictive of selective cellular physiology and morphology, such as the 5. Expression profles allow quantita- and circuit function37 (Fig. 2d). Finally, ‘rosehip’ cell, a layer 1 inhibitory neuron tive comparison of cell types across the same major transcriptomic classes of type in human cortex75 (Fig. 3b). Similarly, evolutionary or developmental times, cortical GABAergic neurons are found strong correspondence between scRNA-seq, enabling the alignment of cell types in mammals and reptiles68 (Fig. 2e), electrophysiology and morphology was across species (based on conserved suggesting deep conservation of cellular shown for mouse layer 1 neurogliaform and expression of homologous genes)61 and architecture and underlying mechanisms of single bouquet neurons, using the patch–seq developmental stages (based on gradual molecular specification. technique, which combines patch-clamp developmental trajectories)62–64. physiology and scRNA-seq76 (Fig. 3c). 6. Transcriptomics also enables comparing Correspondence of cell-type Finally, RNA-seq analysis of retrogradely cell types across organs, as diferent or- classifcations across modalities labeled neurons in mouse primary visual gans use similar genes. Tus, it could be Proposing a transcriptomic-based cortex shows distinctive projections used to classify all the cells in the body classification for a field traditionally of transcriptionally defined excitatory with a single method and framework53. centered on cellular anatomy, physiology and subclasses40 (Fig. 3d). Experimental tools are

Fig. 2 | Transcriptomics classifications of cortical cell types. a, Single-cell transcriptome analysis reveals a molecular diversity of mouse cell types, with relatively invariant interneuron and non-neuronal types across cortical areas but significant variation in excitatory neurons. b, Major interneuron classes are specified by distinct transcription factor codes. c, Single-cell transcriptomics of mouse GABAergic interneuron development demonstrates gradual changes in gene expression underlying developmental maturation and fate bifurcations as cells become postmitotic. d, Gene families shaping cardinal GABAergic neuron type include neuronal connectivity, ligand receptors, electrical signaling, intracellular signal transduction, synaptic transmission and gene transcription. These gene families assemble membrane-proximal molecular machines that customize input–output connectivity and properties in different GABAergic types. e, Single-cell transcriptomics allows cross-species comparisons and shows conservation of major cell classes from reptiles to mammals, with conserved transcription factors but some species-specific effectors (turtle data). TF, transcription factor. Images in a and c adapted with permission from refs. 40,63, respectively, Springer Nature; in b, adapted with permission from ref. 27, Elsevier; in d, adapted with permission from ref. 37, Cell Press; in e, adapted with permission from refs. 30,68, Elsevier and AAAS, respectively.

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1459 comment | FOCUS comment | FOCUS increasingly available to aid in phenotypic methods54,77. While major consortium efforts Challenges for cortical cell type characterization of transcriptionally defined will generate the transcriptomic framework, classifcation cell types in model animals and even human, linking different types of data to it will Although strong cross-modal such as specific Cre lines and viruses, likely be most effective as a distributed correspondence has been observed at the as well as novel spatial transcriptomics community effort. major subclass level, such correspondence

a Transcriptomic cell type classification Dissected area Cortical area Cell class Cluster ALM GABA ALM Glut VISp VISp Non-Neu. Non-neuronal Dissected layer(s) L1 L1–L2/3 Astrocytes & oligodendrocytes Neuroectoderm Neuronal L2/3 L1–L4 t-SNE 2 L4 L2/3–L4 L5 L4–L5 VLMC SMC Peri/ Micro t-SNE 1 L6 L4–L6 Endo L6b L5–L6 Neuron subtypes Glutamatergic All layers GABAergic Serpinf1 VISp Sncg ALM VISp Lamp5 L6 IT L5/L6 IT ALM Pvalb Sst Vip L5 IT L6b L5/6 L5 PT L2/3 IT VISp L6 CT NP L2/3 IT Dissected area (% of cluster) 0 50 100 0 Dissected layer (% of cluster) 50 100 1,404 1,04 9 cells 106 183 110 583 624 810 200 112 129 165 131 103 126 208 203 115 555 238 178 179 207 232 242 289 116 211 115 366 128 289 137 118 213 148 879 534 287 115 118 107 318 278 210 259 241 180 300 183 122 203 220 109 102 172 380 200 174 216 230 122 224 704 184 387 412 123 435 215 119 125 673 212 108 n 30 36 40 80 12 37 83 13 52 18 56 89 63 65 72 69 36 71 80 61 79 89 63 74 59 67 61 80 56 49 86 48 71 68 76 76 68 16 46 59 51 46 60 65 82 98 43 22 91 75 72 54 49 70 70 7 7 4 Microglia Siglech PVM Mrc1 Endo Cytl1 Endo Ctla2a SMC Acta2 Peri Kcnj8 VLMC Spp1 Col15a1 VLMC Osr1 Mc5r VLMC Spp1 Hs3st6 VLMC Osr1 Cd74 Oligo Synpr Oligo Serpinb1a Oligo Rassf10 OPC Pdgfra Ccnb1 OPC Pdgfra Grm5 CR Lhx5 Meis2 Adamts19 Pvalb Vipr2 Pvalb Tpbg Pvalb Reln Tac1 Pvalb Reln Itm2a Pvalb Gpr149 Islr Pvalb Sema3e Kank4 Pvalb Akr1c18 Ntf3 Pvalb Calb1 Sst Pvalb Th Sst Pvalb Gabrg1 Sst Nts Sst Rxfp1 Prdm8 Sst Rxfp1 Eya1 Sst Tac2 Tacstd2 Sst Esm1 Sst Crh 4930553C11Rik Sst Crhr2 Efemp1 Sst Hpse Cbln4 Sst Hpse Sema3c Sst Tac2 Myh4 Sst Chrna2 Ptgdr Sst Myh8 Fibin Sst Chrna2 Glra3 Sst Myh8 Etv1 Sst Nr2f2 Necab1 Sst Calb2 Pdlim5 Sst Calb2 Necab1 Sst Tac1 Tacr3 Sst Tac1 Htr1d Sst Mme Fam114a1 Sst Chodl Vip Col15a1 Pde1a Vip Crispld2 Kcne4 Vip Crispld2 Htr2c Vip Pygm C1ql1 Vip Chat Htr1f Vip Rspo1 Itga4 Vip Lect1 Oxtr Vip Rspo4 Rxfp1 Chat Vip Ptprt Pkp2 Vip Gpc3 Slc18a3 Vip Arhgap36 Hmcn1 Vip Igfbp4 Mab21l1 Vip Lmo1 Myl1 Vip Lmo1 Fam159b Vip Igfbp6 Pltp Vip Igfbp6 Car10 Serpinf1 Aqp5 Vip Serpinf1 Clrn1 Sncg Vip Itih5 Sncg Gpr50 Sncg Vip Nptx2 Sncg Slc17a8 Lamp5 Lhx6 Lamp5 Lsp1 Lamp5 Plch2 Dock5 Lamp5 Ntn1 Npy2r Lamp5 Fam19a1 Tmem182 Lamp5 Fam19a1 Pax6 Lamp5 Krt73 L6b Hsd17b2 L6b VISp Crh L6b P2ry12 L6b ALM Olfr111 Nxph1 L6b ALM Olfr111 Spon1 L6b VISp Col8a1 Rxfp1 L6b VISp Mup5 L6b Col8a1 Rprm L6 CT VISp Gpr139 L6 CT VISp Ctxn3 Sla L6 CT VISp Ctxn3 Brinp3 L6 CT VISp Nxph2 Wls L6 CT VISp Krt80 Sla L6 CT ALM Cpa6 L6 CT Nxph2 Sla L6 NP ALM Trh L5 NP VISp Trhr Met L5 NP ALM Trhr Nefl L5 NP VISp Trhr Cpne7 L5 PT ALM Hpgd L5 PT ALM Npsr1 L5 PT ALM Slco2a1 L5 PT VISp Krt80 L5 PT VISp C1ql2 Cdh13 L5 PT VISp C1ql2 Ptgfr L5 PT VISp Lgr5 L5 PT VISp Chrna6 L6 IT VISp Car3 L6 IT VISp Col18a1 L6 IT VISp Col23a1 Adamts2 L6 IT VISp Penk Fst L6 IT VISp Penk Col27a1 L6 IT ALM Oprk1 L6 IT ALM Tgfb1 L5 IT ALM Gkn1 Pcdh19 L5 IT ALM Cpa6 Gpr88 L5 IT ALM Tmem163 Arhgap25 L5 IT ALM Tmem163 Dmrtb1 L5 IT ALM Tnc L5 IT ALM Lypd1 Gpr88 L5 IT ALM Cbln4 Fezf2 L5 IT ALM Pld5 L5 IT ALM Npw L5 IT VISp Col27a1 L5 IT VISp Col6a1 Fezf2 L5 IT VISp Batf3 L5 IT VISp Whrn Tox2 L5 IT VISp Hsd11b1 Endou L4 IT VISp Rspo1 L2/3 IT ALM Macc1 Lrg1 L2/3 IT ALM Ptrf L2/3 IT ALM Sla L2/3 IT VISp Agmat L2/3 IT VISp Adamts2 L2/3 IT VISp Rrad Astro Aqp4

b c e Developmental TF codes Developmental trajectories Evolutionary conservation of cell types

VZ SGZ/mantle Cortical plate Adult cortex Maturation Reptile (Turtle) Neurogenesis, Cell fate Migration, Mature score cell fate commitement, synaptogenesis, interneurons 0.8 CGE-derived MGE-derived commitment tangential migration maturation 0.6 VIP-like Reln SST PV-like dCGE RLN (SST) Pvalb Obox3 Dlx1/2/5/6, 0.4 Nr2f1?, Nr2f2, Pvalb Wt1 Arx?,Zeb2?, 0.2 Pvalb Rspo3 Dlx1/2 Prox1, Sp8 PV

Diffusion map coordinate 2 Diffusion Pvalb Tacr3 Ascl1 Diffusion map coordinate 1 Gsx1/2 (high) Dlx1/2/5/6, Dlx1/2/5/6, VIP CR Pvalb Cpne5 Nr2f1, Nr2f2, Nr2f1, Nr2f2, Dlx1/2/5/6, Pvalb Tpbg Pax6 (Low) Nr2f1?, Nr2f2, CGE LGE MGE Distance Nr2f1 Arx, Zeb2?, Arx, Zeb2?, Prox1, Sp8 Prox1, Sp8 Arx,Zeb2?, from root Pvalb Gpx3 Nr2f2 Lhx6, Sox6, 1.00 Sst Tacstd2 0 .4 Satb1 Sst Cdk6 0.75 SST CR Sst Myh8 MGE-derived dMGE SST Dlx1/2/5/6, 0.50 Sst Th Nr2f1, Arx?, 0.25 Sst Chodl

Nkx6-2 Dlx1/2/5/6, Dlx1/2/5/6, Zeb2?,Lhx6, 0 Gli1 Nr2f1, Nr2f2, Nr2f1, Nr2f2, Sox6, Satb1 MDS coordinate 2 0 Sst Cbln4 Nr2f1 Arx, Zeb2, Arx, Zeb2?, Igtp SST C orrelatoin Nr2f2 Lhx6, Sox6 Lhx6, Sox6, Satb1 MDS coordinate 1 Smad3 Dlx1/2/5/6, MGE Nr2f1, Nr2f2, CGE LGE MGE Ndnf Cxcl14 Arx, Zeb2?, Reln Ndnf Car4 – 0.26 Lhx6, Sox6, Satb1 Branch 1 Mammal (Mouse) Dlx1/2 Scng Dlx1/2/5/6, Dlx1/2/5/6, Branch 2 Ascl1 Nr2f1, PV Vip Gpc3 Gsx2 (low) Nr2f1, Arx, Zeb2?, Dlx5/6, Arx, Branch 3 Arx, Zeb2, Lhx6, Sox6, Satb1 Vip Chat Nr2f1 Lhx6, Sox6 Zeb2?, Lhx6, Trunk VIP Vip Scng Nkx2-1 Sox6, Satb1

CGE-derived Vip Parm1 MDS coordinate 2 Vip Mybpc1 MDS coordinate 1 d i14 i15 i16 i17 i18 i07 i08 i09 i10 i12 i11 i13 Molecular determinants of cell type function DCV Conserved TF codes CaBP KV Transcription factors CGE-derived CGE-derived MGE-derived MGE-derived SV Endogenous GPCR Adarb2+ Adarb2+ Reln+ SST+ non-SST ligands Syt Connectivity Cell adhesion molecules GABA Syt Nr2f2, Npas1, Nr2f2, Npas1, Lhx6, Sox6, Arx, Lhx6, Sox6, Arx, Transcription Nr2e1, Zbtb16, Nr2e1, Zbtb16, Sp9, Satb1(High), Sp9, Satb1(Low), Receptors CaV factors Sp8*, Prox1 Sp8*, Id2 Mafb Etv1, Mef2c(High), Input Prox1

Signaling KV GPCR CaV Htr3a, Adarb2, Htr3a, Adarb2, SST, NPY, Reln* Plau, Bcan, AC Effector Cnr1, VIP, Cck*, Reln, Lamp5, Pvalb, genes Grp, Lamp5 Cnr1*, Ndnf* Lamp* Ion channels Output GABAR GluR RGS RTK PTP PDE Rho-GEF Synaptic release Mouse Turtle * = subset of cells CaBP RAS Rho

1460 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a c Transgenic interneuron subclass correspondence Phenotypic correspondence by patch-seq

Patch-seq recording Trained Standard recording classifier 100 Modified Firing pattern Inferred morphology Firing pattern intracellular Train solution 80 + classifier RNA cDNA synthesis Detailed morphology + QC

Phase I extraction Phase II RNA-seq 60

40 Elongated neurogliaform cells (eNGCs) Single bouquet cells (SBCs) 20 PV1 PV4 PV11 PV3 PV8 PV12 NPY10 NPY2 NPY9 NPY17 NPY8 NPY4 NPY3 SOM7 SOM10 SOM8 SOM9 SOM6

(Dlink/Dmax8)*100 L1

L2/3 L1

L2/3

L4

L5

L6 w.m. –60mV

PV NPY SOM

+10mV

–71 mV –64 mV –65 mV

b d Multimodal cell type correspondence Transcriptome/connectivity relationships Rosehip transcriptomic cluster 50 μm Human n cells i1 1 i2 i1 i5 i3 i4 i7 i1 0 i6 i8 i9 1 *GAD1 Injection 5 *SST targets 10 1 50 2/3 PVALB VIP 100 *CXCL14 200 *CALB2 CTX STR TH TEC P ***CCK ***CNR1 ***CPLX3 CTX STR TH TEC P VISp cell types **NDNF IT L2/3 IT VISp Rrad **SV2C L2/3 IT VISp Adamts2 **TRPC3 L2/3 IT VISp Agmat **LAMP5 L4 IT VISp Rspo1 1 NTNG1 L5 IT VISp Hsd11b1 Endou 2/3 PRSS12 L5 IT VISp Whrn Tox2 *PDGFRA L5 IT VISp Batf3 TOX L5 IT VISp Col6a1 Fezf2 100 μm Rosehip ARHGAP31 L5 IT VISp Col27a1 L6 IT VISp Penk Col27a1 CDCA7 L6 IT VISp Penk Fst *EYA4 1 HRH1 2/3 Neurogliaform L6 IT VISp Col18a1 KIRREL L6 IT VISp Car3 PMEPA1 L5 PT VISp Chrna6 ROR2 L5 PT VISp Lgr5 *SOX13 L5 PT VISp C1ql2 Ptgfr Basket SSTR2 PT L5 PT VISp C1ql2 Cdh13 TXLNB L5 PT VISp Krt80 NP L5 NP VISp Trhr Cpne7 L5 NP VISp Trhr Met /Hz) 8 2 Rosehip CT L6 CT Nxph2 Sla V Neurogliaform L6 CT VISp Krt80 Sla –9 6 Other L6 CT VISp Nxph2 Wls L6 CT VISp Ctxn3 Brinp3 4 L6 CT VISp Ctxn3 Sla L6 CT VISp Gpr139 2 L6b L6b Col8a1 Rprm L6b VISp Mup5 L6b VISp Col8a1 Rxfp1 0 20 mV L6b P2ry12

Average power (×1 0 100 ms L6b VISp Crh 02 04 06 08 0 L6b Hsd17b2 Frequency (Hz) n cells 491 8 406 52 92

Fig. 3 | Correspondence across phenotypes of cortical neuron types. a, Quantitative morphological clustering and electrophysiological feature variation between major inhibitory neuron classes using transgenic mouse lines (modified from Figs. 1 and 2 from ref. 31). b, Convergent physiological, anatomical and transcriptomic evidence for a distinctive rosehip layer 1 inhibitory neuron type in human cortex that differs from neighboring neurogliaform cells. c, Morphological and physiological differences between layer 1 neurogliaform and single bouquet neurons shown by patch-seq analysis. Scale bars as in b. d, RNA-seq analysis of retrogradely labeled neurons in mouse primary visual cortex show distinctive projections of excitatory subclasses, but overlapping projections for finer transcriptomic cell types. Images in a adapted with permission from ref. 31, Oxford Univ. Press; in b–d, adapted with permission from refs. 75,76 and 40, respectively, Springer Nature. at the more refined branches of the mentioned RNA-seq study of retrogradely at the major branches of the transcriptomic transcriptomic classification remains largely labeled neurons in mouse primary visual taxonomy, there were overlapping to be validated. One example is the already cortex40. Despite distinct projection targets projections for finer transcriptomic cell

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1461 comment | FOCUS comment | FOCUS types (Fig. 3d). One possible explanation including astrocyte diversity and divergent methods) and would include a description is that long-range connectivity patterns are molecular phenotypes between mouse of quantitative metrics such as resolution, set up early in development and may not be and human that correlate with known complexity, variability, uniqueness and strongly reflected in adult gene expression. morphological specializations in primate association of variables with other attributes. However, such mismatches do not negate the astrocytes36,74,84. Such similarities and There are two approaches to find and test value of a core transcriptomic classification differences between cell types across species, cluster validity. One is ‘hard’ clustering, with as described above. Rather, this information as well as challenges created by graded or clearly defined borders between clusters and about developmental trajectories needs to developmental variations in features, could with each cell strictly assigned to a particular be incorporated into the transcriptomic cell also be better captured by a probabilistically type. Alternatively, in ‘soft’ (or ‘fuzzy’) type classification28. defined and hierarchically organized clustering, any given cell has a particular Another challenge to transcriptomic cell-type taxonomy. probability of belonging to a particular classifications (and, in fact, to any cluster. Despite the probabilistic nature, classification of cell types) is the presence A probabilistic and hierarchical inter- and intra-cluster distance may still be of phenotypic variation within a given cell defnition of cortical cell types defined for outcome validation. Ultimately, type. One facet of this is the possibility of Examining the current transcriptomic the consensus description of cell types may variation in gene expression due to cell evidence, in some cases we find highly form a continuum, beginning with hard and state, differentiation and other dynamic distinct cell types based on robust ending with soft distinctions among cell processes within a single cell type. Some similarities of the transcriptome and other types, with an ambiguous transition between studies have suggested that cell types are measurable cell attributes, as exemplified these extremes. possibly not defined, discrete entities and by the phenotypic homogeneity of One natural approach to represent a may be better described as components of neocortical chandelier cells40,85–87 or the transcriptomic taxonomy is to adopt a a complex landscape of possible states78–80, above-mentioned rosehip cells. On the other hierarchical framework. Cluster analysis is and, indeed, some of that heterogeneity hand, the existence of cell states, spatial well suited to this, as its connectivity-based can be mapped with omics data81. Some gradients of phenotypes and mixtures of methods generate a tree-like representation continuous variation could be functionally differences and similarities in cross-species of clusters99. This approach follows the relevant. For example, basal dendritic comparisons present challenges to a historical tradition of using cladistics to lengths and morphological complexity of discrete and categorical perspective on classify organisms, assuming common layer 2/3 pyramidal cells appears to vary defining cell types. Prematurely adopting ancestors in their evolution and smoothly across a rostrocaudal axis in an inflexible definition of types will obscure synapomorphies (shared derived traits) mouse cortex82 (Fig. 4a). Further evidence the significance of observed phenotypic among related clades. While statistical for spatial gradients can be found in the variability and its biological interpretation. clusters do not presume any hierarchy in the graded transcriptomic variation across Rather, a plausible way forward is to employ structure of the data, biological systems have the human cortex83, perhaps reflecting the a practical or operational quantitative a temporal evolution as one of their essential expression of transcription factor gradients definition of a cell type. features and makes temporally based in the ventricular zone during development Cluster analysis has been used to hierarchies natural100. The evolutionary or (Fig. 4b). These phenotypic or spatial classify cortical neurons according to their developmental history of a neural circuit gradients create challenges for thresholding structural or physiological phenotypes or implies earlier stages, which are often less in clustering, and they fuel debates between expression of molecular markers13,14,22,31,82,87–90 specialized and represent common ancestors lumpers and splitters in determining the and, more recently, transcriptomics36,40,91,92. of later states101. Indeed, a hierarchical right level of granularity in defining Many unsupervised and supervised organization of existing transcriptomic cell cell types. methods can be used, including multilayer types data appears to mirror developmental A particular advantage of a perceptrons16, logistic regression16, k-nearest principles and spatiotemporal organization transcriptomic classification is that it neighbors16, affinity propagation93, in the neocortex (see above). Another provides a direct avenue for quantitative Bayesian classifiers34, naïve Bayes16, topic advantage of casting the cell type comparative analysis by aligning cell modelling94, t-distributed stochastic classification as a cladistic one is that the types across species based on shared gene neighbor embedding (t-SNE)95,96, graph lumping–splitting tension maps itself covariation, enabling an ‘Ur-classification’ theory97 and autoencoders98. These methods, naturally as a distinction between different as a common denominator of basic cell building on the existence of statistically levels of the hierarchical tree, since one can types. For example, a recent study of human defined groups or clusters over a set of split a group into subgroups at a lower level cortex61 demonstrated that the overall measurable attributes, naturally lead to an of the hierarchy to reflect data obtained in cellular organization of the human cortex evidence-based probabilistic definition different physiological or developmental is highly conserved with that of the mouse, of cell types. conditions. This provides an effective and allowing identification of homologous cell A probabilistic definition of cell types is objective framework to quantitatively types (Fig. 4c). However, this study also particularly applicable to transcriptomics, evaluate lumper-vs-splitter discussions. revealed a challenge for the future, in that, where the dimension of the underlying But hierarchical transcriptomic in many cases, it was not possible to align space is large, the variance comparatively relationships may not be easily represented cell types across species at the finest levels high and competing approaches give similar as a simple tree-like structure. Rather, they of granularity but rather at a higher level in results. However, one requires community may have complex inclusion–exclusion the hierarchical taxonomy. Furthermore, consensus on a rigorous statistical definition and class relationships and may be many differences were seen in homologous of transcriptomic types and the description more amenable to graph-based or other types, including their proportions, of intra- and inter-type variability. Ideally, set-theoretic constructions. Indeed, the laminar distributions, gene expression and this quantitative definition of a cell type space of the transcriptomes for cortical morphology. Finally, prominent differences would be independent of the statistical cell types could be visualized as a complex, were found in non-neuronal cells as well, method used (i.e., robust to different high-dimensional landscape with isolated

1462 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a b Graded areal variation in morphology Graded areal variation in transcriptome

Total dendritic nodes

Parietal 0.2 25 SPL PoG lobe SFG opIFG 20 PrG MFG PrG SMG PCLA 0.1 SFG 15 MFG AnG Lateral view trIFG SPL opIFG orIFG PCLP fro SMG 10 AnG FP IFGIFG trIFGtrIFG TG PLT SOG FP Pcu HG LOrG PLP 0.0 STG PoG Number of nodes 5 STG MTG SOG orIFG PLT POrG ITG OTG IOGIOG Occipital AOrG MTG PC2 IRoG 0 lobe SRoGMOrG V2L/ S2 M2 Frontal TP ITGITG GRe HG TG TeA lobe –0.1 PaOG PLP PCL P Cortical region Temporal FuG PCL A lobe Peristriate Basal dendritic field area SFG PCu TP 80 –0.2

60 FP Peristriate Striate (area 17) –0.3 SRoG PaOG 40 SRoG PaOG Peristriate Medial view GRe OTG Primary visual cortex TP FuG IOGIOG (area(Area 17) 17) 20 –0.4 Area (x 10^3 um^2) 0 Neocortical spatial topography V2L/ S2 M2 –0.2 0.0 0.2 0.4 TeA Cortical region Neocortical genetic topography c Cross-species transcriptome alignment

Pax6 Inh L1−2 PAX6 CDH12 Inh L1−2 PAX6 TNFAIP8L3 LAMP5/ Lamp5 1 Inh L1 LAMP5 NMBR Lamp5 Rosehip *Lamp5 Rosehip Inh L1−6 LAMP5 LCP2 PAX6 * Lamp5 Lhx6 Inh L1−2 LAMP5 DBP * *Lamp5 Lhx6 Inh L2−6 LAMP5 CA1 Pax6 Inh L1 SST CHRNA4 Vip Sncg Lamp5 2 Inh L1−2 ADARB2 MC4R Lamp5 1 Inh L1−2 SST BAGE2 Lamp5 2 Inh L1−3 VIP SYT6 Vip 1 Vip Sncg Inh L1−2 VIP TSPAN12 Inh L1−4 VIP CHRNA6 Vip 2 Cross-speciesVip 1 transcriptome alignment Inh L1−3 VIP ADAMTSL1 Vip 3 c Inh L1−4 VIP PENK Vip 4 Vip 2 Inh L2−6 VIP QPCT Vip 5 Inh L3−6 VIP HS3ST3A1 Sst Chodl Vip 3 Inh L1−2 VIP PCDH20 * Inh L2−5 VIP SERPINF1 Sst 2 Inh L2−5 VIP TYR VIP Sst 3 Vip 4 Inh L1−3 VIP CHRM2 Sst 1 Inh L2−4 VIP CBLN1 Sst 4 Inh L1−3 VIP CCDC184 Sst 5 Inh L1−3 VIP GGH Vip 5 Inh L1−2 VIP LBH Pvalb 1 Inh L2−3 VIP CASC6 Pvalb 2 Inh L2−4 VIP SPAG17 *Chandelier Sst Chodl Inh L1−4 VIP OPRM1 Exc L5–6 IT 1 * Inh L3−6 SST NPY * Exc L3–5 IT Sst 1 Inh L3−6 SST HPGD * Inh L4−6 SST B3GAT2 Exc L5–6 IT 2 Sst 2 Inh L5−6 SST KLHDC8A Exc L4–5 IT Sst 3 Inh L5−6 SST NPM1P10 Exc L5–6 IT 3 Sst 4 Inh L4−6 SST GXYLT2 SST Exc L2–3 IT Inh L4−5 SST STK32A Exc L6 IT 1 Sst 5 Inh L1−3 SST CALB1 Inh L3−5 SST ADGRG6 Exc L6 IT 2 Inh L2−4 SST FRZB Exc L5 ET Inh L5−6 SST TH Exc L5–6 NP Pvalb 1 Inh L5−6 LHX6 GLP1R Exc L6 CT Inh L5−6 PVALB LGR5 Exc L6b Inh L4−5 PVALB MEPE OPC Inh L2−4 PVALB WFDC2 PVALB Pvalb 2 Inh L4−6 PVALB SULF1 Astrocy* te Inh L5−6 SST MIR548F2 Oligo Chandelier Inh L2−5 PVALB SCUBE3 Endothelial * Microglia/PVM acr3 Gl ra3 Ptgd r

1 Cluster acstd2 2 Sst Nts 2 Human MTG Mouse V1 a a

Sst Esm1 overlap n ac1 T ac2 Myh4 ac1 Htr1d Sst Chodl n 0.8 One-to-one Pvalb Vipr2 Pvalb Tpbg * ac2 T Sncg Gpr50 Lamp5 Lhx6 Lamp5 Lsp1 Pvalb Th Sst Number of clusters Lamp5 Krt73 Pvalb Gabrg1 Sncg Vip Itih5 Sncg Slc17a8 0.6 Vip Chat Htr1f Vip Lect1 Oxtr Serpinf1 Clrn1 Vip Ptprt Pkp2 Sst T Sst T Vip Igfbp6 Pltp Sst Myh8 Fibin Vip Lmo1 Myl1 Sst Myh8 Etv1 Sst Rxfp1 Eya1 Sst Hpse Cbln4 Sncg Vip Nptx2 Pvalb Calb1 Sst Vip Rspo1 Itga4 Vip Pygm C1ql1 Pvalb Reln Tac1 Meis2 Adamts19 Pvalb Reln Itm2a Vip Igfbp6 Car10 Sst T Sst Rxfp1 Prdm8 Pvalb Gpr149 Islr Sst Ch r Sst Calb2 Pdlim5 Sst Ch r Sst Crhr2 Efemp1 Sst Nr2f2 Necab1 Sst Hpse Sema3c Vip Gpc3 Slc18a3 Sst Calb2 Necab1

Serpinf1 Aqp5 Vip 0.4 Vip Crispld2 Htr2c Lamp5 Ntn1 Npy2r Vip Col15a1 Pde1a Vip Crispld2 Kcne4 Pvalb Akr1c18 Ntf3 Vip Lmo1 Fam159b Vip Igfbp4 Mab21l1 Sst Mme Fam114a1 Lamp5 Plch2 Dock5 Pvalb Sema3e Kank4 Vip Arhgap36 Hmcn1

Vip Rspo4 Rxfp1 Chat 0.2 Lamp5 Fam19a1 Pax6 Neurogliaform Sst Crh 4930553C11Rik 0

Lamp5 Fam19a1 Tmem182 Single-bouquet Bipolar or Long-range Layer 5 Upper layer Upper layer Chandelier Mouse Multipolar projecting Martinotti Martinotti Basket morphology Mouse V1 cluster

Fig. 4 | Challenges for transcriptomic classification. a, Gradients in morphological size and complexity across the rostrocaudal extent of the cortex. b, Graded transcriptomic variation across the human cortex encodes rostrocaudal position on the cortical sheet. c, Transcriptomic cell types can be aligned across species based on shared molecular specification, but often at a lower level of resolution than the finest types observed in a given species. Images in a adapted with permission from ref. 82, Oxford Univ. Press; in b and c, adapted with permission from refs. 83 and 61, respectively, Springer Nature.

peaks of expression for a given cell type but a cell type as a continuous trajectory in a classification system per se, but to create also valleys and gradients between more transcriptomic space102. A robust statistical a comprehensive description of cellular weakly defined classes, which could be framework that enables a quantitative diversity in the neocortex. One needs to described alternatively as types or states. definition of cell type (or tendency to be a ensure that the experimental method will Such complexity can be described using, for type) is clearly needed. indeed capture all of the cell types present, example, the concept of cell-type attractors28, A final, and key, question is how to that the classification is complete and that or using the distinction between core and ensure that any given classification or the types are defined correctly. For any intermediate cells40 or the description of taxonomy is valid. The goal is not defining classification to be valid, it is critical to

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1463 comment | FOCUS comment | FOCUS ensure accuracy and correctness. First, it types, it seems likely that many of these in one species are not conserved in other is imperative to seek internal statistical types will vary in a somewhat continuous species. The traditional way of naming cell robustness for identified clusters, using fashion across cortical areas and possibly types is by their anatomical features (such different statistical methods22,103. Second, also across species (Fig. 4a,b). Likewise, as chandelier, double-bouquet, basket, external validation with orthogonal the classification system should also have Martinotti, pyramidal cells), and it would datasets is critical. Multimodal datasets a temporal component to capture the be desirable to incorporate these short and are particularly important in this regard, developmental trajectory from progenitor widely-used names into a nomenclature as they enable cross-comparisons between cell division to a terminally differentiated when possible, to seek consistency with classifications based on different types of state. Cells can be quantitatively defined the vast literature on neocortical cell data, for example, molecular, physiological by their position on that developmental types. However, anatomical features, such or anatomical22,31, patch-seq76, or spatial or spatial gradient. Finally, aligning across as horsetail , may also vary across transcriptomics methods54 (Fig. 3a–c) can species is quantitatively possible now, but species17. Also, for newly identified cell enable this, defining functionally relevant this alignment may only be possible at types, anatomical information is often not levels of granularity. Finally, a probabilistic different levels of granularity with increasing available and naming them by marker genes definition, particularly with a Bayesian evolutionary distance. The benefits of will be more practical. framework, can be tested by generatively creating a unified reference ontology across Adopting a more abstract nomenclature building computational models of each cell these biological axes will be large, but it will not based on anatomical features or type and comparting them with the real be a serious community effort to design a individual marker genes could make it data, thus providing some performance system that can accommodate them. more flexible, more easily applicable across metrics on the algorithms. Using these Following the genetic classification species and more compatible with other criteria, robustness, reproducibility and paradigm proposed here, there are many tissues outside the cortex or the brain. One predictive power can be measured and lessons to learn from genomics. For idea for a cell-type nomenclature system different approaches compared, as is example, the reference classification could is to build on gene nomenclature, treating normally done in machine learning16. be iteratively updated and refined with transcriptomic cell clusters as sequence subsequent accumulation of data108 like data (partially implemented for Allen A unifed ontology and nomenclature genome builds, which changed in the early Institute datasets; https://portal.brain-map. of cortical cell types years but have become increasingly stable. org/explore/classes/nomenclature). Every To truly gain community adoption, the As in current gene nomenclature, an official cell cluster from a dataset or analysis data-driven transcriptomic classification of symbol with multiple aliases can link cell would get a unique accession ID. Robust cortical cell types requires a formal unified types to commonly used terminology and reproducible clusters would have cell type classification, a taxonomy and a relating to cellular anatomy or other official cell type names or symbols, as nomenclature system17,20,90 whose principles phenotypes. This nomenclature should be well as any number of aliases that could are generalizable to other systems. Names portable across species, with orthologous represent different existing nomenclatures are important: as an old Basque proverb cell types having common names, much as or historical names. In addition to cell states, ‘izena duen guzia omen da’ or ‘that current gene symbols refer to orthologous types, higher-order classes (for example, which has a name exists’, and a similar genes. For the cell type classification caudal ganglionic eminence (CGE)-derived Chinese one says ‘the beginning of wisdom to be useful like the genome has been, GABAergic interneurons, GABAergic is to call things by their right names’. This computational tools conceptually similar interneurons, neurons) could be named as classification should aim to be a consensus to BLAST alignment tools109 for mapping well, and both types and classes would be one that incorporates the richness of data sequence data, need to be developed to matched across species at the level (type, accumulated by different groups and be allow researchers to quantitatively map their class) at which they can be aligned. presented in a curated output that is public, data to this reference classification. Finally, easily accessible and has revisions managed continuing the analogy with genomics, just A cell-type knowledge graph for by a curation committee of experts. Creation as there are different versions of genome community data aggregation of such an ontology is a serious project in builds for different purposes (for example, Defining the cell types of the cortex data organization that can build on prior with more or less manual curation), one (or other brain structures) serves as a efforts in cell ontologies104–106, as well as could consider different versions of cortical foundation for aggregating information best practices established by the ontology cell type taxonomies, with varying levels of about their function. By analogy to the development community107 (see Open splitting or lumping; spatial, temporal or genome, the definition of genes has allowed Biomedical Ontology Foundry, http://www. evolutionary criteria; or even some manually a massive integration of information about obofoundry.org). curated by experts, but under a unified their usage, function and disease relevance A true, data-driven transcriptomic framework of probabilistic definition with a wide range of databases. On the other taxonomy poses a series of challenges of cell types. hand, probabilistically defined cell types are that have not yet been taken on by the Nomenclature also poses a challenge. not the same as deterministically defined cell ontology community, but that are Currently, the lack of standardized protein-coding regions of the genome, and surmountable. One challenge is that nomenclature makes it difficult to track we can expect that our understanding of transcriptomically similar cell types can and relate cell types across different studies. cell types and their functional relevance exist in multiple anatomical locations. Thus, One natural idea with a genetically based will change as more information becomes transcriptomic types need to be related to paradigm is to name cell types on the basis available. A more flexible way to organize proper levels of the anatomical structure. of the best defining genes for each cell our knowledge and understanding of Prominent gradients across cortical areas type, as is currently commonly done36,61,110. cell types would be as a living, updatable pose another challenge to define in a However, the most specific genes are not framework, one allowing reference, taxonomy. While any given cortical region always detected in every cell of a cluster, and query and inference. An online-based contains some number of transcriptomic often the genes that best define a cell type data aggregation platform could also

1464 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a scRNA-seq Spatial anatomy d Glutamatergic GABAergic Cell type knowledge graph

50 Genes 0 mV –50 b 0 10 20 Prediction Probabilistic cell type definition ms Electrophysiology Core 3 10 50 100 250 300 Semantic Morphology Intermediate integration Inference 3 10 50 100 Stable Linked data Multimodal assignment c Transcriptome-based taxonomy

Astro/Oligo Neuroectoderm SST ALM Supporting Neurons L2/L3 IT 100 Dissected area modalities Layered H3K27Ac Glutamatergic VISp 0 ALM GABAergic ALM VISp L6 IT ALM VISp Pvalb Serpinf1 L5/L6 IT L5 IT VISP DNase Clusters Sst Vip Lamp5 L2/L3 IT 4.88 L2/L3 IT Dissected layer(s) Cons 100 Verts L1 L1-L2/3 Structured data L2/3 L1-L4 L4 L2/3-L4 Annotation and L5 L4-L5 L6 L4-L6 Ontology L6b L5-L6 All layer(s)

Fig. 5 | Transcriptome based taxonomy, probabilistic cell types, and cell-type knowledge graphs. a, A transcriptome-based cell-type taxonomy is constructed from scRNA-seq data, related epigenomic datasets and , b, Cell types are initially defined based on transcriptomic signatures in a probabilistic manner with multiresolution clustering and statistical analysis to identify robustness and variability. c, Reproducible gene expression patterns identify hierarchies of putative cell types that are subject to further analyses and validation. d, Transcriptomic cell-type taxonomies form a basis for constructing cell-type knowledge graphs that summarize the present state of definable cell types. Multimodal assignment of data, such as morphology, electrophysiology and connectivity, is associated and reported with statistical variability over assigned types. A knowledge graph contains relevant and essential supporting information, such as supporting data for further analysis and mapping, descriptive annotation and ontology, and literature citations. have a significant sociological impact in at present, but could be measured in and nomenclature, providing a common neuroscience by encouraging collaborative future CAP datasets, which could then be denominator for the research in the field, participation. added to the knowledge graph. In such integrating quantitative and qualitative One example of an appropriate data a knowledge graph, there are two basic cell-type classification, and allowing for structure for such a community platform use-cases as new data becomes available. updates, subject to review and validation. is a ‘knowledge graph’, a widely used tool in First, one can use it to identify known cell Computational engines would allow new the tech industry and computer science types and their properties in new datasets. data to be compared and allow users as a platform for data aggregation (Fig. 5). With a probabilistic or Bayesian definition, to query the current state of cell-type A knowledge graph is a relational data each new cell will be assigned a probability understanding from the perspective of structure in which nodes represent entities of belonging to a particular type in the their new data, assigning the most likely (such as cell types and their attributes) and graph. Second, the graph can be manually type to multi- or unimodal datasets based the links, or edges, between them represent or automatically updated, following on similarities to the current framework’s their relational and statistical associations. conventional optimization algorithms, as knowledge. In addition to supporting There is a measurable graph-theoretic new data can change node identities and literature reference, the dynamic framework distance between nodes based on probable distances with respect to one another. might include online forums for scientific associations and known relationships. The The proposed cell-type knowledge discussion and education. Ultimately, a cortical cell knowledge graph could be framework would represent a living and cell-type community knowledge framework initialized with standardized transcriptomics updatable resource that maintains an would be a dynamic and living resource that data, after which other data modalities and actively derived and flexible ontology of researchers, clinicians and educators could related taxonomies could be readily mapped cortical cell types, benefitting from present refer to as the benchmark resource for cell onto the graph to capture anatomical, active ontology efforts. This standardized types in the cortex, promoting collaborative electrophysiological, developmental and database could be powered by open-source participation in the field. other cell properties. For example, important algorithms and managed and curated by contributors to cell identity, determined database administrators. It would be a Maintaining and updating the by cellular interactions, splicing, local dynamic database with query capability, classifcation translation, protein phosphorylation, etc., but would only accept peer-reviewed The classification, nomenclature and may not be readily captured by scRNA-seq published data in a standardized fashion associated knowledge graph could be

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1465 comment | FOCUS comment | FOCUS managed by a committee of experts and anchored on quantitative criteria that If successful, this community-based representing the breadth of approaches and operationally define cell clusters based on classification effort, joined by a common disciplines in the field. Such a committee their statistical and probabilistic grouping. nomenclature and nourished by the would be charged with designing the Although molecularly driven initially, this knowledge graph, could be extended and statistical classification model to sustain taxonomy should be revised and modified as generalized to other parts of the brain or a basic taxonomy; the type of open additional CAP datasets become available, of the body. In this sense, the classification platform to use for the knowledge graph; becoming a true multimodal classification of neocortical cell types, a field with a long the rules by which this taxonomy can be of cortical cell types. We view this core tradition and multidimensional approach updated and revised; the quality control or classification as potentially valid for all to a central problem in neuroscience, could peer-reviewed criteria; and the metadata mammalian species and also as likely be an ideal test case to explore this novel to be added. While the knowledge graph applicable to homologous structures in organization of knowledge in neuroscience could continually update itself automatically, other vertebrates, as a broad framework to and, more generally, in biology. ❐ as new data is imported, different curated encapsulate evolutionary conservation with versions of the graph might be released in species specialization. Indeed, only with Rafael Yuste 1 ✉ , Michael Hawrylycz 2 ✉ , regular updates. This committee, arising such a systematic approach to comparing Nadia Aalling 3, Argel Aguilar-Valles4, from expert volunteers, could also help with cell types across species will it be possible to Detlev Arendt5, Ruben Armañanzas 6,7, vetting of a unified nomenclature of cortical understand how cell type diversity evolved Giorgio A. Ascoli6, Concha Bielza8, cells that is succinct, useful and informative, in the cerebral cortex. Vahid Bokharaie9, as well as methods by which community This taxonomy will only be useful and Tobias Borgtoft Bergmann 3, input would be incorporated in a fair and successful if adopted by the community. Irina Bystron10, Marco Capogna 11, efficient fashion. So, in addition to the nomenclature, a series YoonJeung Chang12, Ann Clemens 13, Potentially, such a committee might of research tools should be developed, Christiaan P. J. de Kock14, Javier DeFelipe15, be established and supported through ideally by a community consortium, to Sandra Esmeralda Dos Santos16, existing organizations or consortia facilitate similar experimental access to Keagan Dunville17, Dirk Feldmeyer18, with interest in cell type classification, these cell types by the broader range of Richárd Fiáth19, Gordon James Fishell20, such as the NIH BRAIN Initiative Cell investigators. We envision molecular and Angelica Foggetti21, Xuefan Gao22, Census Network (BICCN; https://www. genetic tools, such as standard sets of Parviz Ghaderi 23, Natalia A. Goriounova 14, biccn.org), the NeuroLex–International antibodies and RNA probes to identify Onur Güntürkün24, Kenta Hagihara25, Neuroinformatics Coordinating Facility key molecular markers for each cell type, Vanessa Jane Hall 3, (INCF; http://130.229.26.15/news/activities/ as well as cell or mouse lines that are used Moritz Helmstaedter 26, our-programs/pons/neurolex-wiki. as resources for the entire community. Suzana Herculano-Houzel16, html), the Neuroscience Information Statistical tools to enable direct comparisons Markus M. Hilscher 27,28, Framework (NIF; https://neuinfo.org), among datasets, and to enable mapping Hajime Hirase3, Jens Hjerling-Lefer 27, the Human Brain Project (HBP; https:// new datasets to reference datasets, are Rebecca Hodge 2, Josh Huang 29, www.humanbrainproject.eu/), the Human essential. An open informatics backbone Rafq Huda 30, Konstantin Khodosevich3, BioMolecular Atlas Program (HuBMAP; needs to be developed as an essential part of Ole Kiehn 31, Henner Koch32, https://commonfund.nih.gov/hubmap) or the taxonomy, as well as visualization and Eric S. Kuebler 33, the Human Cell Atlas (HCA; https://www. analysis tools that take advantage of this Malte Kühnemund34, Pedro Larrañaga8, humancellatlas.org/). Some of these groups taxonomy and allow scientists to explore Boudewijn Lelieveldt35, Emma Louise Louth11, are already chartered with mapping the cell the data, add to the knowledge base and Jan H. Lui36, Huibert D. Mansvelder 14, types of the nervous system or other organs achieve new knowledge. Oscar Marin 37, Julio Martinez-Trujillo 38, in the body and may have resources to build In addition, we propose that the Homeira Moradi Chameh39, the backend technological infrastructure community input to support this taxonomy Alok Nath Mohapatra 40, needed for the knowledge graph. and enable its future revisions be channeled Hermany Munguba27, Regardless of who supports and into an open platform, a knowledge graph, Maiken Nedergaard 41, maintains this key infrastructure, it is as is becoming increasingly common in Pavel Němec42, Netanel Ofer 43, critical that the efforts be managed through community-led data science. Aggregation Ulrich Gottfried Pfsterer3, Samuel Pontes 1, open communication with the community. of knowledge through data graphs, now William Redmond 44, Jean Rossier45, A public consortium will be a logical a common practice in the tech industry, Joshua R. Sanes 46, Richard H. organizational structure for channeling will accelerate the dissemination of Scheuermann 47,48, Esther Serrano-Saiz49, diverse inputs and will also adequately knowledge and could avoid the ‘publication Jochen F. Staiger50, Peter Somogyi10, represent the wider community, reflecting graveyard’, where data are stored away Gábor Tamás 51, Andreas Savas Tolias 52, cultural, geographic, ethnic and gender in siloed journal articles disconnected Maria Antonietta Tosches 1, diversity. Strong community engagement from the rest of the field. Anchoring this Miguel Turrero García53, will ensure wide acceptance and ensure that taxonomy and knowledge graph, a unified Christian Wozny 54,55, these standards are adopted widely, within new nomenclature of cortical cell types Thomas V. Wuttke 56, Yong Liu 3, and outside of the neocortex specialist field. valid across species is needed to centralize Juan Yuan27, Hongkui Zeng 2 ✉ and efforts in the field, with a generalizable Ed Lein 2 ✉ A community-based taxonomy and framework to integrate with other cell-type 1Columbia University, New York City, NY, USA. nomenclature of cortical cell types classifications. We view the establishment 2Allen Institute for Brain Science, Seattle, WA, To conclude, we think that the field of of a common nomenclature as an essential USA. 3University of Copenhagen, Copenhagen, neocortical studies is ready for a synthetic, step to provide a standardized language that Denmark. 4Department of Neuroscience, Carleton principled classification of cortical cell types, enables the meaningful aggregation and University, Ottawa, Ontario, Canada. 5European based on single-cell transcriptomic data sharing of data. Molecular Biology Laboratory, Heidelberg, Germany.

1466 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

6George Mason University, Fairfax, VA, USA. [email protected]; 37. Paul, A. et al. Cell 171, 522–539.e20 (2017). 7BrainScope Company Inc., Bethesda, MD, USA. [email protected]; [email protected] e520. 8 38. Fishell, G. & Heintz, N. Neuron 80, Universidad Politécnica de Madrid, Madrid, Spain. 602–612 (2013). 9 Max Planck Institute for Biological Cybernetics, Published online: 24 August 2020 39. Nelson, S. B., Sugino, K. & Hempel, C. M. Tübingen, Germany. 10University of Oxford, https://doi.org/10.1038/s41593-020-0685-8 Trends Neurosci. 29, 339–345 (2006). Oxford, UK. 11Department of Biomedicine, Aarhus 40. Tasic, B. et al. Nature 563, 72–78 (2018). University, Aarhus, Denmark. 12Department of References 41. Yager, T. D., Nickerson, D. A. & Hood, L. E. 1. Magner, L.N. A History of the Life Sciences Trends Biochem. Sci. 16, 454 (1991). 456, 458 Genetics, Harvard Medical School, Boston, MA, (Marcel Dekker, 1979). passim. 13 USA. Te University of Edinburgh, Edinburgh, 2. Ramón y Cajal, S. Rev Ciencias Méd. Barcelona 42. Alivisatos, A. P. et al. Neuron 74, UK. 14Vrije Universiteit Amsterdam, Amsterdam, 18, 361–376 (1892). 457–476, 505–520, 529–541. 970–974 (2012). Netherlands. 15Cajal Institute, Madrid, Spain. 3. Ramón y Cajal, S. Recuerdos de Mi Vida: Vol.2. 43. Bargmann, C. I. & Newsome, W. T. JAMA Neurol. 71, 675–676 (2014). 16Vanderbilt University, Nashville, TN, USA. Historia de Mi Labor Científca (Imprenta y 17 18 librería de Nicolás Moya, 1917). 44. Macosko, E. Z. et al. Cell 161, 1202–1214 (2015). Scuola Normale Superior, Pisa, Italy. Research 4. Ramón y Cajal, S. La Textura del Sistema 45. Klein, A. M. et al. Cell 161, 1187–1201 (2015). 19 Centre Jülich, Jülich, Germany. Research Centre Nerviosa del Hombre y los Vertebrados (Imprenta 46. Zheng, G. X. et al. Nat. Commun. 8, for Natural Sciences, Budapest, Hungary. 20Harvard y librería de Nicolás Moya, 1904). 14049 (2017). Medical School, Cambridge, MA, USA. 21Chris 5. Hubel, D. H. & Wiesel, T. N. Proc. R. Soc. Lond. 47. Habib, N. et al. Nat. Methods 14, 955–958 (2017). tian-Albrechts-University Kiel, Kiel, Germany. B Biol. Sci. 198, 1–59 (1977). 48. Bush, E. C. et al. Nat. Commun. 8, 105 (2017). 22 6. Douglas, R. J., Martin, K. A. C. & Whitteridge, European Molecular Biology Laboratory, Hamburg, D. Neural Comput. 1, 480–488 (1989). 49. Garber, M., Grabherr, M. G., Guttman, M. & Germany. 23École Polytechnique Fédérale de 7. Mountcastle, V.B. Perceptual Neuroscience: Te Trapnell, C. Nat. Methods 8, 469–477 (2011). Lausanne, Lausanne, Switzerland. 24Ruhr University Cerebral Cortex (Harvard Univ. Press, 1998). 50. Stuart, T. & Satija, R. Nat. Rev. Genet. 20, Bochum, Bochum, Germany. 25Friedrich Miescher 8. Lorente de Nó, R. Trab. Lab. Invest. Bio. 257–272 (2019). 51. White, J. G., Southgate, E., Tomson, J. N. & Institute for Biological Research, Basel, Switzerland. (Madrid) 20, 41–78 (1922). Brenner, S. Philos. Trans. R. Soc. Lond., B 314, 26 9. Szentágothai, J. Brain Res. 95, 475–496 (1975). Max Planck Institute for Brain Research, Frankfurt, 10. Peters, A. & Jones, E.G. Cerebral Cortex 1–340 (1986). Germany. 27Karolinska Institutet, Stockholm, (Plenum, New York, 1984). 52. Ecker, J. R. et al. Te BRAIN Initiative Cell Sweden. 28Science for Life Laboratory, Department of 11. Lund, J. S. Annu. Rev. Neurosci. 11, Census Consortium. Neuron 96, 542–557 (2017). Biochemistry and Biophysics, Stockholm University, 253–288 (1988). 53. Regev, A. et al. eLife 6, e27041 (2017). 12. Jones, E.G. & Diamond, I.T. (eds.). Te Barrel 54. Lein, E., Borm, L. E. & Linnarsson, S. Science Solna, Sweden. 29Cold Spring Harbor Laboratory, Cortex of Rodents (Plenum, 1995). 358, 64–69 (2017). 30 Laurel Hollow, NY, USA. WM Keck Center for 13. Kozloski, J., Hamzei-Sichani, F. & Yuste, R. 55. Zeng, H. & Sanes, J. R. Nat. Rev. Neurosci. 18, Collaborative Neuroscience, Department of Cell Science 293, 868–872 (2001). 530–546 (2017). Biology and Neuroscience, Rutgers University - New 14. Cauli, B. et al. J. Neurosci. 17, 3894–3906 (1997). 56. Huang, Z. J. & Paul, A. Nat. Rev. Neurosci. 20, Brunswick, Piscataway, NJ, USA. 31Department 15. Tsiola, A., Hamzei-Sichani, F., Peterlin, Z. & 563–572 (2019). Yuste, R. J. Comp. Neurol. 461, 415–428 (2003). 57. Fu, M. & Zuo, Y. Trends Neurosci. 34, of Neuroscience, University of Copenhagen, 16. Guerra, L. et al. Dev. Neurobiol. 71, 177–187 (2011). 32 Copenhagen, Denmark. RWTH Aachen University, 71–82 (2011). 58. Gerfen, C. R., Paletzki, R. & Heintz, N. Neuron Aachen, Germany. 33Robarts Research Institute, 17. Ascoli, G. A. et al. Nat. Rev. Neurosci. 9, 80, 1368–1383 (2013). Western University, London, Ontario, Canada. 557–568 (2008). 59. He, M. et al. Neuron 91, 1228–1243 (2016). 34CARTANA, Stockholm, Sweden. 35Leiden 18. Pelvig, D. P., Pakkenberg, H., Stark, A. K. & 60. Roselli, C. et al. Nat. Genet. 50, Pakkenberg, B. Neurobiol. Aging 29, 1225–1233 (2018). University Medical Center, Leiden, the Netherlands. 1754–1762 (2008). 61. Hodge, R. D. et al. Nature 573, 61–68 (2019). 36 37 Stanford University, Stanford, CA, USA. King’s 19. DeFelipe, J. et al. Nat. Rev. Neurosci. 14, 62. Nowakowski, T. J. et al. Science 358, College London, London, UK. 38Schulich School of 202–216 (2013). 1318–1323 (2017). Medicine and Dentistry, Departments of Physiology, 20. Shepherd, G. M. et al. Front. Neuroanat. 13, 63. Mayer, C. et al. Nature 555, 457–462 (2018). 64. Mi, D. et al. Science 360, 81–85 (2018). Pharmacology and Psychiatry, University of Western 25 (2019). 39 21. Markram, H. et al. Nat. Rev. Neurosci. 5, 65. Winnubst, J. et al. Cell 179, 268–281.e13 (2019). Ontario, London, Ontario, Canada. Krembil 793–807 (2004). 66. Sugino, K. et al. Nat. Neurosci. 9, 99–107 (2006). Research Institute, Toronto, Ontario, Canada. 22. McGarry, L. M. et al. Front. Neural Circuits 4, 67. Anderson, S. A., Eisenstat, D. D., Shi, L. & 40University of Haifa, Haifa, Israel. 41Univeristy of 12 (2010). Rubenstein, J. L. Science 278, 474–476 (1997). Rochester, Rochester, NY, USA. 42Charles University, 23. Markram, H. et al. Cell 163, 456–492 (2015). 68. Tosches, M. A. et al. Science 360, 881–888 (2018). Prague, Czech Republic. 43Bar Ilan University, Ramat 24. Butt, S. J. B. et al. Neuron 48, 591–604 (2005). 69. Peng, Y. R. et al. Cell 176, 1222–1237.e22 (2019). 44 25. Kawaguchi, Y. & Kubota, Y. Cereb. Cortex 7, Gan, Israel. Macquarie University, Sydney, New 476–486 (1997). 70. Martersteck, E. M. et al. Cell Rep. 18, South Wales, Australia. 45Sorbonne University, Paris, 26. Yuste, R. Neuron 48, 524–527 (2005). 2058–2072 (2017). France. 46Harvard University, Cambridge, MA, 27. Kessaris, N., Magno, L., Rubin, A. N. & 71. Häring, M. et al. Nat. Neurosci. 21, USA. 47J. Craig Venter Institute, La Jolla, CA, USA. Oliveira, M. G. Curr. Opin. Neurobiol. 26, 869–880 (2018). 72. Rosenberg, A. B. et al. Science 360, 48Department of Pathology, University of California, 79–87 (2014). 176–182 (2018). 49 28. Fishell, G. & Kepecs, A. Annu. Rev. Neurosci. 43, San Diego, CA, USA. Centro de Biologia Molecular 1–30 (2019). 73. Harris, K. D. et al. PLoS Biol. 16, Severo Ochoa (CSIC), Madrid, Spain. 50Institute for 29. Arendt, D. et al. Nat. Rev. Genet. 17, e2006387 (2018). Neuroanatomy, University of Göttingen, Göttingen, 744–757 (2016). 74. Zeisel, A. et al. Science 347, 1138–1142 (2015). Germany. 51University of Szeged, Szeged, Hungary. 30. Tosches, M. A. & Laurent, G. Curr. Opin. 75. Boldog, E. et al. Nat. Neurosci. 21, Neurobiol. 56, 199–208 (2019). 1185–1195 (2018). 52Baylor College of Medicine, Houston, TX, USA. 31. Dumitriu, D., Cossart, R., Huang, J. & Yuste, R. 76. Cadwell, C. R. et al. Nat. Biotechnol. 34, 53 Department of Neurobiology, Harvard Medical Cereb. Cortex 17, 81–91 (2007). 199–203 (2016). School, Boston, MA, USA. 54University of Strathclyde, 32. Jiang, X. et al. Science 350, aac9462 (2015). 77. Chen, K. H., Boettiger, A. N., Moftt, J. R., Glasgow, UK. 55MSH Medical School, Hamburg, 33. Wheeler, D. W. et al. eLife 4, e09960 (2015). Wang, S. & Zhuang, X. Science 348, Germany. 56Departments of Neurosurgery and of 34. Mihaljević, B. et al. BMC Bioinformatics 19, aaa6090 (2015). 511 (2018). 78. Durruthy-Durruthy, R. et al. Cell 157, Neurology and Epileptology, Hertie-Institute for 35. Shekhar, K. et al. Cell 166, 1308–1323.e30 964–978 (2014). Clinical Brain Research, University of Tübingen, (2016). e1330. 79. Trapnell, C. et al. Nat. Biotechnol. 32, Tübingen, Germany. 36. Tasic, B. et al. Nat. Neurosci. 19, 335–346 381–386 (2014). ✉e-mail: [email protected]; (2016). 80. Shalek, A. K. et al. Nature 510, 363–369 (2014).

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1467 comment | FOCUS comment | FOCUS

81. Fiers, M. W. E. J. et al. Brief. Funct. Genomics 97. Tang, K., Ruozzi, N., Belanger, D. & Jebara, T. 110. Bakken, T. et al. BMC Bioinformatics 18, 17, 246–254 (2018). Bethe learning of graphical models via MAP 559 (2017). Suppl 17. 82. Benavides-Piccione, R., Hamzei-Sichani, F., decoding. Artifcial Intelligence and Statistics Ballesteros-Yáñez, I., DeFelipe, J. & Yuste, R. (AISTATS). Proc. Mach. Learn. Res. 51, Acknowledgements Cereb. Cortex 16, 990–1001 (2006). 1096–1104 (2016). This document resulted from group discussions at the 83. Hawrylycz, M. J. et al. Nature 489, 98. Sümbül, U., Zlateski, A., Vishwanathan, A., FENS/Brain Prize meeting, ‘The necessity of cell types for 391–399 (2012). Masland, R. H. & Seung, H. S. Front. Neuroanat. brain function’, that took place in Copenhagen, Denmark, 84. Bakken, T.E. et al. PLoS ONE 13, 8, 139 (2014). on 7–10 October 2018. We thank the FENS and Brain e0209648 (2018). 99. Romesburg, H.C. Cluster Analysis for Researchers Prize Foundation and staff for help and the Lundbeck 85. Somogyi, P. Brain Res. 136, 345–350 (1977). (Lifetime Learning, 1984). Foundation for support. This paper is dedicated to the 86. Fairén, A. & Valverde, F. J. Comp. Neurol. 194, 100. Arendt, D., Bertucci, P. Y., Achim, K. & memory of Sydney Brenner. 761–779 (1980). Musser, J. M. Curr. Opin. Neurobiol. 56, 87. Woodruf, A. R. et al. J. Neurosci. 31, 144–152 (2019). 17872–17886 (2011). 101. Wiley, E.O. & Liberman, B.S. Phylogenetics: Competing interests 88. Cauli, B. et al. Proc. Natl. Acad. Sci. USA 97, Teory and Practice of Phylogenetic Systematics The authors declare no competing interests. 6144–6149 (2000). (Wiley-Blackwell, 2011). 89. Krimer, L. S. et al. J. Neurophysiol. 94, 102. Siebert, S. et al. Science 365, eaav9314 Open Access Tis article is licensed 3009–3022 (2005). (2019). under a Creative Commons 90. Armañanzas, R. & Ascoli, G. A. Trends Neurosci. 103. Crow, M., Paul, A., Ballouz, S., Huang, Z. J. & Attribution 4.0 International License, 38, 307–318 (2015). Gillis, J. Nat. Commun. 9, 884 (2018). which permits use, sharing, adaptation, distribution 91. Andrews, T. S. & Hemberg, M. Mol. Aspects 104. Bard, J., Rhee, S. Y. & Ashburner, M. Genome and reproduction in any medium or format, as long Med. 59, 114–122 (2018). Biol. 6, R21 (2005). as you give appropriate credit to the original author(s) and 92. Kiselev, V. Y. et al. Nat. Methods 14, 105. Osumi-Sutherland, D. BMC Bioinformatics 18, the source, provide a link to the Creative Commons 483–486 (2017). 558 (2017). Suppl 17. license, and indicate if changes were made. Te images 93. Santana, R., McGarry, L. M., Bielza, C., 106. Masci, A. M. et al. BMC Bioinformatics 10, or other third party material in this article are included Larrañaga, P. & Yuste, R. Front. Neural Circuits 70 (2009). in the article’s Creative Commons license, unless 7, 185 (2013). 107. Smith, B. et al. Nat. Biotechnol. 25, indicated otherwise in a credit line to the material. 94. Liu, L., Tang, L., Dong, W., Yao, S. & Zhou, W. 1251–1255 (2007). If material is not included in the article’s Creative Springerplus 5, 1608 (2016). 108. Pereira, P., Gama, J. & Pedroso, J. IEEE Trans. Commons license and your intended use is not permitted 95. van der Maaten, L. & Hinton, G. J. Mach. Learn. Knowl. Data Eng. 20, 615–627 (2008). by statutory regulation or exceeds the permitted Res. 9, 2579–2605 (2008). 109. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. use, you will need to obtain permission directly 96. Mansergh, F. C., Carrigan, M., Hokamp, K. & & Lipman, D. J. J. Mol. Biol. 215, from the copyright holder. To view a copy of this license, Farrar, G. J. Mol. Vis. 21, 61–87 (2015). 403–410 (1990). visit http://creativecommons.org/licenses/by/4.0/.

1468 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience