comment | FOCUS comment | FOCUS A community-based transcriptomics classifcation and nomenclature of neocortical cell types To understand the function of cortical circuits, it is necessary to catalog their cellular diversity. Past attempts to do so using anatomical, physiological or molecular features of cortical cells have not resulted in a unifed taxonomy of neuronal or glial cell types, partly due to limited data. Single-cell transcriptomics is enabling, for the frst time, systematic high-throughput measurements of cortical cells and generation of datasets that hold the promise of being complete, accurate and permanent. Statistical analyses of these data reveal clusters that often correspond to cell types previously defned by morphological or physiological criteria and that appear conserved across cortical areas and species. To capitalize on these new methods, we propose the adoption of a transcriptome-based taxonomy of cell types for mammalian neocortex. This classifcation should be hierarchical and use a standardized nomenclature. It should be based on a probabilistic defnition of a cell type and incorporate data from diferent approaches, developmental stages and species. A community-based classifcation and data aggregation model, such as a knowledge graph, could provide a common foundation for the study of cortical circuits. This community-based classifcation, nomenclature and data aggregation could serve as an example for cell type atlases in other parts of the body. Rafael Yuste, Michael Hawrylycz, Nadia Aalling, Argel Aguilar-Valles, Detlev Arendt, Ruben Armananzas Arnedillo, Giorgio A. Ascoli, Concha Bielza, Vahid Bokharaie, Tobias Borgtoft Bergmann, Irina Bystron, Marco Capogna, Yoonjeung Chang, Ann Clemens, Christiaan P. J. de Kock, Javier DeFelipe, Sandra Esmeralda Dos Santos, Keagan Dunville, Dirk Feldmeyer, Richárd Fiáth, Gordon James Fishell, Angelica Foggetti, Xuefan Gao, Parviz Ghaderi, Natalia A. Goriounova, Onur Güntürkün, Kenta Hagihara, Vanessa Jane Hall, Moritz Helmstaedter, Suzana Herculano, Markus M. Hilscher, Hajime Hirase, Jens Hjerling-Lefer, Rebecca Hodge, Josh Huang, Rafq Huda, Konstantin Khodosevich, Ole Kiehn, Henner Koch, Eric S. Kuebler, Malte Kühnemund, Pedro Larrañaga, Boudewijn Lelieveldt, Emma Louise Louth, Jan H. Lui, Huibert D. Mansvelder, Oscar Marin, Julio Martinez-Trujillo, Homeira Moradi Chameh, Alok Nath, Maiken Nedergaard, Pavel Němec, Netanel Ofer, Ulrich Gottfried Pfsterer, Samuel Pontes, William Redmond, Jean Rossier, Joshua R. Sanes, Richard Scheuermann, Esther Serrano-Saiz, Jochen F. Steiger, Peter Somogyi, Gábor Tamás, Andreas Savas Tolias, Maria Antonietta Tosches, Miguel Turrero García, Hermany Munguba Vieira, Christian Wozny, Thomas V. Wuttke, Liu Yong, Juan Yuan, Hongkui Zeng and Ed Lein

Classifcations of cortical cell types: ideally systematized into ‘cell taxonomies’, neuron2. Since then, generations of from Cajal to the Petilla Convention classifying groups of cells based on shared investigators have described hundreds of cell The conceptual foundation of modern characteristics and grouping them into taxa types in nervous systems of different species. biology is the cell theory of Virchow, with ranks and a hierarchy. Taxonomies This effort has been particularly arduous which described the cell as the basic unit are important: they provide a conceptual in the cerebral cortex (or neocortex), the of structure, reproduction and pathology foundation for a field and also enable the largest part of the brain in mammals and the of biological organisms1. This idea, which systematic accumulation of knowledge. primary site of higher cognitive functions. arose from the use of microscopes by Essential to this effort is the clear definition The mammalian neocortex has a thin Leeuwenhoek, Hooke, Schleiden and of cell type, normally understood as cells layered structure, composed of mixtures of Schwann, among others, generated the need with shared phenotypic characteristics. excitatory and inhibitory arranged to build catalogs of the cellular components Virchow’s cell theory was introduced in circuits of a forbidding complexity, called of tissues as the first step toward studying to by Cajal, whose ‘ “impenetrable jungles” by Cajal3. This basic their structure and function. As with doctrine’ postulated that the structural unit structure is very similar in different cortical species, these cell catalogs, or atlases, can be of the nervous system was the individual areas and in different species, which has

1456 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment given rise to the possibility that there is a agreement as to which would form an a few hundred cells per experiment, effective ‘canonical’ cortical microcircuit4–7, replicated optimal basis for classification. In principle, new methods have emerged for profiling during evolution, which underlies all many criteria can be used, including thousands of cells or nuclei at a time44–48. cortical function. (i) anatomical or connectivity-based With simultaneous computational advances After more than a hundred years of features19,20, (ii) parametrization of intrinsic for analyzing large sequence-based data49,50, sustained progress, it is clear that neocortical electrophysiological properties21, (iii) it is now possible to systematically neurons and glial cells, like cells in any combination of structural and physiological classify and characterize the diversity of tissue, belong to many distinct types. criteria22,23, (iv) molecular markers14,24,25, (v) neural cells in any tissue, including the Different cell types likely play discrete developmental origins26,27, (vi) epigenetic neocortex (Fig. 2). roles in cortical function and computation, attractor states28 or (vii) evolutionary Conceptually, as much as the genome making it important to characterize and approaches identifying homology across is the internal genetic description for each describe them accurately and in their species29,30. Ideally, these classifications species, the transcriptome, as the complete absolute and relative numbers. Towering should converge and agree, or at least set of being expressed, provides an historical figures like Cajal, Lorente de Nó substantially overlap. Indeed, there is internal code that can describe each cell and Szentágothai, among others, proposed substantial concordance among categories within an organism in a spatiotemporal classifications of cortical cells based on their based on anatomical, molecular and context. Practically, the scale of scRNAseq morphologies as visualized with histological physiological criteria13,22,31–34, but it has not promises near-saturating analysis of stains4,8,9 (Fig. 1a–c). These anatomical been easy to combine these approaches complex cellular brain regions like the classifications described several dozen types into a unified taxonomy. There are neocortex, providing, for the first time, a of pyramidal neurons, short-axon cells substantial differences between researchers comprehensive and quantitative description and glial cells, and they were subsequently in assigning neurons to particular types of cellular diversity and the prospect of complemented by morphological accounts in the literature19, and even experts often simplifying tissue cell composition to a finite of additional cortical cell types by many disagree on what constitutes ground truth. number of cell types and states defined by researchers10–12, but without arriving at a For example, while most publications agree statistical clustering. Importantly, however, clear consensus as to the number or even the on what a chandelier cell is, the concept of these transcriptionally defined clusters definition of a cortical cell type. basket cells, a major subtype of inhibitory represent a probabilistic description of cell Over the last few decades, the neuron, is much less clear19. types in a high-dimensional landscape of introduction of new morphological, This uncertainty is explained and expression across all cells in a tissue, ultrastructural, immunohistochemical and exacerbated by technical challenges: rather than a definition based on a small set electrophysiological methods, new molecular conventional approaches have been of necessary and sufficient cellular markers markers, and a growing appreciation of the laborious, low-throughput, frequently or other features (see below). developmental origins of distinct neuronal non-quantitative and generally plagued by The scale, precision and information subtypes (Fig. 1d–h), have provided an inability to sample cells in standardized content of these current methods now far increasingly finer phenotypic measurements and systematic ways. Thus, setting aside outpace other classical methods of cellular of cortical cells and enabled new efforts to debates about the importance of various phenotyping in neuroscience and have the classify them more quantitatively, using criteria and the nature or even existence potential to approach the complete, accurate supervised or unsupervised methods such of discrete cell types, it is not surprising and permanent (CAP) criteria cited by as cluster analysis13–16. A community effort that the cell-type problem has remained Brenner as the gold standard in biological to classify neocortical inhibitory cells was challenging. science51. Indeed, major efforts now aim attempted at the 2005 Petilla Convention, to generate a complete description of cell held in Cajal’s hometown in Spain, and led Transcriptomics: a new framework for types based on molecular criteria across to a common standardized terminology classifying cortical cell types the neocortex (Allen Institute for Brain describing the anatomical, physiological Recent advances in high-throughput Science36,40), the whole brain (the National and molecular features of neocortical single-cell transcriptomics (scRNAseq) Institutes of Health (NIH) BRAIN Initiative interneurons17. While useful, this fell have changed the paradigm of cellular Cell Census Network52) and even the whole short of providing a classification and classification, offering a new quantitative body (the Cell Atlas53). Also, as the working framework that investigators could genetic framework35–40. These approaches Project offered a means for incorporate into their research. One reason measure the expression profiles of thousands comparative analysis of orthologous genes why this early effort failed was because the of genes from individual cells in large across species, these efforts could define datasets for phenotypically characterizing numbers, at relatively high speed and low all or most cell types and states in cortical neurons were small. Indeed, many of cost. Related methods in epigenomics can and model organisms, with the possibility the early studies are based on characterizing identify sites of methylation and putative of extending them to a variety of species dozens or at most hundreds of neurons, gene transcriptional regulation, essential to to understand the evolution of cell-type small samples from the nearly 20 billion in cell function and state. These new methods diversity. These large investments have human neocortex18. are an outcome from the methodological, the potential for a transformative effect on An outcome of the Petilla Convention conceptual and economic revolution neuroscience, which will be accelerated by was the realization that there was not yet a created by the Human Genome Project41 a formalization of a molecular classification single method that captured the inherently and have flourished with support from and its adoption by the community. They multimodal nature of cell phenotypes and the BRAIN Initiative42,43. With genomes in also hold promise for the development could serve as a standard for classification. hand, it is now feasible to generate entire of methods for querying circuit function While most researchers accepted the transcriptomes (which include the sequence by providing tools for the targeting and existence of cell types that could be and structure of transcripts) from tissues manipulation of particular subtypes. measured and defined independently and to scale these methods for amplifying Transcriptomic classification offers by different methods, there was no RNA in single cells. Initially limited to only the following advantages as a framework

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1457 comment | FOCUS comment | FOCUS

a d e Stereotypy of neuron morphology Interneuron subclass molecular markers Molecular marker/physiology correspondence Parvalbumin (PV) Fast spiking Number Regular spiking non-pyramidal of cells Irregular spiking 20 20

16 16

12 12 Somatostatin (SST) 8 8

4 4

0 0 – Calretinin (CR) + Calbindin-calretinin (CB-CR) Vasoctive intestinal peptide (VIP) – + Calbindin (CB) Neuropeptides – Calbindin-parvalbumin (CB-PV) + Parvalbumin (PV)

f g Classification by physiological features Promiscuous phenotype relationships

Burst Continuous Delayed Stuttering Molecular Physiology marker type

Fast spiking b-AC 20 mV Chandelier cell ChC CB 500 msec

c-AC b Non-adapting Large basket cell LBC PV Stereotypy of glial morphology non-fast spiking b-NAC NBC CR Adapting Nest basket cell c-NAC Small basket cell SBC Irregular d-NAC spiking Double bouquet cell DBC NPY Intrinsic burst c-STUT 30 mV firing 400 ms Bipolar cell BPC VIP d-STUT Accelerating 20 mV BTC SOM 200 ms Bitufted cell b-IS

MC CCK c-IS h Martinotti cell Systematic morpho-electric neuron classification

DAC NGC-DA NGC-SA HAC LAC SAC c Stererotyped connectivity motifs 200 μm MC BTC DBC BP NGC LBC NBC SBC ChC 1 Small basket cell Axon tuft 2 cell GABAergic interneurons Chandelier cell DAC Descending axon cell 3 NGC-DA Neurogliaform cell with dense axonal arborization NGC-SA Neurogliaform cell with slender axonal arborization HAC Horizontal axon cell LAC Large axon cell Large 4 basket SAC Small axon cell cell MC Martinotti cell BTC Bitufted cell 5 DBC Double bouquet cell BP Bipolar cell NGC Neurogliaform cell 6 LBC Large basket cell NBC Nest basket cell SBC Small basket cell ChC Chandelier cell

Fig. 1 | Non-transcriptomics cortical cell-type classifications. a,b, Morphological characterization and classification of neurons (a) and glial cells (b) by Ramón y Cajal (1904)4. c, Diagram showing the connections of different types of interneurons with pyramidal cells. Adapted from Szentágothai (1975)9. d, Definition of GABAergic interneuron classes based on non-overlapping and combinatorial marker gene expression. e, Correlation of firing properties with class markers. f, Cortical cell type classification based on intrinsic firing properties (Petilla convention). g, Complex relationships between cellular morphology, marker-gene expression and intrinsic firing properties based on multimodal analysis. h, Comprehensive morphological and physiological classifications of cortical cell types. Images in a,b reprinted with permission from ref. 4, Cajal Institute; in c, adapted with permission from ref. 9, Elsevier; in d, adapted with permission from ref. 25, Oxford Univ. Press; in e, adapted with permission from ref. 14, Society for Neuroscience; in f and g, adapted with permission from refs. 17,21, respectively, Springer Nature; in h, adapted with permission from ref. 23, Cell Press.

1458 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment for bounding the problem of cellular Indeed, initial transcriptomic studies synaptic connectivity is challenging unless diversity53–56: of cortical tissue are already providing such a classification correlates strongly with 1. High-throughput transcriptomics is many biological insights. For example, those features. Recent work in the retina is very efective at allowing a systematic, scRNAseq analysis of mouse and human promising in this regard, where a large body comprehensive analysis of cellular di- cortex identified a complex but finite set of work has established a highly diverse versity in complex tissues. Its quantita- of ~100 molecularly defined cell types per set of anatomically, physiologically and tive and high-throughput nature enables cortical region that generally agree with prior functionally discrete cell types69 and where the adoption of rigorous defnitions literature on cytoarchitectural organization, transcriptomic clusters strongly correlate and criteria using datasets from tens of developmental origins, functional properties with this prior knowledge35,69,70. For example, thousands to millions of cells. and long-range projections65. Moreover, the for mouse bipolar cells, a class comprising 2. Te genes expressed by a cell during its hierarchical (agglomerative) taxonomy of 15 types of excitatory interneurons, there development and maturity ultimately transcriptomic cell types66, based on relative is essentially perfect correspondence underlie its structure and function, and similarity between clusters, reflects these between types defined by scRNAseq, so the transcriptome ofers predictive organizational principles. Viewed as a tree high-throughput optical imaging of power based on interpreting gene func- or dendrogram, the initial branches reflect electrical activity, and serial section electron tion. Other cellular phenotypes, includ- major classes (neuronal vs non-neuronal; microscopy35. The spinal cord provides ing morphology, are in part encoded by excitatory vs inhibitory), with finer splits another good example of correspondence genes, rather than completely independ- reflecting more subtle variants of each between scRNAseq and other cellular ent defning criteria57. class that reflect different developmental characteristics, including developmental 3. A molecular defnition of cell types programs; for example, neocortical neurons origins and connectivity profiles71,72. allows the identifcation of cell-type are split into excitatory glutamatergic vs Similarly, scRNAseq of mammalian markers and the creation of genetic inhibitory GABAergic classes reflecting their identifies neuronal cell types tools to target, label and manipulate different developmental origins in embryonic that were already described by anatomy and specifc cell types58,59, thereby provid- pallium vs subpallial proliferative regions, electrophysiology73,74. ing the means to standardize datasets while the next splits in the GABAergic Strong evidence for cross-modal obtained by diferent researchers. branch contain neurons generated by correspondence in neocortical cell types is 4. Transcriptomic data can also provide medial and caudal subdivisions of the accumulating as well. An early application information about human diseases, by ganglionic eminence and the preoptic area of cluster analysis of mouse layer 5 neurons allowing a potential linkage between (Fig. 2a). These transcriptomic divisions are showed correspondence between synaptic genes associated with disease and their consistent with a long literature on cell fate connectivity, morphology and even laminar cellular locus of action. By combining specification of different GABAergic classes position13. Almost perfect correlations were with genome-wide association studies and the transcription factors involved in that seen between major interneuron subclasses (GWAS) that identify genes causally process62–64,67 (Fig. 2b). Transcriptomics also for molecular markers, axonal morphology involved in the pathophysiology of a allows quantitative analysis of developmental and kinetics of synaptic inputs31 (Fig. 3a). disease, cell-type transcriptomics-based trajectories involved in this specification Within somatostatin-positive interneurons, data might lead to identifcation of and maturation62–64 (Fig. 2c). Genes that morphological and electrophysiological mechanistically unresolved diseases as differentiate neuronal classes are enriched subgroups were correlated22. Other more detected changes in expression levels of for those involved in neuronal connectivity specific neuron types show concordance genes from key cell types60. and synaptic communication, indicating they between scRNAseq, physiology and 5. Expression profles allow quantita- are predictive of selective cellular and circuit morphology, such as the ‘rosehip’ cell, a layer tive comparison of cell types across function37 (Fig. 2d). Finally, the same major 1 inhibitory neuron type in human cortex75 evolutionary or developmental times, transcriptomic classes of cortical GABAergic (Fig. 3b). Similarly, strong correspondence enabling the alignment of cell types neurons are found in mammals and reptiles68 between scRNA-seq, electrophysiology across species (based on conserved (Fig. 2e), suggesting deep conservation and morphology was shown for mouse expression of homologous genes)61 and of cellular architecture and underlying layer 1 neurogliaform and single bouquet developmental stages (based on gradual mechanisms of molecular specification. neurons, using the patch–seq technique, developmental trajectories)62–64. which combines patch-clamp physiology 6. Transcriptomics also enables comparing Correspondence of cell-type and scRNA-seq76 (Fig. 3c). Finally, RNA-seq cell types across organs, as diferent or- classifcations across modalities analysis of retrogradely labeled neurons gans use similar genes. Tus, it could be Proposing a transcriptomic-based in mouse primary visual cortex shows used to classify all the cells in the body classification for a field traditionally distinctive projections of transcriptionally with a single method and framework53. centered on cellular anatomy, physiology and defined excitatory subclasses40 (Fig. 3d).

Fig. 2 | Transcriptomics classifications of cortical cell types. a, Single-cell transcriptome analysis reveals a molecular diversity of mouse cell types, with relatively invariant interneuron and non-neuronal types across cortical areas but significant variation in excitatory neurons. b, Major interneuron classes are specified by distinct transcription factor codes. c, Single-cell transcriptomics of mouse GABAergic interneuron development demonstrates gradual changes in gene expression underlying developmental maturation and fate bifurcations as cells become postmitotic. d, Gene families shaping cardinal GABAergic neuron type include neuronal connectivity, ligand receptors, electrical signaling, intracellular signal transduction, synaptic transmission and gene transcription. These gene families assemble membrane-proximal molecular machines that customize input–output connectivity and properties in different GABAergic types. e, Single-cell transcriptomics allows cross-species comparisons and shows conservation of major cell classes from reptiles to mammals, with conserved transcription factors but some species-specific effectors (turtle data). TF, transcription factor. Images in a and c adapted with permission from refs. 40,63, respectively, Springer Nature; in b, adapted with permission from ref. 27, Elsevier; in d, adapted with permission from ref. 37, Cell Press; in e, adapted with permission from refs. 30,68, Elsevier and AAAS, respectively.

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1459 comment | FOCUS comment | FOCUS

Experimental tools are increasingly available spatial transcriptomics methods54,77. While Challenges for cortical cell type to aid in phenotypic characterization of major consortium efforts will generate the classifcation transcriptionally defined cell types in model transcriptomic framework, linking different Although strong cross-modal animals and even human, such as specific types of data to it will likely be most effective correspondence has been observed at the Cre lines and viruses, as well as novel as a distributed community effort. major subclass level, such correspondence

a Transcriptomic cell type classification Dissected area Cortical area Cell class Cluster ALM GABA ALM Glut VISp VISp Non-Neu. Non-neuronal Dissected layer(s) L1 L1–L2/3 & oligodendrocytes Neuroectoderm Neuronal L2/3 L1–L4 t-SNE 2 L4 L2/3–L4 L5 L4–L5 VLMC SMC Peri/ Micro t-SNE 1 L6 L4–L6 Endo L6b L5–L6 Neuron subtypes Glutamatergic All layers GABAergic Serpinf1 VISp Sncg ALM VISp Lamp5 L6 IT L5/L6 IT ALM Pvalb Sst Vip L5 IT L6b L5/6 L5 PT L2/3 IT VISp L6 CT NP L2/3 IT Dissected area (% of cluster) 0 50 100 0 Dissected layer (% of cluster) 50 100 1,404 1,04 9 cells 106 183 110 583 624 810 200 112 129 165 131 103 126 208 203 115 555 238 178 179 207 232 242 289 116 211 115 366 128 289 137 118 213 148 879 534 287 115 118 107 318 278 210 259 241 180 300 183 122 203 220 109 102 172 380 200 174 216 230 122 224 704 184 387 412 123 435 215 119 125 673 212 108 n 30 36 40 80 12 37 83 13 52 18 56 89 63 65 72 69 36 71 80 61 79 89 63 74 59 67 61 80 56 49 86 48 71 68 76 76 68 16 46 59 51 46 60 65 82 98 43 22 91 75 72 54 49 70 70 7 7 4 Microglia Siglech PVM Mrc1 Endo Cytl1 Endo Ctla2a SMC Acta2 Peri Kcnj8 VLMC Spp1 Col15a1 VLMC Osr1 Mc5r VLMC Spp1 Hs3st6 VLMC Osr1 Cd74 Oligo Synpr Oligo Serpinb1a Oligo Rassf10 OPC Pdgfra Ccnb1 OPC Pdgfra Grm5 CR Lhx5 Meis2 Adamts19 Pvalb Vipr2 Pvalb Tpbg Pvalb Reln Tac1 Pvalb Reln Itm2a Pvalb Gpr149 Islr Pvalb Sema3e Kank4 Pvalb Akr1c18 Ntf3 Pvalb Calb1 Sst Pvalb Th Sst Pvalb Gabrg1 Sst Nts Sst Rxfp1 Prdm8 Sst Rxfp1 Eya1 Sst Tac2 Tacstd2 Sst Esm1 Sst Crh 4930553C11Rik Sst Crhr2 Efemp1 Sst Hpse Cbln4 Sst Hpse Sema3c Sst Tac2 Myh4 Sst Chrna2 Ptgdr Sst Myh8 Fibin Sst Chrna2 Glra3 Sst Myh8 Etv1 Sst Nr2f2 Necab1 Sst Calb2 Pdlim5 Sst Calb2 Necab1 Sst Tac1 Tacr3 Sst Tac1 Htr1d Sst Mme Fam114a1 Sst Chodl Vip Col15a1 Pde1a Vip Crispld2 Kcne4 Vip Crispld2 Htr2c Vip Pygm C1ql1 Vip Chat Htr1f Vip Rspo1 Itga4 Vip Lect1 Oxtr Vip Rspo4 Rxfp1 Chat Vip Ptprt Pkp2 Vip Gpc3 Slc18a3 Vip Arhgap36 Hmcn1 Vip Igfbp4 Mab21l1 Vip Lmo1 Myl1 Vip Lmo1 Fam159b Vip Igfbp6 Pltp Vip Igfbp6 Car10 Serpinf1 Aqp5 Vip Serpinf1 Clrn1 Sncg Vip Itih5 Sncg Gpr50 Sncg Vip Nptx2 Sncg Slc17a8 Lamp5 Lhx6 Lamp5 Lsp1 Lamp5 Plch2 Dock5 Lamp5 Ntn1 Npy2r Lamp5 Fam19a1 Tmem182 Lamp5 Fam19a1 Pax6 Lamp5 Krt73 L6b Hsd17b2 L6b VISp Crh L6b P2ry12 L6b ALM Olfr111 Nxph1 L6b ALM Olfr111 Spon1 L6b VISp Col8a1 Rxfp1 L6b VISp Mup5 L6b Col8a1 Rprm L6 CT VISp Gpr139 L6 CT VISp Ctxn3 Sla L6 CT VISp Ctxn3 Brinp3 L6 CT VISp Nxph2 Wls L6 CT VISp Krt80 Sla L6 CT ALM Cpa6 L6 CT Nxph2 Sla L6 NP ALM Trh L5 NP VISp Trhr Met L5 NP ALM Trhr Nefl L5 NP VISp Trhr Cpne7 L5 PT ALM Hpgd L5 PT ALM Npsr1 L5 PT ALM Slco2a1 L5 PT VISp Krt80 L5 PT VISp C1ql2 Cdh13 L5 PT VISp C1ql2 Ptgfr L5 PT VISp Lgr5 L5 PT VISp Chrna6 L6 IT VISp Car3 L6 IT VISp Col18a1 L6 IT VISp Col23a1 Adamts2 L6 IT VISp Penk Fst L6 IT VISp Penk Col27a1 L6 IT ALM Oprk1 L6 IT ALM Tgfb1 L5 IT ALM Gkn1 Pcdh19 L5 IT ALM Cpa6 Gpr88 L5 IT ALM Tmem163 Arhgap25 L5 IT ALM Tmem163 Dmrtb1 L5 IT ALM Tnc L5 IT ALM Lypd1 Gpr88 L5 IT ALM Cbln4 Fezf2 L5 IT ALM Pld5 L5 IT ALM Npw L5 IT VISp Col27a1 L5 IT VISp Col6a1 Fezf2 L5 IT VISp Batf3 L5 IT VISp Whrn Tox2 L5 IT VISp Hsd11b1 Endou L4 IT VISp Rspo1 L2/3 IT ALM Macc1 Lrg1 L2/3 IT ALM Ptrf L2/3 IT ALM Sla L2/3 IT VISp Agmat L2/3 IT VISp Adamts2 L2/3 IT VISp Rrad Astro Aqp4

b c e Developmental TF codes Developmental trajectories Evolutionary conservation of cell types

VZ SGZ/mantle Cortical plate Adult cortex Maturation Reptile (Turtle) Neurogenesis, Cell fate Migration, Mature score cell fate commitement, synaptogenesis, interneurons 0.8 CGE-derived MGE-derived commitment tangential migration maturation 0.6 VIP-like Reln SST PV-like dCGE RLN (SST) Pvalb Obox3 Dlx1/2/5/6, 0.4 Nr2f1?, Nr2f2, Pvalb Wt1 Arx?,Zeb2?, 0.2 Pvalb Rspo3 Dlx1/2 Prox1, Sp8 PV

Diffusion map coordinate 2 Diffusion Pvalb Tacr3 Ascl1 Diffusion map coordinate 1 Gsx1/2 (high) Dlx1/2/5/6, Dlx1/2/5/6, VIP CR Pvalb Cpne5 Nr2f1, Nr2f2, Nr2f1, Nr2f2, Dlx1/2/5/6, Pvalb Tpbg Pax6 (Low) Nr2f1?, Nr2f2, CGE LGE MGE Distance Nr2f1 Arx, Zeb2?, Arx, Zeb2?, Prox1, Sp8 Prox1, Sp8 Arx,Zeb2?, from root Pvalb Gpx3 Nr2f2 Lhx6, Sox6, 1.00 Sst Tacstd2 0 .4 Satb1 Sst Cdk6 0.75 SST CR Sst Myh8 MGE-derived dMGE SST Dlx1/2/5/6, 0.50 Sst Th Nr2f1, Arx?, 0.25 Sst Chodl

Nkx6-2 Dlx1/2/5/6, Dlx1/2/5/6, Zeb2?,Lhx6, 0 Gli1 Nr2f1, Nr2f2, Nr2f1, Nr2f2, Sox6, Satb1 MDS coordinate 2 0 Sst Cbln4 Nr2f1 Arx, Zeb2, Arx, Zeb2?, Igtp SST C orrelatoin Nr2f2 Lhx6, Sox6 Lhx6, Sox6, Satb1 MDS coordinate 1 Smad3 Dlx1/2/5/6, MGE Nr2f1, Nr2f2, CGE LGE MGE Ndnf Cxcl14 Arx, Zeb2?, Reln Ndnf Car4 – 0.26 Lhx6, Sox6, Satb1 Branch 1 Mammal (Mouse) Dlx1/2 Scng Dlx1/2/5/6, Dlx1/2/5/6, Branch 2 Ascl1 Nr2f1, PV Vip Gpc3 Gsx2 (low) Nr2f1, Arx, Zeb2?, Dlx5/6, Arx, Branch 3 Arx, Zeb2, Lhx6, Sox6, Satb1 Vip Chat Nr2f1 Lhx6, Sox6 Zeb2?, Lhx6, Trunk VIP Vip Scng Nkx2-1 Sox6, Satb1

CGE-derived Vip Parm1 MDS coordinate 2 Vip Mybpc1 MDS coordinate 1 d i14 i15 i16 i17 i18 i07 i08 i09 i10 i12 i11 i13 Molecular determinants of cell type function DCV Conserved TF codes CaBP KV Transcription factors CGE-derived CGE-derived MGE-derived MGE-derived SV Endogenous GPCR Adarb2+ Adarb2+ Reln+ SST+ non-SST ligands Syt Connectivity Cell adhesion molecules GABA Syt Nr2f2, Npas1, Nr2f2, Npas1, Lhx6, Sox6, Arx, Lhx6, Sox6, Arx, Transcription Nr2e1, Zbtb16, Nr2e1, Zbtb16, Sp9, Satb1(High), Sp9, Satb1(Low), Receptors CaV factors Sp8*, Prox1 Sp8*, Id2 Mafb Etv1, Mef2c(High), Input Prox1

Signaling KV GPCR CaV Htr3a, Adarb2, Htr3a, Adarb2, SST, NPY, Reln* Plau, Bcan, AC Effector Cnr1, VIP, Cck*, Reln, Lamp5, Pvalb, genes Grp, Lamp5 Cnr1*, Ndnf* Lamp* Ion channels Output GABAR GluR RGS RTK PTP PDE Rho-GEF Synaptic release Mouse Turtle * = subset of cells CaBP RAS Rho

1460 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a c Transgenic interneuron subclass correspondence Phenotypic correspondence by patch-seq

Patch-seq recording Trained Standard recording classifier 100 Modified Firing pattern Inferred morphology Firing pattern intracellular Train solution 80 + classifier RNA cDNA synthesis Detailed morphology + QC

Phase I extraction Phase II RNA-seq 60

40 Elongated neurogliaform cells (eNGCs) Single bouquet cells (SBCs) 20 PV1 PV4 PV11 PV3 PV8 PV12 NPY10 NPY2 NPY9 NPY17 NPY8 NPY4 NPY3 SOM7 SOM10 SOM8 SOM9 SOM6

(Dlink/Dmax8)*100 L1

L2/3 L1

L2/3

L4

L5

L6 w.m. –60mV

PV NPY SOM

+10mV

–71 mV –64 mV –65 mV

b d Multimodal cell type correspondence Transcriptome/connectivity relationships Rosehip transcriptomic cluster 50 μm Human n cells i1 1 i2 i1 i5 i3 i4 i7 i1 0 i6 i8 i9 1 *GAD1 Injection 5 *SST targets 10 1 50 2/3 PVALB VIP 100 *CXCL14 200 *CALB2 CTX STR TH TEC P ***CCK ***CNR1 ***CPLX3 CTX STR TH TEC P VISp cell types **NDNF IT L2/3 IT VISp Rrad **SV2C L2/3 IT VISp Adamts2 **TRPC3 L2/3 IT VISp Agmat **LAMP5 L4 IT VISp Rspo1 1 NTNG1 L5 IT VISp Hsd11b1 Endou 2/3 PRSS12 L5 IT VISp Whrn Tox2 *PDGFRA L5 IT VISp Batf3 TOX L5 IT VISp Col6a1 Fezf2 100 μm Rosehip ARHGAP31 L5 IT VISp Col27a1 L6 IT VISp Penk Col27a1 CDCA7 L6 IT VISp Penk Fst *EYA4 1 HRH1 2/3 Neurogliaform L6 IT VISp Col18a1 KIRREL L6 IT VISp Car3 PMEPA1 L5 PT VISp Chrna6 ROR2 L5 PT VISp Lgr5 *SOX13 L5 PT VISp C1ql2 Ptgfr Basket SSTR2 PT L5 PT VISp C1ql2 Cdh13 TXLNB L5 PT VISp Krt80 NP L5 NP VISp Trhr Cpne7 L5 NP VISp Trhr Met /Hz) 8 2 Rosehip CT L6 CT Nxph2 Sla V Neurogliaform L6 CT VISp Krt80 Sla –9 6 Other L6 CT VISp Nxph2 Wls L6 CT VISp Ctxn3 Brinp3 4 L6 CT VISp Ctxn3 Sla L6 CT VISp Gpr139 2 L6b L6b Col8a1 Rprm L6b VISp Mup5 L6b VISp Col8a1 Rxfp1 0 20 mV L6b P2ry12

Average power (×1 0 100 ms L6b VISp Crh 02 04 06 08 0 L6b Hsd17b2 Frequency (Hz) n cells 491 8 406 52 92

Fig. 3 | Correspondence across phenotypes of cortical neuron types. a, Quantitative morphological clustering and electrophysiological feature variation between major inhibitory neuron classes using transgenic mouse lines (modified from Figs. 1 and 2 from ref. 31). b, Convergent physiological, anatomical and transcriptomic evidence for a distinctive rosehip layer 1 inhibitory neuron type in human cortex that differs from neighboring neurogliaform cells. c, Morphological and physiological differences between layer 1 neurogliaform and single bouquet neurons shown by patch-seq analysis. Scale bars as in b. d, RNA-seq analysis of retrogradely labeled neurons in mouse primary visual cortex show distinctive projections of excitatory subclasses, but overlapping projections for finer transcriptomic cell types. Images in a adapted with permission from ref. 31, Oxford Univ. Press; in b–d, adapted with permission from refs. 75,76 and 40, respectively, Springer Nature. at the more refined branches of the mentioned RNA-seq study of retrogradely at the major branches of the transcriptomic transcriptomic classification remains largely labeled neurons in mouse primary visual taxonomy, there were overlapping to be validated. One example is the already cortex40. Despite distinct projection targets projections for finer transcriptomic cell

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1461 comment | FOCUS comment | FOCUS types (Fig. 3d). One possible explanation including diversity and divergent methods) and would include a description is that long-range connectivity patterns are molecular phenotypes between mouse of quantitative metrics such as resolution, set up early in development and may not be and human that correlate with known complexity, variability, uniqueness and strongly reflected in adult gene expression. morphological specializations in primate association of variables with other attributes. However, such mismatches do not negate the astrocytes36,74,84. Such similarities and There are two approaches to find and test value of a core transcriptomic classification differences between cell types across species, cluster validity. One is ‘hard’ clustering, with as described above. Rather, this information as well as challenges created by graded or clearly defined borders between clusters and about developmental trajectories needs to developmental variations in features, could with each cell strictly assigned to a particular be incorporated into the transcriptomic cell also be better captured by a probabilistically type. Alternatively, in ‘soft’ (or ‘fuzzy’) type classification28. defined and hierarchically organized clustering, any given cell has a particular Another challenge to transcriptomic cell-type taxonomy. probability of belonging to a particular classifications (and, in fact, to any cluster. Despite the probabilistic nature, classification of cell types) is the presence A probabilistic and hierarchical defni- inter- and intra-cluster distance may still be of phenotypic variation within a given cell tion of cortical cell types defined for outcome validation. Ultimately, type. One facet of this is the possibility of Examining the current transcriptomic the consensus description of cell types may variation in gene expression due to cell evidence, in some cases we find highly form a continuum, beginning with hard and state, differentiation and other dynamic distinct cell types based on robust ending with soft distinctions among cell processes within a single cell type. Some similarities of the transcriptome and other types, with an ambiguous transition between studies have suggested that cell types are measurable cell attributes, as exemplified these extremes. possibly not defined, discrete entities and by the phenotypic homogeneity of One natural approach to represent a may be better described as components of neocortical chandelier cells40,85–87 or the transcriptomic taxonomy is to adopt a a complex landscape of possible states78–80, above-mentioned rosehip cells. On the other hierarchical framework. Cluster analysis is and, indeed, some of that heterogeneity hand, the existence of cell states, spatial well suited to this, as its connectivity-based can be mapped with omics data81. Some gradients of phenotypes and mixtures of methods generate a tree-like representation continuous variation could be functionally differences and similarities in cross-species of clusters99. This approach follows the relevant. For example, basal dendritic comparisons present challenges to a historical tradition of using cladistics to lengths and morphological complexity of discrete and categorical perspective on classify organisms, assuming common layer 2/3 pyramidal cells appears to vary defining cell types. Prematurely adopting ancestors in their evolution and smoothly across a rostrocaudal axis in an inflexible definition of types will obscure synapomorphies (shared derived traits) mouse cortex82 (Fig. 4a). Further evidence the significance of observed phenotypic among related clades. While statistical for spatial gradients can be found in the variability and its biological interpretation. clusters do not presume any hierarchy in the graded transcriptomic variation across Rather, a plausible way forward is to employ structure of the data, biological systems have the human cortex83, perhaps reflecting the a practical or operational quantitative a temporal evolution as one of their essential expression of transcription factor gradients definition of a cell type. features and makes temporally based in the ventricular zone during development Cluster analysis has been used to hierarchies natural100. The evolutionary or (Fig. 4b). These phenotypic or spatial classify cortical neurons according to their developmental history of a neural circuit gradients create challenges for thresholding structural or physiological phenotypes or implies earlier stages, which are often less in clustering, and they fuel debates between expression of molecular markers13,14,22,31,82,87–90 specialized and represent common ancestors lumpers and splitters in determining the and, more recently, transcriptomics36,40,91,92. of later states101. Indeed, a hierarchical right level of granularity in defining Many unsupervised and supervised organization of existing transcriptomic cell cell types. methods can be used, including multilayer types data appears to mirror developmental A particular advantage of a perceptrons16, logistic regression16, k-nearest principles and spatiotemporal organization transcriptomic classification is that it neighbors16, affinity propagation93, in the neocortex (see above). Another provides a direct avenue for quantitative Bayesian classifiers34, naïve Bayes16, topic advantage of casting the cell type comparative analysis by aligning cell modelling94, t-distributed stochastic classification as a cladistic one is that the types across species based on shared gene neighbor embedding (t-SNE)95,96, graph lumping–splitting tension maps itself covariation, enabling an ‘Ur-classification’ theory97 and autoencoders98. These methods, naturally as a distinction between different as a common denominator of basic cell building on the existence of statistically levels of the hierarchical tree, since one can types. For example, a recent study of human defined groups or clusters over a set of split a group into subgroups at a lower level cortex61 demonstrated that the overall measurable attributes, naturally lead to an of the hierarchy to reflect data obtained in cellular organization of the human cortex evidence-based probabilistic definition of different physiological or developmental is highly conserved with that of the mouse, cell types. conditions. This provides an effective and allowing identification of homologous cell A probabilistic definition of cell types is objective framework to quantitatively types (Fig. 4c). However, this study also particularly applicable to transcriptomics, evaluate lumper-vs-splitter discussions. revealed a challenge for the future, in that, where the dimension of the underlying But hierarchical transcriptomic in many cases, it was not possible to align space is large, the variance comparatively relationships may not be easily represented cell types across species at the finest levels high and competing approaches give similar as a simple tree-like structure. Rather, they of granularity but rather at a higher level in results. However, one requires community may have complex inclusion–exclusion the hierarchical taxonomy. Furthermore, consensus on a rigorous statistical definition and class relationships and may be many differences were seen in homologous of transcriptomic types and the description more amenable to graph-based or other types, including their proportions, of intra- and inter-type variability. Ideally, set-theoretic constructions. Indeed, the laminar distributions, gene expression and this quantitative definition of a cell type space of the transcriptomes for cortical morphology. Finally, prominent differences would be independent of the statistical cell types could be visualized as a complex, were found in non-neuronal cells as well, method used (i.e., robust to different high-dimensional landscape with isolated

1462 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a b Graded areal variation in morphology Graded areal variation in transcriptome

Total dendritic nodes

Parietal 0.2 25 SPL PoG lobe SFG opIFG 20 PrG MFG PrG SMG PCLA 0.1 SFG 15 MFG AnG Lateral view trIFG SPL opIFG orIFG PCLP fro SMG 10 AnG FP IFGIFG trIFGtrIFG TG PLT SOG FP Pcu HG LOrG PLP 0.0 STG PoG Number of nodes 5 STG MTG SOG orIFG PLT POrG ITG OTG IOGIOG Occipital AOrG MTG PC2 IRoG 0 lobe SRoGMOrG V2L/ S2 M2 Frontal TP ITGITG GRe HG TG TeA lobe –0.1 PaOG PLP PCL P Cortical region Temporal FuG PCL A lobe Peristriate Basal dendritic field area SFG PCu TP 80 –0.2

60 FP Peristriate Striate (area 17) –0.3 SRoG PaOG 40 SRoG PaOG Peristriate Medial view GRe OTG Primary visual cortex TP FuG IOGIOG (area(Area 17) 17) 20 –0.4 Area (x 10^3 um^2) 0 Neocortical spatial topography V2L/ S2 M2 –0.2 0.0 0.2 0.4 TeA Cortical region Neocortical genetic topography c Cross-species transcriptome alignment

Pax6 Inh L1−2 PAX6 CDH12 Inh L1−2 PAX6 TNFAIP8L3 LAMP5/ Lamp5 1 Inh L1 LAMP5 NMBR Lamp5 Rosehip *Lamp5 Rosehip Inh L1−6 LAMP5 LCP2 PAX6 * Lamp5 Lhx6 Inh L1−2 LAMP5 DBP * *Lamp5 Lhx6 Inh L2−6 LAMP5 CA1 Pax6 Inh L1 SST CHRNA4 Vip Sncg Lamp5 2 Inh L1−2 ADARB2 MC4R Lamp5 1 Inh L1−2 SST BAGE2 Lamp5 2 Inh L1−3 VIP SYT6 Vip 1 Vip Sncg Inh L1−2 VIP TSPAN12 Inh L1−4 VIP CHRNA6 Vip 2 Cross-speciesVip 1 transcriptome alignment Inh L1−3 VIP ADAMTSL1 Vip 3 c Inh L1−4 VIP PENK Vip 4 Vip 2 Inh L2−6 VIP QPCT Vip 5 Inh L3−6 VIP HS3ST3A1 Sst Chodl Vip 3 Inh L1−2 VIP PCDH20 * Inh L2−5 VIP SERPINF1 Sst 2 Inh L2−5 VIP TYR VIP Sst 3 Vip 4 Inh L1−3 VIP CHRM2 Sst 1 Inh L2−4 VIP CBLN1 Sst 4 Inh L1−3 VIP CCDC184 Sst 5 Inh L1−3 VIP GGH Vip 5 Inh L1−2 VIP LBH Pvalb 1 Inh L2−3 VIP CASC6 Pvalb 2 Inh L2−4 VIP SPAG17 *Chandelier Sst Chodl Inh L1−4 VIP OPRM1 Exc L5–6 IT 1 * Inh L3−6 SST NPY * Exc L3–5 IT Sst 1 Inh L3−6 SST HPGD * Inh L4−6 SST B3GAT2 Exc L5–6 IT 2 Sst 2 Inh L5−6 SST KLHDC8A Exc L4–5 IT Sst 3 Inh L5−6 SST NPM1P10 Exc L5–6 IT 3 Sst 4 Inh L4−6 SST GXYLT2 SST Exc L2–3 IT Inh L4−5 SST STK32A Exc L6 IT 1 Sst 5 Inh L1−3 SST CALB1 Inh L3−5 SST ADGRG6 Exc L6 IT 2 Inh L2−4 SST FRZB Exc L5 ET Inh L5−6 SST TH Exc L5–6 NP Pvalb 1 Inh L5−6 LHX6 GLP1R Exc L6 CT Inh L5−6 PVALB LGR5 Exc L6b Inh L4−5 PVALB MEPE OPC Inh L2−4 PVALB WFDC2 PVALB Pvalb 2 Inh L4−6 PVALB SULF1 Astrocy* te Inh L5−6 SST MIR548F2 Oligo Chandelier Inh L2−5 PVALB SCUBE3 Endothelial * Microglia/PVM acr3 Gl ra3 Ptgd r

1 Cluster acstd2 2 Sst Nts 2 Human MTG Mouse V1 a a

Sst Esm1 overlap n ac1 T ac2 Myh4 ac1 Htr1d Sst Chodl n 0.8 One-to-one Pvalb Vipr2 Pvalb Tpbg * ac2 T Sncg Gpr50 Lamp5 Lhx6 Lamp5 Lsp1 Pvalb Th Sst Number of clusters Lamp5 Krt73 Pvalb Gabrg1 Sncg Vip Itih5 Sncg Slc17a8 0.6 Vip Chat Htr1f Vip Lect1 Oxtr Serpinf1 Clrn1 Vip Ptprt Pkp2 Sst T Sst T Vip Igfbp6 Pltp Sst Myh8 Fibin Vip Lmo1 Myl1 Sst Myh8 Etv1 Sst Rxfp1 Eya1 Sst Hpse Cbln4 Sncg Vip Nptx2 Pvalb Calb1 Sst Vip Rspo1 Itga4 Vip Pygm C1ql1 Pvalb Reln Tac1 Meis2 Adamts19 Pvalb Reln Itm2a Vip Igfbp6 Car10 Sst T Sst Rxfp1 Prdm8 Pvalb Gpr149 Islr Sst Ch r Sst Calb2 Pdlim5 Sst Ch r Sst Crhr2 Efemp1 Sst Nr2f2 Necab1 Sst Hpse Sema3c Vip Gpc3 Slc18a3 Sst Calb2 Necab1

Serpinf1 Aqp5 Vip 0.4 Vip Crispld2 Htr2c Lamp5 Ntn1 Npy2r Vip Col15a1 Pde1a Vip Crispld2 Kcne4 Pvalb Akr1c18 Ntf3 Vip Lmo1 Fam159b Vip Igfbp4 Mab21l1 Sst Mme Fam114a1 Lamp5 Plch2 Dock5 Pvalb Sema3e Kank4 Vip Arhgap36 Hmcn1

Vip Rspo4 Rxfp1 Chat 0.2 Lamp5 Fam19a1 Pax6 Neurogliaform Sst Crh 4930553C11Rik 0

Lamp5 Fam19a1 Tmem182 Single-bouquet Bipolar or Long-range Layer 5 Upper layer Upper layer Chandelier Mouse Multipolar projecting Martinotti Martinotti Basket morphology Mouse V1 cluster

Fig. 4 | Challenges for transcriptomic classification. a, Gradients in morphological size and complexity across the rostrocaudal extent of the cortex. b, Graded transcriptomic variation across the human cortex encodes rostrocaudal position on the cortical sheet. c, Transcriptomic cell types can be aligned across species based on shared molecular specification, but often at a lower level of resolution than the finest types observed in a given species. Images in a adapted with permission from ref. 82, Oxford Univ. Press; in b and c, adapted with permission from refs. 83 and 61, respectively, Springer Nature.

peaks of expression for a given cell type but a cell type as a continuous trajectory in a classification system per se, but to create also valleys and gradients between more transcriptomic space102. A robust statistical a comprehensive description of cellular weakly defined classes, which could be framework that enables a quantitative diversity in the neocortex. One needs to described alternatively as types or states. definition of cell type (or tendency to be a ensure that the experimental method will Such complexity can be described using, for type) is clearly needed. indeed capture all of the cell types present, example, the concept of cell-type attractors28, A final, and key, question is how to that the classification is complete and that or using the distinction between core and ensure that any given classification or the types are defined correctly. For any intermediate cells40 or the description of taxonomy is valid. The goal is not defining classification to be valid, it is critical to

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1463 comment | FOCUS comment | FOCUS ensure accuracy and correctness. First, it types, it seems likely that many of these in one species are not conserved in other is imperative to seek internal statistical types will vary in a somewhat continuous species. The traditional way of naming cell robustness for identified clusters, using fashion across cortical areas and possibly types is by their anatomical features (such different statistical methods22,103. Second, also across species (Fig. 4a,b). Likewise, as chandelier, double-bouquet, basket, external validation with orthogonal the classification system should also have Martinotti, pyramidal cells), and it would datasets is critical. Multimodal datasets a temporal component to capture the be desirable to incorporate these short and are particularly important in this regard, developmental trajectory from progenitor widely-used names into a nomenclature as they enable cross-comparisons between cell division to a terminally differentiated when possible, to seek consistency with classifications based on different types of state. Cells can be quantitatively defined the vast literature on neocortical cell data, for example, molecular, physiological by their position on that developmental types. However, anatomical features, such or anatomical22,31, patch-seq76, or spatial or spatial gradient. Finally, aligning across as horsetail axons, may also vary across transcriptomics methods54 (Fig. 3a–c) can species is quantitatively possible now, but species17. Also, for newly identified cell enable this, defining functionally relevant this alignment may only be possible at types, anatomical information is often not levels of granularity. Finally, a probabilistic different levels of granularity with increasing available and naming them by marker genes definition, particularly with a Bayesian evolutionary distance. The benefits of will be more practical. framework, can be tested by generatively creating a unified reference ontology across Adopting a more abstract nomenclature building computational models of each cell these biological axes will be large, but it will not based on anatomical features or type and comparting them with the real be a serious community effort to design a individual marker genes could make it data, thus providing some performance system that can accommodate them. more flexible, more easily applicable across metrics on the algorithms. Using these Following the genetic classification species and more compatible with other criteria, robustness, reproducibility and paradigm proposed here, there are many tissues outside the cortex or the brain. One predictive power can be measured and lessons to learn from genomics. For idea for a cell-type nomenclature system different approaches compared, as is example, the reference classification could is to build on , treating normally done in machine learning16. be iteratively updated and refined with transcriptomic cell clusters as sequence subsequent accumulation of data108 like data (partially implemented for Allen A unifed ontology and nomenclature genome builds, which changed in the early Institute datasets; https://portal.brain-map. of cortical cell types years but have become increasingly stable. org/explore/classes/nomenclature). Every To truly gain community adoption, the As in current gene nomenclature, an official cell cluster from a dataset or analysis data-driven transcriptomic classification of symbol with multiple aliases can link cell would get a unique accession ID. Robust cortical cell types requires a formal unified types to commonly used terminology and reproducible clusters would have cell type classification, a taxonomy and a relating to cellular anatomy or other official cell type names or symbols, as nomenclature system17,20,90 whose principles phenotypes. This nomenclature should be well as any number of aliases that could are generalizable to other systems. Names portable across species, with orthologous represent different existing nomenclatures are important: as an old Basque proverb cell types having common names, much as or historical names. In addition to cell states, ‘izena duen guzia omen da’ or ‘that current gene symbols refer to orthologous types, higher-order classes (for example, which has a name exists’, and a similar genes. For the cell type classification caudal ganglionic eminence (CGE)-derived Chinese one says ‘the beginning of wisdom to be useful like the genome has been, GABAergic interneurons, GABAergic is to call things by their right names’. This computational tools conceptually similar interneurons, neurons) could be named as classification should aim to be a consensus to BLAST alignment tools109 for mapping well, and both types and classes would be one that incorporates the richness of data sequence data, need to be developed to matched across species at the level (type, accumulated by different groups and be allow researchers to quantitatively map their class) at which they can be aligned. presented in a curated output that is public, data to this reference classification. Finally, easily accessible and has revisions managed continuing the analogy with genomics, just A cell-type knowledge graph for com- by a curation committee of experts. Creation as there are different versions of genome munity data aggregation of such an ontology is a serious project in builds for different purposes (for example, Defining the cell types of the cortex data organization that can build on prior with more or less manual curation), one (or other brain structures) serves as a efforts in cell ontologies104–106, as well as could consider different versions of cortical foundation for aggregating information best practices established by the ontology cell type taxonomies, with varying levels of about their function. By analogy to the development community107 (see Open splitting or lumping; spatial, temporal or genome, the definition of genes has allowed Biomedical Ontology Foundry, http://www. evolutionary criteria; or even some manually a massive integration of information about obofoundry.org). curated by experts, but under a unified their usage, function and disease relevance A true, data-driven transcriptomic framework of probabilistic definition of with a wide range of databases. On the other taxonomy poses a series of challenges cell types. hand, probabilistically defined cell types are that have not yet been taken on by the Nomenclature also poses a challenge. not the same as deterministically defined cell ontology community, but that are Currently, the lack of standardized -coding regions of the genome, and surmountable. One challenge is that nomenclature makes it difficult to track we can expect that our understanding of transcriptomically similar cell types can and relate cell types across different studies. cell types and their functional relevance exist in multiple anatomical locations. Thus, One natural idea with a genetically based will change as more information becomes transcriptomic types need to be related to paradigm is to name cell types on the basis available. A more flexible way to organize proper levels of the anatomical structure. of the best defining genes for each cell our knowledge and understanding of Prominent gradients across cortical areas type, as is currently commonly done36,61,110. cell types would be as a living, updatable pose another challenge to define in a However, the most specific genes are not framework, one allowing reference, taxonomy. While any given cortical region always detected in every cell of a cluster, and query and inference. An online-based contains some number of transcriptomic often the genes that best define a cell type data aggregation platform could also

1464 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

a scRNA-seq Spatial anatomy d Glutamatergic GABAergic Cell type knowledge graph

50 Genes 0 mV –50 b 0 10 20 Prediction Probabilistic cell type definition ms Electrophysiology Core 3 10 50 100 250 300 Semantic Morphology Intermediate integration Inference 3 10 50 100 Stable Linked data Multimodal assignment c Transcriptome-based taxonomy

Astro/Oligo Neuroectoderm SST ALM Supporting Neurons L2/L3 IT 100 Dissected area modalities Layered H3K27Ac Glutamatergic VISp 0 ALM GABAergic ALM VISp L6 IT ALM VISp Pvalb Serpinf1 L5/L6 IT L5 IT VISP DNase Clusters Sst Vip Lamp5 L2/L3 IT 4.88 L2/L3 IT Dissected layer(s) Cons 100 Verts L1 L1-L2/3 Structured data L2/3 L1-L4 L4 L2/3-L4 Annotation and L5 L4-L5 L6 L4-L6 Ontology L6b L5-L6 All layer(s)

Fig. 5 | Transcriptome based taxonomy, probabilistic cell types, and cell-type knowledge graphs. a, A transcriptome-based cell-type taxonomy is constructed from scRNA-seq data, related epigenomic datasets and neuroanatomy, b, Cell types are initially defined based on transcriptomic signatures in a probabilistic manner with multiresolution clustering and statistical analysis to identify robustness and variability. c, Reproducible gene expression patterns identify hierarchies of putative cell types that are subject to further analyses and validation. d, Transcriptomic cell-type taxonomies form a basis for constructing cell-type knowledge graphs that summarize the present state of definable cell types. Multimodal assignment of data, such as morphology, electrophysiology and connectivity, is associated and reported with statistical variability over assigned types. A knowledge graph contains relevant and essential supporting information, such as supporting data for further analysis and mapping, descriptive annotation and ontology, and literature citations. have a significant sociological impact in at present, but could be measured in and nomenclature, providing a common neuroscience by encouraging collaborative future CAP datasets, which could then be denominator for the research in the field, participation. added to the knowledge graph. In such integrating quantitative and qualitative One example of an appropriate data a knowledge graph, there are two basic cell-type classification, and allowing for structure for such a community platform is use-cases as new data becomes available. updates, subject to review and validation. a ‘knowledge graph’, a widely used tool in First, one can use it to identify known cell Computational engines would allow new the tech industry and computer science as a types and their properties in new datasets. data to be compared and allow users platform for data aggregation (Fig. 5). With a probabilistic or Bayesian definition, to query the current state of cell-type A knowledge graph is a relational data each new cell will be assigned a probability understanding from the perspective of structure in which nodes represent entities of belonging to a particular type in the their new data, assigning the most likely (such as cell types and their attributes) and graph. Second, the graph can be manually type to multi- or unimodal datasets based the links, or edges, between them represent or automatically updated, following on similarities to the current framework’s their relational and statistical associations. conventional optimization algorithms, as knowledge. In addition to supporting There is a measurable graph-theoretic new data can change node identities and literature reference, the dynamic framework distance between nodes based on probable distances with respect to one another. might include online forums for scientific associations and known relationships. The The proposed cell-type knowledge discussion and education. Ultimately, a cortical cell knowledge graph could be framework would represent a living and cell-type community knowledge framework initialized with standardized transcriptomics updatable resource that maintains an would be a dynamic and living resource that data, after which other data modalities and actively derived and flexible ontology of researchers, clinicians and educators could related taxonomies could be readily mapped cortical cell types, benefitting from present refer to as the benchmark resource for cell onto the graph to capture anatomical, active ontology efforts. This standardized types in the cortex, promoting collaborative electrophysiological, developmental and database could be powered by open-source participation in the field. other cell properties. For example, important algorithms and managed and curated by contributors to cell identity, determined database administrators. It would be a Maintaining and updating the by cellular interactions, splicing, local dynamic database with query capability, classifcation translation, protein phosphorylation, etc., but would only accept peer-reviewed The classification, nomenclature and may not be readily captured by scRNA-seq published data in a standardized fashion associated knowledge graph could be

Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1465 comment | FOCUS comment | FOCUS managed by a committee of experts based on single-cell transcriptomic data If successful, this community-based representing the breadth of approaches and and anchored on quantitative criteria that classification effort, joined by a common disciplines in the field. Such a committee operationally define cell clusters based on nomenclature and nourished by the would be charged with designing the their statistical and probabilistic grouping. knowledge graph, could be extended and statistical classification model to sustain Although molecularly driven initially, this generalized to other parts of the brain or a basic taxonomy; the type of open taxonomy should be revised and modified as of the body. In this sense, the classification platform to use for the knowledge graph; additional CAP datasets become available, of neocortical cell types, a field with a long the rules by which this taxonomy can be becoming a true multimodal classification tradition and multidimensional approach updated and revised; the quality control or of cortical cell types. We view this core to a central problem in neuroscience, could peer-reviewed criteria; and the metadata classification as potentially valid for all be an ideal test case to explore this novel to be added. While the knowledge graph mammalian species and also as likely organization of knowledge in neuroscience could continually update itself automatically, applicable to homologous structures in and, more generally, in biology. ❐ as new data is imported, different curated other vertebrates, as a broad framework to versions of the graph might be released in encapsulate evolutionary conservation with Rafael Yuste 1 ✉ , Michael Hawrylycz 2 ✉ , regular updates. This committee, arising species specialization. Indeed, only with Nadia Aalling 3, Argel Aguilar-Valles4, from expert volunteers, could also help with such a systematic approach to comparing Detlev Arendt5, vetting of a unified nomenclature of cortical cell types across species will it be possible to Ruben Armananzas Arnedillo 6, cells that is succinct, useful and informative, understand how cell type diversity evolved Giorgio A. Ascoli6, Concha Bielza7, as well as methods by which community in the cerebral cortex. Vahid Bokharaie8, input would be incorporated in a fair and This taxonomy will only be useful and Tobias Borgtoft Bergmann 3, efficient fashion. successful if adopted by the community. So, Irina Bystron9, Marco Capogna 10, Potentially, such a committee might in addition to the nomenclature, a series of Yoonjeung Chang11, Ann Clemens 12, be established and supported through research tools should be developed, ideally Christiaan P. J. de Kock13, Javier DeFelipe14, existing organizations or consortia by a community consortium, to facilitate Sandra Esmeralda Dos Santos15, with interest in cell type classification, similar experimental access to these cell Keagan Dunville16, Dirk Feldmeyer 17, such as the NIH BRAIN Initiative Cell types by the broader range of investigators. Richárd Fiáth18, Gordon James Fishell11, Census Network (BICCN; https://www. We envision molecular and genetic tools, Angelica Foggetti19, Xuefan Gao20, biccn.org), the NeuroLex–International such as standard sets of antibodies and RNA Parviz Ghaderi 21, Coordinating Facility probes to identify key molecular markers Natalia A. Goriounova 13, (INCF; http://130.229.26.15/news/activities/ for each cell type, as well as cell or mouse Onur Güntürkün22, Kenta Hagihara23, our-programs/pons/neurolex-wiki. lines that are used as resources for the entire Vanessa Jane Hall 3, html), the Neuroscience Information community. Statistical tools to enable direct Moritz Helmstaedter 24, Framework (NIF; https://neuinfo.org), comparisons among datasets, and to enable Suzana Herculano15, Markus M. Hilscher 25, the Human Brain Project (HBP; https:// mapping new datasets to reference datasets, Hajime Hirase26, Jens Hjerling-Lefer 25, www.humanbrainproject.eu/), the Human are essential. An open informatics backbone Rebecca Hodge 2, Josh Huang 27, BioMolecular Atlas Program (HuBMAP; needs to be developed as an essential part of Rafq Huda 28, Konstantin Khodosevich3, https://commonfund.nih.gov/hubmap) or the taxonomy, as well as visualization and Ole Kiehn 3, Henner Koch29, the Human Cell Atlas (HCA; https://www. analysis tools that take advantage of this Eric S. Kuebler 30, Malte Kühnemund31, humancellatlas.org/). Some of these groups taxonomy and allow scientists to explore the Pedro Larrañaga7, Boudewijn Lelieveldt32, are already chartered with mapping the cell data, add to the knowledge base and achieve Emma Louise Louth10, Jan H. Lui 33, types of the nervous system or other organs new knowledge. Huibert D. Mansvelder 13, in the body and may have resources to build In addition, we propose that the Oscar Marin 34, Julio Martinez-Trujillo 35, the backend technological infrastructure community input to support this taxonomy Homeira Moradi Chameh36, Alok Nath 37, needed for the knowledge graph. and enable its future revisions be channeled Maiken Nedergaard 38, Pavel Němec 39, Regardless of who supports and into an open platform, a knowledge graph, Netanel Ofer 40, Ulrich Gottfried Pfsterer3, maintains this key infrastructure, it is as is becoming increasingly common in Samuel Pontes 1, William Redmond 41, critical that the efforts be managed through community-led data science. Aggregation Jean Rossier42, Joshua R. Sanes 11, open communication with the community. of knowledge through data graphs, now Richard Scheuermann 43, A public consortium will be a logical a common practice in the tech industry, Esther Serrano-Saiz44, Jochen F. Steiger45, organizational structure for channeling will accelerate the dissemination of Peter Somogyi9, Gábor Tamás 46, diverse inputs and will also adequately knowledge and could avoid the ‘publication Andreas Savas Tolias 47, represent the wider community, reflecting graveyard’, where data are stored away Maria Antonietta Tosches 1, cultural, geographic, ethnic and gender in siloed journal articles disconnected Miguel Turrero García 11, diversity. Strong community engagement from the rest of the field. Anchoring this Hermany Munguba Vieira25, will ensure wide acceptance and ensure taxonomy and knowledge graph, a unified Christian Wozny 48, Thomas V. Wuttke 49, that these standards are adopted widely, new nomenclature of cortical cell types Liu Yong 50, Juan Yuan25, Hongkui Zeng 2 ✉ within and outside of the neocortex valid across species is needed to centralize and Ed Lein 2 ✉ specialist field. efforts in the field, with a generalizable 1Columbia University, New York City, NY, USA. framework to integrate with other cell-type 2Allen Institute for Brain Science, Seattle, WA, USA. A community-based taxonomy and classifications. We view the establishment 3University of Copenhagen, Copenhagen, Denmark. nomenclature of cortical cell types of a common nomenclature as an essential 4Carleton University, Ottawa, Ontario, Canada. To conclude, we think that the field of step to provide a standardized language that 5European Molecular Biology Laboratory Heidelberg, neocortical studies is ready for a synthetic, enables the meaningful aggregation and Heidelberg, Germany. 6George Mason University, principled classification of cortical cell types, sharing of data. Fairfax, VA, USA. 7Technical University of ,

1466 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience FOCUS | comment FOCUS | comment

Madrid, Spain. 8Max Planck Institute, Tübingen, 7. Mountcastle, V.B. Perceptual Neuroscience: Te 52. Ecker, J. R. et al. Te BRAIN Initiative Cell Germany. 9University of Oxford, Oxford, UK. Cerebral Cortex (Harvard Univ. Press, 1998). Census Consortium. Neuron 96, 542–557 (2017). 10 11 8. Lorente de Nó, R. Trab. Lab. Invest. Bio. 53. Regev, A. et al. eLife 6, e27041 (2017). Aarhus University, Aarhus, Denmark. Harvard (Madrid) 20, 41–78 (1922). 54. Lein, E., Borm, L. E. & Linnarsson, S. Science 12 Medical School, Cambridge, MA, USA. Te 9. Szentágothai, J. Brain Res. 95, 475–496 (1975). 358, 64–69 (2017). University of Edinburgh, Edinburgh, UK. 13Vrije 10. Peters, A. & Jones, E.G. Cerebral Cortex 55. Zeng, H. & Sanes, J. R. Nat. Rev. Neurosci. 18, Universiteit Amsterdam, Amsterdam, Netherlands. (Plenum, New York, 1984). 530–546 (2017). 14Cajal Institute, Madrid, Spain. 15Vanderbilt 11. Lund, J. S. Annu. Rev. Neurosci. 11, 56. Huang, Z. J. & Paul, A. Nat. Rev. Neurosci. 20, 16 253–288 (1988). 563–572 (2019). University, Nashville, TN, USA. Scuola Normale 12. Jones, E.G. & Diamond, I.T. (eds.). Te Barrel 57. Fu, M. & Zuo, Y. Trends Neurosci. 34, Superior, Pisa, Italy. 17JARA-Brain Institute of Cortex of Rodents (Plenum, 1995). 177–187 (2011). Neuroscience and Medicine, Julich, Germany. 13. Kozloski, J., Hamzei-Sichani, F. & Yuste, R. 58. Gerfen, C. R., Paletzki, R. & Heintz, N. Neuron 18Research Centre for Natural Sciences, Budapest, Science 293, 868–872 (2001). 80, 1368–1383 (2013). 14. Cauli, B. et al. J. Neurosci. 17, 3894–3906 (1997). 59. He, M. et al. Neuron 91, 1228–1243 (2016). Hungary. 19Christian-Albrechts-University Kiel, Kiel, 20 15. Tsiola, A., Hamzei-Sichani, F., Peterlin, Z. & 60. Roselli, C. et al. Nat. Genet. 50, 1225–1233 (2018). Germany. European Molecular Biology Laboratory, Yuste, R. J. Comp. Neurol. 461, 415–428 (2003). 61. Hodge, R. D. et al. Nature 573, 61–68 (2019). Hamburg, Germany. 21École Polytechnique Fédérale 16. Guerra, L. et al. Dev. Neurobiol. 71, 71–82 (2011). 62. Nowakowski, T. J. et al. Science 358, de Lausanne, Lausanne, Switzerland. 22Ruhr 17. Ascoli, G. A. et al. Nat. Rev. Neurosci. 9, 1318–1323 (2017). University Bochum, Bochum, Germany. 23Friedrich 557–568 (2008). 63. Mayer, C. et al. Nature 555, 457–462 (2018). 18. Pelvig, D. P., Pakkenberg, H., Stark, A. K. & 64. Mi, D. et al. Science 360, 81–85 (2018). Miescher Institute for Biological Research, Basel, Pakkenberg, B. Neurobiol. Aging 29, 65. Winnubst, J. et al. Cell 179, 268–281.e13 24 Switzerland. Max Planck Institute for Brain 1754–1762 (2008). (2019). Research, Frankfurt, Germany. 25Karolinska 19. DeFelipe, J. et al. Nat. Rev. Neurosci. 14, 66. Sugino, K. et al. Nat. Neurosci. 9, 99–107 (2006). Institutet, Stockholm, Sweden. 26RIKEN Center for 202–216 (2013). 67. Anderson, S. A., Eisenstat, D. D., Shi, L. & Brain Science, Saitama, Japan. 27Cold Spring Harbor 20. Shepherd, G. M. et al. Front. Neuroanat. 13, Rubenstein, J. L. Science 278, 474–476 (1997). 25 (2019). 68. Tosches, M. A. et al. Science 360, 881–888 (2018). Laboratory, Laurel Hollow, NY, USA. 28Massachusetts 21. Markram, H. et al. Nat. Rev. Neurosci. 5, 69. Peng, Y. R. et al. Cell 176, 1222–1237.e22 (2019). Institute of Technology, Cambridge, MA, USA. 793–807 (2004). 70. Martersteck, E. M. et al. Cell Rep. 18, 29RWTH Aachen University, Aachen, Germany. 22. McGarry, L. M. et al. Front. Neural Circuits 4, 2058–2072 (2017). 30Robarts Research Institute, London, Ontario, 12 (2010). 71. Häring, M. et al. Nat. Neurosci. 21, Canada. 31CARTANA, Stockholm, Sweden. 32Leiden 23. Markram, H. et al. Cell 163, 456–492 (2015). 869–880 (2018). 33 24. Butt, S. J. B. et al. Neuron 48, 591–604 (2005). 72. Rosenberg, A. B. et al. Science 360, University, Leiden, Netherlands. Stanford University, 25. Kawaguchi, Y. & Kubota, Y. Cereb. Cortex 7, 176–182 (2018). 34 Stanford, CA, USA. King’s College London, London, 476–486 (1997). 73. Harris, K. D. et al. PLoS Biol. 16, e2006387 (2018). UK. 35University of Western Ontario, London, 26. Yuste, R. Neuron 48, 524–527 (2005). 74. Zeisel, A. et al. Science 347, 1138–1142 (2015). Ontario, Canada. 36Krembil Research Institute, 27. Kessaris, N., Magno, L., Rubin, A. N. & Oliveira, 75. Boldog, E. et al. Nat. Neurosci. 21, Toronto, Ontario, Canada. 37University of Haifa, M. G. Curr. Opin. Neurobiol. 26, 79–87 (2014). 1185–1195 (2018). 38 28. Fishell, G. & Kepecs, A. Annu. Rev. Neurosci. 43, 76. Cadwell, C. R. et al. Nat. Biotechnol. 34, Haifa, Israel. Univeristy of Rochester, Rochester, 1–30 (2019). 199–203 (2016). 39 NY, USA. Charles University, Prague, Czech 29. Arendt, D. et al. Nat. Rev. Genet. 17, 744–757 77. Chen, K. H., Boettiger, A. N., Moftt, J. R., Republic. 40Bar Ilan University, Ramat Gan, Israel. (2016). Wang, S. & Zhuang, X. Science 348, 41Macquarie University, Sydney, New South Wales, 30. Tosches, M. A. & Laurent, G. Curr. Opin. aaa6090 (2015). Australia. 42Sarbonne University, Paris, France. 43J. Neurobiol. 56, 199–208 (2019). 78. Durruthy-Durruthy, R. et al. Cell 157, 44 31. Dumitriu, D., Cossart, R., Huang, J. & Yuste, R. 964–978 (2014). Craig Venter Institute, La Jolla, CA, USA. Severo Cereb. Cortex 17, 81–91 (2007). 79. Trapnell, C. et al. Nat. Biotechnol. 32, Ochoa Center for Molecular Biology, Madrid, Spain. 32. Jiang, X. et al. Science 350, aac9462 (2015). 381–386 (2014). 45University of Göttingen, Göttingen, Germany. 33. Wheeler, D. W. et al. eLife 4, e09960 (2015). 80. Shalek, A. K. et al. Nature 510, 363–369 (2014). 46University of Szeged, Szeged, Hungary. 47Baylor 34. Mihaljević, B. et al. BMC Bioinformatics 19, 81. Fiers, M. W. E. J. et al. Brief. Funct. Genomics 17, 511 (2018). 246–254 (2018). College of Medicine, Houston, TX, USA. 48University 49 35. Shekhar, K. et al. Cell 166, 1308–1323.e30 82. Benavides-Piccione, R., Hamzei-Sichani, F., of Strathclyde, Glasgow, UK. University of Tübingen, (2016). e1330. Ballesteros-Yáñez, I., DeFelipe, J. & Yuste, R. Tübingen, Germany. 50School of Engineering, New 36. Tasic, B. et al. Nat. Neurosci. 19, 335–346 (2016). Cereb. Cortex 16, 990–1001 (2006). York University, New York, NY, USA. 37. Paul, A. et al. Cell 171, 522–539.e20 (2017). e520. 83. Hawrylycz, M. J. et al. Nature 489, 391–399 ✉e-mail: [email protected]; 38. Fishell, G. & Heintz, N. Neuron 80, 602–612 (2012). (2013). 84. Bakken, T.E. et al. PLoS ONE 13, e0209648 [email protected]; 39. Nelson, S. B., Sugino, K. & Hempel, C. M. (2018). [email protected]; [email protected] Trends Neurosci. 29, 339–345 (2006). 85. Somogyi, P. Brain Res. 136, 345–350 (1977). 40. Tasic, B. et al. Nature 563, 72–78 (2018). 86. Fairén, A. & Valverde, F. J. Comp. Neurol. 194, Published online: 24 August 2020 41. Yager, T. D., Nickerson, D. A. & Hood, L. E. 761–779 (1980). https://doi.org/10.1038/s41593-020-0685-8 Trends Biochem. Sci. 16, 454 (1991). 456, 458 87. Woodruf, A. R. et al. J. Neurosci. 31, passim. 17872–17886 (2011). References 42. Alivisatos, A. P. et al. Neuron 74, 970–974 (2012). 88. Cauli, B. et al. Proc. Natl. Acad. Sci. USA 97, 1. Magner, L.N. A History of the Life Sciences 43. Bargmann, C. I. & Newsome, W. T. JAMA 6144–6149 (2000). (Marcel Dekker, 1979). Neurol. 71, 675–676 (2014). 89. Krimer, L. S. et al. J. Neurophysiol. 94, 2. Ramón y Cajal, S. Rev Ciencias Méd. Barcelona 44. Macosko, E. Z. et al. Cell 161, 1202–1214 (2015). 3009–3022 (2005). 18, 361–376 (1892). 457–476, 505–520, 45. Klein, A. M. et al. Cell 161, 1187–1201 (2015). 90. Armañanzas, R. & Ascoli, G. A. Trends Neurosci. 529–541. 46. Zheng, G. X. et al. Nat. Commun. 8, 14049 38, 307–318 (2015). 3. Ramón y Cajal, S. Recuerdos de Mi Vida: Vol.2. (2017). 91. Andrews, T. S. & Hemberg, M. Mol. Aspects Historia de Mi Labor Científca (Imprenta y 47. Habib, N. et al. Nat. Methods 14, 955–958 Med. 59, 114–122 (2018). librería de Nicolás Moya, 1917). (2017). 92. Kiselev, V. Y. et al. Nat. Methods 14, 4. Ramón y Cajal, S. La Textura del Sistema 48. Bush, E. C. et al. Nat. Commun. 8, 105 (2017). 483–486 (2017). Nerviosa del Hombre y los Vertebrados (Imprenta 49. Garber, M., Grabherr, M. G., Guttman, M. & 93. Santana, R., McGarry, L. M., Bielza, C., y librería de Nicolás Moya, 1904). Trapnell, C. Nat. Methods 8, 469–477 (2011). Larrañaga, P. & Yuste, R. Front. Neural Circuits 5. Hubel, D. H. & Wiesel, T. N. Proc. R. Soc. Lond. 50. Stuart, T. & Satija, R. Nat. Rev. Genet. 20, 7, 185 (2013). B Biol. Sci. 198, 1–59 (1977). 257–272 (2019). 94. Liu, L., Tang, L., Dong, W., Yao, S. & Zhou, W. 6. Douglas, R. J., Martin, K. A. C. & 51. White, J. G., Southgate, E., Tomson, J. N. & Springerplus 5, 1608 (2016). Whitteridge, D. Neural Comput. 1, 480–488 Brenner, S. Philos. Trans. R. Soc. Lond., B 314, 95. van der Maaten, L. & Hinton, G. J. Mach. Learn. (1989). 1–340 (1986). Res. 9, 2579–2605 (2008). Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience 1467 comment | FOCUS comment | FOCUS

96. Mansergh, F. C., Carrigan, M., Hokamp, K. & 104. Bard, J., Rhee, S. Y. & Ashburner, M. Genome Prize Foundation and staff for help and the Lundbeck Farrar, G. J. Mol. Vis. 21, 61–87 (2015). Biol. 6, R21 (2005). Foundation for support. This paper is dedicated to the 97. Tang, K., Ruozzi, N., Belanger, D. & Jebara, T. 105. Osumi-Sutherland, D. BMC Bioinformatics 18, memory of . Bethe learning of graphical models via MAP 558 (2017). Suppl 17. decoding. Artifcial Intelligence and Statistics 106. Masci, A. M. et al. BMC Bioinformatics 10, Competing interests (AISTATS). Proc. Mach. Learn. Res. 51, 70 (2009). The authors declare no competing interests. 1096–1104 (2016). 107. Smith, B. et al. Nat. Biotechnol. 25, 1251–1255 Open Access Tis article is licensed under 98. Sümbül, U., Zlateski, A., Vishwanathan, A., (2007). a Creative Commons Attribution 4.0 Masland, R. H. & Seung, H. S. Front. Neuroanat. 108. Pereira, P., Gama, J. & Pedroso, J. IEEE Trans. International License, which permits use, 8, 139 (2014). Knowl. Data Eng. 20, 615–627 (2008). sharing, adaptation, distribution and reproduction in any 99. Romesburg, H.C. Cluster Analysis for Researchers 109. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. medium or format, as long as you give appropriate credit to (Lifetime Learning, 1984). & Lipman, D. J. J. Mol. Biol. 215, 403–410 the original author(s) and the source, provide a link to the 100. Arendt, D., Bertucci, P. Y., Achim, K. (1990). Creative Commons license, and indicate if changes were & Musser, J. M. Curr. Opin. Neurobiol. 56, 110. Bakken, T. et al. BMC Bioinformatics 18, made. Te images or other third party material in this 144–152 (2019). 559 (2017). Suppl 17. article are included in the article’s Creative Commons 101. Wiley, E.O. & Liberman, B.S. Phylogenetics: license, unless indicated otherwise in a credit line to the Teory and Practice of Phylogenetic Systematics material. If material is not included in the article’s Creative (Wiley-Blackwell, 2011). Acknowledgements Commons license and your intended use is not permitted 102. Siebert, S. et al. Science 365, eaav9314 This document resulted from group discussions at the by statutory regulation or exceeds the permitted use, you (2019). FENS/Brain Prize meeting, ‘The necessity of cell types for will need to obtain permission directly from the copyright 103. Crow, M., Paul, A., Ballouz, S., Huang, Z. J. & brain function’, that took place in Copenhagen, Denmark, holder. To view a copy of this license, visit http:// Gillis, J. Nat. Commun. 9, 884 (2018). on 7–10 October 2018. We thank the FENS and Brain creativecommons.org/licenses/by/4.0/.

1468 Nature Neuroscience | VOL 23 | December 2020 | 1456–1468 | www.nature.com/natureneuroscience