DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020
On the Existence of Functional Clusters in the Dorsomedial Striatum
THEODOR OHLSSON
KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ENGINEERING SCIENCES
On the Existence of Functional Clusters in the Dorsomedial Striatum
THEODOR OHLSSON
Degree Projects in Optimization and Systems Theory (30 ECTS credits) Master’s Programme in Industrial Engineering and Management KTH Royal Institute of Technology year 2020 Supervisor at Meletis Laboratory (KI): Emil Wärnberg Supervisor at KTH: Xiaoming Hu Examiner at KTH: Xiaoming Hu
TRITA-SCI-GRU 2020:301 MAT-E 2020:074
Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci
Abstract
In recent years, the understanding of the brain has progressed immensely by advanced data gathering methods that can track the activity of individual neurons. This has enabled researchers to investigate the function and dy- namics of di↵erent parts of the brain in detail. Using data gathered from mice engaged in an two-alternative choice task, this thesis sought to answer whether neurons of the dorsomedial striatum are clustered with regards to their activity profiles by using four fundamentally di↵erent mathematical approaches. This analysis could not find any reliable evidence of functional clusters in the dorsomedial striatum, but their existence cannot be excluded.
Sammanfattning Med hj¨alp av avancerade metoder f¨ordatainsamling har forskare inom neu- rovetenskap kunnat f¨olja enskilda neuroners aktivitet i realtid. P˚as˚as¨att har funktionen av delar av hj¨arnan kunnat kartl¨aggas. Detta arbete utforskade huruvida neuroner i dorsomediala striatum ¨ar klustrade med avseende p˚a deras funktion. Detta gjordes genom att till¨ampa fyra olika metoder p˚adata fr˚anf¨ors¨oksm¨oss.Denna analys kan inte hitta bel¨aggf¨oratt det finns funk- tionella kluster i dorsomedial striatum, men kan heller inte utesluta att det finns.
i
Acknowledgments
The foundation of this thesis is the experiments done by researchers at the Dinos Meletis Lab, KI, to whom I am grateful for sharing their data with me. Any faults in this thesis are my own. Lastly, special thanks to Emil W¨arnberg for his guidance and sharing of endless insights into neurosciene and mathematics alike.
ii
Contents
1 Introduction 1 1.1 AdvancesinNeuroscience ...... 1 1.2 Previous Research ...... 2 1.3 Research Question ...... 3
2 Theoretical Background 5 2.1 IntroductiontoNeuroscience...... 5 2.1.1 TheNeuron ...... 5 2.1.2 The Striatum ...... 6 2.2 ClusterAnalysis...... 7 2.3 ClusterValidityIndicies ...... 8 2.3.1 The Gap Statistic ...... 8 2.4 Modality ...... 9 2.4.1 The Dip Test ...... 10 2.4.2 Principle Curve ...... 10 2.4.3 The Folding Test ...... 11 2.5 ePAIRS ...... 13 2.6 CommunityDetection ...... 14 2.6.1 Correlation Clustering ...... 14
3 Methodology 16 3.1 Data Gathering ...... 16 3.2 ArtificialDataSets ...... 17 3.2.1 SkewNormal ...... 18 3.2.2 Three Blobs ...... 18 3.2.3 15 blobs ...... 19 3.2.4 Cross distribution ...... 20 3.3 Procedure ...... 21
4 Results 22 4.1 Folding Test ...... 22 4.2 ePAIRS ...... 23 4.3 Principle Curve ...... 24 4.4 Gap Statistic ...... 26
iii 5 Discussion 28 5.1 InterpretationofResults ...... 28 5.2 Limitations ...... 29 5.2.1 Data Gathering ...... 29 5.2.2 UniformastheNullDistribution ...... 29 5.3 Future Research ...... 29 5.3.1 Experiments with Other Dimensions ...... 30 5.3.2 Algorithms for Categorical Mixed Selectivity ...... 30 5.3.3 Validation Techniques for Correlation Clustering . . . . 32
6 Conclusion 33
References 34
7 Appendix 38 7.1 Principle curve ...... 38 7.2 Gap Statistic ...... 39 7.3 ePAIRS ...... 41
iv 1 Introduction
1.1 Advances in Neuroscience The function of the brain has long been shrouded in mystery. Indeed, Aris- totle posited that its main purpose was to cool the circulating blood [1]. Since the introduction of modern medicine and science, the function of the brain as the central controlling organ which houses thoughts, planning and emotions has been well established. Whereas the function of and dynamics between di↵erent parts of the brain has historically been understood through observations of the e↵ects of head trauma or pathologies, recent advances in technology has allowed researchers to monitor the live activity of the brain in vivo.Throughtheuseoftestanimals,researchersnowhavetheability to record a large number of individual neurons at the same time which has allowed for the role of di↵erent parts of the brain to be studied in detail [2]. This is in practice done by simultaneously recording the behaviour of atestanimalandtheactivityofneurons,whichwillinfercorrelation,or even through controlling the activity of individual neurons through optoge- netic techniques and noting the behaviour, which will allow for conclusions regarding causality.
The ability to record activity of individual neurons in a high temporal reso- lution has made any conclusions contingent on the analysis of large troves of data. This has increasingly made neuroscientists reliant on machine learning techniques and advanced mathematical methods as a part of their research. Moreover, this has also in itself generated new inquiries into mathematical models of neural circuitry and even established computational neuroscience as a field of its own [3].
One such inquiry has been to establish or disprove the existence of neuronal clusters. Let an experiment have recorded N neurons whilst a test subject engages in d number of tasks. Let then subsequent analysis give each neuron az-scoresignifyinghowactivesaidneuronisoneachtaskcomparedtomean activity of neurons. One is then left with N points in a d-dimensional space. It is then of interest to know whether there are k distinct groups of neurons in this d-dimensional space. If there are distinct groups, this would imply that those neurons can be understood as of di↵erent types which could lead researchers to infer a more detailed description of the inner workings of a
1 certain part of the brain. This concept is often referred to as non-random or categorical mixed selectivity,asopposedtorandom mixed selectivity in which neurons are located in this feature space without any clear pattern.
1.2 Previous Research There has been a longstanding e↵ort to find clusters of neurons in many di↵er- ent parts of the brain with mixed results. This has roughly manifested itself in inventing new methods for generating clusters and/or using old methods of validating those clusters. With the use of a mathematical analysis on the angles between data points (see section 2.5), neurons in the posterior parietal and orbitofrontal cortex have been found to be clustered and unclustered re- spectively [4, 5]. However, this method has has not been rigorously tested on artificial data sets and so its robustness is still unclear.
Gr¨undemann et al. [6] grouped neurons of the amygdala in two occasions in apaperpublishedinScience2019.Inthefirstinstancetheyemployedthek- means algorithm to produce 3 clusters, but did not elaborate on the validity or separation of those 3 clusters. In the second instance they excluded some neurons that are deemed to be not active enough, i.e the absolute value of their score is lower than some threshold, to generate clusters. The practice of excluding neurons form a data set was also used by Jennings et. al [7] in the hypothalamus and Barbera et al. [8] to form clusters in the dorsal striatum using neuronal correlation (see section 2.6). Excluding data points is problematic since it, for some cut o↵, might produce a clustered data set from an unclustered one (see Figure 1).
2 Figure 1: A figure illustrating how the exclusion of data points may result in the impression of a clustered data set
The approach to validate clusters generated will rely on some measures of cluster fitness to find the appropriate number of clusters. Alder et al. [9] uses the Silhouette coe cient (see section 2.2.2) to this end on striatal neu- rons. However, the Silhouette coe cient implicitly assumes that a data set is clustered, and cannot be reliably used to di↵erentiate between clustered and unclustered data. Moreover, it seems like Alder et al. interpreted the results of the Silhouette wrong by choosing a di↵erent k than what the method pro- poses. Sales-Carbonell et al. [10] also used the Silhouette on striatal neurons with which they concluded that there were no clusters.
In summary, there are 3 main pitfalls that one commonly encounters in pre- vious inquires into the existence of clusters in the brain. The methods:
1. have not been properly tested so their results cannot be accurately interpreted.
2. filter out data points which make them biased towards finding clusters.
3. implicitly or explicitly assumes the existence of clusters a priori.
1.3 Research Question This paper will examine the existence of functional clusters in the dorsome- dial striatum. This is a similar, yet distinct inquiry from that of the number of clusters in the striatum, which will not be extensively elaborated on. The
3 question of clusters will be examined with regard to fallacies in the research field. That is, methods to be used in this paper will have to be tested on artificial data sets, be used on the whole of data sets and will not assume that there are any clusters to begin with.
4 2 Theoretical Background
2.1 Introduction to Neuroscience 2.1.1 The Neuron The brain consist of a vast number of cells. The morphological and functional heterogeneity of these cells is great, although the primary type of cell is neuron. These cells are specialized in transmitting information to other cells, not exclusively other neurons. Neurons can roughly be described as consisting of a cell body or soma,dendritesandaxon.
Figure 2: A schematic of the rough structure of a neuron. Input signals will be received at the dendrites and output will be transmitted through the axon. Credit: Edvin Wester
The cell membrane, which encloses the cell, is made of phospholipids and is as of such in itself non permeable to much of the surrounding environ- ment. However, channels and pumps covers the cell membrane which allow for exchange of ions and nutrients between the neuron and its surrounding. These pumps and channels will in a resting cell maintain a higher concen- tration of K+ inside the cell than outside of the cell. Conversely, they will also maintain a lower concentration of Na+ and Ca+ inside the cell than outside. This results in a voltage over the cell membrane called the resting membrane potential. When a neuron sends a signal (known as an action potential), a chain reaction of highly coordinated in- and eflux of ions results in a depolarization of the cell membrane. This depolarization will propagate throughout the length of the neuron until it reaches the terminal axon which connects to a the dendrites of another new neuron. This connection between
5 two neurons is known as a synapse. Although the exact mechanisms of neu- ral signaling will depend on the receptor of the post synaptic neuron, neural transmitters will be released to the post-synaptic neuron which will either increase or decrease the probability of inducing an action potential [2].
2.1.2 The Striatum The striatum is a centrally located part of the brain, below the cortex, which in humans consists of the nucleus accumbens, olfactory tubercule, the cau- date nucleus and the putamen. Post mortem findings in patients diagnosed with Parkinson’s disease or Huntington’s disease has prompted researchers investigate the role of the striatum through animal experiments, which has re- vealed that the striatum enforces smooth motor movements [11]. It has been observed that the striatum receives contextual information from the cortex, which is then processed and relayed to the relevant areas to initiate some behaviour. The three main output pathways are the direct pathway, the in- direct pathway and the patch pathway. The expression of genes in neurons of the direct, indirect and patch pathways di↵er which has enabled researchers to selectively study these neurons. These genotypes expressions are denoted as D1, A2A and OPRM1 for the three pathways respectively. The classic rate model describes how the direct pathway selectively excites a movement while the indirect pathway inhibits competing movements. The idea is then that, under this theory, the imbalance between selection of wanted movements and inhibition of unwanted movements cause the rigid and jerking movements of patients with Parkinson’s [12]. Recent research has however showed that the role and dynamics of the striatum is more complex than what can be fit the classical rate model [11, 12, 13]. It has for instance been established that the patch pathway encodes the expected value for an action [14].
6 Figure 3: A coronal section of mouses brain showing the dorsomedial stria- tum, taken from [15].
The striatum itself is divided into subregions that are functionally slightly di↵erent. The dorsomedial striatum, corresponding to the caudate nucleus, is thought to regulate behaviour that are contingent on the outcome of an action. That is, the dorsomedial striatum promotes the behaviour that is associated with a positive value, while suppressing actions that are no longer associated with a positive value [16].
2.2 Cluster Analysis
In a data set X consisting of vectors xi for i =1,...,N,aclusterisdefined as a group of data points which are more similar to one another than to points of other groups [17]. In general, finding clusters is done in order to shed light on the underlying process that generates a data set and so what should and shouldn’t be considered clustered varies depending on the context.
The problem of finding clusters in a data set, known as cluster analysis, has generated much research and has resulted in a plethora of algorithms devoted to this task [18]. However, the majority of these algorithms in use assume that number of clusters, k,areknowna priori.Thisisforinstance the case with k means, hierarchical clustering and the EM-algorithm[19]. For some applications this is a completely reasonable limitation, but for oth- ers the exact number of k is the point of inquiry.
7 In light of this, a number of algorithms that cope with the problem of an un- known k have been developed. Some notable examples are DBSCAN [20] and OPTICS [21]. These do all, however, rely on some sort of hyperparameter selection which in turn will dictate the output. Using such an algorithm will then output some number k corresponding to the most appropriate choice of clusters according to the algorithm, but the user is left to question the va- lidity of said clusters as k will be a function of some hyperparameter space.
2.3 Cluster Validity Indicies Acommonmethodtojudgeandcomparedi↵erentclusteringoutputsistouse a cluster validity index (CVI). For any clustering algorithm that is applied to a data set, it is run for every reasonable k. In the case of no intuition about the data set, the values for k =1,...,N which at times can be infeasible with large data sets. For every run, the fitness of the clusters are evaluated using some CVI. The number of CVI’s are ever growing, and the selection of which one to use is done fairly arbitrarily. A common one is the Silouhette method [22] which was used by [9, 10]. The Silhouette method do however only have ability to discern between k =2,...,N number of clusters and can therefore not really be used to understand whether a data set is clustered or not. Moreover, a comparative study by Arbelaitz et al. [23] show that the performance of many CVIs, including the Silhouette, are poor. When used to find the optimal k on data sets with a known k, the best CVIs only has an accuracy of roughly 50%.
2.3.1 The Gap Statistic One method formulated by Tibrashani et al. [24] that elaborates on the no- tion of comparing clusters fitness is the Gap statistic. The Gap statistic was also proposed to be able to discern between unclustered and clustered data compared to other CVIs. Given that we have data that has been clustered using some arbitrary method, let dij denote the squared Euclidean distance between observation xi and xj.ThenDr is the sum of all pairwise distances
8 in cluster r Dr = di,j (1)
i,j Cr X2 Wk is then defined as k 1 W = D (2) k 2n r r=1 r X The gap statistic is then to compare the log(Wk)toitsexpectationgiven a null distribution. Given that the uniform distribution is the distribution most like to generate spurious cluster, the null distribution will be taken to follow a uniform distribution over the same range as the original data set. M samples of the uniform distribution are generated, for which Wk⇤ is computed. One then gets the estimated gap as
Gap (k)=E (log(W ⇤) log(W )) (3) n n k k
From this sample of Gapn(k)themeanandthestandarderrorcanbeeval- uated. Gapn(k)isdoneforsomeappropriatenumberofk.
Given that the Gapn(k)andcorrespondingsk has been evaluated, there are afewwaysforthenfindingtheoptimalk. What was originally proposed by Tibrashani was to find the smallest k such that
Gap (k) Gap (k +1) s (4) n n k+1 2.4 Modality One approach to assess the presence of clusters is to evaluate multimodality. For a real function F, it is said to be unimodal if and only if F is convex on ( ,m]andconcaveon(m, ]. Note that m does not necessarily need to be 1 unique. One should note that1 the uniform distribution is then said to be unimodal, an implication which will be important later in this section. The reasoning is that if the underlying distribution only has one mode, it can be considered to only consist of one cluster. Conversely, a multimodal underlying distribution will generate clustered data [25]. Reliable methods and tested methods have been developed for this, but mostly in the one- dimensional case.
9 2.4.1 The Dip Test The dip test by Hartigan and Hartigan [26] evaluates probability of unimodal- ity in the one-dimensional case by using the dip statistic. The dip statistic, D(F, G)isthemaximumdi↵erencebetweentheempiricaldistributionF and the unimodal distribution G that minimizes the di↵erence. That is
D(F, G)=sup F (x) G(x) (5) x | |
The unimodal distribution G is found in the following way: let x1, x2,..., x be numbers that are ordered so that x x for every i