Mining Big Neuron Morphological Data

Hindawi Computational Intelligence and Neuroscience Volume 2018, Article ID 8234734, 13 pages https://doi.org/10.1155/2018/8234734

Review Article Mining Big Neuron Morphological Data

Maryamossadat Aghili and Ruogu Fang

School of Computing and Information Sciences, Florida International University, Miami, FL 33174, USA

Correspondence should be addressed to Ruogu Fang; [email protected]

Received 24 February 2018; Revised 9 May 2018; Accepted 24 May 2018; Published 24 June 2018

Academic Editor: Michael Schmuker

Copyright © 2018 Maryamossadat Aghili and Ruogu Fang. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Te advent of automatic tracing and reconstruction technology has led to a surge in the number of neurons 3D reconstruction data and consequently the neuromorphology research. However, the lack of machine-driven annotation schema to automatically detect the types of the neurons based on their morphology still hinders the development of this branch of science. Neuromorphology is important because of the interplay between the shape and functionality of neurons and the far-reaching impact on the diagnostics and therapeutics in neurological disorders. Tis survey paper provides a comprehensive research in the feld of automatic neurons classifcation and presents the existing challenges, methods, tools, and future directions for automatic neuromorphology analytics. We summarize the major automatic techniques applicable in the feld and propose a systematic data processing pipeline for automatic neuron classifcation, covering data capturing, preprocessing, analyzing, classifcation, and retrieval. Various techniques and algorithms in machine learning are illustrated and compared to the same dataset to facilitate ongoing research in the feld.

1. Introduction scientifc domains including biology, chemistry, computer science, and machine learning. Tis feld studies the neu- 1.1. Motivation. Neurons are the building blocks of the ral system’s form, function, connectivity, physiology, and nervous system. Tey exchange information to control the molecular properties [5–7]. Tey have also corroborated that entire body. Terefore, deciphering the complex functions neurons’ morphology difers based on the diferent species, of neurons is fundamental to our ultimate understanding regions in the living body, cell functions, and developmental of memorization, logical thinking, and learning abilities. It stages [6]. Despite the extensive research in this feld, a is reported that there are 86 billion neurons in the human general agreement about all the neuron types has not yet been brain [1]. Every neuron is composed of three basic parts: reached. the dendrite, the cell body, and the axon. But they vary Te convoluted shape of neurons, coupled with their in the number of dendrite branches, size, and shape. Tese subtle structural diferences between types, exposes a new variations lead to the diferent functionalities of particular challenge for researchers in recent years. Te advent of neuron types. All neurons belong to at least one of the newtechnologiessuchasbrightfeldmicroscopy,confocal three basic types: (1) sensory neurons which receive external 2-photon microscopy, and automatic and semiautomatic stimuliandconvertthemtointernalimpulsesthatare neuron tracing has facilitated and accelerated the process of transmitted to the brain, (2) interneurons that convey these 3D neurons images reconstruction and it has allowed the signals between neurons, and (3) motor neurons that pass the number of neural images to grow exponentially. Terefore, to signals from the brain to diferent organs. Despite this general understand and explore this complicated data, it is necessary classifcation, there are neurons which have not been well- to automate the neuron classifcation process to keep up with defned in neuroscience [2]. the increasing amount of accumulated data. Neuroscientists have verifed that there is a relationship Most of the eforts in the past decades mainly have between the form and structure of neurons, their function- depended on human endeavors to manually classify the neu- ality, and underlying connectivity [3–5]. Neuromorphology rons. However, in recent years neuroscientists have started is a multidisciplinary research feld which involves various using artifcial intelligence and machine learning techniques 2 Computational Intelligence and Neuroscience to automatically subdivide the neuromorphological space [8– computational methods in neuromorphology. Most of the 15]. papers on neuromorphology are written by neuroscientists and lack a comprehensive explanation of data processing 1.2. Transition to Automated Neuron Classifcation. Due to steps; all are flled with technical expressions and defnitions the importance of the neuromorphology along with the from that feld. However Vasques et al. recently reviewed burdensome task of manual classifcation, diferent lines most of the morphological classifcation research and they of research have evolved, which exploit computational have briefy provided a review on methods, materials, and approaches for automatic neuron classifcation. Te impor- machine learning algorithms in neuromorphology [30]. In tance of the neuron morphology, the laboriousness, and the contrast, this survey approaches the neuromorphology from considerable cost of the current manual process leads to a new point of view with a broader spectrum and attempts to the development of a repository named Neuromorpho.org. provide a user-friendly review for scientists in diferent felds Tis is a public dataset, lumping together many of the to understand the type of ongoing research, opportunities, available neurons’ morphology research data along with the and challenges in the feld. It explains the entire process of corresponding publications. It has been established online neurons classifcation from scratch and elaborates the way to provide an easy access platform for sharing the valuable of image retrieval. A comprehensive pipeline which precisely results of labor intensive research on neurons from various presents the steps of neurons classifcation from capturing sources. Tis database is growing steadily to become a raw data to defning the fnal neurons’ type is provided in complete reference of neuronal morphology studies. It is Section 2. Section 3 explains state-of-the-art neurons retrieval composed of tens of thousands of 3D cell images with algorithms. Section 4 presents and compares computational the corresponding metadata and related papers from 140 results and the last chapter provides conclusions, discussion, laboratories worldwide [24]. Te full procedure of the dataset and future directions. establishment has been provided in [16, 25]. 2. Neuron’s Mining Pipeline 1.3. Challenges of Automated Neuron Classifcation. Although the emergence of the Neuromorpho.org is promising for In this section, a pipeline for the neuron mining is proposed. further breakthroughs in the feld, the disparate sources of Te steps are shown in Figure 1. Every step will be explained data, diferent experimental conditions, divers levels of recon- in depth as follows. struction completeness, and lack of metadata information lead to a discrepancy in the results. However, a new course 2.1. Data Acquisition. Te advent of Golgi’s staining tech- of action has started to address these issues by providing nique in the late 19th century revolutionized the understand- adequate standards for reporting metadata and details of ing of the brain. Tis technique uses light microscopy to digital reconstruction. Tis research initiative has prompted envision neuronal tissues. Since then a number of new and to refne and complete the metadata information in the promising methods have been invented which helps scientists Neuromorpho website [26, 27]. Additionally, a consistent to understand brain functions. terminology for efective data sharing and communication Constructing a well-defned 3D image of a neuron is a has been established to unify the experiments’ results [28]. time-consuming and labor intensive process. Neuron stain- Due to the lack of consistent terminology for data sharing ing and labeling, as the frst steps of this process, can be and efective communication standard, Neuroscience Infor- conducted via diferent methods depending on the exper- mation Framework has recently assembled a comprehensive iment design and preparation forms. Immunolabeling of lexicon to cover the neuroscience domain and proposed a cellular proteins, bulk extracellular loading, tracer injection, unifed terminology [28, 29]. and genetic labeling which mark neurons intrinsically and As explained in the following section, the neurons’ intracellular are the most well-known techniques of staining image acquisition and reconstruction process are not only [31]. prolonged but also vulnerable to human bias and judgment Visualization, as the next step, is carried out via opti- [29]. Terefore defning a stringent guideline and sharing cal techniques to acquire high-resolution neuronal images. the acquired data with enough comprehensive metadata will Bright feld microscopy and confocal 2-photon microscopy signifcantly help this line of research. In addition, a part of are the most popular visualization techniques which are used the metadata sometimes acts as a confounding variable that to prepare the neurons’ images for tracing. shouldbetakenintoaccountinanalyzinginformationfrom Duetotheneurons’complexmorphologyandconvoluted diferent sources. cell preparation process, captured images have some degree Te 3D nature of the neuron’s image also hampers the of noise, corruption, and obscurity. Tracing, which is an application of many popular methods and techniques of intensive process of reconstructing the digitized image, has pattern recognition, image detection, and classifcation. For evolved during the years to address the aforementioned prob- instance, deep learning which is an emerging feld of research lems. In the past, it has been performed by hand and camera can be exploited normally for 2D images but some challenges lucida but nowadays it is mostly done semiautomatically. should be addressed to efectively apply it on 3D images. However, the tracing results are still incomplete because of the imperfect staining, tissue sectioning, and low image 1.4. Contribution and Organization of the Paper. Tis sur- resolution. Many research groups are currently working on vey paper provides an extensive organized overview of thevisualizationandreconstructiontechniquestoprovide Computational Intelligence and Neuroscience 3

Table 1: A short list of tracing sofware and toolkits.

Sofware/Toolkit Web Address Availability Reference Neurolucida http://www.mbfioscience.com Commercial Halavi et al. [16] ∗ NeuronJ https://imagej.net/NeuronJ Open Source Meijering [17] ∗ Simple Neurite Tracer https://imagej.net/Simple Neurite Tracer Open Source Longair et al. [18] ∗ Sholl Analysis https://imagej.net/Sholl Analysis Open Source Ferreira et al. [19] NeuronStudio http://research.mssm.edu/cnic Open Source Rodriguez et al. [20] Vaa3D http://www.alleninstitute.org/what-we-do/brain-science/research/products-tools/vaa3d/ Commercial Peng et al. [21] FARSIGHT http://farsight-toolkit.org Open Source Luisi et al. [22] NeuronCyto http://neuroncyto.bii.a-star.edu.sg/ Open Source Yu et al [23] Aivia https://www.drvtechnologies.com Commercial N/A Imaris http://www.bitplane.com/imaris-for-neuroscientists Commercial N/A ∗ ImageJ plugin.

Path Distance Unsupervised Bifurcation Supervised • Data • Data Acquisition Processing • Feature Extraction Normalization • Machine Staining labeling Missing Value Learing Visualization Confounding Variable Tracing Dimensionality Reduction

Figure 1: Neuron’s mining pipeline. higher quality 3D images via automatic methods but human intervention is still an inseparable part of the process [21, 32]. A number of diferent types of popular tracing sofware and tools are provided in Table 1. A comprehensive detail on the visualization techniques and tools can be found in [17, 33, 34].

2.2. Feature Extraction. For processing and quantitative analysis of reconstructed images, neurons’ features should be extracted. L-measure is one of the recent types of sofware that executes the morphometric calculation. Tis is free sofware and is designed to calculate more than 30 morphometric features from a neuronal reconstructed fle in a wide range of formats [35]. Tere are some other types of alternative sofware for feature extraction such as Cvapp [36], Neurolucida Explorer [37], NeurphologyJ [38], and NeuronLand which can be used based on the need [39]. Some important neural Figure 2: Neuron’s features. features which can be extracted by L-measure are shown in the Figure 2 [40]. A detailed description of the features has been represented in [35]. research laboratories. Diferent preprocessing steps are briefy explained below. 2.3. Data Preprocessing. Since real-world data tends to be noisy, incomplete, and inconsistent, data preprocessing is 2.3.1. Normalization. Data preprocessing is a fundamental necessary prior to further analysis. To achieve reliable building block of data mining. Te core of preprocessing results in the quantitative analysis, some validation, curation, is standardization and statistical adjustment of the data. and standardization steps should be performed which are Normalization can considerably improve the classifcation considered as preprocessing. Preprocessing is specifcally accuracyofanumberofthemodels.Tedataarenot essential when the dataset is an amalgamation from diferent always expressed in similar units and magnitude, so data 4 Computational Intelligence and Neuroscience pretreatment considering certain criteria is essential prior variable is stabilized [51]. Te propensity score method, to data analysis [41]. One of the popular methods of stan- regression modeling, and a collection of techniques called dardization is z-score scaling, which involves subtracting the Markov chain Monte Carlo are used for data imputation. mean from all values and dividing by the standard deviation A group of well-established imputation methods such as [42–46]. Matrix Factorization, Singular-Value Decomposition (SVD), and K-Nearest Neighbor (KNN) have been implemented in 2.3.2. Missing Value Treatment. Missing values occur when statistical and analytical sofware packages such as MVA in R there is no data value for some variables or features in the and Fancyimpute in Python. dataset.Itiscommonandalsoacomplicatedphenomenon that should be addressed with an appropriate approach prior 2.3.3. Data Unifcation and Consolidation. As mentioned ear- to the classifcation or clustering as many algorithms are lier, the Neuromorpho dataset is an amalgamation of data unable of handling data with missing values. Most of the from neurons research labs. Terefore it is not surprising statistical packages ignore incomplete samples. However, that there are discrepancies in the naming of the same value invalid statistical results may be achieved as a consequence of in the dataset. For example, neonate developmental stages the elimination of critical information. Tis method is called are referred by diferent names like “embryonic”, “infant”, Listwise or Case Deletion. “neonatal”, and “fetal” in diferent datasets. It is required to A similar approach to Listwise is Pairwise deletion, which consolidate this data using common nomenclature prior to deletesorkeepsthedatapointbasedonthepairscoresand processing. the application of the features in the calculation. For instance, if a sample has the value of x1,x3,x4 and y features and it 2.3.4. Address Imbalanced Dataset. Imbalanced classes are missesthevalueofx2andx5,thesamplepointiskeptand thosethathavemajorityvaluesinonetype(morethan when the pairs of x1 and y are needed in a calculation but if y 90%) and the remaining in the others. Accuracy is not an and x5 or y and x2 are needed this sample point is discarded. appropriate metrics for the classifcation performance in Replacing the missing value simply by zero or with these datasets. In order to have a correct understanding of attribute mean value is another way of handling unavailable the classifcation performance, it is recommended to calculate values. It can be more precise if instead of attribute mean, the F-measure, precision, and recall in the future research. missing value is replaced by the mean of all samples belonging Oversampling and undersampling also are popular tech- to the same class. However, this method can be misleading niques that sometimes are used to address imbalanced data whenthevariablehaslargevariance[47]. [52]. A more scientifc approach for addressing missing values suggested the following procedure: calculating the missing 2.3.5. Exclusion of Confounding Variable. While experi- value percentage and if it is less than 5% of the whole values, mental condition, staining model, imaging resolution, and it can be neglected; otherwise two tests of MAR (missing at all other empirical details dramatically afect some of the random) and MCAR (missing completely at random) should parameter calculations, feature selection should not be done be performed to give enough confdence about whether the blindly. Polavaram et al. claimed that axonal morphologic missingdataoccurredrandomlyorifithappenedbased features and branch diameter are extremely dependent on on some corresponding situation that makes those data the experimental conditions so they excluded them from the gathering hard. If this test becomes false, it is clear that pool of features prior to analysis [46]. Diferent preparation missing data happens based on some specifc situation and is mechanisms, shrinkage during tissue processing, and slicing systematic. Terefore using an inference method to calculate artifacts all impact the neuron’s images. Scrupulous attention the missing values can be helpful otherwise unavailable values to these details improves the classifcation accuracy. cannotbepredictedeasilysoListwiseDeletionorMean Substitution can be applied based on the condition [48–50]. 2.3.6. Dimensionality Reduction. Reducing the feature space A better way for prediction of less biased data is applying by obtaining a set of uncorrelated variables is a popular the data mining algorithm for probable value prediction. techniqueinbigdataanalysis.Teremainingfeaturesshould Regression, Bayesian inference, decision trees, and clustering be selected carefully to exhibit the original data variability. algorithms can be used for inferring missing values. Reduction of the attribute space to enhance the classifcation Regressing the missing variable based on the other performance and reduce the process time is the main goal independent variable is a simple solution. Regression model of this technique. When the dimension is reducible to works well for imputation of the missing value when there two, data visualization is possible which provides a better is a strong relationship between missing variables and other understanding of data distribution [53, 54]. independent variables. Principle Component Analysis (PCA) is the most well- Single and multiple imputations using expectation max- known technique of dimensionality reduction in neuro- imization are modern techniques of missing value comple- science studies. Tis method transforms a large set of features tion. Expectation maximization, which is a kind of maximum to a smaller set in a way that remaining set represents the likelihood approach, iteratively imputes the missing value original variance in the data. In other words, PCA identifes based on the relationship among whole sets of variables. It the most dominant features among the entire set and solve adds some degree of random error to refect uncertainty to the curse of dimensionality which improves the classifcation the imputation. Te algorithm will stop when the imputed accuracy [55]. Computational Intelligence and Neuroscience 5

Costa et al. applied PCA and canonical analysis on Filter method selects variables regardless of the model amassivedatasetof6000neuronsfromNeuromorpho and only evaluates the intrinsic importance of the individual database to decrease the 20 extracted features to only two, features. It ignores potential interactions between the subsets’ which explained most of the data variability. Teir result elements and suppresses the least interesting variables. Tis shows that cells with similar types, region, and species tend method is efective in computation time and is robust to to form a cluster together and also these clusters become overftting. Overftting is a modeling error which occurs more substantial afer applying PCA. But there is not enough when a function is too closely ft to the training data points so evidence to answer comparative questions such as “whether this complex model fails to perform well in the training set. the neurons of interrelated species have same morphological However this technique does not consider the relationships traits because of same habits or coexistence” or “how a between variables so it tends to select redundant variables neurons’ morphology evolved in a species” [40]. [63]. In contrast to flter method, Wrapper considers the Polavaram et al. used L-measure sofware to extract probable interaction between subsets’ feature but it also has more than 100 features from the data which is captured the risk of overftting and it takes signifcant computation from Neuromorpho.org. Tey applied PCA to identify the time to complete [64]. Embedded method is a combination most important morphological parameters which help to of the both aforementioned methods. Tis algorithm takes perform structural classifcation. Teir results corroborate advantageofitsownvariableselectionprocessandperforms that while direct assessment of large-scale heterogeneous feature selection and classifcation repeatedly until it reaches dataset can not uncover meaningful patterns, by applying the best performance [65]. PCA as an efective feature space reduction, capturing the Te authors of [66] sought to rank the contribution of relationship between metadata and clusters become feasible. features in the classifcation of the Axonal Projections neu- Tey also reported that among specifc cell types and animal rons. For this purpose, they repeat the classifcation process species there are some morphological diferences that are not multiple times and leave one feature out in each run. Te sensitive to the origin laboratory [46]. Tere are multiple amount of error growth demonstrates the importance of the pieces of research which have adopted PCA in diferent leave out feature. Repeating this technique for each feature, datasets in a similar style [56, 57]. all the features were ranked based on their importance. In contrast to aforementioned studies, one study was not To consider the feature interactions and correlations, Sun able to fnd meaningful improvement in the classifcation by et al. demonstrated the performance of their novel feature applying PCA. In this experiment 67 morphological features selection algorithms on the neurons’ morph repository. In of mouse frontal, visual, and somatosensory cortex were theirBinaryMatrixShufingFilter(BMSF)algorithm,a measured and cluster analysis was performed two times, one matrix with the same column size of the original feature time, without applying PCA and another time afer applying set and a subset of rows (samples set) are selected. Cells in it, to compare the dimensionality reduction efect in the each row are randomly assigned zero and one, representing data. Tis experiment displays no obvious diference between absenceorpresenceofthatfeature,wherethetotalnumber those two runs rather than some cells rearrangement [44]. It of zero and one is equal to each row. Obtaining a reduced suggests that while PCA has rendered promising results in training set with this shufing technique, the accuracy of several bioscience studies it is not practical [40, 46, 56–58]. SVM on the selected subset is measured via tenfold cross val- PCAattemptstoreducethereconstructionerrorby idation. Classifcation accuracy is calculated multiple times the best transformation, so in the data with nonlinear in the shufed subset. Each time one column cell content is dependencies, it can not consider higher order relations. triggered while remaining part of the matrix lef untouched to Furthermore, PCA compresses the attributes and makes a see whether that feature improves or degrades the accuracy. new combined attribute; thus the contribution of original In former case, the feature will be kept in the fnal selected features in classifcation is not easily interpretable. Tese are subset; otherwise it will be excluded. Tis step will be repeated two major drawbacks of this method which highlights the until no more change happens in the fnal subset. Tey need for other dimensionality reduction methods [59]. applied the algorithm coupled with Support Vector Machine (SVM), Back Propagation Neural Network, and Naive Bayes 2.3.7. Feature Selection. As mentioned in the previous part, and reported the highest performance of their proposed some extracted features are not necessarily informative in feature selection technique [67]. model creation. Redundant or irrelevant features decrease processing speed and mislead the algorithm [60]. A proper 2.4. Unsupervised Learning. Defning the hidden structure collection of most signifcant attributes can boost the classi- of the data, without any prior knowledge, is called unsu- fcation performance. Feature selection mostly is done prior pervised learning. A great amount of unlabeled data is to application of machine learning in order to speed up the fedtothealgorithmandclustereddataisachievedasan model training time, make a simpler, easily interpretable output. Clustering is the most popular unsupervised learning model and enhancing the generalization power of the model method which has been widely used in discrimination of [61, 62]. unlabeleddatasofar.Temajorityofpublishedresearchin Tere is a variety of feature selection techniques such as automatic neuron classifcation has applied Ward’s method, flter, wrapper, and embedded method. Tey rank the features K-means, afnity propagation clustering, or a combina- basedontheirimportanceandthenpickanappropriate tionofthose.Severalsamplesarebriefyreviewedinthis subset of features based on diferent approaches. section. 6 Computational Intelligence and Neuroscience

One of the conventional and widespread clustering meth- 3D image from the neurons was mapped to 42 quantitative ods in neuron classifcation is Ward’s method. Main proper- features. By eliminating the redundant and uninformative tiesofthealgorithmwhichmakeitpopulararelistedhere:(1) features at the frst screening, 26 features were achieved. Ten most of the members of a group have common features, (2) a correlation matrix for the feature sets was created to identify each feature is visible in a large number of members, and (3) highly related features. Keeping this group of features leads thereisnoneedforallthememberstohaveallthefeatures to a high dimensional space with no extra information for [68]. Tis algorithm has the bottom-up strategy which means classifcation. Te authors avoided human intervention for it starts from the leaves, groups close features together based weighting the features. Afer empirical experiment based on on the overall largest similarity, and then makes a new cluster. thecorrelationmatrixandsinuheanalysis,threemostsig- It gradually follows this grouping technique in diferent levels nifcant features were produced which are branching density, of the tree until reaching a common root at top of the tree. stratifcation depth, and the extent of dendrites [3]. Chunwen Tsiola et al. afer applying PCA employed Ward with et al. also used clustering coupled with PCA. Te main Euclidean metric for distance measurements on their own diference of this method with similar experiments is that prepared dataset of 158 images of primary visual cortex they used the extracted dataset from the Neuromorpho.org neurons in mouse. Tey focused on dendrite and somatic website and they defned a naming schema based on the shape. Five classes of cell emerged, including large pyramidal morphologies of each type of neurons [81]. neurons, polarized nonpyramidal neurons, and short pyra- In contrast to most of the neurons, some neurons pertain midal neurons [45]. to more than one type rather than having a strict membership Despite the popularity of this approach among many to a particular group. Batagglia et al. referred to the former neuroscientists, Ward’s method has a major drawback. Tere cell type as archetypal and to the latter cell type as atypical is no chance for changing a data point which is assigned to [82]. Tey proposed a fuzzy clustering algorithm to efectively a cluster afer going up in the tree and recognizing a better identify the membership degree of atypical neurons to the cluster for that point [43, 44, 56, 69–76]. main archetype. Te fuzzy theory was invented by [83] Researchers have exploited the K-means algorithm to in order to describe indefnite phenomena with a precise address the aforementioned defciency of Ward’s method. alphabet. Batagglia et al. used the same dataset as [43] and Comparing their result with Ward, they proved that K-means had the same approach; however, they focused on clustering has superiority in neurons discrimination. theatypicalcells.Intheirfuzzyclusteringscheme,oneneuron K-means clustering has a reverse approach in comparison can belong to more than one class type with a diferent to Ward which performs top down. Te number of desired degree of membership. Te sum of all membership degrees clusters is defned in advance and the algorithm dynamically for a neuron in normalized form should be equal to 1. corrects the assignment of neurons to the diferent groups by When one neuron is completely matched with one class, its calculating the inner similarity of the members [77, 78]. membership degree is 1 and when it does not belong to a Badea et al. separated adult mouse retina cells based class, its degree is 0. In a study by Ristanovic et al. large sets of on the multidimensional feature space using K-means. Tey dentate nucleus morphology and topology were qualitatively appliedWardaswelltocomparetheresults.Authorscoupled and quantitatively analyzed. Seven features were extracted molecular composition and physiological properties; for fromthe2Dimagesofneuronsandwereclassifedmanually example, they linked receptive feld size and connectivity to into four diferent groups. To verify fndings and to show the ON and OFF pathways of the neurons with the morpho- the consistency of the proposed classifcation schema, they logical features like arbor area and stratifcation, arbor area applied T-test and ANOVA test [84]. and stratifcation within the inner plexiform layer, branching, Authors of [85] explored afnity propagation clustering density, and radiality of the dendrites to make a diverse on the dataset of 337 interneurons and compared it with setoffeatures.Oneofthedisadvantagesoftheirresearch, theWardalgorithm.Resultsobviateaslightlybetterperfor- as they also explicitly mentioned, is that, by considering a mance of afnity propagation in comparison to the Ward. conservative cut-of for defning the number of clusters, they Te dataset was comprised of 20 electrophysiological and may suppress some crucial clusters [79] 67 morphological features. Considering only the shape of Te main concern of the authors who used Ward and the neurons, 10 clusters appeared and by considering the K-meansisthatallthefeatureshavethesamerateand physiological features, 36 clusters appeared. By combining all importance in the classifcation process. Tis application the features and applying the afnity propagation algorithm, neglects the fact that some features in clustering are more 8 clusters with an accuracy of 77% were achieved. important than others. For example, the stratifcation level A study that approaches the problem from a relatively in the IPL has signifcant importance in comparison to other dissimilar perspective has been done by DeFelipe, Lopez et al. features, so considering it in the same way as other features [86, 87]. A taxonomic solution based on axonal arborization may cause unft clustering [79]. patterns was presented. Six axonal morphological features Other researchers like Kong et al. also used K-means algo- were defned to categorize GABAergic neurons which are less rithm as an appropriate tool for clustering. Tey explained controversial cell types. Afer defning six features clearly, the shape of a series of 219 retinal ganglion cells in the an interactive web-based system was created to allow 42 mouse. In contrast to same neuron types in the monkey, neuroscientists to ascribe the categories of the neurons in 320 cat, and rabbit, mouse ganglion cells are less distinctive and imagesbasedonthosefeatures.Teimagerepositorywas pose a serious challenge for identifying subtypes [80]. Each a collection of interneurons images of diferent parts of the Computational Intelligence and Neuroscience 7 cerebral cortex of human, monkey, cat, rabbit, rat, and mouse. algorithms in a more general dataset like Neuromorpho.org A Bayesian network model was created to analyze diferent [89]. Zhao and Plaza have proposed a method in which experts’ answers. In order to ensure that the agreements electron microscopy images of drosophila optic medulla are were not accidental, Fleisss pi and Cohens kappa index were fed into the segmentation part while labeled feld output is calculated. Eventually, an automatic clustering algorithm the input of the skeletonization part. Tis skeletonization part separated the dataset and output clusters corroborated the converts the binary image to a skeletonized model, which correctness of the community consensus. is a 3D neuron-shaped using the TEASAR method. Difer- Tere are multiple research fndings for automatic clas- ent skeletons based on diferent inputs were provided and sifcation and clustering of neurons based on other features compared with a set of predefned skeletons. Features were (neuromorphological, electrophysiological, and molecular), then calculated. Afer a pairwise matching between diferent whichhaveusedsimilarapproaches[6,70]. feature sets as a signature of each image, a similarity matrix was made. Ten in the fnal step, they applied the afnity 2.5. Supervised Learning. Inspiteofthesheeramountof propagation clustering algorithm and K-Nearest Neighbor labeled data, most of the researches in classifcation of the classifcation on the normalized similarity matrix. Diferent Neuromorpho space have used unsupervised machine learn- classes of the most similar neurons emerged as the desired ing techniques so far. However, these days public available result. Te authors proposed their idea that the location of databases like Neuromorpho.org proliferate the application of the branches determines the types of the neurons. Tey also supervised methods. As it is ofen found that supervised tech- tried to implement a detection algorithm based on the branch niques can perform comparatively better than unsupervised, density. Although the accuracy of their method is high in the thislineofstudyprogressedtowardusinglabeleddatasets provided dataset, it is not comparable with other algorithms [88]. which have been applied in public dataset. Te dataset was Guerra et al. utilized supervised classifcation instead created manually in their laboratory and a specifc alignment of unsupervised clustering in order to reap the benefts of was needed to achieve an acceptable result, which was a major prior knowledge in the feld. Tey attempted to distinguish drawback of their proposed method [90]. neocortical pyramidal cells from interneurons in a total of 327 Recently, Sun et al. exploited a Support Vector Machine samples. Tey compared the accuracy of Decision Tree, Naive (SVM) paired with their proposed method of Binary Matrix Bayse, Multilayer Perception, Logistic Regression, and k- Shufing Filters for Feature Selection (BMSF). BMSF is a Nearest Neighbors algorithms with an unsupervised method. featureselectiontechnique(mentionedinthefeatureselec- Additionally, they applied some dimensionality reduction tion section) which is coupled with a classifer to defne techniques, like PCA, and feature subset selection to reduce the neural space boundaries. Tey also coupled their BMSF the features number [88]. Teir fnal dataset consists of 128 methods with other state-of-the-art classifcation algorithms samplesofpyramidalcellsand199samplesofinterneu- and compared the classifcation accuracy of those methods rons from mouse neocortex with 64 extracted features and including Back Propagation Neural Network (BPNN), SVM Apical Dendrites as the label set. A comparison of the recursive feature elimination (SVM-RFE), and Naive Bayes, outcomes proves the superiority of the supervised classif- with and without their proposed feature selection technique cation approach and the efectiveness of the dimensionality toprovetheefectivenessoftheproposedmethod.[67]. reduction and feature extraction methods in this specifc Jiang et al. classifed the neurons’ space based on their morphological task. morphological features. Acquiring the neuron’s images from Afer creating 400 pairs of image stacks from a pool of neuromorpho.org and extracting 20 features per neuron’s motor neurons in the Drosophila larvae and converting them image, they applied PCA to reduce the feature space to only to 2D images in the lab, Chang et al. partitioned neurons four features. By employing a back propagation algorithm, basedontheirthreemainmorphologicalparts,soma,axon, they classifed the space into various subtypes which have and dendrite. Neurons were annotated manually into fve diferent functionalities including Purkinje, motor pyramidal separate subtypes. Chang et al. applied their proposed algo- sensory neurons, and interneurons [91]. rithm named “hNNRF-T: Hidden Neural Network Random Field” to classify the dataset. Te input of the Neural Network 2.6. Multilabel and Multiclass Classifcation. In some of the is the morphology features from diferent neuron parts, classifcation problems multiple classes should be predicted the hidden layer is a sigmoid nonlinear function, and the rather than binary division of the space. Furthermore, not output is the energy which controls the interactions in the only classes are more than one but also there are more hidden conditional random feld. Tey tested Support Vector than one label for diferent classes. Tis kind of problems Machine (SVM) with Gaussian kernel, a Logistic Regression is considered multiclass multilabel classifcation. Neuron’s Model, and a Gaussian Mixture Model (GMM) on the dataset morphology classifcation can fall in this category [92]. and compared outcomes to demonstrate the superiority of Fernandez et al. compared several state-of-the-art mul- theirmodel.Teaccuracyoftheproposedmethodshows tilabel classifers on the Neuromorpho.org dataset in order the higher performance of the hNNRF in the classifcation to detect gender, species (rat, human, mouse, and elephant), of their specifc sample set of neurons. While the method of developmental stage, area of the neocortex (fronto-insula, converting 3D images to 2D by preserving the whole content anterior cingulate, motor, somatosensory, entorhinal, occip- of the image is inspiring, there is not enough evidence to ital lobe, frontopolar, multiple, frontal lobe, insular cortex, prove that this method outperforms all of the state-of-the-art precentral gyri, postcentral gyri, and media prefrontal cortex 8 Computational Intelligence and Neuroscience perirhinal), cell type level one (interneuron or principal cell), searching and retrieval in large-scale 3D space of the neuronal and cell type level two (stella, pyramidal, basket or bitufed, databases. Tey proposed an efcient asymmetric binary neurogliaform, and containing cell). Tey formulated this coding function to implement a high-speed maximum inner multilabel classifcation problem and introduced Class Bridge product search (MIPS). Tis algorithm not only saves more decomposable Multidimensional Gaussian Classifer (CB- space as a result of the compact representation of each image MGC). Te model is a variation of the Bayesian network by 32 bits but also speeds up the search time among huge classifers and outperforms all the state-of-the-art multiclass, databases, like Neuromorpho.org, by an order of 30 times. multilabel algorithms. Tey reported their results with the Tey compared their algorithm with the diferent state-of- performance measures like hamming score and exact match. the-art algorithms, like Anchor Graph Hashing (AGH) [98], Based on the aforementioned metrics, their proposed algo- Spectral Hashing (SH) [95], and Iterative Quantization (ITQ) rithm shows higher performance [93]. [99], to demonstrate its superiority. In brief, their proposed Te presented experiments and research were some of the algorithm frst extracts the images features and then applies most signifcant eforts that have been done toward automatic the maximum inner product search to fnd the best binary classifcation of the neuron space. Although most of them coding function. Afer getting the best function, it converts apply the methods in a locally created dataset, the idea can eachimageinto32-bitbinarycode.Attheretrievaltime, be exploited and applied in a large enough public dataset to it converts the query image to binary code with the same derive a more general conclusion about the performance and function and uses an inner product search to fnd the most feasibility of the automatic classifcation. similar images in the dataset. In their later work, they added two extra steps to improve the efciency of the algorithm. 3. Neuron’s Retrieval Methods Afer feature extraction, they grouped features into diferent hierarchies to create the similarity matrix. And also afer Te deluge of online neuron 3D images has led to an maximum inner product search, an asymmetric optimization increasing demand for exploring, identifying, analyzing, and is applied and two coding functions are generated [100, 101]. retrieving image data. Tis trend stimulated research seeking Several retrieval techniques have been adopted for large- to develop an efcient high-speed retrieval algorithm for scale medical image analytics which also have the potential looking into datasets and satisfying neuroscientists questions to be applied for mining the neuromorphological space. For and needs. instance, Zhang et al. represented the histopathological image Considering this demand, Costa et al. implemented databyvariousfeaturessuchasimagehistograms,bagof sofware named NBLAST. Tis sofware provides various words, local texture, and shape. Tey used these huge feature functionalities to search and retrieves neuron images in vectors to fnd similarities among diferent clinical cases. To adatabaseandithasthecapabilityofdefningthewell- improve search speed in the high dimensional feature space, matched type for a neuron based on its image. NBLAST is they applied kernelized and supervised hashing methods as also capable of detecting two diferent images of the same a scalable query method. Tey validated their method on the neuron, and it can diferentiate neurons of two highly similar cell-level analysis of thousands of breast tissue images [102, types. NBLAST decompose the neurons’ images into small 103]. Although the histopathological image data is relatively segments, and by applying log-likelihood score matrices it diferent from neuromorphological image data, the idea of defnes the matches in the dataset. Tis algorithm makes a applying kernel and changing the space can be applied for hierarchical clustering by applying Ward and afnity propa- neuronal space. A summary of applied machine learning gation method and groups the neurons’ images based on the techniques in the neuron’s morphology research is provided features similarity. It provides a ranked list of possible similar in Table 2. neurons which makes the search faster and more efcient [94].However,assigningahugefeaturevectortoeachimage 4. Morph Is Not Enough of a tremendous database leads to considerable response time. While search speed is an important issue, a group of Neuroscientists recently agreed that neuron classifcation researchers aimed at tackling this problem by exploiting the cannot be satisfying unless considering multimodal infor- hashing concept. In an image hashing search, each image mation of neurons rather than only morphological features is converted to a binary code which needs far less space [28]. Following this fact, Karagiannis et al. classifed a to be kept in the memory and is easier to be explored. group of neurons based on physiological, molecular, and Although Weiss et al. proved that fnding the best codewords morphological features. Tey selected the Neuropeptide Y is an NP-hard problem they proposed a novel formulation (NPY) neurons which have three distinctive subtypes with for redressing the issue, called Spectral Hashing (SH) [95]. diferent sets of properties. Tey extracted some morpholog- Several other efcient encoding schemes were proposed to ical properties such as orientation of major axis relative to the improve searching performance in big image databases such radial axis and laminar location. Tey also took into account as Neuromorpho.org. Considering that neuron image repos- the electrophysiological and molecular properties such as frst itory is a giant dataset, researchers apply hashing concept to spike amplitude and NPY marker to have a comprehensive provide the fast searching capability [96, 97]. pool of neuron’s features. Te authors aimed at clustering the Traditional metrics like Euclidean distance for calculat- neurons based on the mentioned attributes into three main ing the similarity between high volume of 3D images are types: bipolar, tufed, and multipolar. Tey applied the Ward inefcient so Li et al. implemented a new idea for rapidly clustering technique and they demonstrate that more robust Computational Intelligence and Neuroscience 9

Table 2: Machine learning techniques for neuron classifcation. iteratively be selected based on the impurity measure. Te whole process will be repeated to build several decision trees ML Technique Algorithm to complete the Random Forest model. Each new data point Ward will be fed iteratively into all the generated trees and their K-Mean outcome result will be averaged to form the fnal prediction Unsupervised Techniques PCA of the Random Forest. Tis algorithm achieves the highest Afnity Propagation accuracy of over 85% among other classifers for predicting Fuzzy Set Clustering the developmental stage, species type, primary cell type, and Feature Selection gender. However, for predicting secondary and tertiary brain Neural Network cells, the performance is still too low due to the great diversity in the subtypes and shortage of data in each subtype. Supervised Techniques Hidden Neural Network Random Field Te rat, mouse, monkey, human, chimpanzee, and SVM + Binary Matrix Shufing Filters drosophila sample sets were also classifed with the help of Multiclass Classifcation Random Forest classifers and results are shown in Table 4. Ward + Afnity Propagation Retrieval Techniques Binary Hashing Search 6. Future Directions Maximum Inner Product Despite diferent approaches for neuromorphology presented in this work, there are other promising directions that clustering schema is achievable by considering wide variety of should be explored and there are several issues that should neuron’s features rather than only morphology [43]. Several be addressed to get more reliable results. Here we briefy other studies have classifed neurons based on diferent mentioned the open challenges. combinations of morphological, physiological, and molecular properties using unsupervised clustering algorithms with 6.1. 3D Image Classifcation. Te majority of the applied al- a relatively similar approach [6, 56, 66, 70, 104–106]. All gorithms for classifcation of the neurons have been devel- the aforementioned studies attempt to demonstrate that a oped based on the extracted features by the sofware like L- comprehensive diverse feature set leads to a more accurate measure, Cvapp, Neuronal, and Neurolucida Explorer [39]. classifcation results. To the best of our knowledge, direct processing of the 3D neurons’ images has been never explored in this feld. 5. Computational Methods Validation Applying convolutional neural network and deep learning and Comparison techniques for directly processing and extracting the features of the reconstructed 3D image is a promising direction. To this part, most of the distinguished studies and research in Neuromorpho space have been reviewed. Following the 6.2. Enormous Database Retrieval. Although several hashing proposed pipeline, in this section, we try to classify the public ideas have been applied in the neuronal space, still there Neuromorpho database and present the outcomes. are much more potentially efective methods that have not As the frst step, we download a large portion of the been examined in this space. Jiang et al. have exploited joint neurons’ images from diferent categories in the Neuronor- kernel supervised hashing method with multiple features for pho.orgsite.Apoolofneuronscomposedof16647drosophila, retrieval of the histopathological breast cancer images. To 173 human, 1181 mice, 6426 rats, 184 monkeys, 300 girafes, preserve the data similarity in the hamming distance, kernel 302 C. elegans, 360 chimpanzees, 127 crickets, 143 humpback functions were introduced. Afer feature extraction, they whales, 95 elephants, and 60 minke whales samples are applied diferent kernels for individual features and defned a collected. Features of each image have been extracted with the joint kernel function which linearly combines those kernels. help of L-measure. Aferward data has been cleaned, missing Aferward a series of hashing function are constructed based values were replaced by zero, and naming was standardized on the joint kernel. A supervised algorithm is applied to and preprocessed, and at the end a group of classifers such as optimizetheweightsandhashingfunctions,basedonthe Random Forest, Decision Tree (J48), K-Nearest Neighbors, provided images’ metadata. Hashing function helps to reduce Logistic Regression, and Naive Bayse have been applied to the high dimensionality of the feature space and makes a classify the data. Te accuracy of each algorithm considering compacthashcodewhichbooststhesearchspeed[108].Tis diferent label sets is provided in the Table 3. research idea has great potential to be applied in neuronal Well-matched with previous studies [107], Random For- space. est demonstrates an outstanding performance among other classifers. 6.3. Heterogeneity. Traditional neuromorphology research Random Forest is a type of supervised machine learning usually lacks standardization and ofen fragmented with algorithm which is ensemble of multiple decision trees. For multiple formats, prepared with diferent standards and each tree in the forest a bootstrap sample of data is taken various naming schema. Terefore a common data standard to create various input dataset so that each tree will be ft development is an absolute need for achieving more homo- in a diferent set of samples. Ten the data will be split geneity and consistent results. Confounding variables should based on a selection of random variable. Te best split will be defned and removed from the dataset. In addition, there 10 Computational Intelligence and Neuroscience

Table 3: Baseline accuracy of diferent algorithms based on diferent classes.

Machine Learning Alg Random Forest Decision Tree KNN Linear Reg Naive Bayes Species 98.19% 96.65% 93.6% 90.42% 78.2 Gender 85.12% 81.11% 82.57% 80.19% 78.2 Primary Cell Type 86.08% 83.67% 79.24% 73.44% 71.7 Primary Brain region 68.43% 61.21% 56.29% 48.07% 24.69 Development 97.47% 96.53% 94.89% 91% 83.08

Table 4: Random forest accuracy based on the diferent species.

Species Name Rat Mouse Monkey Human Chimpanzee Drosophila Development 96.2% 97.3% 92.9% 100% 77.8% 99% Gender 99.3% 99.4% 94% 91.3% 80.8% 79.1% Primary Cell Type 97.7% 99.3% 98.4% 98.3% 99.8% 83% Primary Brain Region 97% 96.9% missing 98.8% 1 59% are problems of missing values and noise issues that should Conflicts of Interest be addressed prior to the huge datasets classifcation. Te authors declare that they have no conficts of interest. 6.4. Generalization. Most of the aforementioned studies focusedonalocallycreateddataset.Someofthemonly Acknowledgments studiedapartofapublicdataset.Teyappliedandtestedtheir proposed algorithms and methods on a limited incomplete Te authors would like to thank Dr. Steven Grieco for his help data and have provided the results independently. Tese providing us the neurons drawing. separate eforts and research, while they are valuable but lack the generality to induce a fact or reach a conclusion based References on them. A consistent study on a comprehensive dataset is needed to obviate the semantic dependencies and hidden [1] F. A. C. Azevedo, L. R. B. Carvalho, L. T. Grinberg et al., rules in a more generalized and complete form. Finding a “Equal numbers of neuronal and nonneuronal cells make the high performance and accurate model for classifcation of human brain an isometrically scaled-up primate brain,” Journal neurons will help to complete the chain of automatic neurons of Comparative Neurology,vol.513,no.5,pp.532–541,2009. type detection. [2] J. B. Furness, “Types of neurons in the enteric nervous system,” Autonomic Neuroscience: Basic and Clinical,vol.81,no.1-3,pp. 7. Conclusion 87–96, 2000. [3]J.H.Kong,D.R.Fish,R.L.Rockhill,andR.H.Masland, Tis article presents a comprehensive overview of the tech- “Diversity of ganglion cells in the mouse retina: unsupervised niques, pipeline, future directions, and challenges for neuromorphological classifcation and its limits,” Journal of Compar- morphology in the big data age, by presenting a structured ative Neurology,vol.489,no.3,pp.293–310,2005. analysis of the neuromorphology methods in nearly 100 [4] G. M. Villegas, “Electron microscopic study of the vertebrate papersandwebarticles.Wehavesummarizedmostofthe retina,” Te Journal of General Physiology,vol.43,no.6,pp.15– important machine learning techniques that have been used 43, 1960. for this purpose so far and we have provided a systematic data [5]R.H.Masland,“Neuronalcelltypes.,”Current biology : CB,vol. processing pipeline for the automatic neuron morphology. 14, no. 13, pp. R497–500, 2004. Automatic classifcation of the neuron images in the big data [6] R. Armananzas˜ and G. A. Ascoli, “Towards the automatic age with the growing speed of the reconstructed neurons classifcation of neurons,” Trends in Neurosciences,vol.38,no. 5,pp.307–318,2015. images is an emerging and highly important research feld with potentially signifcant impact on the neurological disor- [7] T. O. Sharpee, “Toward functional classifcation of neuronal types,” Neuron,vol.83,no.6,pp.1329–1334,2014. ders diagnostic. Te future of this feld will beneft from the exponentially increasing amount of digital neuron image. [8] H. Kolb, R. Nelson, and A. Mariani, “Amacrine cells, bipolar cells and ganglion cells of the cat retina: A Golgi study,” Vision Research, vol. 21, no. 7, pp. 1081–1114, 1981. Data Availability [9] W. E. Renehan, Z. Jin, X. Zhang, and L. Schweitzer, “Structure and function of gustatory neurons in the nucleus of the solitary Te public datasets of neuromorphology is available at tract: II. Relationships between neuronal morphology and http://neuromorpho.org/. Source code (Python and R scripts) physiology,” Journal of Comparative Neurology,vol.367,no.2, implementing the methods and the analyses described in pp.205–221,1996. this paper can be requested from Maryamossadat Aghili at [10] R. Fogel, X. Zhang, and W. E. Renehan, “Relationships mailto:[email protected]@fu.edu. between the morphology and function of gastric and intestinal Computational Intelligence and Neuroscience 11

distention-sensitive neurons in the dorsal motor nucleus of the [27] S. Nanda, M. M. Allaham, M. Bergamino et al., “Doubling up on vagus,” Journal of Comparative Neurology,vol.364,no.1,pp.78– the fy: NeuroMorpho.Org Meets Big Data,” Neuroinformatics, 91, 1996. vol.13,no.1,pp.127–129,2015. [11] Y. Seki, J. Rybak, D. Wicher, S. Sachse, and B. S. Hansson, [28] G. A. Ascoli, L. Alonso-Nanclares, S. A. Anderson et al., “Petilla “Physiological and morphological characterization of local terminology: nomenclature of features of GABAergic interneu- interneurons in the Drosophila antennal lobe,” Journal of rons of the cerebral cortex,” Nature Reviews Neuroscience,vol.9, Neurophysiology,vol.104,no.2,pp.1007–1019,2010. no.7,pp.557–568,2008. [12] B. A. Sava, C. S. David,´ A. Teissier et al., “Electrophysiological [29]H.Akil,M.E.Martone,andD.C.VanEssen,“Challengesand and morphological properties of Cajal-Retzius cells with difer- opportunities in mining neuroscience data,” Science,vol.331,no. ent ontogenetic origins,” Neuroscience,vol.167,no.3,pp.724– 6018, pp. 708–712, 2011. 734, 2010. [30] X. Vasques, L. Vanel, G. Villette, and L. Cif, “Morphological [13]B.A.VogtandA.Peters,“Formanddistributionofneurons neuron classifcation using machine learning,” Frontiers in in rat cingulate cortex: Areas 32, 24, and 29,” Journal of Neuroanatomy,vol.10,2016. Comparative Neurology,vol.195,no.4,pp.603–625,1981. [31] C. Grienberger and A. Konnerth, “Imaging Calcium in Neu- [14] R. D. Burwell, M. P. Witter, and D. G. Amaral, “Perirhinal and rons,” Neuron, vol. 73, no. 5, pp. 862–885, 2012. postrhinal cortices of the rat: A review of the neuroanatomical literature and comparison with fndings from the monkey [32] S. Liu, D. Zhang, S. Liu, D. Feng, H. Peng, and W. Cai, “Rivulet: brain,” Hippocampus,vol.5,no.5,pp.390–408,1995. 3D Neuron Morphology Tracing with Iterative Back-Tracking,” Neuroinformatics,vol.14,no.4,pp.387–401,2016. [15] S. Ramon y Cajal, “Degeneration and regeneration of the nervous system,” 1928. [33] K. W. Eliceiri, M. R. Berthold, I. G. Goldberg et al., “Biological [16]M.Halavi,K.A.Hamilton,R.Parekh,andG.A.Ascoli,“Digital imaging sofware tools,” Nature Methods,vol.9,no.7,pp.697– reconstructions of neuronal morphology: Tree decades of 710, 2012. research trends,” Frontiers in Neuroscience, vol. 6, article 49, [34]L.Shamir,J.D.Delaney,N.Orlov,D.M.Eckley,andI.G. 2012. Goldberg, “Pattern recognition sofware and techniques for [17] E. Meijering, “Neuron tracing in perspective,” Cytometry Part biological image analysis,” PLoS Computational Biology,vol.6, A,vol.77,no.7,pp.693–704,2010. no. 11, Article ID e1000974, 2010. [18] M.H.Longair,D.A.Baker,andJ.D.Armstrong,“Simpleneurite [35] R.Scorcioni,S.Polavaram,andG.A.Ascoli,“L-measure:aweb- tracer: Open source sofware for reconstruction, visualization accessible tool for the analysis, comparison and search of digital and analysis of neuronal processes,” Bioinformatics,vol.27,no. reconstructions of neuronal morphologies,” Nature Protocols, 17, pp. 2453-2454, 2011. vol. 3, no. 5, pp. 866–876, 2008. [19]T.A.Ferreira,A.V.Blackman,J.Oyreretal.,“Neuronal [36] R.C.Cannon,D.A.Turner,G.K.Pyapali,andH.V.Wheal,“An morphometry directly from bitmap images,” Nature Methods, on-line archive of reconstructed hippocampal neurons,” Journal vol. 11, no. 10, pp. 982–984, 2014. of Neuroscience Methods,vol.84,no.1-2,pp.49–54,1998. [20] A. Rodriguez, D. B. Ehlenberger, D. L. Dickstein, P. R. Hof, [37] J. R. Glaser and E. M. Glaser, “Neuron imaging with neurolucida and S. L. Wearne, “Automated three-dimensional detection - A PC-based system for image combining microscopy,” Com- and shape classifcation of dendritic spines from fuorescence puterized Medical Imaging and Graphics,vol.14,no.5,pp.307– microscopy images,” PLoS ONE,vol.3,no.4,ArticleIDe1997, 317, 1990. 2008. [38] S.-Y.Ho, C.-Y.Chao, H.-L. Huang, T.-W.Chiu, P.Charoenkwan, [21] H.Peng,Z.Ruan,F.Long,J.H.Simpson,andE.W.Myers,“V3D and E. Hwang, “NeurphologyJ: An automatic neuronal mor- enables real-time 3D visualization and quantitative analysis of phology quantifcation method and its application in pharma- large-scale biological image data sets,” Nature Biotechnology, cological discovery,” BMC Bioinformatics,vol.12,articleno.230, vol. 28, no. 4, pp. 348–353, 2010. 2011. [22] J. Luisi, A. Narayanaswamy, Z. Galbreath, and B. Roysam, [39] R. Parekh and G. A. Ascoli, “Neuronal Morphology Goes “Te FARSIGHT trace editor: An open source tool for 3- Digital: A Research Hub for Cellular and System Neuroscience,” D inspection and efcient pattern analysis aided editing of Neuron,vol.77,no.6,pp.1017–1038,2013. automated neuronal reconstructions,” Neuroinformatics,vol.9, [40]L.D.F.Costa,K.Zawadzki,M.Miazaki,M.P.Viana,andS.N. no. 2-3, pp. 305–315, 2011. Taraskin, “Unveiling the neuromorphological space,” Frontiers [23]W.Yu,H.K.Lee,S.Hariharan,W.Bu,andS.Ahmed,“Quan- in Computational Neuroscience,vol.4,2010. titative Neurite Outgrowth Measurement Based on Image Segmentation with Topological Dependence,” Cytometry Part [41] J. Sola and J. Sevilla, “Importance of input data normalization A,vol.75,no.4,pp.289–297,2009. for the application of neural networks to complex industrial problems,” IEEE Transactions on Nuclear Science,vol.44,no.3, [24]G.A.Ascoli,D.E.Donohue,andM.Halavi,“NeuroMor- pp.1464–1468,1997. pho.Org: a central resource for neuronal morphologies,” Te Journal of Neuroscience,vol.27,no.35,pp.9247–9251,2007. [42]D.C.MontgomeryandG.C.Runger,Applied Statistics And [25]M.Halavi,S.Polavaram,D.E.Donohueetal.,“NeuroMor- Probability for Engineers,JohnWiley&Sons,2010. pho.Org implementation of digital neuroscience: Dense cover- [43] A. Karagiannis, T. Gallopin, C. David´ et al., “Classifcation age and integration with the NIF,” Neuroinformatics,vol.6,no. of NPY-expressing neocortical interneurons,” Te Journal of 3, pp. 241–252, 2008. Neuroscience,vol.29,no.11,pp.3642–3659,2009. [26] R. Parekh, R. Armananzas,˜ and G. A. Ascoli, “Te importance [44] L. M. McGarry, A. M. Packer, E. Fino, V. Nikolenko, T. of metadata to assess information content in digital reconstruc- Sippy, and R. Yuste, “Quantitative classifcation of somatostatin- tions of neuronal morphology,” Cell and Tissue Research,vol. positive neocortical interneurons identifes three interneuron 360, no. 1, pp. 121–127, 2015. subtypes,” Frontiers in Neural Circuits,vol.4,article12,2010. 12 Computational Intelligence and Neuroscience

[45] A. Tsiola, F. Hamzei-Sichani, Z. Peterlin, and R. Yuste, “Quanti- [64] R. Kohavi and G. H. John, “Wrappers for feature subset tative morphologic classifcation of layer 5 neurons from mouse selection,” Artifcial Intelligence,vol.97,no.1-2,pp.273–324, primary visual cortex,” Journal of Comparative Neurology,vol. 1997. 461, no. 4, pp. 415–428, 2003. [65] Y. Saeys, I. Inza, and P.Larranaga,˜ “Areview of feature selection [46] S. Polavaram, T. A. Gillette, R. Parekh, and G. A. Ascoli, techniques in bioinformatics,” Bioinformatics,vol.23,no.19,pp. “Statistical analysis and data mining of digital reconstructions 2507–2517, 2007. of dendritic morphologies,” Frontiers in Neuroanatomy,vol.8, [66] E. C. Marin, G. S. X. E. Jeferis, T. Komiyama, H. Zhu, and L. 2014. Luo, “Representation of the glomerular olfactory map in the [47] M. Cottrell and P.Letremy,´ “Missing values: processing with the Drosophila brain,” Cell,vol.109,no.2,pp.243–255,2002. kohonen algorithm,” Article ID 0701152, https://arxiv.org/abs/ [67] C. Sun, Z. Dai, H. Zhang, L. Li, and Z. Yuan, “Binary matrix math/0701152. shufing flter for feature selection in neuronal morphology [48] D. B. Rubin, “Inference and missing data,” Biometrika,vol.63, classifcation,” Computational and Mathematical Methods in no. 3, pp. 581–592, 1976. Medicine, vol. 2015, Article ID 626975, 2015. [49]P.H.VanNess,T.E.Murphy,K.L.B.Araujo,M.A.Pisani,and [68] J. Ward, “Hierarchical grouping to optimize an objective func- H. G. Allore, “Te use of missingness screens in clinical epi- tion,” Journal of the American Statistical Association,vol.58,pp. demiologic research has implications for regression modeling,” 236–244, 1963. Journal of Clinical Epidemiology,vol.60,no.12,pp.1239–1245, 2007. [69] G. Tamas,E.H.Buhl,andP.Somogyi,“FastIPSPselicitedvia´ multiple synaptic release sites by diferent types of GABAergic [50] A. C. Acock, “Working with missing values,” Journal of Marriage neurone in the cat visual cortex,” Te Journal of Physiology,vol. and Family,vol.67,no.4,pp.1012–1028,2005. 500, no. 3, pp. 715–738, 1997. [51] J. L. Schafer, Analysis of Incomplete Multivariate Data,Chapman & Hall, New York, NY, USA, 1997. [70] B. Cauli, J. T. Porter, K. Tsuzuki et al., “Classifcation of fusiform neocortical interneurons based on unsupervised clustering,” [52] N. Japkowicz and S. Stephen, “Te class imbalance problem: a Proceedings of the National Acadamy of Sciences of the United systematic study,” Intelligent Data Analysis,vol.6,no.5,pp.429– States of America,vol.97,no.11,pp.6144–6149,2000. 449, 2002. [71] F. Karube, Y. Kubota, and Y. Kawaguchi, “Axon Branching and [53] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduc- Synaptic Bouton Phenotypes in GABAergic Nonpyramidal Cell tion by locally linear embedding,” Science,vol.290,no.5500, Subtypes,” Te Journal of Neuroscience,vol.24,no.12,pp.2853– pp. 2323–2326, 2000. 2865, 2004. [54] P. Pudil and J. Novovicova,´ “Novel methods for feature subset selectionwithrespecttoproblemknowledge,”inFeature Extrac- [72] T. Gallopin, H. Geofroy, J. Rossier, and B. Lambolez, “Cortical tion, Construction and Selection, pp. 101–116, Springer, 1998. sources of CRF, NKB, and CCK and their efects on pyramidal cells in the neocortex,” Cerebral Cortex,vol.16,no.10,pp.1440– [55] K. Person, “On lines and planes of closest ft to system of points 1452, 2006. in space,” Philiosophical Magazine,vol.2,pp.559–572,1901. [73] B. Halabisky, F. Shen, J. R. Huguenard, and D. A. Prince, [56]D.Dumitriu,R.Cossart,J.Huang,andR.Yuste,“Correlation “Electrophysiological classifcation of somatostatin-positive between axonal morphologies and synaptic input kinetics of interneurons in mouse sensorimotor cortex,” Journal of Neuro- interneurons from mouse visual cortex,” Cerebral Cortex,vol. physiology,vol.96,no.2,pp.834–845,2006. 17,no.1,pp.81–91,2007. [57] F. Sultan, U. Czubayko, and P. Tier, “Morphological classifca- [74] C. David,´ A. Schleicher, W. Zuschratter, and J. F. Staiger, “Te tion of the rat lateral cerebellar nuclear neurons by principal innervation of parvalbumin-containing interneurons by VIP- component analysis,” Journal of Comparative Neurology,vol. immunopositive interneurons in the primary somatosensory 455, no. 2, pp. 139–155, 2003. cortex of the adult rat,” European Journal of Neuroscience,vol. 25,no.8,pp.2329–2340,2007. [58] M. Hubert and S. Engelen, “Robust PCA and classifcation in biosciences,” Bioinformatics,vol.20,no.11,pp.1728–1736,2004. [75] S. Andjelic, T. Gallopin, B. Cauli et al., “Glutamatergic nonpyra- [59] G. Kerschen and J.-C. Golinval, “Non-linear generalization midal neurons from neocortical layer VI and their comparison of principal component analysis: from a global to a local with pyramidal and spiny stellate neurons,” Journal of Neuro- approach,” Journal of Sound and Vibration,vol.254,no.5,pp. physiology,vol.101,no.2,pp.641–654,2009. 867–876, 2002. [76] M. Helmstaedter, B. Sakmann, and D. Feldmeyer, “Te relation [60]J.Tang,S.Alelyani,andH.Liu,“Featureselectionforclas- between dendritic geometry, electrical excitability, and axonal sifcation: a review,” in Data Classifcation: Algorithms and projections of l2/3 interneurons in rat barrel cortex,” Cerebral Applications,pp.37–64,CRCPress,BocaRaton,Fla,USA,2015. Cortex,vol.19,no.4,pp.938–950,2009. [61] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduc- [77] J. MacQueen, “Some methods for classifcation and analysis of tion to Statistical Learning: with Applications in R,vol.103of multivariate observations,” in Proceedings of the 5th Berkeley Springer Texts in Statistics, Springer, New York, NY, USA, 2013. Symposium on Mathematical Statistics and Probability,vol.1,no. [62] M. L. Bermingham, R. Pong-Wong, A. Spiliopoulou et al., 14, 1967. “Application of high-dimensional feature selection: evaluation [78] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means for genomic prediction in man,” Scientifc Reports,vol.5,Article clustering algorithm,” JournaloftheRoyalStatisticalSociety ID 10312, 2015. Series C (Applied Statistics, vol. 28, no. 1, pp. 100–108, 1979. [63] J. Hamon, Optimisation combinatoire pour la selection´ de vari- [79] T. C. Badea and J. Nathans, “Quantitative analysis of neuronal ables en regression´ en grande dimension: Application en gen´ etiq´ ue morphologies in the mouse retina visualized by a using a genet- animale [Ph.D. thesis],Universite´ des Sciences et Technologie de ically directed reporter,” Journal of Comparative Neurology,vol. Lille-Lille I, 2013. 480,no.4,pp.331–351,2004. Computational Intelligence and Neuroscience 13

[80] D. M. Dacey, B. B. Peterson, F. R. Robinson, and P. D. Gamlin, [96] M. Slaney and M. Casey, “Lecture notes: Locality-sensitive “Fireworks in the primate retina: In vitro photodynamics hashing for fnding nearest neighbors,” IEEE Signal Processing reveals diverse LGN-projecting ganglion cell types,” Neuron, Magazine,vol.25,no.2,pp.128–131,2008. vol. 37, no. 1, pp. 15–27, 2003. [97]D.Zhang,J.Wang,D.Cai,andJ.Lu,“Self-taughthashingfor [81] C. Li, X. Xie, and X. Wu, “A universal neuronal classifcation fast similarity search,” in Proceedings of the 33rd Annual Inter- and naming scheme based on the neuronal morphology,” in national ACM SIGIR Conference on Research and Development Proceedings of the IEEE International Conference on Computer in Information Retrieval, SIGIR 2010, pp. 18–25, che, July 2010. Science and Network Technology (ICCSNT ’11),vol.3,pp.2083– [98] W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with 2087, December 2011. graphs,” in Proceedings of the 28th International Conference on [82] D. Battaglia, A. Karagiannis, T. Gallopin, H. W. Gutch, and B. Machine Learning, ICML 2011, pp. 1–8, usa, July 2011. Cauli, “Beyond the frontiers of neuronal types,” Frontiers in [99] Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative Neural Circuits,vol.7,article13,2013. quantization: A procrustean approach to learning binary codes [83] L. A. Zadeh, “Fuzzy sets,” Information and Control,vol.8,no.3, for large-scale image retrieval,” IEEE Transactions on Pattern pp. 338–353, 1965. Analysis and Machine Intelligence, vol. 35, no. 12, pp. 2916–2929, 2013. [84] D. Ristanovic,´ N. T. Miloˇsevic,´ B. D. Stefanovic,´ D. L. Maric,´ and K. Rajkovic,´ “Morphology and classifcation of large neurons in [100] Z. Li, F. Shen, R. Fang, S. Conjeti, A. Katouzian, and S. Zhang, the adult human dentate nucleus: a qualitative and quantitative “Maximum inner product search for morphological retrieval of analysis of 2D images,” Neuroscience Research,vol.67,no.1,pp. large-scale neuron data,” in Proceedings of the 2016 IEEE 13th 1–7, 2010. International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2016, pp. 602–606, cze, April 2016. [85] R. Santana, L. M. McGarry, C. Bielza, P. Larranaga,˜ and R. Yuste, “Classifcation of neocortical interneurons using afnity [101]Z.Li,R.Fang,F.Shen,A.Katouzian,andS.Zhang,“Indexing propagation.,” Frontiers in Neural Circuits,vol.7,p.185,2013. and mining large-scale neuron databases using maximum inner product search,” Pattern Recognition,vol.63,pp.680–688,2017. [86]J.Defelipe,P.L.Lopez-Cruz,´ R. Benavides-Piccione et al., “New [102] S. Zhang and D. Metaxas, “Large-Scale medical image analytics: insights into the classifcation and nomenclature of cortical Recent methodologies, applications and Future directions,” GABAergic interneurons,” Nature Reviews Neuroscience,vol.14, Medical Image Analysis, vol. 33, pp. 98–101, 2016. no. 3, pp. 202–216, 2013. [103] X.Zhang,W.Liu,M.Dundar,S.Badve,andS.Zhang,“Towards [87] P. L. Lopez-Cruz,´ P. Larranaga,˜ J. DeFelipe, and C. Bielza, large-scale histopathological image analysis: Hashing-based “Bayesian network modeling of the consensus between experts: image retrieval,” IEEE Transactions on Medical Imaging,vol.34, an application to neuron classifcation,” International Journal of no. 2, pp. 496–506, 2015. Approximate Reasoning,vol.55,no.1,part1,pp.3–22,2014. [104] M. Helmstaedter, B. Sakmann, and D. Feldmeyer, “Neuronal [88] L.Guerra,L.M.McGarry,V.Robles,C.Bielza,P.Larranaga,˜ and correlates of local, lateral, and translaminar inhibition with R. Yuste, “Comparison between supervised and unsupervised reference to cortical columns,” Cerebral Cortex,vol.19,no.4, classifcations of neuronal cell types: A case study,” Developmen- pp. 926–937, 2009. tal Neurobiology, vol. 71, no. 1, pp. 71–82, 2011. [105]A.M.Wong,J.W.Wang,andR.Axel,“Spatialrepresentation [89] X. Chang, M. D. Kim, R. Stephens et al., “Neuron recognition of the glomerular map in the Drosophila protocerebrum,” Cell, with hidden neural network random felds,”in Proceedings of the vol. 109, no. 2, pp. 229–241, 2002. 2014 IEEE 11th International Symposium on Biomedical Imaging, [106]M.Li,F.Zhao,J.Lee,D.Wang,H.Kuang,andJ.Z.Tsien, ISBI 2014, pp. 266–269, chn, May 2014. “Computational Classifcation Approach to Profle Neuron [90] T. Zhao and S. M. Plaza, “Automatic neuron type identifcation Subtypes from Brain Activity Mapping Data,” Scientifc Reports, by neurite localization in the drosophila medulla,” Article ID vol. 5, Article ID 12474, 2015. 14091892, https://arxiv.org/abs/1409.1892. [107] M. Fernandez-Delgado,´ E. Cernadas, S. Barro, and D. Amorim, [91] R. Jiang, Q. Liu, and S. Liu, “A proposal for the morphological “Doweneedhundredsofclassiferstosolverealworldclassi- classifcation and nomenclature of neurons,” Neural Regenera- fcation problems?” Journal of Machine Learning Research,vol. tion Research,vol.6,no.25,pp.1925–1930,2011. 15, no. 1, pp. 3133–3181, 2014. [92]M.Hall,E.Frank,G.Holmes,B.Pfahringer,P.Reutemann,and [108]M.Jiang,S.Zhang,J.Huang,L.Yang,andD.N.Metaxas,“Scal- I. H. Witten, “Te WEKA data mining sofware: an update,” able histopathological image analysis via supervised hashing ACM SIGKDD Explorations Newsletter,vol.11,no.1,pp.10–18, with multiple features,” Medical Image Analysis,vol.34,pp.3–12, 2009. 2016. [93] P. Fernandez-Gonzalez, C. Bielza, and P. Larranaga, “Multidi- mensional classifers for neuroanatomical data,” in Proceedings of the ICML Workshop on Statistics, Machine Learning and Neuroscience (Stamlins 2015),2015. [94]M.Costa,J.D.Manton,A.D.Ostrovsky,S.Prohaska,and G. S. X. E. Jeferis, “NBLAST: Rapid, Sensitive Comparison of Neuronal Structure and Construction of Neuron Family Databases,” Neuron,vol.91,no.2,pp.293–311,2016. [95] Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proceedings of the 22nd Annual Conference on Neural Informa- tion Processing Systems (NIPS ’08),pp.1753–1760,Vancouver, Canada, December 2008. Advances in Multimedia Applied Computational Intelligence and Soft Computing Journal of The Scientifc Mathematical Problems World Journal in Engineering Engineering Hindawi Hindawi Publishing Corporation Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 http://www.hindawi.comwww.hindawi.com Volume 20182013 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018

Modelling & Advances in Simulation Artifcial in Engineering Intelligence

Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018

Advances in

International Journal of Fuzzy Reconﬁgurable Submit your manuscripts at Systems Computing www.hindawi.com

Hindawi www.hindawi.com Volume 2018 Hindawi Volume 2018 www.hindawi.com

Journal of Computer Networks and Communications Advances in International Journal of Scientic Human-Computer Engineering Advances in Programming Interaction Mathematics Civil Engineering Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018

International Journal of Biomedical Imaging

International Journal of Journal of Journal of Computer Games Electrical and Computer Computational Intelligence Robotics Technology Engineering and Neuroscience Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018