Biologically Realistic Artificial Neural Networks

Biologically Realistic Artificial Neural Networks Matthijs Kok University of Twente P.O. Box 217, 7500AE Enschede The Netherlands [email protected] ABSTRACT (computational models) for neural learning, to gain an un- Artificial Neural Networks (ANNs) play an important role derstanding of how biological systems, such as the brain, in machine learning nowadays. A typical ANN is tradi- work. Furthermore, neuroscience also aims to make con- tionally trained using stochastic gradient descent (SGD) tributions towards progress in artificial intelligence, such with backpropagation (BP). However, it is unlikely that as inspiration and validation for new computer algorithms a real biological neural network is trained similarly. Neu- and architectures [5]. To improve the understanding in roscience theories, such as Hebbian Theory could inspire both fields, models that better resemble biological neural adaption from the traditional training method SGD to networks are of utmost importance. make ANNs more biologically plausible. Two mathemati- ANNs are vaguely inspired by biological neural networks. cal descriptions of Hebbian theory will be suggested based However, the method SGD-BP that is used to train ANNs on the limitations of the mathematical framework of Heb- does not resemble the way in which the human brain learns. bian theory: A competitive Hebbian learning rule and an This research aims to contribute to closing the gap be- imply Hebbian learning rule. Finally, this research will tween artificial and biological neural networks, as it will propose a method, called the Hebbian Plasticity Term investigate a neuroscience theory called Hebbian theory (HPT) method, that incorporates the mathematical de- [16] as starting point to modify the traditional training scription of Hebbian theory to modify the traditional train- method SGD of ANNs aiming to make the learning pro- ing method SGD. Therefore two variants of the HPT- cess more biologically plausible. Within this research, two method are proposed: HPT-Competitive and HPT-Imply. mathematical formulations are proposed for different Heb- The influence of the hebbian plasticity term ϕ on SGD bian learning rules, which are incorporated in traditional shows more biologically realistic plasticity of a synaptic SGD by a newly developed method. The results of this connection between neurons in the ANN at the cost of research can be used as a basis for developing more bio- performance. logically realistic artificial neural networks. This paper will investigate the following research ques- Keywords tions: Artificial Neural Network, Stochastic Gradient Descent RQ 1 How could Hebbian theory be used to mathemati- with Backpropagation, Hebbian Theory, Hebbian Plastic- cally describe a biologically realistic evolution of a synapse? ity Term method RQ 1.1 What are the limitations to Hebbian theory when using it to formulate a mathematical de- 1. INTRODUCTION scription of a weight update? Deep learning is a very important area in the study of RQ 1.2 Which types of mathematical descriptions machine learning nowadays, with wide applications [21, of Hebbian theory could be used, as a result of these 18, 10] such as speech recognition [9, 15], natural language limitations? processing [7], and image recognition [11, 32]. It describes RQ 2 Could the training algorithm BP-SGD of a typi- a powerful set of techniques for learning in artificial neural cal standard ANN be modified using Hebbian theory to networks (ANNs) [4, 25, 31]. A traditional method to train become a more biologically plausible training method? ANNs is stochastic gradient descent (SGD), which makes use of backpropagation (BP) [13, 25]. RQ 2.1 How could a mathematical description of Hebbian theory be incorporated into the standard In the biological sciences, understanding the biological ba- training algorithm BP-SGD? sis of learning is considered to be one of the ultimate chal- lenges [17]. Theoretical or computational neuroscientists RQ 2.2 Does a connection weight between two strive to make a link between observed biological pro- neurons of an artificial neural network evolve more cesses and biologically plausible theoretical mechanisms biologically plausible as a result of incorporating the new method? Permission to make digital or hard copies of all or part of this work for RQ 3 To what extent does the performance of an ANN personal or classroom use is granted without fee provided that copies trained using the newly developed more biologically realis- are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy oth- tic training method differ from the performance of an ANN erwise, or republish, to post on servers or to redistribute to lists, requires trained using the traditional training method BP-SG? prior specific permission and/or a fee. 33th Twente Student Conference on IT July 3rd, 2020, Enschede, The Netherlands. Copyright 2020, University of Twente, Faculty of Electrical Engineer- ing, Mathematics and Computer Science. 1 2. ANALOGY BETWEEN THE ANN AND of the pre-synaptic neurons, according to equation (1). THE BRAIN l l−1 l l aj = σ (ai · wji)+ bj (1) 2.1 Neurons in the brain " i # X Here, σ denotes a sigmoid-activation function and the bias l is unique to a neuron and is denoted as bj . The weighted l−1 l l l input sum i(ai · wji)+ bj will be summarized as zj throughout the rest of this paper. Furthermore, the su- perscript isP written as l to denote any layer, as integer to denote a specific layer, or as L,L − 1,L − 2,.., where L denotes the last layer. Both, i and j denote the position of the neuron within a layer. The use of subscript letters will be alphabetically coherent with the order of layers l−2 l−1 l l+1 (e.g. ..., ah ,ai ,aj ,ak ,...). The dependence of the l post-synaptic neuron aj on multiple pre-synaptic neurons l−1 ai is also present in real biological post-synaptic neurons, which can be connected to multiple (pre-synaptic) Figure 1. Schematic image of two neurons in a neurons through their (multiple) dendrites, which deter- brain mine together whether or not an action potential is trig- gered. Figure 1 depicts two neurons in a brain that are separated An ANN shows connections as edges between nodes. The by a synapse. When the leftmost neuron, the pre-synaptic edge or connection is analogous to the synapse between l neuron, is excited by an action potential, the vesicles in its two neurons in a brain. In an ANN, a weight wji is asso- axon terminals fuse with its membrane and release their ciated to a connection and is considered to represent the contents, neurotransmitters, into the synaptic cleft. Once strength of the respective connection. In other words, a released, the neurotransmitters can bind to the receptors large weight in an ANN implies a large synaptic efficacy of on the dendrite of the rightmost neuron, the post-synaptic the analogous biological synapse. The subscript ji denotes neuron, acting as a chemical signal. That binding opens the position of the synapse within a layer. Here, j denotes ion channels on its dendrite that allow charged ions to the post- and i the pre-synaptic neuron, which together l flow in and out of the cell, converting the chemical signal form the synaptic connection wji. into an electrical signal. If the combined effect of multiple dendrites (which can be connected to other neurons) of 3. TRAINING AN ANN the post-synaptic neuron changes the overall charge of the A common ANN training method is stochastic gradient de- cell enough, then it triggers an action potential, also called scent (SGD), which requires backpropagation (BP). This a spike [24, 27]. section discusses how traditional SGD is performed on an 2.2 The relation between an ANN and neu- ANN, to serve as a basis for section 6 and 7. In an ANN, one feedforward is performed by computing rons in the brain l equation (1) sequentially for each layer. The weights wji A typical feedforward ANN could be described as a struc- l tured collection of nodes with edges between them, as de- and biases bj used to compute equation (1) are initial- picted in Figure 2. The ANN that is depicted in Fig- ized as random variables drawn from a standard normal distribution. Since the computation of each separate com- ponent is cumbersome, vector-wise multiplication is used. Below, matrix multiplication for a feed-forward is shown (in this example layer l has 2 units and layer l + 1 has 4 units): l+1 l+1 l+1 l+1 z1 w11 w12 b1 l+1 l+1 l+1 l l+1 z2 w21 w22 a1 b2 l = l l · l + l (2) z +1 w +1 w +1 a b +1 3 31 32 2 3 zl+1 wl+1 wl+1 bl+1 4 41 42 4 where l+1 l+1 a1 z1 l+1 l+1 a2 z2 l+1 = σ( l+1 ) (3) Figure 2. Example feedforward ANN with 2 neu- a3 z3 al+1 zl+1 rons in the input layer, 3 neurons in the hidden 4 4 layer, and 2 neurons in the output layer In short, this can be notated as: l 1 l l 1 z + = W l+1 · a + b + (4) ure 2 represents connected neurons in a brain, as connected nodes. The spike in a neuron is considered a bi- and l 1 l 1 nary phenomenon: it is there or not. Therefore, the acti- a + = σ(z + ) (5) vation in an ANN represents the firing rate of the neuron. l l l l The more often a pre-synaptic neuron fires, the higher the Here, W is a matrix and a , b , z are vectors.

Load more