Practical Applications of Biological Realism in Artificial Neural Networks

PRACTICAL APPLICATIONS OF BIOLOGICAL REALISM IN ARTIFICIAL NEURAL NETWORKS by TEGAN MAHARAJ A thesis submitted to the Department of Computer Science in conformity with the requirements for the degree of Master of Science Bishop’s University Sherbrooke, Quebec, Canada November 2015 Copyright © Tegan Maharaj, 2015 Abstract Over the last few decades, developments in structure and function have made artificial neural networks (ANNs) state-of-the-art for many machine learning applications, such as self-driving cars, image and facial recognition, speech recognition etc. Some developments (such as error backpropagation) have no obvious biological motivation, while others (such as the topology of convolutional neural networks) are directly modeled on biological systems. We review biological neural networks with a phenomenological approach to establish a framework for intuitively comparing artificial and biological neural networks based on ‘information’ and ‘actions’ which can be either positive or negative. Based on this framework, we suggest interpreting the activation of a neuron as an amount of neurotransmitter, and weights as a number of receptors, and draw a connection between Spike-Timing-Dependent Plasticity (STDP) and backpropgation. We apply our framework to explain the biological plausibility of several empirically successful innovations in ANNs, and to propose a novel activation function which employs negative rectification. We demonstrate that a network with such negative rectifier units (NegLUs) can represent temporally- inverted STDP (ti-STDP). NegLUs are tested in a convolutional architecture on the MNIST and CIFAR benchmark datasets, where they are found to help prevent overfitting, consistent with the hypothesized role of ti-STDP in deep neural networks. We suggest applications of these networks, and extensions of our model to form Min-Maxout networks and learn activation functions. i Acknowledgements I would first like to thank my supervisor, L. Bentabet, for his good-natured humour, encouragement, and invaluable intuitive explanations. His pragmatic guidance fostered independence and critical thinking, and kept me grounded in a field of exciting and sometimes overwhelming complexity. I also owe many thanks to N. Khouzam and S. Bruda, for their excellent teaching, open doors, and years of practical and academic advice. During my thesis, I had the wonderful opportunity of being offered a Mitacs research internship at iPerceptions Inc., which enabled me to explore data science and machine learning in the real wold. I am very grateful to M. Butler, L. Cochrane, and M. Tremblay for making this valuable experience rewarding and enjoyable. I would also like to thank J. Fredette, E. Pinsonnault, and S. Cote at the Research Office, for helping make this and many other great opportunities accessible. I am forever grateful to S. Stoddard and my friends and co-workers at the ITS Helpdesk, without whom I might never have discovered computer science, and equally to my friends and fellow students in J9 and J113, for making it feel like home. Finally, I could never offer sufficient thanks to K. Gill, G. Guest, S. Matheson, and my parents, for the years of love and support that made everything possible. ii Contents Abstract ................................................................................................................................................... i Acknowledgements ................................................................................................................................ ii Contents ................................................................................................................................................ iii List of Tables ......................................................................................................................................... v List of Figures and Illustrations ......................................................................................................... vi Chapter 1: Introduction ................................................................................................................... 1 1.1 Motivation .............................................................................................................................. 1 1.2 Contributions of this work .................................................................................................... 2 1.3 Publications and software .................................................................................................... 3 1.4 Structure of the paper ........................................................................................................... 4 Chapter 2: Background .................................................................................................................... 6 2.1 Intelligence, learning, and memory ..................................................................................... 6 2.2 Neural network basics .......................................................................................................... 8 2.3 Biological neural networks ................................................................................................... 9 2.3.1 Biological neurons ......................................................................................................... 9 2.3.2 Biological neural networks ......................................................................................... 13 2.3.3 Biochemical systems capable of learning .................................................................. 16 2.3.4 Biological learning and memory ................................................................................. 17 2.3.5 Types of learning ......................................................................................................... 19 2.4 Artificial neural networks .................................................................................................. 22 2.4.1 Artificial Intelligence .................................................................................................. 22 2.4.2 Machine learning ........................................................................................................ 24 2.4.3 Artificial neurons ........................................................................................................ 25 2.4.4 Artificial neural networks .......................................................................................... 28 2.4.5 Deep learning .............................................................................................................. 31 2.4.6 Deep belief networks ................................................................................................... 32 2.4.7 Convolutional neural networks (convnets) ................................................................ 33 2.4.8 Rectified linear units ................................................................................................... 33 2.4.9 Dropout and DropConnect .......................................................................................... 34 2.4.10 Maxout ........................................................................................................................ 35 Chapter 3: Comparing biological and artificial neural networks ............................................... 36 3.1 Introduction ......................................................................................................................... 36 3.1.1 Traditional intuition ................................................................................................... 36 iii 3.1.2 Inhibition and excitation ............................................................................................ 37 3.1.3 Accidental bias ............................................................................................................ 40 3.2 Proposed biological intuition for artificial neurons ........................................................... 42 3.3 Spike-timing dependent plasticity terminology ................................................................ 46 3.4 Related work ........................................................................................................................ 47 3.4.1 Biological influence in artificial neural networks ..................................................... 47 3.4.2 Other interpretations of the activation function ....................................................... 47 3.5 Application of proposed biological intuition ...................................................................... 48 3.5.1 Hyperbolic tangent and 0-mean centred activation functions ................................. 48 3.5.2 Boltzmann machines and deep belief networks (DBNs) .......................................... 49 3.5.3 Convolutional neural networks .................................................................................. 50 3.5.4 ReLUs........................................................................................................................... 50 3.5.5 Dropout, DropConnect, and Maxout .......................................................................... 51 3.5.6 Input normalization .................................................................................................... 52 3.5.7 Unsupervised pre-training ......................................................................................... 53 Chapter 4: Negative rectifiers ......................................................................................................

Load more