Memristive Crossbar Arrays for Machine Learning Systems

MEMRISTIVE CROSSBAR ARRAYS FOR MACHINE LEARNING SYSTEMS A thesis submitted to the University of Manchester for the degree of Master of Philosophy in the Faculty of Science and Engineering by MANU V NAIR 2015 School of Electrical and Electronics Engineering Contents Abstract7 Declaration8 Copyright9 Acknowledgements 10 The Author 11 1 Introduction 12 2 Hardware neural networks 15 2.1 Introduction.................................. 15 2.2 Neural Networks............................... 15 2.3 Digital versus Analogue Neural Networks.................. 18 2.4 Hardware Neural Network Architectures................... 20 2.4.1 Pulse-stream arithmetic based networks............... 20 2.4.2 TrueNorth............................... 24 2.4.3 SpiNNaker.............................. 26 2.4.3.1 Node architecture..................... 28 2.4.3.2 Event driven operation................... 29 2.4.3.3 Network communication................. 29 2.4.3.4 Neuron and Synapse model................ 30 2.4.4 Neurogrid............................... 31 2.4.4.1 Shared dendrite structure................. 31 2.4.4.2 Neuron and Synapse.................... 32 1 CONTENTS 2 2.4.4.3 Communication...................... 33 2.5 Programming neuromorphic hardware.................... 35 2.6 Other noteworthy neuromorphic systems................... 35 2.7 Concluding Remarks............................. 37 3 Memristive Learning 39 3.1 Memristors.................................. 39 3.1.1 Boundary condition model...................... 41 3.2 Crossbar Arrays................................ 43 3.3 Memristor circuits and systems........................ 45 3.3.1 Crossbars............................... 45 3.3.2 Memristor programming schemes.................. 46 3.3.2.1 Unregulated write..................... 46 3.3.2.2 Regulated write...................... 48 3.3.3 STDP-based algorithms in memristive crossbar arrays....... 51 3.3.4 Back-propagation algorithm..................... 53 3.3.5 Dynamical systems and other circuit applications.......... 54 4 Gradient-descent in crossbar arrays 55 4.1 Introduction.................................. 55 4.2 Gradient descent algorithm.......................... 56 4.3 Gradient descent for linear classifiers..................... 58 4.4 Gradient descent in crossbar arrays...................... 59 4.5 Unregulated step descent........................... 61 4.6 USD versus other methods.......................... 62 4.7 Implementation................................ 63 4.8 Simulations.................................. 65 4.8.1 Simulation Setup........................... 65 4.8.2 Initial condition analysis....................... 67 4.8.3 Effect of device parameters on performance............. 69 4.8.4 Effect of variability on performance................. 72 4.8.5 Comparison against floating point implementation......... 73 4.8.6 Performance on the MNIST database................ 74 CONTENTS 3 4.9 USD for other algorithms........................... 75 4.9.1 Matrix Inversion........................... 75 4.9.2 Auto-encoders............................ 76 4.9.3 Restricted Boltzmann Machines................... 77 4.10 Concluding remarks.............................. 78 5 Conclusion 80 Bibliography 83 A Neural Networks and related algorithms 93 A.1 Introduction.................................. 93 A.2 Rosenblatt’s perceptron............................ 93 A.3 Hopfield network............................... 95 A.4 Boltzmann Machines............................. 96 A.5 The self-organizing map (SOM)....................... 98 A.6 Spiking Neural Networks (SNN)....................... 100 List of Figures 2.1 Pulse stream neuron [13] c IEEE...................... 22 2.2 A transconductance multiplier [8] c IEEE.................. 22 2.3 Self-timed asynchronous communication scheme [8] c IEEE....... 23 2.4 Input pulse probability versus output pulse probability........... 24 2.5 TrueNorth architecture [15] c IEEE..................... 25 2.6 A SpiNNaker node [10] c IEEE....................... 28 2.7 The SpiNNaker machine [10] c IEEE.................... 30 2.8 A network of 4 neurons and 16 synapses with 1-to-1 connectivity[11] c IEEE ........................................ 31 2.9 A shared dendrite network of 4 neurons and 4 synapses [11] c IEEE... 32 2.10 A Neurogrid neuron [11] c IEEE...................... 33 2.11 Block diagram of a Neurogrid tree [11] c IEEE............... 34 3.1 The new circuit element: Memristor [3] c IEEE.............. 40 3.2 Tree and Crossbar architecture used in Teramac. [43] c IEEE....... 43 3.3 Crossbar array of memristors: 3-D and 2-D representation......... 44 3.4 A simple circuit schematic for (a) Unregulated writing into memristor and (b) Reading the state of the memristor.................... 47 3.5 Schematic of a continuous feedback write scheme............. 48 3.6 Sneak paths in a crossbar array........................ 49 3.7 Memristor-based analogue memory/computing unit [49] c IEEE...... 50 3.8 A 1T1M crossbar array access to the top left element [33] c IEEE..... 50 3.9 Segmented crossbar architecture [50] c IEEE................ 51 3.10 Voltage across the synapse x(DT) for various action potential shapes [51] c Frontiers of Neuroscience......................... 52 4 LIST OF FIGURES 5 4.1 Gradient descent for a 2-dimensional objective function. F(w0,w1) in the figure is the same as F(w,x;y)......................... 57 4.2 Block diagram of training module [68]................... 59 4.3 Convergence of USD algorithm when finding the minima of a paraboloid function.................................... 62 4.4 4-phase training scheme. The voltage levels in the pulsing scheme can only take three levels: +Vdrive;0;and −Vdrive [68]................ 64 4.5 Convergence behaviour for different weight initializations......... 67 4.6 Weight updates versus iterations when initialized close to 0......... 68 Gratio 4.7 Weight updates versus iterations when initialized to 2 .......... 68 4.8 Weight updates versus iterations when initialized close to 1......... 69 4.9 Weight updates versus iterations when initialized to uniformly distributed random values between 0 and 1........................ 70 4.10 Evolution of classification error with iterations [68]............. 71 4.11 Classification error versus Gratio. The bands straddle the maximum and minimum classification error because of settling error at convergence [68]. 71 4.12 Settling time (Niters) versus Gratio for different values of a (Alpha). [68]. 72 4.13 ) Classification Error (Error) versus s (sigma)................ 73 4.14 Number of training iterations (Niters) versus s (Sigma). Training time increases as variability increases. [68] c IEEE................. 73 4.15 Effect of in the spread of Gon (y-axis) and Go f f (x-axis) values shown using 1000 memristive device samples with mean Go f f = 0.003 S, Mean Gratio= 100. Note that both x and y-axis are in log-scale. [68] c IEEE....... 74 4.16 Performance of the stochastic USD implementation in comparison to floating- point logistic regression for various values of pe [68] c IEEE........ 75 A.1 The perceptron................................ 94 A.2 A Hopfield network............................. 95 A.3 Boltzmann machines and Restricted Boltzmann machines......... 97 A.4 A self-organizing map............................ 99 A.5 A synapse. The blocks shown are state-dependent and characterized by various dynamical effects leading to different types of spikes......... 100 A.6 IPSP, EPSP, and action potential....................... 101 LIST OF FIGURES 6 A.7 Data-types and symbols used in the LIF neuron model [9] c IEEE..... 102 A.8 STDP waveform for the action potential shown in Figure A.6....... 103 Abstract This thesis is a study of specialized circuits and systems targeted towards machine learning algorithms. These systems operate on a computing paradigm that is different from traditional Von-Neumann architectures and can potentially reduce power consumption and im- prove performance over traditional computers when running specialized tasks. In order to study them, case studies covering implementations such as TrueNorth, SpiNNaker, Neuro- grid, Pulse-stream based neural networks, and memristor-based systems, were done. The use of memristive crossbar arrays for machine learning was found particularly interesting and chosen as the primary focus of this work. This thesis presents an Unregulated Step Descent (USD) algorithm that can be used for training memristive crossbar arrays to run algorithms based on gradient-descent learning. It describes how the USD algorithm can address hardware limitations such as variability, poor device models, complexity of training architectures, etc. The linear classifier algorithm was primarily used in the experiments designed to study these features. This algorithm was chosen because its crossbar architecture can easily be extended to larger networks. More importantly, using a simple algorithm makes it easier to draw inferences from experimental results. Datasets used for these experiments included randomly generated data and the MNIST digits dataset. The results indicate that performance of crossbar arrays that have been trained using the USD algorithm is reasonably close to that of the corresponding floating point implementation. These experimental observations also provide a blueprint of how training and device parameters affect the performance of a crossbar array and how it might be improved. The thesis also covers how other machine learning algorithms such as logistic regressions, multi-layer perceptrons, and restricted

Load more