Memristor-Based Approximated Computation
Total Page:16
File Type:pdf, Size:1020Kb
Memristor-based Approximated Computation Boxun Li1, Yi Shan1, Miao Hu2, Yu Wang1, Yiran Chen2, Huazhong Yang1 1Dept. of E.E., TNList, Tsinghua University, Beijing, China 2Dept. of E.C.E., University of Pittsburgh, Pittsburgh, USA 1 Email: [email protected] Abstract—The cessation of Moore’s Law has limited further architectures, which not only provide a promising hardware improvements in power efficiency. In recent years, the physical solution to neuromorphic system but also help drastically close realization of the memristor has demonstrated a promising the gap of power efficiency between computing systems and solution to ultra-integrated hardware realization of neural net- works, which can be leveraged for better performance and the brain. The memristor is one of those promising devices. power efficiency gains. In this work, we introduce a power The memristor is able to support a large number of signal efficient framework for approximated computations by taking connections within a small footprint by taking the advantage advantage of the memristor-based multilayer neural networks. of the ultra-integration density [7]. And most importantly, A programmable memristor approximated computation unit the nonvolatile feature that the state of the memristor could (Memristor ACU) is introduced first to accelerate approximated computation and a memristor-based approximated computation be tuned by the current passing through itself makes the framework with scalability is proposed on top of the Memristor memristor a potential, perhaps even the best, device to realize ACU. We also introduce a parameter configuration algorithm of neuromorphic computing systems with picojoule level energy the Memristor ACU and a feedback state tuning circuit to pro- consumption [8], [9]. gram the Memristor ACU effectively. Our simulation results show Our objective is to use memristors to design a power that the maximum error of the Memristor ACU for 6 common complex functions is only 1.87% while the state tuning circuit can efficient neuromorphic framework for approximated compu- achieve 12-bit precision. The implementation of HMAX model tation with both programmability and computation generality. atop our proposed memristor-based approximated computation The framework is inspired by the theory that a multilayer framework demonstrates 22× power efficiency improvements neural network can work as a universal approximator, and than its pure digital implementation counterpart. the widely observation of approximated computation in a Index Terms—memristor, approximated computation, power series of applications, ranging from signal processing, pattern efficiency, neuromorphic recognition to computer vision and nature language processing. Both the neuromorphic computing architecture and the toler- I. INTRODUCTION ance of inexact computation can be leveraged for substantial Power efficiency is a major concern in computing systems. performance and power efficiency gains [9], [10]. The limited battery capacity urges power efficiency of hun- To realize this goal, the following challenges must be dreds of giga floating point operation per second per watt overcome: Firstly, a power efficient Memristor ACU which can (GFLOPS/W) for mobile embedded systems to achieve a complete general algebraic calculus is demanded to accomplish desirable portability and performance [1]. However, the highest different computing task. Secondly, Memristor ACUs must power efficiency of contemporary CPU or GPU systems is be organized effectively to form a computation framework only ∼ 20 GFLOPS/W, which is expected not to substan- in order to accomplish different processing tasks. Thirdly, tially improve in the predictably scaled technology node [2]. there needs a parameter configuration algorithm to work out Therefore, researchers are looking for an alternative computing the effective parameters of the Memristor ACUs according architecture for the conventional digital computing systems to to the function requirement. Finally, the memristor cannot be achieve performance and power efficiency gains [3]. effectively programmed and thus a state tuning scheme both Brain is such an power efficient platform with incredible efficient and accuracy for memristors is demanded. computational capability. Compared to biological systems, In this work, we for the first time propose a complete nowadays programmable computing systems are more than framework for memristor-based approximated computation. 6 orders of magnitude less efficient [4]. This result drives The contributions of this paper are: more and more people to work on neuromorphic computational 1) We propose a power efficient memristor-based approxi- paradigms beyond digital logic by replicating the brain’s mated computation framework. The framework is inte- extraordinary computational abilities. However, the contem- grated with our programmable Memristor ACUs. Simu- porary neuromorphic system’s research is mostly confined to lation results show that our Memristor ACU offers less algorithm field, while the hardware implementations are still than 1.87% error for 6 common complex functions and built upon the traditional CPU/GPU/FPGA systems, which works reliably under a noisy condition; usually consumes a large volume of memory and computing 2) We provide a parameter configuration algorithm to con- resources [5], [6]. In general, the algorithm enhancement with vert the weights of the neural network to the appropriate conventional CMOS implementation only alleviates the power resistance of the memristors in Memristor ACUs. The efficiency issue but not fundamentally resolve it. algorithm can provide both the maximum dynamic range In recent years, the device technology innovations have and the validity of the memristance by taking the impact enabled new computational paradigms beyond Von-Neumann of fabrication variation into consideration; x Inverter RL· RH·(1- ) Vi1 I(t) Vi2 Memristor Memristor Crossbar Crossbar Program Unit Array Array Vim Voltage Input (Positive) (Negative) Fig. 1. Physical model of HP memristor Input Output Network Network 3) We propose a feedback circuit to program the memristor R R Sig- Sig- Sig- moid moid moid Hidden accurately. The circuit can provide a linear relationship Layer Vo1 Vo2 Von Input Output between the input voltage and the final value of the Layer Layer memristor. The accuracy increases with the frequency Voltage Output of the reference voltage and an analog storage with 12- Fig. 2. Memristor ACU bit precision is realized under 2MHz; The rest of our paper is organized as follows: Section 2 of the global one, and thus sometimes provides a worse result −7 provides background information and related work. Section 3 as shown in Table I. Such a precision level (MSE ≈ 10 ) is proposes the details of memristor-based approximated com- sufficient for most approximated computation, e.g., the HMAX putation framework. Section 4 depicts the detailed program model included in our case study. method of the framework including the parameter configura- tion algorithm and the high-precision state tuning circuit for III. MEMRISTOR-BASED APPROXIMATED COMPUTATION memristors. Section 5 presents a case study based on HMAX FRAMEWORK model. Section 6 concludes this work and shows some future A. Memristor ACU directions. Fig. 2 shows the overview of the Memristor ACU. The II. BACKGROUND KNOWLEDGE AND RELATED Memristor ACU is based on a memristor-based hardware WORK implementation of a multilayer network (with one hidden layer) to work as a universal approximator. The mechanism A. Memristor is as follows. The basic operation of feedforward network Memristor was first physically realized by HP Labs in 2008 is matrix-vector multiplication and can be mapped to the [11]. Fig. 1 shows the physical model of the HP memristor memristor crossbar array illustrated in Fig. 3. The output of [12], which is a two-layer thin film of TiO2. One layer consists the crossbar array can be expressed as: of intact TiO2 while the other layer lacks of a small amount X 1 of oxygen (TiO2−x). These two layers demonstrate different Voj = R · (Vik · gkj); gkj = (1) Mkj conductivity and the overall resistance is the sum of the two k layers. When a current is applied to the device, the boundary of where Vik and Voj are the input and output voltage of each layers (doping front) will move so as to change the resistance row and column. k and j represent the index numbers of input of the memristor. and output voltage, respectively. And they are also the index numbers of the rows and columns of the crossbar array sep- B. Universal Approximators by Multilayer Feedforward Net- arately. M and g represent the resistance and admittance works kj kj of each memristor in the crossbar array, respectively; R is the It has been proven that a universal approximator can be resistor at the end of the crossbar array. implemented by a multilayer feedforward network with only Op Amps are used to enhance the output accuracy and one hidden layer and sigmoid activation function [13], [14]. isolate the two layers. All Op Amps work as current feedback Table I gives the maximum errors of the approximations of amplifiers so that the function of such a crossbar array can be six common functions by this approximated method based expressed as a vector-matrix multiplication. on MATLAB simulation. The mean square errors (MSE) of Since both R and M can only be positive, two crossbars −7 approximations are less than 10 after the back propagation are needed to represent the positive and negative weights of training algorithm