Memristor-Based Neural Networks with Weight Simultaneous Perturbation Training

Nonlinear Dyn https://doi.org/10.1007/s11071-018-4730-z REVIEW Memristor-based neural networks with weight simultaneous perturbation training Chunhua Wang · Lin Xiong · Jingru Sun · Wei Yao Received: 24 May 2018 / Accepted: 11 December 2018 © Springer Nature B.V. 2018 Abstract The training of neural networks involves pler and easier implementation of the neural network numerous operations on the weight matrix. If neural in hardware. The practicability, utility and simplicity of networks are implemented in hardware, all weights this scheme are demonstrated by the supervised learn- will be updated in parallel. However, neural networks ing tasks. based on CMOS technology face many challenges in the updating phase of weights. For example, deriva- Keywords Memristor · Synaptic unit · Hardware tion of the activation function and error back prop- implementation · Multilayer neural networks (MNNs) · agation make it difficult to be realized at the circuit Weight simultaneous perturbation (WSP) level, even though the back propagation algorithm is rather efficient and popular in neural networks. In this paper, a novel synaptic unit based on double identical memristors is designed, on the basis of which a new 1 Introduction neural network circuit architecture is proposed. The whole network is trained by a hardware-friendly weight With the rapid development of artificial intelligence, simultaneous perturbation (WSP) algorithm. The hard- neural networks are attracting more and more atten- ware implementation of neural networks based on WSP tions and play a significantly important role in the field algorithm only involves the feedforward circuit and of image recognition, speech recognition [1] and auto- does not require the bidirectional circuit. Furthermore, matic control, etc. Up to now, most of neural networks two forward calculations are merely needed to update have been realized by software, which lacks the inher- all weight matrices for each pattern, which significantly ent parallelism of neural networks. However, circuit simplifies the weight update circuit and allows sim- operations are inherently parallel and capable of pro- viding high speed computation. Therefore, it is neces- B · · · C. Wang ( ) L. Xiong J. Sun W. Yao sary to research hardware implementations of neural College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China networks. e-mail: [email protected] Regarding hardware implementations of neural net- L. Xiong works, many studies have already been conducted [2– e-mail: [email protected] 8], most of which implemented neural networks via J. Sun CMOS technology. However, the implementation of e-mail: [email protected] nonvolatile weight storage is a major bottleneck for W. Yao them. On the other hand, when the area and consump- e-mail: [email protected] tion taken into account, it is a hard task to efficiently 123 C. Wang et al. realize the circuit-based neural networks using CMOS all single-layer networks [30]. Their design makes the technology. training process simplified and indeed reduces the com- Recently, a novel device, the memristor [9–13], pro- munication overhead and the circuit complexity. How- vides a fire-new approach for hardware implementa- ever, their initial training and the weight update calcu- tions of neural networks. The memristor, a nonvolatile lation were performed in software. In 2015, Soudry et programmable resistor, was first physically realized by al. designed an artificial synapse constituting of a single HP Labs in 2008 [14]. Current through the memristor memristor and two transistors, and BP algorithm was or voltage across the memristor enables to tune mem- used to train the whole network [31]. Indeed, it is fairly ristor resistance. When its excitation is turned off, the novel for their design that the errors are encoded as cor- device keeps its most recent resistance until the excita- responding weight update time and the errors and inputs tion is turned on again. Memristor is widely applied are introduced into weight updating process by means to chaotic circuits [15,16], memory and neural net- of the state variables of memristors. However, BP algo- works. In memristor-based neural networks, memris- rithm needs to propagate the error backward from the tor’s nonvolatility is significantly similar to the biolog- output layer to the hidden layers, and the update weight ical synapse, which spurs it to be a promising candidate is obtained by calculating the derivative of the activa- for weight storage. And the fact that memristor belongs tion function. These complex calculations are not easy to nanodevices enables it easier to be integrated in the to implement at the circuit level. analog circuit. Because of these, memristor was widely Focusing on the difficulties in efficiently imple- used in hardware implementations of neural networks. menting learning rule like BP algorithm in hard- Many memristor-based neural networks were previ- ware, some hardware-friendly algorithms were pro- ously trained by the so-called spike-timing-dependent posed and applied to the circuit implementation of plasticity (STDP) [17–21], which was intended for neural networks. For example, in 2015, the random explaining the biological phenomenon about tuning weight change (RWC) algorithm was used to train the synaptic junction strength of the neurons. Besides, memristor-based neural network where the synaptic Sheri et al. realized neuromorphic character recogni- weights were randomly updated by a small constant in tion system with two PCMO memristors as a synapse each iteration [32]. Their design simplifies the circuit [22]. Nishitani et al. modified an existing three-terminal structure and does not involve complex circuit oper- ferroelectric memristor (3T-FeMEM) model, and the ations. However, in RWC’s weight updating process, STDP was exploited to perform supervised learning just the sign but not the value of the error variation is [19]. However, the convergence of STDP-based learn- considered, and the direction of weight change is ran- ing is not guaranteed for general inputs [23]. dom every time, which results in more iterations being Other memristor-based learning systems recently needed. attract people’s attentions. For example, memristor- In order to solve the problems in the memristor- based single-layer neural networks (SNNs) trained based neural network trained by RWC and BP algo- by the perceptron algorithm were designed to com- rithm, in this paper, a new memristor-based neural net- plete some linear separable tasks [24,25]. Yet, it is work circuit trained by weight simultaneous perturba- the single-layer structure and binary outputs that limit tion (WSP) algorithm is proposed. WSP algorithm was its wide-ranging applications. Therefore, constructing introduced in [33]. Perturbations whose signs are ran- memristor-based multilayer neural networks (MNNs) domly selected with equal probability are simultane- [26–29] that are rather more practical and efficient than ously added to all weights, and the difference between SSNs is requisite. In 2012, Adhikari et al. put forward a the first error function without perturbations and the modified chip-in-the-loop learning rule for MNNs with second one with perturbations is used to control the memristor bridge synapses that consisted of three tran- change of weights in WSP. Unlike RWC algorithm, sistors and four identical memristors and realized zero, both the sign and value of the variation of the error negative and positive synaptic weights [30]. Unlike the function are exploited to update all weights. Hence, conventional chip-in-the-loop learning, a complex mul- fewer iterations are needed compared with RWC algo- tilayer network learning was conducted by decompos- rithm. Additionally, in WSP,complex calculations such ing it into multiple simple single-layer network learn- as derivation and error back propagation are not also ing, and the gradient descent algorithm was used to train involved compared with BP algorithm. Though the 123 Memristor-based neural networks ∗ ∗ ∗ algorithm is relatively efficient and simple, so far, G(s ) G (s ) ∗ G (s ) ∗ G(s) = + (s − s ) + (s − s )2 the memristor-based neural network circuit trained by 0! 1! 2! (n) ∗ WSP algorithm is not available. The hardware architec- G (s ) ∗ ∗ +··· + (s − s )n + O (s − s )n ture of the neural network trained by WSP algorithm is n! therefore proposed in this paper. In addition, in order (3) to solve the problem of storing and updating weights, a (n)( ∗) novel synaptic unit based on double identical memris- where n is a positive integer. G s represents nth ( ) = ∗ tors is designed, and taking full advantage of the char- order derivative of the function G s at s s , and [( − ∗)n] acteristics of WSP algorithm, this design only needs O s s represents higher-order infinitesimal of ( − ∗)n two forward computations for each pattern to update s s . all weight matrices, which greatly simplifies the cir- Soudry et al. demonstrated that if the variations in the ( ) ( ( )) cuit structure. This paper indicates that it is possible value of s t are restricted to be small, G s t is able to be approximated by first-order Taylor series around for neural networks to be actualized in a simpler, more ∗ compact and reliable form at the circuit level. certain point s [31]. The memductance is hence given The remainder of this paper is organized as follows. by ∗ ∗ In Sect. 2, the basic information about memristor, RWC G(s ) G (s ) ∗ G(s) = + (s − s ) (4) algorithm and WSP algorithm is introduced. In Sect. 3, 0! 1! the circuit implementations of SNNs and MNNs are simplified form: described in detail. In Sect. 4, the proposed circuit ∗ G(s(t)) = g +ˆgs(t) (5) architectures are numerically evaluated by supervised ∗ ∗ ∗ tasks to demonstrate its practicability. Finally, Sect. 6 where gˆ =[dG(s)/ds]s=s∗ and g = G(s ) −ˆgs . presents conclusions. Based on this memristor model, SSNs and MNNs are efficiently implemented in hardware. 2 Preliminaries 2.2 Random weight change In this section, the basic information about memristor, RWC algorithm and WSP algorithm is described.

Load more