REVIEW Communicated by Terrence Sejnowski A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures Yong Yu
[email protected] Department of Automation, Xi’an Institute of High-Technology, Xi’an 710025, China, and Institute No. 25, Second Academy of China, Aerospace Science and Industry Corporation, Beijing 100854, China Xiaosheng Si
[email protected] Changhua Hu
[email protected] Jianxun Zhang
[email protected] Department of Automation, Xi’an Institute of High-Technology, Xi’an 710025, China Recurrent neural networks (RNNs) have been widely adopted in research areas concerned with sequential data, such as text, audio, and video. How- ever, RNNs consisting of sigma cells or tanh cells are unable to learn the relevant information of input data when the input gap is large. By intro- ducing gate functions into the cell structure, the long short-term memory (LSTM) could handle the problem of long-term dependencies well. Since its introduction, almost all the exciting results based on RNNs have been achieved by the LSTM. The LSTM has become the focus of deep learning. We review the LSTM cell and its variants to explore the learning capac- ity of the LSTM cell. Furthermore, the LSTM networks are divided into two broad categories: LSTM-dominated networks and integrated LSTM networks. In addition, their various applications are discussed. Finally, future research directions are presented for LSTM networks. 1 Introduction Over the past few years, deep learning techniques have been well devel- oped and widely adopted to extract information from various kinds of data (Ivakhnenko & Lapa, 1965; Ivakhnenko, 1971; Bengio, 2009; Carrio, Sampe- dro, Rodriguez-Ramos, & Campoy, 2017; Khan & Yairi, 2018).