Improving Stock Closing Price Prediction Using Recurrent Neural Network and Technical Indicators
Total Page:16
File Type:pdf, Size:1020Kb
LETTER Communicated by Jinjung Liang Improving Stock Closing Price Prediction Using Recurrent Neural Network and Technical Indicators Tingwei Gao [email protected] Yueting Chai [email protected] Department of Automation, Tsinghua University, Beijing 100084, China This study focuses on predicting stock closing prices by using recurrent neural networks (RNNs). A long short-term memory (LSTM) model, a type of RNN coupled with stock basic trading data and technical in- dicators, is introduced as a novel method to predict the closing price of the stock market. We realize dimension reduction for the technical indicators by conducting principal component analysis (PCA). To train the model, some optimization strategies are followed, including adap- tive moment estimation (Adam) and Glorot uniform initialization. Case studies are conducted on Standard & Poor’s 500, NASDAQ, and Apple (AAPL). Plenty of comparison experiments are performed using a series of evaluation criteria to evaluate this model. Accurate prediction of stock market is considered an extremely challenging task because of the noisy environment and high volatility associated with the external factors. We hope the methodology we propose advances the research for analyzing and predicting stock time series. As the results of experiments suggest, the proposed model achieves a good level of fitness. 1 Introduction Stock prediction is the act of determining the future price value or move- ment of a company stock (Hegazy, Soliman, & Salam, 2014). Mining stock market patterns is generally perceived as a challenging task as the stock data are noisy and nonstationary (Abu-Mostafa & Atiya, 1996). Stock data are actually a type of time series. Generally financial time series predict fu- ture values using charts or model techniques, inclusive of candlestick pat- terns and machine learning (ML) algorithms. The prominent ML model for time series analysis is the artificial neural network (ANN). At present, ANNs are regarded as the state-of-the-art theory and tech- nique for regression and classification applications. A recurrent neural network (RNN) is a special kind of ANN, designed to learn sequential or time-varying patterns (Medsker & Jain, 2001). Long short-term memory (LSTM) is an important kind of RNN that excels at remembering values for Neural Computation 30, 2833–2854 (2018) © 2018 Massachusetts Institute of Technology doi:10.1162/neco_a_01124 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/neco_a_01124 by guest on 25 September 2021 2834 T. Gao and Y. Chai either long or short periods of time (Gers, Schmidhuber, & Cummins, 2000). It has been shown to outperform other RNNs on tasks involving long time lags (Gers, Schraudolph, & Schmidhuber, 2002). This study proposes and validates a novel stock prediction model on the basis of LSTM, stock basic trading data, stock technical indicators, and principal component analysis (PCA). The model is designed to predict the closing price of the next day. Our first major contribution is that we effec- tively design a stock prediction system using LSTM. Second, we propose a method of combining basic stock trading data and technical indicators as the input variables by PCA. Only technical indicators are associated with the PCA unit. Third, our comparison experiments evaluate the model. The remainder of this letter is organized as follows. Section 2 provides a brief overview of related work. In section 3, we describe our prediction model and its design details. Section 4 shows and analyzes the comparison experiment results. Finally, conclusions are presented in section 5. 2 Literature Review In the past decade, the use of ML for stock market behavior analysis has been an active research topic, inclusive of trading strategy (Sirignano & Cont, 2018; Samarakoon & Athukorala, 2017; Dash & Dash, 2016; Chour- mouziadis & Chatzoglou, 2016; Zhu, Yin, & Li, 2014; Takeuchi & Lee, 2013) and stock price prediction. There is plenty of pioneering work on the appli- cations of ANNs for predicting stock prices (Aghakhani & Karimi, 2016; Göçken, Özçalıcı, Boru, & Dosdogru,˘ 2016; Yaqub & Al-Ahmadi, 2016; Chen, 2015; Yetis, Kaplan, & Jamshidi, 2014; Sun, Che, & Wang, 2014; Li, Wu, Liu, & Luo, 2014; Das & Uddin, 2013; Oliveira, Nobre, & Zárate, 2013). Recently, RNN models have been introduced as methods to predict stock prices. Jia (2016) investigated the effectiveness of LSTM for stock market prediction, Xie and Wang (2016) found that RNNs are effective in forecast- ing stock prices, and Chen, Zhou, and Dai (2015) predicted stock returns using LSTM. A model based on RNN was proposed for predicting stock returns (Rather, Agarwal, & Sastry, 2015). Besides ANNs, a variety of other ML methods have been used for stock market forecasting—for instance, support vector machines (SVM; Chen & Hao, 2017; Wen, Xiao, He, & Gong, 2014; Ni, Ni, & Gao, 2011), decision trees (Hu, Feng, Zhang, Ngai, & Liu, 2015; Wu, Lin, & Lin, 2006), genetic algo- rithms (Ye & Wei, 2015; Fang, Fataliyev, Wang, Fu, & Wang, 2014; Sheta, Faris, & Alkasassbeh, 2013), and Markov chains (Gupta & Dhingra, 2012; Wang, Cheng, & Hsu, 2010). Some research has focused on traditional time series analysis for stock market prediction, such as autoregressive (Mathew, Sola, Oladrin, & Amos, 2013; Chou & Wang, 2007), autoregressive and moving average (ARMA) (Anaghi & Norouzi, 2013; Feng & Cao, 2011), autoregressive in- tegrated moving average (Lin & Pai, 2010; Al-Shiab, 2006), and generalized Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/neco_a_01124 by guest on 25 September 2021 Improving Stock Closing Price Predictions 2835 autoregressive conditional heteroskedasticity (GARCH) (Zhang, 2014; Dong, 2012; Wang, Guo, Niu, & Cao, 2010). PCA is critical for stock market prediction. Zhong and Enke (2017) pre- sented a data mining process to predict stock using ANN and PCA. Chang and Wu (2015) presented a kernel-based PCA to extract critical features to increase the performance of a stock trading model. The statistical behaviors of Chinese stock market fluctuations were investigated by Liu and Wang (2011) using PCA. Tsai and Hsiao (2010) used PCA as a feature selection method for predicting the stock market. 3 Proposed Predicting Model 3.1 Input Method 3.1.1 Input Variables Selection. Our model’s input variables include ba- sic historical trading data and technical indicators. There are six variables in the basic trading data set. The open price (OP) is the price at which a security first trades when the exchange opens on a given trading day.The closing price (CL) is the final price at which a security is traded on agiven trading day. The high price (HI) is the highest price at which a stock trades over the course of a trading day. The low price (LO) is the lowest price at which a stock trades over the course of a trading day. The adjusted price (AD) is a stock’s CL on any given day of trading that has been amended to include any distributions and corporate actions that occurred at any time prior to the next day’s open. The volume (VO) is the total quantity of shares or contracts traded for a specified security. Stock technology indicators can be adopted to predict the performance of company’s stock price (Thomas, 2001). Technical indicators are mathemat- ical calculations on the basis of stock basic trading data. They have been proved to have abundant and latent information about stock markets. Em- ploying technique indicators has had better effects, as some technique in- dicators contain information of that very day and information on previous, days. For instance, PROC provides the closing price information for the past 12 days, SO-%K in the price information for the past 5 days, MACD infor- mation for the past 26 days, and VROC volume information for the past 12 days. In our study, 15 stock market technical indicators are selected as the input variables (we list the calculation formula of technical indicators in Table 1): • Accumulation distribution (ACD) attempts to relate price and vol- ume in the stock market. • The moving average convergence divergence (MACD) shows the relationship between two moving averages of prices and reveals Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/neco_a_01124 by guest on 25 September 2021 2836 T. Gao and Y. Chai Table 1: Technical Indicators and Their Formulas. Technical Indicator Formula = + × (CL − LO) − (HI −CL) ACD ACD ACDprevious-day VO HI − LO MACD MACD = EMA(CL, 12) − EMA(CL, 26) CHO CHO = EMA(AD, 3) − EMA(AD, 10) t = Highest Highest(t) max CLi = i 1 t = Lowest Lowest(t) min CLi i=1 = (CL − Lowest(5)) × SO-%K SO-%K Highest(5) − Lowest(5) 100% SO-%D SO-%D = MA(STOS-%K, 3) − = + × CL CLprevious-day VPT VPT VPTprevious-day VO CLprevious-day = Highest(n) −CL × W-R% W-R% Highest(n) − Lowest(n) 100% = − 100 RSI RSI 100 1 + RS MOME MOME(n) = CLt − CLt−n AC AC(t) = AO − MA(AO, t) CL −CL − PROC PROC = t 12 × 100% CLt−12 VO−VO − VROC VROC = t 12 × 100% VOt−12 ≥ , = + OBV If CL CLprevious-day OBV OBVprevious-day VO < , = − If CL CLprevious-day OBV OBVprevious-day VO Moving average (MA) and exponential moving average (EMA). t , = / MA(x t) xi t i=1 , = α × + − α × ,α= / + EMA(x t) x (1 ) EMAprevious-day 2 (t 1) = Average of upward price change . RS Average of downward price change MedianPrice = (HI + LO)/2 , = , − , AO(t1 t2 ) MA(MedianPrice t1 ) MA(MedianPrice t2 ) changes in strength, direction, and duration of a trend in a stock’s price. • The Chaikin oscillator (CHO) is used to measures the AD line of the MACD. • Highest(t) is the highest closing price value during the past t trading days. • Lowest(t) is the lowest closing price value within the past t trading days. Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/neco_a_01124 by guest on 25 September 2021 Improving Stock Closing Price Predictions 2837 • The stochastic oscillator (SO) attempts to compare the closing price of a security to the range of its prices over a certain period of time.