<<

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

LONG SHORT-TERM MEMORY DEEP NETWORK AND MACHINE LEARNING APPROACH IN ONE-DAY AHEAD STOCK MARKET PREDICTION 1P.V.CHANDRIKA, 2K. SAKTHI SRINIVASAN

1Research Scholar, VIT Business School, Vellore Institute of Technology, Vellore.

2Professor, VIT Business School, Vellore Institute of Technology, Vellore.

Abstract

The emergence of industry 4.0 with machine learning, deep learning and artificial intelligence have motivated the interest of many of the researchers in applying these techniques in various domains. There are lot of studies which has proven the strong evidences in predicting the stock market prices using traditional techniques of regression, ARIMA, ARCH and GARCH, The present research paper Deals with Machine Learning (ML) techniques using ARIMA, and Support Vector Machine (SVM) algorithms. Also, ANN, RNN and LSTM are used in Deep Learning Techniques to predict the price movement. To predict, stock indices of DJIA, NIFTY50, S&P 500, KOSPI, SSE are used. Selected technical indicators are used in the model to forecast the price movement for the next day. Appropriate performance metrics are applied and the results compared with traditional approach ARIMA. The results prove that SVM outperforms ARIMA in ML techniques and RNN-LSTM is more accurate than ANN. Various technical indicators are been use to predict the direction of the stock indices and to find the closing price of the stock index for next one day time period. The models are been evaluated based on the various performance metrics of the algorithms and are been compared with the traditional techniques of ARIMA. The results show that for machine learning algorithms Support vector machines outperformed ARIMA and for deep learning models RNN-LSTM outperformed ANN.

Keywords: Time Series Forecasting, Machine Learning, Deep Learning, ARIMA, SVM, ANN, RNN, LSTM

1. Introduction

Stock Market is found to be volatile and dynamic in nature. Applying the techniques of machine learning, deep learning to such an unstable, volatile market is quite challenging. To evaluate whether the techniques of machine learning and deep learning will help in predicting the stock market movements have enhanced the interest of researchers. In fact there are many traditional theories which tried to explain the stock market movements but there is no single theory that supports the volatility and reason pertaining to changes in the stock prices.

Usually there are three methods in which the stock price movements can be studied, those are

1. Fundamental Analysis 2. Technical Analysis and 3. Times series Forecasting.

1.1 Fundamental Analysis:

Fundamental analysis makes use of Market value of companies financial and competitive strength in order to predict future movements of an asset. It is a method of measuring security’s intrinsic value by examining related economic and financial factors. It helps an investor to compare with security’s current price and to check whether a security is undervalued or overvalued. Hence it is a way of determining fair market value of a stock.

1.2 Technical Analysis:

Technical Analysis is a way of representing the price and volume movements of stocks using charts and graphs. It helps in finding out the trends and patterns of the stocks and helps in forecasting the price movements. The charts are built based on certain statistical analysis.

The theory which is not in agreement with fundamental and technical analysis is Efficient Market Hypothesis (EMH). EMH proposes that it is impossible to beat the market through fundamental and technical analysis, because the market efficiently prices all the stocks on ongoing basis and any opportunity with excess returns are grabbed by market participants, so there is no chance of outperformance in the market during the long run.

2501

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

1.3 Time Series Forecasting:

Time Series Forecasting techniques Are used on historical time period data and takes into consideration different forecasting techniques. Techniques which are popular in this method include:

1. Moving Average 2. Exponential Smoothing 3. Auto Regressive Moving Average (ARMA) 4. Auto Regressive Integrated Moving Average (ARIMA)

Prediction accuracy for stock index direction however differs in each model. When the techniques of machine learning and deep learning are used for prediction, many studies propose that neural networks with hybrid models perform better in predicting the stock market direction[ ]. Studies[ ] also propose hybrid models integrating ML and DNN are much superior when technical indicators are taken as input features.

This study attempts to predict the direction of stock indices using ML and DLN techniques. In ML, Support Vector Machines (SVM), Logistic Regression(LR) and Auto Regressive Integrated Moving Average (ARIMA) are used as the preferred algorithms and in DNN, Multi-Layer Perceptron (MLP), Artificial Neural Network (ANN), Deep Neural Network (DNN), Recurrent Neural Network (RNN) are preferred as input algorithms. In addition, this research examines performance of RNN-LSTM as a hybrid tool in determining the index movement. Finally a comparison on accuracy is made on hybrid model of RNN-LSTM with ML models.

The remainder of the paper is organized as follows. Past related works are described in Section 2, objectives of the study and limitations are presented in Section 3. Data description, feature engineering and pre-processing methods and modelling are discussed in Section 4, a brief on different algorithms are also detailed. In Section 5 results of the models used are summarized and in section 6 we present the concluding remarks

2. Related Works

2.1 Econometric Model

Earlier approach adopted in stock market forecasting is confined to basic statistical application and econometric modelling. Later, econometric method are adopted as the basic algorithms in ML techniques to improve the accuracy level. Still, the effectiveness of this method proven short of prediction power, which is highlighted in this section as summarized from survey of available literature. The ARIMA models work well on stationary time series data but can be converted back to original series by using some transformations [1].

Adebiyi, Adewumi, and Ayo[2014] used ARIMA for predicting short and long term movement in New York Stock Exchange and Nigerian stock Exchange. Performance of model in predicting short term movement are better than long term. In another work, Ali et al [2014], finds neural network outperforming ARIMA. When time series data is modelled using ARIMA and GARCH, Indian stock market is found to be outperformed on hybrid models of ARIMA and GARCH when compared with traditional technique of ARIMA [4]. Decision on buy and sell strategy proved profitable when ANN is used as predicting model [Bannerjee, 2014].

Steel [2014], finds hybrid model of ARIMA is a better predictor than traditional ARIMA in finding the stock index movement of Germany, UK, France US, Japan and Australian stock exchanges. His study proved to be good in short term prediction.

2.2 Support Vector Machine

Power of prediction is better when SVM is used in finding the direction of stock price and index [ Patel, Shah, Thakkar and, Kotecha,2015], adoption of ANN, SVM, RF and Naïve Bayes with technical indicators pronounce the superior performance of SVM in predicting the direction of SNX Nifty and S&P500.

An error below 1 per cent is evident in SVR application for long term prediction for ten year period in NSE listed companies and NIFTY, Hore, Vipani, Das and Dutta [2018]. Swell[2017], finds SVM a better predictor compared to other traditional algorithms for forecasting DJIA daily returns.

2502

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

Buy and sell decision is easy when MLP and LR classification is exercised, and proves its worth with high accuracy rate where SVM outperforms [Dingali and Fourerir, 2017].

2.3 Neural Network:

Neural Network applied with back propagation show an improved accuracy, but it is much effective in a hybrid model embedded with Genetic Algorithm. GA-ANN outperforms compared to ANN in predicting the movement of index for monthly observation [Qiu, 2014].

MLP is similar to ANN in forward propagation and DNN, wherein it is termed as Convolution Neural Network(CNN), the results of MLP shows better accuracy when used along with LR, and Naïve Bayes, but a shade underperformer when compared to DNN[Gilberto, 2015].

Krollner, Vanstone and, Finnie [2010] finds ANN as a dominating algorithm in ML for stock market forecasting. Accuracy of ANN is 89.65 per cent when used in Indian , S&P CNX and Nifty 50 to understand the direction, but it shows a lesser accuracy of 69.72 per cent if a long period is taken for forecasting [Majumder and Hussain, 2015]

Kumar and Sharma[2016] finds a high accuracy of 99 per cent using ANN in Nifty 50. Alotaibi et al.[2018] finds ANN with back propagation capable of predicting Saudi Stock market and Oil prices.

Jerzy Korczak [2017] developed a hybrid model of Artificial Neural Network with Principle Component Analysis which is called as Agent trader (A-trader) System to forecast the prices of Talit and NASDAQ stock index. Naeini[2010] used MLP and Elman Recurrent Network and finds MLP underperforming.

Taking technical indicators as input feature and training the ANN gives an accuracy of 60.87 percent in predicting the Nikkei 225 index [Qui and Song, 2016]. Panda and Narashiman[2006] reveal the superior performance of ANN over linear auto regressive models. This is observed in their findings when applied in BSE stocks.

In one study, Ticardo, and Murillo[2016] used DNN to get prediction accuracy of 65 percent, a similar less error metrics is observed when Convolution Neural Network used along with LR and SVM [Dingle and Fournier,2017] and turns out to be more superior in high accuracy.

DNN with MLP, RNN, LSTM and CNN are outstanding if compared with ARIMA in ML, when applied to NSE and NYSE[Tang, 2013].

A minimum loss is observed when CNN are replaced with activation function of SVM. An improved metrics is found in CNN-SVM when compared to SVM [Korczak and Hernes, 2017]

A high accuracy rate of 59.5 per cent in predicting the movement of six US stocks is observed with a overall average score of 54.83 per cent when RNN is used[Gao].

Back propagation in a hybrid setup with random optimized method is a good forecast for Japanese stock market [Baba and Kozaki, 1992].

A hybrid model of RNN-LSTM show a 58 per cent accuracy when modelled on six stocks and stocks in NASDAQ100 [Acunto, 2016]

3. Objective

Taking the lessons from the literature survey, it is understood that both econometric tool and machine learning method are imperative in forecasting. Still, using these methods should be determined pensively, as both might lead to huge error if data is not pre-processed correctly. Hence the first objective is to determine whether ARIMA and SVM can be a predictable model. The recent findings show that DNN is more supportive and accurate in prediction. Hence to test this, the next objective is framed to test this on the data and find the truth behind the past facts. A hybrid RNN-LSTM is trained and tested to find the next day closing price of the selected indices. This is in addition to the ANN trained in DNN model. Comparison of the performance metrics is another objective as it reveals the better model amongst the three and guides one to choose the model in future of prediction.

2503

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

4. Methods

To predict the direction of stock indices, the study propose a data driven sequence. The data set includes, opening/closing value, trade volume of six stock market indices, S&P 500, DJIA, Nifty 50, NYSE, KOSPI and SSE. Technical indicators reflecting the price movements is another input variable.

Pre-processing and variable selection methods are used to select the appropriate predictors that provide the best metrics and accuracy power.

We use two models for this research, the first one is Machine Learning method using ARIMA and Support Vector Machine(SVM) as learning algorithms. In the other model, we propose Deep Neural Network, wherein, Artificial Neural Network (ANN), Recurrent Neural Network(RNN) with Long Short Term Memory(LSTM) is taken as the learning algorithms. These models are then compared with performance metrics. Upon the evaluation, a suitable model is selected for predicting the stock market. We describe the details of the models in the subsection that follow.

4.1 Machine Learning Algorithms

The machine learning algorithms considered for the study are

(a) Auto Regressive Integrated Moving Average (ARIMA)

(b) Support Vector Machine (SVM)

(a) Determining order of ARIMA (p,d,q): we use 1362 data observation covering the period January 1, 2015 to May 31, 2020. Further correlogram is used to test for stationarity, trend and seasonality by adopting ADF statistic. We find the data set is non- stationary with no trend and seasonality. Auto Regressive Integrated Moving Average (ARIMA) is an integration of Auto Regression (AR), p and Moving Average (MA), q. Hence it is important to know the order of ARIMA model (p,d,q), with differencing, d. The best model of ARIMA is determined based on minimum value of Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The ARIMA model is represented as following:

Where ξt is the error; difference between Y and Y hat

(b) Support Vector Machine

Support Vector Machines (SVM) is a powerful classification technique having unique implementation process in handling multiple continuous and categorical variables. The aim of SVM is to divide the datasets into classes that can find maximum marginal hyperplane. A hyperplane is generated based on iterative manner that minimises the error. SVM uses a technique called kernel based technique in which it converts low dimensional input space to high dimensional space. In other words it converts non-separable problems into separable problems by adding dimensions to it.

Fig.1. Support Vector

2504

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

Classifier Source:tutorialspoint.com

There are different kinds of kernels used in SVM;

a. Linear Kernel b. Polynomial Kernel c. Radial Basis Function Kernel

We adopt Radial Basis Function Kernel that maps input in indefinite dimension space. RBF adopts the following formula

K(x,xi)=exp(−gamma sum(x−xi^2))

Gamma ranges from 0∗ to 1. A good default value of gamma is 0.1.

4.2 Deep Learning Algorithms

Neural Network has proven its robustness to approximate real-value, discrete-valued and vector valued target functions. These Neural network have wide application in object recognition, face recognition and handwriting recognition. Neural Networks are well suited with training data corresponding to noise and complex sensor data.

Hidden Input Output

Fig.2. Neural Network Layers

A typical Neural network model consists of three parts:

a. Input layer b. Hidden layer and c. Output layer

a. Input Layer:

Input layer takes the data that we need to feed in to the network to get the target output.

b. Hidden Layer:

Hidden layer contains nodes that gets fed with input neurons. Weights are been generated based on the input and processed with different activation functions at each node. Later these processed data is fed to output layer.

c. Output Layer:

The output layer generates the output based on the input and hidden layers.

Deep Learning algorithms considered for the study are:

1. Artificial Neural Network (ANN) 2. Recurrent Neural Network (RNN) with Long Short Term Memory (LSTM)

1. Artificial Neural Network: 2505

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

ANN works with Feed-forward Neural Networks which has multiple neurons connected to each other. ANN has 3 layers:

a) Input Layer b) Hidden Layer c) Output Layer a) Input Layer : Input Layer is the layer where you feed in the input data or the historical data. b) Hidden Layer : Hidden Layer Consist number of neurons that has to be fed for a model. There can be finite number of neurons defined in each layer. There can also be multiple hidden layers. But there is no principle that the number of neurons in each hidden layer should be the same. All neurons in a hidden layer are connected to all the neurons in the next layers.

There are several combinations of connections between the input layer and the hidden layer. However the basic working is that the state of neurons in the first layer will be probabilistically determined by the state of neurons in the following layer, depending on the sign and strength of connection.

If connections are weak then connection neurons in the next layer may have equal or almost equal probability of any state. In the case of strong connection, the sign of connected neurons will be similar or opposite to next layer.

But basically ANNs are trained with back propagation algorithm. Back propagation algorithm works on basic differential calculus and chain rule of differentiation. The algorithm takes cost function, activation function, weight and inputs into consideration. c) Output Nodes: After processing the input variables at input nodes and hidden nodes with the requisite weights the outputs are executed based on the number of inputs.

Fig.3 Deep Learning Network d) Activation Functions

Activation functions are the functions used to activate a neuron in neural network. These activation functions verify whether the information received by the neuron is relevant or to be ignored. Activation function transforms the input signal and the output received is fed as input signal in the next layer of neurons.

Actually there are many types of activation function used in neural networks. The basic activation function is (x) = which is weighted average. ∝ ∑푤푖푥푖 Y = Activation (∑ (Weight * Input) + bias)

The most common activity functions used are: a) Identity function :

f(a) = a

This function lets the activation value go through b) Threshold activity function : 2506

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

1, 푖푓푎 ≤ 0 푓(푎) = { This function0, activates푖푓푎 ≥ 0 the neuron of the activation is above a certain value c) Sigmoid function :

f(a) = 1 1+exp−a This function is used when output is bounded between 0 & 1 and it is interpreted as the probability of neuron to activate. It is commonly called logistic function d) Hyperbolic Tangent

a -a -2a f(a) = e -e = 1 - e ea + e-a 1 +e-2a e) Bipolar Sigmoid

-a f(a) = 1 -1 = 1 - e , when the output range is (-1,1) 1 + e-a 1 + e-a f) ReLu / Rectified Linear Unit :

, This function is a mix of identity and threshold function and it is called as 'Rectifier' 푎, 푖푓푎 ≥ 0 푓(푎) = { However ANNs0, 푖푓 have푎 > several 0 limitations, because of difficulty in setting the weights and biases that enable ANN correctly approximate the target data as the process is really an optimization. ANNs may also get over trained and leads for over-fitting of the model i.e. it memorizes the data and fails to give generalized data. The other reason for over-fitting is due to the fact that Neural Networks (NN) has a very large number of parameters, when the number of parameters is large over fitting are more likely.

2. Recurrent Neural Networks:

Recurrent Neural Network deals with time series data by defining the recurrence relation over the sequence.

St = f(St-1, Xt)

Where St = State at step ‘t’

St-1= State at previous step.

Xt = Current step.

In RNN each state depends on all previous computations. RNNs have memory over time because the states ‘S’ contains the information of the previous steps. RNNs have the capacity to remember information arbitrarily for long period of time, but practically they are limited to look back only a few steps.

RNNs can take different combinations like one-to-one, one-to-many, many-to-one, many-to-many. The basic working of RNN has only two parameters an input weight ‘U’ and recurrence weight ‘W’. Let ‘Y’ be the last state then the recurrence relation is defined by this network as:

St = St-1 x W + Xt x U

RNN estimates the same weight matrix to compute all the state updates. But RNNs has the difficulty of vanishing gradient and exploding gradient problem. When training RNN, the gradient values jump out of the parameter values which leads to exploding gradients. On the other, long term components go to zero exponential, this is vanishing gradients.

The exploding and vanishing gradient is expressed as:

2507

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

+1 휕퐿 휕퐿 휕푦̂ 휕푧푛 휕푧푖 휕푧푖 = ⋅ ⋅ ⋅ ⋯ ⋅ Source: The Vanishing휕휔푖 휕푦̂ Gradient푧푛 휕푧 Problemℎ−1 by휕푧 Harini푖 휕푤 Suresh1

If |w| > 1 the gradient grows exponentially. If |w|<1 the gradient shrinks exponentially.

This is known as Vanishing gradient. RNN works with Long Short Term Memory (LSTM) which can handle long term dependencies due to memory cell. LSTM consists of three gates. a. An Input gate b. A forgot gate c. An Output gate

Fig.4 Recurrent Neural Network Model

Source: bouvet-deler explaining-recurrent-neural-networks

Long Short Term Memory (LSTM) takes the cell state at a time to predict the value. Usually the cell state can be altered by specific gates. LSTM gates use Logistic Sigmoid Function if the output is 0 and 1 and element wise multiplication which reduces the values while running through the gates.

Forget Gate: It takes the previous value of the output and the current input, squashes with logistic function or the element wise multiplication with the cell and leaves an output with all relevant information through which it erases the irrelevant information.

Input Gate: This decides what new information is to be added. The input is taken from the previous output and the current input and transforms into a tan h function.

Output Gate: It is the gate that exhibits the exact output after transforming the cell memory with specific blocks of information.

MLP

Machine SVC Learning Results Data Pre- Data Processing

Collection Deep DNN Learning

RNN

Fig.5 Analytical Framework

3. Performance Metrics:

2508

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

This section describe the different error metrics used to get the accuracy. The accuracy metrics considered for the study include:

Mean Absolute Percentage Error (MAPE):

MAPE is the average of the average of absolute percentage error and is calculated using the formula

풏 ퟏ |풚풕 − 푭풕| 푴푨푷푬 = ∑ × ퟏퟎퟎ% 풏 풚풕 It is one of the popular forecasting measures which is expressed풕=ퟏ in percentage terms and easily interpretable.

Mean Squared Error (MSE):

Mean Squared Error is the average of Squared error calculated on test dataset. MSE is given by

풏 ퟏ ퟐ 푴푺푬 = ∑(풚풕 − 풇풕) Lower MSE implies the prediction model is better. 풏 풕=ퟏ

Sharpe Index:

Sharpe index represents how could the strategy performance is for the risk taken to achieve. Sharpe index is calculated using

Sharpe = √n * (Strategy Retuns – Risk free return) / std dev (Strategy returns – Risk free Returns)

Higher the Sharpe index better the strategy at (assumption of 5% risk free rate)

5. Results and Discussion

This section deals with analysis and interpretation of various Machine learning and deep learning models.

5.1 Auto Regressive Integrated Moving Average (ARIMA)

ARIMA model is tested for stationarity using Dickey Fuller test and AIC and BIC values are calculated based on the order of ARIMA (p,d,q). The hypothesis for testing stationarity of the data is

H0: Time series data is not stationary

H1: Time series data is stationary

Table 1 Augmented Dickey Fuller (ADF) Test. Sl.No Stock Index Test Statistics P-Value 1 S &P -1.4915 0.5376 2 NYSE -2.3807 0.1472 3 NIFTY 50 -1.7246 0.4183 4 Dow jones -1.3721 0.5955 5 KOSPI -2.4523 0.1274 6 SSE -2.5075 0.1136

From Table 1, we can see that p value obtained for ADF test is greater than the required level of significance(0.05), hence we cannot reject the null hypothesis, which means the data is not stationary.

Table 2 Order of ARIMA and model fit diagnostics Sl.No Stock Index Order AIC BIC Ljung box

2509

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

of ARIMA 1 S &P (0,1,0) -7535.102 -7520.012 3770.551 2 NYSE (0,1,0) -8283.502 -8273.289 4143.751 3 NIFTY 50 (0,1,0) -8169.623 -8159.460 4086.812 4 Dow jones (0,1,0) -8135.830 -8125.617 4069.915 5 KOSPI (0,1,0) -8169.623 -8159.460 4086.812 6 SSE (0,1,0) -6534.281 -6498.786 3274.141

Table 2 shows values of AIC and BIC which tests the white noise in the data. It reveals all the six indices has passed the white noise test which is reflected in Ljung value, where it is found to be greater than the requisite level of significance.

Table 3 Accuracy of ARIMA model: Sl.No Stock Index MAPE 1 S &P 0.01034 2 NYSE 0.010259 3 NIFTY 50 0.01069 4 Dow jones 0.01084 5 KOSPI 0.010720 6 SSE 0.003344

From Table 3, we can observe the performance metrics for ARIMA model. It reveals that the error as shows by MAPE is low in the case of Shanghai Stock Exchange, which is 0.3 per cent indicating the less difference between the actual and the predicted closing price for the next day.

Further the table also indicates the predictive power of ARIMA model for one day ahead as an average of 8.6 per cent error for over all six indices put together is observed.

5.2 Support Vector Classifier

Support Vector Classifier is used to find whether the market strategy and the actual returns are same.

Table 4 Support Vector Machines with Sharpe/Accuracy Sl.No Stock Index Sharpe Accuracy 1 S &P 38.7535 0.33967 2 NYSE -5.19622 0.34014 3 NIFTY 50 62.3084 0.34285 4 Dow jones -12.8013 0.33981 5 KOSPI -27.5869 0.34375 6 SSE 46.90358 0.34246 As shown in Table 4, it is clear that the return generated reflects a better trading strategy as Sharpe ratio is good in S&P 500, Nifty 50 and Shanghai Stock Exchange. This is a conclusion based on the understanding that the higher the Sharpe index, better the returns of the stock index. The average accuracy with which the model classifies stocks as oversold or undersold is 34.09 per cent.

5.3 Deep Learning Algorithms

Neural Networks are found to be robust for noisy data. The Neural networks that are considered for the study is discussed below.

Table 5 Artificial Neural Network (ANN) Sl.No Stock Index Accuracy MSE 1 S &P 0.8688 0.0973 2510

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

2 NYSE 0.8590 0.0978 3 NIFTY 50 0.8809 0.0906 4 Dow jones 0.8947 0.0767 5 KOSPI 0.8089 0.1379 6 SSE 0.9200 0.0611

Table 5 presents the results of ANN as algorithm used in DNN. It is clear from the results that the next day closing price prediction accuracy on average stands at 87.2 per cent with an average MSE of 0.093. The model is built on various weights and selected parameters are used to get the desired result.

Table 6 Recurrent Neural Network - Long Short Term Memory (RNN-LSTM) Sl.No Stock Index MSE Accuracy 1 S &P 0.01506 0.9849 2 NYSE 0.00576 0.9942 3 NIFTY 50 0.0062 0.9938 4 Dow jones 0.0144 0.9856 5 KOSPI 0.0026 0.9974 6 SSE 0.0519 0.9481

Table 6 reflects that the index closing value for one day ahead is predicted by the model with an average accuracy of 98.4 per cent with an error of 0.015.

6. Conclusion and Future Scope of Work

In this paper, a comprehensive one day ahead index closing value forecast process is presented. The process starts with data cleaning and pre-processing followed by analysis of the model and prediction metrics. We propose two models, Machine Learning and Deep Learning Model with four algorithms – ARIMA, ANN, SVM, and RNN-LSTM being a hybrid – to predict the closing value for the next day.

The experiment clearly reveals that ARIMA is able to predict the movement for the next day, but falls short when it is compared with SVM. As this study is confined to short term forecasting, it is observed that RNN-LSTM proves its strength in performance. It outclasses the other models with a high accuracy rate with very less error rate. For future work, the same model can be used on a more high frequency data.

References:

1. Duke “Stationary & Differencing”, [online], Available:http://people/duke.ew/~rnau/411diff.htm [Accessed January 2017]. 2. Adebiyi, Ayodele & Adewumi, Aderemi & Ayo, Charles. (2014). Stock price prediction using the ARIMA model. Proceedings - UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, UKSim 2014. 10.1109/UKSim.2014.67. 3. Ali, M. Montaz, Adebiyi, Ayodele Ariyo, Adewumi, Aderemi Oluyinka &bAU - Ayo, Charles Korede (2014), Comparison of ARIMA and Artificial Neural Networks Models for Stock Price Prediction, Journal of Applied Mathematics, , Hindawi Publishing Corporation, 10.1155/2014/614342. 4. Babu, C.N., & Reddy, B.E. (2014). Selected Indian stock predictions using a hybrid ARIMA-GARCH model. 2014 International Conference on Advances in Electronics Computers and Communications, 1-6. 5. D. Banerjee, "Forecasting of Indian stock market using time-series ARIMA model," 2014 2nd International Conference on Business and Information Management (ICBIM), Durgapur, 2014, pp. 131-135, doi: 10.1109/ICBIM.2014.6970973. 6. A. Steel, “Prediction in Financial Time Series Data” Master of Science in Computing, Institute of Technology, Blanchardstown, 2014. 7. Jigar Patel, Sahil Shah, Priyank Thakkar , K Kotecha, (2015), Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques, Journal of Expert systems with Applications, 42(2015), 259-268. ⇑ 8. Sambit Hore, Raj Vipani, Dr.Prithwiraj Das and Dr. Saibal Dutta, “Prediction of Stock Value using NIFTY 50 market index and RBF – Kernel based Support Vector Regressor, International Journal of Advanced Research in Science and Engineering, (April 2018), Volume 07, Issue No: 03, ISSN: 2319-8324. 2511

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 16, 2020

9. M.V. Swell, “Application of Machine Learning to Financial Time Series Analysis” Doctorate of Philosophy , University College of London, University of London, 2017 10. Alexiei Dingli and Karl Sant Fourier, “Financial Time Series Forecasting – A machine learning approach”, (Sep 2017), Machine Learning and Applications: An International Journal, Vol 4, No.1/2/3. 11. Q. Mingyne, “A Study on Prediction of Stock Market Index and Portfolio Selection” Doctorate of Engineering, Fukuoka Institute of Technology, Fukuoka university, Japan, 2014 12. Gilberto Batres – Estrada, thesis “Deep Learning for Multivariate Financial Time Series”, Master of Science, KTH Institute of Technology, Stockholm, 2015. 13. Bjoern Krollner, Bruce Vanstone, Gavin Finnie, “Financial Time Series Forecasting with Machine Learning techniques: A Survey”, European Symposium of ANN – Computational Intelligence and Machine Learning, Bruges, 28- 30 April 2020, ISBN – 2-930307-10.2. 14. Manna Majumder, MD Anwar Hussain, “Forecasting of Indian Stock Market Index using Artificial Neural Network (ANN)”, research at National Stock Exchange India Ltd.,. 15. Gourav Kumar, Vinod Sharma, “Stock Market Forecasting of Nifty 50 using Machine Learning Techniques with Artificial Neural Network Approach”, (2016) International Journal of Modern Computer Science (IJMCS) Vol4, Issue 3, ISSN: 2370-7868. 16. Talal Alotaibi, Amril Nazir, Roobae Alroobaea, Moteb Abtibi, Fasal Alsubeai, Abdullah Alghamde, Thamer Alsulimani,”Saudi Arabia Stock Market Prediction using Neural Network”, (Feb 2018), International Journal on Computer Science and Engineering, 0975-3397. 17. Hakob Grigoryan, “Stock Market prediction using ANN Case study of Talit, Nasdaq OMX, Baltic Stock”, (2015), Database Systems Journal Vol.6, no 2/2015. 18. Pakdaman Naeini, Mahdi & Taremian, Hamidreza & B. Hashemi, Homa. (2010). Stock market value prediction using neural networks. 132 – 136. 10.1109/CISIM.2010.5643675. 19. Qui. M and Song.Y, “Predicting the direction of stock market index movement using an optimized ANN”, Pone Journal, (2016). 20. Panda, C, and Narasimhan.V (2006), “Predicting stock returns – An experiment of the ANN in Indian Stock Market”, South Asia Economic Journal, Vol.7, No.2, 205-218. 21. Andres Ticardo, Arevalo Murillo, thesis “Short term Forecasting of Financial Time Series with Deep Neural Network”, Masters in Systems and Computer Engineering Universidad Nacional de Colombia, 2016. 22. Alexiei Dingle and Karl Sant Fournier, “Financial Time Series Forecasting – A Deep Learning Approach”, International Journal of Machine Learning and Computing Vol.7, 2017. 23. Jerzy Korczak, Marcin Hernes, “Deep Learning for Financial Time Series Forecasting in A-Trader system”, (2017) proceedings of the Federated conference on computer science and Information systems, ACSIS, Vol.11 PP No. 905-912. 24. Y. Tang, “Deep Learning using Linear Support Vector Machine”, Master of Science, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada 2013. 25. Hiransha M, Gopala Krishnan E.A, Vijay Krishna Menon, Soman K.P, “NSE stock market prediction using Deep Learning Models”, (2018) International Conference on Computational Intelligence and Data Science,P.No:1351-1362. 26. Qiyan Gao, thesis titled, “Stock Market Forecasting using RNN”, Master of Science, University of Missouri – Colombia, 2016. 27. N. Baba and M. Kozaki, "An intelligent forecasting system of stock price using neural networks," [Proceedings 1992] IJCNN International Joint Conference on Neural Networks, Baltimore, MD, USA, 1992, pp. 371-377 vol.1, doi: 10.1109/IJCNN.1992.287183. 28. Gabriel D Acunto, dissertation “A Deep Learning Model to Forecast Financial Time Series”, Master of Science, Universita deglistudi DI Torino, 2016. 29. ML-Support Vector Machine [online]: Available: 30. www.tutorialspoint.com/machine_learning_with_python/machine_learning_with_python_classification_algorithm s_support_vector_machine.htm [Accessed July 2020]. 31. Tom M.Mitchell, “Machine Learning”, McGrawHill Education, 2017. 32. U. Dinesh Kumar, “Business Analytics The Science of Data-Driven Decision Making”, Wiley, 2017. 33. Xingyu Zhou, Zhinsong Pan, Guyuttu, Siqi Tang and Cheng Zhao, “ Stock Market Prediction on High frequency Data using Generative Adversial Nets” Hindawi, Mathematical Problems in Engineering, ID 4907423, (2018)

2512