Using Machine Learning Techniques in the Stock Market

USING MACHINE LEARNING TECHNIQUES IN THE STOCK MARKET A Project Presented to the Faculty of California State Polytechnic University, Pomona In Partial Fulfillment Of the Requirements for the Degree Master of Science In Economics By David Licerio 2018 SIGNATURE PAGE PROJECT: USING MACHINE LEARNING TECHNIQUES IN THE STOCK MARKET AUTHOR: David Licerio DATE SUBMITTED: Spring 2018 Economics Department Dr. Craig Kerr Project Committee Chair Economics Dr. Carsten Lange Economics Dr. Bruce Brown Economics ii ACKNOWLEDGMENTS I would like to thank my professor Dr. Craig Kerr for all the help, and my family for all the love and support given. iii ABSTRACT The objective of this paper is to forecast stock market prices using machine learning techniques. First, a statistical analysis of the Amazon stock is carried out. For further analysis, an autoregressive integrated moving average (ARIMA) model is fitted to the data and a forecast made. After the ARIMA and statistical analysis, machine learning techniques are applied to the stock to make another forecast on the returns to the Amazon stock. The returns are compared to a buy and hold strategy. The machine learning model includes a multi-layer perception model (MLP). The model used includes the following technical indicators: relative strength index, moving average convergence divergence, commodity channel index, stochastic oscillator, and Williams’ accumulation distribution. iv Contents 1 1 1.1 Introduction . 1 2 3 2.1 Review of Literature . 3 3 10 3.1 Statistical Analysis . 10 4 18 4.1 The Model . 18 4.2 Conclusion . 23 Bibiliography 24 v List of Tables 3.1 ARIMA Summary . 16 3.2 Forecasted ARIMA Values . 17 vi List of Figures 2.1 Support Vector Machine Regression . 4 3.1 Amazon Close Prices . 11 3.2 Histogram of Daily Returns . 12 3.3 Fitted ARIMA Model (0,1,0) . 13 3.4 Forecasted ARIMA . 14 3.5 ARIMA ACF . 15 3.6 ARIMA PACF . 16 4.1 Multi-Layer Perceptron . 19 4.2 Relative Strength Index and Price . 20 4.3 Moving Average Convergence Divergence . 21 4.4 Neural Network . 23 vii Chapter 1 1.1 Introduction Stock market prices follow what is known as a “random walk” meaning the prices from one day to the next are random, making it difficult to forecast the next value in the trend. The efficient market hypothesis states that if all prices reflect current information, then it is not possible to beat the system for a profit (Investopedia, 2018c). However, many statistical techniques are used in financial data to point out trends to see if the markets are predictable. Three authors are covered in the literature review. One commonality between two of the authors is that they use the same form of machine learning, known as a multi layer perceptron (MLP), a technique that is also used in this paper. The MLP technique performed better than the other machine learning algorithms used including; single layer perceptron (SLM), radial basis function (RBF), and support vector machines (SVM). For this paper, Amazon daily adjusted closing prices were gathered from Yahoo Fi nance from July 1, 2017 to March 28, 2018. An analysis of the returns to the stock is carried out. Next, an ARIMA model is fitted to the data for further analysis of the Ama zon stock. Finally, the machine learning technique known as MLP is implemented and 1 a comparison between a buy and hold strategy versus the machine learning technique is made. 2 Chapter 2 2.1 Review of Literature Patel et al. (2015) makes predictions on the stock prices of the 1-10th, 15th, and 30th day forecast. In order to accomplish this, the authors used a two step fusion technique. A two step infusion technique requires a combination of two different machine learning techniques, in order to make a single forecast. The reason for the two step technique is that they felt previous use of machine learning techniques relied on older statistical parameters as the forecasting value increased. Meaning, as time went on, the values were based off of older information and became less useful. The first stage is a support vector machine regression (SVMR). According to Math works.com (2018), the goal of this technique is to find a linear function, that is as flat as possible (minimizing beta), subject to constrained residuals. Meaning, you want to find a line which separates the data into 2 sections based on specified properties of the data. SVMR is used to organize and classify data in regression analysis. For example, if you had a cluster of data on the x-axis of a graph, and a cluster of data aligned on the y-axis, you would want a line coming from the point of origin, or (0,0), at a 45◦ angle. The line would be the linear function we are looking for. An example can be seen in Figure 2.1 ?. 3 The line separating the blue and red dots at an angle is the linear function that would be created by the SVMR, which would separate the green and red dots based on their color properties. Figure 2.1: Support Vector Machine Regression The second stage of the technique used by Patel et al. (2015) includes a hybrid model of an Artificial Neural Network (ANN), Random Forest (RF) and SVMR. According to Van Greven and Bohte (2017), an artificial neural networks is a computer system based off of biological brains and is meant to learn something without having been explicitly designed to do so. ANN’s generally consist of 3 layers. The first is an input layer of neurons which sends data to a second layer. The second layer then relays that data to the third layer for output. An ANN learns based on the inputs it is given, and the out put achieved is based on what it has learned given the inputs. ANN’s are considered non-linear statistical modeling tools and are used to find complex patterns in data. The 4 advantage of ANN’s compared to other statistical techniques is it’s ability to learn. A random forest is used in regression by using what are known as decision trees. Investo pedia (2018b) states that decision trees consist of branches which evaluate data, and has leaves which hold the conclusions of the data. According to Vidyha (2018), a decision tree splits data into homogeneous sets, based on the input that differentiates the variables the most. For example, say we have 10 students, and we want to know which students are most likely to play handball during recess. We also have data based on age, gender and height. The decision tree would then organize the data, and would be able to differ entiate between which of the variables (age, gender height) would be the greatest help in determining which students are most likely to play handball during recess. According to the site BML (2016) decision trees and ordinary least squares (OLS) can both be used for regression, however decision trees can also be used for classification. Another difference between the two is that OLS assumes Gaussian relations hold while DT’s do not have such assumptions on the data. These Gaussian assumptions include the data should be stationary, no severe multicollinearity, no heteroskedasticity, no serial-correlation, and the expected error is 0. A stationary variables means that the mean, variance and er ror term do not change over time. No severe multicollinearity means the x coefficients should not be significantly correlated. Heteroskedasticity means the variance of the error is different among the data. Serial-correlation is when the price goes up (or down), the next price go up (or down) and continues to follow this pattern. The hybrid model is then compared to a single stage machine learning model where ANN, RF, and SVR are used alone. Just as in this project, Patel et al. (2015) used indica tors in their neural networks which included including relative strength index, accumu lation /distribution oscillator, moving average and commodity channel index. In order to identify which of the algorithms performed the best, the authors used mean absolute 5 percentage error (MAPE), mean absolute error (MAE), relative root mean square error (rRMSE), and mean squared error (MSE). Each of the measurements are used to measure the accuracy of a forecasting model, specifically in trend estimation. MAPE gives the er ror in terms of percentage. MAE is calculated by obtaining the average of the difference between the predicted value and actual value. rRMSE is used for models whose error are measured in different units. Finally, MSE takes the average of the squared error between the forecasted and actual values. The authors found the that two stage fusion prediction models consisting of ANN and SVMR performed better than the single stage predictions since those techniques had the lowest of the MAPE, MAE, rRMSE and MSE. Patel et al. (2015) found that the best prediction model was when they combined ANN and SVMR. In another article by Usmani et al. (2016), the authors use machine learning tech niques to make predictions on the Karachi Stock Exchange. In order to predict the entire market, the model uses oil, gold, silver, interest and foreign exchange rates as inputs. These inputs are then combined with simple moving average and ARIMA models. The techniques that are used for comparison include a single layer perceptron, multi-layer perceptron (MLP), Radial Basis Function (RBF) and SVM. Usmani et al. (2016) article is similar to this paper in that a multi-layer perceptron is used. To further explain the techniques used, a single layer perceptron consists of a single layer of input neurons and an output neuron.

Load more