Stock Market Prediction Using Artificial Neural Networks
Total Page:16
File Type:pdf, Size:1020Kb
Stock market prediction using artificial neural networks A quantitative study on time delays AROSHINE MUNASINGHE DAJANA VLAJIĆ Royal Institute of Technology DD143X, Bachelor's Thesis in Computer Science Thesis Supervisor: Pawel Herman Title: Researcher, PhD, Department of Computational Biology, KTH Examiner: Örjan Ekeberg 1 June 2015 Stock market prediction using artificial neural networks by Aroshine Munasinghe and Dajana Vlajić Submitted to KTH Computer Science and Communication on June 1, 2015, in partial fulfillment of the requirements for the degree of Bachelor of Computer Science Abstract This report investigates how prediction of stock markets with Artificial Neural Networks (ANN) is affected by altering aspects of data quanti- ties. A short-term and a long-term perspective considering time delays are examined. Inspired by neurosciences, ANNs have shown great po- tential in terms of recognising patterns in nonlinear systems. Existing research suggests that ANN is an eminent model to predicting stock markets due to its dynamical characteristics. Closing prices of large- caps within the sectors of IT and Telecommunication represented by the Swedish of OMX30 Stockholm (OMXS30), have been leveraged as data. The ANNs are implemented as multilayer feedforward networks, trained using supervised learning. To identify specific configurations, the models have undergone extensive testing by mean squared errors and statistical analysis. The results obtained suggest that the short- term perspective is optimally predicted for significantly small numbers of time delays, and that optimal configurations do not alter for increas- ing quantities of data. No significant conclusions could be drawn from the results for the long-term perspective. Key words: ANOVA, Backpropagation, Configurations, Stock Pre- diction, Artficial Neural Networks Referat Denna rapport undersöker hur förutsägelser av aktiemarknader med hjälp av artificiella neuronnät (ANN) påverkas genom att förändra aspek- ter av datamängder. Undersökningen har utförts ur både ett korttids- och långtidsperspektiv, relaterat till antalet tidsförskjutningar. Inspi- rerade av neurovetenskap har ANN visat en stor potential att finna mönster i ickelinjära system. Befintliga undersökningar visar att ANN är en utmärkt modell för att förutsäga aktiemarknaden till följd av dess dynamiska egenskap. Stängningspriser av stora börsnoterade bo- lag inom IT- och Telekom-sektorerna är representerade av den svenska marknaden inom indexet OMX30 Stockholm, och utgör datamängden. ANN-modellerna är implementerade som feedforward-nätverk med fle- ra lager, som har tränats med hjälp av kontrollerad inlärning. För att identifiera specifika konfigurationer har nätverken testats genom utför- liga testningar av minsta kvadratfel och av statistisk analys. Resultaten visar att de optimala konfigurationerna för korttidsperspektivet inklude- rade mindre tidsförskjutningar och att dessa inte förändrades vid ökade mängder data. Inga signifikanta slutsatser kunde dras från resultaten för långtidsperspektivet. Nyckelord: ANOVA, Aktiemarknad, Backpropagation, Artificiella Neurronnät, Konfigurationer Contents 1 Introduction 1 1.1 Problem Statement and Hypothesis . 2 1.2 Scope . 2 1.3 Motivation . 2 2 Background 3 2.1 Stock Market Prediction . 3 2.1.1 Company Risks . 3 2.1.2 Random Walk Hypothesis . 4 2.2 Technical Analysis . 4 2.3 Neural Networks . 5 2.4 Artificial Neural Networks . 5 2.4.1 Architecture of Artificial Neural Networks . 5 2.4.2 Backpropagation Algorithm . 7 2.4.3 Overfitting . 7 2.5 ANN Applied on Stock Data . 8 3 Methods 9 3.1 Data Collection . 9 3.2 Implementation . 10 3.2.1 Identifying Configurations . 10 3.2.2 Bayesian Regularisation . 11 3.3 Performance Measures . 11 3.3.1 Mean Squared Error . 11 3.4 Statistical Tests . 11 3.4.1 Assessment of Normality . 11 3.4.2 Analysis of Variance . 13 3.4.3 Assumptions . 14 3.4.4 Null Hypothesis . 14 3.4.5 Post Hoc Tests . 15 4 Results 17 5 Discussion 21 5.1 Analysis of Results . 21 5.2 Limitations . 22 6 Conclusion 23 6.1 Future Research . 23 Bibliography 25 Chapter 1 Introduction Prediction of stock market data is known as a prominent issue for stock traders. Stock market data has a highly dynamic property due to a conflicting extent of influental factors. The issue has been approached for business interests by observing market forces, making assumptions and recognising historical data. Limited conditions for success have been established for these methodologies. Increased attention from academia in the field of predictions has driven development of technical and statistical models. In contrast to conventional methodologies practiced by traders, technical methods are confined and restricted to observations of historical data. One of the most recently examined technical methods is Artificial Neural Net- works (ANN). Originating from neural networks, ANNs are mathematical structures originally designed to mimic architecture, fault-tolerance, and learning capability of neural networks existing in the human brain [1]. ANNs have been successfully applied for prediction in the fields related to medicine, engineering and physics [1]. The methods have been suggested for modeling financial time series due to the ca- pability of mapping nonlinear connections [2], as proposed for stock market data [3]. The performance of ANNs has been compared to statistical approaches, rep- resenting significantly improved results for increasing complexity in time series [4]. Further factors affecting the performance of ANNs are the choice of architecture, input data, and quantity of data. In this report, ANNs are deployed to model nonlinear connections in stock mar- ket data, in coherence with altering extents of data. This is tested for Nordic large-caps regarding short-term and long-term perspectives. Bayesian regularisa- tion backpropagation has been incorporated to train the network, supported by statistical testing methods for approving the established results. 1 CHAPTER 1. INTRODUCTION 1.1 Problem Statement and Hypothesis The goal of this report is to analyse the quantity of data sufficient for the prediction of stock market prices. Increasing the quantity of data is equivalent to moving back in the past. The report focuses on two aspects: • Short-term perspective: a maximum of two(2) years of data points is used to predict a day ahead. • Long-term perspective: a maximum of four(4) years of data points are used to predict a day ahead. The hypothesis reckoned for the performance evaluation is that small quantity of time delays, and quantities of data produces an optimal result for the short-term perspective. For the long-term perspective, large quantity of data is considered to give optimal results, in combination with large numbers of time delays. 1.2 Scope The research does not include measuring the performance of ANN as existing com- parative research proposes the usage of this technical model. Aspects in the extent of historical data is the main focus of this report for the qualified conclusion of optimal configurations. In addition, the data used include five large-caps within the sectors of IT and Telecommunication sectors from the Swedish index of OMXS30. 1.3 Motivation Despite extensive research in the area of stock market, no significant guidelines to estimate or predict the market have been established. Various methodologies within the technical and statistical analysis have all used to attempt to predict the price. However, there is profoundly limited research conducted in the quantity of data needed to predict stock markets. Therefore, this report intends to investigate how this factor impacts the performance of ANNs. Due to predictions of prices for one day ahead, a maximum of two(2) and four(4) years has been chosen for short-term and long-term perspectives, respectively. 2 Chapter 2 Background The background provides an insight on the issue of predicting stock market data with technical methods. Initially, aspects of influences on stock market behavior are covered and conventional theories about data fluctuations are brought up. Technical analysis is broadly described in 2.2, followed by a detailed investigation on distinc- tive properties and internal architecture of Artificial Neural Networks (ANN). Lastly in 2.5, the performance of ANNs are investigated in comparison to other methods. 2.1 Stock Market Prediction The movement of stock prices is triggered by alterations related to supply and demand, generally referred to as market forces. These represent results a a combi- nation of factors such as earnings and impact of social media, strongly related to a company's internal and external properties [5]. From another perspective, data is affected by informed and 'noise traders'[6]. Therefore, to generate profit in stock markets, a range of influential factors are to be considered. For the purpose of accurate predictions, traders apply methods of analysis derived either from a fundamental or technical perspective [5]. The traditional approach is the fundamental analysis involving factors in relation to the company such as market position, growth rates and revenue generation [7]. The method leveraged in this report is the technical analysis based on historical fluctuations. Technical analysis is further discussed in Section 2.2. 2.1.1 Company Risks An additional influential factor for risk assessments is the size of companies. The trading community denominates large and small companies as large-cap and small- cap where size is directly determined by the market capitalisation values. Large-cap have a market