ARROW@TU Dublin Can Wikipedia Article Traffic Statistics Be Used
Total Page:16
File Type:pdf, Size:1020Kb
Technological University Dublin ARROW@TU Dublin Dissertations School of Computer Sciences 2015-03-10 Can Wikipedia Article Traffic Statistics be Usedo t Verify a Technical Indicator? An Exploration into the Correlation Between Wikipedia Article Traffic Statistics and the Coppockechnical T Indicator. Cormac O'Connor Technological University Dublin Follow this and additional works at: https://arrow.tudublin.ie/scschcomdis Part of the Computer Engineering Commons Recommended Citation O'Connor, C. (2015) Can Wikipedia Article Traffic Statistics be Usedo t Verify a Technical Indicator? An Exploration into the Correlation Between Wikipedia Article Traffic Statistics and the Coppockechnical T Indicator. Masters Dissertation, Technological University Dublin, 2015. This Theses, Masters is brought to you for free and open access by the School of Computer Sciences at ARROW@TU Dublin. It has been accepted for inclusion in Dissertations by an authorized administrator of ARROW@TU Dublin. For more information, please contact [email protected], [email protected]. This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License Can Wikipedia Article Traffic Statistics be used to verify a Technical Indicator? An exploration into the correlation between Wikipedia Article Traffic Statistics and the Coppock Technical Indicator. Cormac O’Connor A dissertation submitted in partial fulfilment of the requirements of Dublin Institute of Technology for the degree of M.Sc. in Computing (Data Analytics) March 2015 DECLARATION I certify that this dissertation which I now submit for examination for the award of MSc in Computing (Data Analytics), is entirely my own work and has not been taken from the work of others save and to the extent that such work has been cited and acknowledged within the test of my work. This dissertation was prepared according to the regulations for postgraduate study of the Dublin Institute of Technology and has not been submitted in whole or part for an award in any other Institute or University. The work reported on in this dissertation conforms to the principles and requirements of the Institute’s guidelines for ethics in research. Signed: _________________________________ Cormac O’Connor Date: 6th March 2015 ii ABSTRACT Recent studies have shown that, through the quantification of Wikipedia Usage Patterns as a result of information gathering, stock market moves can be predicted (Moat et al 2013). There was also research performed to determine the predictive nature of Wikipedia Data to predict movie box office success (Mestyan et al. 2013). The goal of any investor, in order to maximize the return of their investments, is to have an edge over other participants in the markets. Several tools and techniques have been used over the years to fulfil this, some proving to generate a consistent stream of income (Gillen 2012). With the improvement of technology and communication links, what was once considered a closed door, gentleman’s club operation, can now be tapped into by anybody who has access to a PC and communications link. It is said that approximately only 20% of investors are consistently successful in their investments (Terzo 2013). In order be successful, there needs to be a strategy in place that is strictly adhered to. The objective of these trading systems is to minimize, or ideally cut out, the human emotion factor and naturally, as a consequence, allow the strategy operate at its optimum. An example of this is through the use of technical analysis indicator which, when used correctly, can net the investor considerable, consistent returns. (Gillen 2012). Technical indicators, such as Coppock, are widely used in the field of stock market investment to provide traders and investors with an insight into which direction a stock or index is moving so as to facilitate the optimum time to enter or exit the market. This project investigates whether Wiki Article Traffic Statistics can be used to verify trading signals given by the Coppock technical indicator through the use of a suitable correlation technique. Keywords: Technical Analysis, Wikipedia, Coppock Indicator, Momentum, Correlation. iii ACKNOWLEDGEMENTS This dissertation would not have been possible without the help of a number of people, for which I would like to take this opportunity to thank them. I would like to express my sincere thanks to my supervisor Luca Longo for his help, guidance and assistance throughout the course of this dissertation. I would also like to thank Damian Gordon for reviewing my dissertation as the deadline approached and helping to put things into perspective. I would like to thank Mirko Kaempf, who through a series of discussions helped me to solidify my idea and use the Wikipedia data source as a research topic. Thanks to my parents who were always on hand to assist at home with my three boys Cian, Neil and Rory, when I was unavailable due to the demands of the dissertation. Finally, I would not have been able to achieve this completion only for the love and support of my wife, Maria. Without her help, I would not have even considered going the extra mile to achieve this. iv CONTENTS DECLARATION ............................................................................................................ ii ABSTRACT .................................................................................................................. iii ACKNOWLEDGEMENTS ........................................................................................... iv TABLE OF FIGURES .................................................................................................. vii TABLE OF TABLES .................................................................................................... ix 1 INTRODUCTION .................................................................................................. 1 1.1 Background ...................................................................................................... 2 1.2 Research problem ................................................................................................. 3 1.3 Research aim and objectives ................................................................................. 4 1.4 Research methodology.......................................................................................... 5 1.5 Scope and limitations ............................................................................................ 5 1.6 Organisation of dissertation .................................................................................. 6 2. LITERATURE REVIEW ....................................................................................... 8 2.1 What is technical analysis? ................................................................................... 8 2.2 The Coppock indicator ....................................................................................... 12 2.3 Wikipedia article view statistics ......................................................................... 17 2.4 Suitable correlation techniques ........................................................................... 25 2.5 Discussion ........................................................................................................... 28 3. EXPERIMENTAL DESIGN ................................................................................ 29 3.1 Introduction .................................................................................................... 29 3.2 Focus of the experiment ................................................................................. 29 3.3 Data ................................................................................................................ 29 3.3.1 Financial data structure ........................................................................... 30 3.3.2 Wikipedia data structure ......................................................................... 33 3.4 Data Cleansing .................................................................................................... 36 3.5 Transformation of data ................................................................................... 37 3.6 Summary ........................................................................................................ 40 4. EXPERIMENTATION AND EVALUATION .................................................... 41 4.1 Data pre-processing and initial characteristic analysis .................................. 41 4.1.1 Missing stock price data .............................................................................. 41 4.1.2 Missing weekend stock market price data ................................................... 42 4.1.3 Missing Wikipedia article traffic statistics data........................................... 43 4.1.4 Coppock value derivations .......................................................................... 44 4.1.5 Correlation checks ....................................................................................... 48 4.1.6 Strengths and Limitations ............................................................................ 49 v 4.2 Summary ........................................................................................................ 50 5. RESULTS AND DISCUSSION ........................................................................... 51 5.1 Results ............................................................................................................ 51 5.1.1 Shapiro-Wilk