Modeling and Analyzing Stock Trends
Total Page:16
File Type:pdf, Size:1020Kb
Modeling and Analyzing Stock Trends A Major Qualifying Project Submitted to the Faculty of Worcester Polytechnic Institute in partial fulllment of the requirements for the Degree in Bachelor of Science Mathematical Sciences By Laura Cintron Garcia Date: 5/6/2021 Advisor: Dr. Mayer Humi This report represents work of WPI undergraduate students submitted to the faculty as evidence of a degree requirement. WPI routinely publishes these reports on its web site without editorial or peer review. For more information about the projects program at WPI, see http://www.wpi.edu/Academics/Projects 1 Abstract Abstract The goal of this project is to create and compare several dierent stock prediction models and nd a correlation between the predic- tions and volatility for each stock. The models were created using the historical data, DJI index, and moving averages. The most accurate prediction model had an average of 5.3 days spent within a predic- tion band. A correlation of -0.0438 was found between that model an a measure of volatility, indicating that more prediction days means lower volatility. 2 2 Acknowledgments Without the help of some people, it would have been signicantly more di- cult to complete this project without a group. I want to extend my gratitude to Worcester Polytechnic Institute and the WPI Math Department for their great eorts and success this year regarding school and projects during the pandemic. They did everything they could to ensure these projects was still a rich experience for the students despite everything. I would also like to thank my MQP advisor, Professor Mayer Humi for his assistance and guidance on this project, for allowing me to work indepen- dently while always being willing to meet with me or answer any questions, and for continuously encouraging me to do what I thought was best for the project. 3 3 Executive Summary The stock market is a complicated system that nancial analysts have been tracking and attempting to predict for decades. While there are many Ma- chine Learning Models that have been created to predict future values of stocks and investments, many people still believe that the stock market can- not be predicted because its values are random. While I agree that it would be very dicult to take into account every factor that could have a part in the movement of stock prices, there are ways to approximate the data and predict several future values using historical data, the market indices, and taking into account the volatility of a stock. The goal for this project was to develop multiple nancial stock prediction models based on the stocks' historical data and incorporating the inuence of the Dow Jones Industrial (DJI) Index that can be used to predict the future values of stocks. Some of the models also incorporate the 'moving average' of each stock to approximate the future values. The model also contains various methods of measuring the volatility of each stock, including Bollinger Bands and a scale created from the number of values within 1 standard deviation of the mean for each stock. The models were created using stocks in the Indus- trial Sector of the stock market and the DJI index but can, theoretically, be utilized for any stock and index, so long as the appropriate Index is applied for the that stock. The stocks selected are all from the Industrial sector of the stock market and, at the time of selection, were all priced between $20 - $150. This project was divided into 4 main models: (1) model using only the historical data of each stock, (2) model using historical data and DJI inuence, (3) model using moving average of historical data, (4) model using moving average of historical data and DJI inuence. An auto-correlation was taken for each for each stock to nd the time period that is relevant in nding future values (relevant period). For each stock, the model measured how many days their true closing values spent within a prediction band. Here are the 4 models running the stock Alamo Group Corp. (ALG). For each model, ALG spent 4, 5, 4 , and 4 days within the prediction bands respectively. Figure 1: ALG Fourier Approximation 4 Figure 2: ALG DJI Inuence Approximation Figure 3: Moving Average ALG Fourier Approximation 5 Figure 4: Moving Average ALG DJI Inuence Approximation Then, volatility was measured using 2 dierent methods. The rst being a scale that uses the percentage of values 1 standard deviation away from the mean for each stock and DJI, to nd the volatility of each stock in relation to the DJI's volatility. This was done by nding the dierence between each stock's % of values within 1 standard deviation of their mean (%stock) 6 and the %DJI and nding the standard deviation of those dierences. The percentages that are within 1 sd away from the %DJI were deemed a 1 on the scale, those within 2 sd away from the %DJI are a 2 on the scale, within 3 sd away from the %DJI are a 3 on the scale, and 4 sd away from the %DJI are a 4 on the scale. The lower the number on the scale the less volatile the stock is. The next measure was by creating Bolinger Bands using the moving average of each stock, plotting bands 1 standard deviation above and below the mov- ing average of each stock and counting the number of times the closing prices touch/cross the bands. The lower the number of touches, the less volatile the stock is and vice- versa. A correlation between each dierent model and each measure of volatility was taken to see if a pattern could be found between having a higher number of days and a lower volatility measure. The days that all of the stocks spent within their prediction band for each model was counted and the average was taken. Figure 5 According to the average number of days each stock spent within their prediction band, the best model was the one using the moving average and the DJI inuence, with an average of 5.3 out of 9 days. The DJI volatility 7 comparison scale (1-4) and the number of Bollinger touches for each stock is shown below. Figure 6 In the volatility scale, there were no stocks within one standard deviation of the %DJI, but these values change if the 'relevant' period changes. The scale measures the volatility relative to the volatility of the DJI. A 1 on the scale indicates that the stock is roughly as volatile as the DJI, a 2 means its only slightly more volatile, a 3 is even more volatile, and a 4 indicates that is it is signicantly more volatile than the DJI. For the Bolinger Bands, most of the stocks had a minimal amount of touches, therefore they are not overly volatile. The correlations between the number of days each stock spent within the prediction bands for each model and the 2 measures of volatility were taken. Figure 7 A stronger negative correlation would indicate that there is a pattern between low volatility and a higher number of days within their prediction band. The MA DJI Model+ Bollinger have a negative correlation, which may indicate 8 that it is the better prediction model and the Bollinger bands are the better measure of volatility. 9 Contents 1 Abstract 2 2 Acknowledgments 3 3 Executive Summary 4 4 Introduction 17 4.1 Motivation and Goals . 17 4.2 Financial Modeling and Model Trading Today . 18 5 Background 19 5.1 Stock Selection . 19 5.2 Auto-correlation . 23 5.3 Trend- Line . 31 5.4 Fourier Series . 47 5.5 Market Indices . 47 5.6 Moving Average . 49 6 Models and Predictions 50 6.1 Fourier Approximation . 50 6.2 DJI Inuence Approximation . 57 6.3 Moving Average: Fourier Approximation . 65 6.4 Moving Average: DJI Inuence Approximation . 71 7 Volatility 78 7.1 Scale A . 78 7.2 Bollinger Bands . 79 7.3 Volatility Correlations . 85 8 Conclusion 87 10 List of Figures 1 ALG Fourier Approximation . 4 2 ALG DJI Inuence Approximation . 5 3 Moving Average ALG Fourier Approximation . 5 4 Moving Average ALG DJI Inuence Approximation . 6 5 ................................... 7 6 ................................... 8 7 ................................... 8 8 The auto-correlation determined that the data that is useful for prediction for ALG are the 87 days prior to the start of the prediction. The relevant period for ALG is 11/3/2020 - 3/11/2021, represented by 213:300. 23 9 The auto-correlation determined that the data that is useful for prediction for AOS are the 109 days prior to the start of the prediction. The relevant period for AOS is 10/2/2020 - 3/11/2021, represented by 191:300. 24 10 The auto-correlation determined that the data that is useful for prediction for AYI are the 58 days prior to the start of the prediction. The relevant period for AYI is 12/15/2020 - 3/11/2021, represented by 242:300. 24 11 The auto-correlation determined that the data that is useful for prediction for BDC are the 52 days prior to the start of the prediction. The relevant period for BDC is 12/23/2020 - 3/11/2021, represented by 248:300. 25 12 The auto-correlation determined that the data that is useful for prediction for CARR are the 87 days prior to the start of the prediction. The relevant period for CARR is 11/3/2020 - 3/11/2021, represented by 213:300. 25 13 The auto-correlation determined that the data that is useful for prediction for CYRX are the 111 days prior to the start of the prediction.