An Application of Linear Regression & Artificial Neural Network Model In
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Vol. 4 Issue 01,January-2015 An Application of Linear Regression & Artificial Neural Network Model in the NFL Result Prediction Anyama, Oscar Uzoma Igiri, Chinwe Peace Department of Computer Science Department of Computer Science University of Port Harcourt University of Port Harcourt Nigeria Nigeria Abstract—Football results prediction in has gained II. LITERATURE REVIEW popularity in recent years. Non hybrid approaches have shown Today, various mathematical games prediction models complex and low prediction results. Data mining tools with exist, ranging from result prediction models to number of insufficient features, however, have also yielded low predictions. goals prediction models to injury prediction model and even to In our research, machine learning has been used to develop a half time prediction models. These models and computer hybrid football match result predictive model for NFL. We programs for games predictions have long been developed and constructed a more comprehensive system with improved still in development. Most of them employ stochastic methods prediction accuracy by using a hybridized approach. Our to describe uncertainty. prediction system for football match results was implemented using a hybrid of artificial neural network (ANN) and linear One of the most recent challenges especially to researchers regression (LR) techniques with Rapid Miner as a data mining is the deployment of an efficient Hybrid Prediction System, tool. The technique yielded 90.32% prediction accuracy. With which must be a system that takes into account the this output, it is observed that the prediction accuracy is higher performances of players and ranking of teams. The system than those of existing systems. must have very high prediction accuracy. Keywords—ANN; Hybrid; Machine learning; Models; Participation in games activity is perhaps important to all Prediction of us; therefore it is not surprising that there has been a substantial amount of research work done on prediction of I. INTRODUCTION games. Some of the related works are discussed. Sports prediction is gradually becoming a huge business venture, sports entertainment is not just about the competitors A large number of literatures have been dedicated to the (teams, officials) of the game and the fans showing their development of goal modeling, result modeling, ratings and support week in week out to their various sports outfit. Betting rankings for games prediction. The methods proposed in these markets have gradually churned out huge gaming sentiments papers and articles can be evaluated by their ability to predict into a multi-billion dollar venture. You can barely watch the outcomes of future games. Many papers have considered games these days without being reminded that you can bet on methods based on various forms of mathematical models the results and make reasonable financial gains. Therefore, where predictions and forecasting are made for games predicting game results has become an area of interest for outcome [8] different sports organizations [5]. a) Related Works In this research however, the interest of using machine A Quantitative Stock Prediction System based on learning in Artificial Intelligence as an approach in the Financial News was done by [2]. In their work the discrete prediction of the outcome of games is thoroughly investigated. stock price prediction using a synthesis of linguistic, financial Machine Learning approach provides an advantage of having and statistical techniques to create the Arizona Financial Text an unbiased/objective analysis with respect to games statistics System (AZFinText) was done. The major objective of the using select techniques and methods [7]. In turn, this provides project was to provide predictions for stock market using and ensures that games results are predicted in a much statistical data gathered from financial news. The lines of effective manner using appropriate models. research approach used were Mean Squared Error (MSE), The overall benefits of developing such a system are: visualization tools and Machine Learning Techniques. Prediction accuracy of 71.2% was obtained with a Simulated To build a system that can help bettors beat the bookies Trading return of 8.50% [6] [1] developed an Ocean Model, Analysis and Prediction To help mangers with team strategies and decision System using the root mean square error for sea surface height making anomaly. The approach introduced a daily forecast with a new To contribute to knowledge and learning four-cycle design was introduced where four independent forecast cycles in each time-lag. The system is composed of a To study statistics obtained from games’ data. real-time ocean observing system, a quality control system, the Ocean Forecast Australia Model (OFAM), the BLUElink Ocean Data Assimilation System (BODAS), an adaptive IJERTV4IS010426 www.ijert.org 457 ( This work is licensed under a Creative Commons Attribution 4.0 International License.) International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Vol. 4 Issue 01,January-2015 initialisation scheme and air-sea fluxes from the Australian The existing system develops a model using a Back Community Climate and Earth System Simulator (ACCESS). Propagation approach which implies that more than one predictor variable is available. An artificial neural network which is a non-linear data modeling tool that can be used to find hidden patterns and The algorithm below represents the multi linear regression relationships within data was developed by [4]. The findings approach. by the author show good solution to the prediction problem. The major pitfalls include: high complexity, lowly 1) Algorithm (ANN) parameterized poor training of datasets. The week 15 Input prediction rate was 75% using the season average and only Attributes X1, X2…. Xn 37.5% of the games using the three week average. Main process algorithm [3] examined nine college football ranking systems, including several used by the BCS and considered them in initialize network weights (often small random values) addition to an indicator of home field advantage and betting do spreads as predictors in regression models predicting the outcomes (point spreads) of 1,582 games from 1998 to 2001. forEach training example The approach was a very robust one with lots of parameters prediction = neural-net-output(network, ex) captured. The major pitfall is its linear dependencies. A prediction accuracy of 74.7% was obtained. actual = teacher-output(example) compute error (prediction - actual) III. ANALYSIS OF EXISTING SYSTEM compute all weights from hidden layer to output The present system is the research work done by [4]. In layer [4], an artificial neural network which is a non-linear statistical model that can be used to find hidden patterns, compute all weights from input layer to hidden layer approximate functions and find relationships within data using update network weights // input layer not modified by the concept of neurons in the human brain. The tools used are error estimate matlab and perl script. until all ex classified correctly or another terminating criterion satisfied FEATURE EXTRACTION return the network PREPROCESSING DATA SOURCE 2) Output Y= Predicted Result used for rating ARTIFICIAL Pitfalls NEURAL NETWORK PREDICTED High complexity OUTPUT Low parameterized Data cannot be retrained 3) Prediction Accuracy Fig 1: Joshua K., (2003) (Existing System) Predictions were made using both prediction sets and were tested for weeks 14 and 15 of the 2003 NFL season. In both In [4], the Neural Network Prediction of NFL Football cases, the season average prediction set was more effective in Games was divided into the following stages: predicting the outcome of the games. For week 14, the season average prediction set generated 75% correct outcomes, Data Collection whereas the three week average set correctly predicted 62.5% Data Extraction and of the games. The week 15 prediction rate was 75% using the season average and only 37.5% Data mining IV. PROPOSED HYBRIDIZED PREDICTION SYSTEM Features used are: In the proposed system, the use of machine learning was Total yardage differential developed to out-perform the existing system. RapidMiner modeling tool is used to investigate the hybridized methods. Rushing yardage differential The proposed model framework is a hybrid of Linear Time of possession differential (in seconds) Regression Technique and Artificial Neural Network, which employs a supervised learning. Turnover differential and Home or away scores. A Artificial Neural Network IJERTV4IS010426 www.ijert.org 458 ( This work is licensed under a Creative Commons Attribution 4.0 International License.) International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Vol. 4 Issue 01,January-2015 The resulting value of range -1 to +1 from the Linear Regression Technique will provide attributes that affects or FEATURE DATA EXTRACTION contributes to the prediction results. The results will be used SOURCE PREPROCESSING as input for the ANN model. B. Linear Regression (Attribute Weighing) The relationship between the dependent and independent KNOWLEDGE variables is seen. Multiple regression models contain one BASE LINEAR RULES measurement variable in multiple forms. The response REGRESSION variable is influenced by more than one predictor label. Unlike linear regression, where the response is a straight line, the response may