A Dynamic Bayesian Network to Predict the Total Points Scored in National Basketball Association Games Enrique Marcos Alameda-Basora Iowa State University
Total Page:16
File Type:pdf, Size:1020Kb
Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2019 A dynamic Bayesian network to predict the total points scored in national basketball association games Enrique Marcos Alameda-Basora Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Engineering Commons, and the Statistics and Probability Commons Recommended Citation Alameda-Basora, Enrique Marcos, "A dynamic Bayesian network to predict the total points scored in national basketball association games" (2019). Graduate Theses and Dissertations. 16955. https://lib.dr.iastate.edu/etd/16955 This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. A dynamic Bayesian network to predict the total points scored in national basketball association games by Enrique M. Alameda-Basora A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Major: Industrial Engineering Program of Study Committee: Sarah Ryan, Major Professor Dan Nettleton Sigurdur Olafsson The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this thesis. The Graduate College will ensure this thesis is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 Copyright © Enrique M. Alameda-Basora, 2019. All rights reserved. ii DEDICATION I dedicate this work to my beloved parents, Eloiris Basora-Cintron and Rafael A. Alameda- Rojas. Thank you for always loving me unconditionally and instilling within me the values of hard work and dedication. Before embarking on my journey to Iowa State University, I had to cope with the decision not to see you as often, and that was probably the hardest decision of my life. I could imagine how difficult it was to take care of four children. I vividly recall times when you would spend a whole day of your busy lives making sure everyone completed their science fair project due that week. I want you to know that all the sacrifices you made for us were not in vain. Incredibly, we have all grown up and become adults with promising careers and/or graduate degrees. Our success is a testament to all the hours of hard work you put into raising us and making sure we believed in our abilities to succeed. Thank you for teaching me the value of education and life-long learning. Without you, I would not be where I am today. I will forever be grateful and indebted to you. iii TABLE OF CONTENTS LIST OF FIGURES ........................................................................................................................ v LIST OF TABLES ......................................................................................................................... vi ACKNOWLEDGEMENTS .......................................................................................................... vii ABSTRACT ................................................................................................................................. viii CHAPTER I INTRODUCTION ..................................................................................................... 1 1.1 Background and Motivation ................................................................................................. 1 1.2 Research Problem ................................................................................................................. 2 1.3 Proposed Solution ................................................................................................................. 3 1.4 Organization of Thesis .......................................................................................................... 3 CHAPTER II LITERATURE REVIEW ........................................................................................ 5 2.1 Introduction ........................................................................................................................... 5 2.2 Data Mining and its Role in Sports Predictive Modeling ..................................................... 5 2.2.1 Data Mining Concept ..................................................................................................... 5 2.2.2 Sports Predictive Modeling............................................................................................ 7 2.3 Data Mining Techniques Applied to Basketball Predictive Modeling ................................. 7 2.3.1 Naïve Bayes Classifier ................................................................................................... 7 2.3.2 Logistic Regression ........................................................................................................ 9 2.3.3 Neural Networks .......................................................................................................... 12 2.3.4 Review of Bayesian Networks Applied to Sports ........................................................ 14 2.3.5 Conclusion and Research Gap ..................................................................................... 16 CHAPTER III BAYESIAN NETWORK DETAILS AND JUSTIFICATION ............................ 18 3.1 Introduction to Bayesian Networks ................................................................................ 18 3.2 The Importance of the Directed Acyclic Graph Structure .............................................. 18 3.3 Bayesian Network Learning ........................................................................................... 19 3.4 Why a Bayesian Network? .............................................................................................. 21 3.5 Bayesian Network Limitations ....................................................................................... 21 3.6 Simple Bayesian Network Example ............................................................................... 22 3.7 Guide on Computing Probabilities in this Study ............................................................ 27 iv CHAPTER IV DATA COLLECTION AND PREPARATION ................................................... 28 4.1 Collection of Data Sets and Additional Features Constructed ............................................ 28 4.1.1 Training Data Set ......................................................................................................... 28 4.1.2 Test Data Set ................................................................................................................ 29 4.1.3 Features Scraped and Additional Constructed Features .............................................. 30 4.2 Discretization of Data Sets .................................................................................................. 31 4.3 Feature Selection ................................................................................................................. 33 4.3.1 Information Gain Ratio ................................................................................................ 35 4.3.2 Chi-Square Test of Independence for Feature Selection ............................................. 37 4.3.3 Final Learning Features and Validation ....................................................................... 39 CHAPTER V EXPERIMENTAL DESIGN ................................................................................. 45 5.1 Introduction ......................................................................................................................... 45 5.2 Cramer’s V Measure of Association ................................................................................... 45 5.3 Chi-Square Test for Conditional Independence .................................................................. 47 5.4 Expert Bayesian Network ................................................................................................... 48 5.4.1 Methodology ................................................................................................................ 48 5.4.2 Initial Comparison to Non-Expert Bayesian Network with Feature Selection ............ 49 5.5 Calculating the Probabilities ............................................................................................... 52 CHAPTER VI MODEL EVALUATION ..................................................................................... 53 6.1 Results ................................................................................................................................. 53 6.1.1 Accuracy Results ......................................................................................................... 53 6.1.2 Profitability Results ..................................................................................................... 55 6.1.3 Time Results ................................................................................................................ 58 6.2 Discussion of Results .......................................................................................................... 59 CHAPTER VII CONCLUSIONS ................................................................................................. 63 BIBLIOGRAPHY ........................................................................................................................