Sports Data Mining
Total Page:16
File Type:pdf, Size:1020Kb
SPORTS DATA MINING SPORTS DATA MINING Robert P. Schumaker Osama K. Solieman Hsinchun Chen Robert P. Schumaker Iona College New Rochelle, New York Osama K. Solieman Tucson, Arizona Hsinchun Chen University of Arizona Tucson, Arizona TABLE OF CONTENTS LIST OF FIGURES.................................................................................... xiii LIST OF TABLES ...................................................................................... xv PREFACE ................................................................................................. xvii CHAPTER 1. SPORTS DATA MINING Chapter Overview ....................................................................................................... 1 1. Definition ........................................................................................................... 2 2. History ............................................................................................................... 6 3. Societal Dimensions......................................................................................... 10 4. The International Landscape ............................................................................ 11 5. Criticisms ......................................................................................................... 14 6. Questions for Discussion ................................................................................. 15 CHAPTER 2. SPORTS DATA MINING METHODOLOGY Chapter Overview ..................................................................................................... 17 1. Scientific Foundation ....................................................................................... 18 2. Traditional Data Mining Applications ............................................................. 20 3. Deriving Knowledge ........................................................................................ 23 4. Questions for Discussion ................................................................................. 24 CHAPTER 3. DATA SOURCES FOR SPORTS Chapter Overview ..................................................................................................... 25 1. Introduction ...................................................................................................... 25 2. Professional Societies ...................................................................................... 26 2.1 The Society for American Baseball Research (SABR) ........................... 26 2.2 Association for Professional Basketball Research (APBR) .................... 27 2.3 Professional Football Researchers Association (PFRA) ......................... 27 3. Sport-related Associations ............................................................................... 27 3.1 The International Association on Computer Science in vi Sport (IACSS) ......................................................................................... 28 3.2 The International Association for Sports Information (IASI) ................. 28 4. Special Interest Sources ................................................................................... 28 4.1 Baseball ................................................................................................... 28 4.2 Basketball ................................................................................................ 29 4.3 Football ................................................................................................... 29 4.4 Cricket ..................................................................................................... 29 4.5 Soccer ...................................................................................................... 30 4.6 Multiple Sports ........................................................................................ 30 5. Conclusions ...................................................................................................... 30 6. Questions for Discussion ................................................................................. 31 CHAPTER 4. RESEARCH IN SPORTS STATISTICS Chapter Overview ..................................................................................................... 33 1. Introduction ...................................................................................................... 33 2. Sports Statistics ................................................................................................ 34 2.1 History and Inherent Problems of Statistics in Sports ............................. 34 2.2 Bill James ................................................................................................ 35 2.3 Dean Oliver ............................................................................................. 36 3. Baseball Research ............................................................................................ 37 3.1 Building Blocks ....................................................................................... 37 3.2 Runs Created ........................................................................................... 38 3.3 Win Shares .............................................................................................. 39 3.4 Linear Weights and Total Player Rating ................................................. 40 3.5 Pitching Measures ................................................................................... 40 4. Basketball Research ......................................................................................... 41 4.1 Shot Zones............................................................................................... 42 4.2 Player Efficiency Rating ......................................................................... 43 4.3 Plus / Minus Rating ................................................................................. 43 4.4 Measuring Player Contribution to Winning ............................................ 44 4.5 Rating Clutch Performances.................................................................... 44 5. Football Research ............................................................................................ 45 5.1 Defense-Adjusted Value Over Average .................................................. 45 5.2 Defense-Adjusted Points Above Replacement ........................................ 46 5.3 Adjusted Line Yards ............................................................................... 46 6. Emerging Research in Other Sports ................................................................. 46 vii 6.1 NCAA Bowl Championship Series (BCS) .............................................. 47 6.2 NCAA Men’s Basketball Tournament .................................................... 47 6.3 Soccer ...................................................................................................... 48 6.4 Cricket ..................................................................................................... 49 6.5 Olympic Curling ..................................................................................... 49 7. Conclusions ...................................................................................................... 49 8. Questions for Discussion ................................................................................. 49 CHAPTER 5. TOOLS AND SYSTEMS FOR SPORTS DATA ANALYSIS Chapter Overview ..................................................................................................... 51 1. Introduction ...................................................................................................... 51 2. Sports Data Mining Tools ................................................................................ 52 2.1 Advanced Scout ...................................................................................... 53 2.2 Synergy Online ....................................................................................... 53 2.3 SportsVis ................................................................................................. 54 2.4 Sports Data Hub ...................................................................................... 54 3. Scouting tools .................................................................................................. 55 3.1 Digital Scout ........................................................................................... 55 3.2 Inside Edge .............................................................................................. 56 4. Sports Fraud Detection .................................................................................... 59 4.1 Las Vegas Sports Consultants (LVSC) ................................................... 60 4.2 Offshore Gaming ..................................................................................... 60 5. Conclusions ...................................................................................................... 61 6. Questions for Discussion ................................................................................. 61 CHAPTER 6. PREDICTIVE MODELING FOR SPORTS AND GAMING Chapter Overview ..................................................................................................... 63 1. Introduction ...................................................................................................... 63 2. Statistical Simulations ...................................................................................... 64 2.1 Baseball ..................................................................................................