Probabilistic Match Importance in Professional Sports
Total Page:16
File Type:pdf, Size:1020Kb
Probabilistic Match Importance in Professional Sports A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Michael de Lorenzo B. Sci. (Stats) (Hons) RMIT University School of Science College of Science, Engineering and Health RMIT University February, 2018 Declaration I certify that except where due acknowledgement has been made, the work is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; the content of the thesis is the result of work which has been carried out since the official commencement date of the approach research program; any editorial work, paid or unpaid, carried out by a third party is acknowledged; and, ethics procedures and guidelines have been followed. I acknowledge the support that I have for my research received through the provision of an Australian Government Research Training Program Scholarship. Michael de Lorenzo February, 2018 i Acknowledgements I wish to acknowledge the following people whose valuable assistance made this dissertation possible. Dr. Stella Stylianou – Thank you for providing constant guidance and encouragement throughout my candidature. This dissertation would not have been possible without your incredible supervision. Dr. Ian Grundy – Thank you for your constant input and guidance throughout my candidature. Your supervision helped shape this dissertation into what it is today and I am grateful for your guidance throughout my candidature. RMIT Sports Statistics Research students – Thank you to both current and past sports statistics research students for your comradeship and helpful advice throughout my candidature. You guys helped make the difficult days all worth it. Examiners – Thank you for your kind words and helpful suggestions which refined the overall quality of this dissertation. The Australian Government Research Training Program Scholarship – Thank you for providing support and the opportunity to complete this dissertation. Kevin, Dawn and Cathy – Thank you to my family for your constant encouragement and support throughout my candidature. None of this would have been possible without it and I am truly grateful. ii Summary Quantifying the importance of a match in professional sports can be beneficial in a variety of circumstances, including in the statistical modelling of match outcomes and attendance figures. Current literature on probabilistic match importance measures have overlooked key information, including the significance of a draw outcome in football (soccer), and the multiple end-of-season outcomes that a team can achieve in a season. The aim of this research was to develop a probabilistic measure of match importance that accounts for different end-of-season outcomes and is adaptable to both a two-result (win/loss) and a three-result (win/draw/loss) sport. By first furthering an existing probabilistic measure of match importance, a new model was developed that calculates the importance of achieving in different end-of-season outcomes using a Markov Chain model for both NBA basketball and Bundesliga football. Furthermore, the new model allows for the significance of a draw outcome to be quantified in the latter; which has not been fully achieved in past literature. The new model was compared to a complete Monte Carlo simulation model, where it was observed that the results between the two modelling techniques were similar. The results from the new model were then applied to an Elo ratings system and a stepwise regression model, where an effect on each model’s predictive performance was observed when matches were classified by their level of importance to the competing teams. The completion of this research helps further the knowledge on calculating and applying match importance within sports modelling. iii Contents Declaration .................................................................................................................................. i Acknowledgements .................................................................................................................... ii Summary .................................................................................................................................. iii List of Tables ............................................................................................................................ ix List of Figures .......................................................................................................................... xii 1. Introduction ........................................................................................................................ 1 1.1. Bundesliga and the National Basketball Association .................................................. 2 1.2. Literature review ......................................................................................................... 4 1.2.1. Measuring match importance ............................................................................... 4 1.2.2. Match importance in statistical modelling ........................................................... 7 1.2.3. In-play importance ............................................................................................... 9 1.2.4. Performance analysis ......................................................................................... 10 1.3. Research questions and publications ......................................................................... 12 1.3.1. Research questions ............................................................................................. 12 1.3.2. Publications ........................................................................................................ 14 2. History of featured sports................................................................................................. 16 2.1. Bundesliga ................................................................................................................. 17 2.1.1. History................................................................................................................ 17 2.1.2. The game ............................................................................................................ 22 2.2. National Basketball Association ............................................................................... 30 iv 2.2.1. History................................................................................................................ 30 2.2.2. The game ............................................................................................................ 34 3. Methods............................................................................................................................ 42 3.1. Data ........................................................................................................................... 43 3.1.1. Bundesliga.......................................................................................................... 43 3.1.2. NBA ................................................................................................................... 44 3.2. VBA programming .................................................................................................... 46 3.3. Binomial distribution................................................................................................. 46 3.4. Markov Chain ............................................................................................................ 47 3.5. Monte Carlo simulation ............................................................................................. 48 3.6. Elo ratings system ..................................................................................................... 48 3.7. Linear regression ....................................................................................................... 49 4. ParWins importance......................................................................................................... 51 4.1. Introduction ............................................................................................................... 52 4.2. Methods ..................................................................................................................... 54 4.2.1. Projected wins .................................................................................................... 54 4.2.2. Conditional probabilities .................................................................................... 59 4.2.3. Match importance .............................................................................................. 61 4.2.4. Adaption to three-result sport ............................................................................ 65 4.3. Results ....................................................................................................................... 67 4.3.1. NBA ................................................................................................................... 67 v 4.3.2. Bundesliga.......................................................................................................... 73 4.4. Discussion ................................................................................................................. 77 4.5. Summary ................................................................................................................... 80 5. Overtake importance .......................................................................................................