Not All Goals Are Created Equal – Evaluating Hockey Players in the NHL Using Q-Learning with a Contextual Reward Function
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Datateknik 2021 | LIU-IDA/LITH-EX-A--21/008--SE Not All Goals Are Created Equal – Evaluating Hockey Players in the NHL Using Q-Learning with a Contextual Reward Function Värdering av hockeyspelare i NHL med hjälp av Q-learning med en kontextuell belöningsfunktion Jon Vik Supervisor : Niklas Carlsson Examiner : Patrick Lambrix Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrätt Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer- ingsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko- pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis- ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker- heten och tillgängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman- nens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/. Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to down- load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/. © Jon Vik Abstract Not all goals in the game of ice hockey are created equal: some goals increase the chances of winning more than others. This thesis investigates the result of constructing and using a reward function that takes this fact into consideration, instead of the common binary reward function. The two reward functions are used in a Markov Game model with value iteration. The data used to evaluate the hockey players is play-by-play data from the 2013-2014 season of the National Hockey League (NHL). Furthermore, overtime events, goalkeepers, and playoff games are excluded from the dataset. This study finds that the constructed reward, in general, is less correlated than the binary reward to the metrics: points, time on ice and star points. However, an increased correlation was found between the evaluated impact, and time on ice for center players. Much of the discussion is devoted to the difficulty of validating the results from a player evaluation due to the lack of ground truth. One conclusion from this discussion is that future efforts must be made to establish consensus regarding how the success of a hockey player should be defined. Acknowledgments I want to thank my supervisor, Niklas Carlsson, and my examiner, Patrick Lambrix, for their unending enthusiasm for this work. I also want to express my gratitude to Nahid Shahmehri for her warm welcoming to ADIT. Thank you to Linn Mattson for being the opponent for this thesis after all this time. This thesis would not have been possible without the help from many people in my life. My endless gratitude goes to Ulla Högstadius and Rickard Wedin. Thank you to Sofia Edsham- mar. Thank you to my mother Lena, my father Gunnar, my brother Olov, and my partner Rebecca. I will be forever grateful to all of you. iv Contents Abstract iii Acknowledgments iv Contents v List of Figures vii List of Tables viii 1 Introduction 1 1.1 Motivation . 1 1.2 Aim............................................ 2 1.3 Research Questions . 2 1.4 Delimitations . 2 1.5 Research Method . 2 1.6 Outline . 3 2 Background 4 2.1 Hockey Rules . 4 2.2 Hockey Concepts . 7 3 Theory 11 3.1 Markov Game Model . 11 3.2 Alternating Decision Tree . 12 3.3 Value Iteration . 12 4 Related Work 14 4.1 Related Approaches and Problems . 14 5 Method 17 5.1 The Work of Routley and Schulte . 17 5.2 Data Pre-Processing . 21 5.3 Reward Function . 22 5.4 Merging Existing Code Base with the Reward Function . 24 5.5 Three Stars Metric . 24 6 Results 26 7 Discussion 33 7.1 Results . 33 7.2 Method . 34 7.3 The Work in a Wider Context . 35 v 8 Conclusion 36 Bibliography 37 9 Appendix 40 9.1 Weighted Metrics Correlations . 40 9.2 Top 30 Weighted Points . 42 9.3 Weighted Points versus Traditional Points . 42 9.4 Top 25 Players . 43 vi List of Figures 1.1 Goal frequency for each minute of the first three periods in the NHL during the 2013-2014 season. 1 1.2 CCDF of the time between all goals scored in period one, two and three during the 2013-2014 NHL season. 1 2.1 Hockey rink with the three different zones. The zone perspective is from the team with the goaltender in the left goal. 6 5.1 Subset of events in a tree. 19 5.2 Reward distribution over time and goal difference. Each bin is two minutes long. Less than three observations for each bin are left out. 23 5.3 Negative reward distribution over time and goal difference. Each bin is two min- utes long. Less than three observations for each bin are left out. 23 5.4 Reward distribution over time and manpower difference. Each bin is two minutes long. Less than three observations for each bin are left out. 24 5.5 Cumulative distribution function of the reward. 24 5.6 Venn diagram over the relevant classes concerning the three stars metric. 25 6.1 Impact vs. points for different skater positions and different reward functions. 27 6.2 Impact vs. time on ice in hours. 28 6.3 Impact vs. star points. Players that have never been on the three stars list are excluded. 29 9.1 Weighted points versus traditional points. 42 vii List of Tables 2.1 NHL teams during the 2013-2014 season. 5 2.2 Top 30 players by star points. 9 2.3 Conventional metrics and their abbreviations. 10 2.4 Selection of NHL awards. 10 5.1 Action events committed by players in the play-by-play table. 18 5.2 Start and end events in the play-by-play table. 18 5.3 The implemented definition of the missing metricTable. 22 6.1 Three stars precision and recall. 29 6.2 Maximal information-based correlation (MIC) between the two impacts, star points and traditional metrics. Recall from Section 2.2 that GP are games played, P are points, +/- is the plus-minus metric, and TOI is time on ice. 30 6.3 Selection of recipients and runner-ups of seasonal trophy awards with impacts and traditional metrics. The rewards are noted with the following symbols: Most points *, best first-year player †, best defensive forward ‡, most valuable player of the regular season §, best defenseman ||,most goals ¶. 31 9.1 Maximal information-based correlation (MIC) between the two impacts, star points, weighted plus minus, weighted points and traditional metrics. 40 9.2 Weighted points and ranking difference to traditional points. 41 9.3 Top 25 players using contextual reward. 43 viii 1 Introduction The introduction section gives an account for why this work is useful. Next, the aim of this thesis is specified with research questions, delimitations and research method. Last, an out- line for the different chapters are provided. 1.1 Motivation When evaluating the performance of ice hockey players in the National Hockey League (NHL), it is most common to use metrics like the number of goals and assists over a season1. One weakness with such metrics is that they do not take into consideration the context in which the goal was scored in. For instance, a goal scored when you are in the lead with 9–2 at the end of the game is most likely not crucial for winning. In contrast, a goal when the score is 2–2 with fifteen seconds left of the game is of more importance for winning. In addition to this, goals are not evenly distributed over time. This becomes apparent when looking at all goals scored during regular time in Figure 1.1. It appears that fewer goals are scored in the beginning of each period and more goals are scored towards the end of first and third period. Figure 1.1: Goal frequency for each minute Figure 1.2: CCDF of the time between all of the first three periods in the NHL during goals scored in period one, two and three the 2013-2014 season. during the 2013-2014 NHL season. 1https://www.eliteprospects.com/league/nhl 1 1.2. Aim To further illustrate this, the complementary cumulative distribution function (CCDF) can be seen in Figure 1.2.