Arxiv:1604.01455V1 [Stat.OT] 6 Apr 2016
Total Page:16
File Type:pdf, Size:1020Kb
Winning Daily Fantasy Sports Hockey Contests Using Integer Programming David Scott Hunter Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139, [email protected] Juan Pablo Vielma Department of Operations Research, Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139, [email protected] Tauhid Zaman Department of Operations Management, Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139, [email protected] We present an integer programming approach to winning daily fantasy sports hockey contests which have top heavy payoff structures (i.e. most of the winnings go to the top ranked entries). Our approach incorporates publicly available predictions on player and team performance into a series of integer programming problems that compute optimal lineups to enter into the contests. We find that the lineups produced by our approach perform well in practice and are even able to come in first place in contests with thousands of entries. We also show through simulations how the profit margin varies as a function of the number of lineups entered and the points distribution of the entered lineups. Studies on real contest data show how different versions of our integer programming approach perform in practice. We find two major insights for winning top heavy contests. First, each lineup entered should be constructed in a way to maximize its individual mean, and more importantly, variance. Second, multiple lineups should be entered and they should be constructed in a way to limit their correlation with each other. Our approach can easily be extended to other sports besides hockey, such as American football and baseball. Key words: sports analytics, daily fantasy sports, statistics, integer programming History: arXiv:1604.01455v1 [stat.OT] 6 Apr 2016 1. Introduction Daily fantasy sports allow users to enter lineups of players in a sport (such as hockey or American football) into daily contests. If the players on a user’s lineup perform well, the user can win a substantial amount of money in the contests. Daily fantasy sports have recently become a huge industry. For instance, DraftKings, the leading company in this area, reported $30 million in revenues and $304 million in entry fees in 2014 (Heitner 2015). While there is ongoing legal debate as to whether or not daily fantasy sports contests are gambling (Board 2015), there is a strong element of skill required to continuously win contests as evidenced by the fact that the top one percent of players pay 40 percent of the entry fees and gain 91 percent of the 1 Hunter, Vielma, and Zaman: Winning 2 profits (Harwell 2015). Daily fantasy sports therefore provide a great opportunity for users to compete and earn profits with analytics. Several sites such as Rotogrinders have been created which solely focus on daily fantasy sports analytics (Rotogrinders 2015). Other sites such as Daily Fantasy Nerd even offer to provide optimized contest lineups for a fee (DailyFantasyNerd 2015). There are a variety of contests in daily fantasy sports with different payoff structures. Some contests distribute all the entry fees equally among all lineups that score above the median score of the contest. These are known as double-up or 50/50 contests. They are relatively easy to win, but do not pay out very much. Other contests give a disproportionately high fraction of the entry fees to the top scoring lineups. We refer to these as top heavy contests. Winning a top heavy contest is much more difficult than winning a double-up or a 50/50 contest, but the payoff is also much greater. Typical entry fees range from $3 to $30, while the potential winnings range from thousands to a million dollars (DraftKings 2015). A daily fantasy sports company sets a price (in virtual dollars) for each player in a sport. These prices vary day to day and are based upon past and predicted performance (Steinberg 2015). Lineups must then be constructed within a fixed budget. For instance, in DraftKings hockey contests, there is a $50,000 budget for each lineup. There are further constraints on the maximum number of each type of player allowed and the number of different teams represented in a lineup. Finding an optimal lineup is challenging because the performance of each player is unknown and there are hundreds of players to choose from. However, predictions for player points can be found on a host of sites such as Rotogrinders (Rotogrinders 2015) or Daily Fantasy Nerd (DailyFantasyNerd 2015) which are fairly accurate, and many fantasy sports players have a good sense about how players will perform. This suggests that the main challenge is not in predicting player performance, but in using predictions and other information to construct optimal lineups. The strategy for constructing lineups depends upon the type of contest being entered. For double-up and 50/50 contests all that is needed is a lineup that beats the median. In this case, one would want a lineup that has a high average predicted score but also low variance. In contrast, top heavy contests provide extreme profits if a lineup can achieve a very high rank. If the predictions were perfect, then this could be done with a single lineup, similar to what one would do for a double-up or 50/50. The problem with this approach is that the predictions are noisy, so one could not consistently achieve a high rank with a single lineup. However, two things can be done to increase the chance of victory. First, one can use lineups which have an inherently large variance, which would increase the chance of the lineup having an extremely good performance and achieving a high rank. Second, one can enter multiple lineups, thereby increasing the chance that at least one of them will rank high. Because of the payoff structure of a top heavy contest, having at least one lineup achieve a high rank will be enough to make a substantial profit. Therefore, a good approach to winning top heavy contests is to enter many lineups with high variance. In this work we use integer programming to win top heavy hockey contests in DraftKings. Our integer programming approach allows us to systematically construct multiple lineups which have high variance. We Hunter, Vielma, and Zaman: Winning 3 November 15, 2015 November 16, 2015 November 17, 2015 November 23, 2015 Figure 1 Screenshots of the performance of our top integer programming lineups in real DraftKings hockey contests with 200 linueps entered. All profits are being donated to charity. chose hockey because we are not familiar with it. This way, any success we have is strictly due to our use of integer programming and not any of our own expert knowledge. The contests we focused on had thousands of entrants and top payoffs of several thousand dollars. During the period when this work was conducted, DraftKings allowed a single user to enter 200 lineups into a contest. We began competing with our integer programming approach and performed very well several days in a row. We even placed third or higher in three consecutive contests. We show screenshots of our performance with 200 lineups in Figure 1. A week after we began winning with our integer programming approach, DraftKings reduced the number of multiple entries to 100. This reduced our performance, but we were still able to perform well in the contests. This demonstrates the power of integer programming for winning daily fantasy sports contests. The remainder of the paper is structured as follows. We begin by reviewing the literature on sports ana- lytics with a particular focus on hockey in Section 1.1. We present an empirical analysis of NHL player performance and the prediction accuracy of sites which forecast player performance in Section 2. We present a simulation study of the multiple lineups strategy in Section 3 which shows how much of an edge a set of lineups needs in order to win top heavy contests. Section 4 presents our integer programming approach for constructing multiple lineups. We evaluate the performance of our integer programming approach in actual DraftKings hockey contests in Section 5. We conclude in Section 6 with a discussion on extending our integer programming approach to other sports. 1.1. Previous Work Much of the extant work in hockey analytics has focused on statistical models of player performance. The traditional measure of a player’s ability is the plus-minus statistic, which measures the number of goals Hunter, Vielma, and Zaman: Winning 4 scored by a player’s team minus the number of goals scored by the opposing team when the player in on the ice. Many modifications have been suggested for this simple metric in order to provide a more clear assessment of a player’s quality. The adjusted minus/plus probability approach of Schuckers et al. (2011) attempts to add in other data such as hits or face-offs. Controlling for team strength by using the team- average plus-minus was proposed in Awad (2009). Other approaches attempt to statistically infer the effects of individual factors on team performance (Thomas et al. 2013). An approach using a regression on goals per hour to assess player performance in proposed in Macdonald (2011). To deal with the large number of features present in the high dimensional regressions that occur for player performance, a regularized logistic regression approach is used in Gramacy et al. (2013). The Added Goal Value metric is presented in Pettigrew (2015) which incorporates information on what are known as power plays in hockey. A methodology known as total hockey rating is proposed in Schuckers and Curro (2013) for performance evaluation of skaters which incorporates data on factors such as zone starts and non-shooting events such as turnovers.