An Elo-Based Approach to Model Team Players and Predict the Outcome of Games

Total Page:16

File Type:pdf, Size:1020Kb

An Elo-Based Approach to Model Team Players and Predict the Outcome of Games CORE Metadata, citation and similar papers at core.ac.uk Provided by Texas A&M Repository AN ELO-BASED APPROACH TO MODEL TEAM PLAYERS AND PREDICT THE OUTCOME OF GAMES A Thesis by RICHARD EINSTEIN DOSS MARVELDOSS Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Chair of Committee, Jean Francios Chamberland Committee Members, Gregory H. Huff Thomas R. Ioerger Head of Department, Scott Miller August 2018 Major Subject: Computer Engineering Copyright 2018 Richard Einstein Doss Marveldoss ABSTRACT Sports data analytics has become a popular research area in recent years, with the advent of dif- ferent ways to capture information about a game or a player. Different statistical metrics have been created to quantify the performance of a player/team. A popular application of sport data anaytics is to generate a rating system for all the team/players involved in a tournament. The resulting rating system can be used to predict the outcome of future games, assess player performances, or come up with a tournament brackets. A popular rating system is the Elo rating system. It started as a rating system for chess tour- naments. It’s known for its simple yet elegant way to assign a rating to a particular individual. Over the last decade, several variations of the original Elo rating system have come into existence, collectively know as Elo-based rating systems. This has been applied in a variety of sports like baseball, basketball, football, etc. In this thesis, an Elo-based approach is employed to model an individual basketball player strength based on the plus-minus score of the player. The plus-minus score is a powerful metric because it quantifies the contribution of a player like good defense, setting up screens, or sledging the opposite team, which are not reflected by metrics that are pri- marily based on points. Then, the individual player ratings are combined to obtain a team rating, Team rating are compared pairwise to obtain the probability of a win by each of the teams during a matchup. This method not only predicts wins/losses, but offers more information than the Elo rating system as ratings are assigned to each individual player instead of just considering teams. This information includes for example, the effect of mid-season transfers or the impact of injuries to team strengths; these items are overlooked by the standard Elo algorithm. The performance of the proposed Elo-based rating system is compared to that of the standard Elo rating system for basketball by using sythetic data. The rating systems are also compared by running them over real-life data from past NBA seasons. ii DEDICATION This dissertation is dedicated to my family, professors and friends for their guidance and encouragement which made this work possible. iii ACKNOWLEDGMENTS I would like to thank Dr. Jean Francois Chamberland for his support and guidance throughout the course of my thesis. This phase has helped me understand what research really is and I have learnt life-long lessons. I would also like to thank Dr. Gregory Huff and Dr. Thomas Iorger who offered me the freedom to venture into the domain of my interest. iv CONTRIBUTORS AND FUNDING SOURCES Contributors This work was supported by a thesis committee consisting of Professor Jean Francios Cham- berland and Professor Gregory H. Huff of the Department of Electrical and Computer Engineering and Professor Thomas R. Ioerger of the Department of Computer Science. All other work conducted for the thesis was completed by the student independently. v NOMENCLATURE +/- Plus-minus score NBA National Basketball Association NCAA National Collegiate Athletic Association FIDE Fed´ eration´ Internationale des Echecs´ USCF United States Chess Federation ATL Atlanta Hawks BOS Boston Celtics BKN Broklyn Nets CHA Charlotte Hornets CHI Chicago Bulls CLE Cleveland Cavaliers DAL Dallas Mavericks DEN Denver Nuggets DET Detroit Pistons GSW Golden State Warriors HOU Houston Rockets IND Indiana Pacers LAC Los Angeles Clippers LAL Los Angeles Lakers MEM Memphis Grizzlies MIA Miami Heats MIL Millwakee Bucks vi MIN Minnesota Timberwolves NOP New Orleans Pelicans NYK New York Knicks OKC Okhalohoma Thunder ORL Orlando Magic PHI Philadelphia 76ers PHX Phoneix Suns POR Portland Trailblazers SAC Sacramento Kings SAS San Antonio Spurs TOR Toronto Raptors UTA Utah Jazz WAS Washington Wizards vii TABLE OF CONTENTS Page ABSTRACT ......................................................................................... ii DEDICATION....................................................................................... iii ACKNOWLEDGMENTS .......................................................................... iv CONTRIBUTORS AND FUNDING SOURCES ................................................. v NOMENCLATURE ................................................................................. vi TABLE OF CONTENTS ........................................................................... viii LIST OF FIGURES ................................................................................. x LIST OF TABLES................................................................................... xii 1. INTRODUCTION AND LITERATURE REVIEW ........................................... 1 1.1 Rating system ............................................................................. 1 1.1.1 Rating system for basketball tournaments....................................... 1 1.2 Elo algorithm .............................................................................. 3 1.3 Mathematical Model ...................................................................... 3 1.3.1 K-factor ........................................................................... 4 1.3.2 Actual Score (Sij) ................................................................ 4 1.3.3 Expected Score (µij) ............................................................. 4 1.4 Home advantage........................................................................... 7 1.5 Update Algorithm ......................................................................... 8 2. PROPOSED HYPOTHESIS .................................................................... 9 2.1 Motivation ................................................................................. 9 2.1.1 Plus-Minus Score (+/-) ........................................................... 9 2.2 Elo-Based Approach ...................................................................... 10 2.2.1 Strength of a Player .............................................................. 11 2.2.2 Expected Value ................................................................... 11 2.2.3 K-factor ........................................................................... 12 2.2.4 F(x)................................................................................ 13 2.2.5 Point Conservation ............................................................... 13 2.2.6 Outcome of a game ............................................................... 16 2.3 Update Algorithm ......................................................................... 17 viii 3. DATASET ....................................................................................... 18 3.1 Real-life Data.............................................................................. 18 3.1.1 Data Scrapping ................................................................... 18 3.2 Synthetic Data ............................................................................. 18 3.2.1 Initialization ...................................................................... 20 3.2.2 Metric Calculation................................................................ 21 4. NUMERICAL SIMULATIONS................................................................ 22 4.1 Synthetic Data ............................................................................. 22 4.1.1 Elo Algorithm .................................................................... 22 4.1.2 Proposed Algorithm .............................................................. 22 4.2 Real-life Data.............................................................................. 23 4.2.1 Prediction Discrepancy........................................................... 24 4.2.2 Testing on the NBA 2017-18 season ............................................ 26 4.2.3 Testing using bracket scores ..................................................... 27 4.2.3.1 March Madness NCAA brackets ..................................... 28 4.2.3.2 Bracket score prediction using the standard Elo algorithm.......... 30 4.2.3.3 Bracket score prediction using the proposed algorithm ............. 30 4.3 Player Performance ....................................................................... 30 5. SUMMARY AND CONCLUSIONS .......................................................... 36 REFERENCES ...................................................................................... 37 ix LIST OF FIGURES FIGURE Page 1.1 Example of the statistics collected for an NBA game, reprinted from [1]. ............. 2 1.2 This figure is the distribution of player strengths with player 1 having an average rating of 1500; and player 2, a rating 1900............................................... 5 1.3 This figure represents the distribution of difference in player strengths with player 1 having an Elo rating of 1500 and player 2 with an Elo rating 1900 respectively. .... 6 1.4 Comparsion
Recommended publications
  • Here Should That It Was Not a Cheap Check Or Spite Check Be Better Collaboration Amongst the Federations to Gain Seconds
    Chess Contents Founding Editor: B.H. Wood, OBE. M.Sc † Executive Editor: Malcolm Pein Editorial....................................................................................................................4 Editors: Richard Palliser, Matt Read Malcolm Pein on the latest developments in the game Associate Editor: John Saunders Subscriptions Manager: Paul Harrington 60 Seconds with...Chris Ross.........................................................................7 Twitter: @CHESS_Magazine We catch up with Britain’s leading visually-impaired player Twitter: @TelegraphChess - Malcolm Pein The King of Wijk...................................................................................................8 Website: www.chess.co.uk Yochanan Afek watched Magnus Carlsen win Wijk for a sixth time Subscription Rates: United Kingdom A Good Start......................................................................................................18 1 year (12 issues) £49.95 Gawain Jones was pleased to begin well at Wijk aan Zee 2 year (24 issues) £89.95 3 year (36 issues) £125 How Good is Your Chess?..............................................................................20 Europe Daniel King pays tribute to the legendary Viswanathan Anand 1 year (12 issues) £60 All Tied Up on the Rock..................................................................................24 2 year (24 issues) £112.50 3 year (36 issues) £165 John Saunders had to work hard, but once again enjoyed Gibraltar USA & Canada Rocking the Rock ..............................................................................................26
    [Show full text]
  • The USCF Elo Rating System by Steven Craig Miller
    The USCF Elo Rating System By Steven Craig Miller The United States Chess Federation’s Elo rating system assigns to every tournament chess player a numerical rating based on his or her past performance in USCF rated tournaments. A rating is a number between 0 and 3000, which estimates the player’s chess ability. The Elo system took the number 2000 as the upper level for the strong amateur or club player and arranged the other categories above and below, as shown in the table below. Mr. Miller’s USCF rating is slightly World Championship Contenders 2600 above 2000. He will need to in- Most Grandmasters & International Masters crease his rating by 200 points (or 2400 so) if he is to ever reach the level Most National Masters of chess “master.” Nonetheless, 2200 his meager 2000 rating puts him in Candidate Masters / “Experts” the top 6% of adult USCF mem- 2000 bers. Amateurs Class A / Category 1 1800 In the year 2002, the average rating Amateurs Class B / Category 2 for adult USCF members was 1429 1600 (the average adult rating usually Amateurs Class C / Category 3 fluctuates somewhere around 1400 1500). The average rating for scho- Amateurs Class D / Category 4 lastic USCF members was 608. 1200 Novices Class E / Category 5 Most USCF scholastic chess play- 1000 ers will have a rating between 100 Novices Class F / Category 6 and 1200, although it is not un- 800 common for a few of the better Novices Class G / Category 7 players to have even higher ratings. 600 Novices Class H / Category 8 Ratings can be provisional or estab- 400 lished.
    [Show full text]
  • Elo World, a Framework for Benchmarking Weak Chess Engines
    Elo World, a framework for with a rating of 2000). If the true outcome (of e.g. a benchmarking weak chess tournament) doesn’t match the expected outcome, then both player’s scores are adjusted towards values engines that would have produced the expected result. Over time, scores thus become a more accurate reflection DR. TOM MURPHY VII PH.D. of players’ skill, while also allowing for players to change skill level. This system is carefully described CCS Concepts: • Evaluation methodologies → Tour- elsewhere, so we can just leave it at that. naments; • Chess → Being bad at it; The players need not be human, and in fact this can facilitate running many games and thereby getting Additional Key Words and Phrases: pawn, horse, bishop, arbitrarily accurate ratings. castle, queen, king The problem this paper addresses is that basically ACH Reference Format: all chess tournaments (whether with humans or com- Dr. Tom Murphy VII Ph.D.. 2019. Elo World, a framework puters or both) are between players who know how for benchmarking weak chess engines. 1, 1 (March 2019), to play chess, are interested in winning their games, 13 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn and have some reasonable level of skill. This makes 1 INTRODUCTION it hard to give a rating to weak players: They just lose every single game and so tend towards a rating Fiddly bits aside, it is a solved problem to maintain of −∞.1 Even if other comparatively weak players a numeric skill rating of players for some game (for existed to participate in the tournament and occasion- example chess, but also sports, e-sports, probably ally lose to the player under study, it may still be dif- also z-sports if that’s a thing).
    [Show full text]
  • When NBA Teams Don't Want To
    GAMES TO LOSE When NBA teams don’t want to win Team X Stefano Bertani Federico Fabbri Jorge Machado Scott Shapiro MBA 211 Game Theory, Spring 2010 Games to Lose – MBA 211 Game Theory Games to lose – When NBA teams don’t want to win 1. Introduction ................................................................................................................................................. 3 1.1 Situation ................................................................................................................................................ 3 1.2 NBA Structure ........................................................................................................................................ 3 1.3 NBA Playoff Seeding ............................................................................................................................... 4 1.4 NBA Playoff Tournament ........................................................................................................................ 4 1.5 Home Court Advantage .......................................................................................................................... 5 1.6 Structure of the paper ............................................................................................................................ 5 2. Situation analysis ......................................................................................................................................... 6 2.1 Scenario analysis ...................................................................................................................................
    [Show full text]
  • A Comparison of Rating Systems for Competitive Women's Beach Volleyball
    Statistica Applicata - Italian Journal of Applied Statistics Vol. 30 (2) 233 doi.org/10.26398/IJAS.0030-010 A COMPARISON OF RATING SYSTEMS FOR COMPETITIVE WOMEN’S BEACH VOLLEYBALL Mark E. Glickman1 Department of Statistics, Harvard University, Cambridge, MA, USA Jonathan Hennessy Google, Mountain View, CA, USA Alister Bent Trillium Trading, New York, NY, USA Abstract Women’s beach volleyball became an official Olympic sport in 1996 and continues to attract the participation of amateur and professional female athletes. The most well-known ranking system for women’s beach volleyball is a non-probabilistic method used by the Fédération Internationale de Volleyball (FIVB) in which points are accumulated based on results in designated competitions. This system produces rankings which, in part, determine qualification to elite events including the Olympics. We investigated the application of several alternative probabilistic rating systems for head-to-head games as an approach to ranking women’s beach volleyball teams. These include the Elo (1978) system, the Glicko (Glickman, 1999) and Glicko-2 (Glickman, 2001) systems, and the Stephenson (Stephenson and Sonas, 2016) system, all of which have close connections to the Bradley- Terry (Bradley and Terry, 1952) model for paired comparisons. Based on the full set of FIVB volleyball competition results over the years 2007-2014, we optimized the parameters for these rating systems based on a predictive validation approach. The probabilistic rating systems produce 2014 end-of-year rankings that lack consistency with the FIVB 2014 rankings. Based on the 2014 rankings for both probabilistic and FIVB systems, we found that match results in 2015 were less predictable using the FIVB system compared to any of the probabilistic systems.
    [Show full text]
  • Bayesian Analysis of Home Advantage in North American Professional Sports Before and During COVID‑19 Nico Higgs & Ian Stavness*
    www.nature.com/scientificreports OPEN Bayesian analysis of home advantage in North American professional sports before and during COVID‑19 Nico Higgs & Ian Stavness* Home advantage in professional sports is a widely accepted phenomenon despite the lack of any controlled experiments at the professional level. The return to play of professional sports during the COVID‑19 pandemic presents a unique opportunity to analyze the hypothesized efect of home advantage in neutral settings. While recent work has examined the efect of COVID‑19 restrictions on home advantage in European football, comparatively few studies have examined the efect of restrictions in the North American professional sports leagues. In this work, we infer the efect of and changes in home advantage prior to and during COVID‑19 in the professional North American leagues for hockey, basketball, baseball, and American football. We propose a Bayesian multi‑level regression model that infers the efect of home advantage while accounting for relative team strengths. We also demonstrate that the Negative Binomial distribution is the most appropriate likelihood to use in modelling North American sports leagues as they are prone to overdispersion in their points scored. Our model gives strong evidence that home advantage was negatively impacted in the NHL and NBA during their strongly restricted COVID‑19 playofs, while the MLB and NFL showed little to no change during their weakly restricted COVID‑19 seasons. In professional sports, home teams tend to win more on average than visiting teams1–3. Tis phenomenon has been widely studied across several felds including psychology4,5, economics6,7, and statistics8,9 among others10.
    [Show full text]
  • How Much of the NBA Home Court Advantage Is Explained by Rest?
    How Much of the NBA Home Court Advantage Is Explained by Rest? Oliver Entine Dylan Small Department of Statistics, The Wharton School, University of Pennsylvania Home Court Advantage in Different Pro Sports Home Team Winning % Basketball (NBA) 0.608 Football (NFL) 0.581 Hockey (NHL) 0.550 Baseball (Major Leagues) 0.535 Data from NBA 2001-2002 through 2005-2006 seasons; NFL 2001 through 2005 seasons; NHL 1998-1999 through 2002- 2003 seasons; baseball 1991-2002 seasons. Summary: The home advantage in basketball is the biggest of the four major American pro sports. Possible Sources of Home Court Advantage in Basketball • Psychological support of the crowd. • Comfort of being at home, rather than traveling. • Referees give home teams the benefit of the doubt? • Teams are familiar with particulars/eccentricities of their home court. • Different distributions of rest between home and away teams (we will focus on this). Previous Literature • Martin Manley and Dean Oliver have studied how the home court advantage differs between the regular season and the playoffs: They found no evidence of a big difference between the home court advantage in the playoffs vs. regular season. Oliver estimated that the home court advantage is about 1% less in the playoffs. • We focus on the regular season. Distribution of Rest for Home vs. Away Teams Days of Rest Home Team Away Team 0 0.14 0.33 1 0.59 0.47 2 0.18 0.13 3 0.06 0.05 4+ 0.04 0.03 Source: 1999-2000 season. Summary: Away teams are much more likely to play back to back games, and are less likely to have two or more days of rest.
    [Show full text]
  • Elo and Glicko Standardised Rating Systems by Nicolás Wiedersheim Sendagorta Introduction “Manchester United Is a Great Team” Argues My Friend
    Elo and Glicko Standardised Rating Systems by Nicolás Wiedersheim Sendagorta Introduction “Manchester United is a great team” argues my friend. “But it will never be better than Manchester City”. People rate everything, from films and fashion to parks and politicians. It’s usually an effortless task. A pool of judges quantify somethings value, and an average is determined. However, what if it’s value is unknown. In this scenario, economists turn to auction theory or statistical rating systems. “Skill” is one of these cases. The Elo and Glicko rating systems tend to be the go-to when quantifying skill in Boolean games. Elo Rating System Figure 1: Arpad Elo As I started playing competitive chess, I came across the Figure 2: Top Elo Ratings for 2014 FIFA World Cup Elo rating system. It is a method of calculating a player’s relative skill in zero-sum games. Quantifying ability is helpful to determine how much better some players are, as well as for skill-based matchmaking. It was devised in 1960 by Arpad Elo, a professor of Physics and chess master. The International Chess Federation would soon adopt this rating system to divide the skill of its players. The algorithm’s success in categorising people on ability would start to gain traction in other win-lose games; Association football, NBA, and “esports” being just some of the many examples. Esports are highly competitive videogames. The Elo rating system consists of a formula that can determine your quantified skill based on the outcome of your games. The “skill” trend follows a gaussian bell curve.
    [Show full text]
  • Extensions of the Elo Rating System for Margin of Victory
    Extensions of the Elo Rating System for Margin of Victory MathSport International 2019 - Athens, Greece Stephanie Kovalchik 1 / 46 2 / 46 3 / 46 4 / 46 5 / 46 6 / 46 7 / 46 How Do We Make Match Forecasts? 8 / 46 It Starts with Player Ratings Assume the the ith player has some true ability θi. Models of player abilities assume game outcomes are a function of the difference in abilities Prob(Wij = 1) = F(θi − θj) 9 / 46 Paired Comparison Models Bradley-Terry models are a general class of paired comparison models of latent abilities with a logistic function for win probabilities. 1 F(θi − θj) = 1 + α−(θi−θj) With BT, player abilities are treated as Hxed in time which is unrealistic in most cases. 10 / 46 Bobby Fischer 11 / 46 Fischer's Meteoric Rise 12 / 46 Arpad Elo 13 / 46 In His Own Words 14 / 46 Ability is a Moving Target 15 / 46 Standard Elo Can be broken down into two steps: 1. Estimate (E-Step) 2. Update (U-Step) 16 / 46 Standard Elo E-Step For tth match of player i against player j, the chance that player i wins is estimated as, 1 W^ ijt = 1 + 10−(Rit−Rjt)/σ 17 / 46 Elo Derivation Elo supposed that the ratings of any two competitors were independent and normal with shared standard deviation δ. Given this, he likened the chance of a win to the chance of observing a difference in ratings for ratings drawn from the same distribution, 2 Rit − Rjt ∼ N(0, 2δ ) which leads to, Rit − Rjt P(Rit − Rjt > 0) = Φ( ) √2δ and ≈ 1/(1 + 10−(Rit−Rjt)/2δ) Elo's formula was just a hack for the cumulative normal density function.
    [Show full text]
  • Package 'Playerratings'
    Package ‘PlayerRatings’ March 1, 2020 Version 1.1-0 Date 2020-02-28 Title Dynamic Updating Methods for Player Ratings Estimation Author Alec Stephenson and Jeff Sonas. Maintainer Alec Stephenson <[email protected]> Depends R (>= 3.5.0) Description Implements schemes for estimating player or team skill based on dynamic updating. Implemented methods include Elo, Glicko, Glicko-2 and Stephenson. Contains pdf documentation of a reproducible analysis using approximately two million chess matches. Also contains an Elo based method for multi-player games where the result is a placing or a score. This includes zero-sum games such as poker and mahjong. LazyData yes License GPL-3 NeedsCompilation yes Repository CRAN Date/Publication 2020-03-01 15:50:06 UTC R topics documented: aflodds . .2 elo..............................................3 elom.............................................5 fide .............................................7 glicko . .9 glicko2 . 11 hist.rating . 13 kfide............................................. 14 kgames . 15 krating . 16 kriichi . 17 1 2 aflodds metrics . 17 plot.rating . 19 predict.rating . 20 riichi . 22 steph . 23 Index 26 aflodds Australian Football Game Results and Odds Description The aflodds data frame has 675 rows and 9 variables. It shows the results and betting odds for 675 Australian football games played by 18 teams from 26th March 2009 until 24th June 2012. Usage aflodds Format This data frame contains the following columns: Date A date object showing the date of the game. Week The number of weeks since 25th March 2009. HomeTeam The home team name. AwayTeam The away team name. HomeScore The home team score. AwayScore The home team score. Score A numeric vector giving the value one, zero or one half for a home win, an away win or a draw respectively.
    [Show full text]
  • Intrinsic Chess Ratings 1 Introduction
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Central Archive at the University of Reading Intrinsic Chess Ratings Kenneth W. Regan ∗ Guy McC. Haworth University at Buffalo University of Reading, UK February 12, 2011 Abstract This paper develops and tests formulas for representing playing strength at chess by the quality of moves played, rather than the results of games. Intrinsic quality is estimated via evaluations given by computer chess programs run to high depth, ideally whose playing strength is sufficiently far ahead of the best human players as to be a “relatively omniscient” guide. Several formulas, each having intrinsic skill parameters s for “sensitivity” and c for “competence,” are argued theoretically and tested by regression on large sets of tournament games played by humans of varying strength as measured by the internationally standard Elo rating system. This establishes a correspondence between Elo rating and the parame- ters. A smooth correspondence is shown between statistical results and the century points on the Elo scale, and ratings are shown to have stayed quite constant over time (i.e., little or no “inflation”). By modeling human players of various strengths, the model also enables distributional prediction to detect cheating by getting com- puter advice during games. The theory and empirical results are in principle trans- ferable to other rational-choice settings in which the alternatives have well-defined utilities, but bounded information and complexity constrain the perception of the utilitiy values. 1 Introduction Player strength ratings in chess and other games of strategy are based on the results of games, and are subject to both luck when opponents blunder and drift in the player pool.
    [Show full text]
  • Home Advantage and Tied Games in Soccer
    HOME ADVANTAGE AND TIED GAMES IN SOCCER P.C. van der Kruit 0. Introduction. Home advantage and the occurence of tied matches are an important feature of soccer. In the national competition in the Netherlands about half the games end in a victory for the home team, one quarter end up tied, while only one quarter results in a win for the visiting team. In addition, the number of goals scored per game (one average about 3) is rather low. The home advantage is supposedly evened out between the teams by playing a full competition, where each two teams play two matches with both as home team in turn. I have wondered about the matter of the considerable home advantage in soccer ever since I first became interested in it.1 This was the result of the book “Speel nooit een uitwedstrijd – Topprestaties in sport en management”2 by Pieter Winsemius (1987). In this book he discusses various aspects of business management and illustrates these with facts and anecdotes from sports. In the first chapter he notes that in the German Bundesliga the top teams get their high positions in the standings on the basis of regularly winning away games. Teams on lower positions have often lost only a few more home matches, but it is the away games where they have gained much fewer points. The message is clear: never play an away game! In management terms this lesson means, according to Winsemius, for example that you arrange difficult meetings to take place in your own office. Clearly, home advantage plays an important role in soccer competitions.
    [Show full text]