Twar: Introducing a Method to Actually Calculate Wins Above Replacement
Total Page:16
File Type:pdf, Size:1020Kb
tWAR: introducing a method to actually calculate wins above replacement Daniel J. Eck March 21, 2019 1 Introduction Wins above replacement (WAR) is meant to be a one-number summary of the total contribution made by a player for his team in any particular season. As stated by Steve Slowinski of Fangraphs, WAR offers an estimate to answer the question, \If this player got injured and their team had to replace them with a freely available player of lower quality from their bench, how much value would the team be losing," where this value is expressed in number of wins [Slowinski, 2010]. That being said, nobody actually calculates WAR in a manner that properly answers the above question as posed. This is not by any explicit fault of the metric and those who calculate it. One problem is that it is impossible to simultaneously quantify the value of a player when the player is available and the value of a replacement to that player when the player is unavailable. The player in question is either available to play or unavailable to play, never both. Instead of confronting the problems raised in this factual-counterfactual world, people have attempted to calculate a hypothetical replacement player to implicitly compare every player with using the machinery of a proprietary black box [Baumer et al., 2015]. Three widely used versions of WAR that are calculated in this manner are Baseball Reference's bWAR [Reference, 2010], Fangraphs's fWAR [Slowinski, 2010], and Baseball Prospectus's bWARP [Prospectus, 2019]. Through the incorporation of ideas from causal inference, we propose methodolody to directly estimate wins above replacement. Our estimate of WAR confronts the difficulty of the factual- counterfactual real world posed in Slowinski's quote above. Note that there are numerous examples of seasons in which a player is available and unavailable for a substantial amount of time. When this is so, we can directly compare how the team performs when the player is available to how the team performs when the player is unavailable. This framework allows for a direct estimation of the wins that a player adds above a replacement player. This direct estimator is relatively simple to compute, available, easy to understand, and its interpretation is flexible to the narrative of a season. We will refer to this direct calculation of WAR as tWAR, which is short for team WAR, which is shorthand for the direct calculation of team wins above replacement. The tWAR estimator has the 1 potential to yield a much more natural and appropriate estimate of WAR than those which involve the calculation of a hypothetical replacement player via black box methodology. The validity of the simplest estimate of tWAR depends on the team and the competition faced by the team being similar during both player states. Justifiable extensions of tWAR to more complicated settings are proposed. Estimates of cross-team wins above replacement and other \above replacement" metrics are proposed. We primarily focus on the 2014 Yadier Molina season to show the discrepancies between conven- tional calculations of WAR and tWAR. Our version of WAR gives much more value to Yadier Molina's 2014 season than conventional versions. This result is far from surprising. Many note that conventional versions of WAR do not properly account for leadership, game management, pitch framing, and catcher defense, which are all aspects of baseball that Molina excels at [Fagan, 2015, Posnanski, 2015, Schwarz, 2015, Fleming, 2017, Womack, 2017]. That being said, a tangible numeric value of the additional Cardinals wins attributable to Molina as a result of these intan- gible traits has not existed until now. We caution against generalizing our findings beyond the 2014 season with certainty, but we hope that the point is taken and can be used to strengthen Molina's case for the Hall of Fame. The point here being that conventional versions of WAR likely have underestimated the number of Cardinals wins attributable to Yadier Molina by a substantial amount. Additional analyses are provided for Miguel Cabrera's 2015 season with the Detroit Tigers and Mike Trout's 2017 season with the Los Angeles Angeles. These specific players are chosen because of the 2012 most valuable player (MVP) race between them that is symbolic of the fight between those who favor new sophisticated analytics to value a player's production and those who favor traditional analytics to value a players production. As noted in Baumer et al. [2015], sabermetricians from the new school advocated strongly for Trout while those that preferred traditional statistics advocated strongly for Cabrera. To the adherents of sabermetrics, the decision for who should win the 2012 MVP award was clear { point estimates showed Trout leading Cabrera by 3.2 fWAR and 3.6 bWAR. The openWAR metric in Baumer et al. [2015] provided far more sophistication to this debate. According to openWAR, the estimated difference between Trout and Cabrera is only 1.05 WAR in Trout's favor. Moreover, there is substantial overlap of the interval estimates of Trout's and Cabrera's openWAR. We do not provide a tWAR estimate of these player's WAR in 2012 because these players did not not miss a significant portion of the 2012 season, which voids comparisons to a suitable replacement player under the tWAR framework. However, both 2012 Cabrera and 2012 Trout were archetypically similar player in 2015 and 2017 respectively. Our tWAR estimates of WAR for 2015 Miguel Cabrera and 2017 Mike Trout give the opposite impression that conventional WAR and openWAR give for theses player's respective value for their teams. In 2015, the Detroit Tigers were far worse when Miguel Cabrera did not play or was injured. However, the 2017 Los Angeles Angels were not terribly hindered by the absence of Mike Trout's production. These findings are striking (especially for Trout) and they come with natural caveats arising from the context of those seasons. These caveats are explored. 2 2 Mathematical Details We cast the estimation of WAR as a causual inference problem. Causal inference details closely follow the presentation in Aronow and Miller [2019, Chapter 7]. Let Yi(0) denote the binary potential outcome corresponding to a team win (1 if team wins, 0 if team loses) in game i when the player is unavailable for play. Let Yi(1) denote the binary potential outcome corresponding to a team win in game i when the player is available to play. Note here that availability includes situations where the player is healthy but does not enter the game. Let Di deonte the availability status of the player in game i and define the binary observed outcome, Yi = Yi(1)Di + Yi(0)(1 − Di); as in the potential outcome model of Aronow and Miller [2019, Definition 7.1.1]. Under this potential outcome model, we denote the average treatment effect (ATE) for game i as E[τi] = E[Yi(1) − Yi(0)] = E[Yi(1)] − E[Yi(0)]: Under the assumption of random assignment [Aronow and Miller, 2019, Definition 7.1.9] we can conclude that the ATE for game i is point indentified and, E[τi] = E[YijDi = 1] − E[YijDi = 0]; as seen in Aronow and Miller [2019, Theorem 7.1.10]. We estimate the ATE with the difference in sample means estimator, Pn Pn ^ ^ ^ i=1 YiDi i=1 Yi(1 − Di) E[τi] = E[Yi(1)] − E[Yi(0)] = Pn − Pn : i=1 Di i=1(1 − Di) In the context of our baseball application, we respectively denote Pn ^ i=1 YiDi p^avail = E[Yi(1)] = Pn ; i=1 Di Pn ^ i=1 Yi(1 − Di) p^unavail = E[Yi(0)] = Pn ; i=1(1 − Di) as the estimated probability of team victory when the player is available, and the estimated proba- bility of team victory when the player is unavailable. The estimated ATE is a difference of estimated team victory proportions taken across the avialability status of the player in our baseball applica- tion. Our resulting estimator of wins above replacement, conditional on games that the player was available and assuming random assignment, is then tWAR\ avail-unavail = (^pavail − p^unavail) × G where G is the number of games that the player was available. We can similarily define wins above replacement estimators for comparing team performance when the player was on the field of play vs. when the s player was not on the field of play (denoted tWAR\ on-off) and when the player was on the field of play vs. when the player was unavailable (denoted tWAR\ on-unavail). We conditioned on the number of games in which the player was available and we assumed that ran- dom assignment of injury status holds in order for tWAR\ avail-unavail, tWAR\ on-off, and tWAR\ on-unavail 3 to be a useful estimator of an identifiable number of wins above replacement. The assumption of random assignment is equivalent to (independence of games?) the strength of schedule and team composition being the same when the player is available and when the player is not available. When the assumption of random assingment does not hold and we have additional information on each unit, then we can proceed to estimate the ATE under strong ignorability [Aronow and Miller, 2019, Definition 7.1.12]. Strong ignorability is equivalent to assuming random assignment conditional on additional covariates. Under strong ignorability of Di conditional on Xi, the ATE is point identified and is written as, X X E[τi] = E[YijDi = 1;Xi = x]P (Xi = x) − E[YijDi = 0;Xi = x]P (Xi = x): x x With discrete covariates, we can estimate the ATE using plug-in as in Aronow and Miller [2019, Section 7.2.1].