Predicting Baseball Win/Loss Records from Player Projections

Predicting Baseball Win/Loss Records from Player Projections Connor Daly [email protected] November 29, 2017 1 Introduction When forecasting future results in major league baseball (MLB), there are essentially two sources from which you can derive your predictions: teams and players. How do players perform individually, and how do their collaborative actions coalesce to form a team's results? Several methods of both types exist, but often are shrouded with proprietary formulas. Currently, several mature and highly sophisticated player projection systems are used to forecast season results. None are abundantly transparent about their methodology. Here I set out to develop the simplest possible player-based team projection system and try to add one basic improvement. 2 Predicting Wins and Losses 2.1 Team-Based Projections One approach to such forecasting is to analyze team performance in head to head matchups. A common implementation of this approach is known as an elo rating and prediction system. 1 Elo systems start by assigning teams an average rating. After games are played, winning teams' ratings increase and losing teams' ratings decrease relative to the expected outcome of the matchup. Expected outcomes are determined by the difference in rating between the two teams. If a very good team almost loses to a really bad team, its rating will only increase slightly. If an underdog pulls of an upset, however, it will earn relatively more points. As more games are played, older games become progressively less meaningful. Essentially, this prediction method considers only who you played, what the margin of victory was, and where the match was played (home team advantage is adjusted for). Using Monte Carlo simulations, one can predict the outcomes of individual seasons for each team. In between seasons, teams are regressed towards the mean. For a detailed explanation of a baseball elo model, see FiveThirtyEight [Boi]. The main advantage of this kind of team-based aproach is that it can capture some of the hard to pin down factors that make teams more than just the sum of their parts. Without figuring out what the secret sauce is, this method estimates the sum total contributions of ownership, coaches, team philosophy, and an uncountable number of other factors. The method does have a significant downside, however, in that it can't take advantage of known changes in team dynamics, such as changes in players and coaches. If I know Babe Ruth is leaving the Yankess after a particular season, I probably want to project them differently than I would have otherwise. This model fails to capture that. 2.2 Player-Based Projections Baseball enjoys a unique advantage over other major American sports in that it is sig- nificantly easier to decouple the performance of individual players to determine who was ultimately responsible for creating a certain result. If a batter hits a home run, we can say with a high degree of certainty that the batter and the pitcher combined to cause this event. By looking at the large number of combinations of batter/pitcher matchups, we can gauge the relative skill of each by their performance against a wide variety of opponents. On the 2 other hand, a sport such as football presents significant challenges to gauging the true skill of individual players. Looking at the running game, how can one intelligently and objectively pass out credit and blame? If a running back runs for a seven yard gain on a toss sweep to the right, how much credit should the left guard receive? Decoupling in baseball isn't perfect, but compared to other sports, it's much easier. 2.2.1 Wins Above Replacement A foundational pillar of sabermetrics, the empirical, quantitative analysis of baseball, is the concept of wins above replacement (WAR). Essentially, the idea is that all meaningful baseball statistics must measure how events on the field help or hurt a team's chances of winning in expectation. The way games are won is by teams scoring runs and preventing runs from being scored. Thus, every event can thus be understood in the context of runs allowed or runs created. This idea can be hard to grasp at first. How many runs does a home run create? One? Rather counter-intuitively, the generally accepted value is around 1.4 runs. How is this? Well, not only did the batter score himself, but he will have also batted in any potential runners on base. You must also consider the possibility that had the batter made an out instead of scoring these base runners, following batters could have driven them in. Using real playing data, we can determine the expected run creating or subtracting value of every event in baseball. See Table 1 for a complete breakdown of the run value of such events. By looking at the total contributions of a player over the course of a season, we can sum up the expected run contributions of every event the player caused. Now we need to compare our player against a baseline. A first intuition might be to compare the player to league average. Well, defining league average to be a baseline of zero runs created sells league average players short. A league average player is better than approximately half of the players in the league. That's valuable production! Instead, we scale our player's contribution against the idea of a replacement level player. The production of a replacement level player 3 is intended to be equivalent to the contributions of an infinitely replaceable minimum salary veteran or minor league free agent. For reference, a team of replacement players is defined by Fangraphs to win approximately 48 games over the course of a 162 game season. Using this replacement level, we determine the original player's runs above replacement player. Next, we scale the runs above replacement by the amount of runs per win in an average game. Finally, we scale this calculated WAR to the the number of possible wins in a season, so that the sum of all WAR and replacement runs equals the total number of wins in season. On average, the player's context free stats would have resulted in a team winning an extra number of games corresponding to his WAR than if the same player had been replaced with a replacement level player. There is a finite pool of WAR for all players. When one player performs better, that means less WAR will be allocated to the rest of the players. Unfortunately for the reader, there are several variants of WAR, and all define things slightly differently. Several rely on inexplicably chosen constants or proprietary formulas. The main basis of my calculations relies on Fangraphs WAR, but I did make some alterations, which will be explained later. For more in depth explanations of WAR and its underpinnings, see [Joh85], [Tom06], and [Fanb]. 2.2.2 WAR to Wins By projecting a season's worth of players' expected WAR contributions, we can group players by target year team and take the sum total of their contributions. The combined total of their WAR should be able to help predict the team's actual number of wins. This relationship isn't necessarily one-to-one, as will be discussed in 4.5. This method benefits from being able to track players as they change teams. 4 Table 1: Run Values by Event Event Run Value Event Run Value Home Run 1.397 Balk 0.264 Triple 1.070 Intentional Walk 0.179 Double 0.776 Stolen Base 0.175 Error 0.508 Defensive Indifference 0.120 Single 0.475 Bunt 0.042 Interference 0.392 Sacrifice Bunt -0.096 Hit By Pitch 0.352 Pickoff -0.281 Non-intetional Walk 0.323 Out -0.299 Passed Ball 0.269 Strikeout -0.301 Wild Pitch 0.266 Caught Stealing -0.467 Empirical measurements of the run value of events from 1999 - 2002 season. Data from [Tom06] 3 Projecting Players To create a player projection based season long team projecting system, the first step is to project players. Essentially, you need to look at player's past performance and predict how he will perform in the future. Some methods of doing this are highly sophisticated, others quite simple. Systems like Baseball Prospectus's PECOTA, Dan Szymborski's ZiPS, and Chris Mitchell's KATOH all combine bunches of variables and various calculations to com- pute projected outcomes. PECOTA in particular is based primarily around player similarity scores. Mainly, it uses various metrics to find comparable players for a given to-be-projected player and uses the performance of those comparables to infer a trajectory for the targeted player's future performance. Although its general methodology has been discussed, its spe- cific implementation is proprietary. On the other end of the sophistication system is perhaps the simplest possible projection system: Marcel the Monkey. 3.1 Marcel the Monkey Marcel the Monkey, or simply Marcel, is a player projection system invented by Tom Tango [Tan]. It sets out to be the simplest possible player projection system. Essentially, it takes 5 a weighted average of a player's last three years (5/4/3 for batters and 3/2/1 for pitchers), regresses the player toward the mean by 1200 plate appearances, and applies an aging curve to increase player's skills until age 29 after which point they begin to decline. These projections make no attempt to differentiate for team, league, or position, with the exception that some different constants are used for starting pitchers and relief pitchers. Rather than calculating counting stats such as hits or home runs specifically, Marcel projects rate stats like hits or home runs per plate attempt.

Predicting Baseball Win/Loss Records from Player Projections

A Statistical Study Nicholas Lambrianou 13' Dr. Nicko

Machine Learning Applications in Baseball: a Systematic Literature Review

A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises

Improving the FIP Model

Determining the Value of a Baseball Player

Markov League Baseball

Gonzo Honored to Join Latino Baseball Hall of Fame by Josh Rawitch

Bayesian Statistics and Baseball

SF Giants Press Clips Friday, 21, 2018

Framing the Game Through a Sabermetric Lens: Major League

Major Qualifying Project: Advanced Baseball Statistics

Measuring Pitchers' Performance Using Data Envelopment Analysis