WAR and the Hall of Fame

Joshua Bayzick

May 5,2015

Abstract is a term coined by , and it is the search for objective knowledge about . This mathematical approach to baseball produced a statistic called (WAR), which takes into account , baserunning, and fielding to determine how many ‘wins’ that position player was worth. This project looks at how player’s WAR values change as they age. We will compare two groups of players, First-ballot Hall of Famers (HOF) and the average player. We know that the HOF group will have higher WAR values because they were well above average players, but the basic question is do they age differently than the average player.

1 Introduction

A relatively new statistic called WAR (Wins Above Replacement) was developed by saber- metricians to answer questions such as “Who had the better career?” or “Who had the better year?” For example, in 2014 Alex Gordon had a of .266 with 19 home runs and 74 RBI. In 2014 Yoenis Cespedes had a batting average of .260 with 22 home runs and 100 RBI. Based on these conventional statistics, they appear to have had fairly similar years. One might even argue that Cespedes was slightly better because of the RBI advantage. However, these conventional statistics do not measure everything a player does to help his team win a game. WAR, which takes into account batting, baserunning, and fielding, says that in 2014 Alex Gordon was worth 6.6 WAR for his team, while Ces- pedes was worth 3.3 WAR. Based on this WAR statistic, we easily conclude Alex Gordon had the better year overall. This paper will show you how parts of the WAR calculation are done and other uses of the statistic.

2 WAR Calculation

The Sabermetrics website FanGraphs describes the statistic as follows: “WAR offers an estimate to answer the question, ‘If this player got injured, and their(sic) team had to replace them with a freely available minor leaguer or a AAAA player from their bench, how much value would the team be losing?’” The statistic summarizes a player’s contribution to his team’s win total in one statistic. One characteristic of WAR is that it is independent of ballpark, league and season. It is also important to note that WAR is an approximation. For example, a 6 WAR player may be worth between 5 and 7 WAR, but we can certainly say this player performed at a

1 high level. A WAR value of 7 is considered to be an MVP season, according to FanGraphs. Therefore, WAR values tend to be situated between 0 and 7. The table below details WAR values even further. Lastly, WAR values are calculated differently for position players than they are for . This paper will focus only on position players.

WAR Level of Play 0–1 Scrub 1–2 Role Player 2–3 Solid Starter 3–4 Good Player 4–5 All Star 5–6 Superstar 6–7 MVP

To determine how many wins a player is worth to his team, we must first find out how many runs that player is worth to his team. For position players, these runs come from three phases of the game: batting, baserunning, and fielding. We add these runs together and then adjust the values based on his position, league, and the value of a replacement player. This produces the total amount of runs this player is worth. Lastly, we divide the total runs by the runs per win to produce wins.

2.1 Batting Runs To determine how many runs a batter creates by batting, we must first calculate Weighted On Base Average (wOBA). Using wOBA, we can calculate weighted Runs Above Average (wRAA), and then finally we can calculate batting runs. The statistic wOBA is calculated by taking league data and using multiple regression to generate the optimum weight for each hitting event. In other words, wOBA is designed to generate the optimum relationship between the hitting events and the amount of runs scored. The formula for wOBA is : w · uBB + w · HBP + w · 1B + w · 2B + w · 3B + w · HR wOBA = 1 2 3 4 5 6 AB + BB − IBB + SF + HBP where w1, w2, . . . , w6 are the weights of each hitting event, uBB represents unintentional walks, HBP represents the number of times a batter was by a pitch, and 1B, 2B, 3B, and HR represent singles, doubles, triples and home runs respectively. These weights will change every year, but in 2013 we have 0.690 · uBB + 0.722 · HBP + 0.888 · 1B + 1.271 · 2B + 1.616 · 3B + 2.101 · HR wOBA = AB + BB − IBB + SF + HBP After using a player’s statistics to calculate his wOBA, we can calculate wRAA. The formula for wRAA is wOBA − lg wOBA wRAA = · PA wOBA scale This formula finds the difference between a player’s wOBA and the league’s, and then scales this value so that plate appearances are an appropriate multiple.

2 Finally, we can calculate batting runs by using the formula  leagueR PF · leagueR  leagueR AL/NL nonpitcher wRC  BR = wRAA+ − PA+ − PA leaguePA leaguePA leaguePA AL/NL PA Essentially this formula takes batting runs and then adjusts that value for the park effect of a player’s home field, and also for the league. The park adjustment is necessary because baseball fields have different dimensions and characteristics. Some parks suppress runs and others encourage runs. This adjustment makes sure that no player has an advantage based on the park they played in. The last term simply balances the effect of the league. For example, the (AL) usually produces more runs since there is a Designated Hitter to bat for the . Thus, more runs are scored in the AL and it takes more runs to account for a win.

2.2 Baserunning Runs How many runs a player adds based on his baserunning is determined by two statistics: Ultimate Baserunning and Weighted Stolen . These statistics are mostly pro- prietary, but they measure how effective a baserunner is. They account for the amount of steals, the success rate of steals, and even how often a player advances extra bases on a hit. They utilize video review to find values for these statistics.

2.3 Fielding Runs Fielding Runs are also largely based on proprietary data. There are a few metrics available, such as , Total Zone, and . There are different kinds of WAR based on the fielding metric used. These WAR values are similar, but the kinds of WAR must be distinguished from one another if the formula is different. We are using FanGraph’s version of WAR (referred to as fWAR). Video is employed here as well to better estimate the range of a fielder.

2.4 Adjustments After summing the Batting Runs, Baserunning Runs, and Fielding Runs, we must adjust this value. There are three adjustments. The first is a positional adjustment. These are adjustments based on the position the player played. Certain positions are more difficult to play than others, and the positional adjustment tries to account for this. The second adjustment is a league adjustment. This just makes certain that the runs above average balances out to zero. This is often a minor adjustment. The last adjustment is the replace- ment player adjustment. So far, all of our calculations have been based around the average player. A replacement player is below the league average, so we must add some runs to the player to compare them to a replacement player. These adjustments are typically very small.

2.5 Final Calculation To calculate WAR we take the sum of the six components: batting runs, baserunning runs, fielding runs, and the three adjustments. This total is the amount of runs a player

3 contributes. To find the amount of wins the player was worth, we must divide the total number of runs by a Runs per Win value. This value changes year to year, but typically is somewhere around 10. Thus, the full calculation for WAR is Batting + Baserunning + Fielding + Positional Adj. + League Adj. + Replacement WAR = Runs per Win

3 Uses of WAR

Now that we know how WAR is calculated, we must understand what it offers to the baseball community. The main goal of the statistic is to generate one number to account for player performance. It is very easy for anyone to understand the phrase: “He was a 6 win player.” As the Sabermetrics approach takes hold, WAR is being used increasingly in the administering of individual awards, such as the MVP and Cy Young awards. Another application is even its use in calculating the dollar value of a player for contract negotiations.

4 The Basic Question

This brings us to our research question: “Do Inner Circle Hall of Fame Players age differently than the average player?” It is important to note that the Hall of Fame player will have higher WAR values, but we are concerned with the change in the values as the player ages. Further, we will define Inner Circle Hall of Famers, as players elected to the Hall of Fame their first year on the ballot. They must have been retired for seven years to be considered, and to be elected they must appear on 75% of the ballots of the sportswriters who vote. There are 32 First Ballot Hall of Fame position players, including Babe Ruth, Ted Williams, , and (See Appendix A).

4.1 Definition of Data Sets Due to the nature of WAR and the development of the game we will only use players from the live-ball era, which began in 1920. We only consider the live-ball era because 1920 marked a major shift in the equipment and ultimately began to form a game similar to what we see today. Before 1920, the ball that was used by teams was soft and did not travel as far. When Babe Ruth began to hit, everyone noticed how exciting home runs were. This came right after experienced the Black Sox Scandal in 1919, where the players were paid money to lose the . To encourage interest in the game of baseball the league changed the ball to one like we use today, which is harder and flies further and would increase home runs. Also, we know that WAR accumulates. Therefore, players with very few at-bats will not have enough time to gain or lose any significant WAR. We do not want players who only played a few games to factor in to the average WAR. We will only consider seasons in which the player had at least 400 at-bats. Even with this minimum, the data set for all position players from 1920 to 2014 is huge. Scott Lindholm, a sabermetrician for the website ’Beyond the Box Score,’ was very helpful in accessing this large amount of data.

4 5 Results

5.1 Procedure To conduct the tests we first needed to gather the relevant data. All the data was found online at the FanGraphs website, which is a baseball website that includes many elements of Sabermetrics. To gather data for the average player, I compiled all seasons in which the player had over 400 at-bats, and grouped those seasons by the age of the player. Then for each age I found the Average WAR value. For the HOF group, I used the same criteria and process. However, the data were compiled from only first ballot Hall of Famers. We can then graph WAR vs. Age for the two sets of data, and we get the aging graph seen below.

WAR by Age: HOF vs. Average 8

7

6

5

4

WAR HOF AVG

3

2

1

0 18 20 22 24 26 28 30 32 34 36 38 40 42 44 Age

Figure 1: This graph shows the aging curve for the HOF set in blue and the Average Player set in red.

5.2 Qualitative Analysis By simply looking at the graphs, we can see that the HOF curve has significantly higher WAR values between ages 20 and 37 than the average curve. This is to be expected because HOF players were better than the average players. The HOF curve also displays some

5 peaks and valleys while the average curve is much flatter. All of this is symptomatic of the perception that Hall of Famers have a sustained of peak years in the middle of their career. The sharp falloff in Hall of Fame WAR in the late career implies that they play the game longer than is optimal.

5.3 Quantitative Analysis

WAR by Age: HOF vs. Average HOF

8 AVG

HOF quad 7

Scaled AVG Quad 6 y = -0.0299x2 + 1.6895x - 17.999 Poly. (HOF) R² = 0.8508 5 Poly. (AVG)

Poly. (Scaled AVG Quad) 4 y = -0.0135x2 + 0.7808x - 6.2655

R² = 1

3 WAR

2 y = -0.007x2 + 0.405x - 3.2499 R² = 0.6991 1

0 18 20 22 24 26 28 30 32 34 36 38 40 42 44

-1

-2 Age

Figure 2: This is the same graph as Figure 1, but with the best fit quadratic equations included.

Looking at the best fitting parabolas, we can see that the HOF values peak a little sooner than the average curve. We can also compare the curvature of the HOF curve to the average curves. The best fitting quadratic curves are of the form

f(x) = ax2 + bx + c

For a parabola at its vertex, the curvature (κ) is:

κ = 2|a|

6 Thus, given the equations in the figure above, we have:

κHOF = 0.0598

κAVG = 0.027

κAVG Adj. = 0.014

Note that the HOF curve has a much larger curvature at its vertex than either the average or the adjusted average curves. This would imply that WAR values for HOF players will rise more rapidly to the peak and then fall more rapidly from the peak than those of the average player. We will statistically compare the two curves using two different methods: root mean square (RMSE) and the runs test. First, we will compare using RMSE. We have the actual WAR for each age and the predicted WAR for each age from the quadratics. We take the difference between actual and predicted at each age, and then sum the squares of these differences. Finally, we take the square root of this sum, which is the RMSE. Since we know the HOF curve has higher values, we will scale the Average curve up. First, we find the average ratio of HOF WAR to average Player WAR such that: Total WAR WAR Ratio = HOF Total WARAVG Then, we multiply the best fitting average parabola by this ratio to make the curves more comparable. Then we find the RMSE between the actual HOF data and the best fitting HOF curve, which yields a value of 0.76. When we compare the adjusted average curve in the same manner we get a RMSE value of 1.34. Although we know the RMSE is minimal for the best fitting HOF curve, the RMSE for the adjusted average curve is almost that minimum. If the curves were of a similar shape, denoting that HOF players aged the same as the average player, we would be able to adjust the best fitting average curve to fit closely with the HOF data. However, even the adjustment does not fit the average parabola closely to the HOF data. The second test we will use to compare the two data sets is the Runs Test (“Runs” here does not denote baseball scoring here). This is a non-parametric test and it can be used to determine whether sample data in a sequence appears in a random order. According to [Triola], “This test is based on sample data that have two characteristics and it analyzes runs of those two characteristics to determine whether the runs appear to result from some random process or whether the runs suggest that the order of the data is not random”. We performed a runs test for randomness above and below the median of the data set consisting of the ratio of HOF WAR to Average WAR at each age with a 0.05 significance level. Our null hypothesis is:

H0 : The data are in a random sequence.

The median WAR ratio is 1.81. Each age is then assigned a 1 if the WAR ratio is above the median ratio, or a 0 if it is less than or equal to the median ratio. This chart details the process:

7 Player Mean WAR Mean WAR WAR Ratio Ratio>Median? 0 or 1 Run Age (HOF) (AVG) 19 0.750000000 1.535714286 0.488372093 No 0 1 20 4.687500000 2.824561404 1.659549689 No 0 1 21 4.152941176 2.325581395 1.785764706 No 0 1 22 5.661111111 2.359842520 2.398935973 Yes 1 2 23 5.804000000 2.462839879 2.356629048 Yes 1 2 24 5.760869565 2.467138194 2.335041296 Yes 1 2 25 5.486206897 2.507213930 2.188168640 Yes 1 2 26 5.772413793 2.638624339 2.187660331 Yes 1 2 27 6.965517241 2.592857143 2.686425382 Yes 1 2 28 6.106666667 2.563506396 2.382153864 Yes 1 2 29 5.848387097 2.615517241 2.236034618 Yes 1 2 30 6.064516129 2.516580311 2.409824198 Yes 1 2 31 5.800000000 2.583097166 2.245366561 Yes 1 2 32 5.387096774 2.511672684 2.144824367 Yes 1 2 33 4.807407407 2.418865031 1.987464098 Yes 1 2 34 4.214285714 2.323828125 1.813510074 No 0 3 35 3.714285714 2.302088773 1.613441566 No 0 3 36 3.551851852 2.248905109 1.579369373 No 0 3 37 3.682608696 2.054074074 1.792831496 No 0 3 38 2.090909091 1.837735849 1.137763674 No 0 3 39 2.284210526 2.023809524 1.128668731 No 0 3 40 1.507692308 1.600000000 0.942307692 No 0 3 41 0.820000000 1.033333333 0.793548387 No 0 3

The above chart shows that we obtain 11 zeroes and 12 ones; hence, the expected number of runs falls between 7 and 18. [Triola p. 764] Since our ordered sample contains only three runs, we reject the null hypothesis and conclude that the ratio between the two aging patterns is not constant over time. Thus, we conclude that HOF players age differently than the average player.

6 Other Uses

We have just shown a comparison of average players to HOF players. In this section we will discuss how current Major League Front Offices make use of WAR. So we considered other applications of WAR that organizations use often in their day to day activities. The first is that teams use WAR to determine how much money to offer a free-agent player. The market determines how much one WAR is worth, and we can use projections and inflation

8 to determine how much they will be worth in future years. It is also used in arbitration, where players and organizations must convince an arbitrator how much a player is worth. For example, consider Hanley Ramirez. He signed this past off-season for $88 Million over four years. To determine if this was a fair price, we first need to determine how many WAR we expect him to generate. Ramirez produced 2.9 WAR in 2012, 5 WAR in 2013, and 3.4 WAR in 2014. Now we calculate a weighted average, where we weight 2012 with a 1, 2013 with a 2, and 2014 with a 3, to determine his projected 2015 WAR. Thus we have: (1 · 2.9) + (2 · 5) + (3 · 3.4) 2015 WAR = = 3.9 WAR 6 Now, we can use the aging curve to try and predict how his WAR value will change over the contract term. We use percent changes from age to age to determine the WAR values in the upcoming years. Thus, we predict Hanley will contribute 3.9 WAR in 2015, 3.79 in 2016, 3.65 in 2017, and 3.51 in 2018. In 2015 one WAR was worth about $6 Million. Assuming 5% inflation, we have one WAR is $6.3 Million in 2016, $6.6 Million in 2017, and $6.9 Million in 2018. By summing the products of each year’s WAR and dollars per WAR, we would estimate that Hanley Ramirez is worth about $95.586 Million. This means he may be a little underpaid if he plays like we expect. This calculation is nicely summarized in the table below.

Year Age WAR Dollars Per WAR Salary 2015 31 3.9 6 M 23.4 M 2016 32 3.79 6.3 M 23.877 M 2017 33 3.65 6.6 M 24.09 M 2018 34 3.51 6.9 M 24.219 M

Table 1: 4 years 95.586 Million

Another application is the prediction of standings. We can estimate how many games a team will win by taking the total WAR of every player on their roster. We would expect a team of replacement players to win about 54 games in a season, so adding the team WAR to 54 gives us a predicted win total. Then we can simply order the win totals in each division to predict the standings. These values are optimistic and must be scaled, but they offer a solid basis for predictions. (See Appendix B). These predictions are important to front offices. For example, it takes about 88 to 90 wins for a team to make the playoffs. A team projected to win 75 games would not significantly increase their playoff chances by signing a 4 WAR player. On the other hand, a team projected to win 86 games would probably aggressively pursue a 4 WAR player. These are far from the only applications of WAR, but they are two that an organization may pay a lot of attention to when making decisions.

7 WAR in the Future

WAR is a cutting edge statistic which is constantly being tweaked by sabermetricians. Future iterations of WAR will attempt to make the statistic more accurate by considering

9 things currently neglected, e.g., catcher framing or propensity to hit into double plays. Every team today now has an analytics department which collects and develops proprietary statistics. Undoubtedly, each team works with a slightly different WAR formula. This work will undoubtedly lead to future WAR formulas that may look very different from what was presented here.

8 Conclusion

In conclusion, WAR is a statistic that takes into account all aspects of a player’s perfor- mance. The statistic also adjusts for league, park, and year, which allows us to compare players on different teams, in different leagues, and in different seasons. WAR determines how many runs a player creates and saves for his team, and then converts these runs into wins. The main goal of WAR is to easily see how much a player contributed to his team. While it is easy to understand the result, the computation can be complex. We also showed that HOF players age differently than the average player.

References

[1] Albert, Jim. “Teaching Statistics Using Baseball”. The Mathematical Association of America, 2003. Print.

[2]“ FanGraphs: and Analysis”. 2014–2015. Web.

[3] Triola, Mario F. “Elementary Statistics”. 11 ed. Boston:Addison-Wesley, 2010. Print.

10 A First Ballot Hall Of Famers

Position Players Pitchers Frank Thomas Greg Tom Glavine Cal Ripken Dennis Eckersley Tony Gwynn Nolan Ryan Steve Carlton Tom Seaver Eddie Murray Jim Palmer Ozzie Smith Bob Gibson Dave Winfield Warren Spahn Bob Feller Robin Yount Honus Wagner Mike Schmidt Christy Mathewson Reggie Jackson Walter Johnson Rod Carew Willie Stargel Willie McCovey Brooks Robinson Hank Aaron Frank Robinson Willie Mays Ernie Banks Mickey Mantle Stan Musial Ted Williams Jackie Robinson Babe Ruth Ty Cobb (Dead-Ball Era)

11 B Predicted 2015 Standings

AL East Record AL Central Record AL West Record Red Sox 88-74 Tigers 87-75 Angels 87-75 Blue Jays 83-79 Royals 82-80 Mariners 83-79 Rays 83-79 Indians 80-82 Athletics 78-84 Orioles 81-81 White Sox 77-85 Rangers 76-86 Yankees 79-83 Twins 69-93 Astros 74-88

Table 2: Predicted American League Standings

NL East Record NL Central Record NL West Record Nationals 93-69 Cardinals 92-70 Dodgers 93-69 Mets 80-82 Pirates 84-78 Giants 80-82 Braves 80-82 Brewers 83-79 Padres 79-83 Marlins 77-85 Reds 83-79 Rockies 78-84 Phillies 69-93 Cubs 81-81 D’backs 71-91

Table 3: Predicted National League Standings

12