<<

Locus: The Seton Hall Journal of Undergraduate Research

Volume 3 Article 7

October 2020

Sabermetric Analysis: Wins-Above-Replacement

Karl Hendela

Follow this and additional works at: https://scholarship.shu.edu/locus

Recommended Citation Hendela, Karl (2020) "Sabermetric Analysis: Wins-Above-Replacement," Locus: The Seton Hall Journal of Undergraduate Research: Vol. 3 , Article 7. Available at: https://scholarship.shu.edu/locus/vol3/iss1/7 Hendela: Sabermetric Analysis: Wins-Above-Replacement

Sabermetric Analysis: Wins-Above-Replacement

Karl Hendela Seton Hall University

Abstract 1. Introduction

Baseball has increasingly turned to The game of has been America’s pas- in order to evaluate players and teams within the time for well a . The objective of the league. The latest statistic to become popular game—score more runs than you allow and you is Wins-Above-Replacement, abbreviated WAR. win—has led to the development of two axioms, WAR takes into account all factors of a player’s “A ball player’s purpose in playing baseball is to contribution to a team, including pitching, bat- do those things which create wins for his team, ting, base-running, and fielding, and combines while avoiding those things which create losses them to produce a number which approxi- for his team,” and “Wins result from runs scored mates how many games the team has won by uti- while losses result from runs allowed” (James, lizing this player as compared to a replacement- 1984). Predicting which team will win a base- level or minor league level player at the same po- ball game and measuring how players contribute sition. The WAR statistic has become widely used to those wins has become an American pastime in as a way to compare players, even players play- itself with , the search for objective ing different positions or in different time peri- knowledge about baseball (James, 1984). Record- ods. However, current versions of WAR all have ing players’ statistics have been an integral part of one serious drawback: there is no publicly avail- since the foundation of the able, standardized formula or process to calcu- game. It started with the simplest statistics: num- late it. Instead, separate organizations have for- ber of hits and walks; average; RBIs (runs mulated their own versions of the statistic using batted in) and home runs for position players; in- proprietary data, rendering the statistic both dif- nings pitched, , bases on balls (walks), ficult to understand and difficult to replicate. In earned average (ERA), and win/loss record for this project we create a standardized WAR statis- . Statistics such as these have been care- tic that accurately indicates the value of a player fully compiled since the dawn of the game for using only publicly available data, and show that managers and fans alike to look at and analyze our new WAR statistic strongly correlates with themselves. WAR statistics which use proprietary data. This While no statistics can tell the full story of the approach has the potential to create a standard players and the games they played, over time the WAR formula that an average baseball fan can question has arisen: Is it possible to have a sin- use and understand in the future as they evaluate gle statistic that measures the quality of a player? a baseball player’s value. We call it SHU-WAR. This question has led to the development of statis- tics such as runs-created, win-shares, and most

Published by eRepository @ Seton Hall, 2020 1 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

recently wins-above-replacement (WAR). Wins- therefore is left to take the values generated from above-replacement attempts to answer this ques- these methods for granted since there is no way tion by taking into account all aspects of a player’s to reproduce the numbers these organizations cre- performance and translating it into the most im- ate themselves. Baseball Reference has named its portant part of the game: winning. The idea is version bWAR, is fWAR and Baseball if you had a team composed of replacement-level Prospectus is WARP, none of which are identical players, statistically they would be expected to fin- or uniform across all players. When the statistic ish a typical 162-game season with a record of is typically talked about though, it is referred to as roughly 48 wins and 114 losses. The WAR statis- WAR even though there are multiple versions of tic determines how many wins above or below 48 it. would result by replacing one of those replace- Given the amount of publicly available base- ment players with the player whose WAR is being ball statistics, however, and the various mathe- measured (WAR Explained, 2019). In the creation matical ideas and techniques already available in of this number, the unique factors of the game are sabermetrics, it seems reasonable to believe that normalized to be context, league, and park neutral a WAR statistic can be constructed that would so that every player is valued on an even playing be practical, understandable, and accessible to the field. Baseball is one of the few sports that has sig- general baseball-loving fan. Our objective in this nificant differences between fields and leagues as paper is to create that statistic, a new WAR statistic the sport has never standardized outfield or foul we call SHU-WAR, which can function as a stan- territory dimensions. This makes it difficult to dardized wins-above replacement statistic while compare players, as these differences cause prob- avoiding using any proprietary data that is not lems in determining how one player would fare on available on a typical baseball statistical database. a typical field. WAR takes this all into considera- tion, as adjustments are made to even the playing 2. The Linear Weights System field between players, making this a statistic com- parable across all leagues, fields, and time periods. The basis of the SHU-WAR statistic is the Normally, this is the part of the paper where Linear Weights System developed by I would describe how WAR is calculated. Un- and Pete Palmer (Thorn & Palmer, 1985). In fortunately, however, this is impossible because this system various baseball events were indepen- WAR is a non-standardized metric: there are in dently evaluated and given a value that expresses fact multiple versions of the WAR statistic. To each event in terms of runs, specifically runs pro- compound matters, all of these WAR statistics, duced or prevented. The events evaluated in Thorn published by companies like Baseball Reference, and Palmer’s linear weights system encompass all FanGraphs and , are based at physical plays, including even non-scoring plays. least in part on proprietary data not available to Examples of batting events, for example, include the public. Some of this data requires special- events like singles, doubles, triples but also non- ized technology and, while it is always nice to events like walks, bases on balls, and when a gain new angles on parts of the game, this can re- player is hit by a pitch (and thus advances to first sult in confusion about just what aspects are be- base automatically). Using a , ing measured. There is also little public informa- Thorn and Palmer determined the linear weight of tion available about these statistics, which leads each event, as well as adjustments taking into ac- to confusion on how to actually calculate the fi- count the league that the player is in. Hitless at- nal value, as it may be constantly adjusted be- bats are given a negative score, so that the sum of hind closed doors. The typical baseball analyst all linear weights of all events within a league sum

https://scholarship.shu.edu/locus/vol3/iss1/7 2 Hendela: Sabermetric Analysis: Wins-Above-Replacement

up to zero, and so the walks, singles, and so on the final value of how many wins are accredited can be measured by the number of runs they pro- to the player above that of a replacement player. duce above an average at-bat. To give an example, An early version of this method for calculating while a in a baseball game can directly what we now call SHU-WAR first appears in the score between one to four runs (depending on book Understanding Sabermetrics (Costa, Huber how many people were on base) it was determined & Saccoman, 2nd ed.). through Thorn & Palmer’s work that a home run produces approximately 1.40 runs greater than the 3.1 Batting Runs average at-bat (Thorn & Palmer, 1985). Batting runs (BR) is a statistic that takes the For pitching statistics, the linear weights sys- important portions of the offensive player’s game tem is calibrated to the league’s earned-run- and translates each event into . The average (ERA) and the amount of pitched. statistic used is as follows: This means a with the league average ERA would have a presumptive record of 81-81 if they BR = (0.47 × H) + (0.38 × 2B) + (0.55 × 3B) pitched every game of the 162 game season, while + (0.93 × HR) + [0.33 × (BB + HBP)] any deviation from the league average will either − [ABF ∗ (AB − H)] increase or decrease the number of wins for the season (Thorn & Palmer, 1985). This is parallel to The BR formula for an individual player is cal- batting where the system takes into account pos- culated using the total number of at-bats a player itive and negative events in terms of the league records (AB), as well as the various positive events averages to determine how many runs the player that can result: hits (H), doubles (2B), triples (3B), produced. The same type of analysis was done home runs (HR), (BB), or hit by for base stealing, fielding, and the various other pitch (HBP). A final adjustment is made using a events that are tracked by other baseball statis- factor based on the league (ABF), tics. These linear weight system values obtained which we will describe shortly. The ABF calcu- by Thorn & Palmer help weight the various base- lation uses only league totals and is multiplied by ball events that are measured in our calculation of the number of outs, which is at-bats minus hits. the SHU-WAR statistic. In order to simplify the formula above, a player’s singles, that is hits that only advance the 3. The Formulas batter to first base, are not distinguished from hits. Since all doubles, triples and home runs are also Our SHU-WAR statistic has six components, hits, the weights given each type of hit gives infor- respectively covering batting, base stealing, field- mation about the hit quality. A single, for exam- ing, positional adjustment, pitching, and a re- ple, is weighted the same as a hit at 0.47, but dou- placement player adjustment. While the formu- bles, triples, and home runs are given additional las that were used to calculate SHU-WAR are all weight values because these contribute more to the weighted using the linear weights system devel- production of runs. To illustrate with an earlier ex- oped by Thorn and Palmer, a number of adjust- ample a home run in this case is given a weight of ments are also made in order to further strengthen 1.40, just as by Thorn and Palmer, since it accu- the performance of the final value. Once each of mulates the coefficient of 0.47 for being a hit and these six components is calculated, the runs cred- 0.93 for being a home run, and 0.47+0.93 = 1.40. ited to the player for each component are summed This formula also accounts for walks (BB) and up and divided by two times the average runs hits by pitch (HBP) with a coefficient of 0.33. per game per team within the league. This gives These are events that allow a player to get on base,

Published by eRepository @ Seton Hall, 2020 3 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

but are weighted less than a hit because a typi- and is used to adjust the batting runs statistic as cal baseball statistic database does not differenti- ( BR×(1−PF) ate between intentional and unintentional walks. BR + 2 if BR ≥ 0 BRPF Ad j = BR×(PF−1) Lastly, a portion is subtracted to take into account BR + 2 if BR < 0 the ABF, a league batting term factor. The ABF is calculated using only league total statistics, and is The park factor takes up to a five year average of the amount of runs scored at home (ARSH) [(0.47 × H) + (0.38 × 2B) + (0.55 × 3B) and adds the amount of runs allowed at home +(0.93 × HR) + (0.33 × (BB + HBP))] ABF = (ARAH), which is all divided by the amount of AB − (LGF × H) games played at home (GPH). This is divided by the same fraction except calculated now in terms This ABF statistic uses the positive portion of BR of games played away from home. In this way a as its numerator, but with league inputs now, and number greater than one indicates an easier sta- scales it by league at-bats minus the league factor dium to score runs in as compared to the games times hits. The league factor (LGF) is simply 1 for played away. If five years’ data does not exist, both and , and is such as teams having recently changed stadiums, adjusted when calculating for other leagues. Two then the BR PF Adj is calculated using up to a five examples include the historical Union Associa- year average for these statistics. tion League having an LGF of 0.8 and the Federal League having a 0.9. An LGF of 1 simply makes 3.2 Base Stealing the denominator the number of outs, but allows for adjustment depending on the league (Costa, Hu- Base-running is an underestimated portion of ber, & Saccoman, Understanding Sabermetrics, the game, with faster players able to create a com- 2008). petitive advantage on the base paths by advanc- A pitcher’s ABF value in their BR value is di- ing on the bases without hit support, simply based vided by two, but for pitchers who have played a on their speed. A is beneficial to position other than pitcher, or have been the des- a team, but being stealing often ruins a ignated hitter for multiple games, their ABF value team’s chances of scoring in that . The lin- is not divided by two. A modern day case is the ear weights system gives that a stolen base is con- player who in 2018 was a pitcher sidered to create approximately 0.3 runs per stolen for the , but who is known for base, and being subtracts approxi- his hitting ability and was often used as a desig- mately 0.6 runs per instance. This leaves us with nated hitter. This means in the calculation of his the simple formula of: batting runs, his ABF was not divided by two. To complete the batting portion of this WAR SBR = 0.3 × SB − 0.6 ×CS statistic, there needs to be a park factor adjust- 3.3 ment. Ballparks like Coors Field are notoriously known as hitter’s parks since the thin air in Denver The next portion of SHU-WAR covers the as- makes it easier for the ball to travel. Some of the pect of fielding, typically the most difficult aspect longest home runs in history have been hit there of the game to measure because each position and for this reason, so this needs to be considered. The situation is unique in baseball. Calculations here park factor is calculated as will differ somewhat by position, with some of the weights for various events altered depending on ARSH +ARAH PF = GPH its difficulty and importance to the position. To ARSA+ARAA be explicit, , pitchers, and first basemen GPA

https://scholarship.shu.edu/locus/vol3/iss1/7 4 Hendela: Sabermetric Analysis: Wins-Above-Replacement

will have unique formulas, while second basemen There is an adjustment to (PO*) for catch- share the same formula with third basemen and ers, which is why there is an asterisk in the for- , and left, right, and center outfielders mula. Catchers are credited with putouts for most will have the same formula as each other. strikeouts, so PO∗ are the putouts for catchers mi- A players fielding runs (FR) calculation is nus the putouts from strikeouts. done in two stages. First a general fielding run For first basemen, second basemen and - calculation is done for each position on a player’s fielders we use respectively team. The individual player’s FR value is obtained by then taking these values and making adjust- 1BFR = (0.2 × ((2 × A) − E)) ments based on the number of fielding outs that − 1B APL × (Total PO − Total SO) the player obtained at that position. (0.2 × ((2 × A) − E)) 1B APL = First we give the fielding run calculations for Total PO − Total SO each position on the team. The formula for pitcher fielding runs (P FR) is as follows: 2BFR = (0.2 × (PO + (2 × A) − E + DP)) PFR = (0.1 × (PO + (2 × A) − E + DP)) − 2B APL × (Total PO − Total SO) (0.2 × (PO + (2 × A) − E + DP)) − P APL × (Total PO − Total SO) 2B APL = Total PO − Total SO where PO is put-outs, A assists, E errors commit- ted, DP plays, and SO strikeouts. Again, LF FR = (0.2 × (PO + (4 × A) − E − (2 × DP))) all the inputs for this calculation are team statis- − LF APL × (Total PO − Total SO) tics for the pitcher position. The P APL factor is (0.2 × (PO + (4 × A) − E + (2 × DP))) a league factor, analogous to the ABF factor used LF APL = in the batting runs BR calculation. Like the ABF Total PO − Total SO factor previously, league value are used for the in- As mentioned, the previous FR formulas are team puts. values for each position, which are next used to calculate each individual player’s actual fielding (0.1 × (PO + (2 × A) − E + DP)) P APL = runs. The final calculation to determine an indi- Total PO − Total SO vidual player’s fielding runs FR value is as fol- lows: In the APL calculation, the inputs for the numera- tor are position-specific (for example, E counts the  Player Putouts  FR = × FRpos (1) total number of errors committed by league pitch- Total Team Putouts ers) while the denominator uses the total putouts and strikeouts for the entire league across all posi- In the formula above, the team positional value tions. The APL values for other positions, which FRpos (for example, P FR for a pitcher) is appear below, have a similar form and are calcu- weighted by how many putouts the individual lated in the same way. player has at the position compared to the total putouts for the team at the same position. The fig- The corresponding formulas for catchers is ure in parentheses, which is the amount of putouts CFR = (0.2 × (PO∗ + (2 × A) − E + DP)) the player has at a position compared to the total for the team at the position, represents the percent- −C APL × (Total PO − Total SO) age of the team’s fielding runs for that position that (0.2 × (PO∗ + (2 × A) − E + DP)) C APL = the player is awarded. This is then multiplied by Total PO − Total SO the team’s fielding runs for the position.

Published by eRepository @ Seton Hall, 2020 5 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

Over the course of a season a player may play Once again, as with the fielding runs FR calcula- more than one position. In this case the final FR tion, a player may play more than one position in calculation is weighted according to games played a season so the final positional adjustment is the at each position. For example, if a player played weighted sum of each position for the respective 100 games total, with 90 games as a and 10 player. games as a , then their final FR score would be 90% of their FR C score from equation 3.5 Pitching (1) plus 10% of their FR 2B score from (1). It is As important as creating runs is, preventing possible a player has more games played at posi- runs is just as important in winning a game. tions than total games played, as it is possible to Thorn and Palmer’s original formula—take the in- play multiple positions within a single game, but nings pitched, multiply it by the number of games this is a rare occurrence and thus there would only played times the league ERA, divide by nine and be a negligible difference. The current simplic- subtract out the player’s earned runs—was league- ity of the calculation makes it more manageable specific and created before the advent of inter- to calculate with available data. league play in Major League Baseball in 1997. 3.4 Positional Adjustment The formula used in our calculations is very sim- ilar but with an interleague adjustment and an ad- Certain positions are clearly more difficult or justment for park factor: more involved than others, a factor that is not IP GP − TotIG taken into account in the fielding runs formula. PR = × × LeagueERA So SHU-WAR also introduces a positional adjust- 9 GP ment factor. Each position is assigned a coeffi- TotIG  + × MLB ERA − (Player ER) cient to account for this factor, with higher num- GP bers indicating more difficult positions and lower and negative numbers easier positions. The coef- where IP stands for by the player, ficients are weighted by the amount of games a GP for games played, TotIG for total number player has played at the position compared to the of interleague games played, LeagueERA is the total games in the season. The formula and the average for the player’s league (Amer- corresponding coefficients used in this version of ican league or National league), and MLB ERA WAR is as follows: stands for the earned for both leagues combined. GP at Pos Similar to batting runs, pitching is inversely Pos Ad j = × PosAd jCoe f f TotGP affected by the park so there is a park factor ad- justment because any park that is easier to hit in, where GP at Pos stands for games played at a is therefore more difficult to pitch in. particular position, TotGP stands for total games played, and the positional adjustment coefficient ( PR×(PF−1) PR + 2 if PR ≥ 0 PosAdjCoeff for the various positions is given in PR PF Ad j = PR×(1−PF) PR + if PR < 0 the following table. 2 It also needs to be noted that positive values are C 1B 2B 3B SS better than negative values where the final value 12.5 -12.5 2.5 2.5 7.5 is in terms of runs prevented instead of runs al- LF CF RF DH P lowed. A negative value indicates the pitcher al- -7.5 2.5 -7.5 -17.5 5 lowed more runs per game than the average pitcher

https://scholarship.shu.edu/locus/vol3/iss1/7 6 Hendela: Sabermetric Analysis: Wins-Above-Replacement

while a positive value indicates that the pitcher SHU-WAR is prevented that amount of runs above the average pitcher. (BR PF Ad j + SBR + FR + Pos Ad j +PR PF Ad j + Repl Ad j) 3.6 Replacement Player Adjustment 2 × RPG where RPG is the average runs per game for the The last component of SHU-WAR is the re- league. placement player adjustment which takes into ac- Each of the six components, and the RPG count the amount the player actually played. Here value, can be constructed using freely available we make a distinction between position play- data, so SHU-WAR can be calculated ers and pitchers, where the pitchers’ replacement for any player from any league using these for- player adjustment is based on batters faced (BF) mulas, and the SHU-WAR value can be broken and position players’ adjustment is based on plate down into the six components so good and bad appearances (PA). aspects of a player’s SHU-WAR value can be di- rectly attributed to specific parts of it. Two peo- BF (P) Replacement Player Ad justment = ple can calculate the wins-above-replacement for 50 any player and come up with the same value using these formulas. There’s undoubtedly room to fur- PA ther perfect the SHU-WAR statistic, which can be (Pos) Replacement Player Ad justment = the study for future research, but as we will show 30 in the rest of the paper in this current state the for- This aspect of SHU-WAR works as a longevity mula already works well. measure where a player is considerably more valu- Before we compare SHU-WAR to other WAR able to a team if they play more. This is in compar- statistics, though, we briefly recap. In con- ison to that of a replacement player who would not structing SHU-WAR, Thorn and Palmer’s formu- be expected to play as much as a quality starter. It las were primarily used, but a few adjustments also should be noted that the National League cur- were made in an attempt to reinforce the qual- rently has pitchers batting, but a pitcher’s replace- ity of the numbers produced. Four primary ad- ment player adjustment is based purely on batters justments were made to different sections of the faced. This distinction is made if a player has formulas that closed the gap between our initial multiple games as a pitcher, as their replacement SHU-WAR values and those of proprietary WAR player adjustment is made using batters faced di- statistics like, for example, Baseball Reference’s vided by 50; otherwise the positional player ad- bWAR. Changes to batting and pitching included justment is used with plate appearances divided by the modifications to Thorn & Palmer’s one year 30. This accounts for cases like Shohei Ohtani in park factor. Using only a one year park factor 2018 where his replacement player adjustment is presents a problem for those years that are above based on batters faced as a pitcher. or below average, where more or less runs may be scored at the home park than normal. Since 3.7 SHU-WAR 2010, Baseball Reference has reportedly used a three-year park factor while FanGraphs has used a The final calculation of SHU-WAR is the sum- five year regressed park factor (WAR Comparison mation of each of the preceding six components, Chart, 2019). We therefore added the adjustment scaled by two times the runs per game average of using up to a five-year park factor adjustment to for the league. For reference the final formula for have a greater sense of the true impact the park has

Published by eRepository @ Seton Hall, 2020 7 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

on both batting and pitching. Another adjustment ten more involved and are generally more diffi- made was the pitcher’s ABF value was divided cult than other positions. Pitchers have to deal by two. Pitchers’ batting seemed to have a major with bunts and hard ground balls within a short negative impact, specifically for National League distance, so it was deemed at least approximately pitchers on their final SHU-WAR value, so an ad- fair to give a 5.0 positional coefficient to pitchers. justment was made to the ABF value by dividing Other positions over time have had their coeffi- it by two. The ABF is the comparison value to the cient values adjusted as more information is gath- average hitter in the league and since pitchers are ered, so this may be necessary for this value as typically not the best hitters, their batting statistics well, but with the current information available a are much worse than the average major league hit- 5.0 seems appropriate. ter. Also, pitchers’ batting performances are often sacrificed in terms of winning a game: pitchers are 4. Comparison to Proprietary WAR Statistics often pinch-hit for if an important situation comes up in the later innings of a game, as pitchers are In order to justify the accuracy of our SHU- regularly expected to either get out quickly in a WAR statistic, it is important to compare our cal- game or try for a sacrifice play such as bunting culations to some of the large selection of pro- a player over to the next base. A pitcher’s main prietary WAR statistics available, such as Base- purpose is to pitch, so their typical negative bat- ball Reference’s bWAR. The bWAR statistic is ting runs values were adjusted to more accurately one of the most popularly used as it has been represent their actual impact on the game. developed with advanced technology for a long Two other adjustments made to the SHU-WAR time, but it uses metrics that are unavailable to statistic were adjustments to pitching runs with the the public. For example, in their batting portion introduction of interleague play, and the decision of WAR, they use the statistic wOBA (weighted to use a 5.0 coefficient for the pitcher’s positional on-base-average). This value takes into account adjustment. Interleague play was introduced into exclusive data, like the fact that from 2003 on, Major League Baseball in 1997 where teams from they can discern between infield and outfield sin- the American League began playing a few series gles, which have a 0.06 run difference (WAR Ex- with teams from the National League. Since pitch- plained, 2019). Also, Baseball Reference primar- ing runs was originally conceived as a league- ily uses (DRS) as their de- specific statistic only dealing with the league’s fensive metric in calculating bWAR. In the cre- ERA, it was important to take into account the ap- ation of DRS, they measure the player’s fielding proximately twenty interleague games (depending range, the catcher’s throwing ability, the success on the season) that each team played. For any sea- of good fielding plays, and the failure of mis- son in which there were no interleague games, the plays in different situations. Stats for missing the original formula from Thorn and Palmer is sim- cutoff-man in a relay throw and measurements for ply used as the numerator for interleague games a catcher’s ability to frame and block are unavail- would be zero. The last modification to the list of able to the public, but they are all used in Baseball formulas is the positional adjustment coefficient Reference’s calculations behind the scenes (Posi- for pitchers. Initially there was no available coef- tion Player WAR Calculations and Details, 2019). ficient for a pitcher’s fielding so it was determined These are just a few examples of the statistics used that the difficulty of the position fielding-wise was in formulating their final numbers where there are approximately 5.0. This is between and many pages of information about all of the ad- second base which respectively have coefficients justments and calculations used in order to finally of 7.5 and 2.5, where infield positions are of- reach a player’s .

https://scholarship.shu.edu/locus/vol3/iss1/7 8 Hendela: Sabermetric Analysis: Wins-Above-Replacement

One of our secondary goals in developing 2010 season were looked at, as the most available SHU-WAR was to test the hypothesis that many information about Baseball Reference’s bWAR the advanced measures used by organizations like statistic is available from 2010. The 2010 season Baseball Reference and FanGraphs that seem use- was also representative of an average MLB sea- ful may in fact be much beyond what is neces- son, in terms of historical norms for runs scored sary to accurately evaluate a player. A statistic too per game. Each team was purposefully chosen complicated to state its actual calculations, may be to cover different aspects that are accounted for a statistic too complicated to be useful in an accu- in SHU-WAR; for example, the rate analysis. This study therefore challenges the are notoriously known for having a hitter’s park. need for these advanced calculations used in pro- Coors Field is in Denver Colorado where the air prietary WAR statistics by questioning if there is is much thinner than the average ballpark so the a considerable difference between that of a statis- ball is known to travel further, so it was impor- tic that can be understood and calculated by the tant to see how a large park adjustment impacted public to that of an advanced metric from a major the final values. Along the same lines, the 1930 organization. Phillies team was chosen since 1930 was known as “the year of the hitter,” while the championship- 5. Results winning 1968 played in “the year of the pitcher.” The 2018 Los Angeles Angels were To test SHU-WAR versus the proprietary chosen see how arguably the best player in the WAR statistics, we calculated SHU-WAR for 234 present day, , was valued along with players from different teams and different eras and the special player Shohei Ohtani, who is the rare compared the results to Baseball Reference’s final pitcher that is also a quality batter. Lastly, the bWAR values for these players. WAR in general is 1927 are famous for the home a non-standardized statistic but if the final results run battle between and , and are close to that of a major organization, one using we thought it would be valuable to see how leg- along with proprietary data, then ends in the past are valued compared to legends in it shows that WAR could become standardized. the making, such as Mike Trout. Competing organizations have minimal agreement We calculated SHU-WAR for all players from on using any of the same methods in their calcu- these six teams, with positional players (POS) lations, as the sabermetric community makes it al- separated categorically from pitchers (P). Pearson most a game in itself to come up with better ways correlation values between SHU-WAR and Base- to determine the value of each player. However ball Reference’s bWAR values were then calcu- simplicity is often underrated, and we believe it is lated for each team and POS and P segment. The likely possible the value of a player should be able results showed that for each team and segment to be determined with the vast amount of data al- there was greater than a 0.90 Pearson correlation, ready available online. a remarkably high correlation value (see Table 1). The time periods that were studied ranged Moreover, a number of segments had correlation from 1927 to 2018 in order to demonstrate that values greater than 0.98, including the pitchers for SHU-WAR is consistent across different eras. The the 2010 New York Yankees, the pitchers for the full list of teams that were studied are as follows: 1968 Detroit Tigers, the position players for the the 2010 New York Yankees, the 2010 Colorado 2018 Los Angeles Angels, and the position play- Rockies, 2018 Los Angeles Angels, the 1968 De- ers for the 1927 New York Yankees. This indicates troit Tigers, the 1930 , and a strong consistency between SHU-WAR values the 1927 New York Yankees. Two teams from the and Baseball Reference’s bWAR values across po-

Published by eRepository @ Seton Hall, 2020 9 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

Year/Team POS Correlation P Correlation fense in the SHU-WAR calculations did not re- 2010 NYY 0.91 0.98 ward him as much. of the 1930 2010 COL 0.94 0.94 Philadelphia Phillies was the only player to appear 2018 LAA 0.98 0.92 on both the SHU-WAR versus bWAR list and the 1968 DET 0.95 0.99 fWAR versus bWAR list of players that were out- 1930 PHI 0.95 0.93 side the plus or minus two margin. Collins was 1927 NYY 0.99 0.92 calculated to have a SHU-WAR value of 2.9, a bWAR value of 5.5, and an fWAR value of 3.3. Table 1. Correlations between SHU-WAR and This shows that the true value of Phil Collins is bWAR statistics for position players (POS) and pitchers (P) for the teams considered. generally not known, other than that he was an above average player in 1930. The only WAR value here that can be justified is the SHU-WAR sitions and time periods for entire teams. value though, since it is possible to see the cal- Full teams showed significant correlations to culations done to obtain this statistic. In general, that of Baseball Reference’s bWAR values, but almost all 234 values are within the same reason- SHU-WAR is a statistic for individual players. able range of valuing this subset of players. That leads to the question, were there any ma- jor differences between a player’s SHU-WAR and bWAR? A chart was made comparing the paired differences bWAR minus SHU-WAR and bWAR minus fWAR as a control. That chart appears as Figure 1, with the x-axis ordered by bWAR mi- nus SHU-WAR value. When comparing the WAR statistics, we chose to focus on differences of 2 or more, since 2 wins-above-replacement can signif- icantly impact an evaluation of a player; for exam- ple, someone with a 0 WAR is considered replace- Figure 1. Differences in WAR statistics. ment level, while someone with a 2 WAR is gen- erally considered a good player and players hav- A second set of graphs, shown in Figures 2 ing WAR values above 4 are considered all-stars and 2, display the same information in a differ- (Slowinski, 2010). We found that, of the 234 play- ent way. In these graphs values off the x = y line ers calculated, 13 differed by more than plus or indicate differences of final SHU-WAR values. minus two wins when comparing SHU-WAR and The final result showed a R2 value of 0.8801 dis- bWAR values, while a total 9 players were out- playing a strong correlation between SHU-WAR side of this margin between bWAR and fWAR, a and bWAR. In comparison, there was a R2 value fairly small discrepancy. Examples of players val- of 0.9063 between bWAR and fWAR which are ued differently by SHU-WAR and bWAR included two established statistics that use proprietary mea- Brett Gardner of the 2010 New York Yankees and sures. There is slightly larger variability between Phil Collins of the 1930 Philadelphia Phillies. A SHU-WAR and bWAR, but the difference between look at both WAR calculations reveals that fWAR and bWAR clearly shows that there is still Brett Gardner’s discrepancy can be explained as a inconsistencies between two statistics from ma- discrepancy in defensive value. Advanced calcu- jor organizations that use advanced technology at- lations in the defensive portion of the bWAR sig- tempting to measure the same thing. The dif- nificantly benefited his final value, while his de- ferences between bWAR and fWAR also remind

https://scholarship.shu.edu/locus/vol3/iss1/7 10 Hendela: Sabermetric Analysis: Wins-Above-Replacement

us that the goal of a standardized WAR statistic Range Percent of Players is not to produce the exact same values as one Less Than -2 0.9% of these other wins-above-replacement statistics, -2 ≤ x ≤ 0 30.8% but to measure the same thing in a comparable 0 ≤ x ≤ 2 44.4% way. And at least for SHU-WAR, a transparent 2 ≤ x ≤ 4 11.5% and replicable way. 4 ≤ x ≤ 6 6.0% 6 ≤ x ≤ 8 2.6% 8 ≤ x ≤ 10 1.7% Greater Than 10 2.1%

Table 2. Percent of players studied in various SHU- WAR ranges.

above this value. Of the 234 players whose SHU- WAR values were calculated, 68.3% of players were greater than 0 with 12.4% of players over 4. Figure 2. SHU-WAR versus bWAR values. This last number is a fairly large value, but is un- derstandable given the teams we elected to calcu- late, which included a number of elite players like Babe Ruth, Lou Gehrig, Mike Trout, and Robin- son Cano, and also primarily had an above aver- age number of quality players on each team. The scale for SHU-WAR therefore is similar to that of Baseball Reference and Fangraph’s scale, where below-average to average players are between -2 and 2, quality players are between 2 and 4, and great players up to hall of famers are above 4.

Figure 3. bWAR versus fWAR values.

Lastly, a comparison between SHU-WAR and fWAR was made by looking at data from the 2010 season of FanGraphs. The FanGraphs fWAR statistic for all players in the 2010 season saw 35% of all players fall between WAR values of -1 and 0, 38% of players fall between 0 and 1, 19% be- tween 1 and 4, and 6% above 4 (Slowinski, 2010). Since SHU-WAR values were measured for play- ers from different eras, and for individual teams and not full leagues, it was more important to get a sense of the general shape of the WAR statistic. The replacement level player is indicated as hav- Figure 4. Graph of all 234 SHU-WAR values. ing a 0 WAR, so the majority of players should be

Published by eRepository @ Seton Hall, 2020 11 Locus: The Seton Hall Journal of Undergraduate Research, Vol. 3, Iss. 1 [2020], Art. 7

6. Conclusion a big difference in WAR values? The answer is, there is no way to tell. The calculations are not are only as useful as the public information and these numbers must sim- ability of the user to understand what they are ply be taken at face value for what they are since saying. With advances in technology and data you cannot determine how these numbers are cre- gathering methods, there is a strong temptation to ated. say “more is always better” and to try to use all These principles—simplicity, replicability, of the data possible to its greatest extent. How- and transparency—are why the standardization of ever, more is not always better, and in sabermet- the statistic is necessary and why we set about rics there is a certain point where what a number is to create a WAR statistic that adhered to them. trying to say can get lost, and one of the ways this The steps given in this paper for calculating SHU- can happen is when the value is constructed using WAR are at the very least a proof of concept, and techniques beyond anyone’s understanding. Wins- a starting point for a better WAR statistic. Esmil above-replacement is one place where the battle of Rogers’ SHU-WAR value can be calculated using who can use the best and most advanced measures the methods of this paper to be -0.99, and this cal- has become a major discussion, to the point cur- culation can be replicated by anyone using this set rently that the principal reason for the statistic is of formulas here. This value falls right between lost in all of its calculations. Simplifying, stan- the bWAR value of -1.7 and the FanGraphs value dardizing, and making transparent the calculation of 1.2, but the SHU-WAR value comes with the offers a great benefit to baseball fans and analysts additional information, for example, that Rogers’ alike, who can then use and appreciate the statistic low WAR is due to his poor pitching runs value while staying on the same page. of -15.06 for the 2010 season. All this is done In the current state of the game, wins-above- without using any proprietary data that the aver- replacement is often referred only as WAR, which age person does not have access to. Baseball has often obscures the fact that the source of these always been a game of numbers that fans and an- values, and the values themselves, can vary. As alysts have loved to use, and our hope is that with we have seen there is clearly a difference between SHU-WAR it may finally be the case that WAR major organizational WAR values, where the fi- becomes a standardized statistic that everyone can nal number of wins for an individual player can appreciate and understand for themselves. even differ by even greater than 2 WAR. This fact References is remarkable, especially given the stature of the WAR statistic: as noted by FanGraphs, “we think teams are paying about $8 million per every WAR Costa, G. B., Huber, M. R., & Saccoman, J. T. they add to their roster” (Weinberg, 2016). Take (2008). Practicing Sabermetrics. Jefferson: Mc- a player like in the 2010 season for Farland. the Colorado Rockies: Rogers had a -1.7 bWAR Costa, G. B., Huber, M. R., & Saccoman, J. T. value according to Baseball Reference but a 1.2 (2008). Understanding Sabermetrics. Jefferson: fWAR value according to FanGraphs. Both of McFarland. these statistics are often referred to simply just Costa, G. B., Huber, M. R., & Saccoman, J. T. “WAR”, so Esmil Rogers would likely be a huge (2012). Reasoning with Sabermetrics. Jefferson: advocate of using FanGraph’s WAR over Baseball McFarland. Reference’s WAR, since it would mean he should be worth more than $16,000,000 more. A natu- James, B. (1984). The Baseball Ab- ral question, of course, is why can there be such stract. New York: Ballantine.

https://scholarship.shu.edu/locus/vol3/iss1/7 12 Hendela: Sabermetric Analysis: Wins-Above-Replacement

Position Player WAR Calculations and Details. (2019, May 10). Retrieved from Baseball Reference: https://www.baseball-reference.com/ about/war explained position.shtml Slowinski, S. (2010, February 15). What is WAR? Retrieved from FanGraphs: https://library.fangraphs.com/misc/war/ Thorn, J., & Palmer, P. (1985). The Hidden Game of Baseball. Chicago: The University of Chicago Press. WAR Comparison Chart. (2019, May 10). Retrieved from Baseball Refer- ence: https://www.baseball-reference.com/ about/war explained comparison.shtml WAR Explained. (2019, May 6). Retrieved from Baseball Reference: https://www.baseball- reference.com/about/war explained.shtml Weinberg, N. (2016, January 14). Ba- sic Principles of Free Agent Contract Evaluation. Retrieved from FanGraphs: https://library.fangraphs.com/basic-principles-of- free-agent-contract-evaluation/

Published by eRepository @ Seton Hall, 2020 13