Package ‘Sabermetrics’ February 7, 2015 Type Package Title Sabermetrics Functions For Baseball Analytics Version 1.0 Date 2015-02-06 Author Peter Xenopoulos
R topics documented:
sabermetrics-package ...... 2 dice ...... 2 eqa...... 3 fip...... 5 iso...... 6 log5...... 7 obp ...... 8 ops...... 9 pyth...... 10 rcBasic ...... 11 rcBasicSB ...... 12 rcPX...... 13 rcTech ...... 14 secA...... 15 slg...... 16 wOBA...... 17
Index 18
1 2 dice
sabermetrics-package Sabermetrics Functions For Baseball Analytics
Description A collection of baseball analytics functions for sabermetrics purposes. Among these functions include popular metrics such as OBP, wOBA, runs created functions as well as field-independent pitching metrics.
Details
Package: Sabermetrics Type: Package Version: 1.0 Date: 2015-02-06 License: GPL-3
Author(s)
Peter Xenopoulos
References
Wikipedia: http://en.wikipedia.org/wiki/Sabermetrics#Examples Reddit: http://www.reddit.com/r/Sabermetrics
dice Defense-Independent Component ERA (DICE)
Description
A function gives a number that is better at predicting a pitcher’s ERA in the following year than the pitcher’s actual ERA in the current year.
Usage
dice(HR, BB, HBP, K, IP) eqa 3
Arguments HR Home Runs Allowed BB Walks Allowed HBP Batters Hit K Strikeouts IP Innings Pitched
Value Returns 3 + ((13*HR+3*BB+3*HBP-2*K)/IP)
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Defense_independent_pitching_statistics
Examples ## Defense-Independent Component ERA (dice) function is currently defined as function (HR, BB, HBP, K, IP) { defenseERA <- 3 + ((13 * HR + 3 * BB + 3 * HBP - 2 * K)/IP) return(defenseERA) }
## Let's take 2014's MLB MVP, Clayton Kershaw, and find his DICE ## Stats for Clayton Kershaw available on ## http://www.baseball-reference.com/players/k/kershcl01-pitch.shtml ## For 2014, Kershaw allowed 9 HR, 31 BB, 2 HBP, 239 K, and 198.1 IP ## The formula for his DICE using the dice function is below ## Output should be 1.677436 dice(9,31,2,239,198.1)
eqa Equivalent Average
Description A baseball metric invented by Clay Davenport and intended to express the production of hitters in a context independent of park and league effects. EQA represents a hitter’s productivity using the same scale as batting average.
Usage eqa(H, TB, BB, HBP, SB, SAC, SF, AB, CS) 4 eqa
Arguments
H Hits TB Total Bases BB Walks HBP Hit by pitch SB Stolen bases SAC Sacrifice hit/bunt SF Sacrifice flies AB At bats CS Caught stealing
Value
Returns (H+TB+1.5*(BB+HBP)+SB+SAC+SF)/(AB+BB+HBP+SAC+SF+CS+(SB/3))
Author(s)
Peter Xenopoulos
References
http://en.wikipedia.org/wiki/Equivalent_average
Examples
## The equivalent average (eqa) function is currently defined as function (H, TB, BB, HBP, SB, SAC, SF, AB, CS) { eqa <- (H + TB + 1.5 * (BB + HBP) + SB + SAC + SF)/(AB + BB + HBP + SAC + SF + CS + (SB/3)) return(eqa) }
## Let's take 2014's MLB MVP, Mike Trout, and find his OPS ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 173 H, 338 TB, 83 BB, 10 HBP, 16 SB, 0 SAC, 10 SF, 602 AB, 2 CS ## The formula for his EQA using the ops function is below ## Output should be .9496958 eqa(173,338,83,10,16,0,10,602,2) fip 5
fip Field Independent Pitching
Description Similar to DICE dice
Usage fip(HR, BB, K, IP, C)
Arguments HR Home Runs allowed BB Walks K Strikeouts IP Innings Pitched C League average ERA
Value Returns ((13*HR+3*BB-2*K)/IP) + C
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Defense_independent_pitching_statistics
See Also DICE dice
Examples ## Field Independent Pitching (fip) function is currently defined as function (HR, BB, K, IP, C) { fieldIndPitch <- ((13 * HR + 3 * BB - 2 * K)/IP) + C return(fieldIndPitch) }
## Let's take 2014's MLB MVP, Clayton Kershaw, and find his FIPS ## Stats for Clayton Kershaw available on ## http://www.baseball-reference.com/players/k/kershcl01-pitch.shtml ## For 2014, Kershaw allowed 9 HR, 31 BB, 239 K, 198.1 IP and league era (C) of 3.66 6 iso
## The formula for his FIPS using the dice function is below ## Output should be 2.307148 fip(9,31,239,198.1,3.66)
iso Isolated Power
Description Isolated power is a statistic to measure a hitter’s raw power
Usage iso(slg, avg)
Arguments slg Slugging Percentage. Found from slg avg Batting Average
Value Returns Slugging Percentage - Batting Average
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Isolated_Power
See Also Slugging Percentage slg
Examples ## The isolated power (iso) function is currently defined as function (slg, avg) { iso <- slg - avg return(iso) }
## Let's take 2014's MLB MVP, Mike Trout, and find his Isolated Power ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had a SLG of .561 and an AVG of .287 log5 7
## The formula for his Isolated Power using the iso function is below ## Output should be .274 iso(0.561,0.287)
log5 Log5 Sabermetric formula
Description
Log 5 is a formula invented by Bill James to estimate the probability that team A will win a game, based on the true winning percentage of Team A and Team B. It’s equivalent to the Bradley-Terry- Luce model used for paired comparisons, the Elo rating system used in chess and the Rasch model used in the analysis of categorical data.
Usage
log5(probA, probB, order)
Arguments
probA Win probability of team A probB Win probability of team B order Determine winning probability of which team. 0 means win probability of A over B, and 1 vice-versa
Value
Returns (probA - (probA*probB)) / (probA + probB - (2 * probA * probB))
Author(s)
Peter Xenopoulos
References
http://en.wikipedia.org/wiki/Log5 8 obp
obp On-Base Percentage
Description
Function to calculate the on-base percentage of a player/team
Usage
obp(H, BB, HBP, AB, SF)
Arguments
H Hits BB Unintentional Walks HBP Hit by pitch AB At bats SF Sacrifice flies
Details
On-base percentage is used to figure out how often an entity gets on-base
Value
Returns the following: ((H+BB+HBP)/(AB+BB+SF+HBP))
Author(s)
Peter Xenopoulos
References
http://en.wikipedia.org/wiki/On-base_percentage
See Also
Slugging Percentage slg, OPS ops and Isolated Power iso ops 9
Examples ## The on-base percentage (obp) function is currently defined as
function (H, BB, HBP, AB, SF) { onbase <- ((H+BB+HBP)/(AB+BB+SF+HBP)) return(onbase) }
## Let's take 2014's MLB MVP, Mike Trout, and find his on-base percentage ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 173 H, 83 BB, 10 HBP, 602 AB, 10 SF ## The formula for his on-base percentage using the obp function is below ## Output should be 0.377305 obp(173,83,10,602,10)
ops On-base plus Slugging
Description Function to calculate on base percentage plus slugging percentage. This is a measure of a hitter’s ability to hit for power and get on base.
Usage ops(slg, obp)
Arguments slg Slugging percentage. Found from slg obp On-base percentage. Found from obp
Value Returns On-Base Percentage + Slugging Percentage
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/On-base_plus_slugging
See Also On-base Percentage obp and Slugging Percentage slg 10 pyth
Examples
## The on-base percentage plus slugging (ops) function is currently defined as function (slg, obp) { ops <- slg + obp return(ops) }
## Let's take 2014's MLB MVP, Mike Trout, and find his OPS ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had a SLG of .561 and an OBP of .377 ## The formula for his OPS using the ops function is below ## Output should be .938 ops(0.561,0.377)
pyth Pythagorean Expectation
Description Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed.
Usage pyth(RS, RA)
Arguments RS Runs Scored RA Runs Allowed
Value Returns (RS*RS)/((RS*RS)+(RA*RA))
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Pythagorean_expectation rcBasic 11
rcBasic Runs Created (Basic)
Description Basic description of how many runs a hitter contributes to his team
Usage rcBasic(H, BB, TB, AB)
Arguments H Hits BB Walks TB Total Bases AB At Bats
Value Returns ((H+BB)*TB)/(AB+BB)
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Runs_created
See Also Runs Created (with stolen bases) rcBasicSB and Runs Created (Technical) rcTech
Examples ## This is a generic runs created formula ## Let's see how many runs created (keep in mind this is an estimate) ## a batter will make with ## 100 hits, 7 walks (BB), 80 total bases, and 300 at bats
function (H, BB, TB, AB) { rc <- ((H + BB) * TB)/(AB + BB) return(rc) }
rcBasic(100,7,80,300) # Should output 27.88274 runs 12 rcBasicSB
rcBasicSB Runs Created (with Stolen Bases)
Description
Basic description of how many runs a hitter contributes to his team (with stolen bases included)
Usage
rcBasicSB(H, BB, TB, AB, CS, SB)
Arguments
H Hits BB Walks TB Total Bases AB At Bats CS Caught Stealing SB Stolen Bases
Value
Returns ((H+BB-CS)*(TB+(0.55*SB)))/(AB+BB)
Author(s)
Peter Xenopoulos
References
http://en.wikipedia.org/wiki/Runs_created
See Also
Runs Created (Basic) rcBasic and Runs Created (Technical) rcTech rcPX 13
rcPX Runs Created (PX Model)
Description Runs created model using linear weights from 2012, 2013 and 2014 MLB league data
Usage rcPX(SINGLES, DOUBLES, TRIPLES, HR, BB, SB)
Arguments SINGLES Number of singles DOUBLES Number of doubles TRIPLES Number of triples HR Number of home runs BB Number of unintentional walks SB Number of stolen bases
Value Returns -391.39753 + 0.44953*(SINGLES) + 0.85285*(DOUBLES) + 1.05912*(TRIPLES) + 1.36359*(HR) + 0.31761*(BB) + 0.21599*(SB)
Author(s) Peter Xenopoulos
References http://peterxeno.com/linear-weights-in-baseball-sabermetrics/
See Also rcBasic, rcBasicSB, rcTech
Examples ## Let's say the LA Dodgers had ## 952 singles, 302 doubles, 38 triples, 134 home runs, 519 walks, ## 138 stolen bases
rcPX(952,302,38,134,519,138) # Outputs 711.7296 14 rcTech
rcTech Runs Created (Technical)
Description
How many runs a hitter contributes to his team (technical version)
Usage rcTech(H, BB, CS, HBP, GIDP, TB, IBB, SAC, SF, SB)
Arguments
H Hits BB Unintentional walks CS Caught stealing HBP Hit by pitch GIDP Grounded into double play TB Total bases IBB Intentional walks SAC Sacrifice hits/bunts SF Sacrifice flies SB Stolen bases
Value
Returns (H+BB-CS+HBP-GIDP)*(TB + (0.26*(BB-IBB+HBP)) + (0.52*(SAC+SF+SB)))
Author(s)
Peter Xenopoulos
References
http://en.wikipedia.org/wiki/Runs_created
See Also
Runs Created (Basic) rcBasic and Runs Created (with Stolen Bases) rcBasicSB secA 15
secA Secondary Average
Description Secondary average, or SecA, is a baseball statistic that measures the sum of extra bases gained on hits, walks, and stolen bases (less times caught stealing) depicted per at bat.
Usage secA(BB, TB, H, SB, CS, AB)
Arguments BB Walks TB Total Bases H Hits SB Stolen Bases CS Caught Stealing AB At Bats
Value Returns (BB+(TB-H)+(SB-CS))/AB
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Secondary_average
Examples ## The secondary average (secA) function is currently defined as function (BB, TB, H, SB, CS, AB) { avg <- (BB + (TB - H) + (SB - CS))/AB return(avg) }
## Let's take 2014's MLB MVP, Mike Trout, and find his SecA ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 83 BB, 338 TB, 173 H, 16 SB, 2 CS, 602 AB ## The formula for his SecA using the ops function is below ## Output should be .4352159 secA(83,338,173,16,2,602) 16 slg
slg Slugging Percentage
Description Function to calculate the slugging percentage (total hitting power) of a player/team
Usage slg(TB, AB)
Arguments TB Total bases AB At bats
Details Slugging percentage is a popular measure to determine the hitting power of an entity
Value Returns (TB/AB)
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/Slugging_percentage
Examples ## The slugging percentage (slg) function is currently defined as
function (TB, AB) { slugging <- (TB/AB) return(slugging) }
## Let's take 2014's MLB MVP, Mike Trout, and find his slugging percentage ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 602 AB, and 338 TB ## The formula for his slugging percentage using the slg function is below ## Output should be 0.5614618 slg(338, 602) wOBA 17
wOBA Weighted On Base Average
Description Finds the weighted on-base average, a statistic based on linear weights of events
Usage wOBA(BB, HBP, SINGLE, RBOE, DOUBLE, TRIPLE, HR, PA)
Arguments BB Walks HBP Hit by pitch SINGLE Number of singles RBOE Number of bases reached on error DOUBLE Number of doubles TRIPLE Number of triple HR Number of home runs PA Number of plate appearances
Value Returns ((0.72*BB)+(0.75*HBP)+(0.90*SINGLE)+(0.92*RBOE)+(1.24*DOUBLE)+(1.56*TRIPLE)+(1.95*HR))/PA which is based on linear weights
Author(s) Peter Xenopoulos
References http://en.wikipedia.org/wiki/WOBA Index
∗Topic \textasciitildekwd2 slg, 16 rcBasicSB, 12 ∗Topic sports analytics ∗Topic baseball analytics sabermetrics-package,2 sabermetrics-package,2 ∗Topic wOBA ∗Topic baseball wOBA, 17 sabermetrics-package,2 ∗Topic dice dice,2, 5 dice,2 eqa,3 ∗Topic dips dice,2 fip,5 fip,5 ∗Topic equivalent average iso,6, 8 eqa,3 ∗Topic fip log5,7 fip,5 ∗Topic isolated power obp,8, 9 iso,6 ops, 8,9 ∗Topic log5 pyth, 10 log5,7 ∗Topic on base percentage rcBasic, 11, 12–14 obp,8 rcBasicSB, 11, 12, 13, 14 ∗Topic on base plus slugging rcPX, 13 ops,9 rcTech, 11–13, 14 ∗Topic ops ops,9 sabermetrics (sabermetrics-package),2 ∗Topic pythagorean expectation sabermetrics-package,2 pyth, 10 secA, 15 ∗Topic runs created slg, 6, 8,9 , 16 rcBasic, 11 rcBasicSB, 12 wOBA, 17 rcPX, 13 rcTech, 14 ∗Topic sabermetrics sabermetrics-package,2 ∗Topic secA secA, 15 ∗Topic secondary average secA, 15 ∗Topic slugging percentage
18