Package 'Sabermetrics'
Total Page:16
File Type:pdf, Size:1020Kb
Package ‘Sabermetrics’ February 19, 2015 Type Package Title Sabermetrics Functions For Baseball Analytics Version 1.0 Date 2015-02-06 Author Peter Xenopoulos <www.peterxeno.com> Maintainer Peter Xenopoulos <[email protected]> Description A collection of baseball analytics functions for sabermetrics purposes. Among these func- tions include popular metrics such as OBP, wOBA, runs created functions as well as field- independent pitching metrics. License GPL-3 NeedsCompilation no Repository CRAN Date/Publication 2015-02-07 00:55:03 R topics documented: sabermetrics-package . .2 dice .............................................2 eqa..............................................3 fip..............................................5 iso..............................................6 log5.............................................7 obp .............................................8 ops..............................................9 pyth............................................. 10 rcBasic . 11 rcBasicSB . 12 rcPX............................................. 13 rcTech . 14 secA............................................. 15 slg.............................................. 16 wOBA............................................ 17 Index 18 1 2 dice sabermetrics-package Sabermetrics Functions For Baseball Analytics Description A collection of baseball analytics functions for sabermetrics purposes. Among these functions include popular metrics such as OBP, wOBA, runs created functions as well as field-independent pitching metrics. Details Package: Sabermetrics Type: Package Version: 1.0 Date: 2015-02-06 License: GPL-3 Author(s) Peter Xenopoulos References Wikipedia: http://en.wikipedia.org/wiki/Sabermetrics#Examples Reddit: http://www.reddit.com/r/Sabermetrics dice Defense-Independent Component ERA (DICE) Description A function gives a number that is better at predicting a pitcher’s ERA in the following year than the pitcher’s actual ERA in the current year. Usage dice(HR, BB, HBP, K, IP) eqa 3 Arguments HR Home Runs Allowed BB Walks Allowed HBP Batters Hit K Strikeouts IP Innings Pitched Value Returns 3 + ((13*HR+3*BB+3*HBP-2*K)/IP) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Defense_independent_pitching_statistics Examples ## Defense-Independent Component ERA (dice) function is currently defined as function (HR, BB, HBP, K, IP) { defenseERA <- 3 + ((13 * HR + 3 * BB + 3 * HBP - 2 * K)/IP) return(defenseERA) } ## Let's take 2014's MLB MVP, Clayton Kershaw, and find his DICE ## Stats for Clayton Kershaw available on ## http://www.baseball-reference.com/players/k/kershcl01-pitch.shtml ## For 2014, Kershaw allowed 9 HR, 31 BB, 2 HBP, 239 K, and 198.1 IP ## The formula for his DICE using the dice function is below ## Output should be 1.677436 dice(9,31,2,239,198.1) eqa Equivalent Average Description A baseball metric invented by Clay Davenport and intended to express the production of hitters in a context independent of park and league effects. EQA represents a hitter’s productivity using the same scale as batting average. Usage eqa(H, TB, BB, HBP, SB, SAC, SF, AB, CS) 4 eqa Arguments H Hits TB Total Bases BB Walks HBP Hit by pitch SB Stolen bases SAC Sacrifice hit/bunt SF Sacrifice flies AB At bats CS Caught stealing Value Returns (H+TB+1.5*(BB+HBP)+SB+SAC+SF)/(AB+BB+HBP+SAC+SF+CS+(SB/3)) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Equivalent_average Examples ## The equivalent average (eqa) function is currently defined as function (H, TB, BB, HBP, SB, SAC, SF, AB, CS) { eqa <- (H + TB + 1.5 * (BB + HBP) + SB + SAC + SF)/(AB + BB + HBP + SAC + SF + CS + (SB/3)) return(eqa) } ## Let's take 2014's MLB MVP, Mike Trout, and find his OPS ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 173 H, 338 TB, 83 BB, 10 HBP, 16 SB, 0 SAC, 10 SF, 602 AB, 2 CS ## The formula for his EQA using the ops function is below ## Output should be .9496958 eqa(173,338,83,10,16,0,10,602,2) fip 5 fip Field Independent Pitching Description Similar to DICE dice Usage fip(HR, BB, K, IP, C) Arguments HR Home Runs allowed BB Walks K Strikeouts IP Innings Pitched C League average ERA Value Returns ((13*HR+3*BB-2*K)/IP) + C Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Defense_independent_pitching_statistics See Also DICE dice Examples ## Field Independent Pitching (fip) function is currently defined as function (HR, BB, K, IP, C) { fieldIndPitch <- ((13 * HR + 3 * BB - 2 * K)/IP) + C return(fieldIndPitch) } ## Let's take 2014's MLB MVP, Clayton Kershaw, and find his FIPS ## Stats for Clayton Kershaw available on ## http://www.baseball-reference.com/players/k/kershcl01-pitch.shtml ## For 2014, Kershaw allowed 9 HR, 31 BB, 239 K, 198.1 IP and league era (C) of 3.66 6 iso ## The formula for his FIPS using the dice function is below ## Output should be 2.307148 fip(9,31,239,198.1,3.66) iso Isolated Power Description Isolated power is a statistic to measure a hitter’s raw power Usage iso(slg, avg) Arguments slg Slugging Percentage. Found from slg avg Batting Average Value Returns Slugging Percentage - Batting Average Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Isolated_Power See Also Slugging Percentage slg Examples ## The isolated power (iso) function is currently defined as function (slg, avg) { iso <- slg - avg return(iso) } ## Let's take 2014's MLB MVP, Mike Trout, and find his Isolated Power ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had a SLG of .561 and an AVG of .287 log5 7 ## The formula for his Isolated Power using the iso function is below ## Output should be .274 iso(0.561,0.287) log5 Log5 Sabermetric formula Description Log 5 is a formula invented by Bill James to estimate the probability that team A will win a game, based on the true winning percentage of Team A and Team B. It’s equivalent to the Bradley-Terry- Luce model used for paired comparisons, the Elo rating system used in chess and the Rasch model used in the analysis of categorical data. Usage log5(probA, probB, order) Arguments probA Win probability of team A probB Win probability of team B order Determine winning probability of which team. 0 means win probability of A over B, and 1 vice-versa Value Returns (probA - (probA*probB)) / (probA + probB - (2 * probA * probB)) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Log5 8 obp obp On-Base Percentage Description Function to calculate the on-base percentage of a player/team Usage obp(H, BB, HBP, AB, SF) Arguments H Hits BB Unintentional Walks HBP Hit by pitch AB At bats SF Sacrifice flies Details On-base percentage is used to figure out how often an entity gets on-base Value Returns the following: ((H+BB+HBP)/(AB+BB+SF+HBP)) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/On-base_percentage See Also Slugging Percentage slg, OPS ops and Isolated Power iso ops 9 Examples ## The on-base percentage (obp) function is currently defined as function (H, BB, HBP, AB, SF) { onbase <- ((H+BB+HBP)/(AB+BB+SF+HBP)) return(onbase) } ## Let's take 2014's MLB MVP, Mike Trout, and find his on-base percentage ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had 173 H, 83 BB, 10 HBP, 602 AB, 10 SF ## The formula for his on-base percentage using the obp function is below ## Output should be 0.377305 obp(173,83,10,602,10) ops On-base plus Slugging Description Function to calculate on base percentage plus slugging percentage. This is a measure of a hitter’s ability to hit for power and get on base. Usage ops(slg, obp) Arguments slg Slugging percentage. Found from slg obp On-base percentage. Found from obp Value Returns On-Base Percentage + Slugging Percentage Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/On-base_plus_slugging See Also On-base Percentage obp and Slugging Percentage slg 10 pyth Examples ## The on-base percentage plus slugging (ops) function is currently defined as function (slg, obp) { ops <- slg + obp return(ops) } ## Let's take 2014's MLB MVP, Mike Trout, and find his OPS ## Stats for Mike Trout available on ## http://www.baseball-reference.com/players/t/troutmi01-bat.shtml ## For 2014, Trout had a SLG of .561 and an OBP of .377 ## The formula for his OPS using the ops function is below ## Output should be .938 ops(0.561,0.377) pyth Pythagorean Expectation Description Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed. Usage pyth(RS, RA) Arguments RS Runs Scored RA Runs Allowed Value Returns (RS*RS)/((RS*RS)+(RA*RA)) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Pythagorean_expectation rcBasic 11 rcBasic Runs Created (Basic) Description Basic description of how many runs a hitter contributes to his team Usage rcBasic(H, BB, TB, AB) Arguments H Hits BB Walks TB Total Bases AB At Bats Value Returns ((H+BB)*TB)/(AB+BB) Author(s) Peter Xenopoulos References http://en.wikipedia.org/wiki/Runs_created See Also Runs Created (with stolen bases) rcBasicSB and Runs Created (Technical) rcTech Examples ## This is a generic runs created formula ## Let's see how many runs created (keep in mind this is an estimate) ## a batter will make with ## 100 hits, 7 walks (BB), 80 total bases, and 300 at bats function (H, BB, TB, AB) { rc <- ((H + BB) * TB)/(AB