<<

More on Defensive A Regression (or Runs) Analysis

Th is appendix has three primary objectives: fi rst, to disclose aspects of DRA not disclosed in chapter two; second, to address aspects of the model that raise issues related less to per se than to statistical modeling in gen- eral; and third, to drive home the fundamental point that DRA is not an answer, but a method. Included in this appendix are certain alternative models I tried, and suggestions for further improvements, which should provide some sense of the range of alternative approaches that are possible.

DRA POST-1951

Overview

Th ere are essentially two DRA models: post-1951 and pre-1952. Th e post- 1951 model uses a subset of Retrosheet play-by-play data currently available for seasons aft er 1951, and was almost completely described in chapter two. Th e pre-1952 model must make do with considerably less data, which ren- ders it more primitive for infi elders and unavoidably more complicated for outfi elders. When we fi rst began explaining DRA, we took a ‘bottom-up’ approach, starting from the shortstop position and gradually building up until we had a team model. Here we’ll take a ‘top-’ approach, revealing the entire post-1951 team model all at once, and then discussing its components. Likewise, we’ll start with a top-down discussion of the pre-1952 model. Th e following page presents the entire post-1951 model on one page, with a glos- sary of defi ned terms on the facing page.

3

AAppendix-A.inddppendix-A.indd 3 22/1/2011/1/2011 2:27:532:27:53 PMPM AAppendix-A.indd 4 p p e n d i x - A . i n d d

4 1952–2009 DRA Model

Team above or below the league rate, given pitched, DR.ip , is estimated as the sum of pitching, catching, infi eld, and outfi eld defensive runs:

Pitching = .27 *SO.bfp – .34* BB.bfp – 1.49* HR.bh + .42 *A1.bip + .44* IFO.bip – .56* WP.ip . Catching = .59 *CS.sba + .59 *GO2.bip . Infi eld = .52 *rGO3 + .53 *rA4 + .45 *rA5 + .44 *rA6 . Outfi eld = .53 *rPO7 + .46 *rPO8 + .44 *rPO9 + .61 *A7.ip + .61 *A8.ip + .61 *A9.ip .

All ‘plain’ variables are team seasonal totals. See defi nitions on facing page. All variables with a ‘dot’, for example, A6.bip, are calcu- lated in the same way: A6.bip = A6 – [A6 * (BIP \ league BIP )]. A6.bip equals total A6 recorded by the team above (if negative, below) the league average rate that year, given total team BIP ‘opportunities’. Th e ‘opportunities’ variable following the ‘dot’ is always in lower case letters. All variables beginning with an “ r ” are residual team plays that year; that is, estimated net plays taking into account available predictors, using regression analysis.

rGO3 = GO3.bip + .09 *RBIP.bip. rA4 = A4.bip + .08 *RBIP.bip + .15 *RFO.rbip + .32 *LFO.lbip + .18 *HR.bh + .20 *WP.ip + .19 * SH.bip. rA6 = A6.bip − .06 *RBIP.bip + .29 *RFO.rbip + .15 *LFO.lbip + .12 *HR.bh + .56 *WP.ip + .43 * SH.bip. rA5 = A5.bip – .10 *RBIP.bip + .21 *RFO.rbip + .10 *LFO.lbip + .15 *A1.bip + .13 *rGO3 + .13 *IBB.pa. rPO7 = PO7.bip + .03 *RBIP.bip + .21 *RGO.rbip + .10 *LGO.lbip. rPO8 = PO8.bip − .01 *RBIP.bip + .27 *RGO.rbip + .24 *LGO.lbip + .07 *IFO.bip + .20 *SH.bip. rPO9 = PO9.bip − .03 *RBIP.bip + .22 *RGO.rbip + .22 *LGO.lbip + .12 * IFO.bip.

22/1/2011 2:27:53 PM Example of allocation of team fi elding runs to individual (lower-case “i ”) fi elders: / 1 / 2 0

1 iA6 runs = + .44 *rA6 * (iIP \ IP ) + .44 * [iA6 − A6 * (iIP \ IP )]. 1

2 : 2 7 : 5 3

P M AAppendix-A.indd 5 p p e n d i x - A . i n d d

5 Defi nitions of Team-Level Variables for DRA Model (1952–2009) Abbrev. Defi nition Formula or Source Abbrev. Defi nition Formula or Source 1 … 9 ... Right Fielder LFO L e ft -handed batter FO play-by-play data A Assists (total, if not followed by a number) LGO L e ft -handed batter GO play-by-play data BB Unintentional BB + HBP UBB + HBP OA Outfi elder-only A sum(A7,A8,A9 ) BFP Batters Faced by PA - IBB OPO Outfi elder-only PO sum(PO7,PO8,PO9 ) BH Balls BFP - SO - BB PA Plate Appearances BIP Balls In Play BH - HR PB Passed Balls CS Stealing PO (total, if not followed by a Number) FO Fly Outs (total) RFO + LFO RBIP Right-handed batter BIP play-by-play data GO Ground Outs (total) RGO + LGO RFO Right-handed batter FO play-by-play data GO2 GO at A2 - CS RGO Right-handed batter GO play-by-play data GO3 GO at fi rst base A3 + UGO3 SBA (“SB ”) Attempts SB + CS HBP Hit By SH Sacrifi ce Hits HR H o m e R u n s SO IA I n fi elder-only Assists sum(A1,A2, . . . ,A6 ) UBB Unintentional BB BB (traditional) - IBB IBB Intentional Bases on Balls BB – UBB UGO3 Unassisted GO3 avg(UGO3e1 ,UGO3e2 ) IFO I n fi elder-only FO FO - OPO UGO3e1 UGO3 estimate #1 IPO - A - IFO IP (or Played) UGO3e2 UGO3 estimate #2 GO - IA - CS - GIDP IPO I n fi elder-only PO sum(PO1,PO2, . . . PO6 ) WP Wild Pitches (includes PB ) WP (traditional) + PB 22/1/2011 2:27:53 PM / 1 / 2 0 1 1

2 : 2 7 : 5 3

P M 6 APPENDIX A

Th e previous two pages are a bit much to take in all at once. But I do not believe that any other comprehensive system for team and individual remotely as accurate as DRA can be summarized as concisely. Before addressing the new points, let’s quickly recap in a few pages the basic approach under DRA as described in chapter two. You might fi nd it helpful to fl ip back to the preceding two pages as you read both the recap and the discussion of new issues. DRA is essentially a forced-zero-intercept, two-stage multivariable least- squares regression analysis model. I’m using the “two-stage” terminology informally; as we shall see, the DRA model is not an “instrumental vari- ables” model, otherwise known as a “two-stage least-squares” model. Th e forced zero intercept merely means that we ‘center’ the ultimate - come being predicted (team runs allowed), each ‘play made’ (each pitching and fi elding ‘play’ that is made) outcome used to predict expected team runs allowed, and each variable used to ‘predict’ expected pitching and fi elding plays, so that all outcomes and their respective ‘predictors’ are net numbers, above or below the league-average rate. Furthermore, each outcome or pre- dictor is centered by reference to its appropriate ‘denominator’ of opportuni- ties (the ‘denominators’ are not literally used as denominators in the arithmetical sense; hence the quotation marks). Th e fi rst stage of regression analysis involves regressing ‘centered’ fi elding variables ‘onto’ centered variables not under the control of the fi elding posi- tion being evaluated (and ideally not infl uenced by the quality of other fi eld- ers) that tend to be associated with more or fewer fi elding plays at that position. Th e residual left over from each fi rst-stage regression at each posi- tion is treated as an estimate of the ‘skill’ plays made at that position above or below expectation. Th e second-stage regression involves regressing net team runs allowed onto net pitching and (fi rst-stage-regression-adjusted) fi elding plays in order to reveal the number of runs associated with each net pitching and (fi rst- stage-regression-adjusted) fi elding outcome. To rate a team at a position, you simply apply the weight determined in the second-stage regression to the net plays (which, again, are negative half the time) to determine defensive runs at that position. Finally, you allocate team defensive runs at that position to each player fi rst pro-rata, based on his innings played at that position, then calculate his net plays compared to the team rate, given his percentage of team innings played. Each net play is credited with the same run weight used for the team rating at that position.

AAppendix-A.inddppendix-A.indd 6 22/1/2011/1/2011 2:27:532:27:53 PMPM More on Defensive Regression (or Runs) Analysis 7

Centering The Variables By Their Respective ‘Denominators’

We center all the team variables by their respective ‘denominators’ of opportu- nities. Centering in this way is the fi rst step towards making each variable less correlated with the others, so that its ‘independent’ net impact in runs may be better estimated. Th e little quotation marks are there because we will not achieve true independence in a mathematically precise sense. Th e best ‘denominator’ of opportunities for the ultimate outcome we’re trying to model — actual total team runs allowed per season — is innings pitched, so we calculate team runs allowed above or below the league-average rate given the team’s innings played, that is, net runs allowed given innings played, or RA.ip . In some sense this is just denominating net runs allowed by total outs, as innings are defi ned by outs. Th is is correct, because the ultimate limit on the number of runs a team can score in an is defi ned by outs. Th e best ‘denominator’ of opportunities for pitchers to record strikeouts (“ SO”) or unintentional walks (including batters , “ BB”) is the number of batters they face, or batters facing pitcher (“BFP ”); hence net strikeouts given batters facing the team’s pitchers (“SO.bfp ”) and net unin- tentional walks and batters hit by pitch (“ BB.bfp ”).1 Th e best ‘denominator’ for home runs allowed (“ HR ”) is any BFP not ending in a BB or HR , or “balls hit” (“BH ”); hence HR.bh , which tracks home runs allowed, given that the batter has made contact. Th e number of balls in play ( BH minus HR , or “BIP ”) is the primary ‘denominator’ of opportunities for plays involving getting the batter out on a batted not hit out of the park. By initially ‘denominating’ outcomes by BIP , we begin the process of measuring net plays independent of the pitching staff ’s SO.bfp , BB.bfp , and HR.bh . Infi eld fl y outs, that is, fl y balls caught by infi elders (“IFO ”), are almost always weakly hit balls that could be caught by two or more fi elders. Since they are nearly automatic outs, analogous to SO , we credit the pitchers with IFO relative to the league, given total BIP , resulting in the IFO.bip variable appearing among pitching runs. Likewise, we credit the pitcher if he records an (“A1 ”), which will almost always be on a ground ball he has fi elded ( A1.bip ). BIP is also the best

1. In this version of DRA, I tried treating intentional walks separately; for reasons discussed shortly below it didn’t make any diff erence, though it ‘should’ have. Th e BFP ‘denominator’ for SO.bfp and BB.bfp excludes plate appearances ending in an intentional walk.

AAppendix-A.inddppendix-A.indd 7 22/1/2011/1/2011 2:27:532:27:53 PMPM 8 APPENDIX A

‘denominator’ for ground out fi elding plays at catcher (“GO3 ”) and fi rst, assists at second, third, and short, and putouts at each outfi eld position. Th e simplest ‘denominator’ for runners (“CS ”) is the number of stolen base attempts (“SBA ”), hence CS.sba . Finally, wild pitches (defi ned here to include passed balls, “ WP”) and outfi elder assists (A7 , A8 , and A9 ) are ‘denominated’ by innings played (“IP ”), not because that is optimal, but because it is simple. An alternative approach is addressed further below. For the pitching, catching, and outfi elder assists variables, ‘centering’ is the only adjustment that has to be made. (Th e coeffi cients for A7.ip , A8.ip , and A9.ip are the same because I combined all three into one variable, A789. ip, when running the second-stage regression.) Furthermore, with the excep- tion of IFO.bip and GO2.bip , we have the exact counts of ‘denominators’ per pitcher (their BFP , BH , and BIP) and catcher (their SBA ), so the individual formulas are the same as the team formula, and the sum of individual results equals the team results. Th ere is one variable that is truly a combination of a pitching and catching variable: WP.ip , and not just because it includes passed balls. We credit or debit the pitchers with total WP.ip , because by far the largest source of vari- ance in both wild pitches and passed balls is pitching and sheer pitcher wildness. However, to give some credit for being better or worse at preventing wild pitches and passed balls, we credit each catcher with the number of his net passed balls, given innings played, relative to his team (which would ‘control’ somewhat for the eff ect of pitchers), and multiplied by three, because there have been roughly two wild pitches per throughout major league history. Th us, we credit the catcher with eff ectively two wild pitches saved and one passed ball saved for every passed ball he records in a season above or below his team ’s rate. It’s an admittedly crude measure of the impact catchers have on passed balls and wild pitches, but it is probably reasonable, because catchers miss so much playing time that the set of their catching teammates, at least over the course of a career, probably approaches league-average performance. And, as emphasized in our catcher chapter, all of the traditional methods for evaluating catchers are very sus- pect, because the biggest impact catchers may have is on pitcher eff ectiveness, more specifi cally, SO.bfp and BB.bfp , rather than on base runner defense.

Adjusting Net Plays Made Using Proxy BIP Distribution Variables

Second, we refi ne the estimate of true ‘skill’ plays made on BIP by ‘backing out’, using regression analysis, the estimated eff ects pitchers and batters have on the distribution of BIP throughout the fi eld. Th e key items of information gleaned from Retrosheet used to make these adjustments are the number of

AAppendix-A.inddppendix-A.indd 8 22/1/2011/1/2011 2:27:532:27:53 PMPM More on Defensive Regression (or Runs) Analysis 9

total BIP hit by opponent right-handed batters (Right-handed opponent batter BIP , or “ RBIP”), the number of fl y outs (“ FO ”) and ground outs (“ GO”) recorded against opponent right-handed batters (Right-handed opponent batter FO and GO , or “RFO ” and “RGO ”), and the number of FO and GO recorded against opponent left -handed batters (Left -handed oppo- nent batter FO and GO , or “LFO ” and “ LGO ”). Th e ‘denominator’ for RBIP is total BIP , yielding RBIP.bip (you have to have a BIP to have an RBIP ), which is negative when the team has a more left -handed opponent batter BIP . Th e ‘denominator’ for RFO and RGO is RBIP (you have to have an RBIP to have either an RFO or an RGO), yield- ing RFO.rbip and RGO.rbip . Th e ‘denominator’ for LFO and LGO is total BIP hit by opponent left -handed batters, which is merely BIP minus RBIP , or LBIP , yielding LFO.lbip and LGO.lbip. Notice that these variables have all been constructed so that they are at least arithmetically ‘independent’ of each other. Th ese fi ve key variables (RBIP.bip , RFO.rbip , RGO.rbip , LFO.lbip , and LGO.bip) are the “Proxy BIP Distribution Variables.” Th ey are good, if imper- fect, proxies for whatever ‘perfect’ information could theoretically be obtained regarding the actual distribution of expected BIP fi elding plays. As we showed in our Bill Mazeroski, Buddy Bell, and Mickey Mantle examples in chapter two, regression analysis reveals that they have the kind of statisti- cal relationships with net second base assists (“A4 ”) given total BIP (“A4. bip ”), net third base assists (“A5 ”) given total BIP (“ A5.bip”), and net center fi eld putouts (“PO8 ”) given total BIP (“PO8.bip ”) that one would expect. When RBIP.bip is positive (that is, when there is an above-average number of BIP hit by opponent right- handed batters, given total BIP ), there are more ground outs recorded on the left side of the infi eld (third and short) and more fl y outs recorded on the right side of the outfi eld (center and right). When RBIP.bip is negative (in other words, when there is an above-average number of BIP hit by opponent left -handed batters, given total BIP ), there are more ground outs recorded on the right side of the infi eld (fi rst and second) and fewer on the left side of the outfi eld (left fi eld). In both cases, that’s because hitters tend to pull the ball when they ground out and tend to be behind the ball when they fl y out. (Fly balls and line drives to the outfi eld that are pulled tend to be hit harder and drop in as clean hits.) Th e coeffi cients for RBIP.bip are much bigger (positive or negative) in the infi eld than in the outfi eld. Th at’s because batter-handedness has a much greater eff ect on the direction of ground outs than fl y outs. You can see this by watching how infi elds and outfi elds ‘shift ’. For the several left -handed bat- ters these days for whom a ‘Williams’-type shift is put on, especially Ryan Howard, you’ll frequently see the third baseman playing between third and second, and the shortstop playing behind second, but the outfi elders playing practically straightaway.

AAppendix-A.inddppendix-A.indd 9 22/1/2011/1/2011 2:27:532:27:53 PMPM 10 APPENDIX A

RFO.rbip and LFO.lbip are used to adjust ground out plays in the infi eld for fl y ball and ground ball pitching. By using relative FO to estimate relative opportunities to record infi eld assists, we avoid having the assists made by the fi elder being evaluated from being used to take into account his relative opportunities to make plays. By ‘splitting’ fl y outs by opponent batter-hand- edness, we capture to a signifi cant extent cases in which (i) a team’s left - handed pitchers (who would face proportionately more right-handed batters) tend to induce RGO or RFO and (ii) a team’s right-handed pitchers (who would face proportionately more left -handed batters) tend to induce LGO and LFO . Right- and left -handed opponent batters also have their own impact on whether BIP are hit on the ground or in the air, which is also refl ected in RFO.rbip and LFO.lbip . However, RFO.rbip and LFO.lbip are controlled more by a team’s pitchers, who would tend to have much more extreme ground ball or fl y ball tendencies than the league’s batters as a whole (excluding, of course, the team’s own hitters), though this is less true for more recent seasons, which feature less-balanced schedules. If RFO.rbip is positive, that suggests there will be fewer GO recorded against those right-handed batters, and particularly fewer GO on the left side of the infi eld. (If RFO.rbip is negative , there will be more GO, particu- larly on the left side.) If LFO.lbip is high, that suggests there will be fewer GO , and particularly fewer GO on the right side of the infi eld. (If LFO.lbip is negative , there will be more GO , particularly on the right side.) Th e coeffi - cients at second, third, and shortstop in the chart at the beginning of this appendix all refl ect that expectation. We’ll address fi rst base further below. RGO.rbip and LGO.lbip are used to adjust fl y out plays in the outfi eld for fl y ball and ground ball pitching by left - and right-handed pitchers, respec- tively. By using relative GO to estimate relative outfi eld opportunities, we avoid having the actual putouts recorded by each outfi elder being used to estimate how many putouts he ‘should’ have made. If RGO.rbip is positive, that suggests there will be fewer FO recorded against those right-handed bat- ters, and particularly fewer FO on the right side of the infi eld (and vice-versa). If LGO.lbip is positive, that suggests there will be fewer FO , and particularly fewer fl y outs on the left side of the infi eld (and again, vice versa). Notice again that batter-handedness has less of an impact in the outfi eld than in the infi eld, as shown by the fact that the coeffi cients for RGO.rbip and LGO.lbip are nearly equal in the outfi eld, whereas the coeffi cients for RFO.rbip and LFO.lbip are signifi cantly diff erent at each infi eld position. Th e obvious case, mentioned in the Mantle example, is center fi eld, which is, well, in the center of the fi eld, where the impact of left - and right- handed batters (and pitchers) is approximately equal. But in right fi eld, the coeffi cients for RGO.rbip and LGO.lbip are also nearly the same. Only in left is there a meaningful diff erence between the RGO.rbip and LGO.lbip coeffi cients,

AAppendix-A.inddppendix-A.indd 1010 22/1/2011/1/2011 2:27:532:27:53 PMPM More on Defensive Regression (or Runs) Analysis 11

but even so, the diff erence is not as great as the diff erences between the coeffi cients for RFO.rbip and LFO.lbip at second, third, and short. Th e bottom line seems to be that opponent batted handedness, and the interaction between opponent batter handedness and pitcher handedness, has a much, much greater impact on the direction of ground balls than fl y balls.

Adjusting Net Plays For The Impact Of Base Runners

Th e Proxy BIP Distribution Variables attempt to account for where batted balls are hit— that is, whether they are hit on the ground or in the air, and on the left or right side of the fi eld. But there are other factors that were not discussed in chapter two that impact the likelihood that fi elders at each posi- tion will make plays. One obvious factor for infi elders is the presence of runners at fi rst base. Th is increases play assist opportunities for middle infi elders but also forces the fi rst baseman to play close to the bag, which reduces his chance of fi elding ground balls in the between fi rst and second. Taking this into account using regression analysis is a little tricky. If you create a variable for estimated runners at fi rst, this would include not only walks but also . But hits allowed are partly a function of net plays made at fi rst, second, and short. Any statistical association revealed by regression analysis between, say, shortstop assists and runners on fi rst could refl ect either the shortstop’s impact on the number of runners at fi rst (by allowing or prevent- ing hits) or the impact of the runners at fi rst on shortstop assists (by increas- ing or decreasing assist opportunities). Th ere are a few candidates for variables that get around this ‘circularity’ problem, at least for middle infi elder double play assists, because they are not infl uenced by infi elder fi elding: SO.bfp , BB.bfp , HR.bh , WP.ip, and per- haps SH.bip (net sacrifi ce hits given BIP ). Th e more SO.bfp , the fewer hits and runners at fi rst. Th e more BB.bfp , the more runners on fi rst. HR clear the base paths, which obviously prevents double plays. WP and SH allow runners on fi rst to reach second, thus preventing a double play. At both shortstop and second base these variables have, at least directionally, the impact one would expect, though the particular coeffi cients are not very stable from sample to sample, and since WP.ip and SH.bip have relatively little variation from team to team, they are probably not practically signifi cant and could have been dropped from the model. In addition, SH.bip might also belong more with the category of Proxy BIP Distribution Variables, because by defi nition they are ground balls that can only be fi elded in a particular area of the infi eld (say, approximately anywhere within sixty feet of home plate).

AAppendix-A.inddppendix-A.indd 1111 22/1/2011/1/2011 2:27:542:27:54 PMPM 12 APPENDIX A

Net intentional walks (“ IBB”) given total plate appearances (“PA ”), IBB. pa (note that PA equals IBB plus BFP in the post-1951 model) have a nega- tive impact on third base plays, probably because they reduce sacrifi ce bunts that ‘should’ be added back. In any event, that variable has little practical impact and could have been dropped from the model.

Adjusting Net Plays For The Impact Of ‘Ball Hogging’

A fi elder might make more plays not by preventing more BIP from going through for hits, but by taking more ‘easy’ chances that could have been fi elded by other fi elders and were more or less guaranteed outs anyway. By far the most important example of this are FO fi eldable by infi elders. Ninety to ninety-fi ve percent of fl y balls and pop ups caught by infi elders can usu- ally be taken by at least two, and sometimes three, diff erent fi elders (two infi elders and an outfi elder). Centerfi elders who have played very shallow, especially , have tended to ‘hog’ some of these chances. Regressions of PO8.bip onto IFO.bip throughout history consistently show that the more IFO.bip , the fewer PO8.bip , and vice versa. Th erefore, if IFO. bip has been reduced by centerfi elder ball hogging, a portion of those nega- tive ‘hogged plays’ is added to expected PO8.bip , thus reducing the center- fi elder rating, and vice versa. At times there is an impact for corner outfi elders as well. I was somewhat surprised that IFO.bip was so important in right fi eld. Perhaps the fact that most pop-ups are hit to the right side of the fi eld (as most batters are right-handed, and most pop ups are hit to the opposite side of the fi eld, for reasons we’ve already discussed) explains this result. Right fi elders may take more discretionary pop fl ies from fi rst base- men (some of whom are the slowest players in baseball) than left fi elders take from third basemen. A batted ball category similar to infi eld fl y outs is SH . Th e three fi elders who fi eld SH are the pitcher, fi rst baseman, third baseman, and, to a very small extent, catcher. Th ere is probably some ‘ hogging’, depending on the fi elding quality of pitchers. A great fi elding pitcher, such as Greg , probably fi elded some bunts that might otherwise have been fi elded by Chipper Jones or Fred McGriff . In contrast, someone like Randy Johnson probably relied more on others to handle sacrifi ce bunts. Th e third baseman formula above refl ects this factor by ‘backing out’ a portion of A1. bip when calculating rA5 . (So, if the pitcher is ‘taking’ bunt opportunities from the third baseman, estimated ‘hogged’ bunts are added back to the third baseman, and vice versa.) Th ird baseman and fi rst baseman don’t ‘fi ght’ over bunt opportunities; rather, bunt opportunities are gift s — from the batter. Presumably, hitters playing against aimed their

AAppendix-A.inddppendix-A.indd 1212 22/1/2011/1/2011 2:27:542:27:54 PMPM More on Defensive Regression (or Runs) Analysis 13

bunts toward Boog Powell, and hitters playing against Keith Hernandez aimed their bunts toward Howard Johnson. Regression analysis indicates that the more rGO3 (which is already adjusted for batter-handedness), the fewer rA5 , and vice versa. Another similarity between SH.bip and IFO.bip is that both are essen- tially guaranteed outs. All that is at stake with a sacrifi ce hit attempt is whether the lead runner advances — and the value of that is only about .20 runs. As a practical matter, no fi elder should be getting any credit for fi eld- ing a sacrifi ce bunt and getting the runner out at fi rst. Given the total number of SH attempts fi elded, the fi elder should be given credit for the net number of lead runners taken out relative to the league rate, given those total opportunities, multiplied by .20 runs. I doubt any contemporary third or fi rst baseman would earn more than a couple of runs a season for any such skill. Given total SH attempts fi elded, the fi elder should be charged for the net number of times he went for the out at second and failed to get either the lead runner or the batter out, multiplied by the ‘free’ out lost and the hit given up, or about 0.75 runs. Any new DRA model I will develop will take more complete advantage of play-by-play data, will exclude SH from BIP altogether, and will subtract SH assists from each fi elder’s total. Th is will also make it unnecessary to ‘back out’ SH.bip from positions that never have the opportunity to fi eld SH , such as middle infi elders and out- fi elders. Th erefore, any future DRA model would not have the SH.bip factor in the rPO8 formula (it wasn’t statistically signifi cant in left or right) and none for the rA4 or rA6 formulas (except if signifi cant in limiting double play opportunities).

First Base

About ninety-eight to ninety-nine percent of ground outs result in an assist for the fi elder who fi elds the ball, with one exception: fi rst base. First base- men record assists for only about half of the ground balls they convert into outs — the rest of the time they just run to the bag to record the putout unas- sisted. Traditional statistics don’t diff erentiate between ground ball putouts and fl y ball putouts, but Retrosheet play-by-play data aft er 1951 does, so it is possible to the exact number of ground balls a fi rst baseman fi elds. Unfortunately, I had neither individual nor team totals of unassisted ground outs at fi rst base (“ UGO3”) when I fi rst developed the post-1951 model. However, a reasonably good estimate of the team total can by obtained indi- rectly, as shown in the charts at the beginning of the chapter. In English, the three rows above say that estimated UGO3 is simply the average of two estimates.

AAppendix-A.inddppendix-A.indd 1313 22/1/2011/1/2011 2:27:542:27:54 PMPM 14 APPENDIX A

UGO3 U n a s s i s t e d GO3 avg(UGO3e1 ,UGO3e2 ) UGO3e1 U n a s s i s t e d GO3 estimate #1 IPO - A - IFO UGO3e2 U n a s s i s t e d GO3 estimate #2 GO - IA - CS - GIDP

Th e fi rst estimate ( UGO3e1) is the estimated number of infi eld putouts that were not due to catching fl y balls: total infi eld putouts, minus total team assists (including outfi eld assists, which always result in an infi elder putout), minus FO recorded by infi elders. I had the ‘exact’ count for the latter vari- able, because my data provider gave me the Retrosheet count for total fl y outs; all you need to do is subtract outfi eld putouts from that total to arrive at infi elder fl y outs. Th is estimate will overestimate UGO3 by the number of unassisted ground ball putouts at infi eld positions other than fi rst, which are, in total, only about one-third the total at fi rst. Th e second estimate is the estimated number of GO that were not in the form of infi eld assists. I had a total Retrosheet count of GO (at all infi eld positions); infi eld assists from fi elding ground balls are estimated as total infi eld assists less CS and double play assists. Th is estimate underestimates total infi eld unassisted ground outs by the number of infi eld assists on relays. Th e ‘noise’ in the above estimates is not inconsiderable, but probably not biased either. We are not ultimately concerned with getting the exact total of UGO3 , but net UGO3, given BIP , or UGO3.bip . Both unassisted ground ball putouts at second and third, as well as infi elder relay assists are both rare and random events that should merely create random ‘noise’, whereas fi rst base unassisted ground ball putouts are routine and refl ect to a large degree the systematic preference of the fi rst baseman to run to the bag or to toss to the pitcher covering the bag. Th e sum of fi rst base assists (A3 ) and UGO3 is estimated GO at fi rst base (“ GO3 ”). Here is the formula for residual, or regression-adjusted, GO3 :

rGO3 = GO3.bip + .09 *RBIP.bip .

Regression analysis would also include + .06 * RFO.rbip and + .14 *LFO.lbip , but we need to sacrifi ce some accuracy at fi rst base by deleting these vari- ables to ensure that the ‘global’ regression of RA.ip onto our fully-adjusted pitching, fi elding, and base-running variables generates correct run weights for infi eld and outfi eld plays. Here’s why. Th e Proxy BIP Distribution Variables have a couple of important limita- tions. One is that in order to obtain in the second-step regression run weights in the infi eld and outfi eld that make ‘sense’ (are approximately equal or slightly higher in the outfi eld), it is usually desirable that the sum of

AAppendix-A.inddppendix-A.indd 1414 22/1/2011/1/2011 2:27:542:27:54 PMPM More on Defensive Regression (or Runs) Analysis 15

RFO.rbip and LFO.lbip regression weights for adjusting infi elder positions be approximately equal to the sum of RGO.rbip and LGO.lbip regression weights, respectively, for adjusting outfi elder positions. In other words, we do not want each infi elder assist to be ‘discounting’ each outfi eld putout more than each outfi elder putout is ‘discounting’ each infi elder assist. Th e sum of RFO.rbip coeffi cients is .71 with an adjustment included at fi rst base ( rGO3 ) (.65 without); the sum of RGO.rbip coeffi cients is .70. Th e sum of LFO.lbip coeffi cients is .70 with an adjustment at fi rst base (.57 with- out). But the sum of LGO.lbip coeffi cients is only .56:

rGO3 = GO3.bip [ + .06 *RFO.rbip + .13 *LFO.lbip ] rA4 = A4.bip + ( … ) + .15 * RFO.rbip + .32 *LFO.lbip ( … ) rA6ss = A6.bip + ( … ) + .29 *RFO.rbip + .15 *LFO.lbip ( … ) rA5 = A5.bip + ( … ) + .21 * RFO.rbip + .10 *LFO.lbip ( … )

rPO7 = PO7.bip + ( … ) + .21 *RGO.rbip + .10 *LGO.lbip rPO8 = PO8.bip + ( … ) + .27 *RGO.rbip + .24 *LGO.lbip ( … ) rPO9 = PO9.bip + ( … ) + .22 *RGO.rbip + .22 *LGO.lbip ( … )

Including the fi rst base adjustments for the RFO.rbip and RGO.rbip , coeffi - cients would be balanced, but including fi rst base adjustments for the LFO. lbip and LGO.lbip would result in an imbalance that leads to run-weight coeffi cients for the outfi eld positions being lower than for the infi eld posi- tions, because the marginal outfi eld plays are associated with a reduction in ground out plays greater than the reduction in outfi eld plays that is associ- ated with marginal infi eld plays. We’ll address issues related to this further below, when we discuss modeling issues, based on statistical theory, apart from baseball.

Examples Of First Stage Regression And Diagnostics

Th ere would be little point to showing every regression analysis and its output, but a couple of illustrative examples should convey the issues involved in variable selection. If one regresses A4.bip ‘onto’ the Proxy BIP Distribution Variables appli- cable to infi elders (RBIP.bip , RFO.rbip, LFO.lbip , and SH.bip) and variables that may impact the number of runners on fi rst base and thus double play pivot opportunities ( IBB.pa, SO.bfp, BB.bfp, HR.bh , and WP.ip ), we obtain the following output (I imported my Excel spreadsheet of centered variables into the statistical soft ware package S-PLUS in order to run the regressions):

AAppendix-A.inddppendix-A.indd 1515 22/1/2011/1/2011 2:27:542:27:54 PMPM 16 APPENDIX A

Call: lm(formula = A4.bip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + WP.ip + RBIP.jbip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max -88.18 -16.88 0.6016 15.7 81.84 Coeffi cients: Value Std. t value Pr(> |t|) (Intercept) 0.0000 0.7196 0.0000 1.0000 IBB.pa 0.0284 0.0515 0.5509 0.5818 SO.bfp -0.0061 0.0081 -0.7524 0.4519 BB.bfp 0.0285 0.0147 1.9351 0.0532 HR.bh -0.2003 0.0417 -4.8000 0.0000 WP.ip -0.2460 0.0590 -4.1672 0.0000 RBIP.bip -0.0785 0.0042 -18.6234 0.0000 RFO.rbip -0.1498 0.0156 -9.6086 0.0000 LFO.lbip -0.3140 0.0229 -13.7335 0.0000 SH.bip -0.2164 0.0795 -2.7220 0.0066 Residual standard error: 25.22 on 1218 degrees of freedom Multiple R-Squared: 0.4874

Generally, we will eliminate from consideration variables with a Pr( > |t|) greater than .05. It is quite common for statisticians to restrict model variables to those with “p values” of less than .05. When we eliminate variables with p values greater than .05 from the above regression we obtain the following result:

Call: lm(formula = A4.bip ~ HR.bh + WP.ip + RBIP.bip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max -88.52 -16.75 0.1957 15.41 82.76 Coeffi cients: Value Std. Error t value Pr(> |t|) (Intercept) 0.0000 0.7201 0.0000 1.0000 HR.bh -0.1804 0.0405 -4.4523 0.0000 WP.ip -0.2003 0.0537 -3.7325 0.0002 RBIP.bip -0.0793 0.0042 -18.8644 0.0000 RFO.rbip -0.1494 0.0155 -9.6195 0.0000 LFO.lbip -0.3159 0.0228 -13.8672 0.0000 SH.bip -0.1943 0.0781 -2.4876 0.0130 Residual standard error: 25.23 on 1221 degrees of freedom Multiple R-Squared: 0.4855 F-statistic: 192 on 6 and 1221 degrees of freedom, the p-value is 0

AAppendix-A.inddppendix-A.indd 1616 22/1/2011/1/2011 2:27:542:27:54 PMPM More on Defensive Regression (or Runs) Analysis 17

Th e above output, rearranged and rounded, says that a good estimate of

Expected A4.bip = –.08 *RBIP.bip –.15 *RFO.rbip –.32 *LFO.lbip –.18 *HR.bh –.20 *WP.ip –.19 *SH.bip .

Since we are looking for net plays, we subtract expected A4.bip from actual A4.bip to obtain the following formula for residual (or regression-adjusted) plays at second:

rA4 = A4.bip + .08 *RBIP.bip + .15 * RFO.rbip + .32 * LFO.lbip + .18 *HR.bh + .20 *WP.ip + .19 *SH.bip .

We round to two decimal places not only for the sake of readability, but also because the standard errors in the estimates of the coeffi cients (see “Std. Error” column in the regression output) are generally greater than .01 and actually tend to be about .05. Reporting extra decimal places would be a classic case of false precision. Th ere are some interesting additional details in the fi nal output. Notice that the “data” is “DRAsept07sansNL69.” I developed the model in September 2007 from Retrosheet data then only available from 1957 through 2006. Also, because of some data anomalies at the time in the 1969 National League data set, I excluded that year and league from the sample. Having developed the model from 1957–2006 data, I applied it ‘out of sample’ to 1952–56, 1969 (National League), and 2007–09 when fi nalizing this book. We’ll discuss the ‘out of sample’ output shortly below. Th e “Multiple R-Squared” of .4855 indicates that approximately 49 per- cent, or about half, of the variance in A4.bip can be ‘explained’ by the model. Th e remaining residual is what we call rA4 and treat as refl ecting the true ‘skill’ of the team’s second baseman. Th e ‘distribution’ of rA4 is still too large: the worst team at second base had –89 rA4 ; the best, + 83 rA4 . Th e quartiles are fairly reasonable: –17 rA4 and + 15 rA4. Th e “Residual standard error” is the standard deviation in rA4 , which is 25. Th ough the rA4 do not follow a so-called ‘normal’ distribution exactly, due to an exces- sive number of extreme outcomes, it is still approximately correct to say that the middle halves of teams have between –17 and + 15 rA4, and the middle two-thirds have approximately –25 to + 25 rA4. Th is ‘spread’ is probably too high, based on batted ball data, which indicates that the model is not perfectly capturing all the factors that can ‘give’ or ‘take away’ chances from second basemen. But the second-stage regression will

AAppendix-A.inddppendix-A.indd 1717 22/1/2011/1/2011 2:27:542:27:54 PMPM 18 APPENDIX A

‘discount’ rA4 (and other such residual estimated skill plays at other positions) to adjust for this. I have not included the usual diagnostic plots of residuals. Th ere is absolutely no non-linearity in the residuals, at any position. Th e scatter plots of residuals against fi tted values show no change in the spread of residuals. While the residuals in both the fi rst and second stage regressions were unimodal and symmetric, it must be said that the tails were ‘fatter’ than one would like, thus falling short of the ideal in regression modeling of normally distributed residuals. Recall that the presence of runners at fi rst base ‘should’ reduce GO3.bip , because the fi rst baseman has to play closer to the bag. Regression analysis suggests that the typical impact is either not statistically signifi cant or not practically signifi cant over the course of a season. Call: lm(formula = GO3.bip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + WP.ip + RBIP.jbip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max -86.4 -15.39 0.0661 14.99 104.8 Coeffi cients: Value Std. Error t value Pr(> |t|) [1-std impact] (Intercept) 0.0259 0.6764 0.0383 0.9694 IBB.pa -0.0131 0.0484 -0.2695 0.7876 SO.bfp -0.0053 0.0076 -0.7052 0.4808 BB.bfp 0.0261 0.0138 1.8874 0.0593 1.5 runs HR.bh 0.0183 0.0392 0.4659 0.6413 WP.ip -0.2065 0.0555 -3.7219 0.0002 3 runs RBIP.jbip -0.0900 0.0040 -22.7182 0.0000 RFO.rbip -0.0701 0.0147 -4.7858 0.0000 LFO.lbip -0.1347 0.0215 -6.2675 0.0000 SH.bip -0.1725 0.0747 -2.3081 0.0212 1.5 runs Residual standard error: 23.7 on 1218 degrees of freedom Multiple R-Squared: 0.3675 F-statistic: 78.64 on 9 and 1218 degrees of freedom, the p-value is 0 I’ve highlighted the variables not under the control of fi elders that would impact the number of runners at fi rst base. Th e only one with a p -value below .05 was WP.ip , and, given the standard deviation of WP.ip , that impact in runs per season would typically be only plus or minus three runs. For reasons explained shortly above, we excluded RFO.bip and LFO.lbip from the model for rGO3 .

AAppendix-A.inddppendix-A.indd 1818 22/1/2011/1/2011 2:27:542:27:54 PMPM More on Defensive Regression (or Runs) Analysis 19

Second-Stage Regression And Diagnostics

Set forth below is the regression output from the second stage, ‘global’ regression, in which we regress actual team runs allowed above or below the league rate that year, RA.ip , ‘onto’ all of the estimated net skill plays at all positions, including net pitcher ‘plays’ such as BB.bfp , SO.bfp , HR.bh , IFO.bip , A1.bip , WP.ip , and net ‘residual’ fi elder plays such as rA4 , rA6 , rPO8 , etc. Call: lm(formula = R.ip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + IFO.bip + A1.bip + WP.ip + CS.sba + GO2.bip + A789.ip + rGO3 + rA4 + rA5 + rA6 + rPO7 + rPO8 + rPO9, data = DRA,na.action = na.exclude) Residuals: Min 1Q Median 3Q Max -63.44 -14.46 -0.09485 15.02 67.22 Coeffi cients: Value Std. Error t value Pr(> |t|) (Intercept) -0.0097 0.6368 -0.0152 0.9879 IBB.pa 0.3074 0.0455 6.7631 0.0000 SO.bfp -0.2777 0.0075 -37.0611 0.0000 BB.bfp 0.3375 0.0131 25.7802 0.0000 HR.bh 1.4918 0.0376 39.6665 0.0000 IFO.bip -0.4413 0.0175 -25.2079 0.0000 A1.bip -0.4238 0.0305 -13.8906 0.0000 WP.ip 0.5646 0.0546 10.3424 0.0000 CS.sba -0.5898 0.0713 -8.2727 0.0000 GO2.bip -0.5943 0.0682 -8.7095 0.0000 OA.ip -0.6081 0.0956 -6.3621 0.0000 rGO3 -0.5171 0.0288 -17.9480 0.0000 rA4 -0.5265 0.0288 -18.2667 0.0000 rA5 -0.4469 0.0271 -16.4746 0.0000 rA6 -0.4439 0.0266 -16.6587 0.0000 rPO7 -0.5349 0.0316 -16.9241 0.0000 rPO8 -0.4583 0.0278 -16.5126 0.0000 rPO9 -0.4954 0.0300 -16.5378 0.0000 Residual standard error: 22.32 on 1210 degrees of freedom Multiple R-Squared: 0.889 F-statistic: 570.1 on 17 and 1210 degrees of freedom, the p-value is 0 Th e standard error of a little over 22 runs is similar to the standard errors for the twenty or so well-known formulas for estimating team runs scored , as demonstrated by John Jarvis on his website. Generally this means that

AAppendix-A.inddppendix-A.indd 1919 22/1/2011/1/2011 2:27:542:27:54 PMPM 20 APPENDIX A

the DRA estimate of runs allowed per team is within plus or minus 22 runs about two-thirds of the time. Th e worst matches, with the greatest errors, are –63 runs and + 67 runs. I would imagine that almost all of the many well-known off ensive models would have similar outliers in a fi ft y- or sixty- year sample. Th e “Multiple R-Squared” is not as high as I would like. When separate DRA models are developed for the Modern Era (1969–1992) and Contemporary Era (1993–present), such models tend to have multiple r-squareds of approximately ninety-fi ve percent, which is approximately the same as is found in the better models of team off ense, as reported by John Jarvis at this website (three were as high as ninety-six percent). Part of the art of developing regression models is balancing accuracy and simplicity. In this case I felt it would dramatically simplify this book to have one model for all seasons since the early 1950s. I have not included the usual diagnostic plots of residuals. Th ere is abso- lutely no non-linearity in the residuals. We’ve dealt with multi-collinearity among the predictor variables via centering and fi rst-stage regressions, so the variables all have correlations with each other between –.1 and + .1, down from –.6 and + .6 for the simple seasonal totals. Th e scatter plot of residuals against fi tted values shows no change in the spread of residuals. Th e Durbin– Watson statistic did not indicate any meaningful correlation in team residu- als over time. While the residuals in both the fi rst and second stage regressions were unimodal and symmetric, it must be said again that the tails were ‘fatter’ than one would like, though closer to a normal distribution than in the case of the fi rst-stage regressions. However, due to the large sample sizes no residual in the fi rst-stage regression was remotely large enough to impact the coeffi cient estimates in the second-stage regression. One of the typical diagnostic tests for a regression model is to apply it ‘out of sample’ to see how well it works. When I was fi nishing this book and had to apply the model to the 1969 National League and 2007–09 sea- sons for both leagues, the standard error was 23 runs and the r -squared was .90 — virtually identical to the in-sample values. Unfortunately, the 1952–56 standard error was 36 runs, with a .89 r -squared. However, that is easily explained. First, the play-by-play data for the early-to-mid 1950s is not nearly as complete as it is for the late 1950s — some teams are missing up to 40 games of data per season. Th is results in signifi cant data errors in the Proxy BIP Distribution Variables, CS.sba , GO3.bip, and IFO.bip . Second, as we will see in our discussion of the pre-1952 model(s), there was a dramatic change during the 1950s in the impact of pitchers on batted ball outcomes. Th e run weights for the so-called Th ree True Outcomes —BB.bfp , SO.bfp , and HR.bh — are remarkably consistent with those determined under a

AAppendix-A.inddppendix-A.indd 2020 22/1/2011/1/2011 2:27:542:27:54 PMPM More on Defensive Regression (or Runs) Analysis 21

variety of rigorous off ensive models, though the weight for HR.bh is about one-tenth of a run too high. Th e run weight for CS.sba is almost precisely right, for it equals the sum of the typical increase in run expectation if a base is stolen (approximately .15 to .20 runs) and the typical decrease in run expectation if a runner on base is taken out (approximately .45 to .40 runs). Similarly, the run weight for A789.ip is almost precisely right, for it equals the sum of the typical increase in run expectation if a base runner gains the extra base (approxi- mately .15 to .20 runs) and the typical decrease in run expectation if a runner on base is taken out by the outfi elder (approximately .45 to .40 runs). Th e run weight for WP.ip should be .27 runs, not .56, the excess being due to the fact that WP.ip ‘carries’ the higher run expectation of the ‘state’ of there already being one or more runners on base. In other words, positive WP.ip is strongly correlated with runs allowed not only because a WP increases runs allowed, but have runners on base already is obviously even more correlated with allowing runs; the WP.ip variable cannot separate out these two eff ects. We’ll get to an imperfect fi x in one of our alternative DRA models. IBB.pa is also too high, for the same reason; the average intentional walk increases expected runs by only .16, rather than .33. Jim Albert and Jay Bennett’s Curve Ball : Baseball, Statistics, and the Role of Chance in the Game (see pages 187 through 189 of the current paperback edition) has an excel- lent discussion about how regression variables can ‘carry’ information of omitted variables (here, the existence of base runners) in both a good and a bad way. WP.ip and IBB.pa are examples where the omitted variables (the fact that there are runners on base already, which correlates with allowing runs) have a ‘bad’ eff ect on the estimates. We will shortly see examples of variables ‘carrying’ useful information.

THEORETICAL QUESTIONS REGARDING THE PROXY BIP DISTRIBUTION VARIABLES

We now come to perhaps the most interesting issue in the DRA model from the standpoint of general statistical modeling: the role of the Proxy BIP Distribution Variables and the run weights for the residual fi elding plays ( rA4 , rA6 , rPO7 , etc.). Th e Proxy BIP Distribution Variables are good proxy variables under standard multivariable regression theory, for two reasons. First, they are strongly correlated with the ‘true’ distribution of ground balls and fl y balls hit by right- and left -handed batters. As explained in chapter two, the –.80 correlations between RFO.rbip and RGO.rbip, and between LFO.lbip and LGO.lbip , suggest they explain about two-thirds the variance in true ground

AAppendix-A.inddppendix-A.indd 2121 22/1/2011/1/2011 2:27:542:27:54 PMPM 22 APPENDIX A

balls versus fl y balls generated by right- and left -handed batters respectively. Th at’s because there ‘should’ be zero correlations, because quality between team outfi elds and infi elds should be uncorrelated over large samples. Th e fact that the correlations are nevertheless approximately –.80 suggests that the square of that number (64% ) is the amount of variation between FO given BIP and GO given BIP that must be controlled by the pitchers . Second, the chosen proxies are not correlated (or very weakly correlated) with the theoretical error term in a perfectly specifi ed model, in other words, the true skill plays of the position being evaluated, and uncorrelated with any other predictors used to predict skill plays at such position, such as the base- runner variables and the ball-hogging variables. When we get to the second, ‘global’ regression, the residuals from the fi rst set of regression— rGO3 , rA4, rA5, rA6, rPO7, rPO8 , and rPO9 — c a n be viewed as explanatory variables in predicting RA.ip that are either proxy variables for true skill plays or estimates of true skill plays that are subject to “measurement error.” If we view them as proxy variables, they have the problem that they are correlated somewhat with the error term in model- ing RA.ip . For example, rA6 is too high (overestimates true skill net A6 (“ tsA6 ”)) if the team’s outfi elders are above average in true skill, which would be associated with more runs prevented. Seen instead as simply measurement error in explanatory variables, this results in “classical errors-in-variables,” which can be shown to result in “attenuation bias,”2 which causes the coeffi cients to be too small. Th is is exactly what happens in the DRA model, where the true run value of a true skill play (about .75 to .85 runs, depending on the position) is attenuated to something closer to .50 runs. Th ough that results in a mis-estimation of the true run value of a true net skill play, it is ultimately helpful in the DRA model because we are interested more in estimating the total defensive runs per position per team. Attenuation is an appropriate ‘haircut’ for an estimate of skill plays with too much noise in it. When I fi rst published an article in 2003 about the basic approach of DRA, one of the readers suggested that it was an instrumental variables regression model, also known as a “two-stage least-squares” model. I do not believe that is the case. In the fi rst-stage regressions for each position, the Proxy BIP Distribution Variables are serving simply as good (because they are ‘independent’ of the position being evaluated) if imperfect predictors in an ordinary least-squares estimate of net plays at each position, given total BIP , for example, A6.bip , A4.bip , PO8.bip , A5.bip , etc.

2. See Jeff rey M. Wooldridge, Introductory Econometrics: a Modern Approach, 318–22 (South- Western, 2009).

AAppendix-A.inddppendix-A.indd 2222 22/1/2011/1/2011 2:27:552:27:55 PMPM More on Defensive Regression (or Runs) Analysis 23

Perhaps the reader was thinking of the fi rst-stage, ‘per position’ regressions as the fi rst stage in a formal two-stage (that is, instrumental variables) model. Seen in that light, the Proxy BIP Distribution Variables are attempting to function in some sense like “instrumental variables,” but without satisfying all the requirements that an instrumental variable should — most impor- tantly, “exogeneity,” or independence from the error term in the second- stage regression. For example, RFO.rbip is in some sense acting as an instrumental variable to ‘purge’ estimates of net skill plays at each infi eld position of the eff ect of fl y ball versus ground ball pitching to right-handed batters. And about two- thirds of RFO.rbip probably does refl ect the tendency of opponent right- handed batters to hit the ball on the ground or in the air, which has a very minor impact on ultimate runs allowed. (Th e expected run value of a ground ball is close to that of a ball hit in the air; more ground balls go through for hits, but more balls hit in the air go for extra bases.) However, RFO.rbip also refl ects to some extent the skill of the outfi elders in preventing hits, which does have an impact on runs allowed and would impact the error term in the second-stage regression. Another way in which the two-stage DRA model is inconsistent with the two-stage instrument variables regressions that I have seen is that the number of instrumental variables is less than the number of predictor (pitch- ing and fi elding) variables, and that a diff erent set of instrumental variables is used for each predictor. Finally, in the examples of two-stage instrumental variables regression that I have seen, the fi tted variables from the fi rst stage are included in the second-stage regression; here, the residuals from the fi rst-stage regression are included in the second-stage regression. Th ough the Proxy BIP Distribution Variables used in DRA are not ideal, they make the model much better than it would be without them. Furthermore, the ultimate validation of DRA is less whether it passes all the standard diag- nostic tests for a regression model than whether it generates (i) fi elder defen- sive runs estimates that match well with batted ball data systems, and (ii) team defensive runs estimates that match actual team runs allowed. On the basis of many tests I’ve conducted over the years, DRA defensive runs esti- mates for individual fi elders match almost or about as well with estimates derived from batted ball data as the latter do with each other. And DRA defensive runs estimates for teams match nearly or about as well with actual team runs allowed as the best off ensive runs models based on team seasonal totals of various off ensive events match actual team runs scored. Most importantly, the Proxy BIP Distribution Variables can be replaced with better Proxy BIP Distribution Variables in future versions of DRA that can be developed by exploiting the Retrosheet play-by-play database (cur- rently available aft er 1951) to its maximum extent. When I say ‘better’,

AAppendix-A.inddppendix-A.indd 2323 22/1/2011/1/2011 2:27:552:27:55 PMPM 24 APPENDIX A

I mean Proxy BIP Distribution Variables closer to being independent of fi elder quality (to derive better estimates of expected plays at each position) and of the ultimate team outcome being modeled (expected team runs allowed). For example, Sean Smith has used play-by-play data to calculate each pitcher’s career ratio of ground outs to total outs, that is, the pitcher’s ground out ratio (“pgor ”). Th at number, which would generally have a low correlation with the fi elding performance of the pitcher’s teammates in the current year, could be aggregated with those of his pitching teammates to derive some sort of estimate of the likely distribution of GO versus FO in the current year, if the current team were essentially average in fi elding quality at all positions, largely independent of the quality of the current year’s fi elders. Perhaps the data could be further split by batter-handedness. And so forth. For seasons since 1989, we have exact counts of total ground balls and batted balls that are not ground balls; if split by opponent batter- handedness, they would serve as excellent variables for isolating the true ‘skill’ plays at each position.

ALTERNATIVE POST-1951 DRA MODELS

It cannot be said oft en enough that in developing regression models for any given period of baseball history, and deciding which periods of baseball his- tory to include in the same regression model, all sorts of trade-off s must be considered. For any given period of baseball history, one must make judgments about which variables to include. Even if a variable has so-called statistical signifi - cance, one has to consider whether it makes sense. Even if a variable has statistical signifi cance and makes sense, if it will only have an impact of a couple of runs in most cases, and rarely more than fi ve runs of impact, it is worth considering whether to leave it out, because the overall noise in any single-season estimate is at least that large. Th en one must consider how much to ‘chop up’ the models across base- ball history to capture changes in the game. Th e problem is that such changes rarely happen all at once, so drawing a line is diffi cult, and the smaller the sample of seasons included in a model, the less stable the estimates of expected run weights per event. Especially because this book presents the fi rst complete disclosure of a comprehensive regression model for team defense, I have tried to err on the side of relative simplicity. But, to give you some sense of the eff ect of includ- ing diff erent variables and splitting models into separate historical periods, I generated two alternative models covering almost all of the post-1951

AAppendix-A.inddppendix-A.indd 2424 22/1/2011/1/2011 2:27:552:27:55 PMPM More on Defensive Regression (or Runs) Analysis 25

period aft er obtaining some additional data from my data provider: team unassisted ground ball putouts at each position (most important by far at fi rst base), and home runs allowed, split by opponent batter-handedness. Based on tests against batted ball data, the updated models did not appear to be better than the one described above, but I have consulted its output as a ‘sanity’ check for certain historically important fi elders. Shown on the next two pages are separate alternative DRA models for 1977–2008 and 1954–1976. Th ese are formatted consistently with the main model at the beginning of the chapter, to make it easier to compare all three. Defi ned terms are the same as before, except that A4 , A5 , and A6 include unassisted ground ball putouts, and the following new terms are added: ROB means estimated opponent Runners On Base (the sum of walks, batters hit by pitch, and hits minus home runs), ROF means estimated opponent Runners On First base (same as ROB , less doubles and triples), RHR and LHR mean opponent home runs hit by right- and left -handed batters, respectively, and RBH and LBH mean balls hit (at-bats minus strikeouts) by right- and left -handed opponent batters. In general, the two models are remarkably similar to the primary post- 1951 model. Th e coeffi cients for the most important variables are remark- ably stable, though we start to see some slightly higher ones for fi elding plays in the 1954–1976 model, even though run scoring was on average lower than during the 1977–2008 period. We’ll see more of that in the pre-1952 model. Splitting HR.bh into RHR.rbh and LRH.lbh does little to improve things. However, ‘denominating’ WP by ROB , that is, as WP.rob , rather than IP, that is, as WP.ip , does bring the run weight down from .56 runs to about .40 runs, though this is still higher than the .26 runs it ‘ought’ to be. However, denominating outfi eld assists by ROB resulted in run weights that are too low in the 1954–1976 model. Perhaps the most important fi ndings from running the alternative models with exact counts of team unassisted ground ball putouts at fi rst base were that (i) even provisional fi rst-stage regressions to estimate rGO3 that took every possible base runner factor into account showed little impact over the course of a season, and (ii) the standard deviation in residuals for the 1977– 2008 model was 21 net plays. Th e fi rst discovery does not mean that the presence of base runners at fi rst base does not impact fi rst base fi elding. It does, if you look at the play- by-play or batted ball data. But over the course of a season, the diff erence in the number of base runners at fi rst base per team is small enough not to have a statistically signifi cant impact on ground outs at fi rst, at least over the course of the past fi ft y-fi ve years. Perhaps the impact has been greater in very recent seasons for which we have batted ball data.

AAppendix-A.inddppendix-A.indd 2525 22/1/2011/1/2011 2:27:552:27:55 PMPM AAppendix-A.indd 26 p p e n d i x - A . i n d d

2 6

Alternative 1977-2008 DRA Model

Terms are the same as in the 1952-2009 model, except as explained in the immediately preceding text.

Pitching = .30* SO.bfp – .40* BB.bfp – 1.50* HR.bh + .45 * A1.bip + .41* IFO.bip – .42 *WP.rob . Catching = .52 *CS.sba + .49 *GO2.bip . Infi eld = .54 *rGO3 + .55 * rA4 + .41 *rA5 + .43 * rA. Outfi eld = .57 *rPO7 + .45 *rPO8 + .48 *rPO9 + .68 * A7.rob + .68 *A8.rob + .68 *A9.rob.

rGO3 = GO3.bip + .07 *RBIP.bip + .07 *RFO.rbip + .14 *LFO.lbip. rA4 = A4.bip + .08 *RBIP.bip + .16 *RFO.rbip + .33 *LFO.lbip + .21 *RHR.rbh + .12 * LHR.lbh + .20 *SH.bip. rA6 = A6.bip − .05 *RBIP.bip + .33 *RFO.rbip + .15 *LFO.lbip + .43 *SH.bip. rA5 = A5.bip – .10* RBIP.bip + .24* RFO.rbip – .15* RHR.rbh + .13* A1.bip. rPO7 = PO7.bip + .04 *RBIP.bip + .20 *RGO.rbip + .16 *LGO.lbip. rPO8 = PO8.bip + .28 *RGO.rbip + .25 *LGO.lbip + .07 * IFO.bip + .19 *SH.bip. rPO9 = PO9.bip − .04 *RBIP.bip + .25 *RGO.rbip + .20 *LGO.lbip + .13 * IFO.bip. 22/1/2011 2:27:55 PM / 1 / 2 0 1 1

2 : 2 7 : 5 5

P M AAppendix-A.indd 27 p p e n d i x - A . i n d d

2 7 Alternative 1954-1976 DRA Model

Terms are the same as in the 1952-2009 model, except as explained in the immediately preceding text.

Pitching = .31* SO.bfp – .43* BB.bfp – 1.64* HR.bh + .52 * A1.bip + .40 *IFO.bip – .40* WP.rob . Catching = 1.00 *CS.sba + .49 *GO2.bip .

Infi eld = .52 *rGO3 + .57 *rA4 + .53 * rA5 + .53 * rA6 .

Outfi eld = .49 *rPO7 + .53 *rPO8 + .62 * rPO9 + .41 * A7.rob + .41 *A8.rob + .41 *A9.rob

rGO3 = GO3.bip + .08 *RBIP.bip + .03 *RFO.rbip + .13 * LFO.lbip + .20 *RHR.rbh + .15 *LHR.lbh + .20 *SH.bip. rA4 = A4.bip + .08 *RBIP.bip + .13 *RFO.rbip + .30 * LFO.lbip + .30 *RHR.rbh + .36 * LHR.lbh + .31 *SH.bip – .04* BB.bfp. rA6 = A6.bip − .08 *RBIP.bip + .23 *RFO.rbip + .11 * LFO.lbip + .71 *LHR.lbh + .53 *SH.bip + .04 *BB.bfp + .15 *SBA.rof. rA5 = A5.bip – .10 *RBIP.bip + .16 *RFO.rbip + .20 * LGO.lbip – .24 * RHR.rbh + .13 *A1.bip. rPO7 = PO7.bip + .03 *RBIP.bip + .19 *RGO.rbip. rPO8 = PO8.bip – .02 *RBIP.bip + .24 *RGO.rbip + .22 *LGO.lbip + .32 *LHR.lbh + .05 *IFO.bip + .05 * BB.bfp. rPO9 = PO9.bip − .02 *RBIP.bip + .15 *RGO.rbip + .25 *LGO.lbip + .07 * IFO.bip + .30 *SH.bip. 22/1/2011 2:27:55 PM / 1 / 2 0 1 1

2 : 2 7 : 5 5

P M 28 APPENDIX A

Here are the standard deviations in regression-adjusted plays at the ‘fi elding’ positions where ball-in-play fi elding is most important (that is, excluding pitcher and catcher) under the 1976–2008 model:

Standard Deviations in Residual (Regression-Adjusted) Net Plays per Team

rGO3 rA4 rA5 rA6 rPO7 rPO8 rPO9 21 26 25 28 22 25 23

Th e standard deviation for net plays at fi rst, 21, is only 16 percent less than the standard deviation in net plays at the other positions (about 25). Th is is part of the reason I argue in the fi rst base chapter that the DRA estimates of career value for the top fi rst baseman of all time, which are close to the levels found at other positions, might be correct, even though estimates under Sean Smith’s three systems, which are based on exact counts of GO3 , are much more severely compressed compared to other positions.

THE DRA MODEL FROM 1893 TO 1951

Offense Mirrors Defense

One of the important ways of evaluating DRA is by comparing how well it ‘predicts’ actual team runs allowed to how well off ensive models ‘predict’ actual team runs scored. Th e DRA model for 1952 through 2009 has stan- dard errors right in line with off ensive models, though with a slightly weaker correlation. Splitting the model into two historical periods brings the correlations up to those for off ensive models in the same period. If you try to develop a linear regression model of off ense for 1893 through 1951, using all the off ensive team variables that are available for all seasons in the sample, you will be initially surprised by its output (see next page). Although the r -squared, with rounding, is 1.00, indicating a virtually perfect correlation, the standard error is 39 runs— close to twice what it is for models of the entire second half of major league history. Note as well that the ‘run-weights’ for all the off ensive events are ‘too high’, even though the average level of run-scoring throughout the entire 1893–1951 period was approximately the same as from 1952 through 2009.

AAppendix-A.inddppendix-A.indd 2828 22/1/2011/1/2011 2:27:552:27:55 PMPM More on Defensive Regression (or Runs) Analysis 29

Regression Model of Offense 1893-1951 SUMMARY OUTPUT Regression Statistics Multiple R 1.00 R Square 1.00 Adjusted R Square 1.00 Standard Error 39.11 Observations 908.00

ANOVA df SS MS F Regression 7.00 445772922.69 63681846.10 41636.46 Residual 901.00 1378055.31 1529.47 Total 908.00 447150978.00

Coeffi cients Standard Error t Stat P-value Intercept .00 #N/A #N/A #N/A BO (AB – H) -.15 .00 -34.92 .00

BB (inc. HBP) .33 .02 20.89 .00 1B .63 .02 35.19 .00 2B .78 .04 19.71 .00 3B 1.22 .08 14.73 .00 HR 1.65 .05 32.84 .00 SB .57 .03 21.77 .00

Th e standard error is high in part because there was a tremendous varia- tion in run scoring levels throughout the period: the highest ever in the 1890s, then the lowest ever from 1901 through 1919, then very high from 1920 through 1939, and somewhat lower in the 1940s. But even if you try to ‘chop up’ the years to get a tighter fi t, the standard errors remain around 40 for the 1890s and around 35 from 1903 through 1919, though they do fall to more manageable levels of between 25 and 30 runs for the remainder of the period. Furthermore, the coeffi cients for each event when you use smaller samples are unstable and seemingly too high. I surmise the reason for the excessive run weights is that there is an important omitted variable that is unavailable for most of those seasons: runners reaching base safely on errors (“ROE ”). Errors were far more fre- quent in the Era and even during the fi rst part of the Era. Oft en, when important variables are omitted, the remaining variables ‘attract’ higher coeffi cients. So the lesson learned from the above exercise is to expect (i) a higher stan- dard error in ‘predicted’ runs allowed by teams, and (ii) higher run-weights for defensive events that are the ‘mirror image’ of off ensive events, such as regression-adjusted hits prevented (as defi ned in chapter two, “rH P ”).

AAppendix-A.inddppendix-A.indd 2929 22/1/2011/1/2011 2:27:552:27:55 PMPM 30 APPENDIX A

We closed chapter two showing how the basic off ensive runs model was ‘mirrored’ by the DRA team equation. Leaving aside defense, for the post-1952 seasons, with “BO ” standing for Outs (AB – H ), H standing for non- hits, we obtained the following nice match:

Off ensive Runs = – .27 *BO + .33 *BB + .55 *H + 1.40 *HR . Defensive Runs = + .28 *SO.bfp – .34 *BB.bfp + .47 *rHP – 1.49 *HR.bh .

If we ‘translate’ the 1893–1951 regression model into off ensive runs by adding the average run value per out (league runs divided by league outs) during the 1893–1951 period, in the manner of Michael Schell in his book, Baseball’s All-Time Best Sluggers, and then use the relative frequency of singles, doubles, and triples in the major leagues from 1893 through 1951 to obtain the weighted average run value of H, we obtain the following for 1893–1951:

Off ensive Runs = – .31 *BO + .33 *BB + .69 *H + 1.65 *HR .

Th e corresponding DRA formula for 1893–1951 is:

Defensive Runs = + .40 *SO.bfp – .46 *BB.bfp + .61 *rHP – 1.56 *HR.bh .

Th e coeffi cients for hits prevented and home runs look reasonably good, but the coeffi cients for strikeouts and walks look alarmingly high. Th ey are actually too low.

The And DIPS Revolutions

Th e Transitional Era (1947 through 1968) was transitional in so many ways: the most important change was integration, but in addition most of the parks turned over, which helped pitchers, and the running game re-emerged, which challenged catchers. From the standpoint of baseball researchers, it’s the transition between ‘plain’ to play-by-play statistics, and even within the play-by-play statistics era, beginning in 1952, there is a tran- sition from seasons in which a lot of data is missing (in the early to mid- 1950s some teams have up to 40 games of data missing), to some in which almost all the data is available (generally the 1960s). But perhaps the biggest statistical change, and one that I have not heard commented on before, was the sharp increase in strikeouts in both leagues. In 1947, strikeouts per nine innings were 3.6 in the National and 3.7 in the American. By 1968, strikeouts had risen by over two strikeouts per game : 5.8 and 5.9, respectively. And this change did not occur during the so-called ‘Second Dead Ball Era’ of 1964–68, but almost entirely during the 1950s.

AAppendix-A.inddppendix-A.indd 3030 22/1/2011/1/2011 2:27:552:27:55 PMPM More on Defensive Regression (or Runs) Analysis 31

Essentially two outs per game were shift ed from fi elders to pitchers during the 1950s. Walks had risen dramatically by the late 1940s, as a bunch of light-hitting Eddies (Joost, Stanky, and Yost) simply gave up on batting and waited out the pitcher. Here is what I think happened. Th e leagues realized this was slowing down the game, and got the umpires to call a bigger . More called strikes resulted in a decline in walks (about one less walk per game in the American League; about half a walk less in the National) and an increase in strikeouts. Nor do I believe strikeouts rose because batters were swinging for the fences more than they had been. Home runs in the American League did rise somewhat more, from about .70 per game circa 1950 to more like .90 in the early 1960s, but home runs in the National League rose only slightly, from .71 per game to .75 in 1963, before plummeting in the Second Dead Ball Era, 1964–68. Th is strikeout (and, to a lesser degree, walk) revolution had an additional eff ect: it made Defense Independent Pitching Statistics (“DIPS”) a more reliable model when Voros McCracken discovered it about ten years ago. DIPS models pitching only on the basis of walks, strikeouts, and home runs, the items that are independent of fi elders. McCracken discovered, using pitching data from recent seasons, that pitchers have little impact on out conversion rates for balls hit in play. Th e post-1951 DRA model modifi es DIPS somewhat: it assumes pitchers don’t control batted ball out conversion rates, except to the extent they induce ground balls they themselves fi eld or high fl y or pop ups — infi eld fl y outs — that essentially anyone could fi eld. DIPS is usually a good working assumption. But it stops working as well once you travel back in time through and past the Transitional Era. Th e evi- dence? Starting in the middle of the 1950s, the correlations between SO.bfp and BB.bfp and the rate of outs on batted balls in play begin to rise sharply. In other words, in prior periods of baseball history, when overall levels of strikeouts were sharply lower than now, diff erences in strikeout rates had a much stronger correlation with preventing hits when batters were not striking out. Similarly, when overall levels of walks were sharply higher than they are now, diff erences in walk rates had a much stronger correla- tion with hit prevention. When the umpires of the 1950s ‘gave’ pitchers more strikeouts, it appears pitchers ‘lost’ the ability to impact BIP outcomes. When you regress team outs on batted balls in play onto SO.bfp for the entire 1893–1951 sample, the regression analysis indicates that each SO.bfp is statistically associated, with a p -value of 0.00, with .36 fewer hits allowed per batted ball in play over the course of a season. In other words, each mar- ginal strikeout, which obviously is controlled by the pitcher, ‘predicts’ that there will be .36 fewer hits allowed on BIP, which is powerful evidence that pitchers before the 1950s ‘strikeout revolution’ who were either above or below average in striking batters out had a strong impact on batted balls

AAppendix-A.inddppendix-A.indd 3131 22/1/2011/1/2011 2:27:552:27:55 PMPM 32 APPENDIX A

in play. Pitchers with more strikeouts got more outs even when they weren’t striking people out, which has not been true (except to a miniscule extent) during the post-1950s DIPS era. Th e run value of .36 hits prevented is worth close to .27 runs; if you add the average value of a batting out (.31 runs) under the off ensive model for 1893–1951, that implies the value of a strikeout should be + .31 runs plus .27 runs, or nearly .60 runs. Th e .40 run weight is, therefore, probably too low , though at least it is ‘carrying’ some of the weight of batted ball out impact in an appropriate way. Similarly, if you regress outs given batted balls in play onto BB.bfp , you fi nd that each BB.bfp ‘predicts’ that .16 more hits will be allowed on balls in play. Adding the negative impact of .12 runs to the ‘normal’ weight for walks brings the total weight almost precisely to what is shown above for BB.bfp . If SO.bfp and BB.bfp impact out conversion rates on batted balls in play, shouldn’t we be adding them as predictor variables for each fi elding play, such as A6.bip and PO8.bip ? One could, but I decided it would be better not to. Th ough the cumulative impact of SO.bfp and BB.bfp across all positions is quite large, the impact per position is not — or not in a reliable way across time — and including those adjustments would vastly complicate a model that is going to get very complicated due to the challenge of dealing with the outfi eld. Th at is not to say we shouldn’t explore this in future versions of DRA; I just felt that there were more than enough challenges for both writer and reader in the fi rst published version of DRA. One of the benefi ts of this discovery is a remarkably simple rule of thumb that works reasonably well for pitchers before the mid-1950s. Since preventing walks and inducing strikeouts is associated with approx- imately the same impact— about half a run— and the league-average ratio of walks to strikeouts was about one-to-one in those days, you can quickly estimate the value of a pitcher, independent of his team, by simply sub- tracting his walks from his strikeouts, and multiplying by half a run, to derive defensive runs for that pitcher. By no means am I suggesting this as a defi nitive rating tool. But it does enable one to pick through a baseball encyclopedia and quickly grasp which pitchers from the fi rst half of baseball history were great and which were not; career runs allowed by pitchers, rela- tive to their leagues, are surprisingly similar to this simple estimate. Th ough we’ve thus discovered a surprisingly simple formula for pitchers, the DRA model for team defense before 1952 will be challenging. Let’s start with the simplest part of the pre-1952 DRA model: the run weights for all the variables, which were determined under the second stage regression, in which we regressed RA.ip ‘onto’ the ‘centered’ defensive variables from 1893 through 1951.

AAppendix-A.inddppendix-A.indd 3232 22/1/2011/1/2011 2:27:562:27:56 PMPM More on Defensive Regression (or Runs) Analysis 33

1893–1951 DRA Model (Run Weights Only)

Team defensive runs saved above or below the league rate, given innings pitched, DR.ip , is estimated as the sum of pitching, catching, infi eld, and outfi eld defensive runs:

Pitching = + .40* SO.bfp – .46* BB.bfp – 1.56* HR.bh. + .61 *A1.bip + .61* IFO.bip – .61 *WP.ip . Catching = + .42 *rA2. Infi eld = + .61 *rGO3 + .61 *rA4 + .61 * rA5 + .61 *rA6 . Outfi eld = + .61 *rPO789 + .63* A789.ip

Notice that there is only one outfi eld variable, rPO789 , because there wasn’t separate putout data for 1893–1919 and 1940–51 when I was build- ing this model. We’ll see below how we manage to calculate the combined team fi gure of rPO789 when we have, and when we haven’t, separate putout data. Since there is no Retrosheet data for stolen bases allowed and caught stealing it is impossible to create separate rGO2 and CS.sba variables. Th e lower run weight for regression-adjusted catcher assists, .42, refl ects the fact that many, if not most, catcher assists are on runners caught stealing and bunts, which don’t have the same impact on run prevention as pre- venting hits. Th e .61 regression coeffi cients for A1.bip , IFO.bip , rGO3 , rA4 , rA5 , rA6 , and rPO789 (total regression-adjusted putouts at all positions combined) did not magically match that way. One of the technical reviewers for the 1952–2009 DRA model suggested that, since the coeffi cients in the infi eld and outfi eld were so close in value anyway, the model could be simplifi ed by combining all the regression-adjusted plays made at each position into one regression-adjusted hits prevented variable that would attract a single regres- sion weight. Rather than re-run all the numbers for the 1952–2009 model, which would not have resulted in a material change anyway, I just applied that principle to the 1893–1951 model. But lest anyone think I am holding back, when I did calculate separate run weights per position I found that the middle-infi eld positions were .10 runs higher than the average and the out- fi eld putout run weight about .07 runs below the average. Th e latter distor- tion is easily attributable to the complexities we are about to encounter in estimating rPO789 . As more Retrosheet data comes in to simplify the latter calculation, I am reasonably confi dent that all the run weights will coalesce around .6. Because of inconsistencies in the outfi elder data, it was necessary to split the components of the fi rst stage in the regression analysis into six periods: 1940–51 for both leagues combined, 1920–39 for each league separately,

AAppendix-A.inddppendix-A.indd 3333 22/1/2011/1/2011 2:27:562:27:56 PMPM 34 APPENDIX A

1901–19 for the National League, 1903–19 for the ‘foul strike’ American League, and 1893–1900 National and 1901–02 ‘no foul strike’ American. Th ough we are burdened with eff ectively six models within the ‘one’ pre- 1952 model, the number of variables used in each one to predict net fi elding plays is much smaller, because there are fewer to choose from, due to the lack of data. Th e calculations for the Proxy BIP Distribution Variables follow the same pattern. First, we have to calculate estimated unassisted ground outs at fi rst base. In each case this is determined by regressing PO3.bip onto A1.bip, A4.bip, A5.bip, and A6.bip . Th e residual is estimated total unassisted putouts at fi rst base. As we discussed in the example in chapter two, one-third of those are treated as unassisted ground ball putouts at fi rst base (“ UGO3. bip ”). Th is is added to A3.bip to create GO3.bip. And, again as a simplifying matter, there was never more than a miniscule benefi t to regression-adjust- ing GO3.bip because the r -squareds were so modest. Second, we calculate the FO.bip in exactly the way we discussed in the Ozzie Smith example in chapter two, except that we also include outfi eld errors because, aside from outfi eld throwing errors aft er fi elding ground ball hits, each outfi eld error is the result of a ball hit in the air to the outfi eld, and there were so many more errors then. Th is results in FOE.bip . For the cor- responding variable to adjust outfi eld plays, we use GOE.bip , which is infi eld assists given batted balls in play, minus half of catcher assists (because league catcher assists throughout 1893–1951 were approximately half of runners caught stealing when reported by Retrosheet for off ense ) given batted balls in play, minus fi rst base double plays (as a proxy for GIDP , which were not offi cially recorded) given batted balls in play, plus infi eld errors given batted balls in play, plus UGO3.bip . Th ird, as we discussed in chapter two, we use batted balls in play allowed by left -handed pitchers given total balls in play, or LpBIP.bip, as a proxy for relative levels of opponent right-handed hitters, or RBIP.bip . Fourth, as before, we have IFO.bip as a ‘ball hogging’ variable for outfi elders. Fift h, we don’t have SH.bip to adjust infi eld plays. But we have a neat proxy, which I discovered courtesy of Ed Walsh, the spit-balling pitcher who recorded more assists per season than any other pitcher in history: A1.bip . Yes, pitcher assists were enormously higher throughout most of the 1893–1951 period, refl ecting a lot more bunts being fi elded by pitch- ers. Th e more bunts fi elded by pitchers, the fewer double play opportuni- ties for middle infi elders and the fewer bunt opportunities for third basemen. So A1.bip serves, as SH.bip did in the post-1951 model, as a Proxy BIP Distribution Variable as well as a ‘base runner’ variable impact- ing double plays.

AAppendix-A.inddppendix-A.indd 3434 22/1/2011/1/2011 2:27:562:27:56 PMPM More on Defensive Regression (or Runs) Analysis 35

Sixth, the only factor that had a consistently strong impact on catcher assists was the total number of BIP outs (“BIPO ”) recorded by the rest of the team, given batted balls in play: BIPO.bip . Th e fewer BIPO.bip , the more hits allowed by the other fi elders, the more men on base, the more bunting and base stealing, and the more catcher assists; the more BIPO.bip , the fewer hits allowed by the other fi elders, the fewer men on base, with fewer sacrifi ce and stolen base opportunities, and catcher assists go down. Now we can show each of the six ‘fi rst-stage’ regression models. Th e simplest of these, because we have separate putout information for each outfi eld position, are the two 1920–39 models for the fi rst-stage regres- sion, which we’ll set out fi rst. I’ve tried to conform the organization of the equations as much as possible to their counterparts for the 1952–2009 m o d e l .

1920–39 National League First-Stage Regression Results

rA2 = A2.bip + .07 *BIPO.bip. rA4 = A4.bip + .28 *FOE.bip. rA6 = A6.bip − .01 *LpBIP.bip + .18 *FOE.bip. rA5 = A5.bip – .01 *LpBIP.bip + .16 *FOE.bip. rPO7 = PO7.bip + .21 *GOE.bip + .10 *IFO.bip. rPO8 = PO8.bip + .23 *GOE.bip. rPO9 = PO9.bip – .01 *LpBIP.bip + .14 *GOE.bip + .14 *IFO.bip .

For this model, rP0789 for the second-stage regression is simply the sum of rPO7 , rPO8 , and rPO9. For reasons discussed above, rG03 is simply equal to G03.bip for the remaining models discussed below.

1920–39 American League First-Stage Regression Results

rA2 = A2.bip + .13 *BIPO.bip. rA4 = A4.bip + .01 *LpBIP.bip + .11 *FOE.bip. rA6 = A6.bip + .11 *FOE.bip. rA5 = A5.bip – .01 *LpBIP.bip + .18 *FOE.bip + .18 *A1.bip. rPO7 = PO7.bip + .21 *GOE.bip + .17 *IFO.bip. rPO8 = PO8.bip + .01 *LpBIP.bip + .23 *GOE.bip. rPO9 = PO9.bip – .01 *LpBIP.bip + .14 *GOE.bip + .14 *IFO.bip .

For this model, rP0789 for the second-stage regression is simply the sum of rPO7 , rPO8 , and rPO9 .

AAppendix-A.inddppendix-A.indd 3535 22/1/2011/1/2011 2:27:562:27:56 PMPM 36 APPENDIX A

For the remaining fi rst-stage models, we have the trick of estimating net plays anywhere in the outfi eld by each outfi elder on the team, then aggregating those estimates into one rPO789 number for the team. Once we explain that, we can set out the remaining fi rst-stage models, and we’ll be done.

Dead Ball Era Outfielder Evaluation

When fi elding statistics were invented in the nineteenth century, outfi eld putouts made at the separate positions of left , center, and right were not recorded separately. It is only due to the score sheet analyses of Retrosheet volunteers that we have separate PO7 , PO8, and PO9 data (and innings played at each position) per player for seasons since the mid-1950s and now for 1920–39 (though the innings played data is patchy for 1920–39). Th e only fi elding statistics we generally have for outfi elders in 1893– 1919 and 1940–51 are (i) the total number of games each player played at each separate outfi eld position for a given team and year (“G7 ,” “ G8,” “ G9”), (ii) the total number of games each player played anywhere in the outfi eld for a given team and year (Outfi eld Games, or “OG ”), and (iii) the total number of putouts and assists he recorded anywhere in the outfi eld that year for such team and year (“PO789 ,” and assists recorded anywhere in the outfi eld “A789 ”). And, just to make things really interesting, some initial work by Retrosheet clearly indicated that the proportion of putouts in left , center, and right was not the current approximate thirty/forty/thirty percent split, but a much diff erent one that varied greatly over time and across leagues throughout the period! Th e following section is the most challenging in this book. It solves a problem that hasn’t even been approached before. To break this problem down into manageable pieces, let us fi rst assume that we can estimate not just the number of games the player played at each outfi eld position, but his approximate percentage of the team’s total innings played at that position: the percentage of total team innings that year played in left (“P7 ”), center (“P8 ”), and right (“P9 ”). We will show how this is done further below. Next, we calculate the number of total outfi eld putouts the team would have recorded, given their total batted balls in play, if they were league aver- age. Th at is simply the league total of putouts multiplied by the team’s per- centage of league batted balls in play. We’ll call this variable estimated Outfi eld Put Outs, or “eOPO .” For every outfi elder for every team in the league that year, we multiply his P7 , P8 , and/or P9 by his team’s eOPO . Th en we do a forced zero intercept regression of all of the individual outfi elders actual PO789 ‘onto’ their

AAppendix-A.inddppendix-A.indd 3636 22/1/2011/1/2011 2:27:562:27:56 PMPM More on Defensive Regression (or Runs) Analysis 37

respective P7* eOPO , P8 * eOPO , and P9 * eOPO . Yes, that means running a regression for each league and season. Th e regression coeffi cients reveal the likely percentage distribution of putouts at each position during that year. Th ey result in an almost perfect match for 1911, the one year for the Dead Ball Era for which we do have the separate putout totals. Aft er this book went into production, Retrosheet published team PO7 , PO8, and PO9 totals for 1940–49 (excluding 1941–42 National League). Th e standard error in the DRA estimate of league-average team putouts per outfi eld position was two putouts per season. Th e worst match was the 1949 National League, where league-average team PO7 were understated by six put- outs and league-average team PO8 and PO9 were overstated by three putouts each. I think we can therefore be reasonably confi dent that the shift ing outfi eld putout totals from 1893 through 1919, which may never be tabulated by Retrosheet, have been well estimated by DRA. Th e residuals of the regression are the individual outfi elders’ estimated outfi eld putouts above or below the league-average rate, taking into account only their percentage of playing time at each outfi eld position and the total number of BIP allowed by their team’s pitchers: we call this iPO789.bip . If you add up the iPO789.bip for all the team’s players, it does in fact (I’ve checked) equal the actual outfi eld putouts for the team, above or below the

Estimated Percentage of Putouts in Left, Center, and Right Year L PO7 PO8 PO9 Year L PO7 PO8 PO9 1893 N .351 .392 .257 1894 N .332 .392 .276 1895 N .357 .388 .256 1896 N .352 .382 .266 1897 N .344 .394 .262 1898 N .352 .376 .273 1899 N .362 .372 .266 1900 N .377 .373 .250 1901 N .353 .371 .276 1901 A .351 .375 .274 1902 N .371 .381 .249 1902 A .336 .391 .273 1903 N .361 .379 .260 1903 A .331 .396 .272 1904 N .343 .374 .283 1904 A .333 .385 .282 1905 N .338 .386 .276 1905 A .345 .398 .258 1906 N .327 .395 .278 1906 A .341 .405 .254 1907 N .342 .393 .265 1907 A .331 .399 .269 1908 N .340 .402 .258 1908 A .352 .398 .250 1909 N .357 .371 .272 1909 A .338 .411 .252 1910 N .349 .385 .266 1910 A .325 .405 .270 1911 N .329 .387 .283 1911 A .342 .421 .237 1912 N .363 .379 .258 1912 A .350 .399 .251 1913 N .353 .383 .264 1913 A .330 .399 .272

(continued)

AAppendix-A.inddppendix-A.indd 3737 22/1/2011/1/2011 2:27:562:27:56 PMPM 38 APPENDIX A

Estimated Percentage of Putouts in Left, Center, and Right (continued) Year L PO7 PO8 PO9 Year L PO7 PO8 PO9 1914 N .335 .389 .276 1914 A .342 .412 .245 1915 N .345 .399 .256 1915 A .331 .411 .258 1916 N .336 .401 .263 1916 A .351 .393 .256 1917 N .338 .403 .258 1917 A .331 .401 .268 1918 N .330 .408 .262 1918 A .333 .379 .288 1919 N .325 .394 .281 1919 A .321 .394 .285 1940 N .330 .385 .285 1940 A .316 .392 .291 1941 N .353 .377 .270 1941 A .322 .387 .291 1942 N .326 .381 .293 1942 A .318 .394 .288 1943 N .310 .388 .303 1943 A .324 .377 .299 1944 N .321 .396 .283 1944 A .310 .414 .275 1945 N .311 .396 .293 1945 A .313 .385 .302 1946 N .319 .405 .276 1946 A .317 .393 .291 1947 N .330 .381 .289 1947 A .319 .401 .280 1948 N .331 .393 .276 1948 A .301 .413 .286 1949 N .311 .406 .283 1949 A .329 .389 .282 1950 N .326 .397 .277 1950 A .322 .391 .286 1951 N .311 .410 .279 1951 A .334 .394 .272

league-average rate, given total balls in play, which can be explicitly calcu- lated, or PO789.bip. So now we have a provisional ‘net’ number of putouts per outfi elder, given his estimate total playing time at each outfi eld position and the total number of team balls in play. Th is is analogous to knowing A6.bip for a team. In the fi rst-stage regressions thus far we have regressed team A6.bip , for example, onto Proxy BIP Distribution Variables over many seasons. Here, we regress the iPO789.bip for each individual outfi elder onto ‘his’ estimated percentage of GOE.bip and IFO.bip when he was playing in each respective position. Th is regression is not done year-by-year, but for a period of years. Th e table on the next page shows the output for the 1940–51 sample: Th e 1,761 observations were all the individual outfi elder–season combi- nations in 1940–51. Th e coeffi cients refl ect the impact that GOE.bip and IFO.bip have on expected plays in left , center, and right on average throughout 1940– 1951. Th e residuals refl ect the estimated net outfi eld putouts made by each outfi elder, aft er taking into account the impact of ground ball (or fl y ball) pitching, as indicated by positive (or negative) GOE.bip , and IFO.bip ‘ball hogging’, or riPO789 . Th e sum of riPO789 for all the outfi elders on a team equals rPO789 for the team, and that number is what is included in the ‘sec- ond-stage’ regression analysis that began this section.

AAppendix-A.inddppendix-A.indd 3838 22/1/2011/1/2011 2:27:562:27:56 PMPM More on Defensive Regression (or Runs) Analysis 39

Outfi elder Regression Model 1940-51 SUMMARY OUTPUT Regression Statistics Multiple R .31 R Square .10 Adjusted R Square .09 Standard Error 15.68 Observations 1761.00

ANOVA df SS MS F Regression 6.00 46940.31 7823.39 31.81 Residual 1755.00 431645.65 245.95 Total 1761.00 478585.96

Coeffi cients Standard Error t Stat P-value Intercept .00 #N/A #N/A #N/A P7 * GOE.bip -.17 .03 -6.40 .00 P8 * GOE.bip -.25 .03 -9.64 .00 P9 * GOE.bip -.08 .03 -2.83 .00 P7 * IFO.bip -.06 .04 -1.37 .17 P8 * IFO.bip -.15 .04 -3.93 .00 P9 * IFO.bip -.06 .04 -1.30 .19

In conclusion, we use the estimated playing time at each position by each player of the league to help us discover indirectly (i) the average distribution of outs at each position, year-by-year and league-by-league, and (ii) the typ- ical impact of ground ball pitching and infi eld fl y out ball hogging on each position over the time period of the model. In the process of making these discoveries, we derive an estimate of the net plays made by each outfi elder, riPO789, taking into account the total PO789 we know he recorded, his esti- mated percentage playing time at all three positions, and his team’s GOE.bip and IFO.bip, which are also known. Th e team’s rPO789 is then just the sum of its players’ riPO789 . Th is raises the question of how we estimate such playing time at each outfi eld position. I do not claim to have invented the best way to do this, and I would welcome any suggestions. First, we estimate the player’s percentage of the team’s ‘batting innings’, which is his estimated plate appearances (at-bats plus walks) divided by one-ninth of the team’s plate appearances. Next, we estimate the player’s percentage of total team innings in which he was fi elding at non -outfi eld positions, based on his percentage of team plays at those positions. A player’s estimated percentage of his team’s innings played anywhere in the outfi eld is just the lesser of (i) his total games in the outfi eld, divided by the team’s games (i.e., as if he played every inning of every outfi eld game),

AAppendix-A.inddppendix-A.indd 3939 22/1/2011/1/2011 2:27:572:27:57 PMPM 40 APPENDIX A

and (ii) the excess of his ‘batting innings’ over his ‘non-outfi eld innings’. Th at way, if a fourth outfi elder doesn’t get a lot of at bats, his estimated innings go down. If a player plays another position, his estimated outfi eld innings won’t be overestimated. Now we have to allocate his total estimated outfi eld playing time among the three outfi eld positions. If he was a full-time player (greater than seventy percent of estimated outfi eld innings) who played almost all of his games at one position (no more than three percent at other positions), we ‘fi x’ those percentages for him, discounted by two percent, the average percentage of innings missed per game by 2003–05 outfi elders who played full time. If the outfi elder was the only one at that position for his team, he is not discounted, and gets the full 100 percent. All other outfi elders go in the pool of part-time players. Th eir total estimated outfi eld innings are initially allocated pro rata to each player on the basis of his percentage of games played at each position. Usually this causes the sum of team estimated innings to exceed the team total for one or more of the outfi eld positions. In such case we ‘shrink’ the part-time players’ portions in proportion, so the team estimated innings at each position sums to the total. Th is approach will cause a full-time outfi elder who splits his time among all three positions to have his innings underestimated and therefore his net plays overestimated. Th e most signifi cant example of this is the young , who is signifi cantly overrated due to this data limita- tion. But at least we know why. We are now ready to show the ‘fi rst-stage’ 1940–51 regressions:

1940–51 Major League First-Stage Regression Results

rA2 = A2.bip + .06 *BIPO.bip. rA4 = A4.bip + .01 *LpBIP.bip + .16 *FOE.bip. rA6 = A6.bip – .01 *LpBIP.bip + .13 *FOE.bip. rA5 = A5.bip – .02 *LpBIP.bip + .21 *FOE.bip + .32 *A1.bip.

riPO789 = iPO789.bip adjusted as follows --

+ .14 *P7 * GOE.bip + .05 *P7 * IFO.bip + .01* P8 *LpBIP.bip + .26 *P8 * GOE.bip + .15* P8 * IFO.bip + .09 * P9 * GOE.bip + .04* P9 *IFO.bip.

For this and all remaining models, the riPO789 are summed up for a team’s outfi elders to yield the team rPO789 . Th e coeffi cients look reason- able. Th e coeffi cient for the impact of GOE.bip on PO8, which is shown above as + .26 *P8 * GOE.bip , is almost precisely the average of the RGO.rbip

AAppendix-A.inddppendix-A.indd 4040 22/1/2011/1/2011 2:27:572:27:57 PMPM More on Defensive Regression (or Runs) Analysis 41

and LGO.lbip coeffi cients in center fi eld in the post-1951 model. Th e coef- fi cient for left fi eld is only slightly lower than the average of RGO.rbip and LGO.lbip in left fi eld for the post-1951 model. Only right fi eld seems to be weakly modeled, but, as we’ve seen throughout this book, right fi eld is usu- ally a problem for almost any system, including those based on batted ball data. All in all, we’ve teased a surprising amount of information out of the raw statistics bequeathed to us.

More Missing Data

Before 1920, we lack consistent BFP data, which is very serious, given that BFP is the denominator of opportunities for SO and BB and part of the defi nition of the two other important ‘denominators’: BH = BFP – SO – BB ; BIP = BFP – SO – BB – HR. As Retrosheet continues its eff orts to glean data from before the 1950s using boxscores and newspaper articles, the single most important variable to get right, other than separate counts of outfi elder putouts in left , center, and right, is correct BFP for each and every team. For seasons before 1904 (as well as 1905–06 for the American League), the BFP data is obviously incomplete (for example, the 1905 Washington Senators are reported to have faced zero opposing batters). Some further investigation revealed that most American League seasons and many National League sea- sons during this so-called Dead Ball Era have incomplete BFP totals. How do we know this? Because many team seasons with BFP totals that look reason- able are in fact less than the absolute logical minimum. BFP cannot be less than the sum of plate appearances not resulting in an out (BB and HBP ), total hits, and batting outs (which are basically total outs ( IP* 3) reduced by estimated base-running outs (infi eld DP , outfi eld assists, and CS )). Th e obvious missing piece are batters who Reach Base on an Error (“ ROE ”), which was a much more important part of the game before 1920, and particularly before 1901. We have total E at each position, but not ROE for seasons before the mid-1950s. How many E at each position should be treated as ROE ? If you sense another regression analysis coming on, you are correct. Th e trick is to fi nd a sub-sample of Dead Ball Era team seasons which appears to have an ‘appropriate’ excess of BFP over the lower bound estimate of BFP ( BB (including HBP ) + H + IP * 3 – [IDP + A789 + CS ] ( BFP without E , or “ BFPwoe”)). Th ere are only 168 Dead Ball Era team seasons in which PA is reported as a number larger than BFP woe. An initial regression of their excess of BFP over BFP woe (marginal BFP, or “mBFP ”) ‘onto’ errors at each

AAppendix-A.inddppendix-A.indd 4141 22/1/2011/1/2011 2:27:572:27:57 PMPM 42 APPENDIX A

position revealed some bizarre results at individual positions, but the weight for aggregated infi eld errors was .51, and the weight for outfi eld errors was not statistically signifi cant. Taking the sub-sub-sample of ninety-eight team seasons in which the ratio of total infi eld errors divided by m BFP was between 119 percent and 264 percent (the teams within the 168-team sample within one standard deviation of the average), and regressing the m BFP of that sample onto infi eld errors (excluding catcher assists, which are oft en CS) and outfi eld errors, revealed a weight of essentially 1.00 per fi rst base error and .67 per infi eld error (excluding catcher and fi rst base errors) and a not remotely statistically signifi cant .16 weight per outfi eld error. Th e fi rst two weights make some sense to me, as fi rst basemen rarely make errors throwing out baserunners without fi elding a ball (i.e., they don’t make double play pivot throws), and approximately 70 percent of other infi eld errors are ROE in contemporary baseball. Certainly more than 16 percent of outfi eld errors resulted in ROE, but proportionately far more outfi eld than infi eld errors are throwing errors that impact base-runner advancement but not ROE . Anyway, here is the estimate for BFP that is used not only in 1901–1920 but also for 1893–1900 (which does not have any team seasons of positive m BFP ):

BFP = BB + H + BO + ROE. BFP = BB + H + [IP * 3 – (DP3 + A789 + CS )] + [E3 + .67 *E1456 ].

Now we can present the Dead Ball Era fi rst-stage regression results.

1901–19 National League First-Stage Regression Results

rA2 = A2.bip + .12 *BIPO.bip. rA4 = A4.bip + .18 *FOE.bip + .18 *A1.bip. rA6 = A6.bip + .17 *FOE.bip + .20 *A1.bip. rA5 = A5.bip – .01 *LpBIP.bip + .19 *FOE.bip.

riPO789 = iPO789.bip adjusted as follows --

+ .01 *P7 *LpBIP.bip + .26 *P7 * GOE.bip + .05* P7 *IFO.bip + .30 * P8* GOE.bip + .19* P8* IFO.bip – .03* P9 *LpBIP.bip + .10 * P9* GOE.bip – .08* P9 *IFO.bip .

LpBIP.bip is set to zero for seasons before 1915, as platooning didn’t become signifi cant until then. Notice how close the GOE.bip weight is in left to what it is in center. Th is makes sense, because the number of PO7 was much closer to the number of PO8 during the Dead Ball Era— see the Estimated Percentage of Putouts in Left , Center, and Right chart above.

AAppendix-A.inddppendix-A.indd 4242 22/1/2011/1/2011 2:27:572:27:57 PMPM More on Defensive Regression (or Runs) Analysis 43

1903–1919 American League First-Stage Regression Results

rA2 = A2.bip + .21 *BIPO.bip. rA4 = A4.bip + .17 *FOE.bip. rA6 = A6.bip – .02 *LpBIP.bip + .29 *FOE.bip + .14 * A1.bip. rA5 = A5.bip – .02 *LpBIP.bip + .17 *FOE.bip. riPO789 = iPO789.bip adjusted as follows -- + .22* P7 * GOE.bip + .11* P7* IFO.bip – .01* P8 *LpBIP.bip + .21* P8 * GOE.bip + .24* P8 * IFO.bip – .01 *P9 *LpBIP.bip + .10 *P9 * GOE.bip.

Again, LpBIP.bip is set to zero for seasons before 1915. Notice how large the IFO.bip ‘ball-hogging’ adjustment is for the 1903–19 American League. Tris Speaker says hello.

1901–1903 American League and 1893–1900 National League First-Stage Regression Results

rA2 = A2.bip + .14 *BIPO.bip. rA4 = A4.bip + .07 *FOE.bip. rA6 = A6.bip + .14 *FOE.bip + .26 *A1.bip. rA5 = A5.bip + .09 *FOE.bip – .02 *A1.bip. riPO789 = iPO789.bip + .14 * P7 * GOE.bip + .10* P7* IFO.bip + .18 * P8 * GOE.bip + .02* P8* IFO.bip + .09 *P9 * GOE.bip – .12 *P9 *IFO.bip.

Th ere is no LpBIP.bip adjustment because there was eff ectively no platoon- ing. Notice as well that the coeffi cients for FOE.bip and GOE.bip tend to be weaker. Th is is consistent with the trend that goes back throughout major league history — the further back we go, the weaker the coeffi cients. Th e reason would appear to be that there was much greater variability in fi elder quality, so gradually more of the variance in FOE.bip and GOE.bip refl ected fi elder quality than the tendency of the team’s pitchers to generate fl y outs or ground outs.

THE DEAD BALL DRA MODEL APPLIED TO 2003–05 OUTFIELDERS

Towards the end of chapter three we presented Dead Ball DRA results for the outfi elders who were in the test sample for comparing various systems to

AAppendix-A.inddppendix-A.indd 4343 22/1/2011/1/2011 2:27:572:27:57 PMPM 44 APPENDIX A

Test Runs. Here, as promised there, is the version of the Dead Ball Era model that was applied. Th e same methodology was followed— estimating innings at each position, estimating the distribution of putouts per year and per league, then estimating the impact of Proxy BIP Distribution Variables and ball-hogging variables on net plays. Here is the regression output for that model:

Dead Ball DRA Regression for 2003-05 Outfi elders SUMMARY OUTPUT Regression Statistics Multiple R .996 R Square .993 Adjusted R Square .985 Standard Error 26.953 Observations 150.000

ANOVA df SS MS F Regression 13.000 13704170.967 1054166.997 1451.086 Residual 137.000 99526.033 726.467 Total 150.000 13803697.000

Coeffi cients Standard Error t Stat P-value Intercept .000 #N/A #N/A #N/A ePO789 1.006 .008 130.979 .000 p7 * GOE.bip -.110 .065 -1.708 .090 p8 * GOE.bip -.221 .054 -4.051 .000 p9 * GOE.bip -.191 .067 -2.859 .005 p7 * IFO.bip .071 .142 .497 .620 p8 * IFO.bip -.329 .152 -2.166 .032 p9 * IFO.bip -.148 .119 -1.238 .218 p7 * LpBIP.bip .006 .011 .575 .566 p8 * LpBIP.bip .000 .011 -.002 .999 p9 * LpBIP.bip .009 .013 .681 .497 p7 * SH.bip .143 .553 .259 .796 p8 * SH.bip .206 .514 .400 .690 p9 * SH.bip -.022 .605 -.036 .971

MISSING TPAR FORMULAS

I also promised in chapter four to disclose the ugly equations with long strings of signifi cant digits that were developed for the TPAR model for center fi eld and left fi eld. Th e center fi eld TPAR formula is just the sum of the intercept below and the product of the center fi elder’s Mid-Career Season (to the following powers) and the following coeffi cients, expressed in scientifi c notation (the ‘E” numbers at the end represent the number of decimal places to move the decimal show).

AAppendix-A.inddppendix-A.indd 4444 22/1/2011/1/2011 2:27:572:27:57 PMPM More on Defensive Regression (or Runs) Analysis 45

Center Field TPAR Intercept and Coeffi cients

Coeffi cients Intercept 2.2369009951E + 08 MCS -4.6076389823E + 05 MCS^2 3.5585983532E + 02 MCS^3 -1.2213304947E-01 MCS^4 1.5716423571E-05

Similarly, in left fi eld:

Left Field TPAR Intercept and Coeffi cients

Coeffi cients Intercept 2.0664272953E + 08 MCS -4.2308805234E + 05 MCS^2 3.2483095979E + 02 MCS^3 -1.1083709358E-01 MCS^4 1.4181644457E-05

RUN WEIGHTS AND ‘WIN’ WEIGHTS

Some of you might be wondering why we use the same runs estimates for fi elders in the 1893–1951 model (without adjustment for the varying ratio of runs to wins during that period) and the same runs estimates for fi elders under the 1952–2009 model (without adjustment for the varying ratio of runs to wins during that period). It’s simple, and I’m reassured that Sean Smith arrives at the same conclusion in reporting his ratings throughout history. Th e average number of runs scored in both models (1893 through 1951, and again in 1952 through 2009) was essentially equal to the average throughout all of major league history. If we calculate one set of run weights for each model, the run weight will be too ‘low’ for the high scoring portions of each sample, but too ‘high’ for the low scoring portions. But that’s alright, because the value in wins per run in the high scoring is in fact lower, and the value in wins per run in the low scoring era is higher. Th e eff ects balance out, if not perfectly, then to an extent well within the model errors that unavoidably remain.

THE PHILOSOPHER TO BE NAMED LATER

Bertrand Russell.

AAppendix-A.inddppendix-A.indd 4545 22/1/2011/1/2011 2:27:572:27:57 PMPM AAppendix-A.inddppendix-A.indd 4646 22/1/2011/1/2011 2:27:572:27:57 PMPM