Were the 1996{2000 Yankees the Best Baseball Team Ever?
Total Page:16
File Type:pdf, Size:1020Kb
Were the Yankees the Best Baseball Team Ever Gary A Simon and Jerey S Simono Intro duction In the aftermath of the New York Yankees World Series victory over the New York Mets games to many fans sp ortswriters and baseball insiders fo cused attention on whether the Yankees achievementoffourWorld Series championships in veyears was the most impressive team accomplishment in the history of the sp ort In particular how did it compare to the ve championships in six years including four consecutivechampionships of the Yankees or the six championships in seven years including ve consecutivechampionships of the Yankees A commonly expressed opinion was that of MSNBC sp orts columnist Bob Herzog who wrote The greatest team in Yankees history is playing right now The reason commonly given for this opinion is that with two p ostseason playo rounds b efore the World Series there is the added diculty of just getting to the World Series let alone and then winning it The simple act of rep eating is more dicult than ever opined Fox Sp orts commentator Keith Olb ermann and Atlanta Braves manager BobbyCox agreed saying Its so hard to go all the wayevery year now Its amazing to me what the Yankees have done While it is certainly true that it is harder to win three series one b est out of games two b est out of games than to win one b est out of games that is only half the story What Herzog Olb ermann and Cox are ignoring is that it is apparently easier nowtogetinto the p ostseason when eight teams are eligible than it was years ago when only two teams A shortened version of this pap er will app ear in the Spring issue of Chance were eligible In this pap er we examine b oth sides of the question and try to establish whether it really is harder to win the World Series now The rst step is to acknowledge that we are not going to actually answer the question p osed in the title While it might b e a more impressiveachieve ment to win three series rather than one in order to win a championship that hardly implies that a team that only had to win one series is less go o d After all it isnt the fault of the Yankee team whichwon more than of its regular season games and then swept the World Series from the Pittsburgh Pirates that it only had to win four p ostseason games to win it all Rather we will attempt to quantify the impressiveness of a given teams achievement as the answer to the question How surprising would it b e for a team to achieve that level of regular season and p ostseason success Measuring impressiveness Answering the question that closed the previous section involves several steps First wemust dene what we mean by that level of regular season and p ost season success Since the numb er of games played in a season has varied over time regular season success should b e assessed using winning p ercent age rather than numb er of wins If a team wins out of games for example that corresp onds to a winning p ercentage of and we will treat that eventashaving a winning p ercentage of at least The intuition b e hind this is that when a baseball fan refers to a team winning games they actually mean at least games that is a team that wins games also won games Postseason success corresp onds to a twopart pro cess First a team must earn a p osition in p ostseason play Before only one team from each league American and National earned a sp ot in the World Series the only p ostseason comp etition From through eachleaguewas divided into two divisions with the division winners in eachleaguerstplaying each other in a League Championship Series and the league champions then meeting in the World Series Since each league has b een split into three divisions and the three division winners plus the team with the next b est record the wild card team play in the Division Series with the winners playing in the League Championship Series for the righttoplayintheWorld Series There was no p ostseason play in b ecause of a players strike These dierent systems mean that the impressiveness of a teams achievement in making the p ostseason was very dierent in than it was in with four times as many slots available The second part of p ostseason success is simply howmany series a team won While it could b e argued that a gamesto World Series triumph is more impressive than gamesto ultimately all that is rememb ered is which team won Putting these three steps together gives our impressiveness measure Say a team has winning p ercentage x x and ultimately wins the World Series The impressiveness of that feat is P Team having winning p ercentage x and Team making the p ostseason and Team winning the World Series Let X b e the random variable corresp onding to the winning p ercentage for arandomlychosen team with underlying density function f Let S rep resenttheevent of making p ostseason play and let W representtheevent winning the World Series By the denition of conditional probability the impressiveness measure equals P X x and S and W P X x and S P W jS X x Z P S jX w f w dw P W jS X x x Each of the terms in this formula can b e estimated from the data The density f can b e estimated using any density estimate since we are particularly interested in density estimates in the tails we will use a lo cal quadratic likeliho o d density estimate which is known to havefavorable tail b ehavior see Hjort and Jones and Loader The estimate f x is dened as the maximizer of Z n X x x x u i K log f x n K f udu i h h i where K is the kernel weight function h is a smo othing parameter that con trols the smo othness of the estimate and log f is mo deled to b e lo cally quadratic in xWe use the corrected AI C criterion to cho ose the smo othing parameter Simono although the estimates turn out to b e relatively insensitive to smo othing parameter choice for these data In order to reect the changing pattern of winning p ercentages over time separate density es timates are constructed for the three eras of the socalled mo dern age of baseball and We could consider using the empirical cumulative distribution function to estimate P X x and S therebyavoiding smo othing Since our primary interest is in teams that did well during the regular season however there is not enough data in this upp er tail for eective estimation The conditional probability P S jX is estimated from the data based on a logistic regression mo del P S log X P S Clearly the co ecients of this mo del will b e dierent for the dierent eras examined and the mo dels will b e t separately The nal probability P W jS X x p otentially allows for great sim plication If the two teams in any p ostseason series are evenly matched the probability of winning a series is simply assuming indep endence of the p ostseason series a cointossing mo del the probabilityofwinningthe World Series once the team has qualied for the p ostseason is pre and resp ectively It is p ossible to rene these gures by allowing the probability of winning a series to b e a function of the quality of the two teams playing in the series we will examine this p ossibility later Empirical results The analyses presented here are based on data for all teams for the ma jor league seasons through excepting and We b egin with since that is generally considered as the b eginning of the mo dern era of baseball succeeding the dead ball era Both and were dis rupted byplayer strikes that aected the p ostseason was played as a split season with rsthalf champions playing against secondhalf champi ons in the p ostseason a oneofakind arrangementandthus not suited for inclusion in our analysis while the p ostseason was canceled This time p erio d is split into three eras Era through when one team p er league played in one p ostseason playo series the World Series Era through when two teams p er league played in two p ostseason playo series the League Championship Series and the World Series and Era through when four teams p er league haveplayed in three playo series the Division Series League Championship Series and World Series Figure gives density estimates for the winning p ercentages for all teams for the three eras era solid line era dotted line and era dashed line The average winning p ercentage in anyyear is of course but it is noticeable that in the earliest era the distribution of winning p ercentages is decidedly asymmetric with mo de at and a long left tail and exhibits higher variability the standard deviation of winning p ercentages for this era is The winning p ercentage distributions for the latter twoerasare very similar b oth b eing reasonably symmetric p eaking at with less variability than for era standard deviation of winning p ercentages for b oth These densities imply that comp etitive balance has b een much greater in the last years than in the previous years While almost of all ma jor league baseball teams won at least of their games during era only ab out did in eras and the p ercentages winning at least twothirds of their games are and resp ectively Unfortunately this surfeit of apparent excellence in the earlier part of the century is balanced by excess incomp etence with of ma jor league teams in era winning fewer than of their games compared with and in eras and resp ectively and winning no more than onethird of their games compared with and for eras and resp ectively This means that a winning p ercentage of in era the b est of p erformances is equivalent to a winning p ercentage of in the latter eras which