BASEBALL AND THE LEFT HANDED HITTER
A THESIS
Presented to
The Faculty of the Department of Economics and Business
The Colorado College
In Partial Fulfillment of the Requirements for the Degree
Bachelor of Arts
By
Addison Alan DeBoer
May/2010 BASEBALL AND THE LEFT HANDED HITTER
Addison Alan DeBoer
May, 2010
Mathematical Economics
Abstract
This thesis is designed to explain the unordinary amount of left-handed hitters found in MajorLeague Baseball (MLB). The focus of this study is to determine the appropriate amount of left-handed hitters a MLB team should employ in order to maximize their success. The driving force behind this study is that the average amount of lefties in MLB is substantially higher than the amount of lefties found in everyday society. The hypothesis is that a team should employ between 33% and 55% of their hitters to be left- handed in order to achieve a team's optimal rate of success. This study will include all 30 MLB baseball teams over the span often years including more than 4100 hitters. Two models will beused to link the effect left-handed hitters have on the total number of runs a team scores, and also a team's season long winning percentage. The regressions produced R-squared values of .91 and .45 respectively. While the model was able to prove several different variables do significantly affect runs scored, and winning percentage the results were inconclusive in relating left-handed hitting to either dependent variable. For that reason the research could not support the hypothesis that MLB teams should employ between 33% and 55% left-handed hitters.
KEYWORDS: (Major League Baseball, Left-handed Hitter, Runs Scored, Winning Percentage) ON MY HONOR, I HAVE NEITHER GIVEN NOR RECEIVED UNAUTHORIZED AID ON THIS THESIS
Signature
TABLE OF CONTENTS
I INTRODUCTION 1 Literature Review 4 Player Value 5 Player Productivity 8 Natural Handedness 14 Conclusion 18
II THEORY 19 The Left-Handed Hitter 20 Team Success in MLB 21 Sample and Time Frame 22 Model 1 23 Dependent Variable 23 Hitters 24 Independent Variables 24 Model 2 27 Dependent Variable 28 Team 28 Independent Variables 29 Method 34 Conclusion 36
III DATA 38 Data Resources 38 Data Analysis 39 Variable Explanation 41 Conclusion 41 IV RESULTS 42
Regression Results 42
Model 1 43
Model 2 45
Conclusion 47
V CONCLUSION 48
Regression Conclusions 48
Further Research 51
Final Discussion 51
SOURCES CONSULTED 53 LIST OF TABLES
2.1 New York Yankees Batting Order and OPS for 2009 Season 31
2.2 Washington Nationals Batting Order and OPS for 2009 Season 31
2.3 Expected Signs for Independent Variables for Model 1 35
2.4 Expected Signs for Independent Variables for Model 2 36
3.1 Descriptive Statistics 40
4.1 Regression Results: Model 1 44
4.2 Regression Results: Model 2 46 ACKNOWLEDGEMENTS
I would like to thank my teacher, advisor, and friend, Kristina Lybecker for all her guidance and encouragement throughout this thesis. Your support throughout was greatly appreciated and will not be forgotten. I would also like to thank my family for their never-ending support not only throughout my college career, but also throughout my whole life. Lastly, I would like to thank my parents for everything they have sacrificed over the years in order to support me. Without you guys I can't imagine where I would be today. Everything I've ever accomplished I owe to you. CHAPTER I
INTRODUCTION
Introduction
Major League Baseball (MLB), has been commonly referred to as America's pastime for over a century. It has been a focal point in the sporting world for the United
States since its expansion in 1869.' This study has two purposes. The first question that
will be answered is do left-handed hitters increase a team's run production or winning
percentage? Secondly this study will look to uncover just how many lefties a team
should use to maximize their run production and winning percentage.
The focus of this study will be in MLB baseball performance, based upon the
batting handedness ofeach team's players. This topic brings interest to players,
managers, general managers, and fans alike. Each has their own reasons for their interest
in this topic. For instance a player would like to know how many lefties a team is
looking to employ when signing his contract to better understand the demand for him on
that team. Knowing the appropriate number of lefties a team is looking to employ could
help a player decide which team is the best fit for him when signing a new contract. The
manager could use this information in deciding which players he wants to use on a daily
basis, and from this research determine how many left-handers he should use throughout
1 "Baseball Almanac- The Official Baseball History Site,"[cited 2010]. Available from http://www.baseball-almanac.com/.
1 theseason. Along the same lines a general manager could use this study to help determine how many lefties they want to employ from season to season. Knowing how many lefties to hire in order to maximize statistics and winning percentage would be extremely valuable in selecting their roster year in and year out. Finally, a fan could appreciate this knowledge simply from their general interest in the game. Since fans are essentially the revenue generators for this industry, their right to this knowledge is as strong as any of the aforementioned others. These reasons, among others play to the importance of this study. Fundamentally this research can help determine the effect lefties have in today's game.
The role of a left-handed hitter has gradually evolved just as the game of baseball has over the past century. It has become prominent in Major League Baseball to have left-handed hitters in your line-up in order to succeed. This was seen very clearly when manager of the Chicago Cubs Lou Pinella said, "The only thing I talked about last season was a need for a left-hand bat... We didn't bring Edmonds back and Edmonds hit quite a few homeruns, so we needed a left-handed bat. That's it."2 Without question the role of a left-hander in MLB is one that cannot be overlooked and this study will shed light on that.
In a game where the livelihood ofeveryone involved relies on production many studies have been done to explain it. There have been countless studies on professional sports and the production of players relating to their age, race, contract, and countless other characteristics. As explained above a left-handed hitter plays a very large role in baseballyet none of these studies have identified exactly what that role is. These studies,
2 "Chicago Tribune Online - Chicago Cubs," [cited 2010]. Available from http://www.chicagotribune.com/. done in baseball, basketball, cricket, football, hockey, and soccer, will provide beneficial background for this research.
Many factors must be accounted for in order to place a value to the left-handed hitter. First, factors that relate to player, or even employee, productivity must be accounted for. Variables affecting production never seen on the score sheet such as experience, physical attributes, position played, age, contract, among a host of others must be considered. Inside team production variables along with the number of left- handed hitters include, winning percentage, homeruns, hits, strikeouts, stolen bases, and walks among others. It also must be noted that each individual statistic used will not incorporate each individual event, or even each player's season totals. This study will use team averages and totals to get a wider view of the role lefties play in MLB.
The goal of this study is to determine the appropriate number of left-handed hitting players a team should utilize to maximize their run production and winning percentage. The regressions in this study will test the hypothesis that each team should employ between three and five left-handed hitters to maximize success. This section has illustrated the importance of this study and has laid the framework for determining the value of lefties. The rest of this chapter will discuss previous research that pertains to player value, player productivity, and handedness. The following chapter will review relevant economic theory and explain the methodology and data used in this study.
Chapter III will discuss the data set for this study in detail, while Chapter IV will cover regression results and their meanings. Finally Chapter V will provide conclusions based upon the results of this study while providing suggestions for possible future research to this topic. Literature Review
The purpose of this section is to review the literature on player and team performance in Major League Baseball. There havebeen many studies done on player productivity and their value to the team, but few studies have incorporated physical attributes into the equation. Economists Gwartney, and Charles Haworth (1974) studied the impact of a player's ethnicity on a team's winning percentage, but studies of a player's natural handedness is relatively untouched by the research world.3 Many economists such as Bradbury (2007)4, and Berri (1999)5 have found ways to evaluate an
athlete's value, such as attributing monetary or statistical values to a player's worth,
based on the effect he personally has on a team's winning percentage. Differentially this
element of their research was based strictly on their on field and court performances,
never accounting for a player's natural hand preference.
Other studies, such as the one done by Oorlog (1995), on marginal revenue and a
player's salary tried to relate a player's value tohis marginal revenue, generated through
ticket sales and broadcast revenue.6 This study tried to put a fiscal value on each player,
and relate a player to a strict monetary value. In doing so, Oorlog (1995) expanded on
the method used by many economist of calculating each player's marginal revenue
product (MRP) by adding the spectator aspect generating a new model he called marginal
3 James Gwartney,and Charles Haworth, "Employer Costs and Discrimination: The Case of Baseball," The Journal ofPolitical Economy 82, no. 4 (07 1974): 873-881. 4 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 5 David J. Berri, "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association," Managerial and Decision Economics 20,no. 8 (12 1999): 411-427. 6 Dale R. Oorlog, "Marginal Revenue and Labor Strife in Major League Baseball," Journal ofLabor Research XVI, no. 1 (Winter 1995): 25-42. spectator revenue product (MSRP)7. Without question in a business, as major league baseball is, this can be very beneficial.
The first portion of this literature review will cover studies done on player value, relating a player's value to his monetary value, and also tohis team's winning percentage. Both measures have their advantages because as much as pro baseball is a business many would debate that winning is more important than money to many owners around the league. The second section of the literature review will touch upon studies
done on player productivity. It will bring to the forefront the studies that generated and
revolutionized the model that can measure a player's productivity level on the field of
play. The third section will look into studies done on handedness, the natural advantages
and disadvantages that come with being one hand over the other. These are medical
studies, which examine many intricacies, such as natural handedness, handedness in
sports, and hand vs. ocular dominance, identify the advantages and disadvantages of
each. The final section of this reviewof scholarly works will analyze ways to improve
on these studies and relate them to the role left-handed batters play in MLB, to determine
if indeed the optimal ratio of left-handers a team should employ is between 33% and
55%.
Player Value
Player value is something that will never be one hundred percent accurately
recorded. Intangibles such as leadership, chemistry, and work ethic are just a few
attributes that are very difficult to puta mathematical value to. Taking that under
consideration, a player's value can be assessed on a different level; a statistical one.In
7 Ibid sports statistics are more often than not responsible for the well-being of the athlete. \
Using these statistics many researchers have found ways to accurately value a player, and analyzing them we can continue to evaluate sports in the economic arena.
J.C. Bradbury (2007) completed one of the most recent and successful studies of relating a player's value to his team's winning percentage in the book The Baseball
Economist.8 Bradbury states that a player's worth is valued in his ability to generate wins
for his team. In short he states that a player's worth is directly related to his contribution
to his club's winning percentage. DeBrock, Hendricks, and Koenker (2004) support this
claim. They state that a team with better athletes will, in turn, generate a better winning
percentage, and better attendance records.9 This will force owners to pay more for these
players, thus increasing the overall value of said player. These studies are useful because
they generate models that relate an individual player to a team's winning percentage and
can assess said player's value to his team.
Other studies have been done to relate a player's value to the winning percentage
of his team while assessing a saidx-factor placed in the equation. Kahn (1993) measured
the influence of a manager on their ball club's winning percentage, and player
production.10 Using a simple regression function Kahn was able to incorporate an
external value, managerial quality, on winning percentage and player productivity. This
study found that an external factor such as a manager has a positive, and significant,
8 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 9 Lawrence DeBrock, Wallace Hendricks, and Roger Koenker, "Pay and Performance the Impact of Salary Distribution on Firm-Level Outcomes in Baseball." Journal ofSports Economics 5, no.3 (082004): 243- 261. 10 Lawrence M. Kahn, "Managerial Quality, Team Success, and Individual Player performance in Major League Baseball," Industrial and Labor Relations Review 46, no.3 (04 1993): 531-547. effect on a team's winning percentage and player production. This information is useful
in creating a model with other x-factors such as handedness.
Gwartney and Haworth's (1974) study on discrimination in Major League
Baseball, mentioned earlier, used a model to find the impact that employing black
baseball players had on team's winning percentage and overall attendance. As assumed
it was found that team's which employed more black players obtained a competitive
advantage. In fact the study proved that teams willing to employ these players saw an
increase in games won and annual revenue. This is significant because Gwartney and
Haworth were able to specify equations, which like Kahn's study employed an external
factor measured to have a significant impact on winning percentage and player
production.
Another approach to assessing a player's value in a much more direct way was
introduced by Scully (1974). He was able to measure a player's worth to his club as a
specific dollar amount using the theory of marginal revenue product (MRP).12 This study
concluded that a team's winning percentage is determined by the performance of players
on the team, and its revenue is in turn determined by winning percentage. Making these
connections Scully was able to show a player's worth in a specific dollar amount, based
upon his on field performance and the revenue accrued by theorganization. This is
particularly important because player value related to player productivity will be
discussed in the section to follow.
11 James Gwartney,and Charles Haworth, "Employer Costs and Discrimination: The Case of Baseball," The Journal of Political Economy 82, no. 4 (07 1974): 873-881. 12 Gerald W. Scully, "Player Salary Share and the Distribution of PlayerEarnings," Managerial and Decision Economics 25, no. 2 (03 1974): 77-86. One issue with Scully's model was that it was developed pre-free-agency, and thus it has been shown that players as of 1980 were overpaid by about 20 percent.13
Since the birth of free agency players have earned the upper hand in negotiations and thus may be overpaid even more now. With this information Oorlog (1995) generated a study to try and determine if players were paid with a portion of the marginal broadcast revenue
product (MBRP).14 In his study Oorlog found that only free agent and arbitration-eligible
players received some or all of their MBRP, indicating that unless a player has bargaining
power, owners are unwilling to share this revenue with their players. This has
importance in that salary in this new free-agent era cannot be as accurately predicted
using Scully's MRP method, and thus other factors must be included to increase the
model's accuracy. One such example studied by Maxcy (2004) was position played.
This is particularly important in setting up models to account for intangibles, such as
handedness.
Player Productivity
Employee productivity is something that every business wants to maximize in
order for their business to succeed. This is no different in Major League Baseball.In
baseball productivity is everything and a player's stats are his resume. The advantage an
owner of a professional ball club has over his comrades in other fields is that the
productivity of his employees is very accurately recorded.
13 J. Cassing, and R. Douglas, "Implications of the Auction Mechanism in Baseball's Free Agents Draft," Southern Economic Journal 47 (07 1980): 110-121. 14 Dale R. Oorlog, "Marginal Revenue and Labor Strife in Major League Baseball," Journal of Labor Research XVI, no. 1 (Winter 1995): 25-42. 15 Joel Maxcy, "Motivating Long-Term Employment Contracts: Risk Management in Major League Baseball," Managerial and Decision Economics 25,no. 2 (03 2004): 109-120. Krautmann (1990) studied player production and the effects of shirking and stochastic production in Major League Baseball.16 In this study heused the slugging average (SA) of major league hitters to model production. He took note of how player production has a lot of variability and thus he also recorded player's best (max) and worst years (min). These labor services without question involve a random component, and for
this reason the study was done to find if players do indeed shirk after receiving long-term job security in the form of a long-term contract.17 In his study Krautmann found no
evidence of shirking but this study has pertinent use in identifying a player's production,
a topic that is the main focus of this section.
Krautmann (1993) furthered his study in this field based on "allegations of
shirking due to long-term job security have been associated with seniority rights,
professional athletes, and academic institutions of tenure."18 In this reply Krautmann
conveyed that it is indeed hard to conclude if a player is shirking based upon his contract
situation. He instead implies that a player's reduced productivity could instead be
blamed upon the stochastic nature of production, or said player could simply behaving a
down year statistically.19 This is important in realizing the variability in an athlete's
production.
Nonetheless Krautmann believes that for whatever reason, a player's statistics
decrease as a player'scontract length increases. This was proven inhis study with
16 Anthony C. Krautmann, "Shirking or Stochastic Productivity in Major League Baseball?" Southern Economic Journal, 56,no. 4 (04 1990): 961-968. 17 Ibid 18 Anthony C. Krautmann, "Shirking or Stochastic Productivity in Major League Baseball: Reply." Southern Economic Journal, 60, no 1 (07 1993): 241. 19 Ibid 10
Oppenheimer (2002) on contract length and performance in Major League Baseball.2 In this study the authors used a wage regression function using salaryas the dependent variable and predicted performance, team, and player (i.e. seniority and race) as their independent variables. This is a viable note in determining player production over long periods of time, and noting their unpredictability while using external variables as
independent variables (i.e. handedness).
In another study, Krautmann, Maxcy and Fort (2002) look at the effectiveness of
incentive mechanisms in Major League Baseball and its effects on production. In this
study the authors looked into incentive-based contracts and natural incentives to find the
impact they have on ball players. They also considered playing time, time spent on the
disabled list, and skill as the three main measures of production. Inthe measurement of
skill they used slugging average (SA) for position players, and strikeout-to-walk ratio for
pitchers. They found it impossible to exactly measure a player's true potential.
Therefore it is very difficult to know if said player is truly underachieving. Evidence was
also found that players spend less time on the disabled list during contract years. The
authors also proved that players tend to have increased playing time after signing long-
term contracts, which aligned with Krautmann's prior studies that players are not shirking
in post-contract years. An external reason, which can simply not be accounted for
mathematically, is that managers may receive pressure from the front office (i.e. owners
and general managers) to play newly signed players in order to make their acquisitions
20 A. C. Krautmann, and M. Oppenheimer, "Contract Length and the Return to Performance in Major League Baseball," Journal ofSports Economics 3, no. 1 (02 2002): 6-17. 21 J. G. Maxcy, R. D. Fort, and A. C. Krautmann, "The Effectiveness of Incentive Mechanisms in Major League Baseball," Journal ofSports Economics 3, no. 3 (08 2002): 246-255. 11 look successful.22 Again this reiterates the fluctuation found in production when measuring athletes, an issue which will undoubtedly be faced in later sections.
Incontrast to the Krautmann studies, Marburger (2003) believes that there is sufficient evidence of shirking in MLB.23 In this study Marburger focuses on the impact of shirking based upon the property rights of said baseball player. His results lead him to believe that players have the incentive to shirk regardless of how property rights are
assigned at that time. If a player is still in his first six years in the league and as a result
involved in a reserve contract, the team essentially owns that player. For this reason he
could possibly feel he is underpaid and consequently under produce during that span.
Marburger also hasreason to believe that even after a player owns their property rights,
(year seven and later in the MLB) he may have reason to shirk after signing a long-term
contract. Marburger believes shirking is difficult to concretely prove, and that there is
ultimately no net impact of shirking on Major League Baseball. While it may be
impossible to detect, it is not because it does not exist in baseball. Reiterating his belief,
Marburger notes that some experts inside the MLB do indeed believe that some shirking
exists, which he supports by pointing out many weight management stipulations found in
some contracts.24 This study shows that even using much of the same data, multiple
conclusions can be arrived at, which raises the idea of external forces which are
unaccounted for in these studies playa significant role in the end results.
Miceli (2004) studied the reserve clause, touched upon in the preceding study, in
depth. He explored reasons why both the player and organization find the reserve clause
22 Ibid 23 D. R. Marburger, "Does the Assignment of Property Rights Encourage Or Discourage Shirking? Evidence from Major League Baseball," Journal of Sports Economics 4, no. 1 (02 2003): 19-34. 24 Ibid 12 viable.25 He claims that players accept the clause because they receive quality job training from their employer, and in turn the organization has incentive to offer such training because they are ensured the player's work for the remainder of the reserve clause contract. The organization accepts the clause because they are able to pay many players less than market value. Miceli states that the players accept the clause because they need to develop their skills in order to secure a future for themselves in the industry.26 The study also raises the issue of shirking, and makes it apparent that during any particular season shirking may skew results in other studies where statistics are used
as the main source of data.
As discovered in earlier studies some external forces have a pertinent impact on
the data studied. One of these, the manager, was studied by Porter and Scully (1982). In
this study they tried to measure the efficiency of managers in MLB.27 Using a Cobb-
Douglas production function and a team's winning percentage as the dependent variable,
Porter and Scully tried to evaluate the role of a manger on a team's winning percentage.
In this study the authors found that teams with higher managerial efficiency numbers not
surprisingly were among the top teams in the league. Ina previous study done by Scully
(1974) he found that in 1969 superstar player Sandy Koufax had a marginal revenue
product (MRP) estimated at $725,000. In the same season Scully estimated manager Earl
Weaver's MRP at $675,000.28 From this he concluded that superb managers could
contribute as much to a team'srevenue as high caliber players. While little is known
25 T. J. Miceli, "A Principal-Agent Model of Contracting in Major League Baseball," Journal of Sports Economics 5, no. 2 (05 2004): 213-220. 26 Ibid 27 Phillip K. Porter, and Gerald W. Scully, "Measuring Managerial Efficiency: The Case of Baseball," Southern Economic Journal 48, no.3(01 1982): 642-650. 28 Gerald W. Scully, "Pay and Performance in Major League Baseball," American Economic Review 64 (12 1974): 915-930. 13 about their salaries, it is very unlikely that talented managers earn more than a modest fraction of what superstar MLB players are paid.29 These studies establish that the manager plays a significant role in the productivity and success of his players and team.
A study on basketball player production and efficiency was done in the National
Basketball Association by Zak, Huang, and Siegfried(1979). The study gives insight
into a team's potential and their ability to turn that potential into wins.30 This study
effectively found correlation between inputs such as, shooting percentage, rebounding,
and assists versus the end result of a win or a loss. The study used a Cobb-Douglas
production function and the Richmond technique to come to these conclusions and found
very significant results. This study also looked at the effect of home-court advantage and
found that while the home team won a higher percentage of games than the away team,
this was a result of a superior performance by the home team and not the result of
preferential treatment by officials, an idea that has been discussed for some time. While
this study provides insight to the effect player production has on winning it found no non-
statistical inputs to have significant impact.31 This study points to the things that cannot
be accounted for such as the comfort a player feels while playing at home, versus playing
on the road. This may not be measured in value but it may very well be something that
plays a part in home court advantage. No doubt the same effect exists in baseball.
Another relevant study done in the NBA was performed by Berri (1999). This
study measured the value of player production to the amount of wins a certain team
30 Thomas A.Zak, Cliff J. Huang, and John J. Siegfried, "Production Efficiency: The Case of Professional Basketball," The Journal of Business 52,no.3 (07 1979): 379-392. 31 Ibid 14 achieves.32 In this study Berri utilized statistics kept by the NBA and presented an economic model that linked a player's statistics to team wins. With the results generated he found evidence that the value of an NBA player can objectively and accurately be determined. Much likeearlier studies discussed player value was very easily related to a team's winning percentage, so consequently this study's primary focus was the production of each individual player. Berri mentions that further study could be
performed to include factors such as experience, coaching, and team chemistry, which
undoubtedly play a role in team success.33 This study gives good insight into player
production and will be valuable in generating a model to replicate its results in other
sports.
Natural Handedness
Insports there are many different things a numerical value can be placed upon in
order to generate some very interesting conclusions about the performance of athletes.
Nevertheless some things are out of the control of an athlete such as height, age, and, as
many people believe, natural ability. One of these variables is an athlete's natural
handedness. In some sports (i.e. basketball) the handedness of a player is farless
important, but in sports such as baseball it plays a major role. Some studies have been
done on natural handedness, and a few of those will be discussed in this section.
32 David J. Berri, "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association," Managerial and Decision Economics 20, no. 8 (12 1999): 411-427.
33 Ibid 15
Wood and Aggleton (1989) studied the advantage of left-handers in "fast ball" sports.34 The study was done to uncover whether "left-handers have an intrinsic advantage over right-handers due to superior spatio motor skills, and that the relatively high proportion of top left-handed sportsmen and sportswomen is, in part, a reflection of this innate superiority."35 In this article Wood and Aggleton studied three sports; tennis,
cricket, and soccer. Basedon work done by Annett that suggested left-handers have a
higher capacity for visuo-spatial thinking, which is fine control of both hands and the
ability to make fast reactions, possibly explaining the intrinsic advantage lefties have in
baseball.36 This study was performed to determine if there was a higher than normal
proportion of left-handers in said sports which demand rapid and accurate visuo-spatial
coordination. The study uncovered that there was very little data that could validate this
theory. In cricket it was found that there are a noteworthy number of left-handed
bowlers. It is believed by Wood andAggleton that this is more of a strategic advantage
than anything else. They also found evidence that many left-handed batsmen are, in fact,
right-handed by almost any other account. In tennis they found an inordinate number of
left-handed players at the professional level, but again found that if it is not solely a
strategic advantage, the effect is "slight" at best. Finally in soccer goal keeping, where
there should be no strategic advantage, no evidence was found to show an excess of left
handers. This study shows that while there may not be a intrinsic advantage to being left-
handed, in some sports, the strategic advantage of being left-handed can indeed be quite
beneficial.
34 C. J. Wood, and J. P. Aggleton, "Handedness in 'fast ball' sports: Do left-handers have an innate advantage?" British Journal of Psychology 80, (1989): 227-240. 35 Ibid 36 M. Annett, "Left, Right, Hand and Brain: The Right Shift Theory," London: Earlbaum. (1985). 16
A study done on handedness more directly related to baseball is found in
Goldstein and Young (1996).37 The goal of this study was to determine whether or not evolutionary stable strategy (ESS) theory holds truefor handedness of pitchers and hitters in MLB. This study found that indeed the ESS theory holds true in Major League
Baseball but the intrinsic advantage baseball offers lefties forces the ratio of left-handers to be much higher than in everyday society. The study cites the advantage pitchers receive when having a left vs. left, or right vs. right match up with the hitter. For this reason, this study was easily able to explain the reason that left-handed hitters initially increased before left-handed pitchers.
"Initially, RH players are more common than LH players. But the initial predominance of RH players in the sport ensures that LH batters will evolve faster than LH pitchers because, as we have seen, LH batters have an advantage against the more common RH pitcher. However, as LH batters become more common in the population, LH pitchers, because of their effectiveness against LH batters, will also become more common. Eventually, the race should stabilize when the relative proportion of LH pitchers coincides with the relative proportion of LH batters."38
This study is very important in explaining the hypothesis of this study that the optimal number of left-handed hitters should be between 33% and 55% of a teams roster based upon the number of left-handed pitchers.
Grondin, Guiard, Ivry, and Koren (1999) also discusses the advantage of hitting left-handed in MLB.39 This study's primary focus is on baseball's asymmetry, and bimanual movements. The paper found several reasons to believe that the baseball
37 Stephen R. Goldstein, and Charlotte A. Young, "'Evolutionary' Stable Strategy of Handedness in Major League Baseball," Journal of Comparative Psychology 110, no. 2 (06 1996): 164-169. 38 Ibid 39 Simon Grondin, Yves Guiard, Richard Ivry, and Stan Koren, "Manual Laterahty and Hitting Perfromance in Major League Baseball," Journal ofExperimental Psychology 25,no. 3 (06 1999): 747- 754. 17 environment induces players to bat left. Using throwing hand as the determining factor of handedness, Grondin (1999) found that 90% of left-handed players batted left, while
60% of the players who batted left were right-handed people.40 This work brings forward
the striking inordinate amount of left-handed hitters in baseball and sheds some light on
reasons that may bebehind it. This study explains ideas behind why lefties are so
prominent, while this thesis will take these ideas under consideration while determining
the appropriate number of lefties a team should stick with.
The final study to be mentioned in this section is Laby, Kirschen, Rosenbaum,
and Mellman (1998).41 This research looked at the relationship between hand and ocular
dominance in MLB players. This study was done to determine if "crossed" (i.e., left eye
and right hand or right eye left hand) or "same" (left eye and hand or right eye and hand)
dominance is advantageous inbaseball. The study took 410 professional baseball players
and crossed reference their hand-ocular dominance patters vs. their batting average (BA)
or earnedrun average (ERA). Player's hand-ocular dominance was determined using
batting-handedness, throwing hand, questionnaires, and visual tests. Many people have
commonly believed that "crossed" dominance should benefit a batter because their
dominant eye is naturally positioned toward the pitcher, but the result from this study
showed that hand-ocular dominance had no effect on BA or ERA.42 The hand-ocular
study found no significant results although it does help create a foundation for studying
left-handers in the MLB.
40 Ibid 41 Daniel M. Laby, MD, David G. Kirschen, OD, PhD, Arthur L. Rosenbaum, MD, and Michael F. Mellman, MD, "The Effect of Ocular Dominance on the Performance of Professional Baseball Players," Ophthalmology 105, no. 5 (1998): 864-866. 42 Ibid 18
Conclusion
The literature review in this chapter reiterates that the number of left-handed hitters on a MLB team is an economic topic that can and should be researched. This chapter laid out the fundamentals behind the study of lefties and leaves room for
improvement in a few areas including the thesis of this paper. Many studies touched
upon in this section analyzed player value, player production, andhandedness but none
correlated all three into a study to find the optimal number of lefties a team should
employ. The following chapter will present the models used for this study, discuss the
variables used, and present the methodology for this study. CHAPTER II
Theory
The purpose of this chapter is to lay out the framework for models that value statistical success and winning percentage in Major League Baseball, while interpreting various factors pertaining to it. The first section of this chapter discusses the unordinary ratio of right-handers and left-handers while raising questions to why that is. The second section will examine the team's production while developing two models to convey the effect left-handed hitters play inthat respect. The two regression equations will be described to test the hypothesis that having between 33% and 55% of left-handed hitters is the optimal ratio a team should employ. This paper extends models used by Bradbury (2007)1, and Lewis (2003)2 in order to find the effect a left-handed batter has on his team's success. These studies were employed on a general basis and will be refined to generate results more directly related to the hypothesis described above. The modelsand variables presented in this chapter will be tested and the results will be presented in Chapter IV.
1 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 2 Michael M. Lewis, Money Ball. (New York, N.Y:W.W. Norton & Company Inc, 2003), 208.
19 20
The Left-Handed Hitter
Insidethe baseball world the idea that hitting left-handed is an innate advantage has been debated without much success for decades. The belief that a left-hander is far more rare on a child's little league field than on a major league diamond can lead some to the conclusion that left-handers are "naturally superior", "fortunate", or even "luckier," an idea studied by Wood and Aggleton (1989).3 Grondin (1999) notes that in the sport of baseball, batting preference is often inconsistent with hand preference.4 He lists a few reasons for this; a left-hander is closer to the immediate goal (first base), a left-hander after swinging will also likely be appropriately heading in the correct direction, and the intrinsic advantage a left-hander has against a right-handed pitcher.5 (The majority of
MLB pitchers are right-handed.) All of these principles align with baseball's inherent
asymmetry.
As discussed in Chapter I and above, the world of baseball has a much different ratio between right-handers and left-handers than seen in everyday society. Grondin
(1999) found that in a sample of more than 7,000 players only 13.7% of them conveyed left-handedness, using throwing hand as the determining factor of handedness, yet they found that over 37% of these same people could bat from the left side.6 This unprecedented discrepancy continues tofuel the fire that left-handed hitters have an intrinsic advantage.
3 C. J. Wood, and J. P. Aggleton, "Handedness in 'fastball' sports: Do left-handers have an innate advantage?" British Journal ofPsychology 80, (1989): 227-240. 4 Simon Grondin, Yves Guird,, Richard B. Ivry, and Stan Koren, "Manual Laterality and Hitting Performance in Major League Baseball." Journal of Experimental Psychology: Human Perception and Performance 25,no. 3 (1999): 747-754. 5 Ibid 6 Ibid 21
These ideas are the main factors behind the hypothesis of this paper and lead many to believe lefties do have a natural advantage over their counterparts. Whether or
not this advantage is portrayed in team success is the main focus of this study and the
results in Chapter IV may shed light on the handedness discrepancy found in Major
League Baseball.
Team Success in MLB
When a single game is played in Major League Baseball there are two certainties
everyone can count on, one team will win, and the other will lose. The unquestioned goal
of all professional teams in the field of play is to win the game. The next step in
examining the success of a MLB team can befound in the team's winning percentage. A
team can also decipher their success in a more precise manner by analyzing their overall
team statistics. Using these two approaches, two models can be generated to analyze the
effect left-handed hitters have on these two principals at hand.
Two simple regressions were formulated to put this thesis to the test. The first model was
generated to study the statistical success a team incurs noting specifically the number of
left-handed hitters they employ. The second model will calculate team success
specifically recognizing winning percentage. Taking into consideration only team
statistics two regressions can be formulated, and are expressed in equations 2.1 and 2.2.
Runs Scored= a + pi Left-handed Ratio + (32 HR + P3 Hits + p4 SO + P5 SBR
+ P6BB + P7HBP + P8DP + P9 Year + s 2.1 22
Where Runs Scored is the total number of runs scored by a team in a season, Left-handed
Ratio is a dummy variable accounting for teams that meet the pre-determined LH ratio,
HR is the total number of Homeruns hit by a team in a season, Hits is the total number of
hits a team compiled in that season, SO is the total number of strikeouts a team
accumulated while hitting in one season, SBR is the total number of stolen bases in a
season subtracted by the total number of times a team was caught stealing in a season,
BB is the number of walks a team compiled in one season, HBP in the number of times a
team was hit by a pitch, DP was the total number of double plays a team hit into, Year is
a variable of time, and e is the error term
Winning Percentage= a + pi Left-handed Ratio + p2 OPS + 03 Run Ratio
+ p3 Range Factor + p4 Year + s 2.2
In equation 2.2 Winning Percentage is a teams winning percentage in a season, Left-
handed ratio is a dummy variable accounting for teams that meet the pre-determined LH
Ratio, OPS is a teams OPS for theseason, Run Ratio is a teams run ratio score inthat
season, Range Factor is a teams average Range Factor in that season, Year is a variable of
time, and s is the error term.
Sample and Time Frame
This study will focus on the most recent decade, 2000 to 2009 in the MLB. It will
examine 9 offensive team statistics, one defensive statistic, andone pitching statistic for
all 30 Major League Baseball teams. This data set was compiled using Sean Forman's 23 baseball reference site. Since baseball is indeeda team sport each player's statistics will
not be examined individually, but instead they will be accounted for when composing the
team's overall performance. The remainder of this chapter will describe each variable in
detail, and look at the methodology used to test the model.
Model 1
The first model, equation 2.1, is created to determine the effect that left-handed
hitters have on the runs generated by their team throughout a season. This model will
analyze and determine if a team should employ a ratio of lefties that qualifies within the
pre-determined left-handed ratio in order to maximize how many runs they can score. In
this regression there is one dependent variable, Runs Scored and nine independent
variables to be described below. Using these variables plus an error term, the regression
will be run to find the appropriate number of lefties a MLB team should use to maximize
the number of runs they score in a single season.
Dependent Variable
In baseball, as in most sports, a team tries to accumulate more points than the
opposing team in order to win the game. In baseball these points are commonly referred
to as runs (R). A run is counted when a player successfully navigates his way around the
bases without being forced out by the opposing team. If a player accomplishes this, his
team will be credited with one point. The team who has scored the most runs at the end
of the game is then deemed to be the winner. 24
Hitters
In baseball, more than in most sports, statistics are the center of attention when dissecting performance. In the MLB different stats are kept for hitters and pitchers. This model will take into account only statistics obtained by hitters. The exogenous variables listed above are considered to be some of the most important offensive statistics kept and for that reason they are included into this model. The coveted triple-crown, widely believed to be the most impressive feat in baseball, is achieved when a player leads his league in runs-batted-in (RBI), homeruns (HR), and batting average (BA). According to
Amspacher (2010)7 only 14 players in the history of the MLB have accomplished this feat. The last person to obtain this feat was Carl Yastrzemski in 1967; that season he accumulated 121 RBI, 44 HR, and a .326 BA.8 For this reason it must be noted that two
statistics that were removed from this model, BA and RBI, will be accounted for in
different ways. RBI is essentially the same as runs scored when looking at overall team
statistics, and BA is accounted for when calculating OPS. OPS, which is On-base
percentage plus Slugging percentage, will later be described in detail for its use in model
two. For these reasons the models used in this study should capture essentially every
important offensive statistic in baseball.
Independent Variables
The first independent variable in this model is hits (H). A player is credited with
a hit when they hit the baseball into the field of play and reach base safely. We also must
note that a player can hit the ball into play, and reach base safely without getting credit
7 "The Triple Crown - A Brief Look at One of Baseball's most Coveted and Elusive Feats," [cited 2010]. Available from http://www.psacard.com/articles/article3547.chtml. 8 Ibid 25 for a hit. For this to happen the opposing teammust force out another one of the hitter's teammates, (fielders choice (FC)) or be credited with an error (E) by the scorekeeper.
This is very viable in measuring the successa team has while hitting and will be one of the most important variables in this study for that reason.
A player can get four different types of hits, one of which is a homerun (HR). A homerun is achieved when the batter is able to hit the pitch over the fence, or "out of the park" in fair territory. A hitter can also be credited with a homerun if he is able to hit the ball in play and navigate his way all the way around the bases without being forced out.
This is known as an "inside the park" homerun. This variable is very important in measuring the power of a team and thus will be a very important variable as well.
The third independent variable in this regression is strikeouts (SO). If a pitcher is able to force a batter out by getting three strikes on them before they can reach first base or hit the ball into the field of playthey are then said to have struck out. This is very effective in measuring the inability for a team to make contact at the plate. Since that is, for the most part, a main goal when a hitter gets to the plate it will help measure a major flaw, and thus will most likely have a negative effect on runs scored.
Base-on-balls (BB) is the next exogenous variable used in this regression. This is
also often referred to as a walk, and it is achieved when a player can successfully take
four balls (a pitch that the umpire deems to be outside thestrike-zone) without being
forced out. When this is done a player is awarded first base and for this reason some
consider this "free-pass" a very underrated play in baseball. This variable will help
measure positive results for hitters that do not show up in other statistics such as hits and
homeruns. 26
A player can also receive a "free-pass" to first base if they are hit by the pitcher's throw (HBP). This is sometimes believed to be out of the hitter's control, but it does put a hitter on first base and give them an opportunity to score a run for their team. For this reason it must be included into a model that usesruns as the dependent variable. This variable in turn should make the model much more accurate.
The next variable, stolen base ratio (SBR), is a combination of two variables, stolen bases (SB) and caught stealing (CS). A SB is something that a player achieves not at the plate, but instead while they are on the base paths after successfully reaching first base. To steal a base a runner must successfully move up one base without the hitter putting the ball in play or being forced out by the opposing team. CS aligns with stolen bases. This is when a player attempts to steal a base but is forced out by the opposing team before they can do so. At this point the player is considered out and is no longer able to score a run for his team. Since SB and CS are so strongly correlated a simple subtraction of a team's total number of SB by their total number of CS will then result in a teams SBR score. Since base stealing can play such a large rolein run production SBR must be included into the model.
The next independent variable based upon productivity in this model is hit into a double play (DP). This happens when a batter hits the ball into play but the defense is
able to not only force out the hitter, but also another player on his team. This is very
detrimental to a team's abilityto score runs, and for that reason very important to the
model at hand. 27
Year (Y) will also be an independent variable used in this model. It will give a running measurement of time and will help identify any discrepancies in the data based upon possible yearly irregularities.
The final exogenous variable in this model is Left-handed Ratio (LH). This variable has no statistical value to the team's success, but is the main focus of this thesis.
A hitter can be classified in three ways, left-handed, right-handed (RH), or a switch-hitter
(SH). For the purpose of this model only players who have 130 at-bats (AB) or more will be taken under consideration. The number 130 is chosen based on the precedent that a hitter be considered a rookie until they accumulate, in one season, 130 AB, or remain on the active roster for 45 days prior to September 1st.9 If a player achieves either of these plateaus in a season they will be eligible for the rookie of the year award and no longer be considered a rookie the following year. This determinant was set forth in 1971 in regards to rookie of the year voting,10 and will bean acceptable cut-off for this study as well.
Hitters who reach the 130 AB plateau in the season of note will be classified in one of the three categories listed above. The idea behind this thesis is that a team will want to employ between 33% and 55% of their hitters as lefties to better match the asymmetry of baseball. This is a dummy variable andteams who have the aforementioned ratio will be
recorded as one, while team who do not will be recorded as zero. This variable should
give us the results desired for this study and it will also be present in model two.
Model 2
9 "Mlb.Com - MLB Miscellany: Rules, regulations, and statistics," [cited 2010]. Available from http://www.mlb.com/. 10 "Britannica.Com - Facts about Rookie of the Year: baseball," [cited 2010]. Available from http://www.britannica.com/. 28
The second model used in this study, equation 2.2, will examine the effect left- handed hitters have on a team's winning percentage, and as in model one will be used to determine the optimal number of lefties a team should use to maximize their success. In this model the dependent variable is winning percentage, while the independent variables are On-Base Percentage plus Slugging Percentage (OPS), Run Ratio, Range Factor, Left- handed ratio and Year. Using these variables plus an error term a regression will be run to
determine what role left-handed hitters have on a team's winning percentage.
Dependent Variable
The dependent variable at hand, winning percentage (W%) is a fairly simple
variable to compute. As noted earlier, in each game played one team will win the game,
and one team will lose the game. At that point the team who is deemed the winner will
be credited with a win, (W) while the other is credited with a loss (L). To determine
winning percentage, take the total number of wins a team accumulates in a season and
divide that number by the total number of games played (GP). This is shown in equation
2.3.
W%= W/GP 2-3
where W% results in a decimal recording the percent of games a team wins in one
season.
Team
As stated prior many different statistics are kept for players in the MLB. Using
these stats, team statistics can easily be compiled. This is valuable since team success is 29 directly correlated with player productivity. Some team statistics are teams averages, such as OPS. Other team statistics such as runs, homeruns, and stolen bases are team totals. The three statistics used in this model include two statistics, Range Factor and
OPS, widely believed to be the most successful in portraying success in their respective
categories.11 The final variable Run Ratio is a statistic generated for this study, not
formally used by the MLB, but is based on Sabermetric statistic DICE, which is widely
regarded as a very efficient way of computing pitching efficiency in Major League
Baseball. All three will be explained in detail below.
Independent Variables
The first independent variable in this regression is On-Base Percentage plus
Slugging Percentage (OPS). This statistic is part of the Sabermetrics analysis of baseball,
which was developed by John Thorn and Pete Palmer.12 Bradbury (2007) finds that the
explained variance in run production of OPS is 90 percent, which is an unprecedented
high correlation.13 To calculate OPS is a three-step process.
First On-Base-Percentage (OBP) must be calculated. OBP is a fairly
straightforward calculation and it is very similar to a player's batting average (BA). OBP
is the total number of times a player successfully reaches first base divided by their total
number of plate appearances (PA). This stat differs from BA in that it includes not only
hits, but also BB and HBP.As explained earlier BB and HBP can play an important role
in team success therefore the use of BA in this study is unnecessary even though it is
11 J. C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 12 John Thorn and Pete Palmer, The Hidden Game ofBaseball: A Revolutionary Approach to Baseball Statistics (New York; Dolphin, 1984). 13 Ibid 30 widely perceived in the baseball world to be of utmost importance. The calculation for
OBP is shown in equation 2.4.
OBP= (H+BB+HBP)/PA 2.4
This calculation will generate a decimal number reflecting the percentage of times a player reaches first base safely excluding errors.
The second part of OPS is Slugging Percentage (SLG), which is usually used to measure a player's power. A power hitter's primary duty to the team is to hit homeruns and get extra base hits. These players are often found in the middle of the batting order, and more often than notthey will also be toward the top of the team in SLG. To calculate
SLG, the total number of bases (TB) accumulated by the hitter is divided by a player's number of at-bats (AB).(Note a BB or HBP is not considered an AB, which explains the difference between AB and PA.) A hitter accumulates four bases for a HR, three bases for a triple, two bases for a double, andone base for a single. The SLG calculation is found in equation 2.5 below.
SLG= TB/AB 2.5
Finally to calculate OPS SLG and OBP are added together. In equation 2.6 the calculation is laid out.
OPS= OBP + SLG 2.6
Table 2.1 and table 2.2 present theregular season OPS numbers for the best and worst
teams in the 2009 regular season respectively. These tables provide example of OPS for
each position player in the lineup using their most common starting line-ups that season.
This reiterates Bradbury's claim that OPS, more than any other offensive statistic, helps 31 model team success. These tables will show that OPS can be very important in
distinguishing a good team from a bad team.
Table 2.114
New York Yankees Batting Order and OPS for 2009 Season
POSITION FLAYER STATS 1. Derek Jeter .874 OPS as #1 batter (693 PA)
2. Johnnv Damon .854 OPS as #2 batter (591 PA)
3. Mark Teixeira .950 OPS as #3 batter (702 PA)
4. Alex Rodriguez .935 OPS as #4 batter (533 PA)
5. Hideki Matsui .849 OPS as #5 batter (269 PA)
6. Jorse Posada .965 OPS as #6 batter (246 PA)
7. Robinson Cano .891 OPS as #7 batter (250 PA)
8. Melkv Cabrera .659 OPS as #8 batter (227 PA)
9. Melkv Cabrera .796 OPS as #9 batter (132 PA)
Table 2.215
Washington Nationals Batting Order and OPS for 2009 Season
POSITION PLAYER STATS 1. Cristian Guzman .752 OPS as # 1 batter (232 PA) 2. Nick Johnson .858 OPS as #2 batter (336 PA) 3. Rvan Zimmerman .882 OPS as #3 batter (634 PA) 4. Adam Dunn .935 OPS as #4 batter (606 PA) 5. Josh Willingham .793 OPS as #5 batter (351 PA) 6. Eliiah Dukes .809 OPS as #6 batter (154 PA) 7. Josh Bard .537 OPS as #7 batter (170 PA) 8. Wil Nieves .629 OPS as #8 batter (232 PA)
9. John Lannan .273 OPS as #9 batter (65 PA)
14 "ESPN - New York Yankees News, Schedule, Players, Scores, Stats, Photos, Rumors - MLB Baseball, [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=nyy. 15 "ESPN - Washington Nationals News, Schedule, Players, Scores, Stats, Photos, Rumors - MLB Baseball." [cited 2010]. Available fromhttp://sports.espn.go.com/mlb/teams/lineup?team=was. 32
Tables 2.1 and 2.2reveal not only thedifference in OPS between the number nine hitter and the number four hitter but it also demonstrates the importance of team. While the number three and four hitters for both clubs all achieved a good OPS score, the discrepancy between the less talented hitters, the seven through nine hitters, and the more talented hitters, three through five, is strikingly larger for Washington than New York.
This indicates that a team cannot justrely solely on their best few players, and instead need to have a quality team throughout in order to be successful. This shows the value of looking at overall team statistics in determining team success, as opposed to looking at individual statistics. Bradbury (2007) is shown to be very accurate in saying that a team's OPS is very predicting of team success.16 Consider the example at hand, New
York had and average OPS of .864 throughout their lineup, while Washington achieved only a .631 OPS average. This discrepancy in average OPS resulted in a vast difference in winning percentage. In 2009 the New York Yankees had a winning percentage of .636 while the Washington Nationals had a winning percentage of only .364.
The second independent variable in Model 2 is run ratio (RR). This variable is used to value a team's pitching in a positive manner. In 2001 Sabermetrician Clay
Dreslough developed a statistic that has the ability to decipher a team's pitching success independently from their defense. This approach was generated using the Defense
Independent Pitching Statistics model commonly knows as DIPS. Dreslough's new adaptation to this field is called Defense-Independent Component ERA (DICE). (ERA is the amount of earned runs a team or pitcher gives up divided by nine, the number of
16 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 17 "MLB.com — The Official Site of Major League Baseball." [cited 2010]. Available from http://mlb.mlb.com/mlb/standings/?ymd=20091104&tcid=mm_mlb_standings. 33 innings in a standard game.) DICE has a somewhatmore complex formula than any other variable presented in this chapter.
To calculate a team's DICE score the following statistics are needed: HR allowed,
BB given up, opposing batters HBP, SO recorded, and innings pitched (IP). Equation 2.7 expresses the formula associated with the DICE method.
DICE = 3 + [(13HR + 3(BB + HPB) - 2SO)/ IP] 2.7
This formula provides a representation of a team's ERA excluding the defensive aspect of the typical ERA. The lower this number is the better a team's pitching staff. For this reason this study will institute a new formula to generate a team'srun ratio (RR) in order to credit a team with good pitching in a positive manner.
In order to turn a low score, a good DICE score, into a high score, the highest earned run average (ERA) ever surrendered by any team within the time period of this study must be found. (Earned runs are runs given up that were not deemed to be scored due to any error made.) This is found with the 2001 Texas Rangers who's ERA for that season was 5.71.18 Knowing that no team has had a higher (worse) ERA during this span it is then known that this number will always represent the worst pitching team in our era.
From here the calculation of each team's hypothetical RR can be found in order to reward teams with the lowest DICE. Doing so will credit a team who achieves a lower (better)
DICE value with a better RR value. Equation 2.8 shows the simple math needed to
compute RR.
RR= 5.71-DICE 2.8
18 "Baseball-Reference.Com - Major League Baseball Statistics and History," [cited 2010]. Available from http://www.baseball-reference.com/. 34
This value will represent the difference between the worst possible ERA and a team's
DICE score giving teams with the lowest DICE rating the highest RR value. This will help identify the positive contribution of pitching to winning percentage.
The third independent variable in this model is range factor (RF). According to
Sabermetric supporter Bradbury, RF gives a better valuation of a player's defensive value
than their fielding percentage.19 To compute this value one must take a person's number
of putouts (PO), assists (A), and number of games played (GP) at a given position.
Equation 2.9 demonstrates this computation.
RF = (PO + A)/ GP 2-9
Since this stat is recorded on an individual basis, the team average must be calculated.
To keep this study balanced only the position players who qualify for the hitting
statistics, 130 AB or more, will beused in the formulation of a team's average RF. From
there the player with the most games played at each position will be counted and then
averaged to determine a team RF. This value will give the most accurate representation
of team defense and the role it plays in team success.
The last two independent variables in this model are Left-handed Ratio (LH), and
Year (Y). These variables will be calculated and incorporated into the model in the exact
same way they were in model one. LH will again be the primary focus of the model in
terms of identifying any impact a left-handed hitter has on a team's winning percentage.
Method
The results presented in Chapter IV will be obtained by running two regressions.
In order to determine if having a ratio of lefties between 33% and 55% correspond with
19 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 35 the highest winning percentage, and highest number of runs scored, teams will be run through the regressions using LH as a dummy variable mentioned above. This will be done to make an assessment of the thesis of this paper that a team should employ between
33% and 55% of there hitters to be left-handed in order to maximize their runs scored and
winning percentage.
Table 2.3
Expected Signs for Independent Variables for Model 1
Independent Variable Predicted Sign
Hits Positive
Homeruns Positive
Strike Outs Negative
Base on Balls Positive
Hit By Pitch Positive
Stolen Bases Ratio Positive
Hit into Double Play Negative
Year Unknown
Left-handed Hitters Negative if a team has less than 33% Unknown if a team has 33% Positive if a team is between 33% and 55% Unknown if a team has 55% Negative if a team has more than 55% 36
Table 2.4
Expected Signs for Independent Variables for Model 2
Independent Variable Predicted Sign
OPS Positive
Run-Ratio Positive
Range Factor Positive
Year Unknown
Left-Handed Hitters Negative if a team has less than 33% Unknown if a team has 33% Positive if a team is between 33% and 55% Unknown if a team has 55% Negative if a team has more than 55%
Conclusion
This chapter developed the theory and mathematics behind player and team production. The goal of this section was to bring forward the framework of this study, and to analyze what factors into a team's success. The theory discoursed in this chapter was used to develop two models which will be used to determine how many left-handed hitters a team should have in order to maximize their team success. This chapter
provided an in depth description of all the relevant variables in this study. It also
explained the methodology used to analyze the model. The following chapter will
describe the data collected for this study while Chapter IV will provide results for the 37 models above. Finally, Chapter V will also draw conclusions from the results concerning the theory of left-handed hitting, and discuss its impact on Major League Baseball. CHAPTER III
DATA
This chapter will explain the data set collected to test the hypothesis described in the previous theory chapter. It will begin by describing the sources consulted in the collection of data and explain the method used to collect it. The second section will present summary statistics for each variable: range, mean, median, and other statistics.
Data Resources
The majority of the data collected in this study was done so using the website baseball-reference.com.1 This online database accounted for all but one statistic, Range
Factor, which was needed to test the hypothesis of this paper. The range factor statistic used in model two was found using the online baseball almanac.2 From here statistical information on all 30teams was recorded using team statistics, explained previously, over ten years resulting in 300 team seasons. These 300 team seasons provided 21 different variables. This resulted in 6300 different data points collected for this study.
Although 6300 different points were collected, some were combined to formulate variables discussed in the previous chapter, such as Run Ratio.
1 "Baseball-reference.com - Teams," [cited 2010]. Available from http://www.baseball- reference. com/teams/batteam/shtml. 2 "Baseball-almanac.com - The Official Baseball History Site," [cited 2010]. Available from http ://www.baseball-almanac. com/.
38 39
The sample used for the collection of data was Major League Baseball. Players and teams in professional leagues outside of the MLB were not accounted for in this study. Secondly, each team in the MLB was used to collect statistics for the year 2000 to the year 2009. Although each player who took part in a Major League Baseball game was accounted for, the main focus of this study (Left-handed hitting) was determined only by players who accumulated at least 130 at bats in that particular season. At this
point a player was deemed to be a qualified hitter, on average that resulted to
approximately 13.4 qualified hitters per team. All pitching statistics were found using
team totals, while the sole defensive statistics, range factor, was collected using a team
average described in detail previously.
Data Analysis
Table 3.1 presents summary statistics for each variable collected. These include
minimum, maximum, mean, median, and standard deviation. 40
Table 3.1
Descriptive Statistics
Standard Variable Abbreviation Minimum Maximum Mean Median Deviation 768 77.93 Runs Scored R 574 978 770.61
74.31 Hits H 1300 1667 1475.04 1469
171 32.62 Homeruns HR 94 260 173.83
25.44 Stolen Base Ratio SBR 0 154 54.55 52
71.62 Base on Balls BB 363 775 541.63 535.5
1054.5 108.92 Strikeouts SO 805 1399 1062.49
153 17.06 Double Plays DP 113 204 152.74
13.22 Hit by Pitch HBP 29 103 58.46 57
58 12.94 Hit by Pitch P. HBP 27 95 58.46 (Pitchers) 1443.2 12.60 Innings Pitched P. IP 1412.2 1484.2 1443.30
171.5 24.12 Homeruns P.HR 116 239 173.83 Allowed 538 68.30 Base on Balls P.BB 348 728 541.63 (Pitcher) 1057 105.63 Strikeouts P. SO 764 1404 1062.49 (Pitcher) 4.45 .092 Range Factor RF 4.22 4.71 4.46
.756 .034 On-Base- OPS .671 .851 .758 Percentage plus Slugging Percentage .51 .07 Winning Winning % .27 .72 .50 Percentage 0 .49 Left-handed LH Ratio 0 1 .4 Hitting Ratio 41
Variable Explanation
In table 3.1 the breakdown of each variable used for this study is presented and
one can see that there were no unexpected irregularities apart from Left-handed Ratio.
While it is a dummy variable it is apparent that the mean was noticeably lower than it
would be if this variable were evenly distributed. With a mean of .4, it is clear that far
more teams did not qualify within the LH Ratio, put forth in the previous chapter, than
did. Before ever running regressions, it appears that most teams do not believe that they
should have between 33% and 55% Left-handed hitters in order to optimize success.
Whether or not this leads to a lower success rate on the baseball field will be discussed in
detail in the chapter to follow.
Conclusion
This chapter briefly explained how all the data was collected for this study while
documenting all the appropriate patterns found within the data itself. The variable of
note, LH ratio, gave an eye opening result that was much different than expected. The
following results chapter will explain the results obtained from this study and use them to
test the hypothesis that having between 33% and 55% of left-handed hitters is the optimal
ratio a team should employ. CHAPTER IV
RESULTS
This chapter will examine the results produced by the regressions described in the previous chapter. As touched on in the previous chapter two regressions were run with the dependent variables being total runs scored and team winning percentage. The two regressions were run using a combination of the twelve independent variables described in the theory chapter. The regression models were performed precisely as described in
Chapter II, using ten years of data from Major League Baseball. This chapter will first analyze the results obtained from running the two regressions while also discussing any issues or tests run on these models.
Regression Results
There are two different models to be touched upon in this section. The first model used Runs Scored as the dependent variable, and it had nine independent variables. One
of which included left-handed hitting ratio, the focal point of this study. The second
model used team-winning percentage as its' dependent variable, while it had five
independent variables, again including left-handed hitting ratio. Each regression was run
using a data set of 6300 different data points collected over the last decade of the MLB.
42 43
The following results for each regression demonstrate the impacteach variable had in accounting for the two different dependent variables. The regression results are in tables below with their
coefficients and correlated t-statistics. Any independent variable with a t-statistic below
1.833 for model 1, and 2.015 for model 2 at a 5% significance level were deemed to be
insignificantfor that regression.
Model 1
Runs Scored= a + pi Left-handed Ratio + p2 HR + (33 Hits + p4 SO + p5 SBR
+ p6 BB + P7 HBP + p8 DP + p9 Year + s 4.1
Equation 4.1 above illustrates the formula used in model 1. Six variables in the
Runs Scored regression are deemed to be significant at the 5% level. The first Homeruns
is shown to have a very strong effect on this model and was also deemed to be significant
at the 1% level. Justas with Homeruns, Hits was also found to have a very strong impact
on the model and was found to be significant to the 1%. Stolen Base Ratio was also a
significantly positive variable in model 1. The final two significant variables in this
model are; Base-On-Balls, and Hit By Pitch. Both of which were found to have a
positive impact on runs scored even at the 1% level. It should be noted that of the three
variables found to be insignificant two of them, Hit into Double Play, and Year were
deemed to be significant at the 10% level. The variable of note Left-handed Ratio was
found to be insignificant in this model as was Strikeouts. 44
Table 4.1
Regression Results: Runs Scored
Variable Coefficient t-Statistic
Constant -532.5778 -12.80129***
Left-handed Ratio -2.227669 -0.771776
Homeruns 0.844847 15.87223***
Hits 0.641768 27.56803***
Strikeouts -0.017299 -1.108172
Stolen Base Ratio 0.104606 1.834615**
Base-on-balls 0.330899 14.56911***
Hit by Pitch 0.440804 3.966272***
Hit into Double Play 0.143210 1.692823*
Year -0.779496 -1.537898*
R-squared 0.906078 Observations 300 Adjusted R-squared 0.903153 Adjusted Observations 299 * Significant at 10%, ** Significant at 5%, *** Significant at 1 %
The independent variables in the runs scored regression account for approximately 91% of the variation in runs scored by a team in one season. The correlation matrix generated from this regression showed no signs of multicolinearity. This regression also showed no
normality, serial correlation or heteroskedasticity problems. The Jarque-Bera (JB) test
revealeda chi-square distribution below 16.92 with 9 degrees of freedom at a 5%
significance level.1
Chi-squared for data in runs scored regression is 1.470286. 45
The results found in this model did reveal a relatively high R-squared value but the variable of note, Left-handed Ratio was found to be insignificant at all three levels listed above. The coefficients in the model all met the predicted signs from Chapter II except for Left-handed Ratio, and Hit into Double Play. Themain conclusions to be made from this model will be listed in detail in the chapter to follow.
Model 2
Winning Percentage= a + (31 Left-handed Ratio + (32 OPS + (33 Run Ratio
+ (34 Range Factor + p5 Year + s 4.2
Equation 4.2 above lays out the formula used in model 2. Three variables in the
winning percentage model were deemed to be significant. Range Factor (RF) was found
to have a small significance, but contrary to predictions that effect was negative on a
team's winning percentage. On-Base-Percentage plus Slugging Percentage (OPS) also
had a significant impact on the model, but contrary to RF it was in a positive manner.
Finally Run Ratio was shown to have a very strong and also positive impact on this
model. It is also worth noting that the variable of concern, left-handed ratio, was again
found to be insignificant. 46
Table 4.2
Regression Results: Winning Percentage
Variable Coefficient t-Statistic
Constant -0.129082 -0.738384
Left-handed Ratio -0.001297 -0.202259
OPS 1.131611 12.06884***
Run Ratio 0.062792 7.102465***
Range Factor -0.070991 -2.074597**
Year 0.000507 0.453629
R-squared 0.447857 Observations 300 Adjusted R-squared 0.438435 Adjusted Observations 299 * Significant at 10%, ** Significant at 5%, *** Significant at 1 %
The variables used in this regression account for approximately 45% of the variation in a team's winning percentage. As in the first model no problems were found with serial correlation, multicolinearity, or heteroskedasticity. Just as in model number one a
Jarque-Bera test was run to test for normality. For this model the test was run using 5 degrees of freedom, therefore the JB test needed to reveal a value below 11.07.2 Just as in the previous model no normality issues were discovered.
The results from model 2 showed an R-squared value of approximately 45%.
This is a very mediocre result, which may bring forward the issue that, many "outside
sources" such as manager's decisions, must playa very larger role in winning. Both
Left-handed Ratio, and Range Factor resulted with a differing sign than what was
predicted in Chapter II. All three statistics used to represent hitting, pitching, and defense
! Chi-squared for data in winning percentage regression is 0.461489. 47 were found to be significant to at least the 5% level, which may mean that although those are the three parts to the game of baseball, much needs to be accounted for in order to possibly relate Left-handed hitting to winning percentage more successfully. As with model 1, a detailed conclusion of the results found in this model will be presented in
Chapter V.
Conclusion
This brief chapter laid out the results uncovered while running the two regressions
used in this study. It has also discussed tests run on the data to make sure no overlying
problems exist with the results before they are interpreted. The next and final chapter
will go in depth into conclusions drawn from this research while also providing ideas for
further research that may be done to further extend on this thesis. CHAPTER V
CONCLUSION
This chapter will discuss any conclusions that can be taken from the results while also touching upon possible extensions and alterations that could be made to these models in order to extend this research. Finally this chapter will concludewith a discussion summing up anything that can be taken away from this study itself.
Regression Conclusions
The primary focus of this study was to delve into the reasoning behind the asymmetry of baseball. The attempt of this study was to determine what impact left- handed hitters have onteam production based upon what portion of the team they account for. As touched upon in the Literature Review in Chapter I baseball has an odd balance of lefties vs. right-handers not seen in everyday society. For many reasons baseball is a sport of balance and the thesis behind this study was to determine if having between 33%
and 55% of a teams hitters be left-handed would be beneficial to their success. Two
models were generated in order to see if this balance would benefit teams. The first was
in a solely offensive category, runs scored, while the second determined if it would playa
part in increasing a team's winning percentage. From this point ten years worth of data
was collected and two separate regressions were run. Although the variable of note, left-
48 49 handed ratio, wasdeemed to be insignificant in both models some concrete conclusions can be taken from this study.
In model number one a R-squared value of 91% was found. While left-handed ratio was deemed to be insignificant, the nine independent variables combined are very
accurate in explainingruns scored. This means that if a manager felt he needed to
improve his run production he could see that some variables such as homeruns are very
important in making that improvement.
That may be a somewhat commonly accepted answer but this study does show
that when looking for homerun production he may not need to worry as much about a
player's tendency to strikeout or hit into double plays. Contrary to common thought
strikeouts are not significant in hindering run production and furthermore its' negative
coefficient was miniscule.1 Another common belief that power hitter's tendencies to hit
into double plays can be very detrimental to team success was found to not only be
insignificant in this study, but it also was associated with a positive coefficient. This
means that the more double plays a team hits into lead to a higher number of runs scored.
A possible explanation for this could be that the significance of homeruns in this model is
so high that the occasional double play caused by "swinging for the fences" is a risk a
manager should learn to accept.
The second model based upon winning percentage saw a much smaller R-squared
value than the pervious model but was still acceptable explaining 45% of variability in
winning percentage. Again as in the first model Left-handed Ratio wasdeemed to be
insignificant but the three significant variables are all of enough importance to managers
1 Strikeout coefficient from the runs scored model is -0.017299. 2 Hit into Double Play coefficient from the runs scored model is 0.143210. 50 that some interesting conclusions can be drawn. As predicted both On-Base-Percentage plus Slugging Percentage (OPS), and Run Ratio were both deemed to be significant positive variables. This means that as many would assume the better pitching and hitting a team has the more wins they will earn. The variable that did result as predicted was
Range Factor (RF). As explained earlier RF is one of the most accurate ways to evaluate
team defense and not only was it found to have a negative impact on winning it was also
deemed to be significantly negative.3 This is a very intriguing result because most people
in the "baseball world" would with outquestionsay that defense is one very important
piece of winning.
One reason for this could be the small variability seen in defense league wide.
For instance the range in RF throughout ten years of data was only 0.49. This can be
explained when realizing that every MLB player is already so good at playing defense
that the difference between the best fielder and worst fielder is very small. This could
possibly mean that the difference between good and bad is so small that it is almost
meaningless in understanding winning percentage in MLB, and therefore it could have
just as easily been significantly positive or insignificant.
A second possible explanation for this anomaly could lay in the ever renown
saying, "Defense wins Championships." It has often been speculated that during the
playoffs defense plays a much more important role in winning when everyone is playing
more focused than ever. This could mean that during a 162 game season players may get
lackadaisical letting their defense slip making it less influential to winning. Regardless of
what the real reason for this result is, it can help a manager in making personnel choices
to know this result. For instance, a manager is debating between Player X, and Player Y
1 t-Statistic for Range Factor is -2.074597. 51 to bring infor next year. Player X has excellent hitting statistics but has been known to be a liability on defense, at the same time Player Y has adaquet hitting statistics but has a stellar defensive reputation. In this situation the results from this study would influence the manager to go with Player X because offense is seen to have a much more significant,
and at the same time positive, influence on winning. He may then be able to choose a
player based more upon his hitting statistics than his defensive abilities without at much
hesitation as he may have previously had.
Further Research
Although this study did not find a significant answer to the thesis of this study it
should be noted that further research could find more successful results. One possible
change that could be made to this study is to try and incorporate the ability of a switch-
hitter to hit left-handed. If every at bat could be broken down throughout a season and
noted whether the batter hit from the left sideor the right side the results for this study
could be much more complete.
A second extension to this study would be to somehow incorporate player
productivity into this model. It is apparent that the production of each individual player
plays a large part in team production, (the focus of this study)so if a model could be
devised to also include what goes into individual player production, (i.e. natural ability,
age, contract situation, ect.) it would from there lead to a more accurate model.
A final extension to this study could be to simply gather more years of data. This
study was done looking at ten years worth of data. If even 20 years of data was collected
and included into this study it would lead to a more complete picture. 52
Final Discussion
The purpose of this study was to provide evidence that having between 33% and
55% of hitters on a team to be left-handed would benefit a team in their success. In both models the results for the variable Left-handed Ratio were found to be inconclusive.
Therefore this study can neither prove nor disprove the underlying thesis of this study.
This thesis is the mostup to date research on a topic that has been left relatively
untouched until this point in time. This study did find some important factors in what
leads to team success and can beused by general managers and coaches alike to try to
identify what type of players they need to target in order to be successful. Although this
study did not find a positive correlation between a balanced hitting attack and team
success it does not mean there is nota significant correlation. If there were no benefit in
having a balanced number of right-handers and left-handers we would never have seen
such a unique asymmetry in baseball to begin with. SOURCES CONSULTED
Annett, M. "Left. Right. Hand and Brain: The Right Shift Theory." London: Earlbaum (1985).
"Baseball Almanac - The Official Baseball History Site." [cited 2010]. Available from http://www.baseball-almanac.com/.
"Baseball-Reference.Com - MajorLeague Baseball Statistics and History." [cited 2010]. Available from http://www.baseball-reference.com/.
"Baseball-reference.com - Teams." [cited 2010]. Available from http://www.baseball-reference.com/teams/batteam/shtml.
Berri, David J. "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association." Managerial and Decision Economics 20, no. 8 (12 1999): 411-427.
Bradbury, J.C. The Baseball Economist: The Real Game Exposed. New York, N.Y: Dutton, 2007.
"Britannica.Com - Facts about Rookie of the Year: baseball." [cited 2010]. Available from http://www.britannica.com/.
Cassing, J., and R. Douglas. "Implications of the Auction Mechanism in Baseball's Free Agents Draft." Southern Economic Journal 47 (07 1980): 110-121.
"Chicago Tribune Online - Chicago Cubs." [cited 2010]. Available from http://www.chicagotribune.com/.
DeBrock, Lawrence, Wallace Hendricks, and Roger Koenker. "Pay and Performance the Impact on Salary Distribution on Firm-Level Outcomes in Baseball." Journal of Sports Economics 5, no. 3 (08 2004): 243-261.
"ESPN -New York Yankees News, Schedule, Players, Scores, Stats, Photos, Rumors- MLB Baseball." [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=nyy.
"ESPN - Washington Nationals News, Schedule, Players, Scores, Stats, Photos, Rumors
53 54
- MLB Baseball." [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=was.
Goldstein, Stephan R., and Charlotte A. Young. "'Evolutionary' Stable Strategy of Handedness in Major League Baseball." Journal of Comparative Psychology 110, no. 2 (06 1996): 164-169.
Grondin, Simon, Yves Guiard, Richard Ivry, and Stan Koren. "Manual Laterality and Hitting Performance in Major League Baseball." Journal of Experimental Psychology 25, no. 3 (06 1999): 747-754. Gwartney, James, and Charles Haworth. "Employer Cost and Discrimination: The Case of Baseball." The Journal of Political Economy 82,no. 4 (07 1974): 873-881.
Kahn, Lawrence M. "Managerial Quality, Team Success, and Individual Player Performance in Major League Baseball." Industrial and Labor Relations Review 46, no. 3 (04 1993): 531-547.
Krautmann, A. C, and M. Oppenheimer. "Contract Lengthand the Return to Performance in Major League Baseball." Journal of Sports Economics 3, no. 1 (02 2002): 6-17.
Krautmann, Anthony C. "Shirking or Stochastic Productivity in Major League Baseball?" Southern Economic Journal 56, no. 4 (04 1990): 961-968.
Krautmann, Anthony C. "Shirking or Stochastic Productivity in Major League Baseball: Reply." Southern Economic Journal 60, no. 1 (07 1993): 241.
Laby, Daniel M., MD, David g. Kirschen, OD, PhD, Arthur L Rosenbaum, MD, and Michael F. Mellman, MD. "The Effect of Ocular Dominance on the Performance of Professional Baseball Players." Ophthalmology 105, no. 5 (1998): 864-866.
Lewis, Michael M. Money Ball. New York, N.Y: W.W. Norton & Company Inc, 2003.
Marburger, D. R. "Does the Assignment of Property Rights Encourage Or Discourage Shirking? Evidence from Major League Baseball." Journal ofSports Economics 4, no. 1(02 2003): 19-34.
Maxcy, J. G., R. D. Fort, and A. C. Krautmann. "The Effectiveness of Incentive Mechanisms in Major League Baseball." Journal of Sports Economics 3, no. 3 (08 2002): 246-255.
Maxcy, Joel. "Motivating Long-Term Employment Contracts: Risk Management in Major League Baseball." Managerial & Decision Economics 25, no. 2 (03 2004): 109-120.
Miceli, T. J. "A Principal-Agent Model of Contracting in Major League Baseball," 55
Journal of Sports Economics 5, no. 2 (05 2004): 213-220.
"Mlb.Com - MLB Miscellany: Rules, regulations, and statistics." [cited 2010]. Available from http://www.mlb.com/.
Nistler, Tony, and David Walton. Sporting News Books Baseball Register 2007. 2005 ed. St. Louis, MO: Sporting News Books, 2006.
Oorlog, Dale R. "Marginal Revenue and Labor Strife in Major League Baseball," Journal OfLabor Research XVI, no. 1 (Winter 1995): 25-42.
Porter, Phillip K., and Gerald W. Scully. "Measuring Managerial Efficiency: The Case of Baseball." Southern Economic Journal 48, no. 3 (01 1982): 642-650.
Scully, Gerald W. "Player Salary Share and the Distribution of Player Earning." Managerial and Decision Economics 25, no. 2 (031974): 77-86
Scully, Gerald W. "Pay and Performance in Major League Baseball." American Economic Review 64 (12 1974): 915-930.
"The Triple Crown - A Brief Look at One of Baseball's most Coveted and Elusive Feats." [cited 2010]. Available from http://www.psacard.com/articles/article3547.chtml.
Thorn, John, and Pete Palmer. The hidden Game of Baseball: A Revolutionary Approach To Baseball Statistics. New York: Dolphin, 1984.
Wood C J and J. P. Aggleton. "Handedness in 'fastball' sports: Do left-handers have An innate advantage?" British Journal of Psychology 80. (1989): 227-240.
Zak, Thomas A., Cliff J. Huang, and John J. Siegfried, '"^^ll^ of Professional Basketball." The Journal of Business 52, no. 3 (07 1979): 379-392.