AND THE LEFT HANDED HITTER

A THESIS

Presented to

The Faculty of the Department of Economics and Business

The Colorado College

In Partial Fulfillment of the Requirements for the Degree

Bachelor of Arts

By

Addison Alan DeBoer

May/2010 BASEBALL AND THE LEFT HANDED HITTER

Addison Alan DeBoer

May, 2010

Mathematical Economics

Abstract

This thesis is designed to explain the unordinary amount of left-handed hitters found in MajorLeague Baseball (MLB). The focus of this study is to determine the appropriate amount of left-handed hitters a MLB team should employ in order to maximize their success. The driving force behind this study is that the average amount of lefties in MLB is substantially higher than the amount of lefties found in everyday society. The hypothesis is that a team should employ between 33% and 55% of their hitters to be left- handed in order to achieve a team's optimal rate of success. This study will include all 30 MLB baseball teams over the span often years including more than 4100 hitters. Two models will beused to link the effect left-handed hitters have on the total number of runs a team scores, and also a team's season long winning percentage. The regressions produced R-squared values of .91 and .45 respectively. While the model was able to prove several different variables do significantly affect runs scored, and winning percentage the results were inconclusive in relating left-handed hitting to either dependent variable. For that reason the research could not support the hypothesis that MLB teams should employ between 33% and 55% left-handed hitters.

KEYWORDS: (, Left-handed Hitter, Runs Scored, Winning Percentage) ON MY HONOR, I HAVE NEITHER GIVEN NOR RECEIVED UNAUTHORIZED AID ON THIS THESIS

Signature

TABLE OF CONTENTS

I INTRODUCTION 1 Literature Review 4 Player Value 5 Player Productivity 8 Natural Handedness 14 Conclusion 18

II THEORY 19 The Left-Handed Hitter 20 Team Success in MLB 21 Sample and Time Frame 22 Model 1 23 Dependent Variable 23 Hitters 24 Independent Variables 24 Model 2 27 Dependent Variable 28 Team 28 Independent Variables 29 Method 34 Conclusion 36

III DATA 38 Data Resources 38 Data Analysis 39 Variable Explanation 41 Conclusion 41 IV RESULTS 42

Regression Results 42

Model 1 43

Model 2 45

Conclusion 47

V CONCLUSION 48

Regression Conclusions 48

Further Research 51

Final Discussion 51

SOURCES CONSULTED 53 LIST OF TABLES

2.1 Order and OPS for 2009 Season 31

2.2 Batting Order and OPS for 2009 Season 31

2.3 Expected Signs for Independent Variables for Model 1 35

2.4 Expected Signs for Independent Variables for Model 2 36

3.1 Descriptive Statistics 40

4.1 Regression Results: Model 1 44

4.2 Regression Results: Model 2 46 ACKNOWLEDGEMENTS

I would like to thank my teacher, advisor, and friend, Kristina Lybecker for all her guidance and encouragement throughout this thesis. Your support throughout was greatly appreciated and will not be forgotten. I would also like to thank my family for their never-ending support not only throughout my college career, but also throughout my whole life. Lastly, I would like to thank my parents for everything they have sacrificed over the years in order to support me. Without you guys I can't imagine where I would be today. Everything I've ever accomplished I owe to you. CHAPTER I

INTRODUCTION

Introduction

Major League Baseball (MLB), has been commonly referred to as America's pastime for over a century. It has been a focal point in the sporting world for the United

States since its expansion in 1869.' This study has two purposes. The first question that

will be answered is do left-handed hitters increase a team's production or winning

percentage? Secondly this study will look to uncover just how many lefties a team

should use to maximize their run production and winning percentage.

The focus of this study will be in MLB baseball performance, based upon the

batting handedness ofeach team's players. This topic brings interest to players,

managers, general managers, and fans alike. Each has their own reasons for their interest

in this topic. For instance a player would like to know how many lefties a team is

looking to employ when signing his contract to better understand the demand for him on

that team. Knowing the appropriate number of lefties a team is looking to employ could

help a player decide which team is the best fit for him when signing a new contract. The

manager could use this information in deciding which players he wants to use on a daily

basis, and from this research determine how many left-handers he should use throughout

1 "Baseball Almanac- The Official Baseball History Site,"[cited 2010]. Available from http://www.baseball-almanac.com/.

1 theseason. Along the same lines a general manager could use this study to help determine how many lefties they want to employ from season to season. Knowing how many lefties to hire in order to maximize statistics and winning percentage would be extremely valuable in selecting their roster year in and year out. Finally, a fan could appreciate this knowledge simply from their general interest in the game. Since fans are essentially the revenue generators for this industry, their right to this knowledge is as strong as any of the aforementioned others. These reasons, among others play to the importance of this study. Fundamentally this research can help determine the effect lefties have in today's game.

The role of a left-handed hitter has gradually evolved just as the game of baseball has over the past century. It has become prominent in Major League Baseball to have left-handed hitters in your line-up in order to succeed. This was seen very clearly when manager of the Chicago Cubs Lou Pinella said, "The only thing I talked about last season was a need for a left-hand bat... We didn't bring Edmonds back and Edmonds quite a few homeruns, so we needed a left-handed bat. That's it."2 Without question the role of a left-hander in MLB is one that cannot be overlooked and this study will shed light on that.

In a game where the livelihood ofeveryone involved relies on production many studies have been done to explain it. There have been countless studies on professional sports and the production of players relating to their age, race, contract, and countless other characteristics. As explained above a left-handed hitter plays a very large role in baseballyet none of these studies have identified exactly what that role is. These studies,

2 "Chicago Tribune Online - Chicago Cubs," [cited 2010]. Available from http://www.chicagotribune.com/. done in baseball, basketball, cricket, football, hockey, and soccer, will provide beneficial background for this research.

Many factors must be accounted for in order to place a value to the left-handed hitter. First, factors that relate to player, or even employee, productivity must be accounted for. Variables affecting production never seen on the score sheet such as experience, physical attributes, position played, age, contract, among a host of others must be considered. Inside team production variables along with the number of left- handed hitters include, winning percentage, homeruns, hits, , stolen bases, and walks among others. It also must be noted that each individual statistic used will not incorporate each individual event, or even each player's season totals. This study will use team averages and totals to get a wider view of the role lefties play in MLB.

The goal of this study is to determine the appropriate number of left-handed hitting players a team should utilize to maximize their run production and winning percentage. The regressions in this study will test the hypothesis that each team should employ between three and five left-handed hitters to maximize success. This section has illustrated the importance of this study and has laid the framework for determining the value of lefties. The rest of this chapter will discuss previous research that pertains to player value, player productivity, and handedness. The following chapter will review relevant economic theory and explain the methodology and data used in this study.

Chapter III will discuss the data set for this study in detail, while Chapter IV will cover regression results and their meanings. Finally Chapter V will provide conclusions based upon the results of this study while providing suggestions for possible future research to this topic. Literature Review

The purpose of this section is to review the literature on player and team performance in Major League Baseball. There havebeen many studies done on player productivity and their value to the team, but few studies have incorporated physical attributes into the equation. Economists Gwartney, and Charles Haworth (1974) studied the impact of a player's ethnicity on a team's winning percentage, but studies of a player's natural handedness is relatively untouched by the research world.3 Many economists such as Bradbury (2007)4, and Berri (1999)5 have found ways to evaluate an

athlete's value, such as attributing monetary or statistical values to a player's worth,

based on the effect he personally has on a team's winning percentage. Differentially this

element of their research was based strictly on their on field and court performances,

never accounting for a player's natural hand preference.

Other studies, such as the one done by Oorlog (1995), on marginal revenue and a

player's salary tried to relate a player's value tohis marginal revenue, generated through

ticket sales and broadcast revenue.6 This study tried to put a fiscal value on each player,

and relate a player to a strict monetary value. In doing so, Oorlog (1995) expanded on

the method used by many economist of calculating each player's marginal revenue

product (MRP) by adding the spectator aspect generating a new model he called marginal

3 James Gwartney,and Charles Haworth, "Employer Costs and Discrimination: The Case of Baseball," The Journal ofPolitical Economy 82, no. 4 (07 1974): 873-881. 4 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 5 David J. Berri, "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association," Managerial and Decision Economics 20,no. 8 (12 1999): 411-427. 6 Dale R. Oorlog, "Marginal Revenue and Labor Strife in Major League Baseball," Journal ofLabor Research XVI, no. 1 (Winter 1995): 25-42. spectator revenue product (MSRP)7. Without question in a business, as major league baseball is, this can be very beneficial.

The first portion of this literature review will cover studies done on player value, relating a player's value to his monetary value, and also tohis team's winning percentage. Both measures have their advantages because as much as pro baseball is a business many would debate that winning is more important than money to many owners around the league. The second section of the literature review will touch upon studies

done on player productivity. It will bring to the forefront the studies that generated and

revolutionized the model that can measure a player's productivity level on the field of

play. The third section will look into studies done on handedness, the natural advantages

and disadvantages that come with being one hand over the other. These are medical

studies, which examine many intricacies, such as natural handedness, handedness in

sports, and hand vs. ocular dominance, identify the advantages and disadvantages of

each. The final section of this reviewof scholarly works will analyze ways to improve

on these studies and relate them to the role left-handed batters play in MLB, to determine

if indeed the optimal ratio of left-handers a team should employ is between 33% and

55%.

Player Value

Player value is something that will never be one hundred percent accurately

recorded. Intangibles such as leadership, chemistry, and work ethic are just a few

attributes that are very difficult to puta mathematical value to. Taking that under

consideration, a player's value can be assessed on a different level; a statistical one.In

7 Ibid sports statistics are more often than not responsible for the well-being of the athlete. \

Using these statistics many researchers have found ways to accurately value a player, and analyzing them we can continue to evaluate sports in the economic arena.

J.C. Bradbury (2007) completed one of the most recent and successful studies of relating a player's value to his team's winning percentage in the book The Baseball

Economist.8 Bradbury states that a player's worth is valued in his ability to generate wins

for his team. In short he states that a player's worth is directly related to his contribution

to his club's winning percentage. DeBrock, Hendricks, and Koenker (2004) support this

claim. They state that a team with better athletes will, in turn, generate a better winning

percentage, and better attendance records.9 This will force owners to pay more for these

players, thus increasing the overall value of said player. These studies are useful because

they generate models that relate an individual player to a team's winning percentage and

can assess said player's value to his team.

Other studies have been done to relate a player's value to the winning percentage

of his team while assessing a saidx-factor placed in the equation. Kahn (1993) measured

the influence of a manager on their ball club's winning percentage, and player

production.10 Using a simple regression function Kahn was able to incorporate an

external value, managerial quality, on winning percentage and player productivity. This

study found that an external factor such as a manager has a positive, and significant,

8 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 9 Lawrence DeBrock, Wallace Hendricks, and Roger Koenker, "Pay and Performance the Impact of Salary Distribution on Firm-Level Outcomes in Baseball." Journal ofSports Economics 5, no.3 (082004): 243- 261. 10 Lawrence M. Kahn, "Managerial Quality, Team Success, and Individual Player performance in Major League Baseball," Industrial and Labor Relations Review 46, no.3 (04 1993): 531-547. effect on a team's winning percentage and player production. This information is useful

in creating a model with other x-factors such as handedness.

Gwartney and Haworth's (1974) study on discrimination in Major League

Baseball, mentioned earlier, used a model to find the impact that employing black

baseball players had on team's winning percentage and overall attendance. As assumed

it was found that team's which employed more black players obtained a competitive

advantage. In fact the study proved that teams willing to employ these players saw an

increase in games won and annual revenue. This is significant because Gwartney and

Haworth were able to specify equations, which like Kahn's study employed an external

factor measured to have a significant impact on winning percentage and player

production.

Another approach to assessing a player's value in a much more direct way was

introduced by Scully (1974). He was able to measure a player's worth to his club as a

specific dollar amount using the theory of marginal revenue product (MRP).12 This study

concluded that a team's winning percentage is determined by the performance of players

on the team, and its revenue is in turn determined by winning percentage. Making these

connections Scully was able to show a player's worth in a specific dollar amount, based

upon his on field performance and the revenue accrued by theorganization. This is

particularly important because player value related to player productivity will be

discussed in the section to follow.

11 James Gwartney,and Charles Haworth, "Employer Costs and Discrimination: The Case of Baseball," The Journal of Political Economy 82, no. 4 (07 1974): 873-881. 12 Gerald W. Scully, "Player Salary Share and the Distribution of PlayerEarnings," Managerial and Decision Economics 25, no. 2 (03 1974): 77-86. One issue with Scully's model was that it was developed pre-free-agency, and thus it has been shown that players as of 1980 were overpaid by about 20 percent.13

Since the birth of free agency players have earned the upper hand in negotiations and thus may be overpaid even more now. With this information Oorlog (1995) generated a study to try and determine if players were paid with a portion of the marginal broadcast revenue

product (MBRP).14 In his study Oorlog found that only free agent and arbitration-eligible

players received some or all of their MBRP, indicating that unless a player has bargaining

power, owners are unwilling to share this revenue with their players. This has

importance in that salary in this new free-agent era cannot be as accurately predicted

using Scully's MRP method, and thus other factors must be included to increase the

model's accuracy. One such example studied by Maxcy (2004) was position played.

This is particularly important in setting up models to account for intangibles, such as

handedness.

Player Productivity

Employee productivity is something that every business wants to maximize in

order for their business to succeed. This is no different in Major League Baseball.In

baseball productivity is everything and a player's stats are his resume. The advantage an

owner of a professional ball club has over his comrades in other fields is that the

productivity of his employees is very accurately recorded.

13 J. Cassing, and R. Douglas, "Implications of the Auction Mechanism in Baseball's Free Agents Draft," Southern Economic Journal 47 (07 1980): 110-121. 14 Dale R. Oorlog, "Marginal Revenue and Labor Strife in Major League Baseball," Journal of Labor Research XVI, no. 1 (Winter 1995): 25-42. 15 Joel Maxcy, "Motivating Long-Term Employment Contracts: Risk Management in Major League Baseball," Managerial and Decision Economics 25,no. 2 (03 2004): 109-120. Krautmann (1990) studied player production and the effects of shirking and stochastic production in Major League Baseball.16 In this study heused the slugging average (SA) of major league hitters to model production. He took note of how player production has a lot of variability and thus he also recorded player's best (max) and worst years (min). These labor services without question involve a random component, and for

this reason the study was done to find if players do indeed shirk after receiving long-term job security in the form of a long-term contract.17 In his study Krautmann found no

evidence of shirking but this study has pertinent use in identifying a player's production,

a topic that is the main focus of this section.

Krautmann (1993) furthered his study in this field based on "allegations of

shirking due to long-term job security have been associated with seniority rights,

professional athletes, and academic institutions of tenure."18 In this reply Krautmann

conveyed that it is indeed hard to conclude if a player is shirking based upon his contract

situation. He instead implies that a player's reduced productivity could instead be

blamed upon the stochastic nature of production, or said player could simply behaving a

down year statistically.19 This is important in realizing the variability in an athlete's

production.

Nonetheless Krautmann believes that for whatever reason, a player's statistics

decrease as a player'scontract length increases. This was proven inhis study with

16 Anthony C. Krautmann, "Shirking or Stochastic Productivity in Major League Baseball?" Southern Economic Journal, 56,no. 4 (04 1990): 961-968. 17 Ibid 18 Anthony C. Krautmann, "Shirking or Stochastic Productivity in Major League Baseball: Reply." Southern Economic Journal, 60, no 1 (07 1993): 241. 19 Ibid 10

Oppenheimer (2002) on contract length and performance in Major League Baseball.2 In this study the authors used a wage regression function using salaryas the dependent variable and predicted performance, team, and player (i.e. seniority and race) as their independent variables. This is a viable note in determining player production over long periods of time, and noting their unpredictability while using external variables as

independent variables (i.e. handedness).

In another study, Krautmann, Maxcy and Fort (2002) look at the effectiveness of

incentive mechanisms in Major League Baseball and its effects on production. In this

study the authors looked into incentive-based contracts and natural incentives to find the

impact they have on ball players. They also considered playing time, time spent on the

disabled list, and skill as the three main measures of production. Inthe measurement of

skill they used slugging average (SA) for position players, and -to-walk ratio for

. They found it impossible to exactly measure a player's true potential.

Therefore it is very difficult to know if said player is truly underachieving. Evidence was

also found that players spend less time on the disabled list during contract years. The

authors also proved that players tend to have increased playing time after signing long-

term contracts, which aligned with Krautmann's prior studies that players are not shirking

in post-contract years. An external reason, which can simply not be accounted for

mathematically, is that managers may receive pressure from the front office (i.e. owners

and general managers) to play newly signed players in order to make their acquisitions

20 A. C. Krautmann, and M. Oppenheimer, "Contract Length and the Return to Performance in Major League Baseball," Journal ofSports Economics 3, no. 1 (02 2002): 6-17. 21 J. G. Maxcy, R. D. Fort, and A. C. Krautmann, "The Effectiveness of Incentive Mechanisms in Major League Baseball," Journal ofSports Economics 3, no. 3 (08 2002): 246-255. 11 look successful.22 Again this reiterates the fluctuation found in production when measuring athletes, an issue which will undoubtedly be faced in later sections.

Incontrast to the Krautmann studies, Marburger (2003) believes that there is sufficient evidence of shirking in MLB.23 In this study Marburger focuses on the impact of shirking based upon the property rights of said baseball player. His results lead him to believe that players have the incentive to shirk regardless of how property rights are

assigned at that time. If a player is still in his first six years in the league and as a result

involved in a reserve contract, the team essentially owns that player. For this reason he

could possibly feel he is underpaid and consequently under produce during that span.

Marburger also hasreason to believe that even after a player owns their property rights,

(year seven and later in the MLB) he may have reason to shirk after signing a long-term

contract. Marburger believes shirking is difficult to concretely prove, and that there is

ultimately no net impact of shirking on Major League Baseball. While it may be

impossible to detect, it is not because it does not exist in baseball. Reiterating his belief,

Marburger notes that some experts inside the MLB do indeed believe that some shirking

exists, which he supports by pointing out many weight management stipulations found in

some contracts.24 This study shows that even using much of the same data, multiple

conclusions can be arrived at, which raises the idea of external forces which are

unaccounted for in these studies playa significant role in the end results.

Miceli (2004) studied the reserve clause, touched upon in the preceding study, in

depth. He explored reasons why both the player and organization find the reserve clause

22 Ibid 23 D. R. Marburger, "Does the Assignment of Property Rights Encourage Or Discourage Shirking? Evidence from Major League Baseball," Journal of Sports Economics 4, no. 1 (02 2003): 19-34. 24 Ibid 12 viable.25 He claims that players accept the clause because they receive quality job training from their employer, and in turn the organization has incentive to offer such training because they are ensured the player's work for the remainder of the reserve clause contract. The organization accepts the clause because they are able to pay many players less than market value. Miceli states that the players accept the clause because they need to develop their skills in order to secure a future for themselves in the industry.26 The study also raises the issue of shirking, and makes it apparent that during any particular season shirking may skew results in other studies where statistics are used

as the main source of data.

As discovered in earlier studies some external forces have a pertinent impact on

the data studied. One of these, the manager, was studied by Porter and Scully (1982). In

this study they tried to measure the efficiency of managers in MLB.27 Using a Cobb-

Douglas production function and a team's winning percentage as the dependent variable,

Porter and Scully tried to evaluate the role of a manger on a team's winning percentage.

In this study the authors found that teams with higher managerial efficiency numbers not

surprisingly were among the top teams in the league. Ina previous study done by Scully

(1974) he found that in 1969 superstar player Sandy Koufax had a marginal revenue

product (MRP) estimated at $725,000. In the same season Scully estimated manager Earl

Weaver's MRP at $675,000.28 From this he concluded that superb managers could

contribute as much to a team'srevenue as high caliber players. While little is known

25 T. J. Miceli, "A Principal-Agent Model of Contracting in Major League Baseball," Journal of Sports Economics 5, no. 2 (05 2004): 213-220. 26 Ibid 27 Phillip K. Porter, and Gerald W. Scully, "Measuring Managerial Efficiency: The Case of Baseball," Southern Economic Journal 48, no.3(01 1982): 642-650. 28 Gerald W. Scully, "Pay and Performance in Major League Baseball," American Economic Review 64 (12 1974): 915-930. 13 about their salaries, it is very unlikely that talented managers earn more than a modest fraction of what superstar MLB players are paid.29 These studies establish that the manager plays a significant role in the productivity and success of his players and team.

A study on basketball player production and efficiency was done in the National

Basketball Association by Zak, Huang, and Siegfried(1979). The study gives insight

into a team's potential and their ability to turn that potential into wins.30 This study

effectively found correlation between inputs such as, shooting percentage, rebounding,

and assists versus the end result of a win or a loss. The study used a Cobb-Douglas

production function and the Richmond technique to come to these conclusions and found

very significant results. This study also looked at the effect of home-court advantage and

found that while the home team won a higher percentage of games than the away team,

this was a result of a superior performance by the home team and not the result of

preferential treatment by officials, an idea that has been discussed for some time. While

this study provides insight to the effect player production has on winning it found no non-

statistical inputs to have significant impact.31 This study points to the things that cannot

be accounted for such as the comfort a player feels while playing at home, versus playing

on the road. This may not be measured in value but it may very well be something that

plays a part in home court advantage. No doubt the same effect exists in baseball.

Another relevant study done in the NBA was performed by Berri (1999). This

study measured the value of player production to the amount of wins a certain team

30 Thomas A.Zak, Cliff J. Huang, and John J. Siegfried, "Production Efficiency: The Case of Professional Basketball," The Journal of Business 52,no.3 (07 1979): 379-392. 31 Ibid 14 achieves.32 In this study Berri utilized statistics kept by the NBA and presented an economic model that linked a player's statistics to team wins. With the results generated he found evidence that the value of an NBA player can objectively and accurately be determined. Much likeearlier studies discussed player value was very easily related to a team's winning percentage, so consequently this study's primary focus was the production of each individual player. Berri mentions that further study could be

performed to include factors such as experience, coaching, and team chemistry, which

undoubtedly play a role in team success.33 This study gives good insight into player

production and will be valuable in generating a model to replicate its results in other

sports.

Natural Handedness

Insports there are many different things a numerical value can be placed upon in

order to generate some very interesting conclusions about the performance of athletes.

Nevertheless some things are out of the control of an athlete such as height, age, and, as

many people believe, natural ability. One of these variables is an athlete's natural

handedness. In some sports (i.e. basketball) the handedness of a player is farless

important, but in sports such as baseball it plays a major role. Some studies have been

done on natural handedness, and a few of those will be discussed in this section.

32 David J. Berri, "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association," Managerial and Decision Economics 20, no. 8 (12 1999): 411-427.

33 Ibid 15

Wood and Aggleton (1989) studied the advantage of left-handers in "fast ball" sports.34 The study was done to uncover whether "left-handers have an intrinsic advantage over right-handers due to superior spatio motor skills, and that the relatively high proportion of top left-handed sportsmen and sportswomen is, in part, a reflection of this innate superiority."35 In this article Wood and Aggleton studied three sports; tennis,

cricket, and soccer. Basedon work done by Annett that suggested left-handers have a

higher capacity for visuo-spatial thinking, which is fine control of both hands and the

ability to make fast reactions, possibly explaining the intrinsic advantage lefties have in

baseball.36 This study was performed to determine if there was a higher than normal

proportion of left-handers in said sports which demand rapid and accurate visuo-spatial

coordination. The study uncovered that there was very little data that could validate this

theory. In cricket it was found that there are a noteworthy number of left-handed

bowlers. It is believed by Wood andAggleton that this is more of a strategic advantage

than anything else. They also found evidence that many left-handed batsmen are, in fact,

right-handed by almost any other account. In tennis they found an inordinate number of

left-handed players at the professional level, but again found that if it is not solely a

strategic advantage, the effect is "slight" at best. Finally in soccer goal keeping, where

there should be no strategic advantage, no evidence was found to show an excess of left

handers. This study shows that while there may not be a intrinsic advantage to being left-

handed, in some sports, the strategic advantage of being left-handed can indeed be quite

beneficial.

34 C. J. Wood, and J. P. Aggleton, "Handedness in 'fast ball' sports: Do left-handers have an innate advantage?" British Journal of Psychology 80, (1989): 227-240. 35 Ibid 36 M. Annett, "Left, Right, Hand and Brain: The Right Shift Theory," London: Earlbaum. (1985). 16

A study done on handedness more directly related to baseball is found in

Goldstein and Young (1996).37 The goal of this study was to determine whether or not evolutionary stable strategy (ESS) theory holds truefor handedness of pitchers and hitters in MLB. This study found that indeed the ESS theory holds true in Major League

Baseball but the intrinsic advantage baseball offers lefties forces the ratio of left-handers to be much higher than in everyday society. The study cites the advantage pitchers receive when having a left vs. left, or right vs. right match up with the hitter. For this reason, this study was easily able to explain the reason that left-handed hitters initially increased before left-handed pitchers.

"Initially, RH players are more common than LH players. But the initial predominance of RH players in the sport ensures that LH batters will evolve faster than LH pitchers because, as we have seen, LH batters have an advantage against the more common RH . However, as LH batters become more common in the population, LH pitchers, because of their effectiveness against LH batters, will also become more common. Eventually, the race should stabilize when the relative proportion of LH pitchers coincides with the relative proportion of LH batters."38

This study is very important in explaining the hypothesis of this study that the optimal number of left-handed hitters should be between 33% and 55% of a teams roster based upon the number of left-handed pitchers.

Grondin, Guiard, Ivry, and Koren (1999) also discusses the advantage of hitting left-handed in MLB.39 This study's primary focus is on baseball's asymmetry, and bimanual movements. The paper found several reasons to believe that the baseball

37 Stephen R. Goldstein, and Charlotte A. Young, "'Evolutionary' Stable Strategy of Handedness in Major League Baseball," Journal of Comparative Psychology 110, no. 2 (06 1996): 164-169. 38 Ibid 39 Simon Grondin, Yves Guiard, Richard Ivry, and Stan Koren, "Manual Laterahty and Hitting Perfromance in Major League Baseball," Journal ofExperimental Psychology 25,no. 3 (06 1999): 747- 754. 17 environment induces players to bat left. Using throwing hand as the determining factor of handedness, Grondin (1999) found that 90% of left-handed players batted left, while

60% of the players who batted left were right-handed people.40 This work brings forward

the striking inordinate amount of left-handed hitters in baseball and sheds some light on

reasons that may bebehind it. This study explains ideas behind why lefties are so

prominent, while this thesis will take these ideas under consideration while determining

the appropriate number of lefties a team should stick with.

The final study to be mentioned in this section is Laby, Kirschen, Rosenbaum,

and Mellman (1998).41 This research looked at the relationship between hand and ocular

dominance in MLB players. This study was done to determine if "crossed" (i.e., left eye

and right hand or right eye left hand) or "same" (left eye and hand or right eye and hand)

dominance is advantageous inbaseball. The study took 410 players

and crossed reference their hand-ocular dominance patters vs. their (BA)

or earnedrun average (ERA). Player's hand-ocular dominance was determined using

batting-handedness, throwing hand, questionnaires, and visual tests. Many people have

commonly believed that "crossed" dominance should benefit a batter because their

dominant eye is naturally positioned toward the pitcher, but the result from this study

showed that hand-ocular dominance had no effect on BA or ERA.42 The hand-ocular

study found no significant results although it does help create a foundation for studying

left-handers in the MLB.

40 Ibid 41 Daniel M. Laby, MD, David G. Kirschen, OD, PhD, Arthur L. Rosenbaum, MD, and Michael F. Mellman, MD, "The Effect of Ocular Dominance on the Performance of Professional Baseball Players," Ophthalmology 105, no. 5 (1998): 864-866. 42 Ibid 18

Conclusion

The literature review in this chapter reiterates that the number of left-handed hitters on a MLB team is an economic topic that can and should be researched. This chapter laid out the fundamentals behind the study of lefties and leaves room for

improvement in a few areas including the thesis of this paper. Many studies touched

upon in this section analyzed player value, player production, andhandedness but none

correlated all three into a study to find the optimal number of lefties a team should

employ. The following chapter will present the models used for this study, discuss the

variables used, and present the methodology for this study. CHAPTER II

Theory

The purpose of this chapter is to lay out the framework for models that value statistical success and winning percentage in Major League Baseball, while interpreting various factors pertaining to it. The first section of this chapter discusses the unordinary ratio of right-handers and left-handers while raising questions to why that is. The second section will examine the team's production while developing two models to convey the effect left-handed hitters play inthat respect. The two regression equations will be described to test the hypothesis that having between 33% and 55% of left-handed hitters is the optimal ratio a team should employ. This paper extends models used by Bradbury (2007)1, and Lewis (2003)2 in order to find the effect a left-handed batter has on his team's success. These studies were employed on a general basis and will be refined to generate results more directly related to the hypothesis described above. The modelsand variables presented in this chapter will be tested and the results will be presented in Chapter IV.

1 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 2 Michael M. Lewis, Money Ball. (New York, N.Y:W.W. Norton & Company Inc, 2003), 208.

19 20

The Left-Handed Hitter

Insidethe baseball world the idea that hitting left-handed is an innate advantage has been debated without much success for decades. The belief that a left-hander is far more rare on a child's little league field than on a major league diamond can lead some to the conclusion that left-handers are "naturally superior", "fortunate", or even "luckier," an idea studied by Wood and Aggleton (1989).3 Grondin (1999) notes that in the sport of baseball, batting preference is often inconsistent with hand preference.4 He lists a few reasons for this; a left-hander is closer to the immediate goal (first base), a left-hander after swinging will also likely be appropriately heading in the correct direction, and the intrinsic advantage a left-hander has against a right-handed pitcher.5 (The majority of

MLB pitchers are right-handed.) All of these principles align with baseball's inherent

asymmetry.

As discussed in Chapter I and above, the world of baseball has a much different ratio between right-handers and left-handers than seen in everyday society. Grondin

(1999) found that in a sample of more than 7,000 players only 13.7% of them conveyed left-handedness, using throwing hand as the determining factor of handedness, yet they found that over 37% of these same people could bat from the left side.6 This unprecedented discrepancy continues tofuel the fire that left-handed hitters have an intrinsic advantage.

3 C. J. Wood, and J. P. Aggleton, "Handedness in 'fastball' sports: Do left-handers have an innate advantage?" British Journal ofPsychology 80, (1989): 227-240. 4 Simon Grondin, Yves Guird,, Richard B. Ivry, and Stan Koren, "Manual Laterality and Hitting Performance in Major League Baseball." Journal of Experimental Psychology: Human Perception and Performance 25,no. 3 (1999): 747-754. 5 Ibid 6 Ibid 21

These ideas are the main factors behind the hypothesis of this paper and lead many to believe lefties do have a natural advantage over their counterparts. Whether or

not this advantage is portrayed in team success is the main focus of this study and the

results in Chapter IV may shed light on the handedness discrepancy found in Major

League Baseball.

Team Success in MLB

When a game is played in Major League Baseball there are two certainties

everyone can count on, one team will win, and the other will lose. The unquestioned goal

of all professional teams in the field of play is to win the game. The next step in

examining the success of a MLB team can befound in the team's winning percentage. A

team can also decipher their success in a more precise manner by analyzing their overall

team statistics. Using these two approaches, two models can be generated to analyze the

effect left-handed hitters have on these two principals at hand.

Two simple regressions were formulated to put this thesis to the test. The first model was

generated to study the statistical success a team incurs noting specifically the number of

left-handed hitters they employ. The second model will calculate team success

specifically recognizing winning percentage. Taking into consideration only team

statistics two regressions can be formulated, and are expressed in equations 2.1 and 2.2.

Runs Scored= a + pi Left-handed Ratio + (32 HR + P3 Hits + p4 SO + P5 SBR

+ P6BB + P7HBP + P8DP + P9 Year + s 2.1 22

Where Runs Scored is the total number of runs scored by a team in a season, Left-handed

Ratio is a dummy variable accounting for teams that meet the pre-determined LH ratio,

HR is the total number of Homeruns hit by a team in a season, Hits is the total number of

hits a team compiled in that season, SO is the total number of strikeouts a team

accumulated while hitting in one season, SBR is the total number of stolen bases in a

season subtracted by the total number of times a team was in a season,

BB is the number of walks a team compiled in one season, HBP in the number of times a

team was hit by a pitch, DP was the total number of plays a team hit into, Year is

a variable of time, and e is the term

Winning Percentage= a + pi Left-handed Ratio + p2 OPS + 03 Run Ratio

+ p3 Range Factor + p4 Year + s 2.2

In equation 2.2 Winning Percentage is a teams winning percentage in a season, Left-

handed ratio is a dummy variable accounting for teams that meet the pre-determined LH

Ratio, OPS is a teams OPS for theseason, Run Ratio is a teams run ratio score inthat

season, Range Factor is a teams average Range Factor in that season, Year is a variable of

time, and s is the error term.

Sample and Time Frame

This study will focus on the most recent decade, 2000 to 2009 in the MLB. It will

examine 9 offensive team statistics, one defensive statistic, andone pitching statistic for

all 30 Major League Baseball teams. This data set was compiled using Sean Forman's 23 baseball reference site. Since baseball is indeeda team sport each player's statistics will

not be examined individually, but instead they will be accounted for when composing the

team's overall performance. The remainder of this chapter will describe each variable in

detail, and look at the methodology used to test the model.

Model 1

The first model, equation 2.1, is created to determine the effect that left-handed

hitters have on the runs generated by their team throughout a season. This model will

analyze and determine if a team should employ a ratio of lefties that qualifies within the

pre-determined left-handed ratio in order to maximize how many runs they can score. In

this regression there is one dependent variable, Runs Scored and nine independent

variables to be described below. Using these variables plus an error term, the regression

will be run to find the appropriate number of lefties a MLB team should use to maximize

the number of runs they score in a single season.

Dependent Variable

In baseball, as in most sports, a team tries to accumulate more points than the

opposing team in order to win the game. In baseball these points are commonly referred

to as runs (R). A run is counted when a player successfully navigates his way around the

bases without being forced out by the opposing team. If a player accomplishes this, his

team will be credited with one point. The team who has scored the most runs at the end

of the game is then deemed to be the winner. 24

Hitters

In baseball, more than in most sports, statistics are the center of attention when dissecting performance. In the MLB different stats are kept for hitters and pitchers. This model will take into account only statistics obtained by hitters. The exogenous variables listed above are considered to be some of the most important offensive statistics kept and for that reason they are included into this model. The coveted -crown, widely believed to be the most impressive feat in baseball, is achieved when a player leads his league in runs-batted-in (RBI), homeruns (HR), and batting average (BA). According to

Amspacher (2010)7 only 14 players in the history of the MLB have accomplished this feat. The last person to obtain this feat was Carl Yastrzemski in 1967; that season he accumulated 121 RBI, 44 HR, and a .326 BA.8 For this reason it must be noted that two

statistics that were removed from this model, BA and RBI, will be accounted for in

different ways. RBI is essentially the same as runs scored when looking at overall team

statistics, and BA is accounted for when calculating OPS. OPS, which is On-base

percentage plus , will later be described in detail for its use in model

two. For these reasons the models used in this study should capture essentially every

important offensive statistic in baseball.

Independent Variables

The first independent variable in this model is hits (H). A player is credited with

a hit when they hit the baseball into the field of play and reach base safely. We also must

note that a player can hit the ball into play, and reach base safely without getting credit

7 "The Triple Crown - A Brief Look at One of Baseball's most Coveted and Elusive Feats," [cited 2010]. Available from http://www.psacard.com/articles/article3547.chtml. 8 Ibid 25 for a hit. For this to happen the opposing teammust force out another one of the hitter's teammates, (fielders choice (FC)) or be credited with an error (E) by the scorekeeper.

This is very viable in measuring the successa team has while hitting and will be one of the most important variables in this study for that reason.

A player can get four different types of hits, one of which is a homerun (HR). A homerun is achieved when the batter is able to hit the pitch over the fence, or "out of the park" in fair territory. A hitter can also be credited with a homerun if he is able to hit the ball in play and navigate his way all the way around the bases without being forced out.

This is known as an "inside the park" homerun. This variable is very important in measuring the power of a team and thus will be a very important variable as well.

The third independent variable in this regression is strikeouts (SO). If a pitcher is able to force a batter out by getting three strikes on them before they can reach first base or hit the ball into the field of playthey are then said to have struck out. This is very effective in measuring the inability for a team to make contact at the plate. Since that is, for the most part, a main goal when a hitter gets to the plate it will help measure a major flaw, and thus will most likely have a negative effect on runs scored.

Base-on-balls (BB) is the next exogenous variable used in this regression. This is

also often referred to as a walk, and it is achieved when a player can successfully take

four balls (a pitch that the umpire deems to be outside thestrike-zone) without being

forced out. When this is done a player is awarded first base and for this reason some

consider this "free-pass" a very underrated play in baseball. This variable will help

measure positive results for hitters that do not show up in other statistics such as hits and

homeruns. 26

A player can also receive a "free-pass" to first base if they are hit by the pitcher's throw (HBP). This is sometimes believed to be out of the hitter's control, but it does put a hitter on first base and give them an opportunity to score a run for their team. For this reason it must be included into a model that usesruns as the dependent variable. This variable in turn should make the model much more accurate.

The next variable, ratio (SBR), is a combination of two variables, stolen bases (SB) and caught stealing (CS). A SB is something that a player achieves not at the plate, but instead while they are on the base paths after successfully reaching first base. To steal a base a runner must successfully move up one base without the hitter putting the ball in play or being forced out by the opposing team. CS aligns with stolen bases. This is when a player attempts to steal a base but is forced out by the opposing team before they can do so. At this point the player is considered out and is no longer able to score a run for his team. Since SB and CS are so strongly correlated a simple subtraction of a team's total number of SB by their total number of CS will then result in a teams SBR score. Since base stealing can play such a large rolein run production SBR must be included into the model.

The next independent variable based upon productivity in this model is hit into a (DP). This happens when a batter hits the ball into play but the defense is

able to not only force out the hitter, but also another player on his team. This is very

detrimental to a team's abilityto score runs, and for that reason very important to the

model at hand. 27

Year (Y) will also be an independent variable used in this model. It will give a running measurement of time and will help identify any discrepancies in the data based upon possible yearly irregularities.

The final exogenous variable in this model is Left-handed Ratio (LH). This variable has no statistical value to the team's success, but is the main focus of this thesis.

A hitter can be classified in three ways, left-handed, right-handed (RH), or a switch-hitter

(SH). For the purpose of this model only players who have 130 at-bats (AB) or more will be taken under consideration. The number 130 is chosen based on the precedent that a hitter be considered a rookie until they accumulate, in one season, 130 AB, or remain on the active roster for 45 days prior to September 1st.9 If a player achieves either of these plateaus in a season they will be eligible for the rookie of the year award and no longer be considered a rookie the following year. This determinant was set forth in 1971 in regards to rookie of the year voting,10 and will bean acceptable cut-off for this study as well.

Hitters who reach the 130 AB plateau in the season of note will be classified in one of the three categories listed above. The idea behind this thesis is that a team will want to employ between 33% and 55% of their hitters as lefties to better match the asymmetry of baseball. This is a dummy variable andteams who have the aforementioned ratio will be

recorded as one, while team who do not will be recorded as zero. This variable should

give us the results desired for this study and it will also be present in model two.

Model 2

9 "Mlb.Com - MLB Miscellany: Rules, regulations, and statistics," [cited 2010]. Available from http://www.mlb.com/. 10 "Britannica.Com - Facts about Rookie of the Year: baseball," [cited 2010]. Available from http://www.britannica.com/. 28

The second model used in this study, equation 2.2, will examine the effect left- handed hitters have on a team's winning percentage, and as in model one will be used to determine the optimal number of lefties a team should use to maximize their success. In this model the dependent variable is winning percentage, while the independent variables are On-Base Percentage plus Slugging Percentage (OPS), Run Ratio, Range Factor, Left- handed ratio and Year. Using these variables plus an error term a regression will be run to

determine what role left-handed hitters have on a team's winning percentage.

Dependent Variable

The dependent variable at hand, winning percentage (W%) is a fairly simple

variable to compute. As noted earlier, in each game played one team will win the game,

and one team will lose the game. At that point the team who is deemed the winner will

be credited with a win, (W) while the other is credited with a loss (L). To determine

winning percentage, take the total number of wins a team accumulates in a season and

divide that number by the total number of games played (GP). This is shown in equation

2.3.

W%= W/GP 2-3

where W% results in a decimal recording the percent of games a team wins in one

season.

Team

As stated prior many different statistics are kept for players in the MLB. Using

these stats, team statistics can easily be compiled. This is valuable since team success is 29 directly correlated with player productivity. Some team statistics are teams averages, such as OPS. Other team statistics such as runs, homeruns, and stolen bases are team totals. The three statistics used in this model include two statistics, Range Factor and

OPS, widely believed to be the most successful in portraying success in their respective

categories.11 The final variable Run Ratio is a statistic generated for this study, not

formally used by the MLB, but is based on Sabermetric statistic DICE, which is widely

regarded as a very efficient way of computing pitching efficiency in Major League

Baseball. All three will be explained in detail below.

Independent Variables

The first independent variable in this regression is On-Base Percentage plus

Slugging Percentage (OPS). This statistic is part of the analysis of baseball,

which was developed by John Thorn and Pete Palmer.12 Bradbury (2007) finds that the

explained variance in run production of OPS is 90 percent, which is an unprecedented

high correlation.13 To calculate OPS is a three-step process.

First On-Base-Percentage (OBP) must be calculated. OBP is a fairly

straightforward calculation and it is very similar to a player's batting average (BA). OBP

is the total number of times a player successfully reaches first base divided by their total

number of plate appearances (PA). This stat differs from BA in that it includes not only

hits, but also BB and HBP.As explained earlier BB and HBP can play an important role

in team success therefore the use of BA in this study is unnecessary even though it is

11 J. C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 12 John Thorn and Pete Palmer, The Hidden Game ofBaseball: A Revolutionary Approach to (New York; Dolphin, 1984). 13 Ibid 30 widely perceived in the baseball world to be of utmost importance. The calculation for

OBP is shown in equation 2.4.

OBP= (H+BB+HBP)/PA 2.4

This calculation will generate a decimal number reflecting the percentage of times a player reaches first base safely excluding errors.

The second part of OPS is Slugging Percentage (SLG), which is usually used to measure a player's power. A power hitter's primary duty to the team is to hit homeruns and get extra base hits. These players are often found in the middle of the batting order, and more often than notthey will also be toward the top of the team in SLG. To calculate

SLG, the total number of bases (TB) accumulated by the hitter is divided by a player's number of at-bats (AB).(Note a BB or HBP is not considered an AB, which explains the difference between AB and PA.) A hitter accumulates four bases for a HR, three bases for a triple, two bases for a double, andone base for a single. The SLG calculation is found in equation 2.5 below.

SLG= TB/AB 2.5

Finally to calculate OPS SLG and OBP are added together. In equation 2.6 the calculation is laid out.

OPS= OBP + SLG 2.6

Table 2.1 and table 2.2 present theregular season OPS numbers for the best and worst

teams in the 2009 regular season respectively. These tables provide example of OPS for

each position player in the lineup using their most common starting line-ups that season.

This reiterates Bradbury's claim that OPS, more than any other offensive statistic, helps 31 model team success. These tables will show that OPS can be very important in

distinguishing a good team from a bad team.

Table 2.114

New York Yankees Batting Order and OPS for 2009 Season

POSITION FLAYER STATS 1. .874 OPS as #1 batter (693 PA)

2. Johnnv Damon .854 OPS as #2 batter (591 PA)

3. .950 OPS as #3 batter (702 PA)

4. Alex Rodriguez .935 OPS as #4 batter (533 PA)

5. Hideki Matsui .849 OPS as #5 batter (269 PA)

6. Jorse Posada .965 OPS as #6 batter (246 PA)

7. Robinson Cano .891 OPS as #7 batter (250 PA)

8. Melkv Cabrera .659 OPS as #8 batter (227 PA)

9. Melkv Cabrera .796 OPS as #9 batter (132 PA)

Table 2.215

Washington Nationals Batting Order and OPS for 2009 Season

POSITION PLAYER STATS 1. Cristian Guzman .752 OPS as # 1 batter (232 PA) 2. .858 OPS as #2 batter (336 PA) 3. Rvan Zimmerman .882 OPS as #3 batter (634 PA) 4. .935 OPS as #4 batter (606 PA) 5. Josh Willingham .793 OPS as #5 batter (351 PA) 6. Eliiah Dukes .809 OPS as #6 batter (154 PA) 7. Josh Bard .537 OPS as #7 batter (170 PA) 8. Wil Nieves .629 OPS as #8 batter (232 PA)

9. John Lannan .273 OPS as #9 batter (65 PA)

14 "ESPN - New York Yankees News, Schedule, Players, Scores, Stats, Photos, Rumors - MLB Baseball, [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=nyy. 15 "ESPN - Washington Nationals News, Schedule, Players, Scores, Stats, Photos, Rumors - MLB Baseball." [cited 2010]. Available fromhttp://sports.espn.go.com/mlb/teams/lineup?team=was. 32

Tables 2.1 and 2.2reveal not only thedifference in OPS between the number nine hitter and the number four hitter but it also demonstrates the importance of team. While the number three and four hitters for both clubs all achieved a good OPS score, the discrepancy between the less talented hitters, the seven through nine hitters, and the more talented hitters, three through five, is strikingly larger for Washington than New York.

This indicates that a team cannot justrely solely on their best few players, and instead need to have a quality team throughout in order to be successful. This shows the value of looking at overall team statistics in determining team success, as opposed to looking at individual statistics. Bradbury (2007) is shown to be very accurate in saying that a team's OPS is very predicting of team success.16 Consider the example at hand, New

York had and average OPS of .864 throughout their lineup, while Washington achieved only a .631 OPS average. This discrepancy in average OPS resulted in a vast difference in winning percentage. In 2009 the New York Yankees had a winning percentage of .636 while the Washington Nationals had a winning percentage of only .364.

The second independent variable in Model 2 is run ratio (RR). This variable is used to value a team's pitching in a positive manner. In 2001 Sabermetrician Clay

Dreslough developed a statistic that has the ability to decipher a team's pitching success independently from their defense. This approach was generated using the Defense

Independent Pitching Statistics model commonly knows as DIPS. Dreslough's new adaptation to this field is called Defense-Independent Component ERA (DICE). (ERA is the amount of earned runs a team or pitcher gives up divided by nine, the number of

16 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 17 "MLB.com — The Official Site of Major League Baseball." [cited 2010]. Available from http://mlb.mlb.com/mlb/standings/?ymd=20091104&tcid=mm_mlb_standings. 33 in a standard game.) DICE has a somewhatmore complex formula than any other variable presented in this chapter.

To calculate a team's DICE score the following statistics are needed: HR allowed,

BB given up, opposing batters HBP, SO recorded, and (IP). Equation 2.7 expresses the formula associated with the DICE method.

DICE = 3 + [(13HR + 3(BB + HPB) - 2SO)/ IP] 2.7

This formula provides a representation of a team's ERA excluding the defensive aspect of the typical ERA. The lower this number is the better a team's pitching staff. For this reason this study will institute a new formula to generate a team'srun ratio (RR) in order to credit a team with good pitching in a positive manner.

In order to turn a low score, a good DICE score, into a high score, the highest average (ERA) ever surrendered by any team within the time period of this study must be found. (Earned runs are runs given up that were not deemed to be scored due to any error made.) This is found with the 2001 who's ERA for that season was 5.71.18 Knowing that no team has had a higher (worse) ERA during this span it is then known that this number will always represent the worst pitching team in our era.

From here the calculation of each team's hypothetical RR can be found in order to reward teams with the lowest DICE. Doing so will credit a team who achieves a lower (better)

DICE value with a better RR value. Equation 2.8 shows the simple math needed to

compute RR.

RR= 5.71-DICE 2.8

18 "Baseball-Reference.Com - Major League Baseball Statistics and History," [cited 2010]. Available from http://www.baseball-reference.com/. 34

This value will represent the difference between the worst possible ERA and a team's

DICE score giving teams with the lowest DICE rating the highest RR value. This will help identify the positive contribution of pitching to winning percentage.

The third independent variable in this model is range factor (RF). According to

Sabermetric supporter Bradbury, RF gives a better valuation of a player's defensive value

than their .19 To compute this value one must take a person's number

of (PO), assists (A), and number of games played (GP) at a given position.

Equation 2.9 demonstrates this computation.

RF = (PO + A)/ GP 2-9

Since this stat is recorded on an individual basis, the team average must be calculated.

To keep this study balanced only the position players who qualify for the hitting

statistics, 130 AB or more, will beused in the formulation of a team's average RF. From

there the player with the most games played at each position will be counted and then

averaged to determine a team RF. This value will give the most accurate representation

of team defense and the role it plays in team success.

The last two independent variables in this model are Left-handed Ratio (LH), and

Year (Y). These variables will be calculated and incorporated into the model in the exact

same way they were in model one. LH will again be the primary focus of the model in

terms of identifying any impact a left-handed hitter has on a team's winning percentage.

Method

The results presented in Chapter IV will be obtained by running two regressions.

In order to determine if having a ratio of lefties between 33% and 55% correspond with

19 J.C. Bradbury, The Baseball Economist: The Real Game Exposed (New York, N.Y; Dutton, 2007), 336. 35 the highest winning percentage, and highest number of runs scored, teams will be run through the regressions using LH as a dummy variable mentioned above. This will be done to make an assessment of the thesis of this paper that a team should employ between

33% and 55% of there hitters to be left-handed in order to maximize their runs scored and

winning percentage.

Table 2.3

Expected Signs for Independent Variables for Model 1

Independent Variable Predicted Sign

Hits Positive

Homeruns Positive

Strike Outs Negative

Base on Balls Positive

Hit By Pitch Positive

Stolen Bases Ratio Positive

Hit into Double Play Negative

Year Unknown

Left-handed Hitters Negative if a team has less than 33% Unknown if a team has 33% Positive if a team is between 33% and 55% Unknown if a team has 55% Negative if a team has more than 55% 36

Table 2.4

Expected Signs for Independent Variables for Model 2

Independent Variable Predicted Sign

OPS Positive

Run-Ratio Positive

Range Factor Positive

Year Unknown

Left-Handed Hitters Negative if a team has less than 33% Unknown if a team has 33% Positive if a team is between 33% and 55% Unknown if a team has 55% Negative if a team has more than 55%

Conclusion

This chapter developed the theory and mathematics behind player and team production. The goal of this section was to bring forward the framework of this study, and to analyze what factors into a team's success. The theory discoursed in this chapter was used to develop two models which will be used to determine how many left-handed hitters a team should have in order to maximize their team success. This chapter

provided an in depth description of all the relevant variables in this study. It also

explained the methodology used to analyze the model. The following chapter will

describe the data collected for this study while Chapter IV will provide results for the 37 models above. Finally, Chapter V will also draw conclusions from the results concerning the theory of left-handed hitting, and discuss its impact on Major League Baseball. CHAPTER III

DATA

This chapter will explain the data set collected to test the hypothesis described in the previous theory chapter. It will begin by describing the sources consulted in the collection of data and explain the method used to collect it. The second section will present summary statistics for each variable: range, mean, median, and other statistics.

Data Resources

The majority of the data collected in this study was done so using the website baseball-reference.com.1 This online database accounted for all but one statistic, Range

Factor, which was needed to test the hypothesis of this paper. The range factor statistic used in model two was found using the online baseball almanac.2 From here statistical information on all 30teams was recorded using team statistics, explained previously, over ten years resulting in 300 team seasons. These 300 team seasons provided 21 different variables. This resulted in 6300 different data points collected for this study.

Although 6300 different points were collected, some were combined to formulate variables discussed in the previous chapter, such as Run Ratio.

1 "Baseball-reference.com - Teams," [cited 2010]. Available from http://www.baseball- reference. com/teams/batteam/shtml. 2 "Baseball-almanac.com - The Official Baseball History Site," [cited 2010]. Available from http ://www.baseball-almanac. com/.

38 39

The sample used for the collection of data was Major League Baseball. Players and teams in professional leagues outside of the MLB were not accounted for in this study. Secondly, each team in the MLB was used to collect statistics for the year 2000 to the year 2009. Although each player who took part in a Major League Baseball game was accounted for, the main focus of this study (Left-handed hitting) was determined only by players who accumulated at least 130 at bats in that particular season. At this

point a player was deemed to be a qualified hitter, on average that resulted to

approximately 13.4 qualified hitters per team. All pitching statistics were found using

team totals, while the sole defensive statistics, range factor, was collected using a team

average described in detail previously.

Data Analysis

Table 3.1 presents summary statistics for each variable collected. These include

minimum, maximum, mean, median, and standard deviation. 40

Table 3.1

Descriptive Statistics

Standard Variable Abbreviation Minimum Maximum Mean Median Deviation 768 77.93 Runs Scored R 574 978 770.61

74.31 Hits H 1300 1667 1475.04 1469

171 32.62 Homeruns HR 94 260 173.83

25.44 Stolen Base Ratio SBR 0 154 54.55 52

71.62 BB 363 775 541.63 535.5

1054.5 108.92 Strikeouts SO 805 1399 1062.49

153 17.06 Double Plays DP 113 204 152.74

13.22 HBP 29 103 58.46 57

58 12.94 Hit by Pitch P. HBP 27 95 58.46 (Pitchers) 1443.2 12.60 Innings Pitched P. IP 1412.2 1484.2 1443.30

171.5 24.12 Homeruns P.HR 116 239 173.83 Allowed 538 68.30 Base on Balls P.BB 348 728 541.63 (Pitcher) 1057 105.63 Strikeouts P. SO 764 1404 1062.49 (Pitcher) 4.45 .092 Range Factor RF 4.22 4.71 4.46

.756 .034 On-Base- OPS .671 .851 .758 Percentage plus Slugging Percentage .51 .07 Winning Winning % .27 .72 .50 Percentage 0 .49 Left-handed LH Ratio 0 1 .4 Hitting Ratio 41

Variable Explanation

In table 3.1 the breakdown of each variable used for this study is presented and

one can see that there were no unexpected irregularities apart from Left-handed Ratio.

While it is a dummy variable it is apparent that the mean was noticeably lower than it

would be if this variable were evenly distributed. With a mean of .4, it is clear that far

more teams did not qualify within the LH Ratio, put forth in the previous chapter, than

did. Before ever running regressions, it appears that most teams do not believe that they

should have between 33% and 55% Left-handed hitters in order to optimize success.

Whether or not this leads to a lower success rate on the baseball field will be discussed in

detail in the chapter to follow.

Conclusion

This chapter briefly explained how all the data was collected for this study while

documenting all the appropriate patterns found within the data itself. The variable of

note, LH ratio, gave an eye opening result that was much different than expected. The

following results chapter will explain the results obtained from this study and use them to

test the hypothesis that having between 33% and 55% of left-handed hitters is the optimal

ratio a team should employ. CHAPTER IV

RESULTS

This chapter will examine the results produced by the regressions described in the previous chapter. As touched on in the previous chapter two regressions were run with the dependent variables being total runs scored and team winning percentage. The two regressions were run using a combination of the twelve independent variables described in the theory chapter. The regression models were performed precisely as described in

Chapter II, using ten years of data from Major League Baseball. This chapter will first analyze the results obtained from running the two regressions while also discussing any issues or tests run on these models.

Regression Results

There are two different models to be touched upon in this section. The first model used Runs Scored as the dependent variable, and it had nine independent variables. One

of which included left-handed hitting ratio, the focal point of this study. The second

model used team-winning percentage as its' dependent variable, while it had five

independent variables, again including left-handed hitting ratio. Each regression was run

using a data set of 6300 different data points collected over the last decade of the MLB.

42 43

The following results for each regression demonstrate the impacteach variable had in accounting for the two different dependent variables. The regression results are in tables below with their

coefficients and correlated t-statistics. Any independent variable with a t-statistic below

1.833 for model 1, and 2.015 for model 2 at a 5% significance level were deemed to be

insignificantfor that regression.

Model 1

Runs Scored= a + pi Left-handed Ratio + p2 HR + (33 Hits + p4 SO + p5 SBR

+ p6 BB + P7 HBP + p8 DP + p9 Year + s 4.1

Equation 4.1 above illustrates the formula used in model 1. Six variables in the

Runs Scored regression are deemed to be significant at the 5% level. The first Homeruns

is shown to have a very strong effect on this model and was also deemed to be significant

at the 1% level. Justas with Homeruns, Hits was also found to have a very strong impact

on the model and was found to be significant to the 1%. Stolen Base Ratio was also a

significantly positive variable in model 1. The final two significant variables in this

model are; Base-On-Balls, and Hit By Pitch. Both of which were found to have a

positive impact on runs scored even at the 1% level. It should be noted that of the three

variables found to be insignificant two of them, Hit into Double Play, and Year were

deemed to be significant at the 10% level. The variable of note Left-handed Ratio was

found to be insignificant in this model as was Strikeouts. 44

Table 4.1

Regression Results: Runs Scored

Variable Coefficient t-Statistic

Constant -532.5778 -12.80129***

Left-handed Ratio -2.227669 -0.771776

Homeruns 0.844847 15.87223***

Hits 0.641768 27.56803***

Strikeouts -0.017299 -1.108172

Stolen Base Ratio 0.104606 1.834615**

Base-on-balls 0.330899 14.56911***

Hit by Pitch 0.440804 3.966272***

Hit into Double Play 0.143210 1.692823*

Year -0.779496 -1.537898*

R-squared 0.906078 Observations 300 Adjusted R-squared 0.903153 Adjusted Observations 299 * Significant at 10%, ** Significant at 5%, *** Significant at 1 %

The independent variables in the runs scored regression account for approximately 91% of the variation in runs scored by a team in one season. The correlation matrix generated from this regression showed no signs of multicolinearity. This regression also showed no

normality, serial correlation or heteroskedasticity problems. The Jarque-Bera (JB) test

revealeda chi-square distribution below 16.92 with 9 degrees of freedom at a 5%

significance level.1

Chi-squared for data in runs scored regression is 1.470286. 45

The results found in this model did reveal a relatively high R-squared value but the variable of note, Left-handed Ratio was found to be insignificant at all three levels listed above. The coefficients in the model all met the predicted signs from Chapter II except for Left-handed Ratio, and Hit into Double Play. Themain conclusions to be made from this model will be listed in detail in the chapter to follow.

Model 2

Winning Percentage= a + (31 Left-handed Ratio + (32 OPS + (33 Run Ratio

+ (34 Range Factor + p5 Year + s 4.2

Equation 4.2 above lays out the formula used in model 2. Three variables in the

winning percentage model were deemed to be significant. Range Factor (RF) was found

to have a small significance, but contrary to predictions that effect was negative on a

team's winning percentage. On-Base-Percentage plus Slugging Percentage (OPS) also

had a significant impact on the model, but contrary to RF it was in a positive manner.

Finally Run Ratio was shown to have a very strong and also positive impact on this

model. It is also worth noting that the variable of concern, left-handed ratio, was again

found to be insignificant. 46

Table 4.2

Regression Results: Winning Percentage

Variable Coefficient t-Statistic

Constant -0.129082 -0.738384

Left-handed Ratio -0.001297 -0.202259

OPS 1.131611 12.06884***

Run Ratio 0.062792 7.102465***

Range Factor -0.070991 -2.074597**

Year 0.000507 0.453629

R-squared 0.447857 Observations 300 Adjusted R-squared 0.438435 Adjusted Observations 299 * Significant at 10%, ** Significant at 5%, *** Significant at 1 %

The variables used in this regression account for approximately 45% of the variation in a team's winning percentage. As in the first model no problems were found with serial correlation, multicolinearity, or heteroskedasticity. Just as in model number one a

Jarque-Bera test was run to test for normality. For this model the test was run using 5 degrees of freedom, therefore the JB test needed to reveal a value below 11.07.2 Just as in the previous model no normality issues were discovered.

The results from model 2 showed an R-squared value of approximately 45%.

This is a very mediocre result, which may bring forward the issue that, many "outside

sources" such as manager's decisions, must playa very larger role in winning. Both

Left-handed Ratio, and Range Factor resulted with a differing sign than what was

predicted in Chapter II. All three statistics used to represent hitting, pitching, and defense

! Chi-squared for data in winning percentage regression is 0.461489. 47 were found to be significant to at least the 5% level, which may mean that although those are the three parts to the game of baseball, much needs to be accounted for in order to possibly relate Left-handed hitting to winning percentage more successfully. As with model 1, a detailed conclusion of the results found in this model will be presented in

Chapter V.

Conclusion

This brief chapter laid out the results uncovered while running the two regressions

used in this study. It has also discussed tests run on the data to make sure no overlying

problems exist with the results before they are interpreted. The next and final chapter

will go in depth into conclusions drawn from this research while also providing ideas for

further research that may be done to further extend on this thesis. CHAPTER V

CONCLUSION

This chapter will discuss any conclusions that can be taken from the results while also touching upon possible extensions and alterations that could be made to these models in order to extend this research. Finally this chapter will concludewith a discussion summing up anything that can be taken away from this study itself.

Regression Conclusions

The primary focus of this study was to delve into the reasoning behind the asymmetry of baseball. The attempt of this study was to determine what impact left- handed hitters have onteam production based upon what portion of the team they account for. As touched upon in the Literature Review in Chapter I baseball has an odd balance of lefties vs. right-handers not seen in everyday society. For many reasons baseball is a sport of balance and the thesis behind this study was to determine if having between 33%

and 55% of a teams hitters be left-handed would be beneficial to their success. Two

models were generated in order to see if this balance would benefit teams. The first was

in a solely offensive category, runs scored, while the second determined if it would playa

part in increasing a team's winning percentage. From this point ten years worth of data

was collected and two separate regressions were run. Although the variable of note, left-

48 49 handed ratio, wasdeemed to be insignificant in both models some concrete conclusions can be taken from this study.

In model number one a R-squared value of 91% was found. While left-handed ratio was deemed to be insignificant, the nine independent variables combined are very

accurate in explainingruns scored. This means that if a manager felt he needed to

improve his run production he could see that some variables such as homeruns are very

important in making that improvement.

That may be a somewhat commonly accepted answer but this study does show

that when looking for homerun production he may not need to worry as much about a

player's tendency to strikeout or hit into double plays. Contrary to common thought

strikeouts are not significant in hindering run production and furthermore its' negative

coefficient was miniscule.1 Another common belief that power hitter's tendencies to hit

into double plays can be very detrimental to team success was found to not only be

insignificant in this study, but it also was associated with a positive coefficient. This

means that the more double plays a team hits into lead to a higher number of runs scored.

A possible explanation for this could be that the significance of homeruns in this model is

so high that the occasional double play caused by "swinging for the fences" is a risk a

manager should learn to accept.

The second model based upon winning percentage saw a much smaller R-squared

value than the pervious model but was still acceptable explaining 45% of variability in

winning percentage. Again as in the first model Left-handed Ratio wasdeemed to be

insignificant but the three significant variables are all of enough importance to managers

1 Strikeout coefficient from the runs scored model is -0.017299. 2 Hit into Double Play coefficient from the runs scored model is 0.143210. 50 that some interesting conclusions can be drawn. As predicted both On-Base-Percentage plus Slugging Percentage (OPS), and Run Ratio were both deemed to be significant positive variables. This means that as many would assume the better pitching and hitting a team has the more wins they will earn. The variable that did result as predicted was

Range Factor (RF). As explained earlier RF is one of the most accurate ways to evaluate

team defense and not only was it found to have a negative impact on winning it was also

deemed to be significantly negative.3 This is a very intriguing result because most people

in the "baseball world" would with outquestionsay that defense is one very important

piece of winning.

One reason for this could be the small variability seen in defense league wide.

For instance the range in RF throughout ten years of data was only 0.49. This can be

explained when realizing that every MLB player is already so good at playing defense

that the difference between the best fielder and worst fielder is very small. This could

possibly mean that the difference between good and bad is so small that it is almost

meaningless in understanding winning percentage in MLB, and therefore it could have

just as easily been significantly positive or insignificant.

A second possible explanation for this anomaly could lay in the ever renown

saying, "Defense wins Championships." It has often been speculated that during the

playoffs defense plays a much more important role in winning when everyone is playing

more focused than ever. This could mean that during a 162 game season players may get

lackadaisical letting their defense slip making it less influential to winning. Regardless of

what the real reason for this result is, it can help a manager in making personnel choices

to know this result. For instance, a manager is debating between Player X, and Player Y

1 t-Statistic for Range Factor is -2.074597. 51 to bring infor next year. Player X has excellent hitting statistics but has been known to be a liability on defense, at the same time Player Y has adaquet hitting statistics but has a stellar defensive reputation. In this situation the results from this study would influence the manager to go with Player X because offense is seen to have a much more significant,

and at the same time positive, influence on winning. He may then be able to choose a

player based more upon his hitting statistics than his defensive abilities without at much

hesitation as he may have previously had.

Further Research

Although this study did not find a significant answer to the thesis of this study it

should be noted that further research could find more successful results. One possible

change that could be made to this study is to try and incorporate the ability of a switch-

hitter to hit left-handed. If every could be broken down throughout a season and

noted whether the batter hit from the left sideor the right side the results for this study

could be much more complete.

A second extension to this study would be to somehow incorporate player

productivity into this model. It is apparent that the production of each individual player

plays a large part in team production, (the focus of this study)so if a model could be

devised to also include what goes into individual player production, (i.e. natural ability,

age, contract situation, ect.) it would from there lead to a more accurate model.

A final extension to this study could be to simply gather more years of data. This

study was done looking at ten years worth of data. If even 20 years of data was collected

and included into this study it would lead to a more complete picture. 52

Final Discussion

The purpose of this study was to provide evidence that having between 33% and

55% of hitters on a team to be left-handed would benefit a team in their success. In both models the results for the variable Left-handed Ratio were found to be inconclusive.

Therefore this study can neither prove nor disprove the underlying thesis of this study.

This thesis is the mostup to date research on a topic that has been left relatively

untouched until this point in time. This study did find some important factors in what

leads to team success and can beused by general managers and coaches alike to try to

identify what type of players they need to target in order to be successful. Although this

study did not find a positive correlation between a balanced hitting attack and team

success it does not mean there is nota significant correlation. If there were no benefit in

having a balanced number of right-handers and left-handers we would never have seen

such a unique asymmetry in baseball to begin with. SOURCES CONSULTED

Annett, M. "Left. Right. Hand and Brain: The Right Shift Theory." London: Earlbaum (1985).

"Baseball Almanac - The Official Baseball History Site." [cited 2010]. Available from http://www.baseball-almanac.com/.

"Baseball-Reference.Com - MajorLeague Baseball Statistics and History." [cited 2010]. Available from http://www.baseball-reference.com/.

"Baseball-reference.com - Teams." [cited 2010]. Available from http://www.baseball-reference.com/teams/batteam/shtml.

Berri, David J. "Who is 'Most Valuable'? Measuring a Player's Production of Wins in the National Basketball Association." Managerial and Decision Economics 20, no. 8 (12 1999): 411-427.

Bradbury, J.C. The Baseball Economist: The Real Game Exposed. New York, N.Y: Dutton, 2007.

"Britannica.Com - Facts about Rookie of the Year: baseball." [cited 2010]. Available from http://www.britannica.com/.

Cassing, J., and R. Douglas. "Implications of the Auction Mechanism in Baseball's Free Agents Draft." Southern Economic Journal 47 (07 1980): 110-121.

"Chicago Tribune Online - Chicago Cubs." [cited 2010]. Available from http://www.chicagotribune.com/.

DeBrock, Lawrence, Wallace Hendricks, and Roger Koenker. "Pay and Performance the Impact on Salary Distribution on Firm-Level Outcomes in Baseball." Journal of Sports Economics 5, no. 3 (08 2004): 243-261.

"ESPN -New York Yankees News, Schedule, Players, Scores, Stats, Photos, Rumors- MLB Baseball." [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=nyy.

"ESPN - Washington Nationals News, Schedule, Players, Scores, Stats, Photos, Rumors

53 54

- MLB Baseball." [cited 2010]. Available from http://sports.espn.go.com/mlb/teams/lineup?team=was.

Goldstein, Stephan R., and Charlotte A. Young. "'Evolutionary' Stable Strategy of Handedness in Major League Baseball." Journal of Comparative Psychology 110, no. 2 (06 1996): 164-169.

Grondin, Simon, Yves Guiard, Richard Ivry, and Stan Koren. "Manual Laterality and Hitting Performance in Major League Baseball." Journal of Experimental Psychology 25, no. 3 (06 1999): 747-754. Gwartney, James, and Charles Haworth. "Employer Cost and Discrimination: The Case of Baseball." The Journal of Political Economy 82,no. 4 (07 1974): 873-881.

Kahn, Lawrence M. "Managerial Quality, Team Success, and Individual Player Performance in Major League Baseball." Industrial and Labor Relations Review 46, no. 3 (04 1993): 531-547.

Krautmann, A. C, and M. Oppenheimer. "Contract Lengthand the Return to Performance in Major League Baseball." Journal of Sports Economics 3, no. 1 (02 2002): 6-17.

Krautmann, Anthony C. "Shirking or Stochastic Productivity in Major League Baseball?" Southern Economic Journal 56, no. 4 (04 1990): 961-968.

Krautmann, Anthony C. "Shirking or Stochastic Productivity in Major League Baseball: Reply." Southern Economic Journal 60, no. 1 (07 1993): 241.

Laby, Daniel M., MD, David g. Kirschen, OD, PhD, Arthur L Rosenbaum, MD, and Michael F. Mellman, MD. "The Effect of Ocular Dominance on the Performance of Professional Baseball Players." Ophthalmology 105, no. 5 (1998): 864-866.

Lewis, Michael M. Money Ball. New York, N.Y: W.W. Norton & Company Inc, 2003.

Marburger, D. R. "Does the Assignment of Property Rights Encourage Or Discourage Shirking? Evidence from Major League Baseball." Journal ofSports Economics 4, no. 1(02 2003): 19-34.

Maxcy, J. G., R. D. Fort, and A. C. Krautmann. "The Effectiveness of Incentive Mechanisms in Major League Baseball." Journal of Sports Economics 3, no. 3 (08 2002): 246-255.

Maxcy, Joel. "Motivating Long-Term Employment Contracts: Risk Management in Major League Baseball." Managerial & Decision Economics 25, no. 2 (03 2004): 109-120.

Miceli, T. J. "A Principal-Agent Model of Contracting in Major League Baseball," 55

Journal of Sports Economics 5, no. 2 (05 2004): 213-220.

"Mlb.Com - MLB Miscellany: Rules, regulations, and statistics." [cited 2010]. Available from http://www.mlb.com/.

Nistler, Tony, and David Walton. Sporting News Books Baseball Register 2007. 2005 ed. St. Louis, MO: Sporting News Books, 2006.

Oorlog, Dale R. "Marginal Revenue and Labor Strife in Major League Baseball," Journal OfLabor Research XVI, no. 1 (Winter 1995): 25-42.

Porter, Phillip K., and Gerald W. Scully. "Measuring Managerial Efficiency: The Case of Baseball." Southern Economic Journal 48, no. 3 (01 1982): 642-650.

Scully, Gerald W. "Player Salary Share and the Distribution of Player Earning." Managerial and Decision Economics 25, no. 2 (031974): 77-86

Scully, Gerald W. "Pay and Performance in Major League Baseball." American Economic Review 64 (12 1974): 915-930.

"The Triple Crown - A Brief Look at One of Baseball's most Coveted and Elusive Feats." [cited 2010]. Available from http://www.psacard.com/articles/article3547.chtml.

Thorn, John, and Pete Palmer. The hidden Game of Baseball: A Revolutionary Approach To Baseball Statistics. New York: Dolphin, 1984.

Wood C J and J. P. Aggleton. "Handedness in 'fastball' sports: Do left-handers have An innate advantage?" British Journal of Psychology 80. (1989): 227-240.

Zak, Thomas A., Cliff J. Huang, and John J. Siegfried, '"^^ll^ of Professional Basketball." The Journal of Business 52, no. 3 (07 1979): 379-392.