Predicting Baseball Win/Loss Records from Player Projections

Total Page:16

File Type:pdf, Size:1020Kb

Predicting Baseball Win/Loss Records from Player Projections Predicting Baseball Win/Loss Records from Player Projections Connor Daly [email protected] November 29, 2017 1 Introduction When forecasting future results in major league baseball (MLB), there are essentially two sources from which you can derive your predictions: teams and players. How do players perform individually, and how do their collaborative actions coalesce to form a team's results? Several methods of both types exist, but often are shrouded with proprietary formulas. Currently, several mature and highly sophisticated player projection systems are used to forecast season results. None are abundantly transparent about their methodology. Here I set out to develop the simplest possible player-based team projection system and try to add one basic improvement. 2 Predicting Wins and Losses 2.1 Team-Based Projections One approach to such forecasting is to analyze team performance in head to head matchups. A common implementation of this approach is known as an elo rating and prediction system. 1 Elo systems start by assigning teams an average rating. After games are played, winning teams' ratings increase and losing teams' ratings decrease relative to the expected outcome of the matchup. Expected outcomes are determined by the difference in rating between the two teams. If a very good team almost loses to a really bad team, its rating will only increase slightly. If an underdog pulls of an upset, however, it will earn relatively more points. As more games are played, older games become progressively less meaningful. Essentially, this prediction method considers only who you played, what the margin of victory was, and where the match was played (home team advantage is adjusted for). Using Monte Carlo simulations, one can predict the outcomes of individual seasons for each team. In between seasons, teams are regressed towards the mean. For a detailed explanation of a baseball elo model, see FiveThirtyEight [Boi]. The main advantage of this kind of team-based aproach is that it can capture some of the hard to pin down factors that make teams more than just the sum of their parts. Without figuring out what the secret sauce is, this method estimates the sum total contributions of ownership, coaches, team philosophy, and an uncountable number of other factors. The method does have a significant downside, however, in that it can't take advantage of known changes in team dynamics, such as changes in players and coaches. If I know Babe Ruth is leaving the Yankess after a particular season, I probably want to project them differently than I would have otherwise. This model fails to capture that. 2.2 Player-Based Projections Baseball enjoys a unique advantage over other major American sports in that it is sig- nificantly easier to decouple the performance of individual players to determine who was ultimately responsible for creating a certain result. If a batter hits a home run, we can say with a high degree of certainty that the batter and the pitcher combined to cause this event. By looking at the large number of combinations of batter/pitcher matchups, we can gauge the relative skill of each by their performance against a wide variety of opponents. On the 2 other hand, a sport such as football presents significant challenges to gauging the true skill of individual players. Looking at the running game, how can one intelligently and objectively pass out credit and blame? If a running back runs for a seven yard gain on a toss sweep to the right, how much credit should the left guard receive? Decoupling in baseball isn't perfect, but compared to other sports, it's much easier. 2.2.1 Wins Above Replacement A foundational pillar of sabermetrics, the empirical, quantitative analysis of baseball, is the concept of wins above replacement (WAR). Essentially, the idea is that all meaningful baseball statistics must measure how events on the field help or hurt a team's chances of winning in expectation. The way games are won is by teams scoring runs and preventing runs from being scored. Thus, every event can thus be understood in the context of runs allowed or runs created. This idea can be hard to grasp at first. How many runs does a home run create? One? Rather counter-intuitively, the generally accepted value is around 1.4 runs. How is this? Well, not only did the batter score himself, but he will have also batted in any potential runners on base. You must also consider the possibility that had the batter made an out instead of scoring these base runners, following batters could have driven them in. Using real playing data, we can determine the expected run creating or subtracting value of every event in baseball. See Table 1 for a complete breakdown of the run value of such events. By looking at the total contributions of a player over the course of a season, we can sum up the expected run contributions of every event the player caused. Now we need to compare our player against a baseline. A first intuition might be to compare the player to league average. Well, defining league average to be a baseline of zero runs created sells league average players short. A league average player is better than approximately half of the players in the league. That's valuable production! Instead, we scale our player's contribution against the idea of a replacement level player. The production of a replacement level player 3 is intended to be equivalent to the contributions of an infinitely replaceable minimum salary veteran or minor league free agent. For reference, a team of replacement players is defined by Fangraphs to win approximately 48 games over the course of a 162 game season. Using this replacement level, we determine the original player's runs above replacement player. Next, we scale the runs above replacement by the amount of runs per win in an average game. Finally, we scale this calculated WAR to the the number of possible wins in a season, so that the sum of all WAR and replacement runs equals the total number of wins in season. On average, the player's context free stats would have resulted in a team winning an extra number of games corresponding to his WAR than if the same player had been replaced with a replacement level player. There is a finite pool of WAR for all players. When one player performs better, that means less WAR will be allocated to the rest of the players. Unfortunately for the reader, there are several variants of WAR, and all define things slightly differently. Several rely on inexplicably chosen constants or proprietary formulas. The main basis of my calculations relies on Fangraphs WAR, but I did make some alterations, which will be explained later. For more in depth explanations of WAR and its underpinnings, see [Joh85], [Tom06], and [Fanb]. 2.2.2 WAR to Wins By projecting a season's worth of players' expected WAR contributions, we can group players by target year team and take the sum total of their contributions. The combined total of their WAR should be able to help predict the team's actual number of wins. This relationship isn't necessarily one-to-one, as will be discussed in 4.5. This method benefits from being able to track players as they change teams. 4 Table 1: Run Values by Event Event Run Value Event Run Value Home Run 1.397 Balk 0.264 Triple 1.070 Intentional Walk 0.179 Double 0.776 Stolen Base 0.175 Error 0.508 Defensive Indifference 0.120 Single 0.475 Bunt 0.042 Interference 0.392 Sacrifice Bunt -0.096 Hit By Pitch 0.352 Pickoff -0.281 Non-intetional Walk 0.323 Out -0.299 Passed Ball 0.269 Strikeout -0.301 Wild Pitch 0.266 Caught Stealing -0.467 Empirical measurements of the run value of events from 1999 - 2002 season. Data from [Tom06] 3 Projecting Players To create a player projection based season long team projecting system, the first step is to project players. Essentially, you need to look at player's past performance and predict how he will perform in the future. Some methods of doing this are highly sophisticated, others quite simple. Systems like Baseball Prospectus's PECOTA, Dan Szymborski's ZiPS, and Chris Mitchell's KATOH all combine bunches of variables and various calculations to com- pute projected outcomes. PECOTA in particular is based primarily around player similarity scores. Mainly, it uses various metrics to find comparable players for a given to-be-projected player and uses the performance of those comparables to infer a trajectory for the targeted player's future performance. Although its general methodology has been discussed, its spe- cific implementation is proprietary. On the other end of the sophistication system is perhaps the simplest possible projection system: Marcel the Monkey. 3.1 Marcel the Monkey Marcel the Monkey, or simply Marcel, is a player projection system invented by Tom Tango [Tan]. It sets out to be the simplest possible player projection system. Essentially, it takes 5 a weighted average of a player's last three years (5/4/3 for batters and 3/2/1 for pitchers), regresses the player toward the mean by 1200 plate appearances, and applies an aging curve to increase player's skills until age 29 after which point they begin to decline. These projections make no attempt to differentiate for team, league, or position, with the exception that some different constants are used for starting pitchers and relief pitchers. Rather than calculating counting stats such as hits or home runs specifically, Marcel projects rate stats like hits or home runs per plate attempt.
Recommended publications
  • A Statistical Study Nicholas Lambrianou 13' Dr. Nicko
    Examining if High-Team Payroll Leads to High-Team Performance in Baseball: A Statistical Study Nicholas Lambrianou 13' B.S. In Mathematics with Minors in English and Economics Dr. Nickolas Kintos Thesis Advisor Thesis submitted to: Honors Program of Saint Peter's University April 2013 Lambrianou 2 Table of Contents Chapter 1: The Study and its Questions 3 An Introduction to the project, its questions, and a breakdown of the chapters that follow Chapter 2: The Baseball Statistics 5 An explanation of the baseball statistics used for the study, including what the statistics measure, how they measure what they do, and their strengths and weaknesses Chapter 3: Statistical Methods and Procedures 16 An introduction to the statistical methods applied to each statistic and an explanation of what the possible results would mean Chapter 4: Results and the Tampa Bay Rays 22 The results of the study, what they mean against the possibilities and other results, and a short analysis of a team that stood out in the study Chapter 5: The Continuing Conclusion 39 A continuation of the results, followed by ideas for future study that continue to project or stem from it for future baseball analysis Appendix 41 References 42 Lambrianou 3 Chapter 1: The Study and its Questions Does high payroll necessarily mean higher performance for all baseball statistics? Major League Baseball (MLB) is a league of different teams in different cities all across the United States, and those locations strongly influence the market of the team and thus the payroll. Year after year, a certain amount of teams, including the usual ones in big markets, choose to spend a great amount on payroll in hopes of improving their team and its player value output, but at times the statistics produced by these teams may not match the difference in payroll with other teams.
    [Show full text]
  • Machine Learning Applications in Baseball: a Systematic Literature Review
    This is an Accepted Manuscript of an article published by Taylor & Francis in Applied Artificial Intelligence on February 26 2018, available online: https://doi.org/10.1080/08839514.2018.1442991 Machine Learning Applications in Baseball: A Systematic Literature Review Kaan Koseler ([email protected]) and Matthew Stephan* ([email protected]) Miami University Department of Computer Science and Software Engineering 205 Benton Hall 510 E. High St. Oxford, OH 45056 Abstract Statistical analysis of baseball has long been popular, albeit only in limited capacity until relatively recently. In particular, analysts can now apply machine learning algorithms to large baseball data sets to derive meaningful insights into player and team performance. In the interest of stimulating new research and serving as a go-to resource for academic and industrial analysts, we perform a systematic literature review of machine learning applications in baseball analytics. The approaches employed in literature fall mainly under three problem class umbrellas: Regression, Binary Classification, and Multiclass Classification. We categorize these approaches, provide our insights on possible future ap- plications, and conclude with a summary our findings. We find two algorithms dominate the literature: 1) Support Vector Machines for classification problems and 2) k-Nearest Neighbors for both classification and Regression problems. We postulate that recent pro- liferation of neural networks in general machine learning research will soon carry over into baseball analytics. keywords: baseball, machine learning, systematic literature review, classification, regres- sion 1 Introduction Baseball analytics has experienced tremendous growth in the past two decades. Often referred to as \sabermetrics", a term popularized by Bill James, it has become a critical part of professional baseball leagues worldwide (Costa, Huber, and Saccoman 2007; James 1987).
    [Show full text]
  • A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises
    DePaul Journal of Sports Law Volume 4 Issue 1 Summer 2007: Symposium - Regulation of Coaches' and Athletes' Behavior and Related Article 3 Contemporary Considerations A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises Jon Berkon Follow this and additional works at: https://via.library.depaul.edu/jslcp Recommended Citation Jon Berkon, A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises, 4 DePaul J. Sports L. & Contemp. Probs. 9 (2007) Available at: https://via.library.depaul.edu/jslcp/vol4/iss1/3 This Notes and Comments is brought to you for free and open access by the College of Law at Via Sapientiae. It has been accepted for inclusion in DePaul Journal of Sports Law by an authorized editor of Via Sapientiae. For more information, please contact [email protected]. A GIANT WHIFF: WHY THE NEW CBA FAILS BASEBALL'S SMARTEST SMALL MARKET FRANCHISES INTRODUCTION Just before Game 3 of the World Series, viewers saw something en- tirely unexpected. No, it wasn't the sight of the Cardinals and Tigers playing baseball in late October. Instead, it was Commissioner Bud Selig and Donald Fehr, the head of Major League Baseball Players' Association (MLBPA), gleefully announcing a new Collective Bar- gaining Agreement (CBA), thereby guaranteeing labor peace through 2011.1 The deal was struck a full two months before the 2002 CBA had expired, an occurrence once thought as likely as George Bush and Nancy Pelosi campaigning for each other in an election year.2 Baseball insiders attributed the deal to the sport's economic health.
    [Show full text]
  • Improving the FIP Model
    Project Number: MQP-SDO-204 Improving the FIP Model A Major Qualifying Project Report Submitted to The Faculty of Worcester Polytechnic Institute In partial fulfillment of the requirements for the Degree of Bachelor of Science by Joseph Flanagan April 2014 Approved: Professor Sarah Olson Abstract The goal of this project is to improve the Fielding Independent Pitching (FIP) model for evaluating Major League Baseball starting pitchers. FIP attempts to separate a pitcher's controllable performance from random variation and the performance of his defense. Data from the 2002-2013 seasons will be analyzed and the results will be incorporated into a new metric. The new proposed model will be called jFIP. jFIP adds popups and hit by pitch to the fielding independent stats and also includes adjustments for a pitcher's defense and his efficiency in completing innings. Initial results suggest that the new metric is better than FIP at predicting pitcher ERA. Executive Summary Fielding Independent Pitching (FIP) is a metric created to measure pitcher performance. FIP can trace its roots back to research done by Voros McCracken in pursuit of winning his fantasy baseball league. McCracken discovered that there was little difference in the abilities of pitchers to prevent balls in play from becoming hits. Since individual pitchers can have greatly varying levels of effectiveness, this led him to wonder what pitchers did have control over. He found three that stood apart from the rest: strikeouts, walks, and home runs. Because these events involve only the batter and the pitcher, they are referred to as “fielding independent." FIP takes only strikeouts, walks, home runs, and innings pitched as inputs and it is scaled to earned run average (ERA) to allow for easier and more useful comparisons, as ERA has traditionally been one of the most important statistics for evaluating pitchers.
    [Show full text]
  • Determining the Value of a Baseball Player
    the Valu a Samuel Kaufman and Matthew Tennenhouse Samuel Kaufman Matthew Tennenhouse lllinois Mathematics and Science Academy: lllinois Mathematics and Science Academy: Junior (11) Junior (11) 61112012 Samuel Kaufman and Matthew Tennenhouse June 1,2012 Baseball is a game of numbers, and there are many factors that impact how much an individual player contributes to his team's success. Using various statistical databases such as Lahman's Baseball Database (Lahman, 2011) and FanGraphs' publicly available resources, we compiled data and manipulated it to form an overall formula to determine the value of a player for his individual team. To analyze the data, we researched formulas to determine an individual player's hitting, fielding, and pitching production during games. We examined statistics such as hits, walks, and innings played to establish how many runs each player added to their teams' total runs scored, and then used that value to figure how they performed relative to other players. Using these values, we utilized the Pythagorean Expected Wins formula to calculate a coefficient reflecting the number of runs each team in the Major Leagues scored per win. Using our statistic, baseball teams would be able to compare the impact of their players on the team when evaluating talent and determining salary. Our investigation's original focusing question was "How much is an individual player worth to his team?" Over the course of the year, we modified our focusing question to: "What impact does each individual player have on his team's performance over the course of a season?" Though both ask very similar questions, there are significant differences between them.
    [Show full text]
  • Markov League Baseball
    Baseball Analysis Using Markov Chains VJ Asaro Statistics Department Cal Poly, San Luis Obispo 2015­2016 1 Table of Contents I. Introduction………………………………………………….….……..… 2 ​ II. Background………………………………………………………....…… 3 ​ III. Markov Chains………………………………………………….…...…. 4­7 ​ IV. Expected Runs………………………………………………….…....… 7­9 ​ V. Player Analysis……………………………………………………....…. 10­13 ​ VI. Team Analysis……………………………………………………....…. 13 ​ VII. Win Expectancy…………………………………………………....…. 14­16 ​ VIII. Conclusion………………………………………………………........ 16­17 ​ IX. Bibliography……………………………………………...……….…… 18 ​ X. R Code……………………………………………………………....…. 19­29 ​ 2 I. Introduction My love for the game of baseball began when I received a Ken Caminiti Padres jersey when I was three years old. My dreams and goals evolved around baseball throughout my entire life and I aspired to be a professional baseball player. However, as perfectly stated by a scout in the movie Moneyball, “We're all told at some point in time that we can no longer play the ​ ​ ​ children's game, we just don't know when that's gonna be. Some of us are told at eighteen, some of us are told at forty, but we're all told.” Unfortunately, I was told at 18. The passion I held for baseball didn’t end, it just transitioned to a new area, which is where baseball analytics stepped in. I became a statistics major to explore the behind­the­scenes action of baseball, and it led me to this project. Baseball analytics have been in the game ever since the ability to record data, but the start of the 21st century catapulted its importance and usefulness. It all began with Moneyball, as the 2002 Oakland A’s shocked the baseball world with 103 wins and the third lowest payroll in the MLB.
    [Show full text]
  • Gonzo Honored to Join Latino Baseball Hall of Fame by Josh Rawitch
    │ Optimizing the Diamondbacks Lineup: vs. LHP Gonzo honored to join Latino Baseball Hall of Fame By Joseph Jacquez / Venom Strikes By Josh Rawitch / Arizona Diamondbacks http://venomstrikes.com/2015/02/08/optimizing- http://m.dbacks.mlb.com/news/article/108697656/d-backs- diamondbacks-lineup-vs-lhp/ luiz-gonzalez-honored-to-join-latino-baseball-hall-of-fame Luis Gonzalez Enters the Arizona Sports Hall of Fame D-backs done dealing for now, likely out on Shields By Thomas Lynch / Venom Strikes By Steve Gilbert / MLB.com http://venomstrikes.com/2015/02/07/luis-gonzalez-enters- http://m.dbacks.mlb.com/news/article/108553930/arizona- arizona-sports-hall-fame/ diamondbacks-indicate-they-dont-plan-to-make-more-deals- before-spring-training Diamondbacks RHP Touki Toussaint Hires Agent By Joseph Jacquez / Venom Strikes D-backs celebrate first graduates of Dominican academy http://venomstrikes.com/2015/02/06/diamondbacks-rhp- By Josh Rawitch / Arizona Diamondbacks touki-toussaint-hires-agent/ http://m.dbacks.mlb.com/news/article/108558196/arizona- diamondbacks-celebrate-graduation-at-academy-in-dominican- Diamondbacks Preparing for Arbitration Hearings with republic Trumbo, and Reed By Joseph Jacquez / Venom Strikes Diamondbacks to have crowded rotation competition http://venomstrikes.com/2015/02/06/diamondbacks- By Nick Piecoro / Arizona Republic preparing-arbitration-hearings-trumbo-reed/ http://www.azcentral.com/story/sports/mlb/diamondbacks/2 015/02/07/diamondbacks-to-have-crowded-rotation- 13 Days Until Diamondbacks Pitchers & Catchers Report
    [Show full text]
  • Bayesian Statistics and Baseball
    Pomona College Senior Thesis in Mathematics Bayesian Statistics and Baseball Author: Advisor: Guy Stevens Jo Hardin Submitted to Pomona College in Partial Fulfillment of the Degree of Bachelor of Arts April 5, 2013 Contents 1 Introduction 2 2 Variables and Data 2 3 Methodology 3 3.1 Logistic Regression . 3 3.2 Bayesian Statistics . 3 3.3 Hierarchical Bayesian Logistic Regression . 4 3.3.1 Model Specifications . 4 3.3.2 Parameter Priors . 5 3.3.3 Joint Distributions . 5 3.4 Metropolis Hastings and the Gibbs Sampler . 6 3.5 Proof of Convergence . 7 4 Results 10 4.1 Model Output . 10 4.2 Individual Pitchers . 13 5 Discussion 15 6 Appendix A: Code 16 1 Introduction The projection of future performance in professional baseball has long been a serious question of interest to those inside and outside the sport. It has been heavily researched in the academic community and among quantitative analysts within the baseball community. Past academic research has taken a number of distinct approaches to making predictions for hitters, but research on pitchers is a bit more limited. The study of projections often provides valuable insight into aspects of performance used in modeling. One projection engine, created by Tom Tango, is called MARCEL; it uses a weighted average of recent performance and regresses it to the mean. It uses simple, publicly-available data and, in general, performs quite well as a prediction tool for both hitters and pitchers (Tango 2004). However, this approach leaves much to be desired. Tango explains that this model is the simplest possible model that should be accepted; while it stacks up well against some basic systems, it is meant as a simple threshold for considering a model to be effective.
    [Show full text]
  • SF Giants Press Clips Friday, 21, 2018
    SF Giants Press Clips Friday, 21, 2018 San Francisco Chronicle A toast to Hunter Pence and his memorable Giants’ carrer John Shea Hunter Pence gets cheered when he steps out of the dugout. When he takes the field. When he’s introduced as a pinch-hitter. When he gets a big hit. Even when he makes an out. Pence is treated like a legend — an overused word, but perhaps fitting in this case — and a two-time champion who was instrumental and inspirational at the height of the Giants’ glory days. Because that’s who he is, even though he’s hitting a career-worst .211. Pence remains a crowd favorite despite his struggles, despite the team’s struggles, and this will be accentuated in the final week of the season. That starts with Monday’s homestand opener and runs through the Sept. 30 finale against the Dodgers, the final opportunities for fans to stand and cheer the immensely popular 35-year-old outfielder. The Giants plan an on-field tribute to Pence on that last day. “I appreciate it every day regardless of conditions,” Pence said of his love affair with fans. “Ultimately, it just goes to show it’s a big reflection of how good a fan base they are and how good a community the Giants have built. It’s a product of that. 1 “In sports and life in general, not everything’s always positive. But there’s a benefit from some of the difficult things. It’s part of life. The ebb and flow of the good times and the bad times.” Pence is winding down the final days of his contract and, considering the Giants’ plan to reshape the roster and get younger next year, the final days of his career in San Francisco.
    [Show full text]
  • Framing the Game Through a Sabermetric Lens: Major League
    FRAMING THE GAME THROUGH A SABERMETRIC LENS: MAJOR LEAGUE BASEBALL BROADCASTS AND THE DELINEATION OF TRADITIONAL AND NEW FACT METRICS by ZACHARY WILLIAM ARTH ANDREW C. BILLINGS, COMMITTEE CHAIR DARRIN J. GRIFFIN SCOTT PARROTT JAMES D. LEEPER KENON A. BROWN A DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the College of Communication and Information Sciences in the Graduate School of The University of Alabama TUSCALOOSA, ALABAMA 2019 Copyright Zachary William Arth 2019 ALL RIGHTS RESERVED i ABSTRACT This purpose of this dissertation was to first understand how Major League Baseball teams are portraying and discussing statistics within their local broadcasts. From there, the goal was to ascertain how teams differed in their portrayals, with the specific dichotomy of interest being between teams heavy in advanced statistics and those heavy in traditional statistics. With advanced baseball statistics still far from being universally accepted among baseball fans, the driving question was whether or not fans that faced greater exposure to advanced statistics would also be more knowledgeable and accepting of them. Thus, based on the results of the content analysis, fans of four of the most advanced teams and four of the most traditional teams were accessed through MLB team subreddits and surveyed. Results initially indicated that there was no difference between fans of teams with advanced versus traditional broadcasts. However, there were clear differences in knowledge based on other factors, such as whether fans had a new school or old school orientation, whether they were high in Schwabism and/or mavenism, and how highly identified they were with the team.
    [Show full text]
  • Major Qualifying Project: Advanced Baseball Statistics
    Major Qualifying Project: Advanced Baseball Statistics Matthew Boros, Elijah Ellis, Leah Mitchell Advisors: Jon Abraham and Barry Posterro April 30, 2020 Contents 1 Background 5 1.1 The History of Baseball . .5 1.2 Key Historical Figures . .7 1.2.1 Jerome Holtzman . .7 1.2.2 Bill James . .7 1.2.3 Nate Silver . .8 1.2.4 Joe Peta . .8 1.3 Explanation of Baseball Statistics . .9 1.3.1 Save . .9 1.3.2 OBP,SLG,ISO . 10 1.3.3 Earned Run Estimators . 10 1.3.4 Probability Based Statistics . 11 1.3.5 wOBA . 12 1.3.6 WAR . 12 1.3.7 Projection Systems . 13 2 Aggregated Baseball Database 15 2.1 Data Sources . 16 2.1.1 Retrosheet . 16 2.1.2 MLB.com . 17 2.1.3 PECOTA . 17 2.1.4 CBS Sports . 17 2.2 Table Structure . 17 2.2.1 Game Logs . 17 2.2.2 Play-by-Play . 17 2.2.3 Starting Lineups . 18 2.2.4 Team Schedules . 18 2.2.5 General Team Information . 18 2.2.6 Player - Game Participation . 18 2.2.7 Roster by Game . 18 2.2.8 Seasonal Rosters . 18 2.2.9 General Team Statistics . 18 2.2.10 Player and Team Specific Statistics Tables . 19 2.2.11 PECOTA Batting and Pitching . 20 2.2.12 Game State Counts by Year . 20 2.2.13 Game State Counts . 20 1 CONTENTS 2 2.3 Conclusion . 20 3 Cluster Luck 21 3.1 Quantifying Cluster Luck . 22 3.2 Circumventing Cluster Luck with Total Bases .
    [Show full text]
  • Measuring Pitchers' Performance Using Data Envelopment Analysis
    Contemporary Management Research Pages 351-384, Vol. 11, No. 4, December 2015 doi:10.7903/cmr.14157 Measuring Pitchers’ Performance Using Data Envelopment Analysis with Advanced Statistics Shihteng Chiu Stella Maris Primary School, Taiwan E-Mail: [email protected] Chiahuei Hsiao National Taipei University E-Mail: [email protected] Huichin Wu National Taipei University E-Mail: [email protected] ABSTRACT This paper evaluates starting pitchers’ pitching performance during the 2008 to 2014 Major League Baseball (MLB) seasons. We use data envelopment analysis (DEA) based on two inputs (i.e., innings pitched [IPs] and per pitched innings) and three outputs (fielding independent pitching, earned run average [ERA], and skill-interactive ERA) to evaluate the performance of the 30 MLB teams’ starting pitchers with IPs greater than 200 in each single season (2008 to 2014, regular season only). We used the CCR models to calculate the overall efficiency, scale efficiency, technical efficiency, efficiency value, and the slack analysis to measure a pitcher’s performance in each single season. The results showed that 4, 3, 4, 3, 5, 3, and 5 pitches reached overall efficiency each year, from 2008 to 2014 (regular season). By analyzing the results and computing performance indexes and benchmarks for each starting pitcher, we determine the true value of each player to help baseball teams select highly indexed players and maximize player efficiency. Keywords: Advanced Statistics, Pitchers, Data Envelopment Analysis, Innings Pitched, Earned Run Average Contemporary Management Research 352 INTRODUCTION Competitive sports interest a large number of people who watch them because of their uncertainty and unpredictability.
    [Show full text]