Predicting Major League Baseball Game Outcomes
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Pitch Quantification Part 1: Between Pitcher Comparisons of QOP with Conventional Statistics" (2016)
Biola University Digital Commons @ Biola Faculty Articles & Research 2016 Pitch quantification arP t 1: between pitcher comparisons of QOP with conventional statistics Jason Wilson Biola University Follow this and additional works at: https://digitalcommons.biola.edu/faculty-articles Part of the Sports Studies Commons, and the Statistics and Probability Commons Recommended Citation Wilson, Jason, "Pitch quantification Part 1: between pitcher comparisons of QOP with conventional statistics" (2016). Faculty Articles & Research. 393. https://digitalcommons.biola.edu/faculty-articles/393 This Article is brought to you for free and open access by Digital Commons @ Biola. It has been accepted for inclusion in Faculty Articles & Research by an authorized administrator of Digital Commons @ Biola. For more information, please contact [email protected]. | 1 Pitch Quantification Part 1: Between-Pitcher Comparisons of QOP with Conventional Statistics Jason Wilson1,2 1. Introduction The Quality of Pitch (QOP) statistic uses PITCHf/x data to extract the trajectory, location, and speed from a single pitch and is mapped onto a -10 to 10 scale. A value of 5 or higher represents a quality MLB pitch. In March 2015 we presented an LA Dodgers case study at the SABR Analytics conference using QOP that included the following results1: 1. Clayton Kershaw’s no hitter on June 18, 2014 vs. Colorado had an objectively better pitching performance than Josh Beckett’s no hitter on May 25th vs. Philadelphia. 2. Josh Beckett’s 2014 injury followed a statistically significant decline in his QOP that was not accompanied by a significant decline in MPH. These, and the others made in the presentation, are big claims. -
Testing the Minimax Theorem in the Field
Testing the Minimax Theorem in the Field: The Interaction between Pitcher and Batter in Baseball Christopher Rowe Advisor: Professor William Rogerson Abstract John von Neumann’s Minimax Theorem is a central result in game theory, but its practical applicability is questionable. While laboratory studies have often rejected its conclusions, recent field studies have achieved more favorable results. This thesis adds to the growing body of field studies by turning to the game of baseball. Two models are presented and developed, one based on pitch location and the other based on pitch type. Hypotheses are formed from assumptions on each model and then tested with data from Major League Baseball, yielding evidence in favor of the Minimax Theorem. May 2013 MMSS Senior Thesis Northwestern University Table of Contents Acknowledgements 3 Introduction 4 The Minimax Theorem 4 Central Question and Structure 6 Literature Review 6 Laboratory Experiments 7 Field Experiments 8 Summary 10 Models and Assumptions 10 The Game 10 Pitch Location Model 13 Pitch Type Model 21 Hypotheses 24 Pitch Location Model 24 Pitch Type Model 31 Data Analysis 33 Data 33 Pitch Location Model 34 Pitch Type Model 37 Conclusion 41 Summary of Results 41 Future Research 43 References 44 Appendix A 47 Appendix B 59 2 Acknowledgements I would like to thank everyone who had a role in this paper’s completion. This begins with the Office of Undergraduate Research, who provided me with the funds necessary to complete this project, and everyone at Baseball Info Solutions, in particular Ben Jedlovec and Jeff Spoljaric, who provided me with data. -
Name of the Game: Do Statistics Confirm the Labels of Professional Baseball Eras?
NAME OF THE GAME: DO STATISTICS CONFIRM THE LABELS OF PROFESSIONAL BASEBALL ERAS? by Mitchell T. Woltring A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Leisure and Sport Management Middle Tennessee State University May 2013 Thesis Committee: Dr. Colby Jubenville Dr. Steven Estes ACKNOWLEDGEMENTS I would not be where I am if not for support I have received from many important people. First and foremost, I would like thank my wife, Sarah Woltring, for believing in me and supporting me in an incalculable manner. I would like to thank my parents, Tom and Julie Woltring, for always supporting and encouraging me to make myself a better person. I would be remiss to not personally thank Dr. Colby Jubenville and the entire Department at Middle Tennessee State University. Without Dr. Jubenville convincing me that MTSU was the place where I needed to come in order to thrive, I would not be in the position I am now. Furthermore, thank you to Dr. Elroy Sullivan for helping me run and understand the statistical analyses. Without your help I would not have been able to undertake the study at hand. Last, but certainly not least, thank you to all my family and friends, which are far too many to name. You have all helped shape me into the person I am and have played an integral role in my life. ii ABSTRACT A game defined and measured by hitting and pitching performances, baseball exists as the most statistical of all sports (Albert, 2003, p. -
"What Raw Statistics Have the Greatest Effect on Wrc+ in Major League Baseball in 2017?" Gavin D
1 "What raw statistics have the greatest effect on wRC+ in Major League Baseball in 2017?" Gavin D. Sanford University of Minnesota Duluth Honors Capstone Project 2 Abstract Major League Baseball has different statistics for hitters, fielders, and pitchers. The game has followed the same rules for over a century and this has allowed for statistical comparison. As technology grows, so does the game of baseball as there is more areas of the game that people can monitor and track including pitch speed, spin rates, launch angle, exit velocity and directional break. The website QOPBaseball.com is a newer website that attempts to correctly track every pitches horizontal and vertical break and grade it based on these factors (Wilson, 2016). Fangraphs has statistics on the direction players hit the ball and what percentage of the time. The game of baseball is all about quantifying players and being able give a value to their contributions. Sabermetrics have given us the ability to do this in far more depth. Weighted Runs Created Plus (wRC+) is an offensive stat which is attempted to quantify a player’s total offensive value (wRC and wRC+, Fangraphs). It is Era and park adjusted, meaning that the park and year can be compared without altering the statistic further. In this paper, we look at what 2018 statistics have the greatest effect on an individual player’s wRC+. Keywords: Sabermetrics, Econometrics, Spin Rates, Baseball, Introduction Major League Baseball has been around for over a century has given awards out for almost 100 years. The way that these awards are given out is based on statistics accumulated over the season. -
Combining Radar and Optical Sensor Data to Measure Player Value in Baseball
sensors Article Combining Radar and Optical Sensor Data to Measure Player Value in Baseball Glenn Healey Department of Electrical Engineering and Computer Science, University of California, Irvine, CA 92617, USA; [email protected] Abstract: Evaluating a player’s talent level based on batted balls is one of the most important and difficult tasks facing baseball analysts. An array of sensors has been installed in Major League Baseball stadiums that capture seven terabytes of data during each game. These data increase interest among spectators, but also can be used to quantify the performances of players on the field. The weighted on base average cube model has been used to generate reliable estimates of batter performance using measured batted-ball parameters, but research has shown that running speed is also a determinant of batted-ball performance. In this work, we used machine learning methods to combine a three-dimensional batted-ball vector measured by Doppler radar with running speed measurements generated by stereoscopic optical sensors. We show that this process leads to an improved model for the batted-ball performances of players. Keywords: Bayesian; baseball analytics; machine learning; radar; intrinsic values; forecasting; sensors; batted ball; statistics; wOBA cube 1. Introduction The expanded presence of sensor systems at sporting events has enhanced the enjoy- ment of fans and supported a number of new applications [1–4]. Measuring skill on batted balls is of fundamental importance in quantifying player value in baseball. Traditional measures for batted-ball skill have been based on outcomes, but these measures have a low Citation: Healey, G. Combining repeatability due to the dependence of outcomes on variables such as the defense, the ball- Radar and Optical Sensor Data to park dimensions, and the atmospheric conditions [5,6]. -
Batter Handedness Project - Herb Wilson
Batter Handedness Project - Herb Wilson Contents Introduction 1 Data Upload 1 Join with Lahman database 1 Change in Proportion of RHP PA by Year 2 MLB-wide differences in BA against LHP vs. RHP 2 Equilibration of Batting Average 4 Individual variation in splits 4 Logistic regression using Batting Average splits. 7 Logistic regressions using weighted On-base Average (wOBA) 11 Summary of Results 17 Introduction This project is an exploration of batter performance against like-handed and opposite-handed pitchers. We have long known that, collectively, batters have higher batting averages against opposite-handed pitchers. Differences in performance against left-handed versus right-handed pitchers will be referred to as splits. The generality of splits favoring opposite-handed pitchers masks variability in the magnitude of batting splits among batters and variability in splits for a single player among seasons. In this contribution, I test the adequacy of split values in predicting batter handedness and then examine individual variability to explore some of the nuances of the relationships. The data used primarily come from Retrosheet events data with the Lahman dataset being used for some biographical information such as full name. I used the R programming language for all statistical testing and for the creation of the graphics. A copy of the code is available on request by contacting me at [email protected] Data Upload The Retrosheet events data are given by year. The first step in the analysis is to upload dataframes for each year, then use the rbind function to stitch datasets together to make a dataframe and use the function colnames to add column names. -
Does the Defensive Shift Employed by an Opposing Team Affect an MLB
Does the Defensive Shift Employed by an Opposing team affect an MLB team’s Batted Ball Quality and Offensive Performance? 11/20/2019 Abstract This project studies proportions of batted ball quality across the 2019 MLB season when facing two different types of defensive alignment. It also attempts to answer if run production is affected by shifts. Batted ball quality is split into six groups (barrel, solid contact, flare, poor (topped), poor (under), and poor(weak)) while defensive alignments are split into two (no shift and shift). Relative statistics come from all balls put in play excluding sacrifice bunts in the 2019 MLB season. The study shows there to be differences in the proportions of batted ball quality relative to defensive alignment. Specifically, the proportion of barrels (balls barreled) against the shift was greater than the proportion of barrels against no shift. Barrels also proved to result in the highest babip (batting average on balls in play) + slg (slugging percentage), where babip + slg then proved to be a good predictor of overall offensive performance measured in woba (weighted on-base average). There appeared to be a strong positive correlation between babip + slg and woba. MLB teams may consider this data when deciding which defensive alignment to play over the course of a game. However, they will most likely want to extend this research by evaluating each player on a case by case basis. 1 Background and Signifigance Do MLB teams hit the ball better when facing a certain type of defensive alignment? As the shift becomes increasingly employed in Major League Baseball these types of questions become more and more important. -
Height, Weight, and Durability in Major League Baseball Joshua Yeager Claremont Mckenna College
Claremont Colleges Scholarship @ Claremont CMC Senior Theses CMC Student Scholarship 2017 Height, Weight, and Durability in Major League Baseball Joshua Yeager Claremont McKenna College Recommended Citation Yeager, Joshua, "Height, Weight, and Durability in Major League Baseball" (2017). CMC Senior Theses. 1684. http://scholarship.claremont.edu/cmc_theses/1684 This Open Access Senior Thesis is brought to you by Scholarship@Claremont. It has been accepted for inclusion in this collection by an authorized administrator. For more information, please contact [email protected]. Claremont McKenna College HEIGHT, WEIGHT AND DURABILITY IN MAJOR LEAGUE BASEBALL SUBMITTED TO PROFESSOR HEATHER ANTECOL BY JOSHUA YEAGER FOR SENIOR THESIS SPRING 2017 APRIL 24th, 2017 Yeager 1 Table of Contents Abstract ...................................................................................................................................................... 2 Acknowledgements ................................................................................................................................ 3 I. Introduction.......................................................................................................................................... 4 II. Literature Review ............................................................................................................................ 6 III. Data .................................................................................................................................................. -
WOBA Tier 2 Minor WOBA Tier 2 Minor
36 WOBA Tier 2 Minor Rookie (ages 6 & 7) League Specific Rules Revised 2017 1. Major Rookie players may not play on a Minor Rookie team. All overage players must be approved by the WOBA before they are allowed to play for any season or league game on a Minor Rookie team. 2. Major Rookie age players are not permitted to play on a Minor Rookie team, overage approvals excepted. 3. Umpires and the opposing team coaches must be informed of all BOD approved overage players (8 year olds) prior to the start of the game. 4. Overage players may not play at the pitching position. 5. Closed toe shoes must be worn by all players as well as the operator of the pitching machine. 6. Either an “Atec Tuffy Supersoft” ball or regulation "Incrediball" will be used for all league play. 7. With the use of either the "Incrediball" or the “Atec Tuffy” it is optional for the coach operating the pitching machine to wear a regulation batting helmet. If they do not wear a helmet they do so at their own risk. 8. Additional defensive field coaches may be present on the field beyond the bases and base paths. They cannot interfere with or obstruct the play. 9. Games are as many innings as possible with no new inning starting after 75 minutes of play and finishing within the 90 minute (1.5 hour) maximum time limit. 10. Batter will receive up to a maximum of 5 strikes as called by the umpire. OBA Rule Q2.3(h) 11. -
A STATISTICAL ANALYSIS of RETALIATION PITCHES in MAJOR LEAGUE BASEBALL Peter Jurewicz Clemson University, [email protected]
Clemson University TigerPrints All Theses Theses 12-2013 CHIN MUSIC: A STATISTICAL ANALYSIS OF RETALIATION PITCHES IN MAJOR LEAGUE BASEBALL Peter Jurewicz Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_theses Part of the Economics Commons Recommended Citation Jurewicz, Peter, "CHIN MUSIC: A STATISTICAL ANALYSIS OF RETALIATION PITCHES IN MAJOR LEAGUE BASEBALL" (2013). All Theses. 1793. https://tigerprints.clemson.edu/all_theses/1793 This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected]. CHIN MUSIC: A STATISTICAL ANALYSIS OF RETALIATION PITCHES IN MAJOR LEAGUE BASEBALL A Thesis Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Master of Arts Economics By Peter Jurewicz December 2013 Accepted by: Dr. Raymond Sauer, Committee Chair Dr. Scott Baier Dr. Robert Tollison ABSTRACT This paper is focused on hit batsmen in Major League Baseball from the 2008 season through August 20th of the 2013 season. More specifically, this paper examines the characteristics of retaliation pitches and attempts to determine the intent of the pitcher. The paper also takes into account moral hazard and cost-benefit analysis of hitting an opposing batsman. There has been a vast amount of literature in economics with regard to hit batsmen in Major League Baseball. However, very few of these papers have been able to evaluate economic theories in Major League Baseball using Pitchf/x data. -
Leveraging Big Data in Baseball
An article by EMC Proven Professional Knowledge Sharing Elite Author A WHOLE NEW BALLGAME: LEVERAGING BIG DATA IN BASEBALL Bruce Yellin Advisory Systems Engineer EMC Corporation [email protected] Table of Contents Introduction – Baseball, Big Data, and Advanced Analytics ...................................... 9 Big Data and Baseball - Players, Coaches, Trainers, and Managers ....................... 13 Sportvision ................................................................................................................ 15 PITCHf/x ................................................................................................................ 16 HITf/x ..................................................................................................................... 25 FIELDf/x ................................................................................................................ 28 Big Data and the Business of Baseball ..................................................................... 33 Player Development .................................................................................................. 36 Revenue From Fans .................................................................................................. 38 Revenue From Media ................................................................................................ 42 Big Data Helps Create Algorithmic Baseball Journalism ......................................... 43 Listen To Your Data - Grady the Goat - The Curse of the Bambino ........................ -
1. Intro to Scorekeeping
1. Intro To Scorekeeping The following terms will be used on this site: Cell: The term cell refers to the square in which the player’s at-bat is recorded. In this illustration, the cell is the box where the diagram is drawn. Scorecard, Scorebook: Will be used interchangeably and refer to the sheet that records the player and scoring information during a baseball game. Scorekeeper: Refers to someone on a team that keeps the score for the purposes of the team. Official Scorer: The designated person whose scorekeeping is considered the official record of the game. The Official Scorer is not a member of either team. Baseball’s Defensive Positions To “keep score” of a baseball game it is essential to know the defensive positions and their shorthand representation. For example, the number “1” is used to refer to the pitcher (P). NOTE : In the younger levels of youth baseball leagues 10 defensive players are used. This 10 th position is know as the Short Center Fielder (SC) and is positioned between second base and the second baseman, on the beginning of the outfield grass. The Short Center Fielder bats and can be placed anywhere in the batting lineup. Defensive Positions, Numbers & Abbreviations Position Number Defensive Position Position Abbrev. 1 Pitcher P 2 Catcher C 3 First Baseman 1B 4 Second Baseman 2B 5 Third Baseman 3B 6 Short Stop SS 7 Left Fielder LF 8 Center Fielder CF 9 Right Fielder RF 10 Short Center Fielder SC The illustration below shows the defensive position for the defense. Notice the short center fielder is illustrated for those that are scoring youth league games.