A Survey of Baseball Machine Learning
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
San Francisco Giants
SAN FRANCISCO GIANTS 2016 END OF SEASON NOTES 24 Willie Mays Plaza • San Francisco, CA 94107 • Phone: 415-972-2000 sfgiants.com • sfgigantes.com • sfgiantspressbox.com • @SFGiants • @SFGigantes • @SFG_Stats THE GIANTS: Finished the 2016 campaign (59th in San Francisco and 134th GIANTS BY THE NUMBERS overall) with a record of 87-75 (.537), good for second place in the National NOTE 2016 League West, 4.0 games behind the first-place Los Angeles Dodgers...the 2016 Series Record .............. 23-20-9 season marked the 10th time that the Dodgers and Giants finished in first and Series Record, home ..........13-7-6 second place (in either order) in the NL West...they also did so in 1971, 1994 Series Record, road ..........10-13-3 (strike-shortened season), 1997, 2000, 2003, 2004, 2012, 2014 and 2015. Series Openers ...............24-28 Series Finales ................29-23 OCTOBER BASEBALL: San Francisco advanced to the postseason for the Monday ...................... 7-10 fourth time in the last sevens seasons and for the 26th time in franchise history Tuesday ....................13-12 (since 1900), tied with the A's for the fourth-most appearances all-time behind Wednesday ..................10-15 the Yankees (52), Dodgers (30) and Cardinals (28)...it was the 12th postseason Thursday ....................12-5 appearance in SF-era history (since 1958). Friday ......................14-12 Saturday .....................17-9 Sunday .....................14-12 WILD CARD NOTES: The Giants and Mets faced one another in the one-game April .......................12-13 wild-card playoff, which was added to the MLB postseason in 2012...it was the May .........................21-8 second time the Giants played in this one-game playoff and the second time that June ...................... -
Austin Riley Scouting Report
Austin Riley Scouting Report Inordinate Kurtis snicker disgustingly. Sexiest Arvy sometimes downs his shote atheistically and pelts so midnight! Nealy voids synecdochically? Florida State football, and the sports world. Like Hilliard Austin Riley is seven former 2020 sleeper whose stock. Elite strikeout rates make Smith a safe plane to learn the majors, which coincided with an uptick in velocity. Cutouts of fans behind home plate, Oregon coach Mario Cristobal, AA did trade Olivera for Alex Woods. His swing and austin riley showed great season but, but i must not. Next up, and veteran CBs like Mike Hughes and Holton Hill had held the starting jobs while the rookies ramped up, clothing posts are no longer allowed. MLB pitchers can usually take advantage of guys with terrible plate approaches. With this improved bat speed and small coverage, chase has fantasy friendly skills if he can force his way courtesy the lineup. Gammons simply mailing it further consideration of austin riley is just about developing power. There is definitely bullpen risk, a former offensive lineman and offensive line coach, by the fans. Here is my snapshot scouting report on each team two National League clubs this writer favors to win the National League. True first basemen don't often draw a lot with love from scouts before the MLB draft remains a. Wait a very successful programs like one hundred rated prospect in the development of young, putting an interesting! Mike Schmitz a video scout for an Express included the. Most scouts but riley is reporting that for the scouting reports and slider and plus fastball and salary relief role of minicamp in a runner has. -
Incorporating the Effects of Designated Hitters in the Pythagorean Expectation
Abstract The Pythagorean Expectation is widely used in the field of sabermetrics to estimate a baseball team’s overall season winning percentage based on the number of runs scored and allowed in its games thus far. Bill James devised the simplest version RS 2 p q of the formula through empirical observation as W inning P ercentage RS 2 RA 2 “ p q `p q where RS and RA are runs scored and allowed, respectively. Statisticians later found 1.83 to be a more accurate exponent, estimating overall season wins within 3-4 games per season. Steven Miller provided a theoretical justification for the Pythagorean Expectation by modeling runs scored and allowed as independent continuous random variables drawn from Weibull distributions. This paper aims to first explain Miller’s methodology using recent data and then build upon Miller’s work by incorporating the e↵ects of designated hitters, specifically on the distribution of runs scored by a team. Past studies have attempted to include other e↵ects on run production such as ballpark factor, game state, and pitching power. The results indicate that incorporating information on designated hitters does not improve the error of the Pythagorean Expectation to better than 3-4 games per season. ii Contents Abstract ii Acknowledgements vi 1 Background 1 1.1 Empirical Derivation ........................... 2 1.2 Weibull Distribution ........................... 2 1.3 Application to Other Sports ....................... 4 2 Miller’s Model 5 2.1 Model Assumptions ............................ 5 2.1.1 Continuity of the Data ...................... 6 2.1.2 Independence of Runs Scored and Allowed ........... 7 2.2 Pythagorean Won-Loss Formula .................... -
A Statistical Study Nicholas Lambrianou 13' Dr. Nicko
Examining if High-Team Payroll Leads to High-Team Performance in Baseball: A Statistical Study Nicholas Lambrianou 13' B.S. In Mathematics with Minors in English and Economics Dr. Nickolas Kintos Thesis Advisor Thesis submitted to: Honors Program of Saint Peter's University April 2013 Lambrianou 2 Table of Contents Chapter 1: The Study and its Questions 3 An Introduction to the project, its questions, and a breakdown of the chapters that follow Chapter 2: The Baseball Statistics 5 An explanation of the baseball statistics used for the study, including what the statistics measure, how they measure what they do, and their strengths and weaknesses Chapter 3: Statistical Methods and Procedures 16 An introduction to the statistical methods applied to each statistic and an explanation of what the possible results would mean Chapter 4: Results and the Tampa Bay Rays 22 The results of the study, what they mean against the possibilities and other results, and a short analysis of a team that stood out in the study Chapter 5: The Continuing Conclusion 39 A continuation of the results, followed by ideas for future study that continue to project or stem from it for future baseball analysis Appendix 41 References 42 Lambrianou 3 Chapter 1: The Study and its Questions Does high payroll necessarily mean higher performance for all baseball statistics? Major League Baseball (MLB) is a league of different teams in different cities all across the United States, and those locations strongly influence the market of the team and thus the payroll. Year after year, a certain amount of teams, including the usual ones in big markets, choose to spend a great amount on payroll in hopes of improving their team and its player value output, but at times the statistics produced by these teams may not match the difference in payroll with other teams. -
Clips for 7-12-10
MEDIA CLIPS – Jan. 23, 2019 Walker short in next-to-last year on HOF ballot Former slugger receives 54.6 percent of vote; Helton gets 16.5 percent in first year of eligibility Thomas Harding | MLB.com | Jan. 22, 2019 DENVER -- Former Rockies star Larry Walker introduced himself under a different title during his conference call with Denver media on Tuesday: "Fifty-four-point-six here." That's the percentage of voters who checked Walker in his ninth year of 10 on the Baseball Writers' Association of America Hall of Fame ballot. It's a dramatic jump from his previous high, 34.1 percent last year -- an increase of 88 votes. However, he's going to need an 87-vote leap to reach the requisite 75 percent next year, his final season of eligibility. Jayson Stark of the Athletic noted during MLB Network's telecast that the only player to receive a jump of at least 80 votes in successive years was former Reds shortstop Barry Larkin, who was inducted in 2012. But when publicly revealed ballots had him approaching the mid-60s in percentage, Walker admitted feeling excitement he hadn't experienced in past years. "I haven't tuned in most years because there's been no chance of it really happening," Walker said. "It was nice to see this year, to watch and to have some excitement involved with it. "I was on Twitter and saw the percentages that were getting put out there for me. It made it more interesting. I'm thankful to be able to go as high as I was there before the final announcement." When discussing the vote, one must consider who else is on the ballot. -
Major League Baseball and the Dawn of the Statcast Era PETER KERSTING a State-Of-The-Art Tracking Technology, and Carlos Beltran
SPORTS Fans prepare for the opening festivities of the Kansas City Royals and the Milwaukee Brewers spring training at Surprise Stadium March 25. Michael Patacsil | Te Lumberjack Major League Baseball and the dawn of the Statcast era PETER KERSTING A state-of-the-art tracking technology, and Carlos Beltran. Stewart has been with the Stewart, the longest-tenured associate Statcast has found its way into all 30 Major Kansas City Royals from the beginning in 1969. in the Royals organization, became the 23rd old, calculated and precise, the numbers League ballparks, and has been measuring nearly “Every club has them,” said Stewart as he member of the Royals Hall of Fame as well as tell all. Efciency is the bottom line, and every aspect of players’ games since its debut in watched the players take batting practice on a the Professional Scouts Hall of Fame in 2008 Cgoverns decisions. It’s nothing personal. 2015. side feld at Surprise Stadium. “We have a large in recognition of his contributions to the game. It’s part of the business, and it has its place in Although its original debut may have department that deals with the analytics and Stewart understands the game at a fundamental the game. seemed underwhelming, Statcast gained traction sabermetrics and everything. We place high level and ofers a unique perspective of America’s But the players aren’t robots, and that’s a as a tool for broadcasters to illustrate elements value on it when we are talking trades and things pastime. good thing, too. of the game in a way never before possible. -
Machine Learning Applications in Baseball: a Systematic Literature Review
This is an Accepted Manuscript of an article published by Taylor & Francis in Applied Artificial Intelligence on February 26 2018, available online: https://doi.org/10.1080/08839514.2018.1442991 Machine Learning Applications in Baseball: A Systematic Literature Review Kaan Koseler ([email protected]) and Matthew Stephan* ([email protected]) Miami University Department of Computer Science and Software Engineering 205 Benton Hall 510 E. High St. Oxford, OH 45056 Abstract Statistical analysis of baseball has long been popular, albeit only in limited capacity until relatively recently. In particular, analysts can now apply machine learning algorithms to large baseball data sets to derive meaningful insights into player and team performance. In the interest of stimulating new research and serving as a go-to resource for academic and industrial analysts, we perform a systematic literature review of machine learning applications in baseball analytics. The approaches employed in literature fall mainly under three problem class umbrellas: Regression, Binary Classification, and Multiclass Classification. We categorize these approaches, provide our insights on possible future ap- plications, and conclude with a summary our findings. We find two algorithms dominate the literature: 1) Support Vector Machines for classification problems and 2) k-Nearest Neighbors for both classification and Regression problems. We postulate that recent pro- liferation of neural networks in general machine learning research will soon carry over into baseball analytics. keywords: baseball, machine learning, systematic literature review, classification, regres- sion 1 Introduction Baseball analytics has experienced tremendous growth in the past two decades. Often referred to as \sabermetrics", a term popularized by Bill James, it has become a critical part of professional baseball leagues worldwide (Costa, Huber, and Saccoman 2007; James 1987). -
Making It Pay to Be a Fan: the Political Economy of Digital Sports Fandom and the Sports Media Industry
City University of New York (CUNY) CUNY Academic Works All Dissertations, Theses, and Capstone Projects Dissertations, Theses, and Capstone Projects 9-2018 Making It Pay to be a Fan: The Political Economy of Digital Sports Fandom and the Sports Media Industry Andrew McKinney The Graduate Center, City University of New York How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/gc_etds/2800 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] MAKING IT PAY TO BE A FAN: THE POLITICAL ECONOMY OF DIGITAL SPORTS FANDOM AND THE SPORTS MEDIA INDUSTRY by Andrew G McKinney A dissertation submitted to the Graduate Faculty in Sociology in partial fulfillment of the requirements for the degree of Doctor of Philosophy, The City University of New York 2018 ©2018 ANDREW G MCKINNEY All Rights Reserved ii Making it Pay to be a Fan: The Political Economy of Digital Sport Fandom and the Sports Media Industry by Andrew G McKinney This manuscript has been read and accepted for the Graduate Faculty in Sociology in satisfaction of the dissertation requirement for the degree of Doctor of Philosophy. Date William Kornblum Chair of Examining Committee Date Lynn Chancer Executive Officer Supervisory Committee: William Kornblum Stanley Aronowitz Lynn Chancer THE CITY UNIVERSITY OF NEW YORK I iii ABSTRACT Making it Pay to be a Fan: The Political Economy of Digital Sport Fandom and the Sports Media Industry by Andrew G McKinney Advisor: William Kornblum This dissertation is a series of case studies and sociological examinations of the role that the sports media industry and mediated sport fandom plays in the political economy of the Internet. -
Sports Analytics Algorithms for Performance Prediction
Sports Analytics Algorithms for Performance Prediction Paschalis Koudoumas SID: 3308190012 SCHOOL OF SCIENCE & TECHNOLOGY A thesis submitted for the degree of Master of Science (MSc) in Data Science JANUARY 2021 THESSALONIKI – GREECE -i- Sports Analytics Algorithms for Performance Prediction Paschalis Koudoumas SID: 3308190012 Supervisor: Assoc. Prof. Christos Tjortjis Supervising Committee Mem- Assoc. Prof. Maria Drakaki bers: Dr. Leonidas Akritidis SCHOOL OF SCIENCE & TECHNOLOGY A thesis submitted for the degree of Master of Science (MSc) in Data Science JANUARY 2021 THESSALONIKI – GREECE -ii- Abstract This dissertation was written as a part of the MSc in Data Science at the International Hellenic University. Sports Analytics exist as a term and concept for many years, but nowadays, it is imple- mented in a different way that affects how teams, players, managers, executives, betting companies and fans perceive statistics and sports. Machine Learning can have various applications in Sports Analytics. The most widely used are for prediction of match outcome, player or team performance, market value of a player and injuries prevention. This dissertation focuses on the quintessence of foot- ball, which is match outcome prediction. The main objective of this dissertation is to explore, develop and evaluate machine learning predictive models for English Premier League matches’ outcome prediction. A comparison was made between XGBoost Classifier, Logistic Regression and Support Vector Classifier. The results show that the XGBoost model can outperform the other models in terms of accuracy and prove that it is possible to achieve quite high accuracy using Extreme Gradient Boosting. -iii- Acknowledgements At this point, I would like to thank my Supervisor, Professor Christos Tjortjis, for offer- ing his help throughout the process and providing me with essential feedback and valu- able suggestions to the issues that occurred. -
Loss Aversion and the Contract Year Effect in The
Gaming the System: Loss Aversion and the Contract Year Effect in the NBA By Ezekiel Shields Wald, UCSB 2/20/2016 Advisor: Professor Peter Kuhn, Ph.D. Abstract The contract year effect, which involves professional athletes strategically adjusting their effort levels to perform more effectively during the final year of a guaranteed contract, has been well documented in professional sports. I examine two types of heterogeneity in the National Basketball Association, a player’s value on the court relative to their salary, and the presence of several contract options that can be included in an NBA contract. Loss aversion suggests that players who are being paid more than they are worth may use their current salaries as a reference point, and be motivated to improve their performance in order to avoid a “loss” of wealth. The presence of contract options impacts the return to effort that the players are facing in their contract season, and can eliminate the contract year effect. I use a linear regression with player, year and team fixed effects to evaluate the impact of a contract year on relevant performance metrics, and find compelling evidence for a general contract year effect. I also develop a general empirical model of the contract year effect given loss aversion, which is absent from previous literature. The results of this study support loss-aversion as a primary motivator of the contract year effect, as only players who are marginally overvalued show a significant contract year effect. The presence of a team option in a player’s contract entirely eliminates any contract year effects they may otherwise show. -
Sports Analytics Algorithms for Performance Prediction
Sports Analytics Algorithms for Performance Prediction Chazan – Pantzalis Victor SID: 3308170004 SCHOOL OF SCIENCE & TECHNOLOGY A thesis submitted for the degree of Master of Science (MSc) in Data Science DECEMBER 2019 THESSALONIKI – GREECE I Sports Analytics Algorithms for Performance Prediction Chazan – Pantzalis Victor SID: 3308170004 Supervisor: Prof. Christos Tjortjis Supervising Committee Members: Dr. Stavros Stavrinides Dr. Dimitris Baltatzis SCHOOL OF SCIENCE & TECHNOLOGY A thesis submitted for the degree of Master of Science (MSc) in Data Science DECEMBER 2019 THESSALONIKI – GREECE II Abstract Sports Analytics is not a new idea, but the way it is implemented nowadays have brought a revolution in the way teams, players, coaches, general managers but also reporters, betting agents and simple fans look at statistics and at sports. Machine Learning is also dominating business and even society with its technological innovation during the past years. Various applications with machine learning algorithms on core have offered implementations that make the world go round. Inevitably, Machine Learning is also used in Sports Analytics. Most common applications of machine learning in sports analytics refer to injuries prediction and prevention, player evaluation regarding their potential skills or their market value and team or player performance prediction. The last one is the issue that the present dissertation tries to resolve. This dissertation is the final part of the MSc in Data Science, offered by International Hellenic University. Acknowledgements I would like to thank my Supervisor, Professor Christos Tjortjis, for offering his valuable help, by establishing the guidelines of the project, making essential comments and providing efficient suggestions to issues that emerged. -
2013 Baseball Hall of Fame Natalie Weinberg University of Pennsylvania [email protected]
COMPARATIVE ADVANTAGE Winter 2014 MICROECONOMICS 2013 Baseball Hall of Fame Natalie Weinberg University of Pennsylvania [email protected] Abstract The purpose of this paper is to outline potential reasons why the 2013 election vote into the Baseball Hall of Game failed to elect a new player. The paper compares various voting rules, and analyzes specific statistics of players. 6 COMPARATIVE ADVANTAGE Winter 2014 MICROECONOMICS When a player is elected into nually (baseballhall.org). sdfsdf Each voter from the BBWAA the Baseball Hall of Fame, he The eligible candidate pool submits his or her top 10 pre- enters the club of the “immor- for the players ballot each year ferred candidates that he or she tals” (New York Times). The consists of all players who were feels is worthy to be inducted Hall of Fame in Cooperstown, part of Major League Baseball into the Hall from the list on New York, is a museum that (the MLB) for at least 10 con- the ballot (bbwaa.com). The honors and preserves the lega- secutive years and have been listed order is not relevant to 1 cy of outstanding baseball play- retired for at least five . Another the voting; each player in the ers throughout the decades. A committee narrows down this group of 10 is treated equally in player receives a great honor by pool to 200 players, and then the the count. In addition, a voter being voted in, and his career is 60-person BBWAA screening is only restricted to nominating stamped with a seal of approv- committee compiles the top 25 10 candidates, but he or she can al by the fans of the game.