Pythagorean Formula in Baseball: Creation of a Better Model Using Backward Elimination Stepwise Regression Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
India's Take on Sports Analytics
PSYCHOLOGY AND EDUCATION (2020) 57(9): 5817-5827 ISSN: 00333077 India’s Take on Sports Analytics Rohan Mehta1, Dr.Shilpa Parkhi2 Student, Symbiosis Institute of Operations Mangement, Nashik, India Deputy Director, Symbiosis Institute of Operations Mangement, Nashik, India Email Id: [email protected] ABSTRACT Purpose – The aim of this paper is to study what is sports analytics, what are the different roles in this field, which sports are prominently using this, how big data has impacted this field, how this field is shaping up in Indian context. Also, the aim is to study the growth of job opportunities in this field, how B-schools are shaping up in this aspect and what are the interests and expectations of the B-school grads from this sector. Keywords Sports analytics, Sabermetrics, Moneyball, Technologies, Team sports, IOT, Cloud Article Received: 10 August 2020, Revised: 25 October 2020, Accepted: 18 November 2020 Design Approach analysis, he had done on approximately 10000 deliveries. Another writer, for one of the US The paper starts by explaining about the origin of magazines, F.C Lane was of the opinion that the sports analytics, the most naïve form of it, then batting average of the individual doesn’t reflect moves towards explaining the evolution of it over the complete picture of the individual’s the years (from emergence of sabermetrics to the performance. There were other significant efforts most advanced applications), how it has spread made by other statisticians or writers such as across different sports and how the applications of George Lindsey, Allan Roth, Earnshaw Cook till it has increased with the advent of different 1969. -
Trevor Bauer
TREVOR BAUER’S CAREER APPEARANCES Trevor Bauer (47) 2009 – Freshman (9-3, 2.99 ERA, 20 games, 10 starts) JUNIOR – RHP – 6-2, 185 – R/R Date Opponent IP H R ER BB SO W/L SV ERA Valencia, Calif. (Hart HS) 2/21 UC Davis* 1.0 0 0 0 0 2 --- 1 0.00 2/22 UC Davis* 4.1 7 3 3 2 6 L 0 5.06 CAREER ACCOLADES 2/27 vs. Rice* 2.2 3 2 1 4 3 L 0 4.50 • 2011 National Player of the Year, Collegiate Baseball • 2011 Pac-10 Pitcher of the Year 3/1 UC Irvine* 2.1 1 0 0 0 0 --- 0 3.48 • 2011, 2010, 2009 All-Pac-10 selection 3/3 Pepperdine* 1.1 1 1 1 1 2 L 0 3.86 • 2010 Baseball America All-America (second team) 3/7 at Oklahoma* 0.2 1 0 0 0 0 --- 0 3.65 • 2010 Collegiate Baseball All-America (second team) 3/11 San Diego State 6.0 2 1 1 3 4 --- 0 2.95 • 2009 Louisville Slugger Freshman Pitcher of the Year 3/11 at East Carolina* 3.2 2 0 0 0 5 W 0 2.45 • 2009 Collegiate Baseball Freshman All-America 3/21 at USC* 4.0 4 2 1 0 3 --- 1 2.42 • 2009 NCBWA Freshman All-America (first team) 3/25 at Pepperdine 8.0 6 2 2 1 8 W 0 2.38 • 2009 Pac-10 Freshman of the Year 3/29 Arizona* 5.1 4 0 0 1 4 W 0 2.06 • Posted a 34-8 career record (32-5 as a starter) 4/3 at Washington State* 0.1 1 2 1 0 0 --- 0 2.27 • 1st on UCLA’s career strikeouts list (460) 4/5 at Washington State 6.2 9 4 4 0 7 W 0 2.72 • 1st on UCLA’s career wins list (34) 4/10 at Stanford 6.0 8 5 4 0 5 W 0 3.10 • 1st on UCLA’s career innings list (373.1) 4/18 Washington 9.0 1 0 0 2 9 W 0 2.64 • 2nd on Pac-10’s career strikeouts list (460) 4/25 Oregon State 8.0 7 2 2 1 7 W 0 2.60 • 2nd on UCLA’s career complete games list (15) 5/2 at Oregon 9.0 6 2 2 4 4 W 0 2.53 • 8th on UCLA’s career ERA list (2.36) • 1st on Pac-10’s single-season strikeouts list (203 in 2011) 5/9 California 9.0 8 4 4 1 10 W 0 2.68 • 8th on Pac-10’s single-season strikeouts list (165 in 2010) 5/16 Cal State Fullerton 9.0 8 5 5 2 8 --- 0 2.90 • 1st on UCLA’s single-season strikeouts list (203 in 2011) 5/23 at Arizona State 9.0 6 4 4 5 5 W 0 2.99 • 2nd on UCLA’s single-season strikeouts list (165 in 2010) TOTAL 20 app. -
The Numbers Game: Baseball's Lifelong Fascination with Statistics (Book Review)
The Numbers Game: Baseball's Lifelong Fascination with Statistics (Book Review) The American Statistician May 1, 2005 | Cochran, James J. The Numbers Game: Baseball's Lifelong Fascination with Statistics. Alan SCHWARZ. New York: Thomas Dunne Books, 2004, xv + 270 pp. $24.95 (H), ISBN: 0‐312‐322222‐4. I am amazed at how frequently I walk into a colleague's office for the first time and spot a copy of The Baseball Encyclopedia or Total Baseball on a bookshelf. How many statisticians developed and nurtured their interest in probability and statistics by playing Strat‐O‐ Matic[R] and/or APBA[R] baseball board games; reading publications like the annual The Bill James Baseball Abstract; or scanning countless box scores and current summary statistics in the Sporting News or in the sports section of the local newspaper, looking for an edge in a rotisserie baseball league? For many of us, baseball is what initially opened our eyes to the power of probability and statistics, and the sport continues to be an integral part of our lives (for evidence of this, attend the sessions or business meeting of the ASA's Statistics in Sports section at the next Joint Statistical Meetings). This is why so many probabilists and statisticians will enjoy reading Alan Schwarz's The Numbers Game: Baseball's Lifelong Fascination with Statistics. In this book, Schwarz effectively chronicles the ongoing association between baseball and statistics by describing the evolution of the sport from the mid‐nineteenth century through the beginning of the twenty‐first century, and looking at several of its most influential characters. -
Major League Rules
Major League Rules All Managers, Assistant Managers, Team Members, Families, and their guests are subject to the Wasco Baseball Codes of Conduct. All players must meet the minimum age requirement of 11 years and maximum of 12 years of age as of April 30th of that year for the Spring season. Any players not meeting these minimum age requirements must attend a tryout conducted by the league and if they meet the league’s designated safety and ability standards for play in this league, the league will permit them to move up into this next league level of play and pay all fees associated with this league provided the player is no more than one year removed from that season's minimum age requirement. All Wasco Baseball Rules follow the NFHS Rule Book unless stated differently below. Field Dimensions Distance between bases: 70 feet from back of home plate (point) to outfield side of bases at 1st and 3rd, from foul line side of bases at 1st and 3rd to center of 2nd base. All bases are inside the 70-foot square except 2nd base. Pitching rubber distance: 48-50 feet from the back of home plate (point) to the front of the rubber. Uniforms Uniforms are conventional, with long-legged pants (no shorts) and sneakers or compositional baseball shoes ( no steel spike shoes permitted), that the player provides. The league provides replica shirts, pants, socks, belts, and baseball hats. Each player is responsible for wearing their entire uniform on game-day (with all shirts “tucked in”). It is expected that hats will be worn to both practices and games (the only exception being for the smallest players where their hat does not fit properly), however, shirts are to be worn to games only. -
Machine Learning Applications in Baseball: a Systematic Literature Review
This is an Accepted Manuscript of an article published by Taylor & Francis in Applied Artificial Intelligence on February 26 2018, available online: https://doi.org/10.1080/08839514.2018.1442991 Machine Learning Applications in Baseball: A Systematic Literature Review Kaan Koseler ([email protected]) and Matthew Stephan* ([email protected]) Miami University Department of Computer Science and Software Engineering 205 Benton Hall 510 E. High St. Oxford, OH 45056 Abstract Statistical analysis of baseball has long been popular, albeit only in limited capacity until relatively recently. In particular, analysts can now apply machine learning algorithms to large baseball data sets to derive meaningful insights into player and team performance. In the interest of stimulating new research and serving as a go-to resource for academic and industrial analysts, we perform a systematic literature review of machine learning applications in baseball analytics. The approaches employed in literature fall mainly under three problem class umbrellas: Regression, Binary Classification, and Multiclass Classification. We categorize these approaches, provide our insights on possible future ap- plications, and conclude with a summary our findings. We find two algorithms dominate the literature: 1) Support Vector Machines for classification problems and 2) k-Nearest Neighbors for both classification and Regression problems. We postulate that recent pro- liferation of neural networks in general machine learning research will soon carry over into baseball analytics. keywords: baseball, machine learning, systematic literature review, classification, regres- sion 1 Introduction Baseball analytics has experienced tremendous growth in the past two decades. Often referred to as \sabermetrics", a term popularized by Bill James, it has become a critical part of professional baseball leagues worldwide (Costa, Huber, and Saccoman 2007; James 1987). -
A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises
DePaul Journal of Sports Law Volume 4 Issue 1 Summer 2007: Symposium - Regulation of Coaches' and Athletes' Behavior and Related Article 3 Contemporary Considerations A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises Jon Berkon Follow this and additional works at: https://via.library.depaul.edu/jslcp Recommended Citation Jon Berkon, A Giant Whiff: Why the New CBA Fails Baseball's Smartest Small Market Franchises, 4 DePaul J. Sports L. & Contemp. Probs. 9 (2007) Available at: https://via.library.depaul.edu/jslcp/vol4/iss1/3 This Notes and Comments is brought to you for free and open access by the College of Law at Via Sapientiae. It has been accepted for inclusion in DePaul Journal of Sports Law by an authorized editor of Via Sapientiae. For more information, please contact [email protected]. A GIANT WHIFF: WHY THE NEW CBA FAILS BASEBALL'S SMARTEST SMALL MARKET FRANCHISES INTRODUCTION Just before Game 3 of the World Series, viewers saw something en- tirely unexpected. No, it wasn't the sight of the Cardinals and Tigers playing baseball in late October. Instead, it was Commissioner Bud Selig and Donald Fehr, the head of Major League Baseball Players' Association (MLBPA), gleefully announcing a new Collective Bar- gaining Agreement (CBA), thereby guaranteeing labor peace through 2011.1 The deal was struck a full two months before the 2002 CBA had expired, an occurrence once thought as likely as George Bush and Nancy Pelosi campaigning for each other in an election year.2 Baseball insiders attributed the deal to the sport's economic health. -
Improving the FIP Model
Project Number: MQP-SDO-204 Improving the FIP Model A Major Qualifying Project Report Submitted to The Faculty of Worcester Polytechnic Institute In partial fulfillment of the requirements for the Degree of Bachelor of Science by Joseph Flanagan April 2014 Approved: Professor Sarah Olson Abstract The goal of this project is to improve the Fielding Independent Pitching (FIP) model for evaluating Major League Baseball starting pitchers. FIP attempts to separate a pitcher's controllable performance from random variation and the performance of his defense. Data from the 2002-2013 seasons will be analyzed and the results will be incorporated into a new metric. The new proposed model will be called jFIP. jFIP adds popups and hit by pitch to the fielding independent stats and also includes adjustments for a pitcher's defense and his efficiency in completing innings. Initial results suggest that the new metric is better than FIP at predicting pitcher ERA. Executive Summary Fielding Independent Pitching (FIP) is a metric created to measure pitcher performance. FIP can trace its roots back to research done by Voros McCracken in pursuit of winning his fantasy baseball league. McCracken discovered that there was little difference in the abilities of pitchers to prevent balls in play from becoming hits. Since individual pitchers can have greatly varying levels of effectiveness, this led him to wonder what pitchers did have control over. He found three that stood apart from the rest: strikeouts, walks, and home runs. Because these events involve only the batter and the pitcher, they are referred to as “fielding independent." FIP takes only strikeouts, walks, home runs, and innings pitched as inputs and it is scaled to earned run average (ERA) to allow for easier and more useful comparisons, as ERA has traditionally been one of the most important statistics for evaluating pitchers. -
More on Defensive Regression (Or Runs) Analysis 7
More on Defensive A Regression (or Runs) Analysis Th is appendix has three primary objectives: fi rst, to disclose aspects of DRA not disclosed in chapter two; second, to address aspects of the model that raise issues related less to baseball per se than to statistical modeling in gen- eral; and third, to drive home the fundamental point that DRA is not an answer, but a method. Included in this appendix are certain alternative models I tried, and suggestions for further improvements, which should provide some sense of the range of alternative approaches that are possible. DRA POST-1951 Overview Th ere are essentially two DRA models: post-1951 and pre-1952. Th e post- 1951 model uses a subset of Retrosheet play-by-play data currently available for seasons aft er 1951, and was almost completely described in chapter two. Th e pre-1952 model must make do with considerably less data, which ren- ders it more primitive for infi elders and unavoidably more complicated for outfi elders. When we fi rst began explaining DRA, we took a ‘bottom-up’ approach, starting from the shortstop position and gradually building up until we had a team model. Here we’ll take a ‘top-down’ approach, revealing the entire post-1951 team model all at once, and then discussing its components. Likewise, we’ll start with a top-down discussion of the pre-1952 model. Th e following page presents the entire post-1951 model on one page, with a glos- sary of defi ned terms on the facing page. 3 AAppendix-A.inddppendix-A.indd 3 22/1/2011/1/2011 22:27:53:27:53 PPMM AAppendix-A.indd 4 p p e n d i x - A . -
Determining the Value of a Baseball Player
the Valu a Samuel Kaufman and Matthew Tennenhouse Samuel Kaufman Matthew Tennenhouse lllinois Mathematics and Science Academy: lllinois Mathematics and Science Academy: Junior (11) Junior (11) 61112012 Samuel Kaufman and Matthew Tennenhouse June 1,2012 Baseball is a game of numbers, and there are many factors that impact how much an individual player contributes to his team's success. Using various statistical databases such as Lahman's Baseball Database (Lahman, 2011) and FanGraphs' publicly available resources, we compiled data and manipulated it to form an overall formula to determine the value of a player for his individual team. To analyze the data, we researched formulas to determine an individual player's hitting, fielding, and pitching production during games. We examined statistics such as hits, walks, and innings played to establish how many runs each player added to their teams' total runs scored, and then used that value to figure how they performed relative to other players. Using these values, we utilized the Pythagorean Expected Wins formula to calculate a coefficient reflecting the number of runs each team in the Major Leagues scored per win. Using our statistic, baseball teams would be able to compare the impact of their players on the team when evaluating talent and determining salary. Our investigation's original focusing question was "How much is an individual player worth to his team?" Over the course of the year, we modified our focusing question to: "What impact does each individual player have on his team's performance over the course of a season?" Though both ask very similar questions, there are significant differences between them. -
Signature Redacted Author
1 Using Machine Learning to Derive Insights from Sports Location Data by Joel Brooks Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2018 @ Massachusetts Institute of Technology 2018. All rights reserved. Signature redacted Author.. - -- 'A 1- - .... ..................... A5 epartment of Electrical Engineering and Computer Science April 2, 2018 CertifiedCetiid byySignature redacted ....... ... ... .. ..... John Guttag Dugald C. Jackson Professor Thesis Supervisor Signature redacted Accepted by ........... I ((J Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students MASSACHUSETTS INSTITUTE OF TECHNOLOGY JUN 18 2018 LIBRARIES ARCHIVES 77 Massachusetts Avenue Cambridge, MA 02139 MITLibraries http://Iibraries.mit.edu/ask DISCLAIMER NOTICE Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available. Thank you. The images contained in this document are of the best quality available. 2 Using Machine Learning to Derive Insights from Sports Location Data by Joel Brooks Submitted to the Department of Electrical Engineering and Computer Science on April 2, 2018, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Historically, much of sports analytics has aimed to find relationships between discrete events and outcomes. The availability of high-resolution event location and tracking data has led to many new opportunities in sports research. However, it is often challenging to apply machine learning to understand a particular aspect of a sport. -
Sabermetrics Over Time: Persuasion and Symbolic Convergence Across a Diffusion of Innovations
SABERMETRICS OVER TIME: PERSUASION AND SYMBOLIC CONVERGENCE ACROSS A DIFFUSION OF INNOVATIONS BY NATHANIEL H. STOLTZ A Thesis Submitted to the Graduate Faculty of WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES in Partial Fulfillment of the Requirements for the Degree of MASTER OF ARTS Communication May 2014 Winston-Salem, North Carolina Approved By: Michael D. Hazen, Ph. D., Advisor John T. Llewellyn, Ph. D., Chair Todd A. McFall, Ph. D. ii Acknowledgments First and foremost, I would like to thank everyone who has assisted me along the way in what has not always been the smoothest of academic journeys. It begins with the wonderful group of faculty I encountered as an undergraduate in the James Madison Writing, Rhetoric, and Technical Communication department, especially my advisor, Cindy Allen. Without them, I would never have been prepared to complete my undergraduate studies, let alone take on the challenges of graduate work. I also want to thank the admissions committee at Wake Forest for giving me the opportunity to have a graduate school experience at a leading program. Further, I have unending gratitude for the guidance and patience of my thesis committee: Dr. Michael Hazen, who guided me from sitting in his office with no ideas all the way up to achieving a completed thesis, Dr. John Llewellyn, whose attention to detail helped me push myself and my writing to greater heights, and Dr. Todd McFall, who agreed to assist the project on short notice and contributed a number of interesting ideas. Finally, I have many to thank on a personal level. -
Game Information @Cakesbaseball
game information www.cakesbaseball.com @cakesbaseball Wednesday, May 9 new orleans baby cakes (17-15) at round rock express (11-22) 7:05 p.m. CT Dell Diamond Game #33 Round Rock, TX RHP Zac Gallen (3-1, 2.50) vs. LHP David Hurlbut (0-2, 4.86) Road Game #18 THE BABY CAKES – e New Orleans Baby Cakes continue an eight ROAD WARRIORS – A er losing each of their rst nine road games of ‘cakes corner game road trip tonight with the second of a fourgame set against the the season, the ‘Cakes have won seven of the last eight, including ve in Round Rock Express. e ‘Cakes have claimed four of ve meetings this a for the rst time since July 1227, 2015. e last sixgame road winning Current Streak ...........................W6 Home ........................................105 season, including three in their nal atbat last week at the Shrine. Last streak was a sevengame run vs. Round Rock and Memphis, June 1226, Streak ..........................................W1 night matched the secondlargest margin of victory for New Orleans in 2007. New Orleans has won the rst ve games of a road trip for the rst Road ......................................... 710 105 alltime matchups at Dell Diamond (ninerun wins four times). time since its seasonending trek through Oklahoma City and Round Streak ..........................................W5 Rock from August 27-31, 2007 which clinched the club’s last division title. Last 10 ........................................ 82 vs. RHP (starter) ....................1313 YESTERDAY’S NEWS – e Baby Cakes used a 14hit attack to defeat vs. LHP (starter) ....................... 42 Round Rock, 102 for their sixth consecutive victory. Eric Campbell had HOT SOUP – Eric Campbell extended his hitting streak to 11 games, the vs.