___Signature Redacted Signature Redacted

Incorporating Spatiotemporal Machine Learning into Major League Baseball and the National Football League by Jeremy H. Hochstedler B.S. Electrical Engineering MASSACHUSES INSTITUTE OF TECHNOLOGY Rose-Hulman Institute of Technology, 2006 M.S. Electrical Engineering JUN 2 7217 University of Notre Dame, 2008 LIBRARIES M.S. Management Science and Engineering Stanford University, 2012 ARCHIVES Submitted to the Systems Design and Management Program In Partial Fulfillment of the Requirements for the Degree of Master of Science in Engineering and Management at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2016 D 2016 Jeremy H. Hochstedler. All rights reserved. The author hereby grants MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Signature of Author ______Signature redacted I I Jeremy H. Hochstedler System Design and Management Program June 2016 Certified by Signature redacted Dr./ Sanjay Sarma Dean, Digital Learn & Professor, Mechanical Engineering Signature redacted Thesis Supervisor Accepted by Patrick Hale Director, System Design and Management Program This page intentionally left blank. Page 2 of 55 Incorporating Spatiotemporal Machine Learning into Major League Baseball and the National Football League by Jeremy H. Hochstedler Submitted to the MIT System Design and Management Program on May 11, 2016 in partial fulfillment of the requirements for the Degree of Master of Science in Engineering and Management Abstract Rich data sets exist in Major League Baseball (MLB) and the National Football League (NFL) that track players and equipment (i.e. the ball) in space and time. Using machine learning and other analytical techniques, this research explores the various data sets in each sport, providing advanced insights for team decision makers. Additionally, a framework will be presented on how the results can impact organizational decision-making. Qualitative research methods (e.g. interviews with front office personnel) are used to provide the analysis with both context and breadth; whereas various quantitative analyses supply depth to the research. For example, the reader will be exposed to mathematical/computer science terms such as Kohonen Networks and Voronoi Tessellations. However, they are presented with great care to simplify the concepts, allowing an understanding for most readers. As this research is jointly supported by the engineering and management schools, certain topics are kept at a higher level for readability. For any questions, contact the author for further discussion. Part I will address the distinction between performance and production, followed briefly by a decomposition of a typical MLB organizational structure, and finally display how the results of this analyses can directly impact areas such as player evaluation, advance scouting, and in-game strategy. Part II will similarly present how machine learning analyses can impact opponent scouting and personnel evaluation in the NFL. Thesis Supervisors Dr. Sanjay Sarma Dean, Digital Learning & Professor, Mechanical Engineering Dr. Abel Sanchez Executive Director, Geospatial Data Center & Lecturer, Computer Science Page 3 of 55 This page intentionally left blank. Page 4 of 55 J. Hochstedler I MIT 2016 Contents Chapter 1. Introduction ...................................................................................................... 8 1.1. M otivation .................................................................................................................. 8 1.2. Perform ance vs. Production .................................................................................... 8 1.3. Problem Statement ..................................................................................................... 9 Chapter 2. Analysis & Decision M aking in M ajor League Baseball................................ 10 2.1. M easuring Hitter Performance............................................................................. 10 2.2. Evaluating Pitchers Using Neural Networks......................................................... 14 2.2.2. Identification of Similar Pitchers .................................................................... 16 2.2.3. Predicting Future Production in Unproven Pitchers ...................................... 19 2.3. Incorporating Analyses Into MLB Decision-Making Processes.......................... 22 2.3.1. M odel Verification......................................................................................... 22 2.3.1.1. M ean Absolute Error................................................................................ 26 2.3.1.2. Root M ean Square Error ........................................................................ 27 2.3.1.3. Competition Testing................................................................................ 27 2.3.1.4. M odified Receiver Operating Characteristic ........................................... 28 2.3.2. M LB Organizational Structure...................................................................... 30 Chapter 3. Analysis & Decision Making in the National Football League ..................... 33 3.1. W inning and Avoiding Injuries............................................................................. 33 3.2. Data Collection....................................................................................................... 34 3.3. Receiver Openness and QB Decision-M aking....................................................... 35 3.3.1. Zone Size........................................................................................................... 37 3.3.2. Zone Integrity.................................................................................................. 37 3.3.3. Openness Classification ................................................................................. 38 3.3.4. Expected Gain ............................................................................................... 39 3.3.5. Player Elusiveness......................................................................................... 40 3.3.6. QB Decision Analysis.................................................................................... 41 3.4. Play Identification Using Supervised Learning .................................................... 43 3.4.1. Form ation Classification ............................................................................... 43 3.4.2. Action Classification...................................................................................... 43 3.4.3. Play Concept Classification and Sim ilarity..................................................... 48 3.5. Incorporating Analyses Into NFL Decision-Making Processes........................... 50 3.5.1. NFL Organizational Structure......................................................................... 51 Chapter 4. Conclusion...................................................................................................... 52 Chapter 5. References ...................................................................................................... 53 Chapter 6. Acknowledgements ........................................................................................ 55 Page 5 of 55 J. Hochstedler I MIT 2016 Figures Figure 1. Hitter Pitch FX Coordinate System Displaying Launch and Spray Angles......... 10 Figure 2. Distributions of Exit Speed and Launch Angle .................................................... 11 Figure 3. Scatter Plot of Exit Speed and Launch Angle ................................................... 12 Figure 4. Measuring Performance of Exit Speed and Launch Angle ............................... 13 Figure 5. Training Progress Over 100 Iterations............................................................... 16 Figure 6. Kohonen Map of 415 RHP's from the 2015 MLB Season ................................ 17 Figure 7. Scatter Plot of Model A Predicted vs. Actual Hitter Values ............................ 23 Figure 8. Distributions of Hitter Projection Models ........................................................ 24 Figure 9. Gaussian Distribution of Four Player Projection Models with Actual Values..... 25 Figure 10. Gaussian Distribution After Removal of Small Sample Players ..................... 25 Figure 11. Performance Evaluation of Each Prediction Model ........................................ 28 Figure 12. Zoomed Evaluation of Each Prediction Model .............................................. 29 Figure 13. Simplified MLB Team Baseball Operations Organizational Structure.......... 30 Figure 14. NFL QB Passer Rating vs. Receiver Concussions and Team Losses.............. 34 Figure 15. Down and Distance Utility Function ............................................................... 35 Figure 16. Traditional Voronoi Tessellation.................................................................... 35 Figure 17."Predictive" Voronoi Tessellation ................................................................... 35 Figure 18. Distribution of Predicted Voronoi Zone Size for Eligible Receivers.............. 37 Figure 19. First 2014 Colts Play from Scrimmage .......................................................... 39 Figure 20. Expected Play Gain on 148 Completions ........................................................ 40 Figure 21. Segmented Euclidean Regression of Two Distinct #8 (Post) Routes............. 44 Figure 22. Three Segment Euclidean Regression of Multiple

___Signature Redacted Signature Redacted

Flag Football Rules

WFFN National League Rules

Legacy Statewide 7V7 League - Official Rules 1

Guide for Statisticians © Copyright 2021, National Football League, All Rights Reserved

Flag Football Rules Warning

Football Rules and Interpretations 2018 Edition

Football Manual 2020-21.Pdf

Rule 14 Penalty Enforcement

The Official Advanced Playing Rules of Electric Football

See Official Rules

Flag Football Rules Will Govern Play for Any Rules Not Mentioned in the Following Intramural Rules

2020-3-New Orleans.Pub