A Reinforcement Learning Based Approach to Play Calling in Football Preston Biro and Stephen G. Walker March 15, 2021 Abstract With the vast amount of data collected on football and the growth of computing abilities, many games involving decision choices can be op- timized. The underlying rule is the maximization of an expected utility of outcomes and the law of large numbers. The data available allows us to compute with high accuracy the probabilities of outcomes of decisions and the well defined points system in the game allows us to have the nec- essary terminal utilities. With some well established theory we can then optimize choices at a single play level. Keywords: Football; Reinforcement Learning; Markov Decision Process; Ex- pected Points; Optimal Decisions. Corresponding Author: Preston Biro, University of Texas at Austin, De- partment of Statistics and Data Sciences, Austin, TX 78712-1823, USA, e-mail:
[email protected] Stephen G. Walker University of Texas at Austin, Department of Mathemat- ics, Austin, TX 78712-1823, USA, e-mail:
[email protected] 1 Introduction With the advances in computer power and the ability to both acquire and store huge quantities of data, so goes the corresponding advance of the machine (aka arXiv:2103.06939v1 [cs.LG] 11 Mar 2021 algorithm) to replace the human as a primary source of decision making. The number of successful applications is increasing at a rapid pace; in games, such as Chess and Go, medical imaging and diagnosing tumours, to automated driving, and even the selection of candidates for jobs. The notion of reinforcement learning is one key principle, whereby a game or set of decisions is studied and rewards recorded so a machine can learn long term benefits from local decisions, often negotiating a sequence of complex decisions.