View of Deep RL Methods Used in RTS Games
Total Page:16
File Type:pdf, Size:1020Kb
Outcome Prediction and Hierarchical Models in Real-Time Strategy Games by Adrian Marius Stanescu A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computing Science University of Alberta ⃝c Adrian Marius Stanescu, 2019 Abstract For many years, traditional boardgames such as Chess, Checkers or Go have been the standard environments to test new Artificial Intelligence (AI) al- gorithms for achieving robust game-playing agents capable of defeating the best human players. Presently, the focus has shifted towards games that of- fer even larger action and state spaces, such as Atari and other video games. With a unique combination of strategic thinking and fine-grained tactical com- bat management, Real-Time Strategy (RTS) games have emerged as one of the most popular and challenging research environments. Besides state space complexity, RTS properties such as simultaneous actions, partial observabil- ity and real-time computing constraints make them an excellent testbed for decision making algorithms under dynamic conditions. This thesis makes contributions towards achieving human-level AI in these complex games. Specifically, we focus on learning, using abstractions and performing adversarial search in real-time domains with extremely large action and state spaces, for which forward models might not be available. We present two abstract models for combat outcome prediction that are accurate while reasonably computationally inexpensive. These models can in- form high level strategic decisions such as when to force or avoid fighting or be used as evaluation functions for look-ahead search algorithms. In both cases we obtained stronger results compared to at the time state-of-the-art heuris- tics. We introduce two approaches to designing adversarial look-ahead search algorithms that are based on abstractions to reduce the search complexity. Firstly, Hierarchical Adversarial Search uses multiple search layers that work at different abstraction levels to decompose the original problem. Secondly, Puppet Search methods use configurable scripts as an action abstraction mech- ii anism and offer more design flexibility and control. Both methods show similar performance compared to top scripted and state-of-the-art search based agents in small maps, while outperforming them on larger ones. We show how to use Convolutional Neural Networks (CNNs) to effectively improve spatial aware- ness and evaluate game outcomes more accurately than our previous combat models. When incorporated into adversarial look-ahead search algorithms, this evaluation function increased their playing strength considerably. In these complex domains forward models might be very slow or even un- available, which makes search methods more difficult to use.e W show how policy networks can be used to mimic our Puppet Search algorithm and to bypass the need of a forward model during gameplay. We combine the much faster resulting method with other search-based tactical algorithms to pro- duce RTS game playing agents that are stronger than state-of-the-art algo- rithms. We then describe how to eliminate the need for simulators or forward models entirely by using Reinforcement Learning (RL) to learn autonomous, self-improving behaviors. The resulting agents defeated the built-in AI con- vincingly and showed complex cooperative behaviors in small scale scenarios of a fully fledged RTS game. Finally, learning becomes more difficult when controlling increasing num- bers of agents. We introduce a new approach that uses CNNs to produce a spatial decomposition mechanism and makes credit assignment from a single team reward signal more tractable. Applied to a standard Q-learning method, this approach resulted in increased performance over the original algorithm in both small and large scale scenarios. iii Preface This article-based dissertation contains previously published material. The author of this dissertation is the first author of these papers, unless otherwise stated. Chapter 3 is joint work with Michal Certick`y.ˇ It was previously pub- lished [207] in the IEEE Transactions on Computational Intelligence and AI in Games Journal (TCIAIG), 2016. Chapter 4 is joint work with Sergio Poo Hernandez, Graham Erickson, Russel Greiner and Michael Buro. It was previously published [208] at the AAAI Conference on Artificial Intelligence and Interactive Digital Entertain- ment (AIIDE), 2013. Chapter 5 is joint work with Nicolas A Barriga and Michael Buro. It was previously published [204] at the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), 2015. Chapter 6 is joint work with Nicolas A. Barriga and Michael Buro. It was previously published [202] at the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), 2014. Chapter 7 is joint work with Nicolas A. Barriga, Andy Hess and Michael Buro. It was previously published [206] at the IEEE Conference on Computa- tional Intelligence and Games (CIG), 2016. Chapter 8 summarizes work led by Nicolas A. Barriga, in which I, Adrian Marius Stanescu participated as second author. Michael Buro supervised the projects. The chapter describes and contains parts from work previously pub- lished [11, 14, 13] at the AAAI Conference on Artificial Intelligence and Inter- active Digital Entertainment (AIIDE) 2015 and 2017 and in IEEE Transactions on Computational Intelligence and AI in Games (TCIAIG), 2017. iv Chapter 9 describes work that was done by me, Adrian Marius Stanescu, during an internship at Creative Assembly, UK. Guidance in respect to the game engine, existing codebase, and overall project management via weekly planning meetings was provided by Andre Arsenault, Scott Pitkethly and Tim Gosling, from Creative Assembly. Advice with machine learning design was offered by Michael Buro. Chapter 10 describes work done by Adrian Marius Stanescu and supervised by Michael Buro. It was accepted for publishing at the Workshop on Artificia Intelligence for Strategy Games, part of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), 2018. In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of University of Alberta’s products or services. Internal or personal use of this material is permitted. If inter- ested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or re- distribution, please go to http://www.ieee.org/publications_standards/ publications/rights/rights_link.html to learn how to obtain a License from RightsLink. v Acknowledgements I would first like to thank my supervisor Michael Buro for his excellent guid- ance and support throughout my doctoral studies and for his help with less academic ventures such as becoming a prize-winning Skat player and a passable skier. I would also like to thank my committee members Vadim Bulitko, Jonathan Schaeffer, Levi Lelis and James Wright for their comments and feedback that helped improve my work. Thanks to Dave Churchill for paving the way and helping me get started. Special thanks to Nicolas Barriga for all the coffee and moral support shared before deadlines as well as the pizza, movie and scotch afterwards. Thanks to my roommates Sergio, Mona and Graham for letting me win at boardgames, sharing pancakes and countless other delicious meals. Aristotle for offering invaluable assistance in staying positive (while overspending on biking items) and Stuart for helping me stay fit. I would especially like to thank my family. My wife, Raida has been ex- tremely supportive throughout this entire adventure and has made countless sacrifices to help me get to this point. My mom deserves many thanks for her infinite patience and implicit faith in my capabilities. vi Table of Contents 1 Introduction 1 1.1 RealTimeStrategyGames. 2 1.1.1 Multiplayer Online Battle (MOBA) Arena Games . 3 1.1.2 eSportsandRTSGameAI. 3 1.2 Challenges............................. 4 1.3 Contributions ........................... 7 1.4 Publications............................ 8 2 Literature Survey 10 2.1 ReactiveControl ......................... 11 2.2 Tactics............................... 12 2.3 Strategy .............................. 13 2.4 HolisticApproaches........................ 16 2.5 Deep Reinforcement Learning in RTSGames ............................ 17 2.5.1 ReinforcementLearningApproaches . 19 2.5.2 Overview of Deep RL Methods used in RTS Games . 20 3 Predicting Opponent’s Production in Real-Time Strategy Games with Answer Set Programming 25 3.1 Introduction............................ 26 3.2 AnswerSetProgramming . 27 3.3 PredictingUnitProduction . 28 3.3.1 GeneratingUnitCombinations . 29 3.3.2 RemovingInvalidCombinations . 30 3.3.3 Choosing the Most Probable Combination . 30 3.4 ExperimentsandResults . 31 3.4.1 ExperimentDesign . 31 3.4.2 EvaluationMetrics . 33 3.4.3 Results........................... 34 3.4.4 RunningTimes ...................... 35 3.5 Conclusion............................. 37 3.6 Contributions Breakdown and Updates Since Publication . .. 37 4 Predicting Army Combat Outcomes in StarCraft 39 4.1 Introduction............................ 39 4.1.1 Purpose .......................... 39 4.1.2 Motivation......................... 40 4.1.3 Objectives......................... 41 4.2 Background ............................ 41 4.2.1 Bayesiannetworks . 41 4.2.2 RatingSystems ...................... 42