Automated Playtesting of Platformer Games Using Reinforcement Learning

Automated Playtesting of Platformer Games using Reinforcement Learning A thesis presented to the academic faculty in partial fulfillment of the requirement for the Degree Masters of Science in Game Science and Design in the College of Arts, Media and Design by Varun Sriram Thesis Committee Thesis Advisors: Dr. Giovanni Troiano Committee Members: Dr. Christoffer Holmgård, Dr. Casper Harteveld, Dr. Magy Seif El-Nasr Northeastern University Boston, Massachusetts December 2019 1 2 Abstract Platformer games are popular in the video game industry and their design require much efforts from game companies. As part of the design process, playtesting is key for improving the gameplay before game release. Playtesting is the quality assurance phase of the game development cycle where people are hired to play the game, report bugs and provide feedback regarding the playability of the game. This feedback could be used for game balancing (process of tuning game rules to prevent them from being ineffective or provide undesirable results). However, playtesting may be expensive if done manually and may require several iterations, resulting in high budget requirement and time for game companies. In this thesis, we investigate a way to automatically playtest 2D platformer levels using a combination of deep reinforcement learning and curriculum learning, for both quality assurance and game balancing. Deep Reinforcement Learning has contributed greatly in playing games (Atari and Dota 2) and in this thesis, we will try to replicate the results to playtest games. Curriculum learning is an approach that has shown promising results thus we will explore it to derive useful results. We develop our APT tool by training an artificial intelligence (AI) agent on several different platformer levels following a curriculum, and use the trained agent to playtest newly-created levels. Our APT is able to identify areas of the level that needed design improvements and further gameplay balancing. We contribute a reliable APT tool for designers that wish to easily design 2D platformer games and a discussion of how our results extend to APT at-large. Keywords: 2D Platformer Games, Quality Assurance, Automated Playtesting, Deep Reinforcement Learning, Curriculum Learning Northeastern University Boston, Massachusetts December 2019 3 Table of Contents List of Tables List of Figures List of Equations Acknowledgements 7 1. Introduction 8 2. Background 9 3. Methodology 15 4. Results 25 5. Discussion 31 6. Conclusion 35 References 37 Appendix 4 List of Tables Table 1. Hyperparameters for Neural Network Table 2. Reward Structure for the agent Table 3. Damage taken by agent Table 4. Agent Death List of Figures Figure 1. Huge map system in Ori and the Blind Forest Figure 2. Turrets and Spikes Pits in Ori and the Blind Forest Figure 3. Markov Reward Process Figure 4. A Neuron Figure 5. A Neural Network Figure 6. Raycasting Figure 7. Level with spike pits Figure 8. Level with enemy AI Figure 9. Level with both, spike pits and enemy Figure 10. Unseen level for testing Figure 11. Unseen level with player heatmaps Figure 12. Heatmaps for random agent on unseen level Figure 13. Coins in Sonic the Hedgehog Figure 14. “Spirit light” is the equivalent to coins and acts as a currency to unlock new abilities 5 Figure 15. Special moves in Ori and the Blind Forest Figure 16. Ability tree in Ori and the Blind Forest List of Equations Equation 1. Equation for Markov Reward Process Equation 2. Bellman Equation 6 ACKNOWLEDGEMENTS Firstly I would like to thank my committee members Dr. Christoffer Holmgard, Dr. Casper Harteveld, Dr. Giovanni Troiano and Dr. Magy Seif El-Nasr for their guidance and help which eased the thesis writing process. Thank you Christoffer for introducing me to artificial intelligence and guiding me, answering my endless list of queries and providing constructive criticism wherever needed. I would like to thank my family for believing in me that games science is an important field regardless of its application and helping me pursue my dreams. Special thanks to all my friends living here (USA) and in India, who helped me throughout my Master's journey with their constant support. 7 INTRODUCTION Playtesting is an important part of the game development cycle. It provides feedback regarding the “playability” of the game. In the game industry, the quality assurance (QA) process involves hiring human playtesters to play the game, report bugs and provide feedback regarding the playability of the game. Game development is an iterative process and a game is released when it is balanced and almost has no bugs (it is difficult to assume that a game will have no bugs). For this iterative process to function, there is constant requirement of human playtesters which cost money and time. One solution to cut down on QA cost is by automating the playtesting process resulting in a minimal need for human playtesters. Machine learning (branch of artificial intelligence) is used in play-testing applications (PTA; (Gudmundsson et al., 2018); such approach is often referred to as automated play-testing (APT). In APT, pre-trained artificial intelligence (AI) agents will “play” the game, test for bugs and provide feedback to game designers regarding balance in game mechanics, and provide QA (Pfau, Smeddinck, & Malaka, 2017). Platformer Games (PG) are amongst the most popular type of video games (Galyonkin S., 2019) and include games like Mario Bros, Sonic The Hedgehog, and Crash Bandicoot. In PG games, the main task of players is usually to jump between obstacles, move and jump from one platform to another, and avoiding and or shoot enemies. A PG is a combination of various design patterns, 8 including collectibles, mechanics, and power-ups (Khalifa, de Mesentier Silva, & Togelius, 2019), which developers can use to create a vast number of unique levels. However, analyzing design patterns (Smith, Cha, & Whitehead, 2008) and their combinations for balancing the gameplay (Spencer, 1977) is hard. Depending on the dimensionality of the game objects used. 3D games have x, y, and z axes can and ignoring the z axis will result in 2D PGs. In this thesis, we will be focusing on 2D PGs. This thesis uses APT in the context of PGs, with the end goal of helping game designers playtest and design their PGs effectively and rapidly. We develop a PTA, which automatically plays and tests premade PG levels. The PTA developed for this thesis will provide other developers with feedback about the difficulty and the degree of game balance in their game design. The thesis will help developers to playtest their games for feedback regarding the attributes mentioned above. Additionally, this system could be applicable to students who aspire to develop platformer games, independent game developers, and for research in game artificial intelligence. BACKGROUND This thesis investigates APT in PGs using machine learning and deep reinforcement learning to be precise. There has been extensive research conducted in the field of APT with some of them done by King (Gudmundsson et al., 2018) using supervised learning, EA (Zhao et al., 2019) using inverse reinforcement learning and researchers (Mugrai, Silva, Holmgard, & Togelius, 2019) use monte carlo tree search with genetic evolution, (García-Sánchez, Tonda, Mora, 9 Squillero, & Merelo, 2018) use genetic evolution. All the mentioned papers have provided good results to the field of APT. Curriculum learning (CL) (Bengio, Louradour, Collobert, & Weston, 2009) is defined as a way of training a machine learning model where more difficult aspects of a problem are slowly introduced to challenge the model/agent optimally. This way, I could train AI models well versed with different aspects of its environment as the problems are presented to the agent following a proper difficulty curve (Aponte, Levieux, & Natkin, 2009). Deep reinforcement learning has been used to play different games like Atari (Mnih et al., 2013) by DeepMind, Dota 2 (OpenAI, 2018) by OpenAI etc. DeepMind’s AI agents have successfully completed the Atari games and OpenAI’s Dota team has defeated the current world champions (Peng, Sarazen, 2019) too. Therefore, I could use deep reinforcement learning not just to play, but to playtest games as well. To explore new possibilities and set baselines, in this thesis, I position my work in the area of APT, with machine learning as an approach and deep reinforcement learning in particular with curriculum learning. Next, I briefly review previous work in both areas, and discuss challenges that earlier work encountered when developing PG levels, and how APT can help tackle such challenges. 10 Level Design in Platformer Games Consider a 2D PG like Ori and the Blind Forest (Moon Studios, 2015). It consists of a complex map system, which includes PG mechanics such as jumping to and from platforms, as well as solving puzzles. Ori, the protagonist can faces enemies (e.g., turrets, melee frogs, porcupines with long ranged projectiles), collect health shards, energy shards and many other special items (e.g., snow orb, key for doors). This presence of multiple elements results in the formation of multiple mechanics, setting no limit to the number of unique levels one can design. Thus, by automating the testing process, I am able to test a vast number of levels, in a short period of time, and make playtesting more scalable, less expensive, and requiring only minimal human supervision. 11 Fig 1. An example of map system from Ori and the Blind Forest 12 Fig 2. Turrets, spikes, platforms and blue crystal pickups in Ori and the Blind Forest Automated Play-Testing in Platformer games APT is used in games to replace human testers with pre-trained AI agents. The agent will provide QA regarding the playability of the level. The advantages of APT is that it reduces time and money expenses compared to human testers In this thesis, we explore the use of APT for the design of PGs.

Automated Playtesting of Platformer Games Using Reinforcement Learning

Artificial Intelligence: with Great Power Comes Great Responsibility

Towards Incremental Agent Enhancement for Evolving Games

DEEP LEARNING TECHNOLOGIES for AI Considérations

Long-Term Planning and Situational Awareness in Openai Five

AI in Focus - Fundamental Artificial Intelligence and Video Games

Machine Learning for Speech Recognition by Alice Coucke, Head of Machine Learning Research

Arxiv:2002.10433V1 [Cs.AI] 24 Feb 2020 on Game AI Was in a Niche, Largely Unrecognized by Have Certainly Changed in the Last 10 to 15 Years

Global Artificial Intelligence Industry Whitepaper

When Are We Done with Games?

Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning

Deep Learning: State of the Art (2020) Deep Learning Lecture Series

Applied Machine Learning for Games: a Graduate School Course