Crawl-O-Matic-O-Matic Automated Optimization of Expert Systems with Machine Learning

Crawl-O-Matic-O-Matic Automated Optimization of Expert Systems With Machine Learning Eli Ramthun Department of Computer Science Earlham College Richmond, Indiana [email protected] ABSTRACT section of the game to improve its performance. The optimization This project involves utilizing machine learning techniques to at- will be performed by using machine learning techniques to find tempt to optimize an already existing artificial intelligence (AI) the most appropriate values for a section of qw’s knowledge base. agent that plays the roguelike game Dungeon Crawl Stone Soup If successful, this project will showcase a technique for refining (DCSS). An expert system based AI agent has been developed that is AI’s for more efficient and/or successful operation in stochastic or capable of winning a game of DCSS without human assistance with unpredictable contexts. a low frequency of success. This project aims to utilize machine Next, related works will be discussed. Following that, the design learning to automatically tune the agent to improve its performance and implementation of the project will be documented, followed by optimizing values stored in its knowledge base concerning mon- by results and analysis. ster threat levels. 2 RELATED WORK KEYWORDS This paper focuses upon the optimization of a game playing agent Roguelikes, Machine Learning, Artificial Intelligence, Software Op- for the roguelike game Dungeon Crawl Stone Soup. In accordance, timization, Expert Systemsu it is important to examine both the history and development of gameplay agents for roguelike games. Attempts have been made 1 INTRODUCTION to optimize gameplay agents for games both roguelike and oth- Handcrafted, heuristically developed artificial intelligence (AI) al- erwise - often utilizing machine learning - and these attempts at gorithms currently exist for numerous applications. From gaming optimization are worth investigating as well. Finally, works based software to automated drone flight to assembly-line robotics, these on applications of expert systems similar to roguelike gameplay algorithms are more and more becoming driving forces of moder- agents will be discussed. nity, enabling and enhancing industrial society from the production 2.1 Gameplay Agents for Roguelike Games of goods to their consumption, and everywhere in between. [12] [5] [20] Gameplay agents exist that can conquer many of the games within These algorithms are often effective in completing their tasks, the roguelike genre, despite their renowned difficulty, complex- but are not necessarily optimized. This lack of optimization can ity, and inherent stochasticity. This paper catalogs an attempt to have a broad swath of effects - from negative effects as minimal optimize qw, an agent created to beat DCSS. Following are brief as slightly reduced movement efficiency to outright catastrophic descriptions of agents that are successful at beating the roguelike failure. Manually tuning algorithms for peak performance can be a games Rogue, Angband, and Nethack. time intensive procedure, with the risk that the potentially marginal • Rogue and Rog-O-Matic gains from optimization will not be worth the time invested to do so. In 1980, Rogue, the titular roguelike, was released. In 1981, However, if the process of optimization can be automated, then the Rog-O-Matic followed. The source code for Rog-O-Matic time required to effectively optimize code can be greatly improved. has been preserved by various archivists and is currently [27] hosted on Github, among other places. [26] Designed as This project investigating the utilization machine learning tech- an AI tooled to utilize an “expert system” so as to conquer niques to optimize an already existing AI agent, or expert system Rogue, Rog-O-Matic excelled at the task. As documented by that plays the roguelike game Dungeon Crawl Stone Soup (DCSS). popular media representations such as “An Expert System [15] Roguelike games feature high levels of randomness and un- Outperforms Mere Mortals As It Conquers The Feared Dun- predictability in each play through, creating a search space much geons Of Doom” found in an issue of Scientific American larger than that of static games such as Chess, or Go, where the from 1985, “Rog-O-Matic had a higher median score than any game is the same each time it is played. [7] of the 15 top Rogue players at” Carnegie Mellon University. Despite the enormous complexity and stochasticity of DCSS, [16] The expert systems organized the decision making pro- an AI has been developed that is capable of winning a game of cesses needed to defeat Rogue into a "hierarchy of various DCSS named qw.[18] qw’s frequency of success at completing the subsystems," depicted in figure 1. [16] Rogue in the early entire game is low, though still notably higher than the average 1980s was a much simpler game to attempt to solve than success rate of human players playing the game - 17% versus 0.73%. DCSS is today. Regardless, the success of the developers of [18][25] This project aims to optimize qw’s behavior for a specific Rog-O-Matic shows that expert systems adept at winning games that undergo such permutations as DCSS are well culture. [3] With that complexity in mind, creating a bot able within reach. to “ascend” (beat the game) unassisted seemed like a pipe dream for many years, until Reddit user duke-nh showed their success with BotHack in 2015. [17] Similarly to the Figure 1: A schematic depicting the layout of the "expert Angband Borg tactic of level-scumming, BotHack’s success is system" of Rog-O-Matic, the first gameplay agent to defeat at least partially reliant on the arguably exploitative tactic of the game Rogue. Knowledge sources feed into expert sub- "pudding farming" to generate an infinite amount of items for systems which ultimately handle the decision making pro- the character to utilize, trivializing many aspects of the game. cesses of the agent. This bot was programmed utilizing some knowledge sources from previous NetHack bots TAEB and Saiph, such as mon- ster and item dictionaries, but the expert subsystems those sources fed into were hand coded by duke-nh. Thus, this bot again follows a similar paradigm to previous expert systems designed to beat roguelike games - encyclopedic knowledge of possible threats and tools, combined with hand-coded procedures for solving the various game states the bot is bound to encounter. This bot is often successful at achieving its task, but in the words of its developer, “all the bot logic is hand-coded.” [17] This shows that this bot as well has room for application of machine-learning based optimization. 2.2 Optimization of Gameplay Agents • Angband and Borg Creation of a gameplay agent for a given game - particularly Angband is a Roguelike game that has been in development a game featuring procedurally generated content - is often since 1994, though it has inherited code from older rogue- an impressive feat. Optimization of these agents goes further, like games such as Moria and UMoria. It has since spawned allowing for improvement of performance over a number of numerous variants featuring different dungeon styles and different possible axes, from speed to score to success rate. playable character archetypes. [1] One of Angbands defining • Rogue-Like Games as a Playground for Artificial In- features is non-persistent dungeons; leaving a floor of the telligence - Evolutionary Approach games dungeon will effectively destroy it. While a character This paper by Cerny et. al [14] specifically deals with issues can, say, enter the first floor of the dungeon, leave, and then like procedural content generation and their effects on al- enter the first floor again, a new layout with new monsters, gorithmic attempts to beat roguelikes. They describe their items, and traps will be generated each time a given depth is procedure towards creating an AI that has reached a 72% entered. This allows for an effectively unlimited amount of winrate at the roguelike game Desktop Dungeons - a sim- resources in the game, if the player follows a strategy known pler game than DCSS, but one featuring complex rule sets as "level-scumming." [6] This strategy is easily exploitable by and procedural content generation nonetheless. The authors automated gameplay agents, and is largely responsible for document their usage of recombination of greedy strategies the success of Borg, the premier Angband bot. By abusing using evolutionary algorithms and demonstrate a remarkable such a strategy, Borg is able to proceed through the game improvement over their initial hand made greedy algorithm safely and effectively. Borg functions by following an in- - which was only able to achieve victory 1% of the time. [14] ternal list of goals, evaluated sequentially until the game This shows that machine learning approaches can have enor- is beaten. These goals range from very simple, such as de- mous potential in optimization of even rudimentary hand stroying worthless items in the characters inventory, to very designed algorithms. complex, such as determining the optimal set of equipment • Maximizing Flow as a Metacontrol in Angband to wear given the current contents of the characters inven- In their 2015 paper, Mariusdottir et. al utilize hierarchical tory. [2] This goal listing functions similarly to the expert reinforcement learning to match different decision making systems of Rog-O-Matic - handing off information about in- strategies with different levels of difficulty that arise natu- ventory contents to functions designed to sort and prioritize rally from the many different possible game states of the them, for instance - and ultimately, this goal setting and roguelike Angband. [24] They demonstrate the applicability following paradigm is what makes qw, the DCSS bot, tick. of machine learning strategies towards optimization within • Nethack and BotHack the genre of roguelikes when applied not only to single NetHack is a famous and incredibly complicated roguelike decisions, but overall strategy decisions. They sought to under development since 1987.

Load more