The Real-Time Strategy Game Multi-Objective Build Order Problem
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Tenth Annual AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2014) The Real-Time Strategy Game Multi-Objective Build Order Problem Jason Blackford and Gary Lamont Department of Electrical and Computer Engineering, Air Force Institute of Technology Wright-Patterson Air Force Base, Dayton, Ohio, United States of America Abstract next action based on an observed strategy the opponent is executing. Adaptation corresponds to an agent’s planning In this paper we examine the build order problem in mechanism in which an agent adapts its plans in real-time real-time strategy (RTS) games in which the objective to cope with changes in an uncertain environment and op- is to optimize execution of a strategy by scheduling actions with respect to a set of subgoals. We model ponent. These three capabilities enable an AI to learn from the build order problem as a multi-objective problem and reason about its world. Together the competencies and (MOP), and solutions are generated utilizing a multi- capabilities describe an AI framework that enables an agent objective evolutionary algorithm (MOEA). A three di- to manage competing goals in an uncertain and dynamic en- mensional solution space is presented providing a de- vironment. piction of a Pareto front for the build order MOP. Re- To summarize the high level requirements for the devel- sults of the online strategic planning tool are provided which demonstrate that our planner out-performs an opment of an expert RTS agent, we generated the expert expert scripted player. This is demonstrated for an AI RTS agent pyramid in Fig. 1. The foundation of the pyra- agent in the Spring Engine Balanced Annihilation RTS mid is the RTS player competencies that are implemented game. with various AI algorithmic techniques. These techniques can range from complex search algorithms (Sailer, Buro, and Lanctot 2007) to scripted managers that encapsulate each Introduction competency. Above the player competencies are the expert RTS player capabilities. The capabilities rely on the algo- A goal of current real-time strategy (RTS) game AI research rithmic techniques utilized to implement the competencies. is to develop an agent that performs at the level of an expert Collectively the competencies and capabilities enable an AI human player. A multitude of researchers explain that there agent to handle uncertainty and manage competing goals in exist a set of RTS player competencies an expert must mas- an RTS environment. ter (Cunha and Chaimowicz 2010) (Coy and Mateas 2008) (Sailer, Buro, and Lanctot 2007). These competencies in- clude strategy execution, tactical maneuvering of units in battle, resource collection, production of units, buildings and upgrades, and scouting enemy positions. Each of these are intertwined with one another in that they introduce compet- ing goals and resource allocation needs. The competencies may be divided into several managers including: strategy, resource collection, production, tactics, and finally a reconnaissance manager. Each of these man- agers possess their own goals that compete for resources and execution. These managers can be integrated into a frame- work that enables these managers to interoperate as a func- tioning RTS game AI. One specific framework is known as a multi-scale agent (Weber et al. 2010). We adopt this ap- proach for our investigation. In addition to the RTS player competencies, Weber (We- ber 2012) presents three capabilities unique to expert players in RTS games: estimation, adaptation, and anticipation. Es- timation and anticipation relate to predicting an opponents Figure 1: The expert RTS agent pyramid organizes from the Copyright c 2014, Association for the Advancement of Artificial bottom up what is necessary for constructing a multi-scale Intelligence (www.aaai.org). All rights reserved. agent capable of playing an RTS game at an expert level. 105 The research problem we address in this paper is on op- based upon the three objective functions of the build or- timizing the performance of the strategic decision-making der MOP. To generate build orders, an MOEA framework process or build orders of an RTS agent. This investigation in conjunction with the RTS strategic decision simulator is includes the development and analysis of an RTS agent that utilized to score a population of randomly generated build optimizes strategy execution in order to perform at an expert order plans. To enhance the discovery of near-optimal so- human level (Blackford 2014)1; therefore, we make the as- lutions, expert solutions are injected into the population of sumption that the build orders generated by an expert player the MOEA framework utilizing CBR (Blackford 2014). This are near-optimal. The objective of our planning tool is to results in an RTS strategic planning capability that is online produce build orders that are as good as or better than expert and provides an agent with near-optimal build orders. The build orders or near-optimal. Our approach is real-time and capabilities of the strategic planner are demonstrated in the is comparably as fast as the approach presented by Churchill Spring game engine with the Balanced Annihilation RTS and Buro in (Churchill and Buro 2012) whose depth-first game. Results for the ability of our model to approximate search (DFS) technique utilizes around five seconds of plan- Starcraft build orders are presented in (Blackford 2014), but ning time to produce build orders in Starcraft. the focus of this paper is the RTS Spring Engine 2 Balanced Annihilation (BA) game. Background Build Order as Producer/Consumer Problem Players of an RTS game attempt to advance the skills and technology of their cities and armies in order to over power In formulating the mathematical model to approximate the their opponent. This technology advancement is defined in build order problem, a goal programming approach was the technology tree of an RTS game. A technology tree is adopted (Coello Coello, Lamont, and Veldhuizen 2007). a large directed decision tree that determines the execution Goal programming is defining objective functions that min- of a player’s selected strategy. It establishes an ordering of imize a distance to a goal. Distance is defined loosely as any actions a player must take in order to build and advance their quantitative metric that can be utilized to measure equiva- army. This leads to the concept of build orders. lence of a feasible solution from a multi-objective problem A build order is an ordered sequence of actions a player population to a target goal. takes to execute their chosen strategy. Strategy execution is a The conjecture that the BOO problem is a planning and planning problem requiring two components: first, determin- scheduling problem with producer/consumer constraints is ing a good goal-ordering to ensure expert level execution of first addressed in (Wei and Sun 2009). The scheduling prob- a strategy with respect to resource allocations and time to ex- lem consists of two parts: Scheduling highly cumulative ecute the strategy; and second, issuing and executing actions activities and scheduling highly disjunctive activities. Both in a build order to ensure expert or near-optimal player per- pieces are required to minimize the makespan for executing formance measured in resource allocations and time to reach and completing actions to reach a desired goal state from an a goal. With respect to providing an AI agent with the ability initial state. Cumulative actions are actions that can be exe- to rationalize across this planning problem, the first compo- cuted or issued concurrently (overlap) on the same resource. nent involving goal-ordering can be determined by deriving This applies to actions requiring volumetric resources. Dis- a goal-ordering from the replays of an expert RTS player junctive scheduling consists of pairs of actions that cannot (Weber and Mateas 2009) (Ontanon et al. 2007) and then be executed concurrently on the same resource. This applies provide this to the agent via case-based reasoning (CBR). to actions requiring unary resources. It is possible for actions Resolving the second component is known in the RTS re- to require both unary and volumetric resources; however, in search community as the build order optimization (Churchill the RTS domain an action most likely requires one unary and Buro 2011) problem and is the focus of this investiga- resource and no more than two volumetric resources. For tion. example, in the popular Age of Empires (AOE) games a sol- dier requires a barracks (unary) and food or wood and gold Build Order Methodology (volumetric resources). This investigation frames the RTS strategic decision-making Cumulative Scheduling process as a build order optimization problem. The build or- der problem is modeled as a multi-objective problem (MOP) In cumulative scheduling, actions or activities are con- (Coello Coello, Lamont, and Veldhuizen 2007) with three strained by volumetric resources. The constraint is the cu- objective functions and three constraints. The mathematical mulative producer/consumer constraint (Wei and Sun 2009). model of the strategic decision-making process is then in- The cumulative constraint requires that the sum of the tegrated into an RTS game simulator capable of executing amounts of consumable resources required by a set of ac- strategic decisions common to generic RTS games including tions scheduled at a time t does not exceed the amount of the Starcraft: Broodwars and Balanced Annihilation. The simu- resource available at time t. For example, if it costs 100 food lator is written in C++ and utilizes seven XML schema files and 15 gold to produce a foot solider in an RTS game, and it to define any RTS game to be simulated (Blackford 2014). costs 200 food and 100 gold to research advanced armor for As build orders are passed into the simulator, it scores them the foot soldier, then to execute these actions concurrently 1https://code.google.com/p/online-planning-rts-agents/ 2https://www.springrts.com/ 106 requires a total of 300 food and 115 gold at the time of re- Exist Constraint questing execution of these activities.