Automatic Construction of a Heuristic Search Function for General Game Playing
Total Page:16
File Type:pdf, Size:1020Kb
Automatic Construction of a Heuristic Search Function for General Game Playing Stephan Schiffel and Michael Thielscher Department of Computer Science Dresden University of Technology {stephan.schiffel,mit}@inf.tu-dresden.de Student Paper Abstract tic search. Our focus is on techniques for constructing a search heuristics by the automated analysis of game speci- General Game Playing (GGP) is the art of design- fications. More specifically, we describe the following func- ing programs that are capable of playing previously tionalities of a complete General Game Player: unknown games of a wide variety by being told nothing but the rules of the game. This is in con- 1. Determining legal moves and their effects from a formal trast to traditional computer game players like Deep game description requires reasoning about actions. We Blue, which are designed for a particular game and use the Fluent Calculus and its Prolog-based implemen- can’t adapt automatically to modifications of the tation FLUX [Thielscher, 2005]. rules, let alone play completely different games. General Game Playing is intended to foster the de- 2. To search a game tree, we use non-uniform depth- velopment of integrated cognitive information pro- first search with iterative deepening and general pruning cessing technology. In this article we present an techniques. approach to General Game Playing using a novel 3. Games which cannot be fully searched require auto- way of automatically constructing a search heuris- matically constructing evaluation functions from formal tics from a formal game description. Our system is game specifications. We give the details of a method being tested with a wide range of different games. that uses Fuzzy Logic to determine the degree to which Most notably, it is the winner of the AAAI GGP a position satisfies the logical description of a winning Competition 2006. position. 4. Strategic, goal-oriented play requires to automatically 1 Introduction derive game-specific knowledge from the game rules. General Game Playing is concerned with the development of We present an approach to recognizing structures in systems that can play well an arbitrary game solely by being game descriptions. given the rules of the game. This raises a number of issues different from traditional research in game playing, where it All of these techniques are independent of a particular lan- is assumed that the rules of a game are known to the pro- guage for defining games. However, for the examples given grammer. Writing a player for a particular game allows to fo- in this paper we use the Game Description Language devel- cus on the design of elaborate, game-specific evaluation func- oped by Michael Genesereth and his Stanford Logic Group; tions (e.g., [Morris, 1997]) and libraries (e.g, [Schaeffer et al., the full specification of syntax and semantics can be found 2005]). But these game computers can’t adapt automatically at games.stanford.edu. We also refer to a number of to modifications of the rules, let alone play completely differ- different games whose formal rules are available on this web- ent games. site, too. Since 2005 the Stanford Logic Group holds an an- Systems able to play arbitrary, unknown games can’t be nual competition for General Game Players. The approach given game-specific knowledge. They rather need to be described in this paper has been implemented in the system endowed with high-level cognitive abilities such as general FLUXPLAYER, which has has won the second AAAI Gen- strategic thinking and abstract reasoning. This makes General eral Game Playing competition. Game Playing a good example of a challenge problem which encompasses a variety of AI research areas including knowl- 2 Theorem Proving/Reasoning edge representation and reasoning, heuristic search, planning, and learning. In this way, General Game Playing also revives Games can be formally described by giving an axiomatization some of the hopes that were initially raised for game playing of their rules. A symbolic representation of the positions and computers as a key to human-level AI [Shannon, 1950]. moves of a game is needed. For the purpose of a modular and In this paper, we present an approach to General Game compact encoding, positions need to be composed of features Playing which combines reasoning about actions with heuris- like, for example, (cell ?x ?y ?p) representing that ?p is the contents of square (?x,?y) on a chessboard.1 The where the domain-dependent feature (control ?p) moves are represented by symbols, too, like (move ?u ?v means that it’s player’s ?p turn. ?x ?y) (?u,?v) denoting to move the piece on square to In order to be able to derive legal moves, to simulate game (?x,?y) (promote ?u ?x ?p) , or denoting the pro- play, and to determine the end of a game, a general game ?p ?u motion of a pawn to piece by moving it from file on playing system needs an automated theorem prover. The first ?x the penultimate row to file on the last row. thing our player does when it receives a game description is In the following, we give a brief overview of a specific to translate the GDL representation to Prolog. This allows for Game Description Language, GDL, developed by the Stan- efficient theorem proving. More specifically, we use the Pro- ford Logic Group and used for the AAAI competitions. For log based Flux system [Thielscher, 2005] for reasoning. Flux games.stanford.edu details, we refer to . This language is a system for programmingreasoning agents and for reason- is suitable for describing finite and deterministic n-player ing about actions in dynamic domains. It is based on the Flu- games (n ≥ 1) with complete information. GDL is purely ent Calculus and uses an explicit state representation and so axiomatic, so that no prior knowledge (e.g., of geometry or called state-update axioms for calculating the successor state arithmetics) is assumed. The language is based on a small set after executing an action. Positions correspond to states in of keywords, that is, symbols which have a predefined mean- the Fluent Calculus and features of a game, i.e. atomic parts ing. A general game description consists of the following of a position, are described by fluents. For the time being, elements. the state update axioms we obtain don’t take full advantage • The players are described using the keyword (role of the efficient solution provided by the Fluent Calculus and ?p), e.g., (role white). Flux for the inferential frame problem. Improvements to the • The initial position is axiomatized using the key- translation are work in progress. word (init ?f), for example, (init (cell a 1 white rook)). 3 Search Algorithm • Legal moves are axiomatized with the help of the key- The search method used by our player is a modified iterative- words (legal ?p ?m) and (true ?f), where the deepening depth first search with two well-known enhance- latter is used to describe features of the current position. ments: transposition tables[Schaeffer, 1989] for caching the An example is given by the following implication: value of states already visited during the search; and his- (<= (legal white (promote ?u ?x ?p)) tory heuristics[Schaeffer, 1989] for reordering the moves ac- (true (cell ?u 7 white_pawn)) cording to the values found in lower-depth searches. For ... ) higher depths we apply non-uniformdepth first search, i.e. the • Position updates are axiomatized by a set of formulas depth limit is higher for moves that got a high value in a which entail all features that hold after a move, rela- lower-depth search. This method allows for higher maximal tive to the current position and the moves taken by the depth of the search, especially with big branching factors. It players. The axioms use the keywords (next ?f) and seems to increase the probability that the search reaches ter- (does ?p ?m), e.g., minal states earlier in the match because all games considered end after a finite number of steps. Thus non-uniform depth (<= (next (cell ?x ?y ?p)) first search helps to reduce the horizon problem[Russell and (does ?player (move ?u ?v ?x ?y)) Norvig, 2003]. The horizon problem arises often towards the (true (cell ?u ?v ?p))) end of a match and particularly in games ending after a lim- • The end of a game is axiomatized using the keyword ited number of steps. In those games the heuristic evaluation terminal, for example, function can return a good value but the goal is in fact not reachable anymore because there are not enough steps left. A (<= terminal checkmate) higher search depth for the moves considered best at the mo- (<= terminal stalemate) ment helps to avoid bad terminal states in that case. There is where checkmate and stalemate are auxiliary, another rational behind non-uniform depth-limited search: It domain-dependentpredicates which are defined in terms helps filling the transposition table with states we will look of the current position, that is, with the help of predicate at again and therefore speeds up search in the future. If we true. search the state space behind a move we do not take, it is • Finally, the result of a game is described using the key- unlikely that we will encounter these states again in the next word (goal ?p ?v), e.g., steps of the match. Depending on the type of the game (single-player vs. (<= (goal white 100) multi-player, zero-sum game vs. non-zero-sum game) we checkmate (true (control black))) use additional pruning techniques e.g., alpha-beta-pruning (<= (goal black 0) for two-player games. Our player also decides between checkmate (true (control black))) turn-taking and simultaneous-moves games.