Generating and Solving Imperfect Information Games

Generating and Solving Imperfect Information Games Daphne Koller Avi Pfeffer University of California University of California Berkeley, CA 94720 Berkeley, CA 94720 [email protected] [email protected] Abstract players. The only uncertainty is about future moves. In games such as poker, the players have imperfect information: they Work on game playing in AI has typically ignored have only partial knowledge about the current state of the games of imperfect information such as poker. In game. This can result in complex chains of reasoning such this paper, we present a framework for dealing with as: ªSince I have two aces showing, but she raised, then she such games. We point out several important issues is either bluf®ng or she has a good hand; but then if I raise that arise only in the context of imperfect infor- a lot, she may realize that I have at least a third ace, so she mation games, particularly the insuf®ciency of a might fold; so maybe I should underbid, but ¡ ¢ .º It should simple game tree model to represent the players' be fairly obvious that the standard techniques are inadequate information state and the need for randomization in for solving such games: no variant of the minimax algorithm the players' optimal strategies. We describe Gala, duplicates the type of complex reasoning we just described. an implemented system that provides the user with a In game theory [von Neumann and Morgenstern, 1947], on very natural and expressive language for describing games. From a game description, Gala creates an the other hand, virtually all of the work has focused on games with imperfect information. Game theory is mostly intended augmented game tree with information sets which to deal with games derived from ªreal life,º and particularly can be used by various algorithms in order to ®nd optimal strategies for that game. In particular, Gala from economic applications. In real life one rarely has perfect information. The insights developed by game theorists for implements the ®rst practical algorithm for ®nding such games also apply to the imperfect information games optimal randomized strategies in two-player imperfect information competitive games [Koller et al., encountered in AI applications. 1994]. The running time of this algorithm is polyno- It is well-known in game theory that the notion of a strat- mial in the size of the game tree, whereas previous egy is necessarily different for games with imperfect informa- algorithms were exponential. We present exper- tion. In perfect information games, the optimal move for each imental results showing that this algorithm is also player is clearly de®ned: at every stage there is a ªrightº move ef®cient in practice and can therefore form the basis that is at least as good as any other move. But in imperfect for a game playing system. information games, the situation is not as straightforward. In the simple game of ªscissors-paper-stone,º any deterministic strategy is a losing one as soon as it is revealed to the other 1 Introduction players. Intuitively, in games where there is an information The idea of getting a computer to play a game has been around gap, it is usually to my advantage to keep my opponent in since the earliest days of computing. The fundamental idea is the dark. The only way to do that is by using randomized as follows: When it is the computer's turn to move, it creates strategies. Once randomized strategies are allowed, the exis- some part of the game tree starting at the current position, tence of ªoptimal strategiesº in imperfect information games evaluates the `leaves' of this partial tree using a heuristic can be proved. In particular, this means that there exists an evaluation function, and then does a minimax search of this optimal randomized strategy for poker, in much the same way tree to determine the optimal move at the root. This same as there exists an optimal deterministic strategy for chess. simple idea is still the core of most game-playing programs. Kuhn [1950] has shown for a simpli®ed poker game that the This paradigm has been successfully applied to a large class optimal strategy does, indeed, use randomization. of games, in particular chess, checkers, othello, backgammon, The optimality of a strategy has two consequences: the and go [Russell and Norvig, 1994, Ch. 5]. There have been far player cannot do better than this strategy if playing against fewer successful programs that play games such as poker or a good opponent, and furthermore the player does not do bridge. We claim that this is not an accident. These games fall worse even if his strategy is revealed to his opponent, i.e., the into two fundamentally different classes, and the techniques opponent gains no advantage from ®guring out the player's that apply to one do not usually apply to the other. strategy. This last feature is particularly important in the The essential difference lies in the information that is avail- context of game-playing programs, since they are vulnerable able to the players. In games such as chess or even backgam- to this form of attack: sometimes the code is accessible, and mon, the current state of the game is fully accessible to both in general, since they always play the same way, their strategy 1185 can be deduced by intensive testing. Given these important playing system for imperfect information games. bene®ts of randomized strategies in imperfect information games, it is somewhat surprising that none of the AI papers that deal with these games (e.g., [Blair et al., 1993; Gordon, 1993; 2 Some basic game theory Smith and Nau, 1993]) utilize such strategies. Game theory is the strategic analysis of interactive situations. In this work,we attempt to solve the computational problem Several aspects of a situation are modeled explicitly: the associated with imperfect information games: Given a concise players involved, the alternative actions that can be taken by description of a game, compute optimal strategies for that each player at various times, the dynamics of the situation, game. Two issues in particular must be addressed. First, the information available to players, and the outcomes at the how do we specify imperfect information games? Describing end. Given such a model, game theory provides the tools the dynamics of the players' information states in a concise to formally analyze the strategic interaction and recommend fashion is a nontrivialknowledge representation task. Second, `rational' strategies to the players. given a game tree with the appropriate structure, how do we The standard representation of a game in computer science ®nd optimal strategies for it? is a tree, in which each node is a possible state of the game, and We present an implemented system, called Gala, that ad- each edge is an action available to a player that takes the game dresses both these computational issues. Gala consists of four to a new state. At each node there is a single player whose turn components. The ®rst is a knowledge representation language it is to choose an action. The set of edges leading out of a node that allows a clear and concise speci®cation of imperfect in- are the choices available to that player. The player may be formation games. As our examples show, the description of chance or `nature', in which case the edges represent random a game in Gala is very similar to, and not much longer than, events. The leaves of the tree specify a payoff for each player. a natural language description of the rules of the game. The This representation is inadequate for games with imperfect second component of the system generates game trees from a information, because it does not specify the information states game description in the language. These game trees are aug- of the players. A player cannot distinguish between states of mented with information sets, a standard concept from game the game in which she has the same information. Thus, any theory that captures the information states of the players. decision taken by the player must be the same at all such The third component of the system addresses the issue of nodes. To encode this constraint, the game tree is augmented ®nding good strategies for such games. Obviously, the stan- with information sets. An information set contains a set of dard minimax-type algorithms cannot produce randomized nodes that are indistinguishable to a player at the time she has strategies. The game theoretic paradigm for solving games is to make a decision. based on taking the entire game tree, and transforming it into Figure 1 presents part of the game tree for a simpli®ed a matrix (called the normal or strategic form of the game). variant of poker described by Kuhn [1950]. The game has Various techniques, such as linear programming, can then be two players and a deck containing the three cards 1, 2, and applied to this matrix in order to construct optimal strategies. 3. Each player antes one dollar and is dealt one card. The Unfortunately, this matrix is typically exponential in the size ®gure shows the part of the game tree corresponding to the ¢ ¡ ¢ ¡ ¢ of the game tree, making the entire approach impractical for deals 2 ¡ 1 , 2 3 , and 1 3 . The game has three rounds. most games. In the ®rst round, the ®rst player can either bet an additional In recent work, Koller, Megiddo, and von Stengel [1994] dollar or pass. After hearing the ®rst player's bet, the second present an alternative approach to dealing with imperfect in- player decides whether to bet or pass.

Generating and Solving Imperfect Information Games

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support