<<

Why study games?

Games playing: ideal world of hostile agents attempting to diminish one’s well being. Game Playing Reasons to study games: „ Modeling strategic and adversary problems is of general interest (e.g. economic situations). „ Handling opponents introduces uncertainty and requires contingency plans. „ Problems are usually complex and very often viewed as an indicator of intelligence. Characteristics: „ Well-formalized problems: clear description of the environment. „ Common-sense knowledge is not required. „ Rules are fixed. „ Number of nodes in the tree might be high, but memorizing the past is not needed.

Why Study Games? Game Playing as Search

Games offer: Playing a game involves searching for the best move. ¾ Intellectual Engagement ¾ Abstraction Board games clearly involve notions like start state, ¾ Representability ¾ Performance Measure goal state, operators, etc. We can this usefully import problem solving techniques that we have already met.

Not all games are suitable for AI research. We will restrict ourselves to 2 person There are nevertheless important differences from board games. standard search problems.

Special Characteristics of Game Playing Search AI and game playing

Main differences are uncertainties introduced by Game playing (especially and Till now we assumed the situation is not going to change checkers) was the first test application of AI while we search. However …. Presence of an opponent. One do not know what the It involves a different type of search opponent will do until he/she does it. Game playing programs must solve the contingency problem. problem than we have considered up to now Complexity. Most interesting games are simply too – a solution is not a path, but simply the complex to solve by exhaustive means. Chess, for next move example, has an average branching factor of 35. Uncertainty also arises from not having the resources to The best move depends on what the compute a move which is guaranteed to be the best. opponent might do (adversary search)

1 Two-player games: motivation Two-player games

Previous heuristics and search procedures are Search tree for each player remains the same only useful for single-player games „ Even levels i are moves of player A „ Odd levels i+1 are moves of player B „ no notion of turns: one or more cooperative agents Each player searches for a goal (different for „ does not take into account adversarial moves each) at their level Games are ideal to explore adversarial Each player evaluates the states according to strategies their heuristic function „ well-defined, abstract rules A’s best move brings B to the worst state „ most formulated as search problems A searches for its best move assuming B will „ really hard combinatorial problems -- chess!! also search for its best move

MinMax search Typical case

Search for A’s best next move, so that no 2-person game matter what B does (in particular, choosing its Players alternate moves best move) A will be better off Zero-sum: one player’s loss is the other’s gain At each step, evaluate the value of all Perfect information: both players have access to descendants: take the maximum if it is A’s complete information about the state of the game. turn, or the minimum if it is B’s turn No information is hidden from either player. No chance (e.g., using dice) involved We need the estimated values d moves ahead Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, „ generate all nodes to level d (BFS) Othello „ propagate Min-Max values up from leafs Not: Bridge, Solitaire, Backgammon, ...

How to play a game Ingredients of 2-Person Games

A way to play such a game is to: Players: We call them Max and Min. „ Consider all the legal moves you can make Initial State: Includes board position and whose turn it is. „ Compute the new position resulting from each move „ Evaluate each resulting position and determine which is Operators: These correspond to legal moves. best Terminal Test: A test applied to a board position which „ Make that move determines whether the game is over. In chess, for „ Wait for your opponent to move and repeat example, this would be a checkmate or stalemate situation. Key problems are: Utility Function: A function which assigns a numeric „ Representing the “board” value to a terminal state. For example, in chess the „ Generating all legal next boards outcome is win (+1), lose (-1) or draw (0). Note that by „ Evaluating a position convention, we always measure utility relative to Max.

2 Normal and Game Search Problem Chess as a First Choice

Normal search problem: Max searches for a sequence It provides proof that a machine can actually do something of moves yielding a winning position and then makes that was thought to require intelligence. the first move in the sequence. It has simple rules. Game search problem: Clearly, this is not feasible in in a game situation where Min's moves must be taken into The world state is fully accessible to the program. consideration. Max must devise a strategy which leads to a winning position no matter what moves Min makes. The computer representation can be correct in every relevant detail.

Some games Complexity of Searching

• tic-tac-toe The presence of an opponent makes the decision problem • checkers more complicated. •Go Games are usually much too hard to solve. • Othello •chess Games penalize inefficiency very severely. • poker • bridge

Things to Come…

Perfect Decisions in Two-Person Games Perfect Decisions in Two-Player Games Imperfect Decisions

Alpha-Beta Pruning

Games That Include an Element of Chance

3 Games as Search Problem Two Player Game

Some games can normally be defined in the form of a tree. Two players: Max and Min Objective of both Max and Min to optimize winnings Branching factor is usually an average of the possible „ Max must reach a terminal state with the highest utility number of moves at each node. „ Min must reach a terminal state with the lowest utility Game ends when either Max and Min have reached a This is a simple search problem: a player must search this terminal state search tree and reach a leaf node with a favorable outcome. upon reaching a terminal state points maybe awarded or sometimes deducted

Search Problem Revisited Game Playing -

Simple problem is to reach a favorable terminal state Game Playing

Problem Not so simple... An opponent tries to thwart your every move „ Max must reach a terminal state with as high a utility as possible regardless of Min’s moves 1944 - outlined a search Max must develop a strategy that determines best possible method (Minimax) that maximised your move for each move Min makes. position whilst minimising your opponents

Game Playing – Example Game Playing - Minimax

Nim (a simple game) Starting with 7 tokens, the game is small Start with a single pile of tokens enough that we can draw the entire game At each move the player must select a pile and divide the tokens into two non-empty, non-equal piles tree

+ The “game tree” to describe all possible + games follows: +

4 7 Game Playing - Minimax 6-1 5-2 4-3 Conventionally, in discussion of minimax, 5-1-1 4-2-1 3-2-2 3-3-1 have two players “MAX” and “MIN”

4-1-1-1 3-2-1-1 2-2-2-1 The utility function is taken to be the utility for MAX

3-1-1-1-1 2-2-1-1-1 Larger values are better for MAX”

2-1-1-1-1-1

Game Playing – Nim Game Playing – Minimax

Remember that larger values are taken to be better for Basic idea of minimax: MAX Assume that use a utility function of Player MAX is going to take the best move „ 1 = a win for MAX available „ 0 = a win for MIN Will select the next state to be the one with We only compare values, “larger or smaller”, so the the highest utility actual sizes do not matter „ in other games might use {+1,0,-1} for {win,draw,lose}. Hence, value of a MAX node is the MAXIMUM of the values of the next possible states „ i.e. the maximum of its children in the search tree

Game Playing – Minimax Game Playing – Minimax Summary

Player MIN is going to take the best move available A “MAX” move takes the best move for MAX – for MIN so takes the MAX utility of the children „ i.e. the worst available for MAX

Will select the next state to be the one with the lowest A “MIN” move takes the best for min – hence utility the worst for MAX – so takes the MIN utility of „ recall, higher utility values are better for MAX and the children so worse for MIN

Hence, value of a MIN node is the MINIMUM of the Games alternate in play between MIN and values of the next possible states MAX „ i.e. the minimum of its children in the search tree

5 MIN 1 7 11 1 Game Playing – Use of Minimax MAX 6-1 5-2 4-3 The Min node has value +1 0101 MIN 5-1-1 4-2-1 3-2-2 3-3-1 All moves by MIN lead to a state of value 1 +1 for MAX MAX 0 4-1-1-1 3-2-1-1 2-2-2-1 0 MIN cannot avoid losing

MIN 3-1-1-1-1 0 2-2-1-1-1 1 From the values on the tree one can read off the best moves for each player MAX 2-1-1-1-1-1 0 (loss for MAX) „ make sure you know how to extract these best moves (“perfect lines of play”)

Game Playing – Bounded Game Playing – Bounded Minimax Minimax

For real games, search trees are much The terminal states are no longer a definite win/loss „ actually they are really a definite win/draw/loss but bigger and deeper than Nim with reasonable computer resources we cannot determine which Have to heuristically/approximately evaluate the Cannot possibly evaluate the entire tree quality of the positions of the states

Evaluation of the utility function is expensive if it is not Have to put a bound on the depth of the a clear win or loss search

Game Playing – Bounded Minimax MAX 1 A

Next Slide: MIN 1 B -3 C

Artificial example of minimax bounded B Utility values of “terminal” positions obtained Evaluate “terminal position” after all possible by an evaluation function moves by MAX

(The numbers are invented, and just to illustrate the working of minimax) = terminal position = agent = opponent

6 Game Playing – Bounded Minimax MAX 1 A

Example of minimax with bounded depth MIN 1 B -3 C Evaluate “terminal position” after all possible moves in the order: 1. MAX (a.k.a “agent”) MAX 4 D 1 E 2 F -3 G 2. MIN (a.k.a. “opponent”) 3. MAX

(The numbers are invented, and just to illustrate the working of minimax) 4 -5 -5 1 -7 2 -3 -8 Assuming MX plays first, complete the MIN/MAX tree = terminal position = agent = opponent

Game Playing – Bounded Minimax MAX 1 A

MIN 1 B -3 C If both players play their best moves, then which “line” does the play follow? MAX 4 D 1 E 2 F -3 G

4 -5 -5 1 -7 2 -3 -8

= terminal position = agent = opponent

Game Playing – Perfect Play Two-Ply Game: Revisited

Note that the line of perfect play leads the a terminal 3 node with the same value as the root node All intermediate nodes also have that same value Essentially, this is the meaning of the value at the 322 root node Caveat: This only applies if the tree is not expanded further after a move because then the terminals will change and so values can change 31282461452

7 An Analysis Is There Another Way?

This algorithm is only good for games with a low Take Chess on average has: branching factor, Why? „ 35 branches and „ usually at least 100 moves In general, the complexity is: „ so game space is: •35100 O(bd) where: b = average branching factor Is this a realistic game space to search? d = number of plies Since time is important factor in gaming searching this game space is highly undesirable

Why is it Imperfect?

Imperfect Decisions Many games produce very large search trees. Without knowledge of the terminal states the program is taking a guess as to which path to take.

Cutoffs must be implemented due to time restrictions, either buy computer or game situations.

Evaluation Functions How to Judge Quality

A function that returns an estimate of the expected utility Evaluation functions must agree with the utility functions of the game from a given position. on the terminal states. „ Given the present situation give an estimate as to the value of the next move. It must not take too long ( trade off between accuracy and time cost). The performance of a game-playing program is dependant on the quality of the evaluation functions. Should reflect actual chance of winning.

8 Design Different Types

Different evaluation functions must depend on the nature Material Advantage Evaluation Functions of the game. „ Values of the pieces are judge independent of other pieces on the board. A value is returned base on the material value of the Encode the quality of a position in a number that is computer minus the material value of the player. representable within the framework of the given language. „ Weighted Linear Functions Design a heuristic for value to the given position of any •W1f1+w2f2+……wnfn object in the game. W’s are weight of the pieces F’s are features of the particular positions

Example Different Types

Use probability of winning as the value to return. „ Chess : Material Value – each piece on the board is worth some value ( Pawn = , Knights = 3 …etc) „ If A has a 100% chance of winning then its value to return is 1.00 www.imsa.edu/~stendahl/comp/txt/gnuchess.txt

„ Othello : Value given to # of certain color on the board and # of colors that will be converted lglwww.epfl.ch/~wolf/java/html/Othello-desc.html

Cutoff Search Consequences

Cutting of searches at a fixed depth dependant on time Evaluation function might return an incorrect value. „ The deeper the search the more information is available to the „ If the search in cutoff and the next move results involves a capture program the more accurate the evaluation functions then the value that is return maybe incorrect.

Iterative deepening – when time runs out return the Horizon problem program returns the deepest completed search. „ Moves that are pushed deeper into the search trees may result in an „ Is searching a node deeper better than searching more nodes? oversight by the evaluation function.

9 Improvements to Cutoff

Evaluation functions should only be applied to quiescent position. Alpha-Beta Pruning „ Quiescent Position : Position that are unlikely to exhibit wild swings in value in the near future.

Non quiescent position should be expanded until on is reached. This extra search is called a Quiescence search. „ Will provide more information about that one node in the search tree but may result in the lose of information about the other nodes.

Pruning Alpha-Beta Pruning

What is pruning? A particular technique to find the optimal solution „ The process of eliminating a branch of the search tree from according to a limited depth search using evaluation consideration without examining it. functions. Why prune? Returns the same choice as minimax cutoff decisions, but „ To eliminate searching nodes that are potentially unreachable. examines fewer nodes. „ To speedup the search process. Gets its name from the two variables that are passed along during the search which restrict the set of possible solutions.

Definitions Implementation

Alpha – the value of the best choice so far along the path Set root node alpha to negative infinity and beta to positive for MAX. infinity. Beta – the value of the best choice (lowest value) so far Search depth first, propagating alpha and beta values down along the path for MIN. to all nodes visited until reaching desired depth. Apply evaluation function to get the utility of this node. If parent of this node is a MAX node, and the utility calculated is greater than parents current alpha value, replace this alpha value with this utility.

10 Implementation (Cont’d) Example: Depth = 4

If parent of this node is a MIN node, and the utility α = − ∞ calculated is less than parents current beta value, replace β = + 3 ∞

this beta value with this utility. α = − 3∞ α α == − 3∞ Based on these updated values, it compares the alpha and β = + ∞ β = 3 beta values of this parent node to determine whether to

look at any more children or to backtrack up the tree. α = −∞ ϑ αα = = 3 3 α = − ∞ β = + 83 ∞ β = + 2 ∞ β = 3 Continue the depth first search in this way until all MIN potentially better paths have been evaluated. α = − 8∞ α = − 3∞ α α == 223 αα == −1414∞ MAX β = + ∞ β = 8 β = + ∞ ββ = = 33

Effectiveness Problems

The effectiveness depends on the order in which the search If there is only one legal move, this algorithm will still progresses. generate an entire search tree. If b is the branching factor and d is the depth of the search, Designed to identify a “best” move, not to differentiate the best case for alpha-beta is O(bd/2), compared to the best between other moves. case of minimax which is O(bd). Overlooks moves that forfeit something early for a better position later. Evaluation of utility usually not exact. Assumes opponent will always choose the best possible move.

Chance Nodes

Games that Include an Element Many games that unpredictable outcomes caused by such of Chance actions as throwing a dice or randomizing a condition. Such games must include chance nodes in addition to MIN and MAX nodes.

For each node, instead of a definite utility or evaluation, we can only calculate an expected value.

11 Inclusion of Chance Nodes Calculating Expected Value

For the terminal nodes, we apply the utility function.

We can calculate the expected value of a MAX move by applying an expectimax value to each chance node at the same ply.

After calculating the expected value of a chance node, we can apply the normal minimax-value formula.

Expectimax Function Application to an Example

Provided we are at a chance node preceding MAX’s turn, MAX we can calculate the expected utility for MAX as follows:

„ Let di be a possible dice roll or random event, where P(di) represents the probability of that event occurring. „ If we let S denote the set of legal positions generated by each dice Chance 3.56 roll, we have the expectimax function defined as follows: .6 .4 expectimax(C) = Σi P(di) maxs єS(utility(s))

MIN 3.0 4.4 „ Where the function maxs єS will return the move MAX will pick out of all the choices available. „ Alternately, you can generate an expextimin function for chance Chance 3.6 3.0 5.8 4.4 nodes preceding MIN’s move. .6 .4 .6.4 .6 .4 .6 .4 „ Together they are called the function. MAX 43 3 357 62

2 43 2 3112 35 275612

Chance Nodes: Differences Complexity of Expectiminimax

Where minimax does O(bm), expectiminimax will take O(bmnm), where n is the number of distinct rolls.

The extra cost makes it unrealistic to look too far ahead.

How much this effects our ability to look ahead depends For minimax, any order-preserving transformation of leaves do not on how many random events that can occur (or possible affect the decision. dice rolls). However, when chance nodes are introduced, only positive linear transformations will keep the same decision.

12 Things to Consider

Calculating optimal decisions are intractable in most cases, Wrapping Things Up thus all algorithms must make some assumptions and approximations.

The standard approach based on minimax, evaluation functions, and alpha-beta pruning is just one way of doing things.

These search techniques do not reflect how humans actually play games.

Demonstrating A Problem Summary

We defined the game in terms of a search. Discussion of two-player games given perfect information (minimax). Using cut-off to meet time constraints. Optimizations using alpha-beta pruning to arrive at the Given this two-ply tree, the minimax algorithm will select same conclusion as minimax would have. the right-most branch, since it forces a minimum value of Complexity of adding chance to the decision tree. no less than 100. This relies on the assumption that 100, 101, and 102 are in fact actually better than 99.

13