3372 Zero-Sum Two Person Games

Zero-Sum Two Person Games a coin toss, but the rest of the game is determined entirely by the decisions of the two players. In such games, players T.E.S. RAGHAVAN make strategic decisions and attempt to gain an advantage Department of Mathematics, Statistics over their opponents. and Computer Science, University of Illinois, A game played by two rational players is called zero- Chicago, USA sum if one player’s gain is the other player’s loss. , Checkers, Gin Rummy, Two-finger Morra, and Tic-Tac- Toe are all examples of zero-sum two-person games. Busi- Article Outline ness competition between two major airlines, two major Introduction publishers, or two major automobile manufacturers can be Games with modeled as a zero-sum two-person games (even if the out- Mixed and Theorem come is not precisely zero-sum). Zero-sum games can be Behavior Strategies in Games with Perfect Recall used to construct Nash equilibria in many dynamic non- Efficient Computation of Behavior Strategies zero-sum games [64]. General Minimax Theorems Applications of Infinite Games Games with Perfect Information Epilogue Emptying a Box Acknowledgment Bibliography Example 1 A box contains 15 pebbles. Players I and II re- move between one and four pebbles from the box in alter- Introduction nating turns. Player I goes first, and the game ends when all pebbles have been removed. The player who empties Conflicts are an inevitable part of human existence. This the box on his turn is the winner, and he receives $1 from is a consequence of the competitive stances of greed and his opponent. the scarcity of resources, which are rarely balanced with- out open conflict. Epic poems of the Greek, Roman, and The players can decide in advance how many pebbles to re- Indian civilizations which document wars between nation- move in each of their turn. Suppose a player finds x pebbles states or clans reinforce the historical legitimacy of this in the box when it is his turn. He can decide to remove 1, statement. It can be deduced that domination is the re- 2, 3 or at most 4 pebbles. Thus a strategy for a player is any curring theme in human conflicts. In a primitive sense function f whose domain is X Df1; 2 :::;15g and range is this is historically observed in the domination of men over R f1; 2; 3; 4g such that f (x) min(x; 4). Given strate- women across cultures while on a more refined level it gies f ; g for players I and II respectively, the game evolves can be observed in the imperialistic ambitions of nation- by executing the strategies decided in advance. For exam- state actors. In modern times, a new source of conflict has ple if, say emerged on an international scale in the form of economic ( 2ifx is even competition between multinational corporations. f (x) D ; While conflicts will continue to be a perennial part ( 1ifx is odd of human existence, the real question at hand is how to 3ifx 3 formalize mathematically such conflicts in order to have g(x) D x otherwise : a grip on potential solutions. We can use mock conflicts in the form of parlor games to understand and evaluate so- The alternate depletions lead to the following scenario lutions for real conflicts. Conflicts are unresolvable when the participants have no say in the course of action. For ex- move by I II I II I II I II ample one can lose interest in a parlor game whose entire removes1 3 1 31312 course of action is dictated by chance. Examples of such leaving14111076320: games are Chutes and Ladders, Trade, Trouble etc. Quite a few parlor games combine tactical decisions with chance In this case the winner is Player II. Actually in his first moves. The game Le Her and the game of Parcheesi are move Player II made a bad move by removing 3 out of 14. typical examples. An outstanding example in this category Player I could have exploited this. But he did not! Though is the game of backgammon, a remarkably deep game. In he made a good second move, he reverted back to his naive chess, the player who moves first is usually determined by strategy and made a bad third move. The question is: Can Zero-Sum Two Person Games 3373

Player II ensure victory for himself by intelligently choos- 1. If at least one column total is an odd number, then the ing a suitable strategy? Indeed Player II can win the game player who is about to make a move can choose one with any strategy satisfying the conditions of g where basket and by removing a suitable number of oranges 8 leave all column totals even. ˆ1ifx 1 is a multiple of 5 2. If at least one basket is nonempty and if all column to- ˆ < tals are even, then the player who has to make a move 2ifx 2 is a multiple of 5 g (x) D ˆ will end up leaving an odd column total. ˆ3ifx 3 is a multiple of 5 :ˆ 4ifx 4 is a multiple of 5 : By looking for the first odd column total from the left, we notice that the basket with 16 oranges is the right choice Since the game starts with 15 pebbles, Player I must leave for Player I. He can remove the left most 1 in the binary ex- either 14 or 13 or 12, or 11 pebbles. Then Player II can in pansion of 16 and change all the other binary digits to the his turn remove 1 or 2 or 3 or 4 pebbles so that the number right by 0 or 1. The key observation is that the new num- of pebbles Player I finds is a multiple of 5 at the beginning ber is strictly less than the original number. In Player I’s of his turn. Thus Player II can leave the box empty in the move, at least one orange will be removed from a basket. last round and win the game. Furthermore, the new column totals can all be made even. Many other combinatorial games could be studied for If an original column total is even we leave it as it is. If an optimal strategic behavior. We give one more example of original column total is odd, we make it even by making a combinatorial game, called the game of Nim [13]. any 1 a 0 and any 0 a 1 in those cases which correspond to the basket with 16 oranges. For example the new binary Nim Game expansion corresponds to removing all but 1 orange from basket 3. We have Example 2 Three baskets contain 10, 11, and 16 oranges 10 D 1010 respectively. In alternating turns, Players I and II choose 11 D 1011 a non-empty basket and remove at least one orange from 1 D 0001 it. The player may remove as many oranges as he wishes Column totals in base 10 D 2022 : from the chosen basket, up to the number the basket con- In the next move, no matter what a player does, he has to tains. The game ends when the last orange is removed leave one of the 1’s a 0 and the column total in that column from the last non-empty basket. The player who takes the will be odd and the move is for Player I. Thus Player I will last orange is the winner. be the first to empty the baskets and win the game. In this game as in the previous example at any stage the For the game of Nim we found a constructive and ex- players are fully aware of what has happened so far and plicit strategy for the winner regardless of any action by what moves have been made. The full history and the state the opponent. Sometimes one may be able to assert who of the game at any instance are known to both players. should be the winner without knowing any winning strat- Such a game is called a game with perfect information.How egy for the player! to plan for future moves to one’s advantage is not at all Definition 3 A zero-sum two person game has perfect clear in this case. Bouton [13] proposed an ingenious so- information if, at each move, both players know the com- lution to this problem which predates the development of plete history so far. formal . His solution hinges on the binary representation of any There are many variations of nim games and other com- number and the inequality that 1C2C4C:::C2n < 2nC1. binatorial games like Chess and Go that exploit the com- The numbers 10, 11, 16 have the binary representation binatorial structure of the game or the end games to de- velop winning strategies. The classic monographs on com- Number Binary representation binatorial game theory is by Berlekamp, Conway, and 10 D 1010 Guy [8] on Winning Ways for your Mathematical Plays, 11 D 1011 whose mathematical foundations were provided by Con- 16 D 10000 way’s earlier book On Numbers and Games. These are of- Column totals ten characterized by sequential moves by two players and (in base 10 digits) D 12021 : the outcome is either a win or lose kind. Since the en- tire history of past moves is common knowledge, the main Bouton made the following key observations: thrust is in developing winning strategies for such games. 3374 Zero-Sum Two Person Games

Definition 4 A zero-sum two person game is called is enlarged with a frame on all sides. The frame F consists a win-lose game if there are no chance moves and the fi- of lattice points F Df(i; j): 0 i n C 1; 0 j n C 1 nal outcome is either Player I wins or loses (Player II wins) where either i D 0orn C 1, or j D 0orn C 1g.Theframe the game. (In other words, there is no way for the game to on the west side W Df(i; j): i D 0g\F and the frame on end in a tie.) the east side E Df(i; j): i D n C 1g\F are reserved for Player I. Similarly the frame on the south side S Df(i; j): The following is a fundamental theorem of Zermelo [70]. 0 < i < n C 1; j D 0g\F and frame on the north side N Theorem 5 Any zero-sum two person perfect informa- Df(i; j): (0 < i < n C 1; j D n C 1)g\F are reserved tion win-lose game with finitely many moves and finitely for Player II. Two lattice points P D (x1; y1); Q D (x2; y2) many choices in each move has a winner with an optimal are called adjacent vertices iff either x1 x2; y1 y2 or x1 winning strategy. x2; y1 y2 and max(jx1x2j; jy1y2j) D 1. For exam- ple the lattice points (4; 10) and (5; 11) are adjacent while Proof Let Player I make the first move. Then depend- (4; 10) and (5; 9) are not adjacent. Six vertices are adjacent ing on the choices available, the game evolves to a new set to any interior lattice point of the Hex board B while lat- of subgames which on their own right are also win-lose n tice points on the frame will have fewer than six adjacent games of perfect information. Among these subgames the vertices. one with the longest play will have fewer moves than the The game is played as follows: Players I and II, in alter- original game. By an induction on the length of the longest nate turns, choose a vertex from the available set of unoc- play, we can find a winner with a winning strategy, one cupied vertices. The aim of Player I is to occupy a bridge of for each subgame. Each player can develop good strategies adjacent vertices that links a vertex on the west boundary for the original game as follows. Suppose the subgames with a vertex on the east boundary. Player II has a similar are ; ;:::; . Now among these subgames, let be 1 2 k s objective to connect the north and south boundary with a game where Player I can ensure a victory for himself, no abridge. matter what Player II does in the subgame. In this case, Player I can determine at the very beginning, the right Theorem 6 The game of Hex can never end in a draw. For c choice of action which leads to the subgame s. A good any T Bn occupied by Player I and the complement T strategy for Player I is simply the choice s in the first occupied by Player II, either T contains a winning bridge c move followed by his good strategy in the subgame s. for Player I or T contains a winning bridge for Player II. Player II’s strategy for the original game is simply a k-tuple Further only one can have a winning bridge. of strategies, one for each subgame. Player II must be ready Proof We label any vertex with 1 or 2 depending on to use an optimal strategy for the subgame in case the r who (Player I or Player II) occupies the vertex. Consider first move of Player I leads to playing ,whichisfavor- r triangles formed by vertices that are mutually adja- able to Player II. Suppose no subgame has a winning s cent to each other. Two such triangles are called mates strategyforPlayerI.ThenPlayerIIwillbethewinnerin if they share a common side. Either all the 3 vertices of each subgame. To achieve this, Player II must use his win- thetriangleareoccupiedbyoneplayerortwovertices ning strategy in each subgame they are lead to. Such a k-tu- byoneplayerandthethirdbytheotherplayer.Forex- ple of winning strategies, one for each subgame, is a win- ample if P D (x ; y ); Q D (x ; y ); R D (x ; y )areadja- ning strategy for Player II for the original game . 1 1 2 2 3 3 cent to each other, and if P; Q; R are occupied by say, I, II, and I, they get the labels 1, 2 and 1 respectively. The TheGameofHex triangle has exactly 2 sides (PQ and QR) with vertices la- An interesting win-lose game was made popular by John beled 1 and 2. The algorithm described below involves en- Nash in late forties among Princeton graduate students. tering such a triangle via one side with vertex labels 1 and 2 While the original game is aesthetically pleasing with its and exiting via the other side with vertex labels 1 and 2. hexagonal tiles forming a 10 10 rhombus, it is more con- Suppose we start at the south west corner triangle 0 venient to use the following equivalent formulation for its (in the above figure) with vertex A D (0; 0) occupied by mathematical simplicity. The version below is extendable player I (labeled 1), B D (1; 0) occupied by player II (la- to multi person games and is useful for developing impor- beled2),andsupposeC D (1; 1) is occupied by, player I tant algorithms [28]. (labeled 1). Since we want to stay inside the framed Hex Let Bn be a square board consisting of lattice points board, the only way to exit 0 via a side with vertices la- f(i; j): 1 i n; 1 j ng. The game involves occupa- beled1and2istoexitviaBC. We move to the unique tion of unoccupied vertices by players I and II. The board mate triangle 1 which shares the common side BC which Zero-Sum Two Person Games 3375

Zero-Sum Two Person Games, Figure 1 Hex path via mate triangles has vertex labels 1 and 2. The mate triangle 1 has vertices Approximate Fixed Points (1; 0), (1; 1), and (1; 0) (0; 0) C (1; 1) D (2; 1). Suppose Let I2 be the unit square 0 x; y 1. Given any continu- D D (2; 1) is labeled 2, then we exit via the side CD to the 2 2 ous function: f D ( f1; f2): I ! I , Brouwer’s fixed point mate triangle with vertices C; D,andE D (1; 1) (1; 0) theorem asserts the existence of a point (x; y)suchthat C(2; 1) D (2; 2).Eachtimewefindtheplayerofthenew f(x; y) D (x; y). Our Hex path building algorithm vertex with his label, we drop out the other vertex of the due to Gale [28] gives a constructive approach to locating same player from the current triangle and move into the an approximate fixed point. new mate triangle. In each iteration there is exactly one Given >0, by uniform continuity we can find new mate triangle to move into. Since in the initial step we 1 0 0 a ı> n > 0suchthatif(i; j)and(i ; j ) are adjacent ver- had a unique mate triangle to move into from 0,there tices of a Hex board Bn,then is no way for the algorithm to reenter a mate triangle vis- ˇ ˇ ited earlier. This process must terminate at a vertex on the ˇ i j i0 j0 ˇ ˇ f ; f ; ˇ ; North or East boundary. One side of these triangles will ˇ 1 1 ˇ ˇ n n n n ˇ all have the same label forming a bridge which joins the ˇ 0 0 ˇ (1) ˇ i j i j ˇ ˇ f2 ; f2 ; ˇ : appropriate boundaries and forms a winning path. The n n n n winning player’s bridge will obstruct the bridge the los- ing player attempted to complete. The game of Hex and Consider the the 4 sets: its winning strategy is a powerful tool in developing algo- C i j i rithms for computing approximate fixed points. Hex is an H D (i; j) 2 Bn : f1 ; > ; (2) example of a game where we do know that the first player n n n can win, but we don’t know how (for a sufficiently large i j i H D (i; j) 2 Bn : f1 ; < ; (3) board). n n n 3376 Zero-Sum Two Person Games

Zero-Sum Two Person Games, Table 1 C i j j Table giving the Hex building path V D (i; j) 2 Bn : f2 ; > ; (4) n n n i j j (x, y) jf1 xj jf2 yj L V D (i; j) 2 B : f ; < : (5) n 2 n n n (.0, .0) 0 .6667 1 (.1, 0) .0167 .5833 2 C Intuitively the points in H under f are moved further to (.1, .1) .01228 .4667 2 the right (with increased x coordinate) by more than . (0, .1) .0 .53333 1 Points in V under f are moved further down (with de- (.1, .2) .007 .35 2 creased y coordinate) by more than . We claim that these (0, .2) 0 .4 1 sets cannot cover all the vertices of the Hex board. If it were (.1, .3) .002 .233 2 so, then we will have a winner, say Player I with a winning (0, .3) 0 .26667 1 path, linking the East and West boundary frames. Since (.1, .4) .238 .116 1 points of the East boundary have the highest x coordinate, (.2, .4) .194 .088 1 they cannot be moved further to the right. Thus vertices in (.2, .3) .007 .177 2 HC are disjoint with the East boundary and similarly ver- (.3, .4) .153 .033 1 tices in H are disjoint with the West boundary. The path (.3, .3) .017 .067 2 must therefore contain vertices from both HC and H. (.4, .4) .116 0 1 However for any (i; j) 2 HC; (i0; j0) 2 H we have (.4, .3) .0296 0 * i j i f1 ; >; n n n With D :05, we can start with a grid of ı D :1 (hopefully 0 0 0 i j i adequate) and find an approximate fixed point. In fact for f1 ; C >: n n n a spacing of .1 units we have the following iterations. The iterations according to Hex rule passed through the fol- Summing the above two inequalities and using (1)weget lowing points with j f1(x; y) xj and j f2(x; y) yj given i0 i by Table 1. Thus the approximate fixed point is x D :4 ; > 2: n n y D :3. Thus the points (i; j)and(i0; j0) cannot be adjacent and this contradicts that they are part of a connected path. We Extensive Games and Normal Form Reduction have a contradiction. Any game as it evolves can be represented by a rooted Remark 7 The algorithm attempts to build a winning tree where the root vertex corresponds to the initial path and advances by entering mate triangles. Since the move. Each vertex of the tree represents a particular move algorithm will not be able to cover the Hex board, partial of a particular player. The alternatives available in any bridge building should fail at some point, giving a vertex given move are identified with the edges emanating from that is outside the union of sets HC; H; V C; V .Hence the vertex that represents the move. If a vertex is assigned we reach an approximate fixed point while building the to chance, then the game associates a probability distri- bridge. bution with the the descending edges. The terminal ver- tices are called plays and they are labeled with the payoff to An Application of the Algorithm Player I. In zero-sum games Player II’s payoff is simply the negative of the payoff to Player I. The vertices for a player Consider the continuous map of the unit square into itself are further partitioned into information sets. Information given by: sets must satisfy the following requirements: f (x; y) D 1 The number of edges descending from any two moves x C max(2 C 2x C 6y 6xy; 0) within an information set are same. 1 C max(2 C 2x C 6y 6xy; 0) C max(2x 6xy; 0) No information set intersects the unique unicursal path from the root to any end vertex of the tree in more than f (x; y) D 2 one move. y C max(2 6x 2y C 6xy; 0) : Any information set which contains a chance move is 1 C max(2 6x 2y C 6xy; 0) C max(2y 6xy; 0) asingleton. Zero-Sum Two Person Games 3377

We will use the following example to illustrate the exten- is called the normal form reduction of the original exten- sive form representation: sive game. Example 8 Player I has 3 dice in his pocket. Die 1 is a fake die with all sides numbered one. Die 2 is a fake die with Saddle Point all sides numbered two. Die 3 is a genuine unbiased die. The normal form of a zero sum two person game has He chooses one of the 3 dice secretly, tosses the die once, a saddle point when there is a row r and column c such and announces the outcome to Player II. Knowing the out- that the entry arc is the smallest in row r and the largest come but not knowing the chosen die, Player II tries to in column c. By choosing the pure strategy correspond- guess the die that was tossed. He pays $1 to Player I if his ing to row r Player I guarantees a payoff arc D minj arj. guess is wrong. If he guesses correctly, he pays nothing to By choosing column c, Player II guarantees a loss no Player I. more than maxi aic D arc.Thusrowr and column c are The game is represented by the above tree with the root good pure strategies for the two players. In a payoff ma- vertex assigned to Player I. The 3 alternatives at this move trix A D (aij)rowr is said to strictly dominate row t if are to choose the die with all sides 1 or to choose the die arj > atj for all j.PlayerI,themaximizer,willavoidrowt with all sides 2 or to choose the unbiased die. The end when it is dominated. If rows r and t are not identical and vertices of these edges descending from the root vertex if arj atj, then we say that row r weakly dominates row t. are moves for chance. The certain outcomes are 1 and 2 Example 9 Player I chooses either 1 or 2. Knowing play- if the die is fake. The outcome is one of the numbers 1, er I’s choice Player II chooses either 3 or 4. If the total T ...,6ifthediechosenisgenuine.Theotherendsofthese is odd, Player I wins $T from Player II. Otherwise Player I edges are moves for Player II. These moves are partitioned pays $T to Player II. into information sets V1 (corresponding to outcome 1), V2 (corresponding to outcome 2), and singleton information ThepurestrategiesforPlayerIaresimply 1 = choose 1, sets V3, V4, V5, V6 corresponding to outcomes 3, 4, 5 and 6 2 = choose 2. For Player II there are four pure strate- respectively. Player II must guess the die based on the in- gies given by: 1: choose 3 no matter what I chooses. formation given. If he is told that the game has reached 2: choose 4 no matter what I chooses. 3: choose 3 if I a move in information set V1, it simply means that the chooses 1 and choose 4 if I chooses 2. 4: choose 4 if I outcome of the toss is 1. He has two alternatives for each chooses 3 1 and choose 3 if I chooses 2. This results in move of this information set. One corresponds to guessing a normal form with payoff matrix A forPlayerIgivenby: the die is fake and the other corresponds to guessing it is (3; 3) (4; 4) (3; 4) (4; 3) genuine. The same applies to the information set V2.Ifthe outcome is in V ;:::;V , the clear choice is to guess the die 1 4545 3 6 A D : as genuine. Thus a pure strategy (master plan) for Player II 2 5 6 65 is to choose a 2-tuple with coordinates taking the values F or G. Here there are 4 pure strategies for Player II. They Here we can delete column 4 which dominates column 3. We don’t have row domination yet. We can delete col- are: (F1; F2); (F1; G); (G; F2); (G; G). For example, the first coordinate of the strategy indicates what to guess when the umn 2 as it weakly dominates column 3. Still we have no outcome is 1 and the second coordinate indicates what to row domination after these deletions. We can delete col- guess for the outcome 2. For all other outcomes II guesses umn 1 as it weakly dominates column 3. Now we have the die is genuine (unbiased). The payoff to Player I when strict row domination of row 2 by row 1 and we are left Player I uses a pure strategy i and Player II uses a pure with the row 1, column 3 entry = 4. This is a saddle point strategy j is simply the expected income to Player I when for this game. In fact we have the following: the two players choose i and jsimultaneously. This can as Theorem 10 The normal form of any zero sum two per- well be represented by a matrix A D (aij)whoserowsand son game with perfect information admits a saddle point. columns are pure strategies and the corresponding entries A saddle point can be arrivedatbyasequenceofrowor are the expected payoffs. The payoff matrix A given by column deletions. A row that is weakly dominated by an- other row can be deleted. A column that weakly dominates 0 (F1; F2)(F1; G)(G; F2)(G; G) 1 another column can be deleted. In each iteration we can al- F1 B 0011C ways find a weakly or strictly dominated row or a weakly or B C A D F2 @ 0101A strictly dominating column to be deleted from the current 1 1 1 submatrix. G 3 6 6 0 3378 Zero-Sum Two Person Games

Zero-Sum Two Person Games, Figure 2 Game tree for a single throw with fake or genuine dice

Mixed Strategy and Minimax Theorem Historical Remarks Zero sum two person games do not always have saddle The existence of a saddle point in mixed strategies for Ex- points in pure strategies. For example, in the game of ample 8 is no accident. All finite games have a saddle point guessing the die (Example 8) the normal form has no sad- in mixed strategies. This important theorem, called the dle point. Therefore it makes sense for players to choose minimax theorem, is the very starting point of game the- the pure strategies via a random mechanism. Any prob- ory. While Borel (see under Ville [65]) considered the no- ability distribution on the set of all pure strategies for tions of pure and mixed strategies for zero sum two person a player is called a mixed strategy. In Example 8 a mixed games that have symmetric roles for the players, he was strategy for Player I is a 3-tuple x D (x1; x2; x3)and able to prove the theorem only for some special cases. It a mixed strategy for Player II is a 4-tuple y D (y1; y2; was von Neumann [66] who first proved the minimax the- y3; y4). Here xi is the probability that player I chooses pure orem using some intricate fixed point arguments. While strategy i and yj is the probability that player II chooses several proofs are available for the same theorem [29,34, pure strategy j. Since the players play independently and 44,48,49,65,69], the proofs by Ville and Weyl are notable make their choices simultaneously, the expectedP P payoff from an algorithmic point of view. The proofs by Nash to Player I from Player II is K(x; y) D i j aijxi y j and Kakutani allow immediate extension to Nash equi- where aij are elements in the payoff matrix A. librium strategies in many person non zero sum games. We call K(x; y) the mixed payoff where players choose For zero-sum two person games, optimal strategies and mixed strategies x and y instead of pure strategies i and j. strategies coincide. The following is the 1 1 3 3 1 Suppose x D ( 8 ; 8 ; 4 )andy D ( 4 ; 0; 0; 4 ). Here x seminal minimax theorem for matrix games. guarantees Player I an expected payoff of 1 against any 4 Theorem 11 (von Neumann) Let A D (aij) be any m n pure strategy j of Player II. By the affine linearity of real matrix. Then there exists a pair of probability vectors K(x; y)iny it follows that Player I has a guaranteed x D (x1; x2;:::;xm) and y D (y1; y2;:::;yn) such that D 1 expectation 4 against any mixed strategy choice of II. for a unique constant v A similar argument shows that Player II can choose the Xm mixed strategy ( 3 ; 0; 0; 1 ) which limits his maximum ex- 4 4 a x vjD 1; 2;:::;n ; pected loss to 1 against any mixed strategy choice of ij i 4 iD1 Player I. Thus Xn aijy j viD 1; 2;:::;m : 1 max K(x; y) D min K(x; y) D : jD1 x y 4 The probability vectors x, y are called optimal mixed strate- By replacing the rows and columns with mixed strategy gies for the players and the constant v is called the value of payoffs we have a saddle point in mixed strategies. the game. Zero-Sum Two Person Games 3379

H. Weyl [69] gave a complete algebraic proof and proved subject to that the value and some pair of optimal strategies for the Xn two players have all of their coordinates lie in the same or- aijj C si D 1 ; i D 1; 2;:::;m ; (14) dered subfield as the smallest ordered field containing the jD1 payoff entries. Unfortunately his proof was non-construc- ; D ; ;:::; ; tive. It turns out that the minimax theorem can be proved y j 0 j 1 2 n (15) via linear programming in a constructive way which leads ; D ; ;:::; : to an efficient computational algorithm alathe simplex si 0 i 1 2 m (16) method [18]. The key idea is to convert the problem to Of the various algorithms to solve a linear programming dual linear programming problems. problem, the simplex algorithm is among the most effi- cient. It was first investigated by Fourier (1830). But no Solving for Value and Optimal Strategies other work was done for more than a century. The need via Linear Programming for its industrial application motivated active research Without loss of generality we can assume that the pay- and lead to the pioneering contributions of Kantarowich off matrix A D (aij)mn > 0, that is aij > 0forall(i; j). [1939] (see a translation in Management Science [35]) and Thus we are looking for some v such that: Dantzig [18]. It was Dantzig who brought out the earlier investigations of Fourier to the forefront of modern ap- v D min v1 (6) plied mathematics. such that Simplex Algorithm Consider our linear programming Xn problem above. Any solution D (y1;:::;yn); s D (s1; aijy j v1 ; (7) :::;sm) to the above system of equations is called a feasible jD1 solution. We could also rewrite the system as

y1;:::;yn 0 ; (8) 1 2 n 1 2 m 1C C 2C CCn C C s1e C s2e C sm e D 1

Xn 1;2;:::;n; s1; s2;:::;sm 0 : y j D 1 : (9) j jD1 Here C ; j D 1 :::;n are the columns of the matrix A and ei are the columns of the m m identity matrix. The Since the payoff matrix is positive, any v1 satisfying the vector 1 is the vector with all coordinates unity. With any constraints above will be positive, so the problem can be extreme point (; s) D (1;2;:::;n; s1;:::;sm)ofthe reformulated as convex polyhedron of feasible solutions one can associate Xn with it a set of m linearly independent columns, which 1 max D max j (10) form a basis for the column span of the matrix (A; I). v1 jD1 Here the coefficients j and si are equal to zero for coor- dinates other than for the specific m linearly independent such that columns. By slightly perturbing the entries we can assume Xn that any extreme point of feasible solutions has exactly m aijj 1forallj ; (11) positive coordinates. Two extreme feasible solutions are jD1 called adjacent if the associated bases differ in exactly one column. j 0forallj : (12) The key idea behind the simplex algorithm is that an With A > 0, thePj’s are bounded. The maximum of the extreme point P D ( ; s ) is an optimal solution if and linear function j j is attained at some extreme point of only if there is no adjacent extreme point Q for which the the convex set of constraints (11)and(12). By introduc- objective function has a higher value. Thus when the algo- ing nonnegative slack variables s1; s2;:::;sm we can re- rithm is initiated at an extreme point which is not optimal, place the inequalities (11) by equalities (13). The problem there must be an adjacent extreme point that strictly im- reduces to proves the objective function. In each iteration, a column Xn from outside the basis replaces a column in the current basis corresponding to an adjacent extreme point. Since max j (13) jD1 there are m C n columns in all for the matrix (A; I), and 3380 Zero-Sum Two Person Games

7 5 in each iteration we have strict improvement by our non- point. We find that y1 D 61 ; y2 D 61 is such a solution. degeneracy assumption on the extreme points, the proce- The procedure terminates because no adjacent solution dure must terminate in a finite number of steps resulting with y1 > 0; s1 > 0ory2 > 0; s1 > 0ory1 > 0; s2 > 0 in an optimal solution. or y2 > 0; s2 > 0 if any has higher objective function value. The algorithm terminates with the optimal value of Example 12 Players I and II simultaneously show either 1 D 12 . Thus the value of the modified game is 61 ,and 1or2fingers.IfT is the total number of fingers shown v1 61 12 61 1 then Player I receives from Player II $T when T,isodd the value of the original game is 12 5 D 12 . A good strat- and loses $T to Player II when T is even. egy for Player II is obtained by normalizing the optimal so- 7 5 lution of the linear program, it is 1 D 12 ;2 D 12 .Sim- The payoff matrix is given by ilarly, from the dual linear program we can see that the 7 5 23 strategy 1 D 12 ;2 D 12 is optimal for Player I. A D : 3 4 Fictitious Play Add 5 to each entry to get a new payoff matrix with all entries strictly positive. The new game is strategically same Though optimal strategies are not easily found, even naive as A. players can learn to steer their average payoff towards the value of the game from past plays by certain iterative pro- 38 : cedures. This learning procedure is known as fictitious 81 play. The two players make their next choice under the as- The linear programming problem is given by sumption that the opponent will continue to choose pure strategies at the same frequencies as what he/she did in the max 1:y1 C 1:y2 C 0:s1 C 0:s2 past. If x(n); y(n) are the empirical mixed strategies used by such that the two players in the first n rounds,theninroundn C 1 2 3 Player I pretends that Player II will continue to use y(n) in y1 6 7 the future and selects any row i such that 3810 6 y2 7 1 4 5 D : X X 8101 s1 1 (n) (n) ai j y D max aijy : s j i j 2 j j T We can start with the trivial solution (0; 0; 1; 1) .Thiscor- The new empirical mixed strategy is given by 1 2 responds to the basis e ; e with s1 D s2 D 1. The value of the objective function is 0. If we make y2 > 0, then the (nC1) 1 n (n) x D I C x : value of the objective function can be increased. Thus we n C 1 i n C 1 look for a solution to (Here Ii is the degenerate choice of pure strategy i .) This s1 D 0 ; s2 > 0 ; y2 > 0 intuitive learning procedure was proposed by Brown [14] and the following convergence theorem was proved by satisfying the constraints Robinson [56].

8y2 C 0s2 D 1 Theorem 13 X X y2 C s2 D 1 (n) (n) lim min aijxi D lim max aijy j D v : n j n i or to i j

s2 D 0 ; s1 > 0 ; y2 > 0 We will apply the fictitious play algorithm to the following satisfying the constraints example and get a bound on the value.

8y2 C s1 D 1 Example 14 Player I picks secretly a card of his choice from a deck of three cards numbered 1, 2, and 3. Player II y2 C 0s1 D 1 proceeds to guess player I’s choice. After each guess 1 7 Notice that y2 D 8 ; s2 D 8 is a solution to the first sys- player I announces player II’s guess as “High”, “Low” or tem and that the second system has no nonnegative so- “Correct” as the case may be. The game continues till lution. The value of the objective function at this ex- player II guesses player I’s choice correctly. Player II pays 1 treme solution is 8 . Now we look for an adjacent extreme to Player I $N where N is the number of guesses he made. Zero-Sum Two Person Games 3381

Zero-Sum Two Person Games, Table 2 object is hidden is known, the exact location is unknown. The search strategy consists of either targeting a single Row Total so far Column Total so far choices choices point of the space and paying a penalty when the search

R1 1 1 2 2 3 C1 1 2 3 fails or continue the search till the searcher gets closer R3 4 3 4 3 4 C2 2 5 5 to the hidden location. If the search consists of many at- R2 6 6 5 6 6 C3 4 6 7 tempts, then a pure strategy is simply a function of the R3 9 8 7 7 7 C3 6 7 9 search history so far. Example 14 is a typical search game. R3 12 10 9 8 8 C4 8 10 10 Thefollowingareexamplesofsomesimplesearchgames R2 15 12 11 9 9 C4 10 13 11 that have unexpected turns with respect to the value and R2 17 15 12 12 11 C5 13 15 12 optimal strategies. R 19 18 13 15 13 C 15 16 14 2 3 Example 16 A pet shop cobra of known length t < 1es- R 21 21 14 18 15 C 17 17 16 2 3 capes out of the shop and has settled in a nearby tree some- R 22 22 16 20 18 C 19 18 18 1 3 where along a particular linear branch of unit length. Due to the camouflage, the exact location [x; x C t]whereit has settled on the branch is unknown. The shop keeper The payoff matrix is given by chooses a point y of his choice and aims a bullet at the (1; 2) (1; 3) (2) (3; 1) (3; 2) point y. In spite of his 100% accuracy the cobra will escape 0 1 permanently if his targeted point y is outside the settled 11B 122 3C location of the cobra. B C A D 22@ 313 2A : We treat this as a game between the cobra (Player I) and 33 221 1 the shop keeper (Player II). Let the probability of survival be the payoff to the cobra. Thus Here the row labels are possible cards chosen by ( Player I, and the column labels are pure strategies for 1ify < x ; or y > x C t Player II. For example, the pure strategy (1, 3) for Player II, K(x; y) D : means that 1 is the first guess and if 1 is incorrect then 3 0otherwise isthesecondguess.Theelementsofthematrixarepayoffs The pure strategy spaces are 0 x 1 t for the to Player I. We can use cumulative total, instead average snake and 0 y 1 for the shop keeper. It can be shown for the players to make their next choice of row or column that the game has no saddle point and has optimal mixed based on the totals. We choose the row or column with strategies. The value function v(t) is a discontinuous func- theleastindexincasemorethanoneroworonecolumn tion of t.Incase 1 is an integer n then, a good strat- meets the criterion. The total for the first 10 rounds using t egy for the snake is to hide along [0; t], or [t; 2t]or fictitious play is given in Table 2. [(n 1)t; 1] chosen with equal chance. In this case the op- The bold entries give the approximate lower and timal strategy for the shop keeper is to choose a random upper bounds for the total payoff in 10 rounds giving 1 1 point in [0; 1]. The value is 1 n .Incase t is a fraction, 1:6 v 1:9. 1 let n D [ t ] then the optimal strategy for the snake is to Remark 15 Fictitious play is known to have a very poor hide along [0; t]; or [t; 2t];::: or [(n 1)t; nt]. An opti- rate of convergence to the value. While it works for all zero mal strategy for the shop keeper is to shoot at one of the 1 2 n sum two person games, it fails to extend to Nash equi- points nC1 ; nC1 ;:::; nC1 chosen at random. librium payoffs in bimatrix games even when the game Example 17 While mowing the lawn a lady suddenly re- has a unique Nash equilibrium. It extends only to some alizes that she has lost her diamond engagement ring some very special classes like 2 2 bimatrix games and to the so where in her lawn. She has maximum speed s and will called potential games. be able to locate the diamond ring from its glitter if she (See Miyasawa [1961], Shapley [1964], Monderer and is sufficiently close to, say within a distance from the Shapley [46], Krishna and Sjoestrom [41], and Berger [7]). ring. What is an optimal search strategy that minimizes her search time. Search Games If we treat Nature as a player against her, she is playing Search games are often motivated by military applications. a zero sum two person game where Nature would find An object is hidden in space. While the space where the pleasure in her delayed success in finding the ring. 3382 Zero-Sum Two Person Games

Search Games on Trees The following is an elegant search game on a tree. For many other search games the readers can refer to the monographs by Gal [27] and Alpern and Gal [1]. Also see [55]. Example 18 A bird has to look for a suitable location to build its nest for hatching eggs and protecting them against predator snakes. Having identified a large tree with a single predator snake in the neighborhood, the bird has to further decide where to build its nest on the chosen tree. The chance for the survival of the eggs is directly propor- tional to the distance the snake travels to locate the nest. While birds and snakes work out their strategies based on instinct and evolutionary behavior, we can surely approx- imate the problem by the following zero sum two person Zero-Sum Two Person Games, Figure 3 search game. Let T D (X; E)beafinitetreewithvertex Bird trying to hide at a leaf and snake chasing to reach the appro- set X and edge set E.LetO 2 X be the root vertex. A hider priate leaf via optimal Chinese postman route starting at root O hides an object at a vertex x of the tree. A searcher starts and ending at O at the root and travels along the edges of the tree such that the path traced covers all the terminal vertices. The search ends as soon as the searcher crosses the hidden location The value of this game is a1 C a2 D sum of the edge and the payoff to the hider is the distance traveled so far. lengths. We can use an induction on the number of sub- By a simple domination argument we can as well as- trees to establish the value as the sum of edge lengths. We sume that the optimal hiding locations are simply the ter- will use an example to just provide the intuition behind the minal vertices. formal proof. Given any permutation of the end vertices (leaves), of Theorem 19 The search game has value and optimal the above tree let P be the shortest path from the root ver- strategies. The value coincides with the sum of all edge tex that travels along the leaves in that order and returns lengths. Any optimal strategy for the hider will necessar- to the root. Let the reverse path be P1. Observe that it will ily restrict to hide at one of the terminal vertices. Let the cover all edges twice. Thus if the two paths P and P1 are least distance traveled to exhaust all end vertices one by one chosen with equal chance by the snake, the average dis- correspond to a permutation of the end vertices in the tance traveled by the snake when it locates the bird’s nest 1 order w1; w2;:::;wk .Let be its reverse permutation. at an end vertex will be independent of the particular end Then an optimal strategy for the searcher is to choose one vertex. For example along the closed path O ! t ! d ! t of these two permutations by the toss of a coin. The hider ! a ! t ! x ! b ! x ! c ! x ! t ! O ! u ! s has a unique optimal mixed strategy that chooses each end ! e ! s ! y ! f ! y ! g ! y ! s ! u ! h ! u vertex with positive probability. ! j ! u ! O the distance traveled by the snake to reach Suppose the tree is a path with root O and a single termi- leaf e is (3 C 7 C 7 CC8 C 3) D 74. If the snake trav- nal vertex x. Since the search begins at O,thelongesttripis els along the reverse path to reach e the distance traveled possible only when hider hides at x and the theorem holds is (9 C 5 C 5 CC5 C 4 C 3) D 66. For example if it is 1 trivially. In case the tree has just two terminal vertices be- to reach the vertex d then via path P it is (3 C 7). Via P sides the root vertex, the possible hiding locations are say, it is to make travel to e and travel from e to d by the reverse path.Thisis(66C 3 CC6 C 6 C 7) D 130. Thus in O; x1; x2 with edge lengths a1; a2. The possible searches both cases the average distance traveled is 70. The average are via paths: O ! x1 ! O ! x2 abbreviated Ox1Ox2 distance is the same for every other leaf when P and P1 or O ! x2 ! O ! x1, abbreviated Ox2Ox1.Thepayoff matrix can be written as are used. The optimal Chinese postman route can allow all permutations subject to permuting any leaf of any subtree Ox Ox Ox Ox 1 2 2 1 only among themselves. Thus the subtree rooted at t has x a 2a C a leaves a, b, c, d and the subtree rooted at u has leaves e, f , 1 1 2 1 : x2 2a1 C a2 a2 g, h, j. For example while permuting a, b, c, d only among Zero-Sum Two Person Games 3383 themselves we have the further restriction that between we can treat this as collapsing the previous subtree to x b; c we cannot allow insertion of a or d.Forexamplea, and treat stem length of the new subtree with vertices b, c, d and a, d, c, b are acceptable permutations, but not a, ft; a; x; dg as though the three stems [ta]; [tx]; [td]have b, d, c. It can never be the optimal permuting choice. The lengths 6; 5 C (4 C 2); 7. We can check that for this sub- same way it applies to the tree rooted at u.Forexampleh, j, tree game the leaves a; x; d are chosen with probabilities 6 9 7 e, g, f is part of the optimal Chinese postman route, but h, p(a) D (6C9C7) ; p(x) D (6C9C7) ; p(d) D (6C9C7) .Thus g, j, e, f is not. We can think of the snake and bird play- the optimal mixed strategy for the bird for choosing leaf b ing the game as follows: The bird chooses to hide in a leaf for our original tree game is to pass through vertices t; x; b of the subgame Gt rooted at t or at a leaf of the subgame and is given by the product p(t)p(x)p(b). We can induc- Gu rooted at u. These leaves exhaust all leaves of the origi- tively calculate these probabilities. nal game. The snake can restrict to only the optimal route of each subgame. This can be thought of as a 2 2game Completely Mixed Games and Perron’s Theorem where the strategies for the two players (bird) and snake on Positive Matrices are: Amixedstrategyx for player I is called completely mixed if Bird: it is strictly positive (x > 0). A matrix game A is completely Strategy 1: Hide optimally in a leaf of Gt, mixed if and only all optimal mixed strategies for Player I Strategy 2: Hide optimally in a leaf of Gu. and Player II are completely mixed. The following elegant Snake: theorem was proved by Kaplanski [36]. Strategy 1: Search first the leaves of Gt along the optimal Theorem 20 A matrix game A with value v is completely ChinesepostmanrouteofGt and then search along the mixed if and only if leaves of Gu. 1. The matrix is square. Strategy 2: Search first the leaves of Gu along the optimal 2.Theoptimalstrategiesareuniqueforthetwoplayers. Chinese postman route and then search the leaves of Gt along the optimal postman route. The expected outcome 3. If v ¤ 0, then the matrix is nonsingular. 4. If v D 0, then the matrix has rank n 1 where n is the canbewrittenasthefollowing2 2 game. (Here v(Gt), order of the matrix. v(Gu ) are the values of the subgames rooted at t; u respec- tively.) The theory of completely mixed games is a useful tool in Gt Gu GuGt linear algebra and numerical analysis [4]. The following is Gt [3 C v(Gt )] 2[9 C v(Gu)] C [3 C v(Gt )] a sample application of this theorem. Gu 2[3 C v(Gt ] C [9 C v(Gu )] [9 C v(Gu)] Theorem 21 (Perron 1909) Let A be any n nmatrix Observe that the 2 2 game has no saddle point and with positive entries. Then A has a positive eigenvalue with hence has value 3 C 9 C v(G ) C v(G ). By induction we t u a positive eigenvector which is also a simple root of the char- can assume v(G ) D 24; v(G ) D 34. Thus the value of t u acteristic equation. this game is 70. This is also the sum of the edge lengths of the game tree. An optimal strategy for the bird can be Proof Let I be the identity matrix. For any >0, the recursively determined as follows. maximizing player prefers to play the game A rather than the game A I. The payoff gets worse when the di- Umbrella Folding Algorithm Ladies, when storing um- agonal entries are reached. The value function v()of brellas inside their handbag shrink the central stem of the A I is a non-increasing continuous function. Since umbrella and then the stems around all in one stroke. v(0) > 0andv() < 0forlarge we have for some We can mimic a somewhat similar procedure also for 0 > 0thevalueofA 0I is 0. Let y be optimal for our above game tree. We simultaneously shrink the edges player II, then (A 0I)y 0 implies 0 < Ay 0 y. [xc]and[xb]tox.Inthenextroundfa; x; dg edges That is 0. Since the optimal y is completely mixed, [a; t]; [x; t]; [d; t] can be simultaneously shrunk to t and for any optimal x of player I, we have (A 0I)x D 0. soontilltheentiretreeisshrunktotherootver- Thus x > 0 and the game is completely mixed. By (2) tex O. We do know that the optimal strategy for the and (4) if (A 0I)u D 0thenu is a scalar multiple bird when the tree is simply the subtree with root x and of y and so the eigenvector y is geometrically simple. If 4 2 with leaves b; c is given by p(b) D (4C2) ; p(c) D (4C2) . B D A 0I,thenB is singular and of rank n 1. If Now for the subtree with vertex t and leaves fa; b; c; dg, (Bij) is the cofactor matrix of the singular matrix B then 3384 Zero-Sum Two Person Games

P j bijBkj D 0 ; i D 1;:::;n.Thusrowk of the cofac- amoveinU,namely 8 P tor matrix is a scalar multiple of y. Similarly each column < P12S q 1 if U is relevant for 1 ; of B is a scalar multiple of x. Thus all cofactors are of the 2 q ˇ1(U;) D :P 1 S 1 same sign and are different from 0. That is q if U is not relevant for : 12T 1 1 ˇ X d ˇ The following theorem of Kuhn [42] is a consequence of det(A I)ˇ D Bii ¤ 0 : the assumption of perfect recall. d 0 i Theorem 22 Let 1;2 be mixed strategies for players I and II respectively in a zero sum two person finite game of Thus 0 is also algebraically simple. See [4] for the most perfect recall. Let ˇ ;ˇ be the induced behavior strategies general extensions of this theorem to the theorems of Per- 1 2 for the two players. Then the probability of reaching any rron and Frobenius and to the theory of M-matrices and end vertex w using ; coincides with the probability of power positive and polynomially matrices). 1 2 reaching w using the induced behavior strategy ˇ1;ˇ2.Thus in zero-sum two person games with perfect recall, players Behavior Strategies in Games with Perfect Recall can play optimally by restricting their strategy choices just to behavior strategies. Consider any extensive game where the unique unicur- The following analogy may help us understand the advan- sal path from an end vertex w to the root x0 intersects two tages of behavior strategies over mixed strategies. A book moves x and y of say, Player I. We say x y if the the has 10 pages with 3 lines per page. Someone wants to unique path from y to x0 is via move x.LetU 3 x and glance through the book reading just 1 line from each V 3 y be the respective information sets. If the game has page. A master plan (pure strategy) for scanning the reached a move y 2 V; Player I will know that it is his turn book consists of choosing one line number for each page. and the game has progressed to some move in V.Thegame Since each page has 3 lines, the number of possible plans is said to have perfect recall if each player can remember all is 310. Thus the set of mixed strategies is a set of dimen- his past moves and the choices made in those moves. For sion 310 1. There is another randomized approach for example if the game has progressed to a move of Player I in scanning the book. When page i is about to be scanned the information set V he will remember the specific alter- choose line 1 with probability x , line 2 with probabil- native chosen in any earlier move. A move x is possible for i1 ity x and line 3 with probability x .Sinceforeachi we Player I with his pure strategy , if for some suitable pure i2 i3 1 have x C x C x D 1thedimensionofsuchastrategy strategy of Player II, the move x can be reached with i1 i2 i3 2 space is just 20. Behavior strategies are easier to work with. positive probability using 1;2. An information set U is Further Kuhn’s theorem guarantees that we can restrict to relevant for a pure strategy , for Player I, if some move 1 behavior strategies in games with perfect recall. x 2 U is possible with 1.Let˘1;˘2 be pure strategy In general if there are k alternatives at each informa- spaces for players I and II. tion set for a player and if there are n information sets for Let Dfq ; 2˘ g be any mixed strategy for 1 1 1 1 the player, the dimension of the mixed strategy space is Player I. The information set U forPlayerIisrelevant for kn 1. On the other hand the dimension of the behav- the mixed strategy if for some q > 0; U is relevant 1 1 ior strategy space is simply n(k 1). Thus while the di- for . We say that the information set U forPlayerIis 1 mension of mixed strategy space grows exponentially the not relevant for the mixed strategy 1 if for all q > 0; U 1 dimension of behavior strategy space grows linearly. The is not relevant for .Let 1 following example will illustrate the advantages of using behavior strategies. S Df1 : U is relevant for 1 and 1(U) D g ; Example 23 Player I has 7 dice. All but one are fake. Fake D ;:::; S Df1 : U is relevant for 1g ; die Fi has the same number i on all faces i 1 6. Die G is the ordinary unbiased die. Player I selects one of

T Df1 : U is not relevant for 1 and 1(U) D g : them secretly and announces the outcome of a single toss of the die to player II. It is Player II’s turn to guess which The behavior strategy induced by a mixed strategy pair die was selected for the toss. He gets no reward for correct guess but pays $1 to Player I for any wrong guess. (1;2) at an information set U forPlayerIissimplythe conditional probability of choosing alternative in the in- Player I has 7 pure strategies while Player II has 26 pure formation set U, given that the game has progressed to strategies. As an example the pure strategy (F1, G, G, F4, Zero-Sum Two Person Games 3385

G, F6) for Player II is one which guesses the die as fake ploiting the inherent symmetries in the problem. We were when the outcome revealed is 1 or 4 or 6, and guesses the also lucky in our search when we were looking for optimals dieasgenuinewhentheoutcomeis2or3or5.Thenor- among equalizers. mal form game is a payoff matrix of size 7 64. For ex- From an algorithmic point of view, this is not pos- ample if G is chosen by Player I, and (F1, G, G, F4, G, F6) sible with any arbitrary extensive game with perfect re- is chosen by Player II, the expected payoff to Player I is call. While normal form is appropriate for finding op- 1 1 6 [1 C 0 C 0 C 1 C 0 C 1] D 2 .IfF2 is chosen by Player I, timal mixed strategies, its complexity grows exponential the expected payoff is 1 against the above pure strategy with the size of the vertex set. The payoff matrix in nor- of Player II. Now Player II can use the following behav- mal form is in general not a sparse matrix (a sparse ma- ior strategy. If the outcome is i, then with probability qi trix is one which has very few nonzero entries) a key issue he can guess that the die is genuine and with probability for data storage and computational accuracies. By stick- (1 qi ) he can guess that it is from the fake die Fi.Theex- ing to the normal form of a game with perfect recall we pected behavioral payoff to Player I when he chooses the cannot take full advantage of Kuhn’s theorem in its drasti- genuine die with probability p0 and chooses the fake die Fi cally narrowed down search for optimals among behavior with probability pi ; i D 1;:::;i D 6 is given by strategies. A more appropriate form for these games is the sequence form [63]andrealization probabilities to be de- X6 X6 1 scribed below. The behavioral strategies that induce the re- K(p; q) D p (1 q ) C p q : 0 6 i i i alization probabilities grow only linearly in the size of the iD1 iD1 terminal vertex set. Another major advantage is that the sequence form induces a sparse matrix. It has at most as Collecting the coefficients of qi’s, we get many non-zero entries as the number of terminal vertices or plays. X6 1 K(p; q) D q p p C p : i i 6 0 0 iD1 Sequence Form When the game moves to an information set U of say, By choosing p 1 p D 0, we get p D p D ::: ;p D 1 i 6 0 1 2 6 player I, the perfect recall condition implies that wherever 1 p .Thusp D ( 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ). For this mixed 6 0 2 12 12 12 12 12 12 thetruemoveliesinU , the player knows the actual al- strategy for Player I, the payoff to Player I is independent 1 ternative chosen in any of the past moves. Let denote of Player II’s actions. Similarly, we can rewrite K(p; q)as u1 the sequence of alternatives chosen by Player I in his past afunctionofpi’s for i D 1 ;::: ;6where moves. If no past moves of player I occurs we take u1 D;. " # Suppose in U1 player I selects an action “c” with behavioral X X6 1 probability ˇ1(c) and if the outcome is c the new sequence K(p; q) D pk qk (1 qr ) 6 is u1 [ c.Thusanysequence s1 forplayerIisthestring k¤0 rD1 ! of choices in his moves along the partial path from the X6 1 initial vertex to any other vertex of the tree. Let S ; S ; S C (1 q ) : 0 1 2 6 k be the set of all sequences for Nature (via chance moves), kD1 Player I and Player II respectively. Given behavior strate- gies ˇ0;ˇ1;ˇ2 Let This expression can be made independent of pi’s by choos- 1 Y ing qi D 2 ; i D 1;:::;6. Since the behavioral payoff 1 ri (si ) D ˇi (c) ; i D 0; 1; 2 : for these behavioral strategies is 2 , the value of the game 1 1 c2si is 2 , which means that Player I cannot do any better than 2 while Player II is able to limit his losses to 1 . 2 The functions: ri : Si :! R : i D 0; 1; 2 satisfy the follow- ing conditions Efficient Computation of Behavior Strategies ri (;) D 1 (17) Introduction X D ; ; D ; ; In our above example with one genuine and six fake dice, ri ( u i ) ri ( u i c) i 0 1 2 (18) we used Kuhn’s theorem to narrow our search among op- c2A(Ui ) timal behavior strategies. Our success depended on ex- ri (si ) 0forallsi ; i D 0; 1; 2 : (19) 3386 Zero-Sum Two Person Games

Conversely given any such realization functions r1; r2 we can define behavior strategies, ˇ1 say for player I, by

r1(u1 [ c) ˇ1(U1; c) D r1(u1 )

for c 2 A(U1) ; and r1(u1 ) > 0 :

PWhen r1(u1 ) D 0wedefineˇ1(U1; c) arbitrarily so that ˇ (U ; c) D 1. If the terminal payoff to player I c2A(U1) 1 1 at terminal vertex ! is h(!), by defining h(a) D 0forall nodes a that are not terminal vertices, we can easily check that the behavioral payoff

X Y2 H(ˇ1;ˇ2) D h(s) ri (si ) : s2S iD0 ; ; When we work with realization functions ri i D 1 2we Zero-Sum Two Person Games, Figure 4 can associate with these functions the sequence form of payoff matrix whose rows correspond to sequence s1 2 S1 Since no end vertex corresponds s D;,forPlayerI,the for Player I and columns correspond to sequence s2 2 S2 1 for Player II and with payoff matrix first row of A is identically 0 and so it is represented by X entries. K(s1; s2) D h(s0; s1; s2) : We are looking for a pair of vectors x ; y such s02S0 that y is the best reply vector y which minimizes (x Ay) among all vectors satisfying Fy D f ; y 0. Similarly the Unlike the mixed strategies we have more constraints on best reply vector is x where it maximizes (x; Ay)subject ; the sequences r1 r2 for each player given by the linear to ETx D e; x 0. The duals to these two linear program- constraints above. It may be convenient to denote the se- ming problems are quence functions r1; r2 by vectors x; y respectively. The max ( fq)min(ep) vector x has jS1j coordinates and vector y has jS2j coor- dinates. The constraints on x and y are linear given by such that such that T T Ex D e; Fy D f where the first row is the unit vector (1; F q A x ; Ep Ay ; 0;:::;0) of appropriate size in both E and F.IfU1 is the q unrestricted: p unrestricted: collection of information sets for player I then the num- T ber of rows in E is 1 CjU1j. Similarly the number of rows Since E x D e; and x 0; Fy D f ; and y 0these in F is 1 CjU2j. Except for the first row, each row has the two problems can as well be viewed as the dual linear pro- starting entry as 1 and some 1’s and 0’s. Consider the grams following extensive game with perfect recall. Primal: max( fq) The set of sequences for player I is given by S1 Df;; l; such that r; L; Rg. The set of sequences for player II is given by S2 D f;; ; g FT AT q 0 x 0 ; c d . The sequence form payoff matrix is given by ; 2 3 0 ET x D e q unrestricted. 6 7 6 0 007 Dual: min(ep) 6 7 K(s1; s2) D A D 6 011 7 : 4 5 such that 0 24 100 F 0 y D f y 0 ; ; AE p 0 p unrestricted. The constraint matrices E and F are given by 2 3 We can essentially prove that: 1 1 E D 4 111 5 ; F D : Theorem 24 The optimal behavior strategies of a zero sum 111 111 two person game with perfect recall can be reduced to solv- Zero-Sum Two Person Games 3387

P j ing for optimal solutions of dual linear programs induced Now Player II has t D j j x a mixed strategy which j by its sequence form. The linear program has a size which chooses x with probability j.Sincet is a boundary point in its sparse representation is linear in the size of the game of T D con S, by the Caratheodary theorem for convex tree. hulls, the convex combination above involves at most m points of S. Hence the theorem. See [50]. General Minimax Theorems The minimax theorem of von Neumann can be general- Geometric Consequences ized if we notice the possible limitations for extensions. Many geometric theorems can be derived by using the minimax theorem for S-games. Here we give as an exam- Example 25 Players I and II choose secretly positive inte- ple the following theorem of Berge [6] that follows from gers i; j respectively. The payoff matrix is given by the theorem on S-games. ( 1ifi > j Theorem 27 Let Si ; i D 1 :::;m be compact convex sets aij D 1ifi < j in Rn. Let them satisfy the following two conditions. S 1. S D m S is convex. the value of the game does not exist. T iD1 i 2. i¤j Si ¤;; j D 1; 2 :::;m. The boundedness of the payoff is essential and the follow- T Then m S ¤;. ing extension holds. iD1 i Proof Suppose Player I secretly chooses one of the sets Si S-games and Player II secretly chooses a point x 2 S.Letthe payoff to I be d(S ; x)whered is the distance of the Given a closed bounded set S Rm, let Players I and II i point x from the set S .ByourS-game arguments we play the following game. Player II secretly selects a point i have for some probability vector ,andmixedstrategy s D (s ;:::;s ) 2 S. Knowing the set S but not knowing 1 m D ( ;:::; ) the point chosen by Player II, Player I selects a coordi- 1 m X nate i 2f1; 2;:::;mg. Player I receives from Player II an i d(Si ; x) v for all x 2 S (20) amount si. Xi m j Theorem 26 Given that S is a compact subset of R ,ev- jd(Si ; x ) v for all i D 1 :::;m : (21) ery S-game has a value and the two players have optimal j mixed strategies which use at most m pure strategies. If the set S is also convex, Player II has an optimal pure strategy. Since d(Si ; x) are convex functions, the second inequal- ity (1)implies Proof Let T be the convex hull of S.HereT is also com- 0 1 pact. Let X @ jA d Si ; j x v : (22) v D min max ti D max ti : j t2T i i P The compact convex set T and the open convex set ı j The game admits a pure optimal x D j j x for G Dfs :maxi si < vg are disjoint. By the weak separation Player II. We are also given theorem for convex sets there exists a ¤ 0 and constant c \ such that Si ¤;; j D 1; 2 :::;m : i¤j for all s 2 G ; (;s) c and for all t 2 T ; (;t) c : For any optimal mixed strategy D (1;2;T m)of Player I, if any i D 0 and we choose an x 2 i¤1 Si , Using the property that v D (v; v;:::;v) 2 G¯ and t 2 T then from (20)wehave0 v and thus v D 0. When the ı \ G¯,wehave(;t ) D c.Foranyu 0; t u 2 G¯. valueT v is zero, the third inequality (22)showsthatx 2 Thus (;t u) c.Thatis(;u) 0. We can assume i Si .If>0, the second inequality (21)willbecomean ı ı is a probability vector, in which case equality for all i and we haveT d(Si ; x ) D v.Butforx 2 S, ı m we have v D 0andx 2 D Si .(SeeRaghavan[53]for i 1 (;v) D v c D (;t ) max ti D v : other applications.) i 3388 Zero-Sum Two Person Games

Ky Fan–Sion Minimax Theorems Without loss of generality let the finite sets X1; Y1 be with General minimax theorems are concerned with the follow- minimum cardinality m and n satisfying the above condi- ing problem: Given two arbitrary sets X, Y and a real func- tions. The minimum cardinality conditions have the fol- tion K : X Y ! R, under what conditions on K, X, Y Tlowing implications. The sets Si Dfy : K(xi ; y) cg Con Y1 areT non-empty and convex. T can one assert n Further i¤j Si ¤;for all j D 1;:::;n,but iD1 Si sup inf K(x; y) D inf sup K(x; y) : D;. Now by Berge’s theorem (Subsect. “Geometric Con- y2Y y2Y x2X x2X sequences”), the union of the sets Si cannot be convex. Therefore there exists y 2 Con Y ,withK(x; y ) > c A standard technique for proving general minimax theo- 0 1 0 for all x 2 X .SinceK(:; y ) is quasi-concave we have rems is to reduce the problem to the minimax theorem for 1 0 K(x; y ) > c for all x 2 Con X . Similarly there exists matrix games. Such a reduction is often possible with some 0 1 an x 2 Con X such that K(x ; y) < c for all y 2 Y and form of compactness of the space X or Y and a suitable 0 1 0 1 hence for all y 2 Con Y (byquasi-convexityofK(x ; y)). continuity and convexity or quasi-convexity of the ker- 1 0 Hence c < K(x ; y ) < c, and we have a contradiction. nel K. 0 0 When the sets X; Y are mixed strategies (probability Definition 28 Afunction f : X ! R is upper-semi-con- measures) for suitable Borel spaces one can find value for tinuous on X if and only if for any real c; fx : f (x) < cg is some games with optimal strategies for one player but not open in X.Afunction f : X ! R is lower semi-continu- for both. See [50], Alpern and Gal [1988]. ous in X if and only if for any real c; fx : f (x) > cg is open in X. Applications of Infinite Games Definition 29 Let X be a convex subset of a topologi- S-games and Discriminant Analysis cal vector space. A function f : X ! R is quasi-convex if and only if for each real c,thesetfx : f (x) < cg is convex. Motivated by Fisher’s enquiries [25]intotheproblemof Afunctiong is quasi-concave if and only if g is quasi- classifying a randomly observed human skull into one of convex. Clearly any convex function (concave function) is several known populations, discriminant analysis has ex- quasi-convex (conceive). ploded into a major statistical tool with applications to di- verse problems in business, social sciences and biological The following minimax theorem is a special case of sciences [33,54]. more general minimax theorems due to Ky Fan [21]and Sion [61]. Example 31 Apopulation˘ has a probability density which is either f 1 or f 2 where f i is multivariate normal Theorem 30 Let X, Y be compact convex subsets of linear with mean vector i ; i D 1; 2 and variance covariance topological spaces. Let K : X Y ! R be upper semi con- matrix, ˙,thesameforbothf 1 and f 2. Given an ob- tinuous (u.s.c) in x (for each fixed y) and lower semi contin- servation X from population ˘ the problem is to clas- uous (l.s.c) in y (for each x). Let K(x; y) be quasi-concave sify the observation into the proper population with den- in x and quasi-convex in y. Then sity f 1 or f 2. The costs of misclassifications are c(1/2) > 0 and c(2/1) > 0wherec(i/j) is the cost of misclassifying an max min K(x; y) D min max K(x; y) : x2X y2Y y2Y x2X observation from ˘j to ˘i . The aim of the statistician is to find a suitable decision procedure that minimizes the Proof The compactness of spaces and the u:s:c; l:s:c con- worst risk possible. ditions guarantee the existence of maxx2X miny2Y K(x; y) This can be treated as an S-game where the pure strategy and miny2Y maxx2X K(x; y). We always have for Player I (nature) is the secret choice of the population and a pure strategy for the statistician (Player II) is to par- max min K(x; y) min max K(x; y) : x2X y2Y y2Y x2X tition the sample space into two disjoint sets (T1; T2)such that observations falling in T1 are classified as from ˘1 If possible let maxx miny K(x;y)< c cg. x y The payoffs to Player I (Nature) when the observation is Therefore we have finite subsets X1 X, Y1 Y such that chosen from ˘1;˘2 is given by the risks (expected costs): for each y 2 Y and hence for each y 2 Con Y ,thereisan 1 Z x 2 X1 with K(x; y) > c and for each x 2 X and hence r(1; (T1; T2)) D c(2/1) f1(x)dx ; for each x 2 Con X1,thereisay 2 Y1,withK(x; y) < c. T2 Zero-Sum Two Person Games 3389

Z The random variable U is univariate normal with r(2; (T1; T2)) D c(1/2) f2(x)dx : ˛ T mean 2 ,andvariance ˛ if x 2 ˘1. The random variable U 1 ˛ has mean 2 and variance ˛ if x 2 ˘2. The minimax The following theorem, based on an extension of Lya- strategy for the statistician will be such that punov’s theorem for non-atomic vector measures [43], Z 1 ˛ is due to Blackwell [10], Dvoretsky-Wald and Wol- p (k ) 2 ˛ 2 1 y fowitz [20]. c(2/1) p e 2 dy 1 2 Z 1 2 Theorem 32 Let 1 y D c(1/2) p e 2 dy : Z p1 ˛ ˛ (kC 2 ) 2 S D (s1; s2): s1 D c(2/1) f1(x)dx ; s2 D c(1/2) Z T2 The value of k can determined by trial and error.

f2(x)dx ; (T1; T2) 2 T T1 General Minimax Theorem and Statistical Estimation where T is the collection of all measurable partitions of the Example 33 A coin falls heads with unknown probabil- sample space. Then S is a closed bounded convex set. ity . Not knowing the true value of a statistician wants to estimate based ona single toss of the coin with squared We know from the theorem on S games (Theorem 22) that error loss. Player II (the statistician has a minimax strategy which is pure. If v is the value of the game and if (1 ;2 )isanop- Of course, the outcome is either heads or tails. If heads he timal strategy for Player I then we have: can estimate ˆ as x and if the outcome is tails, he can esti- Z Z mate ˆ as y. To keep the problem zero-sum, the statistician pays a penalty ( x)2 when he proposes x as the estimate 1 :c(2/1) f1(x)dx C 2 :c(1/2) f2(x)dx v 2 T2 T1 and pays a penalty ( y) when he proposes y as the es- timate. Thus the expected loss or risk to the statistician is for all measurable partitions T . For any general parti- given by tion T , the above expected payoff to I simplifies to: Z ( x)2 C (1 )( y)2 : 1 c(2/1) C [2 :c(1/2) f2(x) 1 c(2/1)] f1(x)dx : T1 It was Abraham Wald [68] who pioneered this game the- oretic approach. The problem of estimation is one of dis- It is minimized whenever the integrand is 0onT .Thus 1 covering the true state of nature based on partial knowl- the optimal pure strategy (T; T)satisfies: 1 2 edge about nature revealed by experiments. While nature ˚ reveals in part, the aim of nature is to conceal its secrets. T D x : c(1/2) f2(x) c(2/1) f1(x) 0 1 2 1 The statistician has to make his decisions by using ob- This is equivalent to served information about nature. We can think of this as an ordinary zero sum two person game where the pure X1 strategy for nature is to choose any 2 [0; 1] and a pure 1 T D x : U(x) D ( )T x ( )T strategy for Player II (statistician) is any point in the unit 1 1 2 2 1 2 square I Df(x; y): 0 x 1; 0 y 1g with the payoff X1 given above. Expanding the payoff function we get (1 C 2) k K(;(x; y)) D 2(2x C1C2y)C(x2 2y y2)C y2: and The statistician may try to choose his strategy in such X1 a way that no matter what is chosen by nature, it has T T2 D U(x) D (1 2) x no effect on his penalty given the choice he makes for x; y. We call such a strategy an equalizer. We have an equal- X1 1 izer strategy if we can make the above payoff independent ( )T ( C ) < k 2 1 2 1 2 of . In fact we have used this trick earlier while deal- P ing with behavior strategies. For example by choosing x T 1 2 for some suitable k.Let˛ D (1 2) (1 2). and y such that the coefficient of and are zero we 3390 Zero-Sum Two Person Games

can make the payoff expression independent of .Wefind Borel’s Poker Model D 3 ; D 1 that x 4 y 4 is the solution. While this may guaran- Two risk neutral players I and II initiate a game by adding 1 tee the statistician a constant risk of 16 ,onemaywonder an ante of $1 each to a pot. Players I and II are dealt a hand, whether this is the best? We will show that it is best by find- a random value u of U and a random value v of V respec- ing a mixed strategy for mother nature that guarantees an tively. Here U, V are independent random variables uni- 1 ; expected payoff of 16 no matter what (x y)isproposed formly distributed on [0; 1]. by the statistician. Let F() be the cumulative distribution The game begins with Player I. After seeing his hand he function that is optimal for mother nature. Integrating the either folds losing the pot to Player II, or raises by adding above payoff with respect to F we get $1 to the pot. When Player I raises, Player II after seeing Z 1 his hand can either fold,losingthepottoPlayerI,orcall K(F; (x; y) D (2x C 1 C 2y) 2dF() by adding $1 to the pot. If Player II calls, the game ends 0 Z and the player with the better hand wins the entire pot. 1 Suppose players I and II restrict themselves to using only C (x2 2y y2) dF() C y2 : 0 strategies g; h respectively where ( Let Z Z fold if u x 1 1 Player I: g(u) D 2 > ; m1 D dF()andm2 D dF() : raise if u x 0 0 ( In terms of m1; m2; x; y the expected payoff can be rewrit- fold if v y Player II: h(v) D ten as raise if v > y :

L((m1; m2); (x; y)) The computational details of the expected payoff K(x; y) 2 2 2 D m2(2x C 1 C 2y) C m1(x 2y y ) C y : based on the above partitions of the u; v space is given be- low in Table 3. For fixed values of m2; m1 the minimum value of The payoff K(x; y) is simply the sum of each area times 2 2 2 m2(2x C 1 C 2y) C m1(x 2y y ) C y must sat- the local payoff given by isfy the first order conditions K(x; y) D 2y2 3xy C x y m2 m2 m1 D x ; and D y : when x < y : m1 m1 1

If m1 D 1/2 and m2 D 3/8 then xD3/4 and yD1/4 is optimal. In fact, a simple probability distribution can be found by choosing the point 1/2 ˛ with probabil- ity 1/2 and 1/2 C ˛ with probability 1/2. Such a distri- bution will have mean m1 D 1/2 and second moment 2 2 2 m2 D (1/2 (1/2˛) C 1/2 (1/2˛) ) D 1/4 C ˛ .When 2 m2 D 3/8, we get ˛ D 1/8. Thus the optimal strategy for nature also called the least favorable distribution,istotoss an ordinary unbiased coin and if itp is heads, selectp a coin which falls heads with probability ( 2 1)/(2 2) and if the ordinary coinp toss is tails,p select a coin which falls heads with probability ( 2 C 1)/(2 2). For further applications of zero-sum two person games to statistical decision the- ory see Ferguson [23]. In general it is difficult to characterize the nature of optimal mixed strategies even for a C1 payoff function. See [37]. Another rich source for infinite games do ap- pear in many simplified parlor games. We give a simpli- fied poker model due to Borel (See the reference under Zero-Sum Two Person Games, Figure 5 Ville [65]). Poker partition when x < y Zero-Sum Two Person Games 3391

Zero-Sum Two Person Games, Table 3

Outcome Action by players: Region Area Payoff u x Idropsout A [ D [ F x 1 u > x; v y II drops out E [ G y(1 x) 1 (1y)(12xCy) u < v; u > x; v > y both raise B 2 2 1 2 u > v; u > x; v > y both raise C 2 (1 y) 2

Zero-Sum Two Person Games, Table 4

Outcome Action by players: Region Area Payoff u x Idropsout H [ I [ J [ K x 1 u > x; v y II drops out N y(1 x) 1 (1x)2 u < v; u > x; v > y both raise L 2 2 1 u > v; u > x; v > y both raise M 2 (1 x)(1 C x 2y) 2

2 The payoff K(x; y) is simply the sum of each area times the that infy y x¯2.Nowfory > x¯ we have K(x¯; y) D 2y2 3xy¯ C K(x; y) D 2y2 3xy C x y for x < y : x¯ y. This being a convex function in y, it has its min- imum either at the boundary or at an interior point of Also K(x; y) is continuous on the diagonal and concave the interval [x¯; 1]. We have K(x¯; x¯) Dx¯2; K(x¯; 1) D in x and convex in y. By the general minimax theorem of 1 2x¯ or the minimum is attained at y D (3x¯ C 1)/4, Ky Fan or Sion there exist x , y pure optimal strategies theuniquesolutionto(@K(x¯; y))/(@y) D 0. At y D for K(x; y). We will find the value of the game by explic- (3x¯ C 1)/4, the value of K(x¯; y) D (9x¯2 C 2x¯ 1)/8. itly computing miny K(x; y)foreachx and then taking the Clearly for x¯ ¤ 1wehave(9x¯2 C 2x¯ 1)/8 < 2 2 maximum over x. x¯ < 1 2x¯.Thusminy K(x; y) D (9x C 2x 1)/8. 2 Let 0 < x¯ < 1. (Intuitively we see that in our search for Now max miny K(x; y) D maxx (9x C 2x 1)/8 D saddle point, x D 0orx D 1 are quite unlikely.) min0y1 1/9. It is attained at x D 1/9. Now miny K(1/9; y)is K(x¯; y) D minfinfy 1/9 and a good pure strategy for II is to call when y > 1/3. For other poker models see von Neumann and Mor- genstern [67], Blackwell and Bellman [5], Binmore [9]and Ferguson and Ferguson [22] who discuss other discrete and continuous poker models.

War Duels and Discontinuous Payoffs on the Unit Square While general minimax theorems make some stringent continuity assumptions, rarely they are satisfied in mod- eling many war duels as games on the unit square. The payoffs are often discontinuous along the diagonal. The following is an example of this kind. Example 34 Players I and II start at a distance D 1from a balloon, and walk toward the balloon at the same speed. Zero-Sum Two Person Games, Figure 6 Each player carries a noisy gun which has been loaded Poker partition when x > y with just one bullet. Player I’s accuracy at x is p(x)and 3392 Zero-Sum Two Person Games

Player II’s accuracy at x is q(x)wherex is the distance Suppose the payoff has the following structure: traveled from the starting point. Because the guns can be heard, each player knows whether or not the other player ˛ if I wins has fired. The player who fires and hits the balloon first ˇ if II wins wins the game. if neither wins Some natural assumptions are 0 if both shoot accurately and ends in a draw. The functions p; q are continuous and strictly increas- Depending on these values while the value exists only one ing player may have optimal pure strategy while the other p(0) D q(0) D 0andp(1) D q(1) D 1. may have only an epsilon optimal pure strategy, close to If a player fires and misses then the opponent can wait un- but not the same as the solution to the equation p(x ) til his accuracy D 1 and then fire. Player I decides to shoot Cq(x ) D 1. after traveling distance x provided the opponent has not Example 36 (silent duel) Players I and II have the same ac- yet used his bullet. Player II decides to shoot after trav- curacy p(x) D q(x) D x. However, in this duel, the play- eling distance y provided the opponent has not yet used ers are both deaf so they do not know whether or not the his bullet. The following payoff reflects the outcome of this opponent has fired. The payoff function is given by strategy choice. K(x; y) D K(x; y) D 8 8 <ˆ(1)x C (1)(y)(1 x) D x y C xy when x < y ; <ˆ(1)p(x) C (1)(1 p(x) D 2p(x) 1whenx < y ˆ0whenx D y; p(x) q(x)whenx D y ; : :ˆ (1)y C (1)(1 y)x D x y xy when x > y : (1)q(y) C (1 q(y)(1) D 1 2q(y)whenx > y : This game has no saddle point. In this case, the value of We claim that the players have optimal pure strategies thegameifitexistsmustbezero.Onecandirectlyverify and there is a saddle point. Consider miny K(x; y) D that the density minf2p(x) 1; p(x) q(x); infy: a unique solution to the equation p(x) C q(x) D 1asboth p functions are strictly increasing. Let x be that solution Let a pD 6 2.Thenthegamehasvaluev D 1 2a D point. We get p(x) C q(x) D 1. The value v satisfies 5 2 6. An optimal strategy for Player I is given by the density v D p(x) q(x) ( 0for0

a with an additional mass of 2Ca at D 1. The deaf dilemma impose tacitly, actual cooperation at equilibrium player has to maintain a sizeable suspicion until the very among otherwise non-cooperative players. This was first end! recognized by Schelling [58], and was later formalized by The study of war duels is intertwined with the study the so called folk theorem for repeated games (Aumann of positive solutions to integral equations. The theorems and Shapley [1986], [3]). Indeed Nash equilibrium per se of Krein and Rutman [40] on positive operators and their is a weak is also one of the main mes- positive eigenfunctions are central to this analysis. For fur- sages of Folk theorem for repeated games. With a plethora ther details see [19,37,51]. of equilibria, it fails to have an appeal without further re- Dynamic versions of zero sum two person games finements. It was Selten who initiated the need for refining where the players move among several games according equilibria and came up with the notion of subgame perfect to some Markovian transition law leads to the theory of equilibria [59]. It turns out that subgame perfect equilibria stochastic games [60]. The study of value and the com- are the natural solutions for many problems in sequential plex structure of optimal strategies of zero-sum two person bargaining [57]. Often one searches for specific types of stochastic games is an active area of research with applica- equilibria like symmetric equilibria, or Bayesian equilib- tions to many dynamic models of competition [24]. ria for games with incomplete information [30]. Auctions While stochastic games study movement in discrete as games exhibit this diversity of equilibria and Harsanyi’s time, the parallel development of dynamic games in con- Bayesian equilibria turn out to be the most appropriate so- tinuous time was initiated by Isaacs in several Rand reports lution concept for this class of games [47]. culminating in his monograph on Differential games [31]. Zero sum two person games and their solutions will Many military problems of pursuit and evasion, attrition continue to inspire researchers in all aspects non-cooper- and attack lead to games where the trajectory of a moving ative game theory. object is being steered continuously by the actions of two players. Introducing several natural examples Isaacs ex- Acknowledgment plicitly tried to solve many of these games via some heuris- tic principles. The minimax version of Bellman’s optimal- The author wishes to acknowledge the unknown referee’s ity principle lead to the so called Isaacs Bellman equations. detailed comments in the revision of the first draft. More This highly non-linear partial differential equation on the importantly he drew the author’s attention to the topic of value function plays a key role in this study. search games and other combinatorial games. The author would like to thank Ms. Patricia Collins for her assistance in her detailed editing of the first draft of this manuscript. Epilogue The author owes special thanks to Mr. Ramanujan Ragha- The rich theory of zero sum two person games that we van and Dr. A.V. Lakshmi Narayanan for their help in have discussed so far hinges on the fundamental notions incorporating the the graphics drawings into the LaTeX of value and optimal strategies. When either zero sum or file. two person assumption is dropped, the games cease to have such well defined notions with independent standing. Bibliography In trying to extend the notion of optimal strategies and the minimax theorem for bimatrix games Nash [48]in- 1. Alpern S, Gal S (2003) The theory of search games and ren- dezvous. Springer troduced the concept of an equilibrium point. Even more 2. Aumann RJ (1981) Survey of repeated games, Essays in Game than this extension it is this concept which is simply the Theory and Mathematical Economics, in Honor of Oscar Mor- most seminal solution concept for non-cooperative game genstern. Bibliographsches Institut, Mannheim, pp 11–42 theory. This Nash equilibrium solution is extendable to 3. Axelrod R, Hamilton WD (1981) The evolution of cooperation. any non-cooperative N person game in both extensive Science 211:1390–1396 4. Bapat RB, Raghavan TES (1997) Nonnegative Matrices and Ap- and normal form. Many of the local properties of opti- plications. In: Encyclopedia in Mathematics. Cambridge Uni- mal strategies and the proof techniques of zero sum two versity Press, Cambridge person games do play a significant role in understanding 5. Bellman R, Blackwell D (1949) Some two person games involv- Nash equilibria and their structure [15,32,39,52,53]. ing bluffing. Proc Natl Acad Sci USA 35:600–605 When a non-zero sum game is played repeatedly, the 6. Berge C (1963) Topological Spaces. Oliver Boyd, Edinburgh 7. Berger U (2007) Brown’s original fictitious play. J Econ Theory players can peg their future actions on past history. This 135:572–578 leads to a rich theory of equilibria for repeated games [62]. 8. Berlekamp ER, Conway JH, Guy RK (1982) Winning Ways for Some of them, like the model of Prisoner’s your Mathematical Plays, vols 1, 2. Academic Press, NY 3394 Zero-Sum Two Person Games

9. Binmore K (1992) Fun and Game Theory A text on Game The- 36. Kaplansky I (1945) A contribution to von Neumann’s theory of ory. DC Heath, Lexington games. Ann Math 46:474–479 10. Blackwell D (1951) On a theorem of Lyapunov. Ann Math Stat 37. Karlin S (1959) Mathematical Methods and Theory in Games, 22:112–114 ProgrammingandEcons,vols1,2.AddisonWesley,NewYork 11. Blackwell D (1961) Minimax and irreducible matrices. Math J 38. Kohler DA (1973) Translation of a report by Fourier on his work Anal Appl 3:37–39 on linear inequalities. Opsearch 10:38–42 12. Blackwell D, Girshick GA (1954) Theory of Games and Statistical 39. Kreps VL (1974) Bimatrix games with unique equilibrium Decisions. Wiley, New York points. Int Game J Theory 3:115–118 13. Bouton CL (1902) Nim-a game with a complete mathematical 40. Krein MG, Rutmann MA (1950) Linear Operators Leaving invari- theory. Ann Math 3(2):35–39 ant a cone in a Banach space. Amer Math Soc Transl 26:1–128 14. Brown GW (1951) Iterative solution of games by fictitious play. 41. Krishna V, Sjostrom T (1998) On the convergence of fictitious In: Koopmans TC (ed) Activity Analysis of Production and Allo- play. Math Oper Res 23:479–511 cation. Wiley, New York, pp 374–376 42. Kuhn HW (1953) Extensive games and the problem of infor- 15. Bubelis V (1979) On equilibria in finite games. Int J Game The- mation. Contributions to the theory of games. Ann Math Stud ory 8:65–79 28:193–216 16. Chin H, Parthasarathy T, Raghavan TES (1973) Structure of 43. Lindenstrauss J (1966) A short proof of Liapounoff’s convexity equilibria in N-person noncooperative games. Int Game J The- theorem. Math J Mech 15:971–972 ory 3:1–19 44. Loomis IH (1946) On a theorem of von Neumann. Proc Nat 17. Conway JH (1982) On numbers and Games, Monograph 16. Acad Sci Wash 32:213–215 London Mathematical Society, London 45. Miyazawa K (1961) On the convergence of the learning pro- 18. Dantzig GB (1951) A proof of the equivalence of the program- cess in a 2 2 non-zero-sum two-person game, Econometric ming problem and the game problem. In: Koopman’s Actvity Research Program, Research Memorandum No. 33. Princeton analysis of production and allocationation, Cowles Conume- University, Princeton sion Monograph 13. Wiley, New York, pp 333–335 46. Monderer D, Shapley LS (1996) Potential LS games. Games 19. Dresher M (1962) Games of strategy. Prentice Hall, Englewood Econ Behav 14:124–143 Cliffs 47. Myerson R (1991) Game theory. Analysis of Conflict. Harvard 20. Dvoretzky A, Wald A, Wolfowitz J (1951) Elimination of ran- University Press, Cambridge domization in certain statistical decision problems and zero- 48. Nash JF (1950) Equilibrium points in n-person games. Proc Natl sum two-person games. Ann Math Stat 22:1–21 Acad Sci Wash 88:48–49 21. Fan K (1953) Minimax theorems. Proc Natl Acad Sci Wash 49. Owen G (1985) Game Theory, 2nd edn. Academic Press, New 39:42–47 York 22. Ferguson C, Ferguson TS (2003) On the Borel and von Neu- 50. Parthasarathy T, Raghavan TES (1971) Some Topics in Two Per- mann poker models. Game Theory Appl 9:17–32 son Games. Elsevier, New York 23. Ferguson TS (1967) Mathematical Stat, a Decision Theoretic 51. Radzik T (1988) Games of timing related to distribution of re- Approach. Academic Press, New York sources. Optim J Theory Appl 58:443–471, 473–500 24. Filar JA, Vrieze OJ (1996) Competitive Markov Decision Pro- cesses. Springer, Berlin 52. Raghavan TES (1970) Completely Mixed Strategies in Bimatrix 25. Fisher RA (1936) The Use of Multiple Measurements in Taxo- Games. Lond J Math Soc 2:709–712 nomic Problems. Ann Eugen 7:179–188 53. Raghavan TES (1973) Some geometric consequences of 26. Fourier JB (1890) Second Extrait. In: Darboux GO (ed) Gauthiers a game theoretic result. Math J Anal Appl 43:26–30 Villars, Paris, pp 325–328; English Translation: Kohler DA (1973) 54. Rao CR (1952) Advanced Statistical Methods in Biometric Re- 27. Gal S (1980) Search games. Academic Press, New York search. Wiley, New York 28. Gale D (1979) The game of Hex and the Brouwer fixed-point 55. Reijnierse JH, Potters JAM (1993) Search Games with Immobile theorem. Am Math Mon 86:818–827 Hider. Int J Game Theory 21:385–94 29. Gale D, Kuhn HW, Tucker AW (1951) Linear Programming and 56. Robinson J (1951) An iterative method of solving a game. Ann the theory of games. In: Activity Analysis of Production and Al- Math 54:296–301 location. Wiley, New York, pp 317–329 57. Rubinstein A (1982) Perfect equilibrium in a bargaining model. 30. Harsanyi JC (1967) Games with incomplete information played Econometrica 50:97–109 by Bayesian players, Parts I, II, and III. Sci Manag 14:159–182; 58. Schelling TC (1960) The Strategy of Conflict. Harvard University 32–334; 486–502 Press. Cambridge, Mass 31. Isaacs R (1965) Differential Games: Mathematical A Theory with 59. Selten R (1975) Reexamination of the perfectness concept for Applications to Warfare and Pursuit. Control and Optimization. equilibrium points in extensive games. Int Game J Theory 4: Wiley, New York; Dover Paperback Edition, 1999 25–55 32. Jansen MJM (1981) Regularity and stability of equilibrium 60. Shapley LS (1953) Stochastic games. Proc Natl Acad of Sci USA points of bimatrix games. Math Oper Res 6:530–550 39:1095–1100 33. Johnson RA, Wichern DW (2007) Applied Multivariate Statisti- 61. Sion M (1958) On general minimax theorem. Pac J 8:171–176 cal Analysis, 6th edn. Prentice Hall, New York 62. Sorin S (1992) Repeated games with complete information, 34. Kakutani S (1941) A generalization of Brouwer’s fixed point In: Aumann RJ, Hart S (eds) Handbook of Game Theory, vol 1, theorem. Duke Math J 8:457–459 chapter 4. North Holland, Amsterdam, pp 71–103 35. Kantorowich LV (1960) Mathematical Methods of Organizing 63. von Stengel B (1996) Efficient computation of behavior strate- and Planning Production. Manag Sci 7:366–422 gies. Games Econ Behav 14:220–246 Zero-Sum Two Person Games 3395

64. Thuijsman F, Raghavan TES (1997) Stochastic games with 67. von Neumann J, Morgenstern O (1947) Theory of Games switching control or ARAT structure. Int Game J Theory 26: and Economic Behavior, 2nd edn. Princeton University Press, 403–408 Princeton 65. Ville J (1938) Note sur la theorie generale des jeux ou intervient 68. Wald A (1950) Statistical Decision Functions. Wiley, New York l’habilite des joueurs. In: Borel E, Ville J (eds) Applications aux 69. Weyl H (1950) Elementary proof of a minimax theorem due to jeux de hasard, tome IV, fascicule II of the Traite du calcul des von Neumann. Ann Math Stud 24:19–25 probabilites et de ses applications, by Emile Borel 70. Zermelo E (1913) Uber eine Anwendung der Mengenlehre auf 66. von Neumann J (1928) Zur Theorie der Gesellschaftspiele. die Theorie des Schachspiels. Proc Fifth Congress Mathemati- Math Ann 100:295–320 cians. Cambridge University Press, Cambridge, pp 501–504