Zero-Sum Games Game Theory 2021

Zero-Sum Games Game Theory 2021 Game Theory 2021 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Zero-Sum Games Game Theory 2021 Plan for Today Today we are going to focus on the special case of zero-sum games and discuss two positive results that do not hold for games in general. • new solution concepts: maximin and minimax solutions • Minimax Theorem: maximin = minimax = NE for zero-sum games • fictitious play: basic model for learning in games • convergence result for the case of zero-sum games The first part of this is also covered in Chapter 3 of the Essentials. K. Leyton-Brown and Y. Shoham. Essentials of Game Theory: A Concise, Multi- disciplinary Introduction. Morgan & Claypool Publishers, 2008. Chapter 3. Ulle Endriss 2 Zero-Sum Games Game Theory 2021 Zero-Sum Games Today we focus on two-player games hN; A; ui with N = f1; 2g. Notation: Given player i 2 f1; 2g, we refer to her opponent as −i. Recall: A zero-sum game is a two-player normal-form game hN; A; ui for which ui(a) + u−i(a) = 0 for all action profiles a 2 A. Examples include (but are not restricted to) games in which you can win (+1), lose (−1), or draw (0), such as matching pennies: H T L R −1 1 −5 3 H T 1 −1 5 −3 1 −1 0 −2 T B −1 1 0 2 Ulle Endriss 3 Zero-Sum Games Game Theory 2021 Constant-Sum Games A constant-sum game is a two-player normal-form game hN; A; ui for which there exists a c 2 R such that ui(a) + u−i(a) = c for all a 2 A. Thus: A zero-sum game is a constant-sum game with constant c = 0. Everything about zero-sum games to be discussed today also applies to constant-sum games, but for simplicity we only talk about the former. Fun Fact: Football is not a constant-sum game, as you get 3 points for a win, 0 for a loss, and 1 for a draw. But prior to 1994, when the \three-points-for-a-win" rule was introduced, World Cup games were constant-sum (with 2, 0, 1 points, for win, loss, draw, respectively). Ulle Endriss 4 Zero-Sum Games Game Theory 2021 Maximin Strategies The definitions on this slide apply to arbitrary normal-form games . Suppose player i wants to maximise her worst-case expected utility (e.g., if all others conspire against her). Then she should play: ? si 2 argmax min ui(si; s−i) s 2S si2Si −i −i ? Any such si is called a maximin strategy (usually there is just one). Solution concept: assume each player will play a maximin strategy. Call max min ui(si; s−i) player i's maximin value (or security level). si s−i Ulle Endriss 5 Zero-Sum Games Game Theory 2021 Exercise: Maximin and Nash Consider the following two-player game: L R 2 0 T 8 0 0 4 B 0 8 What is the maximin solution? How does this relate to Nash equilibria? Note: This is neither a zero-sum nor a constant-sum game. Ulle Endriss 6 Zero-Sum Games Game Theory 2021 Exercise: Maximin and Nash Again Now consider this fairly similar game, which is zero-sum: L R −8 0 T 8 0 0 −8 B 0 8 What is the maximin solution? How does this relate to Nash equilibria? Ulle Endriss 7 Zero-Sum Games Game Theory 2021 Minimax Strategies Now focus on two-player games only, with players i and −i ... Suppose player i wants to minimise −i's best-case expected utility (e.g., to punish her). Then i should play: ? si 2 argmin max u−i(si; s−i) s 2S si2Si −i −i Remark: For a zero-sum game, an alternative interpretation is that player i has to play first and her opponent −i can respond. ? Any such si is called a minimax strategy (usually there is just one). Call min max u−i(si; s−i) player −i's minimax value. si s−i So i's minimax value is min max ui(s−i; si) = min max ui(si; s−i). s−i si s−i si Ulle Endriss 8 Zero-Sum Games Game Theory 2021 Equivalence of Maximin and Minimax Values Recall: For two-player games, we have seen the following definitions. • Player i's maximin value is max min ui(si; s−i). si s−i • Player i's minimax value is min max ui(si; s−i). s−i si Lemma 1 In a two-player game, maximin and minimax value coincide: max min ui(si; s−i)= min max ui(si; s−i) si s−i s−i si We omit the proof. For the case of two actions per player, there is a helpful visualisation in the Essentials. Note that one direction is easy: (6) LHS is what i can achieve when she has to move first, while RHS is what i can achieve when she can move second. X Remark: The lemma does not hold if we quantify over actions rather than strategies (counterexample: Matching Pennies). Ulle Endriss 9 Zero-Sum Games Game Theory 2021 The Minimax Theorem Recall: A zero-sum game is a two-player game with ui(a) + u−i(a) = 0. Theorem 2 (Von Neumann, 1928) In a zero-sum game, a strategy profile is aNE iff each player's expected utility equals her minimax value. Proof: Let vi be the minimax/maximin value of player i (and v−i = −vi that of player −i). (1) Suppose ui(si; s−i) 6= vi. Then one player does worse than she could (note that here we use the zero-sum property!). So (si; s−i) is not a NE. X (2) Suppose ui(si; s−i) = vi. Then one having a better response would mean the other's security John von Neumann (1903{1957) level is actually lower. So (si; s−i) is a NE. X J. von Neumann. Zur Theorie der Gesellschaftsspiele. Mathematische Annalen, 100(1):295{320, 1928. Ulle Endriss 10 Zero-Sum Games Game Theory 2021 Learning in Games Suppose you keep playing the same game against the same opponents. You might try to learn their strategies. A good hypothesis might be that the frequency with which player i plays action ai is approximately her probability of playing ai. Now suppose you always best-respond to those hypothesised strategies. And suppose everyone else does the same. What will happen? We are going to see that for zero-sum games this process converges to a NE. This yields a method for computing a NE for the (non-repeated) game: just imagine players engage in such \fictitious play". Ulle Endriss 11 Zero-Sum Games Game Theory 2021 Empirical Mixed Strategies ` 0 1 `−1 Given a history of actions Hi = ai ; ai ; : : : ; ai played by player i in ` ` prior plays of game hN; A; ui, fix her empirical mixed strategy si 2 Si: 1 s`(a ) = · #fk < ` j ak = a g for all a 2 A i i ` i i i i | {z } ` relative frequency of ai in Hi Ulle Endriss 12 Zero-Sum Games Game Theory 2021 Best Pure Responses ? Recall: Strategy si 2 Si is a best response for player i to the (partial) ? 0 0 strategy profile s−i if ui(si ; s−i) > ui(si; s−i) for all si 2 Si. Due to expected utilities being convex combinations of plain utilities: Observation 3 For any given (partial) strategy profile s−i, the set of best responses for player i must include at least one pure strategy. So we can restrict attention to best pure responses for player i to s−i: ? ai 2 argmax ui(ai; s−i) ai2Ai Ulle Endriss 13 Zero-Sum Games Game Theory 2021 Fictitious Play Take any action profile a0 2 A for the normal-form game hN; A; ui. Fictitious play of hN; A; ui, starting in a0, is the following process: 0 • In round ` = 0, each player i 2 N plays action ai . • In any round ` > 0, each player i 2 N plays a best pure response to her opponents' empirical mixed strategies: ` ` ai 2 argmax ui(ai; s−i); where ai2Ai ` 1 k 0 si0 (ai0 ) = ` · #fk < ` j ai0 = ai0 g for all i 2 N and ai0 2 Ai0 Assume some deterministic way of breaking ties between maxima. 0 1 2 This yields a sequence a a a ::: with a corresponding 0 1 2 sequence of empirical-mixed-strategy profiles s s s ::: Question: Does lim s` exist and is it a meaningful strategy profile? `!1 Ulle Endriss 14 Zero-Sum Games Game Theory 2021 Example: Matching Pennies Let's see what happens when we start in the upper lefthand cornerHH (and break ties between equally good responses in favour of H): H T −1 1 H 1 −1 1 −1 T −1 1 Any strategy can be represented by a single probability (of playing H). 1 1 2 1 3 1 3 1 3 1 HH ( 1 ; 1 ) HT ( 2 ; 2 ) HT ( 3 ; 3 ) TT ( 4 ; 4 ) TT ( 5 ; 5 ) 3 1 3 2 3 3 3 4 TT ( 6 ; 6 ) TH ( 7 ; 7 ) TH ( 8 ; 8 ) TH ( 9 ; 9 ) 3 5 4 6 5 7 TH ( 10 ; 10 ) HH ( 11 ; 11 ) HH ( 12 ; 12 ) ··· Exercise: Can you guess what this will converge to? Ulle Endriss 15 Zero-Sum Games Game Theory 2021 Convergence Profiles are Nash Equilibria In general, lim s` does not exist (no guaranteed convergence).

Load more