<<

Midterm 1

Stat155 Theory Lecture 11: Midterm 1 Review “This is an open book exam: you can use any printed or written material, but you cannot use a laptop, tablet, or phone (or any device that can communicate). There are three questions, each consisting of three parts. Peter Bartlett Each part of each question carries equal weight. Answer each question in the space provided.”

September 29, 2016

1 / 35 2 / 35 Topics Definitions: Combinatorial

Combinatorial games Positions, moves, terminal positions, impartial/partisan, progressively A combinatorial game has: bounded Two players, Player I and Player II. Progressively bounded impartial and partisan games The sets N and P A set of positions X . Theorem: Someone can win For each player, a set of legal moves between positions, that is, a set Examples: Subtraction, Chomp, Nim, Rims, Staircase Nim, Hex of ordered pairs, (current position, next position): Zero sum games

Payoff matrices, pure and mixed strategies, safety strategies MI , MII X X . Von Neumann’s minimax theorem ⊂ × Solving two player zero-sum games Saddle points Equalizing strategies Players alternately choose moves. Solving2 2 games × Play continues until some player cannot move. Dominated strategies 2 n and m 2 games Normal play: the player that cannot move loses the game. × × Principle of indifference Symmetry: Invariance under permutations

3 / 35 4 / 35 Definitions: Combinatorial games Impartial combinatorial games and winning strategies

Terminology: An impartial game has the same set of legal moves for both players: MI = MII . Theorem A partisan game has different sets of legal moves for the players. In a progressively bounded impartial A terminal position for a player has no legal move to another position. combinatorial game under normal x is terminal for player I if there is no y X with( x, y) MI . play, X = N P. ∈ ∈ ∪ A combinatorial game is progressively bounded if, for every starting That is, from any initial position, one position x X , there is finite bound on the number of moves before of the players has a winning . 0 ∈ the game ends. (That is, if B(x) denotes the maximum number of Proof: induction on number of moves before the game ends, then B(x) < .) ∞ moves until the end. A winning strategy for a player from position x: a mapping from non-terminal positions to legal moves that is guaranteed to result in a win for that player from that position.

5 / 35 6 / 35 Key Ideas: Progressively bounded impartial games Example: Nim

k piles of chips. Remove some (positive) number of chips from some pile. A player wins when they take the last chip.

Bouton’s Theorem P: Every move leads to N. A Nim position( x ,..., x ) is in P iff N: Some move leads to P (hence cannot contain terminal positions). 1 k the Nim-sum of its components is0. The Nim-sum of x = (x ,..., x ) is written x x x . 1 k 1 ⊕ 2 ⊕ · · · ⊕ k The binary representation of the Nim-sum is the bitwise sum, in modulo-two arithmetic, of the binary representations of the components of x. Proof: Show that Z := (x ,..., x ): x x = 0 is P: { 1 k 1 ⊕ · · · ⊕ k } Z must lead to Z c ; from Z c , there is a move into Z.

7 / 35 8 / 35 Partisan Games Topics

Combinatorial games Positions, moves, terminal positions, impartial/partisan, progressively Recall: bounded Progressively bounded impartial and partisan games An impartial game has the same set of legal moves for both players: The sets N and P MI = MII . Theorem: Someone can win A partisan game has different sets of legal moves for the players. Examples: Subtraction, Chomp, Nim, Rims, Staircase Nim, Hex Zero sum games Theorem Payoff matrices, pure and mixed strategies, safety strategies Consider a progressively bounded partisan Von Neumann’s minimax theorem combinatorial game under normal play, with no ties Solving two player zero-sum games allowed. From any initial position, one of the players Saddle points Equalizing strategies has a winning strategy. Solving2 2 games × Dominated strategies 2 n and m 2 games × × Principle of indifference Symmetry: Invariance under permutations

9 / 35 10 / 35 Two-player zero-sum games Two-player zero-sum games

Definitions Player I has m actions, 1, 2,..., m. Definitions Player II has n actions, 1, 2,..., n. A mixed strategy is a probability distribution over actions. m n The payoff matrix A R × represents the payoff to Player I: ∈ A mixed strategy for Player I is a vector

a11 a12 a1n ··· x1 a a a 21 22 2n x m A =  . ···. .  2 m . . . x =  .  ∆m := x R : xi 0, xi = 1 . . . . . ∈ ∈ ≥   . ( i=1 ) am1 am2 amn   X  ···  xm     If Player I chooses i and Player II chooses j, the payoff to Player I is   a and the payoff to Player II is a . ij − ij The sum of the payoff to Player I and the payoff to Player II is0.

11 / 35 12 / 35 Two-player zero-sum games Two-player zero-sum games

The expected payoff to Player I when Player I plays mixed strategy A mixed strategy for Player II is a vector x ∆m and Player II plays mixed strategy y ∆n is ∈ ∈ m n y1 n EI x EJ y aIJ = xi aij yj y2 ∼ ∼   n i=1 j=1 y = . ∆n := y R : yi 0, yi = 1 . X X . ∈ ( ∈ ≥ )   Xi=1 = x>Ay yn   a a a y   11 12 ··· 1n 1 a21 a22 a2n y2 A pure strategy is a mixed strategy where one entry is1 and the = x1 x2 xm  . ···. .   .  ··· . .. . . others0. (This is a canonical basis vector ei .)      a a a  y   m1 m2 ··· mn  n    

13 / 35 14 / 35 Two-player zero-sum games Two-player zero-sum games

Von Neumann’s Minimax Theorem A safety strategy for Player I is an x ∆ that satisfies m n ∗ m For any two-person zero-sum game with payoff matrix A R × , ∈ ∈

min x∗>Ay = max min x>Ay. max min x>Ay = min max x>Ay. y ∆n x ∆m y ∆n x ∆m y ∆n y ∆n x ∆m ∈ ∈ ∈ ∈ ∈ ∈ ∈

This mixed strategy maximizes the worst case expected gain for Player I. We call the optimal expected payoff the value of the game,

Similarly, a safety strategy for Player II is a y ∗ ∆n that satisfies V := max min x>Ay = min max x>Ay. x ∆m y ∆n y ∆n x ∆m ∈ ∈ ∈ ∈ ∈ max x>Ay ∗ = min max x>Ay. x ∆m y ∆n x ∆m LHS: Player I plays x ∆m first, then Player II responds with y ∆n. ∈ ∈ ∈ ∈ ∈ RHS: Player II plays y ∆ first, then Player I responds with x ∆ . ∈ n ∈ m Notice that we should always prefer to play last: This mixed strategy minimizes the worst case expected loss for Player II. max min x>Ay min max x>Ay. x ∆m y ∆n ≤ y ∆n x ∆m ∈ ∈ ∈ ∈ The astonishing part is that it does not help. 15 / 35 16 / 35 Two-player zero-sum games Topics

Combinatorial games Von Neumann’s Minimax Theorem Positions, moves, terminal positions, impartial/partisan, progressively m n For any two-person zero-sum game with payoff matrix A R × , ∈ bounded Progressively bounded impartial and partisan games max min x>Ay = min max x>Ay. x ∆m y ∆n y ∆n x ∆m Zero sum games ∈ ∈ ∈ ∈ Payoff matrices, pure and mixed strategies, safety strategies Von Neumann’s minimax theorem Solving two player zero-sum games Safety strategies are optimal strategies: Saddle points For safety strategies x∗ for Player I and y ∗ for Player II, Equalizing strategies Solving2 2 games × min x∗>Ay = max x>Ay ∗ = x∗>Ay ∗ = V . Dominated strategies y ∆n x ∆m 2 n and m 2 games ∈ ∈ × × Principle of indifference Symmetry: Invariance under permutations

17 / 35 18 / 35 Saddle points Saddle points

Definition

A pair( i ∗, j∗) 1,..., m 1,..., n is a saddle point for a payoff m∈n { } × { } matrix A R × if Theorem ∈ m n If( i ∗, j∗) is a saddle point for a payoff matrix A R × , then max aij = ai j = min ai j . ∈ i ∗ ∗ ∗ j ∗ ei ∗ is an optimal strategy for Player I,

ej∗ is an optimal strategy for Player II, and If Player I plays i and Player II plays j , neither player has an ∗ ∗ the value of the game is ai ∗j∗ . incentive to change. Think of these as locally optimal strategies for the players. They are also globally optimal strategies.

19 / 35 20 / 35 2 2 games Dominated pure strategies × How to solve a2 2 game × 1 Check for a saddle point. (Is the max of row mins= min of column maxes?) 2 If there are no saddle points, find equalizing strategies. Definition

Equalizing strategies satisfy: A pure strategy ei for Player I is dominated by ei 0 in payoff matrix A if, for all j 1,..., n , ∈ { } x1a11 + (1 x1)a21 = x1a12 + (1 x1)a22, a a . − − ij ≤ i 0j y a + (1 y )a = y a + (1 y )a . 1 11 − 1 12 1 21 − 1 22 Solving gives

a21 a22 x1 = − , a21 a22 + a12 a11 − a a − y = 12 − 22 . 1 a a + a a 12 − 22 21 − 11 21 / 35 22 / 35 Solving2 n games Principle of indifference × Payoff matrix Theorem m n Suppose a game with payoff matrix A R × has value V . If x ∆m 2315 ∈ ∈ and y ∆ are optimal strategies for Players I and II, then 4160 n   ∈ m n for all j, x a V , for all i, a y V , l lj ≥ il l ≤ The maximum occurs at the Xl=1 Xl=1 intersection of the lines m n corresponding to columns 2 and 3. if yj > 0, xl alj = V , if xi > 0, ail yl = V . l=1 l=1 The optimal strategy for Player I is X X x = (5/7, 2/7). Then Player II is indifferent between This means that if one player is playing optimally, any action that has columns 2 and 3. positive weight in the other player’s optimal mixed strategy is a The optimal strategy for Player II suitable response.

(Ferguson, 2014) involves only Columns 2 and 3. We It implies that any mixture of these “active actions” is also a suitable can find it by solving a2 2 game. response. × 23 / 35 24 / 35 Using the principle of indifference Example

Diagonal payoff matrix

Solving linear systems a11 00 A = 0 a 0 Suppose that we have a payoff matrix A and we suspect that an  22  optimal strategy for Player I has certain components positive, say 00 a33 x1 > 0, x3 > 0.   Then we can solve the corresponding “indifference equalities” to find The aii are all positive, so we suspect that all xi , yi > 0 for the y, say optimal strategies. Solve n n a1l yl = V , a3l yl = V . V V V > x>A = VVV : x = y = a a a , l=1 l=1 11 22 33 X X  1   V = . 1/a11 + 1/a22 + 1/a33

25 / 35 26 / 35 Topics Symmetry

Combinatorial games Positions, moves, terminal positions, impartial/partisan, progressively Definition bounded m n A game with payoff matrix A R × is invariant under a permutation πx Progressively bounded impartial and partisan games ∈ on 1,..., m if there is a permutation πy on 1,..., n such that, for all Zero sum games { } { } i, j, aij = aπx (i),πy (j). Payoff matrices, pure and mixed strategies, safety strategies Von Neumann’s minimax theorem Solving two player zero-sum games If A is invariant under permutations π and π on 1,..., m , then it 1 2 { } Saddle points is invariant under π1 π2. Equalizing strategies ◦ Solving2 2 games If A is invariant under some set S of permutations, then it is invariant × Dominated strategies under the group G of permutations generated by S (that is, 2 n and m 2 games × × compositions and inverses). Principle of indifference Symmetry: Invariance under permutations

27 / 35 28 / 35 Symmetry Symmetry

Definition Example: Submarine Salvo A mixed strategy x ∆ is invariant under a permutation π on ∈ m x 1,..., m if for all i, x = x . { } i πx (i)

An orbit of a group G of permutations is a set O = π(i): π G . i { ∈ } If a mixed strategy x is invariant under a group G of permutations, then for every orbit, x is constant on the orbit.

Theorem If A is invariant under a group G of permutations, then there are optimal strategies that are invariant under G. (Karlin and Peres, 2016)

29 / 35 30 / 35 Symmetry Symmetry

Submarine Salvo on orbits edge center For a left-to-right flip, πx permutes bomb positions, πy permutes 1 submarine positions. corner 4 0 1 1 In addition to left-to-right flips, we could add top-to-bottom flips. We mid-edge 4 4 have invariance under consecutive flips. center 01 x is invariant for the permutation corresponding to a left-to-right flip if x1 = x3, x4 = x6 and x7 = x8. Each entry is a uniform average over orbits of the original payoffs. Example: for the group generated by horizontal, vertical, and diagonal For Bomber, corner is dominated by mid-edge. flips, the orbits are Then for Submarine, center is dominated by edge. O = 1, 3, 7, 9 , O = 2, 4, 6, 8 , O = 5 . Optimal strategies: Bomber plays mid-edge, Submarine plays edge. 1 { } 2 { } 5 { } Bomber puts weight1 /4 on each of 2, 4, 6, 8. Submarine puts weight1 /8 on each of 12, 23, 14, 36, 47, 69, 78, 89.

31 / 35 32 / 35 Symmetry: Examples Symmetry: Examples

Rook and pawn Rook and bishop Player I places a rook on a4 4 board. Player I places a rook on a chessboard (8 8 squares). × × Player II places a bishop. Player II places a pawn. Player I gets payoff1 if the rook can take the bishop, 1 if the bishop can Player I gets payoff1 if the rook can take the pawn,0 if it cannot (or they − take the rook,0 if neither can take the other (or they both chose the same both chose the same square). square).

Symmetries? Symmetries? Orbits? Orbits? What is the reduced game (on orbits)? What is the reduced game (on orbits)?

33 / 35 34 / 35 Topics

Combinatorial games Positions, moves, terminal positions, impartial/partisan, progressively bounded Progressively bounded impartial and partisan games Zero sum games Payoff matrices, pure and mixed strategies, safety strategies Von Neumann’s minimax theorem Solving two player zero-sum games Saddle points Equalizing strategies Solving2 2 games × Dominated strategies 2 n and m 2 games × × Principle of indifference Symmetry: Invariance under permutations

35 / 35