Stat 155 Lecture Notes

Professor: Oscar Hernan Madrid Padilla Scribe: Daniel Raban

Contents

1 Combinatorial Games5 1.1 Subtraction game and definitions...... 5 1.2 Combinatorial games as graphs...... 6 1.3 Existence of a winning ...... 6 1.4 Chomp...... 6

2 Nim and Rim8 2.1 Nim...... 8 2.2 Rim...... 9

3 Staircase Nim and Partisan Games 10 3.1 Staircase Nim...... 10 3.2 Partisan Games...... 11 3.2.1 Partisan subtraction game...... 11 3.2.2 Hex...... 11

4 Two Player Zero-Sum Games 13 4.1 Pick a Hand...... 13 4.2 Zero-sum games...... 13 4.3 Von-Neumann’s theorem...... 15

5 Solving Two-player Zero-sum Games 16 5.1 Saddle points...... 16 5.2 Removing dominated pure strategies...... 17 5.3 2 × 2 games...... 18

1 6 Domination and the Principle of Indifference 19 6.1 Domination by multiple rows or columns...... 19 6.2 The principle of indifference...... 20 6.3 Using the principle of indifference...... 20

7 Symmetry in Two Player Zero-sum Games 22 7.1 Submarine Salvo...... 22 7.2 Invariant vectors and matrices...... 22 7.3 Using invariance to solve games...... 23

8 Nash Equilibria, Linear Programming, and von Neumann’s Minimax Theorem 25 8.1 Nash equilibria...... 25 8.1.1 Optimality of Nash equilibria...... 25 8.1.2 Indifference and Nash Equilibria...... 26 8.2 Solving zero-sum games using matrix inversion...... 26 8.3 Linear programming: an aside...... 27 8.4 Proof of von Neumann’s minimax theorem...... 28

9 Gradient Ascent, Series Games, and Parallel Games 30 9.1 Gradient ascent...... 30 9.2 Series and parallel games...... 31 9.2.1 Series games...... 31 9.2.2 Parallel games...... 31 9.2.3 Electric networks...... 32

10 Two Player General-Sum Games 33 10.1 General-sum games and Nash equilibria...... 33 10.2 Examples of general-sum games...... 34

11 Two-Player and Multiple-Player General-Sum Games 36 11.1 More about two-player general-sum games...... 36 11.1.1 Cheetahs and gazelles...... 36 11.1.2 Comparing two-player zero-sum and general-sum games...... 36 11.2 Multiplayer general-sum games...... 37

12 Indifference of Nash Equilibria, Nash’s Theorem, and Potential Games 40 12.1 Indifference of Nash equilibria in general-sum games...... 40 12.2 Nash’s theorem...... 41 12.3 Potential and Congestion games...... 42 12.3.1 Congestion games...... 42 12.3.2 Potential games...... 43

2 13 44 13.1 Criticisms of Nash equilibria...... 44 13.2 Evolutionarily stable strategies...... 44 13.3 Examples of strategies within populations...... 45

14 Evolutionary Game Theory of Mixed Strategies and Multiple Players 47 14.1 Relationships between ESSs and Nash equilibria...... 47 14.2 Evolutionary stability against mixed strategies...... 47 14.3 Multiplayer evolutionarily stable strategies...... 49

15 Correlated Equilibria and Braess’s 50 15.1 An example of inefficient Nash equilibria...... 50 15.2 Correlated strategy pairs and equilibria...... 50 15.3 Interpretations and comparisons to Nash equilibria...... 52 15.4 Braess’s paradox...... 52

16 The of Anarchy 54 16.1 Flows and latency in networks...... 54 16.2 The for linear and affine latencies...... 56 16.3 The impact of adding edges...... 58

17 Pigou Networks and Cooperative Games 59 17.1 Pigou networks...... 59 17.2 Cooperative games...... 60

18 Threat Strategies and Nash Bargaining in Cooperative Games 63 18.1 Threat strategies in games with transferable ...... 63 18.2 Nash bargaining model for nontransferable utility games...... 65

19 Models for 67 19.1 Nash’s bargaining theorem and relationship to transferable utility...... 67 19.2 Multiplayer transferable utility games...... 68 19.2.1 Allocation functions and Gillies’ ...... 68 19.2.2 Shapley’s axioms for allocation functions...... 70

20 71 20.1 Shapley’s axioms...... 71 20.2 Junta games...... 71 20.3 Shapley’s theorem...... 72

3 21 Examples of Shapley Value and 74 21.1 Examples of Shapley value...... 74 21.2 Examples of mechanism design...... 75

22 Voting 78 22.1 Voting preferences, communication, and ...... 78 22.2 Properties of voting systems...... 79 22.3 Violating IIA and Arrow’s Impossibility theorem...... 80

23 Impossibility Theorems and Properties of Voting Systems 82 23.1 The Gibbard-Satterthwaite theorem...... 82 23.2 Properties of voting systems...... 83 23.3 rules...... 84

4 1 Combinatorial Games

1.1 Subtraction game and definitions Consider a “subtraction game” with 2 players and 15 chips. Players alternate moves, with player 1 starting. At each move, the player can remove 1 or 2 chips. A player wins when they take the last chip (so the other player cannot move). Let x be the number of chips remaining. Suppose you move next. Can you guarantee a win? Let’s look at a few examples. If x ∈ {1, 2}, the player who moves can take the remaining chip(s) and win. If x = 3, the second player has the advantage; no matter what player 1 does, player 2 will be presented with 1 or 2 chips. Write N as the set of positions where the next player to move can guarantee a win, provided they play optimally. Write P as the set of positions where the other player, the player that moved previously, can guarantee a win, provided that they play optimally. So 0, 3 ∈ P , 1, 2 ∈ N. In the case of our original game, 15 ∈ P .

Definition 1.1. A combinatorial game is a game with two players (players 1 and 2) and a set X of positions. For each player, there is a set of legal moves between positions, M1,M2 ⊆ X × X (current position, next position). Players alternately choose moves, starting from some starting position x0, and play continues until some player cannot move. The game has a winner or loser and follows normal or misere play.

Definition 1.2. In a combinatorial game, normal play means that the player who cannot move loses the game.

Definition 1.3. In a combinatorial game, misere play means that the player who cannot move wins the game.

Definition 1.4. An impartial game has the same set of legal moves for both players; i.e. M1 = M2.A partisan game has different sets of legal moves for the players. Definition 1.5. A terminal position for a player is a position in which the player has no legal move to another position; i.e. x is terminal for player i if there is no y ∈ X with (x, y) ∈ Mi. Definition 1.6. A combinatorial grame is progressively bounded if, for every starting po- sition x0 ∈ X, there is a finite bound on the number of moves before the game ends. Definition 1.7. A strategy for a player is a function that assigns a legal move to each non-terminal position. If XNT is the set of non-terminal positions for player i, then Si : XNT → X is a strategy for player i if, for all x ∈ XNT ,(x, Si(x)) ∈ Mi. Definition 1.8. A winning strategy for a player from position x is a strategy that is guaranteed to a result in a win for that player.

5 Example 1.1. The subtraction game is an impartial combinatorial game. The positions are X = {0, 1, 2,..., 15}, and the moves are {(x, y ∈ X × X : y ∈ {x − 1, x − 2}}. The terminal position for both players is 0. The game is played using normal play. It is progressively bounded because from x ∈ X, there can be no more than x moves until the terminal position. A winning strategy for any starting position x ∈ N is S(x) = 3bx/3c.

1.2 Combinatorial games as graphs Impartial combinatorial games can be thought of as directed graphs. Think of the positions as nodes and the moves as directed edges between the nodes Terminal positions are nodes without outgoing edges.

Example 1.2. What does the graph look like for the subtraction game? Every edge from a node in P leads into a node in N. There is also an edge from every node in N to a node in P . The winning strategy chooses one of these edges.

Acyclic graphs correspond to progressively bounded games. B(x) is the maximum length along the graph from node x to a terminal position.

1.3 Existence of a winning strategy Theorem 1.1. In a progressively bounded, impartial combinatorial game, X = N ∪ P . That is, from any initial position, one of the players has a winning strategy.

Proof. By definition, N,P ⊆ X, so N ∪ P ⊆ X. We now show that X ⊆ N ∪ P . For each x ∈ X, we induct on B(x). If B(x) = 0, then we are in a winning position for one of the two players, so x ∈ N ∪ P . Now suppose that x ∈ N ∪ P holds when B(x) ≤ n. If B(x) = n + 1, then every legal move leads to y with B(y) ≤ n, so y ∈ N ∪ P . Consider all the legal next positions y. Either

1. All of these y are in N, which implies x ∈ P , or

2. Some legal move leads to a y ∈ P , which implies x ∈ N.

1.4 Chomp Chomp is an impartial combinatorial game. Two players take turns picking squares from a rectangular chocolate bar and eat everything above and to the right of the square they pick (including the square itself); the squares removed are called the “chomp.” The positions are the non-empty subsets of a chocolate bar that are left-closed and below-closed. The moves are {(x, y) ∈ X × X : y = x − chomp}. The terminal position is when only the bottom left square remains. The game follows normal play. Chomp is progressively bounded because from x ∈ X with |x| blocks remaining, there can be no more than |x| − 1 moves until the terminal position.

6 Theorem 1.2. In chomp, every non-terminal rectangle is in N.

Proof. We use a “strategy stealing” argument. From a rectangle r ∈ X, there is a legal move (r, r0) ∈ M that we can always choose to skip; that is, for any move (r0, s) ∈ M, we also have (r, s) ∈ M. There are two cases:

1. r0 ∈ P , which implies r ∈ N.

2. r0 ∈ N. In this case, there is an s ∈ P with (r0, s) ∈ M, But then we know that (r, s) ∈ M, also implying r ∈ N.

7 2 Nim and Rim

2.1 Nim Here is a combinatorial game called Nim. We have k piles of chips, and each turn, a player removes some (positive) number of chips from some pile. The player wins when they take the last chip. Nim is an impartial combinatorial game with positions

X = {(n1, . . . , nk): ni ≥ 0} .

The set of moves is

 2 (x, y) ∈ X : some i has yi < xi, yj = xj ∀j 6= i .

The terminal position is 0, and the game follows normal play. We can think of a position (x1, . . . , xi, 0,..., 0) as the position (xi, . . . , xi) in a smaller game. So we could instead define X = {(n1, . . . , nk): k ≥ 1, ni ≥ 0} , letting k be a part of the position. Nim is progressively bounded because from x ∈ X, P there can be no more than i xi moves until the terminal position.

Example 2.1. Which positions are in N or P ? 0 ∈ P , but n1 ∈ N. Also, (1, 1) ∈ P , and (1, 2) ∈ N. If n1 6= n2, then (n1, n2) ∈ N; but (n1, n1) ∈ P . To find the winning positions of Nim, we make the following definition.

Definition 2.1. Given a Nim position (x1, . . . , xk), the Nim-sum x1 ⊕ · · · ⊕ xk is defined as follows. Write x1, . . . , xk in binary, and add the digits in each place modulo 2; then interpret the result as the binary representation of a number.

6 0 1 1 0 12 1 1 0 0 13 1 1 0 1 0 1 1 1 = 7

Example 2.2. You can check your work with these examples to see if you understand how to get the Nim-sum of a position.

1. If x = 7, x has Nim-sum is 7.

2. If x = (2, 2), x has Nim-sum 0.

3. If x = (2, 3), it has Nim-sum 1.

4. If x = (1, 2, 3), it has Nim-sum 0.

8 Theorem 2.1 (Bouton). The Nim position (x1, . . . , xk) is in P iff the Nim-sum of its components is 0.

Proof. Let Z = {(x1, . . . , xk): x1 ⊕ · · · ⊕ xk = 0}. We will show that 1. Every move from X leads to a position outside Z.

2. For every position outsize Z, there is a move to Z, which implies that terminal positions are in Z. From this, it will follow that Z = P (exercise). To prove 1, note that removing chips from one pile only changes one row when com- puting the Nim-sum. So then some place in the binary representation of the Nim-sum is changed, making it nonzero. To prove 2, let j be the position of the leftmost 1 in the binary representation of the Nim-sum s = x1 ⊕ · · · ⊕ xk. There is an odd number of i ∈ {1, 2, . . . , k} with 1 in column j. Choose one such i. Now we replace xi by xi ⊕ s. That is, we make the move

(x1, . . . , xk) → (x1, . . . , xi−1, xi ⊕ s, xi+2, . . . , xk).

[insert picture] This decreases the value of xi, so it is a legal move. This also changes every 1 in the binary representation of the Nim-sum to 0, making the Nim-sum 0.

2.2 Rim Here is a game called Rim. Each position is a finite set of points in the plane and a finite set of continuous, non-intersecting loops, each passing though at least one point. Each turn, a player adds another loop. This game is progressively bounded. Proposition 2.1. Rim is equivalent to Nim, in the sense that we can define a mapping φ : X → XNim such that P = {x ∈ X : φ(x) ∈ PNim}.

Proof. For a position x, define φ(x) = (n1, . . . , nk), where the ni are the number of points in the interiors of the connected regions bounded by the loops. This allows all of the standard Nim moves; by drawing a loop (not containing any points in its interior) that passes through some number of points in a connected component, the corresponding chips are removed. It also allows some nonstandard moves, such as moves that create more piles. Why is P = {x ∈ X : φ(x) ∈ PNim}? φ(x) = 0 for terminal x, and some move from N leads to P ; this is true because all of the standard Nim moves are available as Rim moves. We now want to show that every move from P leads to N; we need only check that if φ(x) has Nim-sum zero, then any move to φ(y) has a nonzero Nim-sum. We know this is true for a standard Nim move, so we need only check that this is true when the pile that was diminished is split. Suppose we split xi into u and v, using up some of the vertices from xi. We have xi > u + v ≥ u ⊕ v. So the move changes to a nonzero Nim-sum.

9 3 Staircase Nim and Partisan Games

3.1 Staircase Nim Here is a game called Staircase Nim. Imagine a staircase with balls on the steps.1 Every turn, a player takes some (positive) number of balls from a step and moves these balls down one step on the staircase. The player who moves the last ball to the bottom step wins. This game is progressively bounded.

Proposition 3.1. Staircase Nim is equivalent to Nim, in the sense that we can define a mapping φ : X → XNim, such that P = {x ∈ X : φ(x) ∈ PNim}.

Proof. For a position x = (x1, x2, . . . , xk) (number of chips on each step), define φ(x) = (x2, x4, . . . , x2bk/2c) (the number of chips on the even steps). Define the set  Z := {x ∈ X : φ(x) ∈ PNim} = (x1, . . . , xk) ∈ X : x2 ⊕ x4 ⊕ · · · ⊕ x2bk/2c = 0 .

We want to show that Z = P . It is sufficient to show that

1. If x ∈ Z, x0 ∈ X \ Z for all x0 such that (x, x0) ∈ M.

2. If x ∈ X \ Z, ∃x0 ∈ Z such that (x, x0) ∈ M.

If we move balls from an even to an odd step, say we move from state x to x0. This just decreases one of the components in the vector x, so it corresponds to a Nim move. So if φ(x) has 0 Nim-sum, φ(x0) has nonzero Nim-sum. If we move balls from an even to an odd step, we increase the value of one of the piles in φ(x). This changes at least one place in the Nim sum, making φ(x0) have nonzero Nim-sum. So every move in Z leads to a move in X \ Z. If we start from x, where φ(x) 6= 0, then there is some move in Nim that makes the Nim-sum 0. We can make this move in Staircase Nim by taking balls on an even step and moving them to an odd step. So for every x ∈ X \ Z, there is a move (x, x0) such that x0 ∈ Z. 1These figures for Staircase Nim are modified versions of figures from the book Game Theory, Alive by Anna Karlin and Yuval Peres.

10 3.2 Partisan Games 3.2.1 Partisan subtraction game Here is an partisan subtraction game. Start with 11 chips. Player 1 can remove wither 1 or 4 chips per turn. Player 2 can remove either 2 or 3 chips per turn. The game is played under normal play. We can construct sets

Ni = {positions where Player i, playing next, can force a win} ,

Pi = {positions where, if Player i plays next, the previous player can force a win} .

In this game, {1, 2, 4, 5} ⊆ N1, {2, 3, 5} ⊆ N2, {0, 3} ⊆ P1, and {0, 1, 4} ⊆ P2. Theorem 3.1. Consider a progressively bounded partisan combinatorial game with no ties allowed. Then from any initial position, one of the players has a winning strategy.

3.2.2 Hex In the game of Hex, players alternate painting tiles on a board either yellow (Player 1) or blue (Player 2). The winner of the game is the first player who can construct a path from one side to the other.2

This game is partisan because one player can only paint squares blue, and the other can only paint squares yellow. This game is progressively bounded because there are only finitely many tiles. Hex has no ties; this is nontrivial to prove, and we will not prove it here.

Theorem 3.2. On a symmetric Hex board, the first player has a winning strategy.

Proof. We use a strategy-stealing argument. Assume for the sake of contradiction that the second player has a winning strategy (i.e. a mapping S from the set of positions to the set of destinations of legal moves); we will construct a winning strategy for Player 1. The first player plays an arbitrary first move m1,1. To play the n-th move, the first player calculates

2These Hex diagrams are modified versions of diagrams from the book Game Theory, Alive by Anna Karlin and Yuval Peres.

11 the position xn−1 of the board as if only moves m2,1, m2,2, . . . , m2,n−1 were played, and then plays m1,n = Srot(xn−1), where Srot is the strategy S applied to the board rotated 90 degrees with colors switched. And if m1,n is not a legal move because that hexagon has already been played, choose something else; an extra hexagon can only help. So Player 1 also has a winning strategy. This is a contradiction, so Player 2 cannot have a winning strategy. So Player 1 has the winning strategy.

12 4 Two Player Zero-Sum Games

4.1 Pick a Hand Consider a game of “Pick a Hand” with two players and two candies. The Hider puts both hands behind their back and chooses to either

1. Put 1 candy in their left hand (L1),

2. Put 2 candies in their right hand (R2). The second player, the Chooser, picks a hand and takes the candies in it. Both moves are made simultaneously. We can represent this by a matrix:

L1 R2 L 1 0 R 0 2 What if the players play randomly?

P (Chooser plays L) = x1,P (Chooser plays R) = 1 − x1,

P (Hider plays L1) = y1,P (Chooser plays R2) = 1 − y1. Say we are playing sequentially, with the Chooser going first. The expected gain when the Hider plays L1 is x1 · 1 + (1 − x1) · 0 = x1. The expected gain when the Hider plays R2 is x1 · 0 + (1 − x1) · 2 = x(1 − x1). Given these probabilities, the Holder can pick y1 to minimize the Chooser’s overall expected gain. The Chooser knows this, so the chooser should pick an x1 that maximizes their expected gain given that they know that the Holder will minimize their expected gain. In this case, the Chooser should pick x1 = 2/3. What if the Hider plays first? The Hider should also pick y1 = 2/3.

4.2 Zero-sum games Definition 4.1. A two player zero-sum game is a game where Player 1 has m actions 1, 2, . . . , m, and Player 2 has n actions 1, 2, . . . , n. The game has an m × n payoff matrix m×n A ∈ R , which represents the payoff to player 1.   a1,1 a1,2 ··· a1,n  a2,1 a2,2 ··· a2,n  A =    . . .   . . ··· .  am,1 am,2 ··· am,n

If Player 1 chooses i, and Player 2 chooses j, then the payoff to player 1 is ai,j, and the payoff to Player 2 is −ai,j.

13 Definition 4.2. A mixed strategy is a probability over actions. It is a vector   x1 ( m )  x2  X x =   ∈ ∆ := x ∈ m : x ≥ 0, x = 1  .  m R i i  .  i=1 xm for Player 1 and   y1 ( n ) y2 X y =   ∈ ∆ := x ∈ n : y ≥ 0, y = 1  .  n R i i  .  i=1 yn for Player 2.

Definition 4.3. A pure strategy is a mixed strategy where one entry is 1, and all the others are 0. This is a standard basis vector ei.

The expected payoff to Player 1 when Player 1 plays mixed strategy x ∈ ∆m and Player 2 plays mixed strategy y ∈ ∆m is

m n X X EI∼xEJ∼yaI,J = ciai,jyj i=1 j=1 = x>Ay     a1,1 a1,2 ··· a1,n y1  a2,1 a2,2 ··· a2,n  y2 = (x , x , . . . , x )     . 1 2 m  . . .   .   . . ··· .   .  am,1 am,2 ··· am,n yn

∗ Definition 4.4. A safety strategy for Player 1 is an x ∈ ∆m that satisfies

min (x∗)>Ay = max min x>Ay. y∈∆n x∈∆m y∈∆n

∗ A safety strategy for Player 2 is an y ∈ ∆n that satisfies

max x>Ay∗ = min max x>Ay. x∈∆m y∈∆n x∈∆m A safety strategy is the best strategy that Player 1 can use if they reveal their probability distribution to Player 2 before Player 2 makes a mixed strategy. This mixed strategy maximizes the worst case expected gain for Player 1. Safety strategies are optimal.

14 4.3 Von-Neumann’s minimax theorem Theorem 4.1 (Von-Neumann’s Minimax Theorem). For any two-person zero-sum game m×n with payoff matrix A ∈ R ,

min max x>Ay = max min x>Ay. y∈∆n x∈∆m x∈∆m y∈∆n

We will prove this in a later lecture. The left hand side says that Player 1 plays x first, and then Player 2 responds with y; the right hand side says that Player 2 plays y first, and then Player 1 responds with x. You might think that this is actually an inequality (≥) instead of an equality; this means playing last is preferable. But the minimax theorem says that it doesn’t matter whether you play first or second.

Definition 4.5. We call the optimal expected payoff the value of the game.

V = min max x>Ay = max min x>Ay. y∈∆n x∈∆m x∈∆m y∈∆n

15 5 Solving Two-player Zero-sum Games

5.1 Saddle points Consider a zero-sum game with the matrix

−1 1 5  5 3 4 . 6 2 1

Suppose both players choose their 2nd move; the payoff is a2,2 = 3. Should either player change their strategy? No. This would decrease the payoff for either player. This is called a saddle point, or a pure .

Definition 5.1. A pair (i∗, j∗) ∈ {1, . . . , m} × {1, . . . , n} is a saddle point for a payoff m×n matrix A ∈ R if max ai,j∗ = ai∗,j∗ = min ai∗,j. i j If Player 1 plays i∗, and Player 2 plays j∗, neither player has an incentive to change. Think of saddle points as locally optimal strategies for both players. We will also see that these are globally optimal.

∗ ∗ m×n Theorem 5.1. If (i , j ) is a saddle point for a payoff matrix A ∈ R , then

1. ei∗ is an optimal strategy for Player 1.

2. ej∗ is an optimal strategy for Player 2.

3. The value of the game is ai∗,j∗ . Proof. We have seen that we should always prefer to play last, but with a saddle point, the opposite inequality is also true:

min max x>Ay ≥ max min x>Ay y∈∆n x∈∆m x∈∆m y∈∆n > ≥ min ei∗ Ay y∈∆n > = ei∗ Aej∗ > = max x Aej∗ x∈∆m ≥ min max x>Ay. y∈∆n x∈∆m

> Observe that ai∗,j∗ = ei∗ Aej∗ .

16 5.2 Removing dominated pure strategies Another way to simplify a two-player zero-sum game is by removing dominated rows or columns.

Example 5.1. Here is a game called Plus One. Each player picks a number in {1, 2, . . . , n}. If i = j, the payoff is 0 If |i − j| = 1, the higher number wins 1. If |i − j| ≥ 2, the higher number loses 2. Here is the payoff matrix.

1 2 3 4 5 6 ··· n − 1 n 1 0 −1 2 2 2 2 ··· 2 2 2 1 0 −1 2 2 2 ··· 2 2 3 −2 1 0 −1 2 2 ··· 2 2 4 −2 −2 1 0 −1 2 ··· 2 2 5 −2 −2 −2 1 0 −1 ··· 2 2 6 −2 −2 −2 −2 1 0 ··· 2 2 ...... n − 1 −2 −2 −2 −2 −2 −2 .. 0 −1 n −2 −2 −2 −2 −2 −2 ··· 1 0

If one row is less than another (entry by entry), we can remove the lesser row from the matrix because Player 1 would never choose a strategy in that row. Similarly, we can drop columns that are larger in every entry than other columns. After we remove rows and columns, we get  0 −1 2   1 0 −1 . −2 1 0 Example 5.2. Here is a game called Miss-by-one. Player 1 and 2 choose numbers i, j ∈ {1, 2,..., 5}. Player 1 wins 1 if |i − j| = 1; otherwise, the payoff is 0. The matrix is

0 1 0 0 0 1 0 1 0 0   0 1 0 1 0 .   0 0 1 0 1 0 0 0 1 0 If we remove useless rows (1st and 5th) and columns (3rd), we get

1 0 0 0 0 1 1 0 . 0 0 0 1

17 5.3 2 × 2 games Consider a zero-sum game with matrix

LR T c d B a b

Assume all the values are different. Without loss of generality, a is the largest. There are six cases, then. The following four cases have saddle points:

1. a > b > c > d

2. a > b > d > c

3. a > c > b > d

4. a > c > d > b.

If there are no saddle points, we should equalize mixed strategies. Writing x1 = P (T ), we get V = b + x1(d − b),

V = a + x1(c − a). Solving this gives us a − b x = . 1 a − b + d − c In more general notation, we get

x1a1,1 + (1 − x1)a2,1 = x1a1,2 + (1 − x1)a2,2,

y1a1,1 + (1 − y1)a1,2 = y1a2,1 + (1 − y1)a2,2. Solving gives us a2,1 − a2,2 x1 = , a2,1 − a2,2 + a1,2 − a1,1

a1,2 − a2,2 y1 = . a1,2 − a2,2 + a2,1 − a1,1

18 6 Domination and the Principle of Indifference

6.1 Domination by multiple rows or columns Recall the concept of dominated rows or columns in a payoff matrix from last lecture.

Definition 6.1. A pure strategy ej for player 2 is dominated by ei0 in the payoff matrix A if for all i ∈ {1, . . . , m}, ai,j ≤ ai,j0 . We can extend this idea to include comparisons with multiple columns.

Definition 6.2. A pure strategy ej for player 2 is dominated by columns ej1 , . . . , ejk in the payoff matrix A if there is a convex combination y ∈ ∆n with yj = 0 and {` : y` 6= 0} = {j1, . . . , jk} such that, for all i ∈ {1, . . . , m},

n X ai,j ≥ ai,`y`. `=1

Theorem 6.1. If a pure strategy ej is dominated by columns ej1 , . . . , ejk , then we can remove column j from the matrix; i.e. there is an optimal strategy for Player 2 that sets yj = 0.

Proof. Letx ˜ ∈ ∆m andy ˜ ∈ ∆n. Then

n m > X X x˜ Ay˜ = x˜iai,`y˜` `=1 i=1 m m X X X = x˜iai,jy˜` + x˜iai,`y˜j `∈{1,...,n}\{j} i=1 i=1 m m k ! X X X X ≥ x˜iai,jy˜` + x˜i ai,js yjs y˜j `∈{1,...,n}\{j} i=1 i=1 s=1 m k m X X X X = x˜iai,jy˜` + x˜iai,js (yjs y˜j + yjs ) `∈{1,...,n}\{j} i=1 s=1 i=1 =x ˜>Ay,˜ where  y˜ ` ∈ {1, . . . , n}\{j , . . . , j , j}  I 1 k y˜ = 0 ` = j  yjs y˜j + yjs ` = js, s ∈ {1, . . . , k}. The same holds for dominated columns.

19 6.2 The principle of indifference We’ve seen a few examples where the optimal mixed strategy for one player leads to a from the other that is indifferent between actions. This is a general principle.

m×n Theorem 6.2. Suppose a game with payoff matrix A ∈ R has value V . If x ∈ ∆m and y ∈ ∆n are optimal strategies for Players 1 and 2, then

m n X X x`a`,j ≥ V ∀j, y`ai,` ≥ V ∀i, `=1 `=1

m n X X x`a`,j = V if yj > 0, y`ai,` = V if xi > 0. `=1 `=1 This means that if one player is playing optimally, any action that has positive weight in the other player’s optimal mixed strategy is a suitable response. It implies that any mixture of these “active actions” is a suitable response.

Proof. To prove the two inequalities, note that

m > 0 > X V = min x Ay ≤ x Aej = x a , 0 ` `,j y ∈∆n `=1

n X V = max (x0)>Ay ≥ e>Ay = x a . 0 i ` i,` x ∈∆m `=1 Pm Pn Recalling that i=1 xi = j=1 yj = 1, the inequalities give us

n n m m X X X X V = V yj ≤ xiai,jyj ≤ V xi = V. j=1 j=1 i=1 i=1

If either of the stated equalities did not hold, then we would have strict inequalities here, implying that V < V .

6.3 Using the principle of indifference Suppose we have a payoff matrix A, and we suspect that an optimal strategy for Player 1 has certain components positive, say x1 > 0, x3 > 0. Then we can solve the corresponding “indifference equalities” to find y, say:

n n X X a1,`y` = V, a3,`y` = V `=1 `=1

20 Example 6.1. Recall the game Plus One with payoff matrix

1 2 3 4 5 6 ··· n − 1 n 1 0 −1 2 2 2 2 ··· 2 2 2 1 0 −1 2 2 2 ··· 2 2 3 −2 1 0 −1 2 2 ··· 2 2 4 −2 −2 1 0 −1 2 ··· 2 2 5 −2 −2 −2 1 0 −1 ··· 2 2 6 −2 −2 −2 −2 1 0 ··· 2 2 ...... n − 1 −2 −2 −2 −2 −2 −2 .. 0 −1 n −2 −2 −2 −2 −2 −2 ··· 1 0 and reduced (after removing dominated rows and columns) payoff matrix

 0 −1 2   1 0 −1 . −2 1 0

We suspect that x1, x2, x3 > 0, so we solve

V  Ay = V  V to get that 1/4 y = 1/2 ,V = 0. 1/4

21 7 Symmetry in Two Player Zero-sum Games

7.1 Submarine Salvo Submarine Salvo is a game with a 3 × 3 grid.

7 8 9 4 5 6 1 2 3

One player picks two adjacent squares (vertically or horizontally) and hides a submarine on those squares. The other player picks a square and drops a bomb to blow up a submarine on that square, if it is there. The payoff matrix is

12 14 23 25 36 45 47 56 58 69 78 89 1 110000000000 2 101100000000 3 001010000000 4 010001100000 5 000101011000 6 000010010100 7 000000100010 8 000000001011 9 000000000101 Consider a transformation that flips the board from left to right. What happens to the payoff matrix? All we do is permute the rows and the columns of the matrix. Can we exploit this symmetry to help solve the game?

7.2 Invariant vectors and matrices m×n Definition 7.1. A game with payoff matrix A ∈ R is invariant under a permutation πx on {1, . . . , m} if there is a permutation πy on {1, . . . , n} such that for all i and j, ai,j = aπx(i),πy(j).

If A is invariant under π1 and π2, then A is invariant under π1 ◦ π2. So if A is invariant under some set S of permutations, then it is invariant under the group G of permutations generated by S.

Definition 7.2. A mixed strategy x ∈ ∆m is invariant under a permutation πx on

{1, . . . , m} if for all i, xi = xπx(i). Example 7.1. In Submarine Salvo, x is invariant for the permutation corresponding to a left-to-right flip if x1 = x3, x4 = x6, and x7 = x9.

22 Definition 7.3. An orbit of a group G of permutations is a set

Oi = {π(i): π ∈ G}.

Example 7.2. For the group generated by horizontal, vertical, and diagonal flips in Sub- marine Salvo, a few orbits are

O1 = {1, 3, 7, 9},O2 = {2, 4, 6, 8},O5 = {5}.

If a mixed strategy x is invariant under a group G of permutations, then for every orbit, x is constant on the orbit.

Theorem 7.1. If A is invariant under a group G of permutations, then there are optimal strategies x¯ and y¯ that are invariant under G.

Proof. Let x, y be an optimal strategies, and define the strategyx ¯ to have

1 X x¯i = xi0 , |Oi| 0 i ∈Oi

1 X y¯j = xj0 , |Oj| 0 j ∈Oj where Oi is the unique orbit containing move i for Player 1, and Oj is the unique orbit containing move j for Player 2. As an exercise, show that these are optimal.

7.3 Using invariance to solve games Using an optimal strategy that is symmetric across orbits, we can simplify a compli- cated payoff matrix. Letx ¯ andy ¯ be invariant optimal strategies. Let O1,...,O1 and 1 K1 O2,...,O2 be partitions of {1, . . . , m} and {1, . . . , n}, respectively. Letx ¯s be the value 1 K2 1 t 2 of xi for i ∈ Os , and lety ¯ be the value of yj for j ∈ Ot Then

m n K K   X X X1 X2 X X x¯iai,jy¯j =  x¯iai,jy¯j i=1 j=1 s=1 t=1 1 2 i∈Os j∈Ot   K1 K2 X X s X X t = x¯  ai,j y¯ s=1 t=1 1 2 i∈Os j∈Ot   K1 K2 X X 1 s X X ai,j 2 t = (|Os |x¯ )   (|Ot |y¯ ). |O1| · |O2| s=1 t=1 1 2 s t i∈Os j∈Ot

23 Note also that

K1 m K2 n X 1 s X X 2 t X |Os |x¯ = x¯i = 1, |Ot |y¯ = y¯j = 1, s=1 i=1 t=1 j=1 so we can simplify the matrix to a smaller payoff matrix on the orbits of moves (instead of on each move). The entries of the new matrix are the averages of the original ai,j elements over the orbits containing move i and move j for Players 1 and 2, respectively.

Example 7.3. In Submarine Salvo, we get the payoff matrix over orbits of actions

edge center corner 1/4 0 mid-edge 1/4 1/4 center 0 1

Solving this by finding dominated rows and columns, we get the optimal strategies

0 1 xˆ = 1 , yˆ = .   0 0

In terms of the original game, this means that an optimal strategy is for the Bomber is to put weight 1/4 for each mid-edge move and for the Submarine to put weight 1/8 on each of 1 2, 1 4, 2 3, 3 6, 4 7, 6 9, 7 8, and 8 9.

Example 7.4. In Rock, Paper, Scissors, each player’s moves fall into 1 orbit: O = {Rock, Paper, Scissors}. Then an optimal strategy for each player is

1/3 1/3 x¯ = 1/3 , y¯ = 1/3 . 1/3 1/3

24 8 Nash Equilibria, Linear Programming, and von Neumann’s Minimax Theorem

8.1 Nash equilibria 8.1.1 Optimality of Nash equilibria

∗ ∗ Definition 8.1. A pair (x , y ) ∈ ∆m × ∆n is a Nash equilibrium for a payoff matrix m×n A ∈ R if max x>Ay∗ = (x∗)>Ay∗ = min (x∗)>Ay. x∈∆m y∈∆n Think of these as locally optimal strategies. If Player 1 plays x∗ and Player 2 plays y∗, neither player has an incentive to change. Given a pair of safety strategies, we can get a Nash equilibrium, but a Nash equilibrium is a priori not necessarily a pair of safety strategies. The difference is that we do not require (x∗)>Ay∗ to be the value of the game. However, these are actually globally optimal strategies, as well.

Theorem 8.1. The pair (x∗, y∗) is a Nash equilibrium iff x∗ and y∗ are optimal.

Proof. ( =⇒ ) This is the same as the proof for the optimality of a saddle point.

min max x>Ay ≥ max min x>Ay y∈∆n x∈∆m x∈∆m y∈∆n ≥ min (x∗)>Ay y∈∆n = (x∗)>Ay∗ = max x>Ay∗ x∈∆m ≥ min max x>Ay. y∈∆n x∈∆m

( ⇐= ) The von Neumann minimax theorem implies that

(x∗)>Ay∗ ≥ min(x∗)>Ay y = max min x>Ay x y = min max x>Ay y x = max x>Ay∗ x ≥ (x∗)>Ay∗.

25 8.1.2 Indifference and Nash Equilibria Assume that a ∗ > . ∗ (x ) A = (a, . . . , a), . = Ay a for some constant a. Then

min(x∗)>Ay = a = (x∗)>Ay∗ = max x>Ay∗, y x so (x∗, y∗) is a Nash equilibrium. So x∗ an y∗ are optimal.

8.2 Solving zero-sum games using matrix inversion Here is a useful theorem that is a consequence of the principle of indifference. You can find the proof in the Ferguson book.

Theorem 8.2. Suppose the square matrix A is nonsingular and 1>A−11 6= 0. Then the game with matrix A has value V = (1>A−11)−1 and optimal strategies (x∗)> = V 1>A−1 and y∗ = VA−11, provided both x∗ ≥ 0 and y∗ ≥ 0.

3×3 Example 8.1. Let A ∈ R be   a1,1 0 0 A =  0 a2,2 0  0 0 a3,3 with each ai,i > 0. Using the theorem, we get

V = (1>A−11)−1     −1 1/a1,1 0 0 1 = (1, 1, 1)  0 1/a2,2 0  1 0 0 1/a3,3 1 1 = . 1/a1,1 + 1/a2,2 + 1/a3,3 We also get

(x∗)> = V 1>A−1   a1,1 0 0 = V (1, 1, 1)  0 a2,2 0  0 0 a3,3

26 1 = (1/a1,1, 1/a2,2, 1/a3,3), 1/a1,1 + 1/a2,2 + 1/a3,3

y∗ = VA−11     a1,1 0 0 1 = V  0 a2,2 0  1 0 0 a3,3 1 1 = (1/a1,1, 1/a2,2, 1/a3,3). 1/a1,1 + 1/a2,2 + 1/a3,3

8.3 Linear programming: an aside Definition 8.2. A linear program is an optimization problem involving the choice of a real vector to maximize a linear objective subject to linear constraints:

> > max x b such that d1 ≤ c1 x∈Rn . . > dk ≤ ck.

n > n Here, b ∈ R specifies the linear objective x → b x, and di ∈ R and ci ∈ R specify the i-th constraint. The set of values x that satisfy the constraints is a polytope (an intersection of half spaces). From the perspective of the row player, a two player zero-sum game is an opti- mization problem of the form

> > max min x Aei such that x1 ≤ 0 x∈Rn i∈{1,...,n} . . > xk ≤ 0 1>x = 1.

This is not a linear program; the constrants are linear, but hte objective is not. But we can > convert it to a linear program by introducting the slack variable Z = mini∈{1,...,n} x Aei. There are efficient (polynomial time) algorithms for solving linear programs. The col- umn player’s linear program is the dual of the row player’s linear program. In fact, for any concave maximization problem, like the row player’s linear program (we’ll call it the pri- mal problem), it is possible to define a dual convex minimization problem, like the column player’s linear program. This dual problem has a value that is at least as large the value of the primal problem.

27 In many important cases (such as our linear program), these values are the same. In optimization, this is called strong duality. This is von Neumann’s minimax theorem. The principle of indifference is a general property of dual optimization problems (called complementary duality).

8.4 Proof of von Neumann’s minimax theorem We want to prove the following theorem:

m×n Theorem 8.3. For any two-person zero-sum game with payoff matrix A ∈ R ,

min max x>Ay = max min x>Ay. y∈∆n x∈∆m x∈∆m y∈∆n

The textbook proves this theorem using the separating hyperplane theorem. We will prove this theorem in a more algorithmic way, developing an optimal strategy by learning from the other player’s optimal moves against ours. Consider a two-player zero-sum game that is repeated for T rounds. At teach round, the row player chooses an xt ∈ ∆m. Then the columns player chooses a yt ∈ ∆n, and the > row player receives a payoff of xt Ayt. The row player’s regret after T rounds is how much its total payoff falls short of the best in retrospect that it could have achieved against the column player’s choices with a fixed mixed strategy: T T X > X > RT = max x Ayt − xt Ayt. x∈∆m t=1 t=1 We will see that there are learning algorithms that have low regret against any sequence played by the column player. These learning algorithms don’t need to know anything about the game in advance; they just need to see, after each round, the column vector of payoffs corresponding to the column player’s choice.

Lemma 8.1. The existence of a row player with low regret (RT /T → 0 as T → ∞) implies the minimax theorem.

−1 PT Proof. Definex ¯ = T t=1 xt. Suppose that the column player plays a best response yt against the row player’s choice xt:

> > xt Ayt = min xt Ay. y∈∆n

−1 PT Definey ¯ = T t=1 yt. We then have

max min x>Ay ≥ min x¯>Ay x∈∆m y∈∆n y∈∆n

28 T 1 X > = min xt Ay y∈∆n T t=1 T 1 X > ≥ min xt Ay T y∈∆n t=1 T 1 X = x>Ay T t t t=1 T 1 X > RT = max x Ayt − x∈∆m T T t=1 R = max x>Ay¯ − T x∈∆m T R ≥ min max x>Ay − T y∈∆n x∈∆m T → min max x>Ay y∈∆n x∈∆m as T → ∞.

The proof shows thatx ¯ andy ¯ are asymptotically optimal, in the sense that the gain of x¯ and the loss ofy ¯ approach the value of the game. Next lecture, we’ll consider a specific low regret learning algorithm: gradient ascent.

29 9 Gradient Ascent, Series Games, and Parallel Games

9.1 Gradient ascent

Here, we will describe a low regret (in the sense that RT /T → 0 as T → ∞) learning algorithm for a two player zero-sum game. This will complete our proof of the von Neumann minimax theorem. Fix x1 ∈ ∆m. On round t, play xt, observe yt, and choose

xt+1 = P∆m (xt + ηAyt), where η us a step size and P∆m is the projection onto ∆m:

2 P∆m (x) = arg min ka − xk2. a∈∆m

> Note that if F (x) = x Ayt, ∇F (x) = Ayy. This is a “gradient ascent” algorithm because Ayt is the gradient of the payoff when the column player plays yt.

Theorem 9.1. Let G = maxy∈∆n kAyk. Then the gradient ascent algorithm with η = p 2 2/(G T ) has regret √ 2 RT ≤ 2G T. Proof. Note that

T T X > X > Rt = max x Ayy − xt Ayt x∈∆m t=1 t=1 T X > = max (x − xt) Ayy. x∈∆m t=1

Fix a strategy x. How does kx − xtk evolve?

kx − xt+1k = kx − P∆m (xt + ηAyt)k The distance to the projection is at most the distance to the original point.

≤ kx − xt − ηAytk

Use the identity that ka + bk2 = kak2 + 2a · b + kbk2.

2 > 2 = kx − xtk − 2η(x − xt) Ayt + η kAytk.

So we get that

> 2 2 2 2 2η(x − xt) Ayt ≤ kx − xtk − kx − xt+1k + η kAytk .

30 We can use this inequality to get

T T T X 1 X η X (x − x )>Ay ≤ (kx − x k2 − kx − x k2) + kAy k2 t t 2η t t+1 2 t t=1 t=1 t=1 T 1 η X = (kx − x k2 − kx − x k2) + kAy k2 2η 1 T +1 2 t t=1 2 ηT G2 ≤ + . η 2 Choosing η = p2/(G2T ) and taking the max over x on the left side gives the result.

9.2 Series and parallel games 9.2.1 Series games

Say we have two games, G1 and G2. How can we combine these into a single game?

Definition 9.1. A series game is a game in which every turn, both players first play G1 then both play G2. > If the players play x1 and y1 in G1 and then x2 and y2 in G2, the payoff is x1 Ay1 + > ∗ ∗ x2 A2y2. The two games decouple; Player 1 should play x1 and x2, and Player 2 should ∗ ∗ play y1 and y2. If G1 has value V1, and G2 has value V2, the series game has value V1 + V2.

9.2.2 Parallel games Definition 9.2. A parallel game is a game in which both players simultaneously decide which game to play, and an action in that game. If they choose the same game, they get the payoff from that game. If they choose different games, the payoff is 0.

Player 1 can either play x1 in G1 or x2 in G2. Player 2 can either play y1 in G1 or > y2 in G2. If they both play G1, the payoff is x1 A1y1. If they both play G2, the payoff is > x2 sA2y2. Otherwise, the payoff is 0. So the matrix for the game can be expressed as a block matrix: A 0  1 . 0 A2 We can split the decisions into choosing a mixture of games and then, with in each game, choosing a strategy. Withing G1, Player 1 only needs to consider payoffs in G1; if Player II chooses G2, the payoff is 0, so Player 1 is indifferent about actions in that case. Thus, the players should play optimal strategies within each game, and the only choice is which game to play. So we can reduce the payoff matrix to involve V1 and V2 only: V 0  1 . 0 V2

31 We can solve this to find that Player 1 should play G1 with probability

V2 V1 + V2 and that the value of the game is 1 V = . 1/V1 + 1/V2 What if we are playing k games in parallel? The payoff matrix becomes   V1 0 ··· 0  ..   0 V2 . 0    .  ......   . . . .  0 0 ··· Vk

If any entries are 0, this is a saddle point. If all entries are nonzero, the matrix is invertible and we can solve it by taking the inverse, as before. We also get 1 V = . 1/V1 + ··· + 1/Vk

9.2.3 Electric networks The way values combine in these games is identical to the way resistances combine in electric networks. For resistors connected in series, the equivalent resistance is the sum of the resistances of the resistors. For resistors connected in parallel, the equivalent resistance is the reciprocal of the sum of the reciprocals of the resistances.

32 10 Two Player General-Sum Games

10.1 General-sum games and Nash equilibria Definition 10.1. A two-person general-sum game is sepcified by two payoff matrices m×n A, B ∈ R . Simultaneously, Player 1 chooses i ∈ {1, . . . , m}, and Player 2 chooses j ∈ {1, . . . , n}. Player 1 receives payoff ai,j, and Player 2 receives payoff bi,j. Because it is easier to view, we will often write a single bimatrix, that is a matrix with ordered pair entries (ai,j, bi,j). Example 10.1. A zero-sum game is the case when B = −A.

Definition 10.2. A pure strategy ei for Player 1 is dominated by ei0 in the payoff matrix A if, for all j ∈ {1, . . . , n}, ai,j ≤ ai0,j. Similarly, a pure strategy ej for Player 2 is dominated by ej0 in the payoff matrix B if, for all i ∈ {1, . . . , n}, bi,j ≤ bi,j0 . ∗ Definition 10.3. A safety strategy for Player 1 is an x ∈ ∆m that satisfies max min x>Ay = min (x∗)>Ay. x∈∆m y∈∆n y∈∆n ∗ A safety strategy for Player 2 is a y ∈ ∆n that satisfies max min x>By = min x>By∗. y∈∆n x∈∆m x∈∆m So x∗ and y∗ maximize the worst case expected gain for Player 1 and Player 2, respec- tively. Recall that for zero-sum games, the safety strategy for Player 2 was defined using A (because in that case, B = −A):

min max x>Ay = max x>Ay∗. y∈∆n x∈∆m x∈∆m These definitions coincide because taking out the negative switches the max to a min (and vice versa).

∗ ∗ m×n Definition 10.4. A pair (x , y ) ∈ R is a Nash equilibruim for payoff matrices A, B ∈ m×n R if max x>Ay∗ = (x∗)>Ay∗, x∈∆m max(x∗)>Ay = (x∗)>By∗. y∈∆n This is a strategy where if Player 1 plays x∗ and Player 2 plays y∗, neither player has an incentive to unilaterally deviate. In other words, x∗ is a best response to y∗, and y∗ is a best response to x∗. For zero-sum games, we saw that Nash equilibria were safety strategies, and the payoff from playing them was the value of the game. However, in general-sum games, there might be many Nash equilibria, with different payoffs.

33 10.2 Examples of general-sum games Example 10.2. Here is the Prisoners’ Dilemma. Two suspects are imprisoned by the police, who ask each of them to confess. The charge is serious, but there is not enough evidence to convict the suspects. Separately (in different rooms), each prisoner is offered the following plea deal: • If one prisoner confesses, and the other prisoner remains silent, the confessor goes free, and their confession is used to sentence the other prisoner to ten years of jail.

• If both confess, they will both spend eight years in jail. • If both remain silent, the sentence is one year to each for the minor crime that can be proved without additional evidence. The payoff bimatrix for this game is silent confess silent (−1, −1) (−10, 0) confess (0, −10) (−8, −8). If each player solves their own payoff matrix, then they will each choose to confess with probability 1. Example 10.3. Two hunters are following a stag, when a hare runs by. Each hunter has to make a split-second decision: to chase the hare or to continue tracking the stag. The hunters must cooperate to catch the stag, but each hunter can catch the hare on his own. If they both go for the hare, they share it. The payoff bimatrix for this game is stag hare stag (4, 4) (0, 2) hare (2, 0) (1, 1). For each player, a safety strategy is to go for the hare. So (hare, hare) is a pure Nash equilibrium with payoff (1, 1). Another pure Nash equilibruim is (stag, stag). Let’s find a mixed Nash equilibrium. For ((x, 1−x), (y, 1−y)) to be a Nash equilibrium, the players don’t want to shift to a different mixture. If Player 2 plays first and plays (1 − y, y), the the payoff for Player 1 is 4 0  y   4y  (x, 1 − x) = (x, 1 − x) . 2 1 1 − y 2y + 1 − y

So Player 1 will play e1 if 4y > 2y + 1 − y. Player 1 will play e2 if 4y < 2y + 1 − y. This means that Player 1 will play a mixed strategy (x, 1 − x) if and only if 4y − 2y + 1 − y.

34 Similarly, if Player 2 plays second, Player 2 will play a safety strategy if and only if

4x = 2x + 1 − x.

Solving this, we get that ((1/3, 2/3), (1/3, 2/3)) is a mixed Nash equilibrium. The payoff is (4/3, 4/3).

Example 10.4. Player 1 is choosing between parking in a convenient but illegal parking spot (payoff 10 if they are not caught) and parking in a legal but inconvenient spot (payoff 0). If Player 1 parks illegally and is caught, they will pay a hefty fine (payoff −90). Player 2, the inspector representing the city, needs to decide whether to check for illegal parking. There is a small cost (payoff −1) to inspecting. However, there is a greater cost to the city if Player 1 has parked illegally since that can disrupt traffic (payoff −10). This cost is partially mitigated if the inspector catches the offender (payoff −6). The payoff bimatrix for this game is

inspect chill illegal (−90, −6) (10, −10) legal (0, −1) (0, 0).

Safety strategies are for Player 1 to park legally and for Player 2 to inspect the parking spot.3 There are no pure Nash equilibria. What about mixed Nash equilibria? For (x, y) to be a Nash equilibrium (where we implicitly mean the strategies are ((x, 1 − x), (y, 1 − y))), the players don’t want to shift to a different mixture. The strategies need to satisfy

0 = 10(1 − y) − 90y, −10x = −(1 − x) − 6x.

So (1/5, 1/10) is a Nash equilibrium. The expected payoff is (0, −2).

3Let this be a lesson to you.

35 11 Two-Player and Multiple-Player General-Sum Games

11.1 More about two-player general-sum games 11.1.1 Cheetahs and gazelles Here is another example of a two-player general-sum game. Example 11.1. Two cheetahs are chasing a pair of antelopes, one large and one small. Each cheetah has two possible strategies: chase the large antelope (L) or chase the small antelope (S). The cheetahs will catch any antelope they choose, but if they choose the same one, they must share the spoils. Otherwise, the catch is unshared. The large antelope is worth `, and the small one is worth s. The payoff bimatrix for this game is

large small large (`/2, `/2) (`, s) small (s, `)(s/2, s/2)

If ` ≥ 2s, then large is a dominant strategy. If ` ≤ 2s, then the pure Nash equilibria are (large, small) and (small, large). What about a mixed Nash equilibrium? If Cheetah 1 plays P(large) = x, then Cheetah 2’s payoffs are ` large L(x) = x + `(1 − x), 2 s small S(x) = sx + (1 − x). 2 Equilibrium is when these are equal: 2` − s x∗ = . ` + s For example, if ` = 8 and s = 6, then x∗ = 5/7. Think of x∗ as the proportion of a population that would greedily pursue the large gazelle. For a randomly chosen pair of cheetahs, if x > x∗, S(x) > L(x), and non-greedy cheetahs will do better (and vice versa). Evolution pushes the proportion to x∗; this is the evolutionarily stable strategy.

11.1.2 Comparing two-player zero-sum and general-sum games How do two player general-sum games differ from the zero-sum case?

• Zero-sum games – A pair of safety strategies is a Nash equilibrium (minimax theorem)

36 – There is always a Nash equilibrium. – If there are multiple Nash equilibria, they form a convex set, and the expected payoff is identical within that set. – Any two Nash equilibria give the same payoff. – If each player has an equalizing mixed strategy (that is, x>A = V 1> and Ay = V 1), then this pair of strategies is a Nash equilibrium (from the principle of indifference).

• General-sum games – A pair of safety strategies might be unstable. (opponent aims to maximize their payoff, not minimize mine). – There is always a Nash equilibrium (Nash’s theorem). – There can be multiple Nash equilibria with different payoff vectors. – If each player has an equalizing mixed strategy for their opponent’s payoff matrix > > (that is, x B = V21 and Ay = V11), then this pair of strategies is a Nash equilibrium.

11.2 Multiplayer general-sum games

A k-person general-sum game is specified by k utility functions Uj : S1 ×S2 ×· · ·×Sk → R. Player j can choose strategies sj ∈ Sj. Simultaneously, each player chooses a strategy. Player j receives the payoff uj(s1, . . . , sk). In the case where k = 2, we have the familiar u1(i, j) = ai,j and u2(i, j) = bi,j. For s = (s1, . . . , sk), we denote s−i as the strategies without the ith one:

s−i = (s1, . . . , si−1, si+1, . . . , sk).

We then write (si, s−i) as the full vector. ∗ ∗ Definition 11.1. A vector (s1, . . . , sk) ∈ S1 ×· · ·×Sk is a pure Nash equilibrium for utility functions u1, . . . , uk if for each player j ∈ {1, . . . , k},

∗ ∗ ∗ max uj(sj, s−j) = uj(sj , s−j). sj ∈Sj

∗ If the players play these sj , nobody has an incentive to unilaterally deviate; each player’s strategy is a best response to the other players’ strategies.

∗ ∗ Definition 11.2. A sequence (x1, . . . , xk) ∈ ∆S1 × · · · × ∆Sk is a Nash equilibrium (also called a strategy profile) for utility functions u1, . . . , uk if, for each player j ∈ {1, . . . , k},

∗ ∗ ∗ max uj(xj, x−j) = uj(xj , x−j). xj ∈∆Sj

37 Here, we define

∗ uj(x ) = Es1∼x,...,sk∼xk uj(s1, . . . , sk) X = x1(s1) ··· xk(sk)uj(s1, . . . , sk).

s1∈S1,...,sk∈Sk ∗ If the players play these mixed strategies xj , nobody has an incentive to unilaterally deviate; each player’s mixed strategy is a best response to the other players’ mixed strate- gies.

Lemma 11.1. Consider a k-player game where xi is the mixed strategy of player i. For each i, let Ti = {s : xi(s) > 0}. Then (x1, . . . , xk) is a Nash equilibrium if and only if for each i, there is a constant ci such that

1. For all si ∈ Ti, ui(si, x−i) = ci.

2. For all si ∈/ Ti, ui(si, x−i) ≤ ci. Example 11.2. Three firms will either pollute a lake in the following year or purify it. They pay 1 unit to purify, but it is free to pollute. If two or more pollute, then the water in the lake is useless, and each firm must pay 3 units to obtain the water that they need from elsewhere. If at most one firm pollutes, then the water is usable, and the firms incur no further costs. If firm 3 purifies, the cost trimatrix (cost = − payoff) is

purify pollute purify (1, 1, 1) (1, 0, 1) pollute (0, 1, 1) (3, 3, 4) If firm 3 pollutes, the cost trimatrix is purify pollute purify (1, 1, 0) (4, 3, 3) pollute (3, 4, 3) (3, 3, 3)

Three of the pure Nash equilibria are (purify, purify, pollute), (purify, pollute, purify), and (pollute, purify, purify). There is also the Nash equilibrium of (pollute, pollute, pollute), which is referred to as the “.” Let xi = (pi, 1 − pi) (that is, i purifies with probability pi). It follows from the previous lemma that these strategies are a Nash equilibrium with 0 < pi < 1 if and only if

ui(purify, x−i) = ui(pollute, x−i).

So if 0 < p1 < 1, then

p2p3 + p2(1 − p3) + p3(1 − p2) + 4(1 − p2)(1 − p3)

38 = 3p2(1 − p3) + 3p3(1 − p2) + 3(1 − p2)(1 − p3), or, equivalently, 1 = 3(p2 + p3 − 2p2p3). Similarly, we get 1 = 3(p1 + p3 − 2p1p3),

1 = 3(p1 + p2 − 2p1p2). Solving gives us two symmetric Nash equilibria: √ 3 ± 3 p = p = p = . 1 2 3 6

39 12 Indifference of Nash Equilibria, Nash’s Theorem, and Po- tential Games

12.1 Indifference of Nash equilibria in general-sum games Last lecture, we stated a useful lemma for multiplayer general-sum games.

Lemma 12.1. Consider a strategy profile x ∈ ∆S1 × · · · × ∆Sk . Let Ti = {s ∈ Si : xi(s) > 0}. Then x is a Nash equilibrium iff for each i there is a ci such that

1. For si ∈ Ti, ui(si, x−i) = ci (indifferent within Ti).

2. For si ∈ Si, ui(si, x−i) ≤ ci (no better response outside Ti).

Proof. ( =⇒ ) Suppose that x is a Nash equilibrium. Let i = 1 and c1 := u1(x). Then u1(s1, x−1) ≤ u1(x) = c1 for all s1 ∈ S1 be the definition of Nash equilibrium. Now observe that

c1 = u1(x) X = x1(s1) ··· xk(sk)u1(s1, . . . , sk)

s1∈T1,s2∈S2,...,Sk∈Sk   X X = x1(s1)  x2(s2) ··· xk(sk)u1(s1, . . . , sk) s1∈T1 s2∈S2,...,Sk∈Sk X = x1(s1)u1(s1, . . . , sk)

s1∈T1 X ≤ x1(s1)u1(x1, . . . , sk)

s1∈T1 X = x1(s1)c1

s1∈T1

= c1.

Since the inequality is actually an equality, we must have that u1(s1, . . . , sk) = u1(x1, . . . , sk) for each s1 ∈ T1. ( ⇐= ) Now assume that the latter conditions hold. Then X X u1(x) = u1(x1, x−1) = x1(s1)u1(s1, x−1) = x1(s1)c1 = c1,

s1∈T1 s1∈T1 and ifx ˜ ∈ ∆S1 , then X X u1(˜x1, x−1) = x˜1(s1)u1(s1, x−1) ≤ x˜1(s1)c1 = c1.

s1∈S1 s1∈S1

40 12.2 Nash’s theorem Theorem 12.1 (Nash). Every finite general-sum game has a Nash equilibrium.

Proof. We give a sketch of the proof for the two player case. We find an “improvement” map M(x, y) = (ˆx, yˆ), so that

1.ˆx>Ay > x>Ay (orx ˆ = x if such anx ˆ does not exist).

2. x>Ayˆ > x>Ay (ory ˆ = y if such any ˆ does not exist).

3. M is continuous.

A Nash equilibrium is a fixed point of M. The existence of a Nash equilibrium follows from Brouwer’s fixed-point theorem. > > How do we find M? Set ci(x, y) := max{ei Ay − x Ay, 0}. Then define

x1 + ci(x, y) xˆi = Pm . 1 + k=1 ck(x, y) We can constructy ˆ in a similar way.

Here is the precise statement of the theorem that does most of the work in the proof of Nash’s theorem.

Theorem 12.2 (Brouwer’s Fixed-Point Theorem). A continuous map f : K → K from d a convex, closed, bounded set K ⊆ R has a fixed point; that is, there exists some x ∈ K such that f(x) = x.

We will not provide a proof, but here is some intuition. In one dimension, a continuous map f from an interval [a, b] to the same interval must intersect the identity map (this is a diagonal of the square [a, b] × [a, b]). In two dimensions, this is related to the Hairy Ball theorem (a hair on a surface must point straight up somewhere). In general, the theorem is non-constructive, so it does not tell us how to get the fixed-point.

Remark 12.1. Not all games have a pure Nash equilibrium. There may only be mixed Nash equilibria.

41 12.3 Potential and Congestion games 12.3.1 Congestion games Example 12.1. Consider a game on the following graph:

Three people want to travel from location S to location T and pick a path on the graph. On each of the edges, there is a congestion vector related to how many people choose to take the edge. For example, the edge from B to T takes 2 minutes to traverse if 1 person travels along it, 4 minutes for each person if 2 people travel along it, and 8 minutes for each person if all 3 people travel along the edge. The players each want to minimize the time it takes for them to reach location T .

Definition 12.1. A has k players and m facilities {1, . . . , m} (edges). For Player i, there is a set Si of strategies that are sets of facilities, s ⊆ {1, . . . , m} (paths). k For facility j, there is a cost vector cj ∈ R , where cj(n) is the cost of facility j when it is used by n players. For a sequence s = (s1, . . . , sn), the of the players are defined by X costi(s) = −ui(s) = xj(nj(s)),

j∈si where nj(s) = |{i : j ∈ si}| is the number of players using facility i. A congestion game is egalitarian in the sense that the utilities depend on how many players use each facility, not on which players use it.

Theorem 12.3. Every congestion game has a pure Nash equilibrium.

Proof. We define a potential function Φ : S1 × · · · × Sk → R as

m nj (s) X X Φ(s) := cj(`) j=1 `=1

42 for fixed strategies for the k players s = (s1, . . . , sk). What happens when Player i changes 0 from si to si? We get that

0 ∆costi = costi(si, s−i) − costi(s) X X = cj(nj(s) + 1) − cj(nj(s)) 0 j∈(si,s−i) j∈(si,s−i) 0 = Φ(si, s−i) − Φ(si, s−i) = ∆Φ.

If we start at an arbitrary s, and update one player’s choice to decrease that player’s cost, the potential must decrease. Continuing updating other player?s strategies in this way, we must eventually reach a local minimum (there are only finitely many strategies). Since no player can reduce their cost from there, we have reached a pure Nash equilibrium. This gives an algorithm for finding a pure Nash equilibrium: update the choice of one player at a time to reduce their cost.

12.3.2 Potential games

Definition 12.2. A has k players. For Player i, there is a set Si of strategies and a cost function costi : S1 × · · · × Sk → R. A potential game has a potential function Φ: S1 × · · · × Sk → R, where

0 0 Φ(si, s−i) − Φ(si, s−i) = costi((si, s−i) − costi(si, s−i).

Congestion games are an example of potential games. In considering congestion games, we actually proved the following theorem.

Theorem 12.4. Every potential game has a pure Nash equilibrium.

There is also a converse to the statement that congestion games are potential games.

Theorem 12.5. Every potential game has an equivalent congestion game.

Here, an equivalent game means we can find a way to map from the strategies of one game to the strategies of the other so that the utilities are identical. But the congestion game might be much larger: for k players with each |Si| = `, the proof involves constructing a congestion game with 2k` resources.

43 13 Evolutionary Game Theory

13.1 Criticisms of Nash equilibria What’s wrong with Nash equilibria? There are many criticisms one might have:

• Will all players know everyone’s utilities?

• Maximizing expected utility does not (explicitly) model .

• Will players maximize utility and completely ignore the impact on other players’ utilities?

• How can the players find a Nash equilibrium?

• How can the players agree on a Nash equilibrium to play?

• Will players actually randomize? We will discuss some alternative equilibrium concepts:

1.

2. Evolutionary stability

3. Equilibria in perturbed games

13.2 Evolutionarily stable strategies Say there is a population of individuals. There is a game played between randomly chosen pairs of individuals, where each individual has a pure strategy encoded in its genes. A higher payoff gives higher reproductive success. This can push the population towards stable mixed strategies. Consider a two-player game with payoff matrices A, B. Suppose that it is symmetric (A = B>). Consider a mixed strategy x. Think of x as the proportion of each pure strategy in the population. Suppose that x is invaded by a small population of mutants z (that is, playing strategy z). The criteria for x to be an evolutionary stable strategy will imply that, for small enough ε, the average payoff for xs will be strictly greater than that for zs, so the invaders will disappear. Will the mix x survive? Say a player who plays x goes against an invader. Then the expected payoff is x>Az. If, instead, a player with strategy x goes against another one with strategy x, then the expected payoff is x>Ax. Since 1 − ε is the proportion of players with strategy x, and ε is the proportion of players with strategy z, the utility of a player with strategy x is (1 − ε)x>Ax + εx>Az = x>A((1 − ε)x + εz).

44 Similarly, the utility for an invader is (1 − ε)z>Ax + εz>Az = z>A((1 − ε)x + εz).

Definition 13.1. A mixed strategy x ∈ ∆n is an evolutionarily stable strategy (ESS) if, for any pure strategy z, 1. z>Ax ≤ x>Ax ((x, x) is a Nash equilibrium). 2. If z>Ax = x>x, then z>Az < x>Az.

13.3 Examples of strategies within populations Example 13.1. Two players play a game of Hawks and Doves for a prize of value v > 0. They confront each other, and each chooses (simultaneously) to fight or to flee; these two strategies are called the “hawk” (H) and the “dove” (D) strategies, respectively. If they both choose to fight (two hawks), then each incurs a cost c, and the winner (either is equally likely) takes the prize. If a hawk faces a dove, the dove flees, and the hawk takes the prize. If two doves meet, they split the prize equally. The payoff bimatrix is HD H (v/2 − c, v/2 − c)(v, 0) D (0, v)(v/2, v/2) If, for example, we set v = c = 2, we get the payoff bimatrix The payoff bimatrix is HD H (−1, −1) (2, 0) D (0, 2) (1, 1) The pair (x, x) with x = (1/2, 1/2) is a Nash equilibrium. Is it an evolutionarily stable strategy? Consider a mutant pure strategy z. We have z>Ax ≤ x>Ax because (x, x) is a Nash equilibrium. If z>Ax = z>Ax, then is z>Az < x>Az? For z = (1, 0) (that is, H) z>Az = −1 < −1/2 = x>Az. For z = (0, 1) (that is, D) z>Az = 1 < 3/2 = x>Az. So x is an ESS. Example 13.2. Consider a game of rock-paper-scissors. The payoff matrix for Player 1 is RPS R 0 −1 1 P 1 0 −1 S −1 1 0

45 The pair (x, x) with x = (1/3, 1/3, 1/3) is a Nash equilibrium. Is it an ESS? We need to check that if z>Ax = x>Ax then z>Az < x>Az. But for any pure strategy z, z>Ax = 0 = z>Az. So x is not an ESS.

The example of rock-paper-scissors shows us that cycles can occur, with the population shifting between strategies. This actually happens in nature.

Example 13.3. The males of the Uta Stansburiana lizard come in three colors. The colors correspond to different behaviors, which allow them to attract female mates:

1. Orange throat (aggressive, large harems, defeats blue throat)

2. Blue throat (less aggressive, small harems defeats yellow striped)

3. Yellow striped (submissive, look like females, defeats orange throat4)

In nature, there is a 6 year cycle of shifting population proportions between these three colors.

4The yellow-striped lizards sneak into the territory of the orange throats and woo away the females.

46 14 Evolutionary Game Theory of Mixed Strategies and Mul- tiple Players

14.1 Relationships between ESSs and Nash equilibria We have mentioned this before, but it is worth stating explicitly.

Theorem 14.1. Every ESS is a Nash equilibrium.

Proof. This follows from the definition. We have that for each pure strategy z, z>Ax ≤ > Pn Pn x Ax. Any mixed strategy is w = j=1 cjzj for cj ≥ 0 and j=1 cj = 1. Then

 n  n n > X > X > X > > w Ax =  cjzj  Ax = cj(zj Ax) ≤ cjx Ax = x Ax. j=1 j=1 j=1

Does this theorem have a converse?

∗ ∗ ∗ Definition 14.1. A strategy profile x = (x1, . . . , xk) ∈ ∆S1 × · · · × ∆Sk is a strict Nash equilibrium for utility functions u1, . . . , uk if for each j ∈ {1, ··· , k} and for each xk ∈ ∆Sj ∗ with xj 6= xj , ∗ ∗ ∗ uj(xj, x−j) < uj(xj , x−j). This is the same definition as for a Nash equilibrium, except that the inequality in the definition is strict. By the principle of indifference, only a pure Nash equilibrium can be a strict Nash equilibrium.

Theorem 14.2. Every strict Nash equilibrium is an ESS.

Proof. A strict Nash equilibirum has z>Ax < x>Ax for z 6= x, so both conditions defining an ESS are satisfied. In particular, for the second condition, the case where z>Ax = x>Ax for z 6= x never occurs.

14.2 Evolutionary stability against mixed strategies ∗ ∗ ∗ > ∗ ∗ > ∗ An ESS is a Nash equilibrium (x , x ) such that for all ei 6= x , if ei Ax = (x ) Ax , then > ∗ > ei Aei < (x ) Aei. But what about mixed strategies? Definition 14.2. A symmetric strategy (x∗, x∗) is evolutionarily stable against mixed strategies (ESMS) if

1. x is a Nash equilibrium.

2. For all mixed strategies z 6= x∗, if z>Ax∗ = (x∗)>Ax∗, then z>Az < (x∗)Az.

Sometimes, people refer to these as ESSs.

47 Theorem 14.3. For a two-player 2 × 2 , every ESS is ESMS.

Proof. Assume that x = (q, 1 − q) with q ∈ (0, 1) is an ESS. Let x = (p, 1 − p) for > > > > > > p ∈ (0, 1) be such that z Ax = x Ax. Since e1 Ax ≤ x Ax, e2 Ax ≤ Ax, and z Ax = > > pe1 Ax + (1 − p)e2 Ax, we must have that

> > > e1 Ax = e2 Ax = x Ax.

Hence, q is obtained though the equalizing conditions, and a − a q = 1,2 2,2 . a1,2 + a2,1 − a1,1 − a2,2

Next, define the function G(p) := x>Az = z>Az. We want to show that G is positive.

2 2 G(p) = (a2,1 − a1,1)[p − pq] + (a1,2 − a2,2)[q − qp − p + p ]

> > > > However, since e1 Ax = x Az, by the ESS condition, we must ahve e1 Ae1 < x Ae1. The latter is equivalent to a1,1 < qa1,1 + (1 − q)a1,2, which gives us that a1,1 < a1,2. Similarly, a2,2 < a2,1. By inspection, we see that G(0) > 0 and G(1) > 0. G0(0) = 0 if and only if

0 = (a2,1 − a1,1)[2p − q] + (a1,2 − a2,2)[−q − 1 + 2p], which is equivalent to

2p[a1,2 + a2,1 − a1,1 − a2,2] = q[a − 1, 2 + a2,1 − a1,1 − a2,2] + a1,2 − a2,2.

From this, we get that p = q. Moreover, G(q) = 0.

Example 14.1. Here is an example where an ESS is not an ESMS. Consider the symmetric game with matrix 1 1 1  1 0 20 . 1 20 0

> x = e1 is an ESS, but it is not an ESMS because for x = (1/3, 1/3, 1/3) ,

> > x Ax = 5 > 1 = e1 Ax.

48 14.3 Multiplayer evolutionarily stable strategies Consider a symmetric multiplayer game (that is, unchanged by relabeling the players). Suppose that a symmetric mixed strategy x is invaded by a small population of mutants z; x is replaced by (1 − ε)x + εz. Will the mix x survive? The utility for x is, by linearity,

u1(x, εz + (1 − ε)x, . . . , εz + (1 − ε)x)

= ε(u(x, z, x, . . . , x) + u1(x, x, z, x, . . . , x) + ··· + u1(x, . . . , x, z)) 2 + (1 − (n − 1)ε)u1(x, . . . , x) + O(ε ).

Similarly, the utility for z is

u1(z, εz + (1 − ε)x, . . . , εz + (1 − ε)x)

= ε(u(z, z, x, . . . , x) + u1(z, x, z, x, . . . , x) + ··· + u1(z, . . . , x, z)) 2 + (1 − (n − 1)ε)u1(z, . . . , x) + O(ε ).

Definition 14.3. Suppose, for simplicity, that the utility for player i depends on si and on the set of strategies played by the other players but is invariant to a permutation of the other players’ strategies. A strategy x ∈ ∆n is an evolutionarily stable strategy (ESS) if for any pure strategy z 6= x,

1. u1(z, x−1) ≤ u1(x, x−1)(x is a Nash equilibrium).

2. If u1(z, x−1) = u1(x, x−1), then for all j 6= 1, u1(z, z, x−1,−j) < u1(x, z, x−1,j).

49 15 Correlated Equilibria and Braess’s Paradox

15.1 An example of inefficient Nash equilibria Example 15.1. Consider an example of traffic, where two drivers have to decide whether to stop or go. Stopping has a cost of 1, and going has a payoff of 1. However, if both cars go, they crash, and the cost is 100 to each driver. The payoff bimatrix is

Go Stop Go (−100, −100) (1, −1) Stop (−1, 1) (−1, −1)

The pure Nash equilibria are (go, stop) and (stop, go). To find mixed Nash equilibriam we solve −100 1   y  v  = 1 , −1 −1 1 − y v1 which gives the Nash equilibrium ((2/101, 99/101), (2/101, 99/101)). Under the mixed Nash equilibrium, each player gets a payoff of −1. Can we do better? Here is a better solution. Suppose there is a traffic signal with

P((Red, Green)) = P((Green, Red)) = 1/2, and both players agree that Red means Stop and Green means Go. After they both see the traffic signal, the players have no incentive to deviate from the agreed actions. The expected payoff for each player is 0, higher than that of the mixed Nash equilibrium.

15.2 Correlated strategy pairs and equilibria

Definition 15.1. For a two player game with strategy sets S1 = {1, . . . , m} and S2 = {1, . . . , n}, a correlated strategy pair is a pair of random variables (R,C) with some joint probability distribution over pairs of actions (i, j) ∈ S1 × S2. Example 15.2. In the traffic example, the traffic light induces a correlated strategy pair with joint distribution Go Stop Go 0 1/2 Stop 1/2 0

Compare this definition with a pair of mixed strategies. Let x ∈ ∆Sm and y ∈ ∆Sn such that P(R = i) = xi and P(C = j) = xj. Then, choosing the two actions (R,C) independently gives P(R = i, C = j) = xiyj. In the traffic signal example, we cannot have P(Stop, Go) > 0 and P((Go, Stop) > 0 without P(Go, Go) > 0.

50 Definition 15.2. A correlated strategy pair for a two-player game with payoff matrices A and B is a correlated equilibrium if 0 1. ∀i, i ∈ S1, P(R = i) > 0 =⇒ E[ai,C | R = i] ≥ E[ai0,C | R = i]. 0 2. ∀j, j ∈ S2, P(C = j) > 0 =⇒ E[bR,j | C = j] ≥ E[bR,j0 | C = j].

Compare this with Nash equilibria. Let (x, y) ∈ ∆Sm × ∆Sn be a strategy profile, and let R and C be independent random variables with Xi = P(R = i) and P(C = j) = yj. Then (x, y) is aNash equilibrium iff 0 1. ∀i, i ∈ S1, P(R = i) > 0 =⇒ E[ai,C ] ≥ E[ai0,C ]. 0 0 2. ∀j, j ∈ S2, P(C = j) > 0 =⇒ E[bRj ] ≥ E[bR,j ]. This is because X X > E[ai,C ] = ai,jP(C = j) = ai,jyj = ei Ay, j∈S2 j∈S2 coupled with the principle of indifference. Since R and C are independent, these expec- tations and the conditional expectations are identical. Thus, a Nash equilibrium is a correlated equilibrium. Example 15.3. Consider the pair of random variables (R,C) with joint distribution  0 1/3 , 1/3 1/3 so P(Go, Go) = 0, and P(Go, Stop) = P(Stop, Go) = P(Stop, Stop) = 1/3. Is this a correlated equilibrium for the traffic example? We need to check if

E[aStop,C | R = Stop] ≥ E[aGo,C | R = Stop],

E[aGp,C | R = Go] ≥ E[aStop,C | R = Go], E[bR,Stop | C = Stop] ≥ E[bR,Go | C = Stop], E[bR,Gp | C = Go] ≥ E[bR,Stop | C = Go]. Notice that P(C = Go | R = Stop) = 1/2, so

E[aStop,C | R = Stop] = −1 > −99/2 = E[aGo,C | R = Stop]. Also, P(C = Go | R = Go) = 0, so

E[aGo,C | R = Go] = 1 > −1 = E[aStop,C | R = Go]. What is the expected payoff for each player? For Player 1, it is 1 1 1 1 [a ] = a + a + a = − . E R,C 3 Go,Stop 3 Stop,Go 3 Stop,Stop 3 For Player 2, it is the same.

51 15.3 Interpretations and comparisons to Nash equilibria How do correlated equilibria compare to Nash equilibria? Nash’s Theorem implies that there is always a correlated equilibrium. They are also easy to find via linear programming. It is not unusual for correlated equilibria to achieve better solutions for both players than Nash equilibria, as in the traffic example. We can think of a correlated equilibrium being implemented in two equivalent ways:

1. There is a random draw of a correlated strategy pair with a known distribution, and the players see their strategy only.

2. There is a draw of a random variable (an ’external event’) with a known probability distribution, and a private signal is communicated to the players about the value of the random variable. Each player chooses a mixed strategy that depends on this private signal (and the dependence is ).

Given any two correlated equilibria, you can combine them to obtain another: Imagine a public random variable that determines which of the correlated equilibria will be played. Knowing which correlated equilibrium is being played, the players have no incentive to deviate. The payoffs are convex combinations of the payoffs of the two correlated equilibria.

15.4 Braess’s paradox In 2009, New York City closed Broadway at Times Square with the aim of reducing traffic congestion. It was successful. It seems counterintuitive that removing options for trans- portation would reduce traffic congestion. But there are other examples, as well:

• In 2005, the Cheonggyecheon highway was removed, speeding up traffic in downtown Seoul, South Korea.

• 42nd Street in NYC closed for Earth Day. Traffic improved.

• In 1969, congestion decreased in Stuttgart, West Germany, after closing a major road. Why does this happen? Drivers, acting rationally, seek the fastest route, which can lead to bigger delays (on average, and even for everyone).

Example 15.4. Consider the following network from destination A to B, where the latency of traffic on each edge is dependent on the proportion of the traffic flow traveling along

52 that edge.

The optimal flow is for 1/2 of the traffic to travel though C and 1/2 of the traffic to travel though D. What happens when we add an edge from C to D?

A Nash equilibrium flow has all of the traffic travel to C, then to D, and then to B. This has a latency of 2 for every driver, as opposed to the optimal form from before, which only had a latency of 3/2 for each driver. So adding edges is not always efficient.

53 16 The Price of Anarchy

16.1 Flows and latency in networks Last time we saw Braess’s paradox, in which a Nash equilibrium resulted in an inefficient flow in a network. How can we quantify the inefficiency of Nash equilibria in a network?

Definition 16.1. For a routing problem we define the price of anarchy as average travel time in worst Nash equilibrium price of anarchy = . minimal average travel time Note that the minimum is over all flows. The flow minimizing average travel time is th socially optimal flow. The price of anarchy reflects how much average travel time can decrease in going from a Nash equilibium flow (where all individuals choose a path to minimize their travel time) to a prescribed flow.5

Example 16.1. Consider the following network.

The price of anarchy of this network is 1. Finding the socially optimal strategy is equivalent to minimizing the function 2 2 f(x1) = ax1 + b(1 − x1) . 0 Setting f (x1) = 0 is a equivalent to ax1 = b(1 − x1), which is the Nash equilibrium condition. 5This was first defined by Elias Koutsoupias and Christos Papadimitriou. They were awarded the 2012 G¨odelPrize (with four others).

54 Example 16.2. Consider the following network.

A Nash equilibrium flow occurs when x = 1. We can find an optimal flow by minimizing the function f(x) = x2 + (1 − x). This is minimized at x = 1/2, so the socially optimal strategy gives an average time of 3/4. So the price of anarchy is 4/3.

Definition 16.2. A flow f from source s to destination t in a directed graph is a mixture of paths from s to t, with mixture weight fP for path P . We write the flow on an edge e as X fe = fP . P :e∈P

Definition 16.3. Latency on an edge e is a non-decreasing function of Fe, written `e(Fe). The latency on a path P is the total latency X LP (f) = `e(Fe). e∈P The average latency is X X L(f) = fP LP (f) = Fe`e(Fe). P e

0 Definition 16.4. A flow f is a Nash equilibrium flow if, for all P and P , if fP > 0, then LP (f) ≤ LP 0 (f). In equilibrium, each driver will choose some lowest latency path with respect to the current choices of other drivers.

55 16.2 The price of anarchy for linear and affine latencies

Theorem 16.1. For a directed, acyclic graph (DAG) with latency functions `e that are continuous, non-decreasing, and non-negative, if there is a path from source to destination, there is a Nash equilibrium unit flow.

Proof. Here is the idea of the proof. This is the non-atomic version of a congestion game. For the atomic version (finite number of players), we showed that there is a pure Nash equilibrium that can be found by descending a potential function. The same approach works here. The potential function is

X Z Fe φ(f) = `e(x) dx. e 0 If f is not a Nash equilibrium flow, then φ(f) is not minimal. φ is convex, on a convex, compact set, so it has a minimum.

Theorem 16.2. For linear latencies, that is `e(x) = aex with ae ≥ 0, if f is a Nash equilibrium flow and f ∗ is a socially optimal flow (that is L(f ∗) is minimal, then

L(f) = L(f ∗).

Proof. Since f is a Nash equilibrium, there is no advantage to shifting any flow from f to any other flow. In particular, there is no advantage to shifting from f to f ∗. X L(f) = fP LP (f)

P :fP >0 X ∗ ≤ fP LP (f) P X ∗ X = fP `e(Fe) P e ! X X ∗ = fP `e(Fe) e P :e∈P X ∗ = Fe `e(Fe) e X ∗ = aeFe Fe e X  ∗ 2 ∗2 2  = ae −(Fe − Fe ) /2 + (Fe + Fe )/2 (magic) e X ∗2 2 ≤ ae(Fe + Fe )/2 e

56 X ∗ ∗ = (Fe `e(Fe ) + Fe`e(Fe))/2 e = (L(f ∗) + L(f))/2, so L(f) ≤ L(f ∗).

Corollary 16.1. For linear latency functions, the price of anarchy is 1.

∗ Remark 16.1. In the proof above, we used a quadratic inequality to bound Fe Fe; one could also use the Cauchy-Schwarz inequality to do the same. Quadratic inequalities are useful because for any α, we have

 y 2 y2 y2 xy = − αx − + α2x2 + ≤ α2x2 + . 2α 4α2 4α2 This shows that  1  xy = min α2x2 + y2 . α 4α If x and y have the same sign, then we could choose α2 = y/(2x) to give xy = α2x2 + y2/(4α2), so in this case, these inequalities are tight. In bounding the price of anarchy, we could use any of these inequalities to gie a linear bound relating to L(f) to L(f ∗). The choice of α2 = 1/2 givse the best linear bound.

Theorem 16.3. For affine latencies, that is, `e(x) = aex + be, with ae, be ≥ 0, if f is a Nash equilibrium flow and f ∗ is a socially optimal slow (that is L(f ∗) is minimal), then 4 L(f) ≤ L(f ∗). 3 Proof. Recall, because there is no advantage to shifting from f to f ∗,

X X ∗ L(f) = Fe`e(Fe) ≤ Fe `e(Fe). e e

∗ X ∗ ∗ L(f) − L(f ) = (Fe`e(Fe) − Fe `e(Fe )) e X ∗ ∗ ≤ Fe (`e(Fe) − `e(Fe )) e X ∗ ∗ = Fe ae(Fe − Fe ) e X 2 ∗ 2 = ae((Fe/2) − (Fe − Fe/2) ) (more magic) e 1 X ≤ F (a F + b ) 4 e e e e e

57 L(f) = . 4 So L(f) ≤ (4/3)L(f ∗).

Corollary 16.2. For affine latency functions, the price of anarchy is ≤ 4/3.

16.3 The impact of adding edges As we saw before, adding edges to a network can reduce efficiency. We can quantify this in relation to the price of anarchy.

Theorem 16.4. Consider a network G with a Nash equilibrium from fG and average latency LG(fG) and a network H with additional roads added. Suppose that the price of anarchy in H is no more than α. Then any Nash equilibrium flow fH has average latency

LH (fH ) ≤ αLG(fG).

Proof. ∗ ∗ ∗ LH (fH ) ≤ αLH (fH ) ≤ αLH (fG) = αLG(fG) ≤ αLG(fG). Removing edges might improve the Nash equilibrium flow’s latency by up to the price of anarchy. Which edges should we remove? It turns out finding the best edges to remove is NP-hard. For affine latencies, even finding edges to remove that gives approximately the biggest reduction is NP-hard! It’s east to efficiently compute a Nash equilibrium flow that approximates the minimal latency Nash equilibrium flow within a factor of 4/3.; just compute a Nash equilibrium flow for the full graph. Nothing better is possible; assuming P 6= NP, there is no (4/3 − ε)-approximation algorithm.

58 17 Pigou Networks and Cooperative Games

17.1 Pigou networks Last time, we studied the price of anarchy for linear and affine latencies. More generally, suppose we allow latency functions from some class L. So far, we have considered the following classes: Llinear = {x 7→ ax : a ≥ 0}

Laffine = {x 7→ ax + b : a, b ≥ 0}. What about the class X d L = {x 7→ adx : ad ≥ 0} d of polynomial latencies? We will insist that latency functions are non-negative an non- decreasing. It turns out that the price of anarchy in an arbitrary network with latency functions chosen from L is at most the price of anarchy in a certain small network with these latency functions: a Pigou network.

Definition 17.1. The Pigou price of anarchy is the price of anarchy for this network with latency function and total flow r:

r`(r) αr(`) = . min0≤x≤r x`(x) + (r − x)`(r) Theorem 17.1. For any network with latency functions from L and total flow 1, the price of anarchy is no more than Ar(L) := max max αr(`). 0≤r≤1 `∈L

59 Proof. X L(f) = Fe`e(Fe) e   X Fe`e(Fe) = min (x`e(x) + (Fe − x)`e(Fe)) min (x` (x) + (F − x)` (F )) 0≤x≤r e 0≤x≤r e e e e X = αF (`e) min (x`e(x) + (Fe − x)`e(Fe)) e 0≤x≤r e X ∗ ∗ ∗ ≤ αr(`e)(Fe `e(Fe ) + (Fe − Fe )`e(Fe)) e ! X ∗ ∗ X ∗ ≤ max αFe (`e) Fe `e(Fe ) + (Fe − Fe )`e(Fe) r∈[0,1],`∈L e e X ∗ ∗ ≤ max αr(`e) Fe `e(Fe ) r∈[0,1],`∈L e ∗ = max αr(`e)L(f ). r∈[0,1],`∈L

d Example 17.1. Consider a Pigou network with r = 1, nonlinear latency `e(x) = x , and `(r) = 1. The Nash equilibrium flow is concentrated completely on the top edge: L(f) = 1. The socially optimal flow gives:

L(f ∗) = min(1 − x + xd+!) = 1 − d(d + 1)(d+1)/d. x The price of anarchy is 1 d ∼ . 1 − d(d + 1)(d+1)/d ln(d)

What about αr(`e)? Let g(x) = x`(x) + (r − x)`(r). Taking the derivative to zero, we get x∗ = r/(d + 1)1/d is the point where g attains the minimum. So r`(r) rd+1 d αr(`e) = = ∼ . g(x∗) rd+1 − rd+1 + rd+1 log d (d+1)(d+1)/d (d+1)1/d

17.2 Cooperative games Let’s review noncooperative games. Players play their strategies simultaneously. They might communicate (or see a common signal, e.g. a traffic signal), but there is no enforced agreement. The natural solution concepts are Nash equilibrium and correlated equilibrium. What if the players can cooperate?

60 In cooperative games, players can make binding agreements. For example, in the pris- oner’s dilemma, the prisoners can make an agreement not to confess. Both players gain from an enforceable agreement not to confess. There are two types of agreements.

Definition 17.2. An agreement has transferable utility if the players agree what strategies to play and what additional side payments are to be made.

Definition 17.3. An agreement has nontransferable utility if the players choose a joint strategy, but there are no side payments.

Example 17.2. Consider the game with payoff bimatrix

(2, 2) (6, 2) (1, 2) . (4, 3) (3, 6) (5, 5)

What should the players agree to play if they cannot transfer utility? Try it with a friend!6

Definition 17.4. The set of payoff vectors that the two players can achieve is called the feasible set.

With nontransferable utility, the feasible set is the convex hull of the entries in the payoff bimatrix.

Definition 17.5. A feasible payoff vector (v1, v2) is Pareto optimal if the only feasible 0 0 0 0 0 0 payoff vector (v1, v2) with v1 ≥ v1 and v2 ≥ v2 is (v1, v2) = (v1, v2). Example 17.3. In our cooperative game example, the feasible region is

The Pareto boundary is the part of the feasible region with nothing to the right of or above it. 6If you do not have any friends, send me an email, and I will play this game with you.

61 Example 17.4. Consider the same payoff bimatrix as before, but now assume that the payoff is in dollars. (2, 2) (6, 2) (1, 2) . (4, 3) (3, 6) (5, 5) The two players need to agree on what they will play, and they can pay each other to incentivize certain strategies. What is the best total payoff that can be shared? How should it be shared? Try it with a friend!

With transferable utility, the players can choose to shift a payoff vector. For example, suppose a pure strategy pair gives payoff (ai,j, bi,j). Suppose the players agree to play it, and Player 1 will give Player 2 a payment of p. The payment shifts the payoff vector from (ai,j, bi,j) to (ai,j − p, bi,j − p). The feasible region looks like this:

Here, the Pareto boundary is the line y = −x + 10.

62 18 Threat Strategies and Nash Bargaining in Cooperative Games

18.1 Threat strategies in games with transferable utility Players negotiate a joint strategy and a side payment. Since they are rational, they will agree to play a Pareto optimal payoff vector. Why? Players might make threats (and counter-threats) to justify their desired payoff vectors. If an agreement is not reached, they could carry out their threats. But reaching an agreement gives higher utility, so the threats are only relevant to choosing a reasonable side payment Since players are rational, they will play on the Pareto set, which is defined by the payoff vectors with the largest total payoff,

σ := max(ai,j + bi,j). i,j

They agree on a cooperative strategy (i0, j0) that has ai0,j0 + bi0,j0 = σ. ∗ ∗ The players will agree on a final payoff vector (a , b ) = (ai0,j0 − p, bi0,j0 + p), where p is the side payment from Player 1 to Player 2. To arrive at (a∗, b∗), the players agree on threat strategies (x, y) ∈ ∆n × ∆n. We will explore how they decide on their threat strategies after we’ve seen how threat strategies and the final payoff vector are related. The threat strategies give a certain payoff vector, called the disagreement point,

> > d = (d1, d2) := (x Ay, x By).

Neither player will accept less than their disagreement point payoff. This defines a subset of the Pareto boundary: (d1, σ − d1) to (σ − d2, d2). The other details of the game are now irrelevant, so it’s reasonable to choose the symmetric solution, the midpoint of this interval: σ − d + d σ − d + d  (a∗, b∗) = 2 1 , 1 2 . 2 2

63 Now that we know the role of the disagreement point, we can see how the players should choose it. Player 1 wants to choose a threat strategy x to maximize (σ − d2 + d1)/2, and Player 2 wants to choose a threat strategy y to maximize (σ − d1 + d2)/2, where > > (d1, d2) = (x Ay, x By). This is equivalent to a zero-sum game, with payoff d1 − d2 for Player 1 and payoff d2 − d1 for Player 2: > > > d1 − d2 = x Ay − x By = x (A − B)y. Suppose x∗ and y∗ are the optimal strategies for this zero-sum game with value

δ = (x∗)>(A − B)y∗.

Then Player 1’s best threat strategy is x∗, Player 2’s best threat strategy is y∗, and the disagreement point is ∗ > ∗ ∗ > ∗ d = (d1, d2) = ((x ) Ay , (x ) By ). The final payoff vector is σ + δ σ − δ  (a∗, b∗) = , . 2 2

So Player 1 pays Player 2 ai0,j0 − (σ + δ)/2. Example 18.1. Consider a game with the following payoff bimatrix. (2, 2) (6, 2) (1, 2) . (4, 3) (3, 6) (5, 5) The payoff matrix for the zero-sum game of the disagreement points is 0 4 −1 . 1 −3 0

64 The cooperative strategy should be (2, 3), which gives payoff σ = 10. When we solve the zero-sum game, column 1 is dominated by column 3, so we get x∗ = (3/8, 5/8) and y∗ = (0, 1/8, 7/8). The value is δ = −3/8. The disagreement point is (3.58, 3.95), and the final payoff vector is (4.8, 5.2). So Player 1 should pay 0.2 to Player 2.

What if two cooperative strategies are optimal? Any choice gives the same Pareto boundary. They give different disagreement points, but the value of the zero sum game (d1 − d2) must be the same. So the payment will depend on the choice of cooperative strategy, but the final payoff vector will be the same.

18.2 Nash bargaining model for nontransferable utility games In general, what are the ingredients of a bargaining problem? Suppose we have a compact, 2 2 convex feasible set S ⊆ R and a disagreement point d = (d1, d2) ∈ R . Think of the disagreement point as the utility that the players get from walking away and not playing the game. We’ll assume every x ∈ S has x1 ≥ d1 and x2 ≥ d2, with strict inequalities for some x ∈ S.

Definition 18.1. A solution to a bargaining problem is a function F that takes a feasible set S and a disagreement point d and returns an agreement point a = (a1, a2) ∈ S. Here are Nash’s axioms for a bargaining problem:

1. Pareto optimality: The agreement point shouldn’t be dominated by another point for both players.

2. Symmetry: This is about fairness: if nothing distinguishes the players, the solution should be similarly symmetric.

3. Affine covariance: Changing the units (or a constant offset) of the utilities should not affect the of bargaining.

4. Independence of irrelevant attributes: This assumes that all of the threats the players might make have been accounted for in the disagreement point.

More formally, these are

1. Pareto optimality: the only feasible payoff vector (v1, v2) with v1 ≥ a1 and v2 ≥ a2 is (v1, v2) = (a1, a2).

2. Symmetry: If (x, y) ∈ S =⇒ (y, x) ∈ S and d1 = d2, then a1 = a2.

3. Affine covariance: For any affine transformation ψ(x1, x2) = (α1x1 − β1, α2x2 + β2) with α1 > 0 and α2 > 0, for any S, and for any d, F (ψ(S), ψ(d)) = ψ(F (S, d)).

65 4. Independence of irrelevant attributes: For two bargaining problems (R, d) and (S, d), if R ⊆ S and F (S, d) ∈ R, then F (R, d) = F (S, d).

Theorem 18.1. There is a unique function F satisfying Nash’s bargaining axioms. It is the function that takes S and d and returns the unique solution to the optimization problem

max(x1 − d1)(x2 − d2) x1,x2 subject to the constraints

x1 ≥ d1

x2 ≥ d2

(x1, x2) ∈ S.

66 19 Models for Transferable Utility

19.1 Nash’s bargaining theorem and relationship to transferable utility Last time, we mentioned Nash’s bargaining theorem. Theorem 19.1. There is a unique function F satisfying Nash’s bargaining axioms. It is the function that takes S and d and returns the unique solution to the optimization problem

max(x1 − d1)(x2 − d2) x1,x2 subject to the constraints

x1 ≥ d1

x2 ≥ d2

(x1, x2) ∈ S. We are talking about games with nontransferable utility, but this is also related to games with transferable utility. Example 19.1. Consider a transferable utility game with disagreement point d and coop- erative strategy with total payoff σ. Then the convex set S is the set of convex combinations of lines {(ai,j + p, bi,j − p): p ∈ R}. To maximize (x1 − d1)(x2 − d2), we set x2 = σ − x1 and choose x1 to maximize 2 (x1 − d1)(σ − x1 − d2) = −x1 + (σ − d2 + d1)x1 − d1(σ − d2).

This gives x1 = (σ − d2 + d1)/2. The Nash solution is unique. See the text for a slick proof. The Nash solution satisfies the bargaining axioms:

1. Pareto optimality: increasing, say, x1 increases (x1 − d1)(x2 − d2). 2. Symmetry: You can check that this follows from uniqueness of the solution.

3. Affine covariance: α1x1 + β1 − (α1d1 + β1) = α1(x − d1). 4. Independence of irrelevant attributes: A maximizer in S that belongs to R is still a maximizer in R ⊆ S. Here is the idea of the proof of the theorem.

Proof. Any bargaining solution that satisfies the axioms is the Nash solution. For S and d, if the Nash solution is a, find the affine function so that ψ(a) = (1, 1) and ψ(d) = (0, 0). If the Nash solution is a = (1, 1) and d = (0, 0), then the convex hull of S and its reflection are in {x1 + x2 ≤ 2}, so any symmetric, optimal F returns (1, 1) for this convex hull,and hence, by IIA, for S.

67 The affine covariance property is not always easily evident. Consider the following region S, and a region S0 that is the image of S under and affine transformation.

Here, it seems like Player 2 should have an advantage somehow, but the Nash solution is (1, 1) for the region S0. Is this how players would choose a solution in real life?

19.2 Multiplayer transferable utility games 19.2.1 Allocation functions and Gillies’ core Example 19.2. A customer in a marketplace is willing to buy a pair of gloves for $100. There are three players, one with right gloves and two with only left gloves, and they need to agree on who sells their glove and how to split the $100. This is more complicated than a two-player game: the players can form coalitions. Who holds the power and what’s fair depends on how the different subsets of players depend on other players and contribute to the payoff. Definition 19.1. For each subset S of players, let v(S) be the total value that would be available to be split by that subset of players no matter what the other players do. We call v a characteristic function. Example 19.3. In our glove example, we have the following characteristic function:

v({1, 2, 3}) = v({1, 2}) = v({1, 3}) = 100,

v({1}) = v({2}) = v({3}) = v({2, 3}) = v(∅) = 0. Definition 19.2. An allocation function is a map from a characteristic function v for n n players to a vector ψ(v) ∈ R . This is the payoff that is allocated to the n players.

68 What properties should an allocation function have?

1. Efficiency: The total payoff gets allocated. That is,

n X ψi(v) = v({1, . . . , n}). i=1

2. Stability: Each coalition is allocated at least the payoff it can obtain on its own. For each S ⊆ {1, . . . , n}, X ψi(v) ≥ v(S). i∈S

The conditions are called Gillies’ core.7

Example 19.4. Let’s go back to the left and right gloves example.

3 X ψi(v) = v({1, 2, 3}) = 100 i=1

ψ1(v) + ψ2(v) ≥ 100, ψ1(v) + ψ3(v) ≥ 100.

There is one solution: ψ1(v) = 100. Example 19.5. Consider a game where any pair of gloves sells for $1. The characteristic function is v({1, 2}) = v({1, 3}) = v({2, 3}) = v({1, 2, 3}) = 1,

v({1}) = v({2}) = v({3}) = v(∅) = 0. Then 3 X ψi(v) = v({1, 2, 3}) = 1, i=1

ψ1(v) + ψ2(v) ≥ 1, ψ1(v) + ψ3(v) ≥ 1, ψ2(v) + ψ3(v) ≥ 1. There are no solutions!

Example 19.6. Consider a game where single gloves sell for $1, pairs sell for $10, and triples sell for $100. The characteristic function is

v({1}) = v({2}) = v({3}) = 1,

v({1, 2}) = v({1, 3}) = v({2, 3}) = 10,

7Donald B Gillies is a Canadian-born mathematician, game theorist, and computer scientist at the University of Illinois at Urbana-Champaign.

69 v({1, 2, 3}) = 100. Then 3 X ψi(v) = v({1, 2, 3}) = 100, i=1

ψ1(v) ≥ 1, ψ2(v) ≥ 1, ψ3(v) ≥ 1

ψ1(v) + ψ2(v) ≥ 10, ψ1(v) + ψ3(v) ≥ 10, ψ2(v) + ψ3(v) ≥ 10,

ψ1(v) + ψ2(v) + ψ3(v) ≥ 100. There are many solutions!

As we can see, Gillies’ core, while reasonable, may not be the most accurate model.

19.2.2 Shapley’s axioms for allocation functions Here are Shapley’s axioms for allocation functions. Pn 1. Efficiency: i=1 ψi(v) = v({1, . . . , n}). 2. Symmetry: If, for all S ⊆ {1, . . . , n} and i, j∈ / S, v(S ∪ {i}) = v(S ∪ {j}), then ψi(v) = ψj(v).

3. No freeloaders: For all i, if for all S ⊆ {1, . . . , n}, v(S ∪ {i}) = v(S), then ψi(v) = 0.

4. Additivity: ψi(v + u) = ψi(v) + ψi(u). Theorem 19.2 (Shapley). Shapley’s axioms uniquely determine the allocation ψ.

70 20 Shapley Value

20.1 Shapley’s axioms Here are Shapley’s8 axioms for allocation functions. Pn 1. Efficiency: i=1 ψi(v) = v({1, . . . , n}). 2. Symmetry: If, for all S ⊆ {1, . . . , n} and i, j∈ / S, v(S ∪ {i}) = v(S ∪ {j}), then ψi(v) = ψj(v).

3. No freeloaders: For all i, if for all S ⊆ {1, . . . , n}, v(S ∪ {i}) = v(S), then ψi(v) = 0.

4. Additivity: ψi(v + u) = ψi(v) + ψi(u). Shapley’s theorem says that Shapley’s axioms uniquely determine the allocation ψ. We call the unique allocation ψ(v) the Shapley value of the players in the game defined by the characteristic function v.

Theorem 20.1 (Shapley). The following allocation uniquely satisfies Shapley’s axioms:

ψi(v) = Eπφi(v, π), where the expectation is over uniformly chosen permutations π on {1, . . . , n} and

φi(v, π) = v(π({1, . . . , k})) − v(π({1, . . . , k − 1})), where k = π−1(i).

Example 20.1. For the identity permutation, π(i) = i,

φi(v, π) = v({1, . . . , i}) − v({1, . . . , i − 1}), which is how much value i adds to {1, . . . , i − 1}. And for a random π, φi(v, π) is how much value i adds to the random set π({1, . . . , π−1(i) − 1}).

20.2 Junta games Example 20.2. A Junta9 game (J-veto game) is a game where there is a set J ⊆ {1, . . . , n} with all the power: ( 1 J ⊆ S wJ (S) = 1(J⊆S) = 0 otherwise.

8Lloyd Shapley was a professor of mathematics at UCLA. He won the 2012 Nobel Prize for . 9In Spanish, a “junta” is a small group with all the power. In Latin America, this has historically occurred many times.

71 For any permutation π,

−1 −1 ψi(wJ , π) = wJ (π({1, . . . , π (i)})) − wJ (π({1, . . . , π (i) − 1}))

= 1(i∈J,J⊆π({1,...,π−1(i)}))

= 1(i∈J,π−1(J)⊆{1,...,π−1(i)})

= 1(i∈J,π−1(i)∈π−1(J),π−1(J)⊆{1,...,π−1(i)}), so

ψi(wJ ) = Eπ[φi(wJ , π)] 1 −1 −1 = (i∈J)P(π (i) = max π (j)) j∈J 1 = 1 . (i∈J) |J| Check that this agrees with the axioms: Pn 1. Efficiency: i=1 ψi(wJ ) = 1 = wJ ({1, . . . , n}).

2. Symmetry: If for all S ⊆ {1, . . . , n} not containing i and j, wJ (S ∪{i}) = wJ (S ∪{j}) (and this is true for i, j ∈ J and for i, j∈ / J), then ψi(wJ ) = ψj(wJ ).

3. Dummy: If for all S ⊆ {1, . . . , n}, wJ (S ∪ {i}) = wJ (S) (and this is true for i∈ / J), then ψi(wJ ) = 0. Lemma 20.1 (Characteristic functions as Junta games). We can write any v as a unique linear combination of wJ . Proof. Write v as a vector, with one coordinate for each subset S ⊆ {1, . . . , n}. Write a matrix W , with rows indexed by S ⊆ {1, . . . , n}, columns indexed by J ⊆ {1, . . . , n}, and entries wJ (S). If we make sure these subsets are ordered by cardinality, then this matrix is lower triangular, with 1s on its diagonal. Since W is invertible, we can solve the equation v = W c to obtain a unique c, with one entry cJ for each J ⊆ {1, . . . , n}, and then we have X v(S) = wJ (S)cJ . J

20.3 Shapley’s theorem Let’s prove Shapley’s theorem.

Proof. First, we want to show that the allocation ψi(v) = Eπ[φi(v, π)] satisfies Shapley?s axioms. For any π, ψi(v, π) satisfies the efficiency, dummy, and additivity axioms. These axioms all involve linear expressions in i, so they are preserved under expectation. Sym- metry follows from the randomization.

72 1. Efficiency:

n π(n) X X ψi(v, π) = [v(π({1, . . . , i})) − v(π({1, . . . , i − 1))] i=1 i=π(1) n X = [v(π({1, . . . , j})) − v(π({1, . . . , j − 1))] j=1 = v({1, . . . , n})

2. Dummy: π({1, . . . , π−1(i)}) = π({1, . . . , π−1(i) − 1}) ∪ {i}, so ψi(v, π) = v(π({1, . . . , i})) − v(π({1, . . . , i − 1)) = 0.

To prove uniqueness, represent v as a unique linear combination of Junta game charac- teristic functions wJ (S) = 1(J⊆S). Then ψi(wJ ) = 1(i∈J)/|J| is the unique allocation satis- fying the Shapley axioms for the Junta games. Additivity implies that ψi(v) is unique. This proof actually gives us a nice way to compute the characteristic function. Solve P for the coefficients cJ in v(S) = J cJ wJ (S) by solving the linear system mentioned in the Junta game lemma.

Example 20.3. Consider a glove game like before, with characteristic function

v({1, 2, 3}) = v({1, 2}) = v({1, 3}) = 100,

v({1}) = v({2}) = v({3}) = v({2, 3}) = v(∅) = 0. Solving the linear system, we get

v(S) = 100w{1,2}(S) + 100w{1,3}(S) − 100w{1,2,3}(S), and hence 1 1 1 2 ψ (v) = 100 + − = 100 · , 1 2 2 3 3 1 1 1 ψ (v) = ψ (v) = 100 − = 100 · . 2 3 2 3 6

73 21 Examples of Shapley Value and Mechanism Design

21.1 Examples of Shapley value Example 21.1. Consider a situation where shareholder i holds i shares for i = 1,..., 4. A deicsion needs the support of shareholders with a total of six shares:

v({1, 2, 3, 4}) = v({1, 2, 3}) = v({1, 2, 4}) = v({1, 3, 4}) = v({2, 3, 4}) = v({2, 4}) = v({3, 4}) = 1, and v(S) = 0 otherwise. So we have the matrix

 {3, 4}{2, 4}{2, 3, 4}{1, 3, 4}{1, 2, 4}{1, 2, 3}{1, 2, 3, 4}  {3, 4} 1 0 0 0 0 0 0     {2, 4} 0 1 0 0 0 0 0     {2, 3, 4} 1 1 1 0 0 0 0  W =   ,  {1, 3, 4} 1 0 0 1 0 0 0     {1, 2, 4} 0 1 0 0 1 0 0     {1, 2, 3} 0 0 0 0 0 1 0  {1, 2, 3, 4} 1 1 1 1 1 1 1 with rows S ⊆ {1, 2, 3, 4}, rows J ⊆ {1, 2, 3, 4}, and entries WS,J = wJ (S). We can solve v = W c for the vector c to get

v(S) = w{2,4}(S) + w{3,4}(S) + w{1,2,3}(S) − w{2,3,4}(S) − w{1,2,3,4}(S). Then we get the allocation 1 1 1 1 1 1 1 1 ψ (v) = − = , ψ (v) = − + − = , 1 3 4 12 2 2 3 3 4 4 1 1 1 1 1 1 1 1 1 5 ψ (v) = + − − = , ψ (v) = + − − = . 3 2 3 3 4 4 4 2 2 3 4 12

Example 21.2. Players 1, 2, and 3 value a painting at a1, a2, and a3 with 0 < a1 < a2 < a3. But Player 1 owns the painting, so the characteristic function is given by

v({1}) = a1, v({2}) = v({3}) = v({2, 3}) = 0,

v({1, 2}) = a2, v({1, 3}) = v({1, 2, 3}) = a3. The rational outcome, which achieves the maximal value, is for Player 3 to own the painting. What payments should occur? We can compute that

v(S) = a1w{1}(S) + (a2 − a1)w{1,2}(S) + (a3 − a1)w{1,3}(S) − (a2 − a1)w{1,2,3}(S),

74 so we get

1 1 1 2a + a + 3a ψ (v) = a + (a − a ) − + (a − a ) = 1 2 3 , 1 1 2 1 2 3 3 1 2 6

1 1 a − a ψ (v) = (a − a ) − = 2 1 , 2 2 1 2 3 6 1 1 a − a 2a + a + a ψ (v) = (a − a ) − (a − a ) = a − 2 1 − 1 2 3 . 3 3 1 2 2 1 3 3 6 6

21.2 Examples of mechanism design Example 21.3. In the women’s badminton tournament in the 2012 London Olympics, there were sixteen teams, split into four groups (A, B, C, and D) of four teams each. Within a group, all pairs played a match. The top two per group advanced to a knockout tournament. In the knockout tournament, there were

1. Four quarterfinals: (i) A1 vs C2, (ii) S2 vs C1, (iii) B1 vs D2, (iv) B2 vs D1,

2. Two semifinals: Winners of (i) and (iii), and the winners of (ii) and (iv),

3. A bronze medal match between the semifinal losers,

4. A gold medal match between the semifinal winners.

However, there was an issue. There was an upset in Group D: Denmark (Pedersen/Juhl) beat the top-ranked Chinese team (Tian/Zhao), so these teams were D1 and D2, respec- tively. Thus, A1 would play C2, and if they won, would play Tian/Zhao (the top-ranked team) in a semifinal. But A2 would play C1 and, if they won, would play Pedersen/Juhl in a semifinal and would not play Tian/Zhao until the gold medal match. Since there was an upset, the rank 2 and 3 teams were playing each other in a match where the winner would play the highest ranked team in the semifinals and the loser would play a lower ranked team in the semifinals. So winning the last Group A match would likely lead to a bronze medal, whereas losing it would likely lead to a silver medal. Both teams tried to lose the match, and they were both disqualified. This was a failure of tournament design, probably in the way that the rank 2 and 3 teams were both in Group A.

When we design games and mechanisms, we aim to design the rules of a game so that the outcomes have certain desired properties.

– Consistent with voters’ rankings, – Fair (symmetric).

75 • Auctions – Maximize revenue for the seller, – Pareto efficiency, – Calibrated (revealing bidders’ values).

• Tournaments – The best team is most likely to win, – Players have an incentive to compete.

Example 21.4. Suppose there are two candidates for president, and all voters have a preference. How do we design an to decide between the two candidates? Voters vote; the candidate with the most votes wins. The candidate that wins is the choice of at least half of the voters. Voters never have an incentive to vote against their preferences. Things aren’t as simple with three candidates. Example 21.5 (Condorcet’s paradox10). Suppose 3 voters have the following preferences:

1st 2nd 3rd Voter 1 ABC Voter 2 BCA Voter 3 CAB

For every candidate, there is another candidate who is preferred by the majority. Suppose we choose candidates by a two-stage vote: 1. A vs B, then the winner vs C,

2. A vs C, then the winner vs B, or

3. B vs C, then the winner vs A. There is no fair (symmetric) voting process that can assign a winner in these cases. Example 21.6. Here is a voting system called . Voters vote for one candi- date; the candidate with the most votes wins. What are the disadvantages? This system encourages strategic voting. A vote for candidates ranked third or worse is wasted. This can lead to a winner who is lowest ranked by a majority. Example 21.7. Here is a voting system called two-round voting (used in France). Voters vote for one candidate. If there is not sufficient support for one candidate, a second vote is held to decide between the top-ranked two candidates. 10Marquis de Condorcet lived in the 18th century.

76 Example 21.8. Here is a system called contingent voting. Voters rank the candidates. The first choices are counted. If there is not sufficient support for a single candidate, there is a second count to decide between the top-ranked two candidates. Votes supporting other candidates are distributed among the two remaining candidates according to the voters’ preferences.

Example 21.9. Here is instant-runoff voting (used in Australia11). Voters rank all can- didates. The number of top choices is counted. If no candidate has a majority of the top choices, the candidate with fewest top choices is eliminated and that candidate’s votes are allocated to the next-ranked choices.

All of these voting systems are vulnerable to strategic voting.

Example 21.10. In a contingent voting system, say the voters have the following distri- bution of preferences: 1st 2nd 3rd 30% ABC 45% BCA 25% CAB In this situation C is eliminated in round 1, and then A wins. What if 10% of the people in the second group lie about their preferences?

1st 2nd 3rd 30% ABC 35% BCA 10% CBA 25% CAB

Then A is eliminated in round 1, and B wins.

What properties would we like voting methods to have? What methods possess these properties? We will discuss this next lecture.

11This was first used in Quensland in 1893.

77 22 Voting

22.1 Voting preferences, preference communication, and Borda count What is a voting mechanism? What properties would we like voting mechanisms to have? What mechanisms possess these properties? How do we model voters’ preferences? How do voters express their preferences? How do we combine that information? We will distinguish two outcomes:

1. A single winner (“voting rule”)

2. A ranking of all candidates (“ranking rule”)

Here are some models for voters’ preferences.

Example 22.1. Each voter has a ranking of the set of candidates: Voter i has a permu- tation π on Γ. However, this is not perfect. Sometimes an individual does not have a total order on the candidates. And sometimes and individual’s preferences are not transitive (they might prefer A over B over C over A).

Example 22.2. Each voter has a utility associated with each candidate: Voter i has utility function ui :Γ → R. This is a more fine-grained model: it allows us to compare the total utility of different outcomes. However, it is more difficult to assign scores than to compare, and scores are typically incomparable between individuals.

Voters can communicate their preferences in many ways.

Example 22.3. Here are some ways voters can communicate their preferences.

1. Each voter assigns a score to each candidate.

2. Each voter assigns a ranking of the set of candidates.

3. Each voter approves of a subset of the set of candidates.

4. Each voter approves of a single candidate.

Last lecture, we discussed some examples of voting systems. Here is another.

Example 22.4. Here is a voting system called the Borda count. Voters provide a ranking of the candidates, from 1 to |Γ|. A candidate that is ranked in the i-th position is assigned |Γ| − i + 1 points. Candidates are ranked by the total number of points assigned.

78 22.2 Properties of voting systems What formal assumptions do we make when modeling voting? There is a set γ of candi- dates. Voter i has a preference relation i defined on candidates that is:

1. Complete: for every A 6= B, A i B or B i A.

2. Transitive: for every A, B, C, if A i B and B i C, then A i C.

Definition 22.1. A voting rule is a function f that maps a preference profile π = ( 1 ,..., n) to a winner from Γ. Definition 22.2. A ranking rule is a function R that maps a preference profile π = ( 1,..., n) to a social ranking . on Γ, which is another complete, transitive preference relation.

While more than one candidate remains: Eliminate the bottom-ranked k candidates, Apply ranking rule to voters’ preferences over remaining candidates.

Example 22.5. If k = |Γ|−1, take the top-ranked candidate as the winner. If the ranking is based on voters’ top choices this is plurality voting.

Example 22.6. If k = |Γ| − 2 and the ranking is based on the voters’ top choices, this is contingent voting.

Example 22.7. If k = 1 and the ranking is based on voters’ top choices, this is instant- runoff voting.

Definition 22.3. A ranking rule R has the unanimity property if, for all i, A i B, then . = R( 1,..., n) satisfies A.B; i.e. if all voters prefer candidate A over B, then candidate A should be ranked above B.

It is hard to imagine a “fair” voting rule that violates unanimity.

Definition 22.4. A ranking rule R is strategically vulnerable if, for some preference profile ( 1,..., n), some voter i, some candidates A and B, and

. = R( 1,..., i,..., n),

0 0 . = R( 1,..., i,..., n), 0 then A i B, and B.A, but A. B.

This meas that Voter i has a preference relation i, but by stating an alternative 0 preference relation i, they can swap the ranking rule’s preference between A and B to 0 make it consistent with i.

79 Definition 22.5. Independence of irrelevant alternatives (IIA) is the following property of a ranking rule R. If you have two different voter preference profiles ( 1,..., n) and 0 0 0 0 0 ( 1,..., n), define . := R( 1,..., n) and . := R( 1,..., n)R. If, for all i, A i 0 0 B ⇐⇒ A i B, then A.B ⇐⇒ A. B. This says that the ranking rule’s relative rankings of candidates A and B should depend only on the voters’ relative rankings of these two candidates. Example 22.8. Ranking based on runoff voting violates IIA. Consider the following ex- ample of strategic voting. 1st 2nd 3rd 30% ABC 45% BCA 25% CAB Instant runoff gives us the ranking A.B.C. But if 10% of the people in the second group lie about their preferences, we get a different result.

1st 2nd 3rd 30% ABC 35% BCA 10% CBA 25% CAB

Here, instant runoff gives us the ranking B.C.A. But when the 10% changed their preferences, they did not change their relative preferences between B and A. So IIA is violated.

22.3 Violating IIA and Arrow’s Impossibility theorem Theorem 22.1. Any ranking rule R that violates IIA is strategically vulnerable.

0 0 0 Proof. Suppose R violates IIA. Let π = ( 1,..., n), π = ( 1,..., n), . = R(π), and 0 0 0 0 . = R(π ). Then for all i, A i B ⇐⇒ A i B, but A.B and B. A. Change the voters’ rankings one by one to change π into π0:

( 1, 2,..., n) A.B 0 ( 1, 2,..., n) A.1 B 0 0 ( 1, 2,..., n) B.2 A . . . . 0 0 0 0 ( 1, 2,..., n) B. A Then some voter on the path from π to π0 changes the order of A and B. So R is strategically vulnerable.

80 Definition 22.6. A ranking rule R is a dictatorship if there is a voter i∗ such that, for any preference profile ( 1,..., n) and . = R( 1,..., n), A.B ⇐⇒ A i∗ B. Theorem 22.2 (Arrow’s Impossibility theorem). For |Γ| ≥ 3, any ranking rule R that satisfies both IIA and unanimity is a dictatorship.

Corollary 22.1. Any ranking rule R that satisfies unanimity and is not strategically vul- nerable is a dictatorship.

Violating unanimity does not make sense, but a dictatorship is undesirable.12 Hence, strategic vulnerability is inevitable.

12You may disagree.

81 23 Impossibility Theorems and Properties of Voting Sys- tems

23.1 The Gibbard-Satterthwaite theorem Last time we introduced Arrow’s13 Impossibility theorem.

Theorem 23.1 (Arrow’s Impossibility theorem). For |Γ| ≥ 3, any ranking rule R that satisfies both IIA and unanimity is a dictatorship.

Here is another impossibility theorem.

Definition 23.1. A voting rule f is a function that takes the voters’ preference profile π to the winner in Γ.

Definition 23.2. A voting rule f is onto the set Γ of candidates if, for all candidates A ∈ Γ, there is a preference profile π such that f(π) = A.

If f is not onto Γ, some candidate is excluded from winning.

Theorem 23.2 (Gibbard-Satterthwaite). For |Γ| ≥ 3, any voting rule f that is onto Γ and is not strategically vulnerable is a dictatorship.

Proof. The proof is by contradiction; we use f to construct a ranking rule that violates Arrow’s theorem. Suppose f is onto Γ, not strategically vulnerable, and not a dictatorship. Define . = R(π) via ( A . B f(π{A,B}) = A, B . A f(π{A,B}) = B, where πS maintains the order of candidates in S but moves them above all other candidates in all voters’ preferences. If f is onto Γ and not strategically vulnerable, then for all S ⊆ Γ, f(πS) ∈ S, so . is complete; otherwise, in the path from a π0 ∈ f −1(S) to πS, some voter switch would demonstrate a strategic vulnerability. Also, . is transitive; the same argument shows that f(π{A,B,C}) = A implies A.B and A.C, so cycles are impossible. {A,B} {A,B} {A} So R satisfies unanimity because A i B implies that π = (π ) , so A.B. By a similar argument, R satisfies IIA. So by Arrow’s impossibility theorem, R is a dictatorship. But because f is not a dictatorship, neither is R. So we have a contradiction. 13Kenneth Arrow was a professor of and Economics at Stanford. He won the Nobel Prize in Economics in 1972 and is considered the founder of modern .

82 23.2 Properties of voting systems Here are some more properties of voting systems. Are these desirable? Are they realistic?

Definition 23.3. A voting system is symmetric if permuting voters does not affect the outcome.

Definition 23.4. A voting system is monotonic if changing one voter’s preferences by promoting candidate A without changing any other preferences should not change the outcome from A winning to A not winning.

Definition 23.5. The Condorcet winner criterion is that if a candidate is majority- preferred in pairwise comparisons with any other candidate, then that candidate wins.

Definition 23.6. The Condorcet loser criterion is that if a candidate is preferred by a minority of voters in pairwise comparisons with all other candidates, then that candidate should not win.

Definition 23.7. The is that the winner always comes from the , the smallest nonempty set of candidates that are majority-preferred in pairwise comparisons with any candidate outside the set.

Definition 23.8. A voting system is reversal symmetric if when candidate A wins for some voter preference profile, candidate A does not win when the preferences of all voters are reversed.

Definition 23.9. Cancellation of ranking cycles is when if a set of |Γ| voters have pref- erences that are cyclic shifts of each other (e.g. A 1 B 1 C, B 2 C 2 A, and C 3 A 3 B), then removing these voters does not affect the outcome. Definition 23.10. Cancellation of opposing rankings is when if two voters have reverse preferences, then removing these voters does not affect the outcome.

Definition 23.11. Participation is when if candidate A wins for some voter peferrence profile, then adding a voter with A B does not change the winner from A to B.

Example 23.1. Which of these properties does instant runoff voting have? Recall that in instant runoff voting, we eliminate the candidate that is top-ranked by the fewest voters, remove that candidate from everyone’s rankings and repeat.

• Instant runoff voting satisfies symmetry because permuting the voters does not affect the outcome.

• Instant runoff voting does not satisfy monotonicity, however; our example from the last two lectures of strategic voting is a counterexample to monotonicity.

83 • Instant runoff voting does not satisfy the Condorcet winner criterion. Here is an example where B is preferred over any candidate, but A wins.

1st 2nd 3rd 30% ABC 45% CBA 25% BAC

• Instant runoff voting satisfies the Condorcet loser criterion. If the Condorcet loser makes it to the last round, they will lose the pairwise vote in that round; so they cannot win.

• Instant runoff voting does not satisfy the Smith criterion. In the above example, the Smith set is {B}, but A wins instead of B.

• Instant runoff voting is not reversal symmetric. In the following example, reversing the preferences still makes candidate A the winner.

1st 2nd 3rd 1st 2nd 3rd 30% ABC 30% CBA 45% CBA 45% ABC 25% BAC 25% CAB

23.3 Positional voting rules

Definition 23.12. A positional voting rule is defined as follows. Let a1 ≥ a2 ≥ · · · ≥ aN . For each candidate, assign ai points for each voter that assigns that candidate rank i. The candidate with the largest total wins.

14 Example 23.2. Borda count is the positional voting rule with ai given by N,N −1,..., 1.

Example 23.3. Plurality is the positional voting rule with ai given by 1, 0,..., 0.

Example 23.4. is the rule with ai given by 1, 1,..., 1, 0,..., 0. Positional voting rules satisfy symmetry, monotonicity and cancellation of ranking cy- cles. However, they do not necessarily satisfy the Condorcet winner criterion.

14Jean-Charles de Borda was an 18th century French naval commander, scientist, and inventor. He created ballistics, mapping and surveying instruments, pumps, and metric trigonometric tables.

84