Repeated Games EC202 Lectures IX & X
Total Page:16
File Type:pdf, Size:1020Kb
Repeated Games EC202 Lectures IX & X Francesco Nava London School of Economics January 2011 Nava (LSE) EC202 —Lectures IX & X Jan2011 1/16 Summary Repeated Games: Definitions: Feasible Payoffs Minmax Repeated Game Stage Game Trigger Strategy Main Result: Folk Theorem Examples: Prisoner’sDilemma Nava (LSE) EC202 —Lectures IX & X Jan2011 2/16 Feasible Payoffs Q: What payoffs are feasible in a strategic form game? A: A profile of payoffs is feasible in a strategic form game if can be expressed as a weighed average of payoffs in the game. Definition (Feasible Payoffs) A profile of payoffs wi i N is feasible in a strategic form game f g 2 N, Ai , ui i N if there exists a distribution over profiles of actions p suchf that: g 2 wi = ∑a A p(a)ui (a) for any i N 2 2 Unfeasible payoffs cannot be outcomes of the game Points on the north-east boundary of the feasible set are Pareto effi cient Nava (LSE) EC202 —Lectures IX & X Jan2011 3/16 Minmax Q: What’sthe worst possible payoff that a player can achieve if he chooses according to his best response function? A: The minmax payoff. Definition (Minmax) The (pure strategy) minmax payoff of player i N in a strategic form 2 game N, Ai , ui i N is: f g 2 ui = min max ui (ai , a i ) a i A i ai Ai 2 2 Mixed strategy minmax payoffs satisfy: v i = min max ui (si , s i ) s i si The mixed strategy minmax is not higher than the pure strategy minmax. Nava (LSE) EC202 —Lectures IX & X Jan2011 4/16 Example: Prisoner’sDilemma Minmax Payoff: (1, 1) Feasible Payoff: contained in red boundaries Pareto Effi cient Payoffs: (2, 2; 3, 0) and (2, 2; 0, 3) Stage Game Payoffs u2 3 1 2 w s 2 wn 2,2 0,3 s 3,0 1,1 1 0 0 1 2 3 u1 Nava (LSE) EC202 —Lectures IX & X Jan2011 5/16 Example: Battle of the Sexes Minmax Payoff: (2, 2) Feasible Payoff: contained in red boundaries Pareto Effi cient Payoffs: (3, 3) Stage Game Payoffs u2 3 1 2 w s 2 wn 3,3 1,0 s 0,1 2,2 1 0 0 1 2 3 u1 Nava (LSE) EC202 —Lectures IX & X Jan2011 6/16 Repeated Game: Timing Consider any strategic form game G = N, Ai , ui i N f g 2 Call G the stage game An infinitely repeated game describes a strategic environment in which the stage game is played repeatedly by the same players infinitely many times Round 1 ... Round t ... ... ... Nava (LSE) EC202 —Lectures IX & X Jan2011 7/16 Repeated Games: Payoffs and Discounting The value to player i N of a payoff stream ui (1), ui (2), ..., ui (t), ... is: 2 f g ¥ t 1 (1 d) ∑ d ui (t) t=1 The term (1 d) amounts to a simple normalization ... and guarantees that a constant stream v, v, ... has value v f g Future payoffs are discounted at rate d An infinitely repeated game can be used to describe strategic environments in which there is no certainty of a final stage In such view d describes the probability that the game does not end at the next round which would result in a payoff of 0 thereafter Nava (LSE) EC202 —Lectures IX & X Jan2011 8/16 Repeated Games: Perfect Information and Strategies Today we restrict attention to perfect information repeated games In such games all players prior to each round observe the actions chosen by all other players at previous rounds Let a(s) = a1(s), ..., an(s) denote the action profile played at round s f g A history of play up to stage t thus consists of: h(t) = a(1), a(2), ..., a(t 1) f g In this context strategies map histories (ie information) to actions: ai (h(t)) Ai 2 Nava (LSE) EC202 —Lectures IX & X Jan2011 9/16 Dilemma Folk Theorem Prisoner’s Consider the prisoner’sdilemma discussed earlier: 1 2 w s wn 2,2 0,3 s 3,0 1,1 To understand how equilibrium behavior is affected by repetition, let’s show why (2, 2) is SPE of the infinitely repeated prisoner’sdilemma Folk theorem shows that any feasible payoff that yields to both players at least their minmax value is a SPE of the infinitely repeated game if the discount factor is suffi ciently high Nava (LSE) EC202 —Lectures IX & X Jan2011 10/16 Grim Trigger Strategies [Draw Paths...] Consider the following strategy (known as grim trigger strategy): w if either a(t 1) = (w, w) or t = 0 a (t) = i s otherwise If all players follow such strategy, no player can deviate and benefit at any given round provided that d 1/2 ≥ In subgames following a(t 1) = (w, w) no player benefits from a deviation if: (1 d) 3 + d + d2 + d3 + ... = 3 2d 2 d 1/2 ≤ , ≥ In subgames following a(t 1) = (w, w) no player benefits from a deviation since: 6 (1 d) 0 + d + d2 + d3 + ... = d 1 d 1 ≤ , ≤ Nava (LSE) EC202 —Lectures IX & X Jan2011 11/16 Folk Theorem Theorem (SPE Folk Theorem) In any two-person infinitely repeated game: 1 For any discount factor d, the discounted average payoff of each player in any SPE is at least his minmax value in the stage game 2 Any feasible payoff profile that yields to all players at least their minmax value is the discounted average payoff of a SPE if the discount factor d is suffi ciently close to 1 3 If the stage game has a NE in which each players’payoff is his minmax value, then the infinitely repeated game has a SPE in which every players’discounted average payoff is his minmax value Nava (LSE) EC202 —Lectures IX & X Jan2011 12/16 Testing SPE in Repeated Games Definition (One-Deviation Property) A strategy satisfies the one-deviation property if no player can increase his payoff by changing his action at the start of any subgame in which he is the first-mover given other players’strategies and the rest of his own strategy Fact A strategy profile in an extensive game with perfect information and infinite horizon is a SPE if and only if it satisfies the one-deviation property This observation can be used to test whether a strategy profile is a SPE of an infinitely repeated game as we did in the Prisoner’sdilemma example Nava (LSE) EC202 —Lectures IX & X Jan2011 13/16 Example To practice let’sshow why (1.5, 1.5) is also and SPE of the repeated PD: if [ai (t) = s and aj (t) = w and w a(k) / (s, s), (w, w) for k < t] ai (t + 1) = 8 or2 fif [t = 0 andg i = 1] > < s otherwise > If both follow such strategy,:> no player can deviate and benefit at any given round if d 1/2 ≥ After a(t 1) = (s, w), (w, s), no player benefits from a deviation if d 1/2: ≥ 1 (1 d) 3d + 3d3 + 3d5 + ... = 3d/(1 + d) ≤ (1 d) 2 + d + d2... = 2 d (1 d) 3 + 3d2 + 3d4... = 3/(1 + d) ≤ After a(t 1) = (w, w ), (s, s), no player benefits from a deviation since: (1 d) 0 + d + d2 + d3 + ... = d 1 d 1 ≤ , ≤ Nava (LSE) EC202 —Lectures IX & X Jan2011 14/16 Last Example Consider the following game — with minmax payoffs of (1, 1): 1 2 AB An 0,0 4,1 B 1,4 3,3 Two PNE: (A, B) and (B, A) with payoffs (1, 4) and (4, 1) Always playing (B, B) is SPE of the repeated game for d high enough Consider the following grim trigger strategy: (B, B) if a(s) = (B, B) for any s < t or t = 0 (B, B) for s < z 8 (B, A) if a(s) = for z 0, .., t 1 a(t) = (A, B) for s = z 2 f g > > (B, B) for s < z < (A, B) if a(s) = for z 0, .., t 1 (B, A) for s = z 2 f g > > Nava: (LSE) EC202 —Lectures IX & X Jan2011 15/16 No Profitable Deviation 1 2 AB An 0,0 4,1 B 1,4 3,3 If all follow such strategy, no player can deviate and benefit if d 1/3 ≥ Following a(s) = (B, B) for s < t no player deviates since: 8 (1 d) 4 + d + d2 + d3 + ... = 4 3d 3 d 1/3 ≤ , ≥ Following a(s) = (B, B) for s < z and a(z) = (A, B) no one deviates as: 8 (1 d) 0 + d + d2 + d3 + ... = d 1 d 1 ≤ , ≤ (1 d) 3 + 4d + 4d2 + 4d3 + ... = 3 + d 4 d 1 ≤ , ≤ Following a(s) = (B, B) for s < z and a(z) = (B, A) no one deviates for symmetric reasons. 8 Nava (LSE) EC202 —Lectures IX & X Jan2011 16/16.