Repeated Games
Total Page:16
File Type:pdf, Size:1020Kb
Prerequisites Almost essential Game Theory: Dynamic REPEATED GAMES MICROECONOMICS Principles and Analysis Frank Cowell April 2018 Frank Cowell: Repeated Games 1 Overview Repeated Games Basic structure Embedding the game in context Equilibrium issues Applications April 2018 Frank Cowell: Repeated Games 2 Introduction . Another examination of the role of time . Dynamic analysis can be difficult • more than a few stages • can lead to complicated analysis of equilibrium . We need an alternative approach • one that preserves basic insights of dynamic games • for example, subgame-perfect equilibrium . Build on the idea of dynamic games • introduce a jump • move from the case of comparatively few stages • to the case of arbitrarily many April 2018 Frank Cowell: Repeated Games 3 Repeated games . The alternative approach • take a series of the same game • embed it within a time-line structure . Basic idea is simple • connect multiple instances of an atemporal game • model a repeated encounter between the players in the same situation of economic conflict . Raises important questions • how does this structure differ from an atemporal model? • how does the repetition of a game differ from a single play? • how does it differ from a collection of unrelated games of identical structure with identical players? April 2018 Frank Cowell: Repeated Games 4 History . Why is the time-line different from a collection of unrelated games? . The key is history • consider history at any point on the timeline • contains information about actual play • information accumulated up to that point . History can affect the nature of the game • at any stage all players can know all the accumulated information • strategies can be conditioned on this information . History can play a role in the equilibrium • some interesting outcomes aren’t equilibria in a single encounter • these may be equilibrium outcomes in the repeated game • the game’s history is used to support such outcomes April 2018 Frank Cowell: Repeated Games 5 Repeated games: Structure . The stage game • take an instant in time • specify a simultaneous-move game • payoffs completely specified by actions within the game . Repeat the stage game indefinitely • there’s an instance of the stage game at time 0,1,2,…,t,… • the possible payoffs are also repeated for each t • payoffs at t depends on actions in stage game at t . A modified strategic environment • all previous actions assumed as common knowledge • so agents’ strategies can be conditioned on this information . Modifies equilibrium behaviour and outcome? April 2018 Frank Cowell: Repeated Games 6 Equilibrium . Simplified structure has potential advantages • whether significant depends on nature of stage game • concern nature of equilibrium . Possibilities for equilibrium • new strategy combinations supportable as equilibria? • long-term cooperative outcomes • absent from a myopic analysis of a simple game . Refinements of subgame perfection simplify the analysis: • can rule out empty threats • and incredible promises • disregard irrelevant “might-have-beens” April 2018 Frank Cowell: Repeated Games 7 Overview Repeated Games Basic structure Developing the basic concepts Equilibrium issues Applications April 2018 Frank Cowell: Repeated Games 8 Equilibrium: an approach . Focus on key question in repeated games: • how can rational players use the information from history? • need to address this to characterise equilibrium . Illustrate a method in an argument by example • outline for the Prisoner's Dilemma game • same players face same outcomes from their actions that they may choose in periods 1, 2, …, t, … . Prisoner's Dilemma particularly instructive given: • its importance in microeconomics • pessimistic outcome of an isolated round of the game April 2018 Frank Cowell: Repeated Games 9 * detail on slide can only be seen if you run the slideshow Prisoner’s dilemma: Reminder .Payoffs in stage game .If Alf plays [RIGHT] Bill’s best response is [right] [LEFT] .If Bill plays [right] Alf’s best response is [RIGHT] 2,2 0,3 .Nash Equilibrium Alf .Outcome that Pareto dominates NE [RIGHT] 3,0 1,1 [left] [right] .The highlighted NE is inefficient Bill .Could the Pareto-efficient outcome be an equilibrium in the repeated game? .Look at the structure April 2018 Frank Cowell: Repeated Games 10 * detail on slide can only be seen if you run the slideshow Repeated Prisoner's dilemma .Stage game between (t=1) Alf .Stage game (t=2) follows here 1 .or here [LEFT] [RIGHT] .or here .or here Bill [left] [right] [left] [right] Alf Alf Alf Alf (2,2) (0,3) (3,0) (1,1) 2 2[LEFT] 2 [LEFT][RIGHT]2[LEFT] [RIGHT][LEFT][RIGHT] [RIGHT] Bill Bill Bill Bill [left] [right][left] [left][right][left] [right][left][left] [right][right][left] [right][left] [right] (2,2) (2,2)(0,3) (2,2)(3,0(0,3)) (2,2)(3,(1,(0,3)01)) (3,(0,3)0(1,) 1) (3,(1,01) ) (1,1) . Repeat this structure indefinitely…? April 2018 Frank Cowell: Repeated Games 11 Repeated Prisoner's dilemma .The stage game Alf 1 . [LEFT] [RIGHT] repeated though time Bill [left] [right] [left] [right] (2,2) … (0,3) …(3,0) …(1,1) Alf t [LEFT] [RIGHT] Bill [left] [right] [left] [right] Let's look at the detail (2,2) … (0,3) …(3,0) …(1,1) April 2018 Frank Cowell: Repeated Games 12 Repeated PD: payoffs . To represent possibilities in long run: • first consider payoffs available in the stage game • then those available through mixtures . In the one-shot game payoffs simply represented • it was enough to denote them as 0,…,3 • purely ordinal • arbitrary monotonic changes of the payoffs have no effect . Now we need a generalised notation • cardinal values of utility matter • we need to sum utilities, compare utility differences . Evaluation of a payoff stream: • suppose payoff to agent h in period t is υh(t) • value of (υh(1), υh(2),…, υh(t)…) is given by ∞ [1−δ] ∑ δt−1υh(t) t=1 • where δ is a discount factor 0 < δ < 1 April 2018 Frank Cowell: Repeated Games 13 PD: stage game . A generalised notation for the stage game • consider actions and payoffs • in each of four fundamental cases . Both socially irresponsible: • they play [RIGHT], [right] • get ( υa, υb) where υa > 0, υb > 0 . Both socially responsible: • they play [LEFT],[left] • get (υ*a, υ*b) where υ*a > υa, υ*b > υb . Only Alf socially responsible: • they play [LEFT], [right] • get ( 0,υb) where υb > υ*b . Only Bill socially responsible: A diagrammatic • they play [RIGHT], [left] view • get (υa, 0) where υa > υ*a April 2018 Frank Cowell: Repeated Games 14 Repeated Prisoner’s dilemma payoffs .Space of utility payoffs .Payoffs for Prisoner's Dilemma υb .Nash-Equilibrium payoffs .Payoffs Pareto-superior to NE .Payoffs available through mixing _ .Feasible, superior points b υ • ."Efficient" outcomes ( υ*a, υ*b ) * • a b• ( υ , υ ) υa 0 •_ υa April 2018 Frank Cowell: Repeated Games 15 Choosing a strategy: setting . Long-run advantage in the Pareto-efficient outcome • payoffs (υ*a, υ*b) in each period • clearly better than ( υa, υb) in each period . Suppose the agents recognise the advantage • what actions would guarantee them this? • clearly they need to play [LEFT], [left] every period . The problem is lack of trust: • they cannot trust each other • nor indeed themselves: • Alf tempted to be antisocial and get payoffυa by playing [RIGHT] • Bill has a similar temptation April 2018 Frank Cowell: Repeated Games 16 Choosing a strategy: formulation . Will a dominated outcome still be inevitable? . Suppose each player adopts a strategy that 1. rewards the other party's responsible behaviour by responding with the action [left] 2. punishes antisocial behaviour with the action [right], thus generating the minimax payoffs (υa, υb) . Known as a trigger strategy . Why the strategy is powerful • punishment applies to every period after the one where the antisocial action occurred • if punishment invoked offender is “minimaxed for ever” . Look at it in detail April 2018 Frank Cowell: Repeated Games 17 Repeated PD: trigger strategies a .Take situation at t Bill’s action in 0,…,t sT Alf’s action at t+1 .First type of history .Response of other player to [left][left],…,[left] [LEFT] continue this history .Second type of history Anything else [RIGHT] .Punishment response a b .Trigger strategies [sT , sT ] s b Alf’s action in 0,…,t T Bill’s action at t+1 [LEFT][LEFT],…,[LEFT] [left] Will it work? Anything else [right] April 2018 Frank Cowell: Repeated Games 18 Will the trigger strategy “work”? . Utility gain from “misbehaving” at t: υa − υ*a . What is value at t of punishment from t + 1 onwards? • Difference in utility per period: υ*a − υa • Discounted value of this in period t + 1: V := [υ*a − υa]/[1 −δ ] • Value of this in period t: δV = δ[υ*a − υa]/[1 −δ ] . So agent chooses not to misbehave if • υa − υ*a ≤ δ[υ*a − υa ]/[1 −δ ] . But this is only going to work for specific parameters • value of δ • relative to υa, υa and υ*a . What values of discount factor will allow an equilibrium? April 2018 Frank Cowell: Repeated Games 19 Discounting and equilibrium . For an equilibrium condition must be satisfied for both a and b . Consider the situation of a . Rearranging the condition from the previous slide: • δ[υ*a − υa ] ≥ [1 −δ] [υa − υ*a ] • δ[υa − υa ] ≥ [υa − υ*a ] . Simplifying this the condition must be • δ ≥ δa • where δa := [υa − υ*a ] / [υa − υa ] . A similar result must also apply to agent b . Therefore we must have the condition: • δ ≥ δ • where δ := max {δa , δb} April 2018 Frank Cowell: Repeated Games 20 Repeated PD: SPNE a b . Assuming δ ≥ δ, take the strategies [sT , sT ] prescribed by the Table . If there were antisocial behaviour at t consider subgame that would start at t + 1 • Alf could not increase his payoff by switching from [RIGHT] to [LEFT], given that Bill is playing [left] • a similar remark applies to Bill • so strategies imply a NE for this subgame • likewise for any subgame starting after t + 1 .