Repeated Games with Discounting Repeating Yourself Repeated simultaneous move games with discounting are a particular class of multi-stage games. They are often Marco Pagnozzi used to analyze collusive arrangements. [email protected] Players: A finite set I = 1, 2,...,n , with member i. { } (Based on Notes by David Myatt) Stages: There are T +1stages, with t 5 0, 1,...,T . { } t Actions: At each stage t,playeri chooses an action ai 5 Ai. (The (stage) space includes mixtures over Objective: Ai.) The choice may be conditioned on the history of action choices in all stages up to t  1. Repeated games with discounting. t Payoffs: The (vNM) payoff to player i in period t is ui(a ) • t t Conditional . where a = iai. The payoffs for the whole game are: • × Finitely vs. infinitely repeated games. T • t t and conditions for collusion. i =(1 B) B ui(a ). • t=0 The Nash-threats folk theorems. X • (The leading constant in the payoffs is a normalisation; it gives the average per-period payoffs.)

In an infinitely , the payoffs are: • T t t i = lim (1  B) B ui(a ) . T $4 " t=0 # X Conditional Equilibrium Selection Examine the payoffs graph: • Consider the following : 4 •

Left Middle Right 3 To p 00 34 60 Middle 43 00 00 2 Bottom 06 00 55

Consider the mixed strategy of [T(1  ²)+M²] versus 1 • the pure strategy B, for ² very small. Row’s payoffs are:

0 0 0.2 0.4 0.6 0.8 1 LM R x T(1  ²)+M² 4² 3(1 ²) 6(1 ²) Left (or y =1): solid; Middle (or y =0): dash B 0 0 5 ; Bottom (and by symmetry, Right) is strictly dominated by A M, L with payoff (4, 3) ? { } • the mixing. Hence, delete them to obtain: , NE = T, M with payoff (3, 4) =A { 3 3 } 12 12 7, 7 with payoff 7 , 7 L (y) M (1  y) Expected T (x) 00 34 3(1 y) But all three NE are© Paretoª dominated by ¡B, R ¢. • { } M (1  x) 43 00 4y Expected 3(1 x)4x

, There are 3 equilibria: 2 in pure strategies, 1 in mixed strategies. Repeating the Game , The strategy described is an equilibrium if a deviation is LMR not profitable, i.e. if: T 00 34 60 E D M 43 00 00 i > i + B > 7/9. B 06 00 55 If players are sufficiently patient, then they can obtain Can players do any better? Assume they repeat the game • • more than the stage game NE payoff. twice, with a discount rate B. Consider the following strategy: • Notice that the required: (i) Play {B,R} in the first period, and {M,L} in the second. • (i) To play a stage game NE in the second period; (ii) If either player deviates in the first period, play the mixed 3 3 (ii) The presence of multiple stage game NE. equilibrium 7, 7 in the second period. Why? © ª • In equilibrium, the strategy yields: • E i  5+B3.

If a player deviates in the first period he can obtain 6 in- • stead of 5, but this leads to a penalty in the second period. By deviating from equilibrium, a player obtains: • 12 D =6+B . i 7 Conditional Equilibrium Selection II Equilibria of Finitely Repeated Games

Payoff outcomes from repeated games can at least mimic Left Middle Right • the stage game: To p 22 00 60 – Play a stage game NE at every stage, contingent on the Middle 00 44 00 stage but not the history of plays. Bottom 06 00 55 – Then deviating cannot benefit within a stage, and does Again, there are 3 Nash Equilibria in this game, but they not change future play. • are all dominated by {B,R}. , We have a perfect . Assume players repeat the game twice, with a discount • rate B. But multiple stage NE allow a more involved outcomes Consider the following strategy: • • since future choice may be contingent on current play (i) Play {B,R} in the first period, and {M,M} in the sec- (e.g. conditional equilibrium selection). ond. Then deviation in a stage has two effects: it alters stage (ii) If either player deviates in the first period, play the pure • payoffs, and may change the equilibrium played in future strategy equilibrium {T,L} in the second period. stages. This enables various outcomes, conditional on a high enough • If a player deviates in the first period he can obtain 6 in- discount rate. • stead of 5, but this leads to a penalty of (4  2) = 2 in the second period. But when there is a unique stage game NE, the only equi- • , The strategy described is an equilibrium for B > 1/2. librium of the finitely repeated game is to play it in every stage. Again, this required a stage game NE in the second pe- • riod. Finitely Repeated Prisoners’ Dilemma Repeated Competition and Collusion

Consider the two level pricing game — a Prisoners’ Dilemma: In a one-shot and in a finitely-repeatedBertrand game, • • firms price at marginal cost and earn zero profits. High Low But infinite repetition yields the possibility of collusion. 3 5 • High 3 0 0 1 Suppose firms adopt the following (trigger) strategy: Low • 5 1 – Charge the monopoly price in each period (and obtain an equal share of the monopoly profits), unless a player , There is a unique dominant strategy NE: {L,L}. deviates, in which case switch to Bertrand equilibrium forever. Consider repeating this stage game T times: • – In the last stage, it is a dominant strategy for both firms The payoff from colluding, with n firms in the market, is: to price low. • 4 – In the penultimate stage, current behaviour does not in- tZM 1 ZM C = B = . fluence future payoffs. Hence both firms price low. n 1  B n t=0 – Iterating back, firms price low at every stage. X A firm can deviate and undercut its rival. This yields ZM , Through the logic, the unique NE of • for one period, but zero profits forever after (since it de- the finitely repeated game is to price low every period. stroys collusion): Hence finitely repeated games may not yield added stage • D outcomes. = ZM +0. The argument depends on the existence of a final period. • Hence a firm will not deviate iff: Conditions for Collusion •

1 ZM The previous argument required an infinite horizon. This C > D / > ZM • 1  B n might seem unrealistic. 1 An alternative definition is an indefinite horizon. For in- / 1  n(1  B) / B  1  . • n stance, the discount parameter B may be the probability that the world continues tomorrow. , If players are sufficiently patient, then tacit collusion is possible. The collusive outcome required a sufficiently high dis- • count rate: But there are many equilibria in this game: • – Players need to care about the future; they trade the ben- M – Consider price p 5 c, p yielding payoff Z (p) . efit of cheating today against the penalty of punishment – The £ ‘‘choose¤ price p in each period; if a tomorrow. player deviates choose c forever’’ is an equilibrium iff: B becomes higher as the time periods shorten. Hence reg- • ular interaction helps to support collusion. 1 Z (p) 1 > Z (p) / B  1  . 1  B n n If the future is uncertain, this corresponds to a lower B. • Hence collusion is harder. , Any price between marginal cost and monopoly price can be sustained in equilibrium as long as B  1  1. n The required B depends on the number of firms. • The temptation to cheat is the capture of the whole mar- • ket, but the punishment is the loss of the nth share of the market during collusive phases. The Nash-Threats Folk Theorem Such a payoff vector can be supported in the repeated • game by a trigger strategy: Consider the prisoners’ dilemma: • – Play the prescribed strategies to yield the required av- High Low erage payoff. 3 5 – If anyone deviates, revert to the stage NE forever. High 3 0 This is the Nash-threats Folk Theorem. 0 1 • Low 5 1

The convex hull of the strategy profile payoff represents • the feasible payoffs.

5

4

3

2

1

0 1 2 3 4 5

For sufficiently high B, any per-period feasible payoff vec- • tor higher than the stage-game NE payoff vector may be achieved. Minmax Threats and the Folk Theorem • Consider the following game:

• Definition:Playeri’s minmax payoff (or reservation util- CD ity) is the lowest payoff that the other players can force 4 0 i C upon :   4 2 2 1 vi =min max ui(ai,a i) . a A a A  D i5 i i5 i 0 1

• This is the worst other players can force on player i, recog- • C is a dominant strategy , (C, C) is the unique Nash nizing that player i will do her best in the circumstance. equilibrium. , Player i cannot get less than vi in the stage game. • The Nash-threat Folk Theorem only indicates that (4, 4) can be achieved. • The set of individually rational feasible payoffs is the subset of the feasible payoffs that give each player at least her minmax payoff. 4 • Folk Theorem: For every individually rational feasible payoff vector v, there exists a discount factor B < 1 such 3 that, for all B 5 (B, 1), there is a Nash equilibrium of the repeated game with payoff v. 2

, If players are sufficiently patient then any feasible, indi- 1 vidually rational payoff can be obtained in equilibrium.

0 1 2 3 4 • But consider the minmax payoffs: Repeated Competition with Noisy Demand – If Column chooses C, Row’s best reply is D • The standard supergame-theoretic reasoning assumes that and Row gets 4 prices are observed. – If Column chooses D, Row’s best reply is D • If secret-price cutting is available (i.e., a firm cannot ob- and Row gets 2 serve the price set by its opponent), then the collusion rule , Column can force Row to get 2. must depend on a publicly-observed state variable. • Minmax payoff for both players is 2. So many other equi- • Consider Tirole’s version of the Green and Porter model libria can be achieved as subgame perfect equilibria: of collusion with noisy demand: – With probability (1k), demand follows from the stan- dard model. – With probability k, market demand falls to zero. 4 • Then a player can have zero sales for two reasons: (a) Her opponent has undercut her; 3 (b) Demand was zero.

2

1

0 1 2 3 4 Strategies with Noisy Demand Calculating Payoffs and Supporting Collusion

• Consider the following strategy: • Suppose that players collude and charge pM .

–PlaypM (monopoly price) as split the market as long as • Denote the expected payoffs in the collusive phase as VC demand is positive; and the expected payoff in the price war phase as VW : –PlaypC (marginal cost) for T periods if zero demand is ; k l ZM observed by either firm. A VC =(1  k) + BVC + kBVW A 2  ~} € ?A  ~} € Enter war , Players enter a price war phase whenever one of them Keep colluding observes zero demand. A A V = T V =A W B ~}C€ T • Notice that the event of entering a price war is commonly Collude after Periods known to both players: • A deviating player obtains VD: – If a player colludes and observes zero demand, she knows there will be a price war; VD =(1 k)[ZM + BVW ]+kBVW . – If a player deviates, she knows her opponent observes zero demand and there will be a price war. • To avoid a deviation by a player we require:

V V D  C k l (1 )[ + V ] (1 ) ZM + V /  k ZM B W   k 2 B C + V ZM + V / ZM B W  2 B C ZM (V V ). / 2  B C  W • Solving VC and VW simultaneously (from the initial in- • The model has the following features: equalities) yields: – Agents never cheat in equilibrium, yet price wars still ; (1 ) /2 occur. A V =  k ZM A C T+1 – Price wars are required to give players incentive to be- ?A 1  (1  k)B  kB have in equilibrium. A (1 ) T /2 – Given low demand, a player would rather not ‘‘pull the A V =  k B ZM A W T +1 = 1  (1  k)B  kB trigger’’. – Players enter a price war phase, since they expect heir • Substituting for VC and VW in the condition for B we ob- opponents to enter too. tain:

T +1 • 2(1  k)B  (1  2k)B  1. The existence of a public state variable is critical in the Green and Porter model.

• This is satisfied for appropriate values of B and given a large enough T (i.e., for k small, B large and T large). • Folk Theorems are available under incomplete informa- tion.