Repeated Games Economics 302 - Microeconomic Theory II: Strategic Behavior

Shih En Lu

Simon Fraser University (with thanks to Anke Kessler)

ECON 302 (SFU) Repeated Games 1 / 24 Topics

1 Information Sets and Game Trees 2 in Repeated Games 3 Time Discounting 4 Equilibrium in Repeated Games 5 Some Strategies in the Repeated Prisoner’sDilemma

ECON 302 (SFU) Repeated Games 2 / 24 Most Important Things to Learn

1 Know what an "information set" is, and be able to draw a game tree even if the game does not have . 2 Know what a "" is in a . 3 How to work with discount factors 4 Understand why, in a repeated game, a NE of the stage game might not have to be played in all stages. 5 Understand the similarities and differences between finitely and infinitely repeated games. 6 Know what "" and "tit-for-tat" strategies entail. 7 How to find conditions on the discount factor for cooperation to be sustainable in a prisoner’sdilemma.

ECON 302 (SFU) Repeated Games 3 / 24 Introduction to Repeated Games

Games formed by playing a stage game over and over again. Much studied due to numerous applications: competition (and potential ) in an oligopoly, customer relations, cleaning a shared apartment, etc. We will focus on SPEs (and not worry too much about other NEs) of repeated games where: the stage game has simultaneous moves, and actions are perfectly observed after each stage.

1 Finitely many stages, unique NE in stage game 2 Infinitely many stages, unique NE in stage game 3 Finitely many stages, multiple NEs in stage game

ECON 302 (SFU) Repeated Games 4 / 24 Perfect Information?

Recall the definition of SPE: profile where a is played in every subgame. We’vedefined "subgame" for games of perfect information. But do repeated games feature perfect information? Usually not! In fact, as long as there are at least two players, any game where the players sometimes play simultaneously doesn’thave perfect information. Therefore, we need to use a more general definition of subgames. First step: How to draw a game tree if a game doesn’thave perfect information?

ECON 302 (SFU) Repeated Games 5 / 24 Information Set

An information set is a set of nodes where:

1 the same player is acting at all nodes in the set; and 2 the player that is acting knows that she is at the information set, but cannot distinguish between the nodes within the information set. Note: If a player knows that she is at a particular node, then that node is the only element in its information set. Now, a player’sstrategy must specify a course of action for each information set (rather than each node) where he/she acts: you can’tplay differently at nodes that you can’ttell apart!

ECON 302 (SFU) Repeated Games 6 / 24 Example 1

Let’sdraw a game tree for the following Prisoner’sdilemma: Cooperate Defect Cooperate -1,-1 -10,0 Defect 0,-10 -8,-8

ECON 302 (SFU) Repeated Games 7 / 24 Example 2

Now suppose the Prisoner’sdilemma from Example 1 is played twice. The players observe each other’saction after the first stage. Let’sdraw the game tree for this repeated game.

ECON 302 (SFU) Repeated Games 8 / 24 Subgames in Repeated Games

Subgames are parts of the game tree can stand alone as a game. For the full definition, see the Supplementary Material at the end of this set of slides. For repeated games, a subgame starts with a single node at the beginning of a stage, and includes everything following that node, up to the end of the game tree. Why can’tit start with some other node? How many subgames are there in Example 2?

ECON 302 (SFU) Repeated Games 9 / 24 Discount Factors (I)

Often, stages correspond to time periods, and the payoffs from each stage are realized at the end of the stage (rather than the end of the game). How can we derive the payoffs for each of the whole game from the payoffs within each stage? We need to know the relative value of payoffs from different time periods. Definition: The discount factor between periods t and t + 1 is the value of a unit of payoff in period t + 1 relative to the value of a unit of payoff in period t. Assumption: For a given player, the discount factor between periods t and t + 1 is the same for all t. The discount factor is often denoted δ (or β). This implies that the "present discounted value" of a unit of payoff n periods from today is δn. That is, we assume exponential discounting.

ECON 302 (SFU) Repeated Games 10 / 24 Discount Factors (II)

We will usually assume δ [0, 1). ∈ People are impatient. You have a small chance of dying each day. Firms (whose payoffs are usually assumed to be their profits) can invest $1 today to get more than $1 (on average) tomorrow.

If a player has discount factor δ and a payoff stream u0, u1, u2, ... in periods 0, 1, 2, ..., then the value of that payoff stream from period 0’s point of view is:

∞ t 2 ∑ δ ut = u0 + δu1 + δ u2 + ... t=0 Aside: Exponential discounting is a common assumption in economics, but there is evidence that on top of it, people place an extra premium on present payoffs. Search for "hyperbolic discounting" and "behavioral economics" for more information.

ECON 302 (SFU) Repeated Games 11 / 24 Finitely Repeated Games with Unique NE in Stage Game (I)

Let’sgo back to the prisoner’sdilemma: Cooperate Defect Cooperate -1,-1 -10,0 Defect 0,-10 -8,-8 Unique NE is (Defect, Defect). Suppose the game is played twice, and players have discount factor δ. Let’srevisit the game tree from Example 2. In SPE, what must happen in the last stage? So what will players do in the first stage?

ECON 302 (SFU) Repeated Games 12 / 24 Finitely Repeated Games with Unique NE in Stage Game (II)

This reasoning applies whenever a stage game has just one NE, and whenever the number of stages is finite. In SPE, in the last stage, the unique stage-game NE must be played no matter what happened in earlier stages ("history"). But given that, what players do in the second-to-last stage does not impact what happens in the last stage. Therefore, the stage game’sNE must be played in the second-to-last stage: no player has a reason not to play a . Can keep going with this reasoning. Conclusion: When a stage game with a unique NE is repeated finitely many times, the repeated game has a unique SPE where the stage game’sNE is played at each stage, regardless of history.

ECON 302 (SFU) Repeated Games 13 / 24 Mathematical Reminder

Suppose you have discount factor δ [0, 1), and you get a payoff of K in every period starting today. What∈ is the total present value of that payoff stream? x = K + K δ + K δ2 + ... Note that xδ = K δ + K δ2 + K δ3 + ... = x K. K − Thus, x = 1 δ . − Note: Can’tdo this when x is infinite, i.e. when δ 1. ≥

ECON 302 (SFU) Repeated Games 14 / 24 Infinitely Repeated Games

Doesn’tmean that game never ends. But there is no time where it ends for sure. Discount factor captures probability of game ending at each period in addition to impatience. Useful for thinking about many kinds of interactions: personal, professional, between organizations, etc. Sometimes called "supergames." There are infinitely many subgames, and they all look the same!

ECON 302 (SFU) Repeated Games 15 / 24 Repeated Prisoner’sDilemma

CD C -1,-1 -10,0 D 0,-10 -8,-8 When this game is finitely repeated, what is the SPE? When the game is infinitely repeated, is (D, D) every period still an SPE? As we will see, many other SPEs may be possible: there is no last period where (D, D) has to happen. Can be viewed as model of duopoly: "C" might be "Produce half the monopoly quantity" and "D" might be "Produce the Cournot quantity."

ECON 302 (SFU) Repeated Games 16 / 24 Grim (I)

One way to provide incentives for cooperation is to use the grim trigger strategy. "Play C in the first period. Then play C if the outcome of all previous stages was (C, C) (i.e. the "history" is "(C, C) every period"); otherwise, play D." The grim strategy of defecting is triggered by any previous defection, including own defections. Is a strategy profile where both players use this strategy subgame-perfect? Clearly, it is subgame-perfect in subgames where players always play D.

ECON 302 (SFU) Repeated Games 17 / 24 Grim Trigger Strategy (II)

In subgames where players play C, we need to check that they aren’t better off playing a different strategy. 1 Payoff from playing according to profile: 1− δ . − If they deviate and play D, their opponent will play D forever, in which case their best response is to play D forever. 8 Best possible payoff from deviation: 0 + δ 1− δ . − Thus, for following Grim Trigger to be optimal, we need:

1 8δ − − 1 δ ≥ 1 δ − 1 − δ ≥ 8

ECON 302 (SFU) Repeated Games 18 / 24 Tit-for-Tat Strategy

Another strategy that may sustain cooperation in a prisoner’s dilemma is "tit-for-tat." "Play C in the first period. Then play the opponent’saction in the previous period." Are incentives to cooperate stronger or weaker than under grim trigger? But forgiving may be useful in a world where people make mistakes, or where observed outcomes imperfectly correlate with unobserved actions. Note: even when players are very patient (δ close to 1), both playing tit-for-tat usually does not form an SPE because you may be better off playing C even when your opponent played D in the previous period. But it will form a NE if the payoffs and δ are such that a one-period punishment is suffi cient.

ECON 302 (SFU) Repeated Games 19 / 24 Recap of Repeated Games with Unique NE in Stage Game

If finitely repeated, in SPE, unique NE of stage game must be played at each stage, regardless of history. If infinitely repeated, there is no last stage where, in SPE, players have to play the NE of the stage game. This means that even when the stage game has a unique NE, there can be SPE where the stage-game NE is not always played. Specifically, players may not maximize their current-stage payoff because their actions may impact play in future stages. For example, in the prisoner’sdilemma, when δ is high enough, there exists a SPE where the outcome is cooperation in every period. This result can be generalized to other games: if an outcome is better for all players than a NE of the stage game, then when δ is high enough, that outcome can be sustained in SPE. This is (a weak version of) the "Folk Theorem." High δ, i.e. patience, is important: for players not to play the action that maximizes their current-stage payoff, they need to care enough about the future to play another action.

ECON 302 (SFU) Repeated Games 20 / 24 Finitely Repeated Games with Multiple NEs in Stage Game

It’sstill true that an NE must be played in the last stage. But now, which NE is played can depend on what has happened before. Example: if a NE of the stage game Pareto dominates another one, then players might play the "good" NE if they have never deviated, and the "bad" NE otherwise. This gives an incentive not to deviate in earlier period(s). If the incentive is strong enough and players are patient enough, might not play a NE of the stage game in earlier period(s). (Note that this is allowed in SPE: any stage game but the last one, taken by itself, is NOT a subgame.) Reasoning much like in infinitely repeated games, with the difference that a NE must be played in the last period.

ECON 302 (SFU) Repeated Games 21 / 24 Example

Suppose that the following stage game is played twice, and the discount factor is 1: Nice Mean Polite 5,0 0,0 Rude 6,-1 1,-1 Note that there are two pure-strategy NEs in the stage game: (Rude, Nice) and (Rude, Mean). Verify that the following is an SPE, even though (Polite, Nice), which is not a stage-game NE, is played in stage 1:

1 Player 1: In stage 1, Polite; in stage 2, Rude 2 Player 2: In stage 1, Nice; in stage 2, Nice if player 1 was polite in stage 1, Mean if player 1 was rude in stage 1 What if the discount factor is 0.1 instead?

ECON 302 (SFU) Repeated Games 22 / 24 Supplementary Material: General Definition of Subgames (I)

Definition: A node h’s successors are all the nodes after h, all the way to the terminal nodes (end of the game tree). Definition: Suppose you have a game G.A subgame of G consists of the part of the extensive form of G containing a single non-terminal node and all its successors with the property that every information set of G is either entirely inside or entirely outside that set of nodes. The last part of the definition can be rephrased: no information set of G contains both nodes inside and nodes outside of a subgame.

ECON 302 (SFU) Repeated Games 23 / 24 Supplementary Material: General Definition of Subgames (II)

Way to remember the definition: think of information sets as spider webs. Subgames are parts of the tree (except for terminal nodes) that you can detach by snapping a single branch and without tearing a web. Note: The whole game is always a subgame. To solve for SPE, do what we have been doing! Start with the small subgames toward the end of the tree, and work backwards. As you work backwards, you will be solving bigger and bigger subgames. is a special case of this procedure: in games of perfect information, every non-terminal node and its successors are a subgame.

ECON 302 (SFU) Repeated Games 24 / 24