Problem Set 4 GTSB Fall 2015
Due on 11/15. If you are working with a partner, you and your partner may turn in a single copy of the problem set. Please show your work and acknowledge any additional resources consulted. Questions marked with an (∗) are intended for math-and-game-theory-heads who are interested in deeper, formal exploration, perhaps as preparation for grad school. The questions typically demonstrate the robustness of the results from class or other problems, and the answers do not change the interpretation of those results. Moreover, this material will not play a large role on the exam and tends to be worth relatively little on the problem sets. Some folks might consequently prefer to skip these problems.
1 Grim Trigger in the Repeated Prisoner’s Dilemma (70 points)
In one instance of the prisoner’s dilemma, each player chooses whether to pay some cost c > 0 in order to confer a benefit b > c onto the other player. The payoffs from a single iteration of this prisoner’s dilemma are therefore:
Cooperate Defect Cooperate (b − c, b − c) (−c, b) Defect (b, −c) (0, 0)
The repeated prisoner’s dilemma1 is built out of several stages, each of which is a copy of the above game. At the end of each stage, the two players repeat the prisoner’s dilemma again with probability δ, where 0 ≤ δ ≤ 1. A strategy in the repeated prisoner’s dilemma is a rule which determines whether a player will cooperate or defect in each given stage. This rule may depend on which round it is, and on either player’s actions in previous rounds. For example, the grim trigger strategy is described by the following rule: cooperate if both players have never defected, and defect otherwise. The goal of this problem is to show that the c strategy pair in which both players play grim trigger is a Nash equilibrium if δ > b .
(a) Suppose that player 1 and player 2 are both following the grim trigger strategy. What actions will be played in each stage of the repeated game? What are the payoffs to players 1 and 2 in each stage?
(b) Using your result from parta, write down the expected payoff to player 1 from the entire repeated prisoner’s dilemma in terms of c, b, and δ. Hint: Remember that, if |δ| < 1:
a a + aδ + aδ2 + aδ3 + ... = 1 − δ 1Please consult Section 5 of the Game Theory handout on Repeated Games for details.
1 (c) Now we will check whether player 1 can improve his payoff by deviating from the grim trigger strategy. Argue that we only need to check the case where player 1 plays all-D, that is, player 1 defects in every round.
(d) Suppose that player 2 plays grim trigger and player 1 deviates from grim trigger and plays all-D. What is the total payoff to player 1 from the entire repeated prisoner’s dilemma?
(e) For grim trigger to be a Nash equilibrium, we need that the payoff to player 1 from playing grim trigger is greater than or equal to the payoff to player 1 from playing all-D, assuming player 2’s strategy is fixed. Using your results from partsb andd, write down an inequality that must be satisfied in order for grim trigger to be a Nash equilibrium. Simplify this inequality to obtain the condition c δ > b .
(f)( ∗) - 10 points. Show that the Grim Trigger is a Subgame Perfect equilibrium in addition to being a Nash equilibrium [Hint: use the one-stage deviation principle]. For a formal discussion of subgame perfection, see the Game Theory Handout. So far we have focused on the Grim Trigger because it is a relatively simple strategy to under- stand, but not necessarily because we think it is used in practice. Importantly, many of the insights we have learned from studying the Grim Trigger generalize to any Nash equilibrium.
(g)( ∗) - 10 points. Show that in any Nash equilibrium in which both players play C at each period, player 2 must cooperate less in the future if player 1 were to deviate and play D at any period instead of C. Interpret this result in terms of ‘reciprocity,’ as discussed in lecture.
2 No Cooperation for Small δ (50 points)
In lecture, we argued that cooperative equilibria exist in the repeated prisoner’s dilemma if and c only if δ > b . In problem1, you showed that we can have a Nash equilibrium in which both players c always cooperate (specifically, the equilibrium in which both players play grim trigger) if δ > b . In c this problem, we will show that if δ < b , then the only Nash equilibrium is (all-D, all-D). That is, c cooperative equilibria exist only if δ > b . Combined, your responses to these two questions thus provide a complete proof to our claim from lecture.
(a) Suppose that the strategy pair (s1, s2) is a Nash equilibrium, and let U1(s1, s2) and U2(s1, s2)
be the payoffs to players 1 and 2, respectively. Show that U1(s1, s2) ≥ 0 and U2(s1, s2) ≥ 0.
(b) Notice that, in each round of the prisoner’s dilemma, the sum of the payoffs to players 1 and
2 is either 2(b − c), b − c, or 0. Show that, if s1 and s2 are any two strategy pairs, then 2(b−c) U1(s1, s2) + U2(s1, s2) ≤ 1−δ .
2 c (c) Now assume δ < b . Using your results from partb, show that U1(s1, s2) + U2(s1, s2) < 2b for any strategy pair (s1, s2). Use this to conclude that, if (s1, s2) is a Nash equilibrium, at least one player receives total payoff less than b.
(d) Suppose that, when players 1 and 2 play s1 and s2, both players cooperate in some round k. Without loss of generality, we may assume that k = 1 (otherwise we repeat the argument from partsa-c to the subgame starting at round k, introducing a factor of δk−1). Using your result from partc, show that one of the players can improve his payoff by deviating.
(e) Next we need to rule out the possibility of a round in which one player cooperates and the other defects. Repeat the argument of partb using the additional result that players 1 and 2 never simultaneously cooperate (so the sum of their payoffs in a given round is either b − c or b−c 0). Show that U1(s1, s2) + U2(s1, s2) ≤ 1−δ .
c (f) Again assume that δ < b . Use your results from partsa ande to conclude that each player’s payoff is less than b; that is, U1(s1, s2) < b and U2(s1, s2) < b.
(g) Now suppose that, in the first round, player 1 cooperates and player 2 defects. By your reasoning from part (f), player 2 receives total payoff less than b. Show that player 2 can
improve his payoff by deviating, so that (s1, s2) is not a Nash equilibrium.
Using this proof by contradiction, you have showed that a strategy pair (s1, s2) which involves c cooperation in any period cannot be a Nash equilibrium if δ < b . It follows that (all-D, all-D) is the only equilibrium in this case.
3 Panchanathan and Boyd (2004)
Recall the model presented in Panchanathan and Boyd (2004):
“. . . we consider a large population subdivided into randomly formed social groups of size n. Social life consists of two stages. First, individuals decide whether or not to contribute to a one-shot collective action game at a net personal cost C in order to create a benefit B shared equally amongst the n−1 other group members, where B > C. Second, individuals engage in a multi-period ‘mutual aid game’. . . In each period of the mutual aid game, one randomly selected individual from each group is ‘needy’. Each of his n − 1 neighbours can help him an amount b at a personal cost c, where b > c > 0. Each individual?s behavioural history is known to all group members. This assumption is essential because it is known that indirect reciprocity cannot evolve when information quality is poor. The mutual aid game repeats with probability w and terminates with probability 1 − w, thus lasting for 1/(1 − w) periods on average.”
Recall, also, the “shunner” strategy:
3 q r s
Figure 1: An Abstract Information Structure
“Shunners contribute to the collective action and then try to help those needy indi- viduals who have good reputations during the mutual aid game, but mistakenly fail owing to errors with probability e. . . Shunners never help needy recipients who are in bad standing.”