as Public Monitoring Becomes Noisy:

Experimental Evidence∗

Masaki Aoyagi† Guillaume R. Fréchette‡ Osaka University New York University

October 13, 2005

Abstract

This paper studies collusion in an infinitely when the oppo- nent’s past actions are observed only through a noisy public signal. Attention is focusedonathreshold , which switches between cooperation and pun- ishment phases based on the comparison between the realized public signal and a threshold. The paper develops a simple theory on when such a strat- egy supports the most efficient symmetric (perfect public) equilibrium, and characterizes its payoff as a function of noise in monitoring. The theoretical predictions are then tested in laboratory experiments. It is found that subjects’ payoffs (i) decrease as noise increases, and (ii) are lower than the theoretical maximum for small noise, but exceed it for large noise. It is also estimated that the subjects’ strategies are best described by a simple threshold strategy that looks only at the most recent public signal. Keywords: Repeated games, collusion, cooperation.

∗Special thanks are given to John Kagel, who helped with comments and guidance through- out the development of the paper. We have benefited from discussions with Al Roth and Marco Casari and comments by Pedro Dal Bo, Georg Weizsacker and seminar participants at Université Laval, Harvard University, New York University, Université de Montréal, Carnegie Mellon Univer- sity, RIETI, Rutgers, and the Stockholm School of Economics, as well as participants of the 2003 International Meeting of the Economic Science Association, and the 14th Summer Festival on at Stony Brook. Alex Brown provided valuable research assistance. This research was sup- ported in part by the Harvard Business School, by the Center for Experimental Social Science and the C.V. Starr Center. We are responsible for all remaining errors. †ISER, 6-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan. ‡Department of Economics, 269 Mercer St. 7th Floor, New York, NY, 10003-6687.

1 1Introduction

Imperfection in monitoring is an integral part of many competitive situations in reality. It is a particularly important topic in the field of industrial organization, where the signals are subject to various external shocks. Firms’ ability to cooperate in such environments is of clear interest to researchers as well as regulation author- ities. A key prediction of the theory of repeated games is that when players are patient, a simple strategy can sustain collusion even under imperfect public moni- toring, where opponents’ actions are observed only through a noisy public signal. In the repeated prisoners’ dilemma (PD), for example, it is known that any payoff in an important class of symmetric equilibria can be achieved through the use of a grim-, which reverts to the one-shot in the event of particular signal realizations. This paper is aimed at testing the theory of imperfect public monitoring in infinitely repeated games: It provides a theoretical characterization of the maximal equilibrium payoff as a function of noise in monitoring, and then uses laboratory experiments to test the theoretical predictions. The main focus of the paper is on the comparative statics of the effect of noise on the players’ payoffs, and on the strategies they use to achieve collusion. While imperfect monitoring has attracted much attention in economic theory, empirical work on the subject has been limited because of some fundamental dif- ficulties. For example, it is not easy to identify the exact public signal the firms use to coordinate their actions: it could be price, shares in a nationwide or regional market, industry output, or the combination of any of these and other indicators. There are also difficulties with identifying the firms’ strategic variables and their payoff structure. Free from these problems, a laboratory experiment in a controlled environment is considered an ideal alternative for the study of the subject. Green and Porter (1984) are the first to provide a theoretical analysis of re- peated games with imperfect public monitoring: In their model of quantity-setting oligopoly, the market price serves as a noisy public signal of firms’ output quantities because of demand fluctuations. They put forth an equilibrium based on the trigger strategy as follows: The firms produce at the jointly monopolistic level as long as the realized price is above a certain threshold, but revert to the one-shot Cournot quantity for a fixed number of periods when it falls below the threshold. Because of the random component in demand, periodic price wars occur on the equilibrium path.

2 As in Green and Porter (1984), the present model has a one-dimensional public signal (price) whose distribution is monotonically related to the players’ actions (quantities). While the set of repeated game strategies is large, we suppose that players coordinate their actions based on a simple rule. In particular, we suppose that players use a threshold strategy, which uses thresholds on the public signal as a coordination device: The players agree to take one action profile if the realized public signal is above a certain value but take another action profile if it falls below it. Specifically, we consider a symmetric stage game with two actions for each player: As in the PD, the first action gives rise to an efficient symmetric profile, while the second action gives rise to an inefficient symmetric (one-shot) Nash equilibrium. We characterize the set of equilibrium payoffs when the two players’ repeated game strategies are symmetric, sequentially rational, and public in the sense that today’s actions are determined only by past public signals. In this case, it is known from the bang-bang property (Abreu et al. (1990)) that the highest equilibrium payoff is sustained by a grim-trigger strategy.Basedonthisfact,weidentifysufficient conditions for the optimal grim-trigger strategy to trigger the punishment based on a threshold condition. We then proceed to derive an explicit and testable link between the maximal payoff and the level of noise in monitoring.1 In our experiments, we use a simple two-person two-action PD as the stage game, and assume a simple distribution for the noise component of the public signal. The distribution allows for an explicit derivation of the maximum payoff as a function of noise in monitoring. We have five treatments depending on the level of noise from zero to infinity. In theory, positive cooperation is possible for the three low noise treatments, while the best equilibrium entails no cooperation for the two high noise treatments. We have two major objectives in analyzing data from our experiments. First, we compare the players’ actual payoffs against the theoretical maximum de- rived as above, and examine how they change with the noise in monitoring. Our findings are as follows:

1) The level of cooperation is positive for any noise level.

2) The level of cooperation is lower than the theoretical maximum for small noise

1 This should be contrasted with the qualitative conclusion of Kandori (1992) that the set of (symmetric and asymmetric) perfect public equilibrium payoffs strictly expands as monitoring be- comes more accurate. We are unaware of any quantitative characterization of equilibrium payoffs as a function of noise in monitoring.

3 but higher than that for large noise.

3) The level of cooperation decreases as noise increases.

4) For large noise for which theory predicts no cooperation, the level of cooper- ation is no higher than that in a one-shot game.

We also find a distinctive pattern in the evolution of play. In the low noise treatments for which cooperation is possible in theory, we observe that subjects increase the level of cooperation as they accumulate experience. In the high noise treatments, on the other hand, the subjects begin to behave more non-cooperatively as they become more experienced. As stated above, we find that reducing noise in monitoring increases the level of cooperation. It should be emphasized that this is the first experiment to identify such a relationship between the noise in monitoring and the level of cooperation in the standard imperfect public monitoring setting. This result is far from obvious for the following reasons: First, as is well known, experiments on the one-shot PD often generate positive levels of cooperation. Given that their subjects cooperate even with no histories, these experiments may be interpreted as suggesting that past histories would play an insignificant role in their decisions in repeated games. The present experiment rejects this view. Second, our finding is at odds with those of some experiments on imperfect monitoring. These experiments, which model imper- fect monitoring in ways significantly different from the present one, find no increase in cooperation when monitoring becomes “more accurate.”2 Third, sustaining co- operation in a repeated game requires the ability to perform non-trivial reasoning even if it only involves a simple strategy. Our finding suggests that the subjects do in fact possess such capabilities. Our second objective is to estimate the subjects’ repeated game strategies. For this, we identify the strategy that best describes the data from a class of strategies that use thresholds. Specifically, we consider a class of strategies which start out in the cooperation phase, switch to the punishment phase when the public signal falls below a certain threshold, and return to the cooperation phase after some number of periods provided that the signal then is above another threshold. It can be seen that this class encompasses those strategies that are most frequently discussed in the literature: Included in this class are the trigger strategies as in Green and Porter

2 As discussed in the next section, these experiments deviate in some important ways from the standard models of imperfect public monitoring.

4 (1984) as well as the tit-for-tat strategy. In all but one noise treatment, we find that the subjects use only the most recent public signal in determining today’s action: In other words, they choose the cooperative action today if the public signal yesterday is above a certain threshold, and choose the non-cooperative action otherwise. It should be noted that our laboratory experiments are designed in strict accor- dance with the underlying assumptions of the tested theory. First, by the standard identification of the continuation probability with the discount factor, an infinitely repeated game is replaced by a repeated game with a random termination point. The noise is taken to be independent and identically distributed across periods and has full support regardless of actions. Payments to subjects are designed so that they are bounded and reveal no more information than the public signal during thecourseofplay.3 Each pair of subjects understand that they observe the same stochastic signal after every period, and how its probability distribution is related to their actions. The organization of the paper is as follows. In the next section, we present a brief review of the related literature. Section 3 presents a model of a repeated game with imperfect public monitoring and Section 4 proves the optimality of a threshold strategy and characterizes its payoff. The experimental design is described in Section 5. Section 6 reports the results of our experiments: We first test the theoretical prediction on the players’ payoffs and then estimate their strategies. We conclude in Section 7 with some discussions.

2 Related Literature

Most empirical work on repeated games with imperfect monitoring analyzes the data from the 1880’s Joint Executive Committee (JEC) railroad cartel, with a special emphasis on the specification of the timing of regime shifts, i.e., switches between cooperation and punishment phases in the repeated game.4 Earlyworkassumesthat regime shifts follow a Bernoulli distribution (Porter (1983), Lee and Porter (1984)), while some later work uses the Markovian chain in the estimation (Cosslett and Lee (1985)). Porter (1985) takes a detailed look into what triggers the regime shifts, the effect of market structure, and the determinants of price war duration. Ellison (1994), again using the JEC data, tests the Green and Porter (1984) model. In

3 See Section 6. 4 Green and Porter (1984) suggest that such regime shifts follow a Markov process of order equal to the length of punishment periods.

5 contrast to the prior estimates that were closer to the Cournot level, he finds collusive behavior to be much closer to the monopoly level. He identifies several factors as statistically significant determinants (i.e. triggers) of regime shifts. However, the estimated mechanism is not strong enough to deter cheating. Furthermore, he finds evidence of secret price cutting, which is not predicted by the model. Experimental economics has focused much attention on the possibility of coop- eration in various models of oligopolies including the PD and public goods games.5 For repeated games with perfect monitoring, the results of laboratory experiments generally indicate that repeated play generates cooperation strictly above the one- shot Nash equilibrium level and below the first-best level. However, there is no definitive conclusion on the strategies that players use to achieve cooperation. For example, there exists conflicting evidence as to the use of trigger strategies.6 It should also be noted that most of the early results need to be interpreted with cau- tion as they pertain to repeated games with an “unknown horizon,” where subjects are not informed of how long the game will last.7 Experiments on imperfect monitoring include Feinberg and Snyder (2002), Cason and Khan (1999), and Holcomb and Nelson (1997). These papers introduce monitor- ing imperfection in rather specific ways. Cason and Khan (1999) study a repeated public good experiment and compare standard perfect monitoring with perfect, but delayed monitoring of past actions. They do not find any significant difference in the levels of contributions between the two treatments. Feinberg and Snyder (2002) consider a version of the repeated PD where each subject observes his own payoff in each period. They introduce imperfection by occasionally manipulating those payoff numbers, and compare the treatments with and without the ex post revela- tion of such manipulation. Less collusive behavior is found in the latter treatment. Holcomb and Nelson (1997), on the other hand, study a repeated duopoly model in which information about the opponent’s quantity choice is randomly changed 50% of the time. They conclude that such manipulation “does significantly affect market outcomes” (p. 79). It should be noted that the formulation of imperfect monitoring

5 See Holt (1995) for a literature review. 6 See, for example, Sell and Wilson (1991), Feinberg and Husted (1993), and Engle-Warnick and Slonim (2002). 7 See Roth and Murnighan (1978), who point out that such a game yields significantly different results from an infinitely repeated game. They propose, to properly replicate an infinitely repeated game with discounting in an experimental setting, to terminate the game after each period with a fixed probability.

6 in these papers is not in line with the assumptions of the standard theory. For ex- ample, in Feinberg and Snyder (2002) and Holcomb and Nelson (1997), monitoring is imperfect but private since players in these models do not necessarily observe the same signal.8 In contrast to the above models, formulation of imperfect monitoring in this paper strictly matches that of the standard theory. The distinguishing feature of our model is the assumption that the public signal is a one-dimensional real vari- able. This assumption has the following advantages: First, it closely replicates the oligopoly models where price serves as the public signal. Second, this specification naturally incorporates a monotone relationship between the signal and action: the higher the public signal, the more likely the other player has cooperated. This re- lationship is easy for the subjects to understand, and also justifies the use of the threshold strategy. Besides imperfection in monitoring, some recent experiments look at factors that affect players’ cooperative behavior. Duffy and Ochs (2003) study the effects of fixed versus random pairing in a repeated game. For parameter values that can sustain cooperation even with random matching, they find cooperation emerge only in the fixed pairing case. Dal Bo (2003) compares a repeated game with random termination against that with a fixed and known length. He finds that cooperation in the former treatment is at a higher level.9 One of the main focuses of the present paper is the analysis of the players’ strategies in repeated games. This is the subject of some recent experiments as follows. Mason and Phillips (2001) study the use of trigger strategies in a repeated Cournot duopoly game with perfect monitoring. They estimate the duration and severity of punishment by changing the stage payoffs corresponding to deviations. They conclude that evidence is consistent with the use of trigger strategies and that behavior is more consistent with the use of a strategy with long and mild punishment phases. Engle-Warnick and Slonim (2002) study the strategies played by subjects in repeated trust games with perfect monitoring. They look for the strategy that best fits the observed play from the set of pure strategies that can be expressed as deterministic finite automata, and conclude that some subjects use a grim-trigger

8 Cason and Khan (1999) and Holcomb and Nelson (1997) use finite horizon games but do not specify what information was given to their subjects about the duration of the game. 9 A much earlier experiment by Roth and Murnighan (1978) also studies the effect of the con- tinuation probability on the level of cooperation. However, their experiment matches subjects to computerized opponents.

7 strategy.10 In comparison with the above papers, the threshold specification in the present paper allows for a direct estimation of the subjects’ strategies based on a standard econometric technique.

3Model

The set I of n players play a symmetric game infinitely often.11 Player i’s action 0 ai in the stage-game is chosen from the set Ai = a , aˆi .LetA = A1 An { i } ×···× denote the set of action profiles a =(a1,...,an) of all players. After each period, players do not observe each other’s actions but observe a random public signal z R ∈ whose probability distribution is determined by the action profile a A played in ∈ that period. Denote by h(z a) the density of the public signal z under the action | profile a,andbyH(z a) the corresponding cumulative distribution. In the Cournot | model with stochastic demand, for example, ai and z correspond to firm i’s output quantity and the realized price level, respectively.

Player i’s stage-payoff when the signal realization is z is given by wi(ai,z).It should be noted that i’s payoff depends on other players’ actions only through the public signal z, and hence does not provide more information than z itself.

Given the action profile a,letgi(a) be player i’s expected stage-payoff:

gi(a)= wi(ai,z) h(z a) dz. R | Z We assume that the stage-game (A, (g1,...,gn)) is a PD:

0 0 0 gi(ai , aˆ i) >gi(ˆa) >gi(a ) gi(ˆai,a i). − ≥ − 0 0 0 In other words, aˆ =(ˆa1,...,aˆn) is the efficient symmetric profile, and a =(a1,...,an) 0 is a one-shot Nash equilibrium as well as a mutual minmax profile. Furthermore, ai is a profitable one-shot deviation from the efficient profile aˆ. Denote the one-shot 0 0 Nash equilibrium payoff by g = gi(a ) and the efficient payoff by gˆ = gi(ˆa). A t-length public history is the history of signals z in periods 1 through t.At- length private history of player i is the history of signals z and i’s actions in periods

10It should be noted that this estimation technique is not practical for games with a long expected horizon. In Engle-Warnick and Slonim (2002), the continuation probability is such that the expected length of the repeated game is fiveperiods.Itisteninourcase. 11While our experiment is on two-person games, the theory is presented for n-person games as this entails little extra cost.

8 1 through t.Thesetoft-length public histories is given by Rt, while the set of t t t-length private histories of player i is given by R Ai.Playeri’s (pure) strategy t t × is a function σi : ∞ (R A ) Ai.Itisapublic strategy if it is a function of t=0 × i → the public history alone. S Let δ<1 denote the players’ common discount factor. Given a strategy profile σ, the expected payoff to player i is given by

∞ t 1 t πi(σ)=(1 δ) δ − g , − i t=1 X t where gi is the expected stage-payoff in period t under the probability distribution induced by σ. The strategy profile σ is a (pure) equilibrium if for every i, πi(σ) ≥ πi(σi0 ,σ i) for any strategy σi0 . An equilibrium σ =(σ1,...,σn) is public if σi is a − public strategy for each i. A public equilibrium is perfect if σi is a to

σ i for each i after every public history, and is (strongly) symmetric if σ1 = = σn. − ···

4 Symmetric Equilibrium Payoffs

Our objective is to identify the set of symmetric perfect public equilibrium payoffs. As shown by Abreu et al. (1990), this can be accomplished by examining the following class of “grim-trigger” equilibria. For this, suppose that player i plays a public strategy σˆi that starts with aˆi, and keeps playing aˆi as long as the realized public signal z falls in a certain (Borel) subset Q of R but reverts to the 0 action ai forever otherwise. Let σˆ =(ˆσ1,...,σˆn) be the symmetric profile of such a strategy, and v be the expected payoff associated with σˆ. By the stationarity of play, v must satisfy

v =(1 δ)ˆg + δ vP(z Q aˆ)+g0 P (z/Q aˆ) , (1) − ∈ | ∈ | © ª where P (z Q aˆ)= z Q h(z aˆ) dz is the probability that the public signal falls ∈ | ∈ | in set Q under the actionR profile aˆ. Solving (1) for v,weget (1 δ)ˆg + δg0 P (z/Q aˆ) v = − ∈ | . (2) 1 δP(z Q aˆ) − ∈ | It is clear from (2) that the larger the value of P (z Q aˆ), the closer to gˆ is the ∈ | payoff v.Forσˆ to be an equilibrium, playing aˆi in the cooperation phase must be incentive compatible for player i: For any alternative action ai =ˆai, v and Q must 6

9 satisfy

0 v (1 δ)gi(ai, aˆ i)+δ vP(z Q ai, aˆ i)+g P (z/Q ai, aˆ i) . (3) ≥ − − ∈ | − ∈ | − Solving (3) for v,weget © ª

0 (1 δ)gi(ai, aˆ i)+δg P (z/Q ai, aˆ i) v − − ∈ | − . (4) ≥ 1 δP(z Q ai, aˆ i) − ∈ | − Eliminating v from (2) and (4), we obtain

0 1 δP(z Q ai, aˆ i) gi(ai, aˆ i) g − ∈ | − − − . (5) 1 δP(z Q aˆ) ≥ gˆ g0 − ∈ | − We make the following assumptions about the distribution of the public signal:

0 Assumption 1 The unilateral deviation ai from the symmetric action profile aˆ shifts down the distribution of z as measured in the likelihood ratio:

h(z0 aˆ) h(z0 ai, aˆ i) | | − for any z0 z. h(z aˆ) ≥ h(z ai, aˆ i) ≥ | | −

In a Cournot model, for example, Assumption 1 corresponds to assuming that aprofitable increase in production lowers the distribution of the price. As seen, 0 Assumption 1 requires that H( aˆ) first-order stochastically dominate H( ai , aˆ i). ·| ·| − The following theorem shows that under Assumption 1, the most efficient symmetric perfect public equilibrium payoff is replicated by a threshold grim-trigger strategy, which reverts to the punishment if and only if the public signal falls below a certain threshold.

Proposition 1 Suppose that Assumption 1 holds and let v be the maximal sym- metric perfect public equilibrium payoff of the repeated game. If v>g0,thenthere exists a (pure) grim-trigger strategy profile σˆ such that v = πi(ˆσ).Furthermore,σˆi can be taken to be a stationary threshold grim-trigger strategy that begins with aˆi and continues with aˆi as long as the realized public signal z is above some threshold k (i.e., Q =(k, )), but reverts to a0 otherwise. ∞ i Proof. See the Appendix. Let s : A R be a (deterministic) symmetric function of the action profile a, → and x be a real-valued random variable whose distribution is independent of a.For

10 the rest of this paper, we assume that the public signal takes the following additive form: z = s(a)+x. In line with the assumption that z has full support, assume that x has a strictly positive density f over R. Denote the corresponding cumulative distribution by F . It follows that h and f are related through

h(z a)=f(z s(a)) for each z R and a A. | − ∈ ∈ It is not difficult to verify that Assumption 1 is implied by the following set of assumptions on s and f:

0 Assumption 2 s(ai , aˆ i) s(ˆa). − ≤ Assumption 3 f(x y) f(x y ) f(x y ) f(x y) for any x x and y y .12 − 0 − 0 ≥ − 0 0 − ≥ 0 ≥ 0

It should be noted that Assumption 3 holds for a wide class of distributions including normal and gamma distributions. We now turn to the characterization of the set of equilibrium payoffs in this setup. Proposition 1 allows us to focus on threshold grim-trigger strategies. Define

0 0 gi(ai , aˆ i) g 0 l = − − > 1 and d = s(ˆa) s(ai , aˆ i) 0. gˆ g0 − − ≥ − Namely, l is the (normalized) one-shot gain from a deviation to ai,whiled is the sensitivity of the public signal measured by the change in its expected value following such a deviation.

Let σˆ =(ˆσ1,...,σˆn) denote the threshold grim-trigger strategy profile that begins with aˆ and reverts to a0 when z falls below the threshold k as specified in Proposition 1. We can rewrite (2) and (5) respectively as: (1 δ)ˆg + δg0 F (k s(ˆa)) v = − − , (6) 1 δ 1 F (k s(ˆa)) − { − − } and 1 δ 1 F (k s(ˆa)+d) − { − − } l. (7) 1 δ 1 F (k s(ˆa)) ≥ − { − − } 12 The function f satisfying this condition is known as a Polya function of degree 2 (PF2)(Karlin, 2 1968). Note that f is PF2 if and only if the function fˆ : R R+ defined by fˆ(x, y)=logf(x y) → − is supermodular.

11 Let K(δ) denote the set of thresholds k for which the above condition holds: K(δ)= k R : k satisfies (7) . { ∈ } By construction, K(δ) is a closed set. In the Appendix, it is also shown that K(δ) is an interval. There exists a threshold grim-trigger equilibrium that supports the action profile aˆ ifandonlyifK(δ) = .13 By (6), the optimal threshold k = k (δ) 6 ∅ ∗ that maximizes v is the smallest element of K(δ):

k∗(δ)=minK(δ). (8)

It is also clear from (7) that K(δ) = requires d>0. The following proposition 6 ∅ summarizes our observation.

Proposition 2 Suppose that Assumptions 2 and 3 hold. There exists a grim-trigger equilibrium σˆ that plays the symmetric action profile aˆ in the cooperation phase if and only if K(δ) = . In this case, the threshold grim-trigger equilibrium that plays 6 ∅ aˆ in the cooperation phase and uses the optimal threshold k∗(δ) yields the payoff

0 (1 δ)ˆg + δg F (k∗(δ) s(ˆa)) v∗(δ)= − − . 1 δ + δF(k (δ) s(ˆa)) − ∗ − 0 Otherwise, v∗(δ)=g .

As an index that does not depend on specificpayoff numbers, we introduce the following normalization of v∗(δ):

0 v∗(δ) g y∗(δ)= − . gˆ g0 − It can be verified that y∗(δ) [0, 1] is unaffected by a positive affine transformation 14 ∈ of the payoff numbers. By Proposition 2, y∗(δ) can be explicitly written as 1 δ y∗(δ)= − (9) 1 δ + δF(k (δ) s(ˆa)) − ∗ − if K(δ) = and y (δ)=0otherwise. 6 ∅ ∗ 13By Proposition 1, if there exists no threshold grim-trigger equilibrium, then there exists no grim-trigger equilibrium which supports v>g0. 14 That is, y∗(δ) stays the same when we add, subtract, or multiply a positive constant to the players’ stage-payoffs.

12 5 Experimental Design

The experiment tests the theory developed in the previous sections in the following environment. The expected stage payoffs are specified as follows:

1 2 L H \ L 25, 25 15, 28 . H 28, 15 16, 16

As seen, L (“low output”) corresponds to the cooperative action and H (“high output”) represents the opportunity for a profitable one-shot deviation.15 Note that aˆ =(L, L) and a0 =(H, H) in the notation of the previous section. The public signal z is generated through z = s(a)+x with the deterministic function s of the action profile and a random variable x specified as follows: The function s is given by 1 2 L H \ L 20 18 H 18 16 and the random variable x has the following distribution:

x x 1 β 1 | | 1 2 e− if x 0 f (x)= e− β ,andF (x)= − x ≥ (10) 2β 1 β ( 2 e if x<0, where β>0.16 As can be readily verified, f satisfies Assumption 3. Moreover, the parameter β represents the level of noise in the public signal since

E[x]=0 and Var(x)=2β2.

From (9), we can write the normalized maximum symmetric equilibrium payoff as

1 δ if K(δ) = −k∗(δ) 20 δ β− 6 ∅ y∗(δ)= 1 δ+ 2 e ⎧ − ⎨ 0 otherwise. The explicit description of the⎩ set K(δ) of admissible thresholds is found in the Appendix.

15Our choice of this particular stage-game is based on the fact that it scores high on the indexes proposed by Rapoport and Chammah (1965) and Roth and Murnighan (1978) that correlate with the level of cooperation in the infinitely repeated PD with perfect monitoring. 16This distribution is simple enough to allow for the analytical derivation of the optimal threshold k∗(δ). See the Appendix.

13 1

0.9

0.8

0.7

0.6

y 0.5

0.4

0.3

0.2

0.1

0 0 1 2 3 4 5 6 Beta

Figure 1: y∗ as a function of β (δ =0.9)

In the actual treatment, we need to (i) bound the subjects’ payoffs while allowing z to have full support as assumed by the theory, and (ii) end each session in finite time. For (i), we have the subjects receive the expected stage payoffs gi(a) at the conclusion of the experiment instead of having them receive the (stochastic) 17 payoffs wi(ai,z) after each period. For (ii), we interpret the discount factor as a continuation probability and terminate the game after each period with a fixed probability. We set the termination probability equal to 0.1 so that the effective discount factor δ =0.9. Note that the theory in Section 4 holds precisely under these alternative specifications. The experiments proceed in the following steps. First, subjects are provided

17In this specification, hence, the public signal z indicates the opponent’s action choice but does not directly affect the players’ payoffs. This design hence abstracts from the psychological impact of the payoffs as analyzed by Bereby-Meyer and Roth (2004). Alternatively, we could have paid the subjects wi(ai,z) after each period of play for some (bounded) function wi. This payment method, however, would have involved presenting a complex function in the instructions to the subjects.

14 with the basic information about the game they will play. They are then matched in pairs to play a repeated game with imperfect public monitoring. As mentioned earlier, this repeated game has stochastic length and terminates in finite time almost surely.18 The sequence of play between any pair of subjects is referred to as a cycle. At the conclusion of every cycle, the subjects see on the screen their own payoffsin that cycle. They are then randomly rematched to play a new cycle. The information provided to the subjects at the outset includes: (i) The length of a cycle is randomly determined by the termination probability 0.1. (ii) They play against a randomly chosen subject in the session. (iii) The distribution of the random shocks to the public signal is given by (10).19 The random matching for each new cycle is done in a round robin manner: A subject is matched with someone new as long as it is possible, and matched with someone they have played with previously thereafter.20 The first cycle to end after one hour of play marks the end of the session. Therefore, different sessions have different numbers of cycles. In the experiments, we use four different values of β:Theyareβ =0(no noise = perfect monitoring), β =1(low noise), β =4(medium noise), and β =10(high noise). Figure 1 plots y y (0.9) as a function of β. In addition to the above ∗ ≡ ∗ four treatments, we conduct a control treatment where subjects are anonymously and randomly matched after every period of play. This control hence eliminates the repeated game effect (i.e., δ =0),andisdesignedtoprovideabenchmarkforthe first four treatments with respect to the level of cooperation. From the theoretical perspective, the one-shot environment in this treatment is equivalent to having infinitely large noise.21 For this reason, we refer to this last treatment as β = . ∞ The length of those sessions is set equal to 75 periods, which is approximately the average number of periods in the other treatments. For the control treatment, the term cycle refers to the block of initial 15 periods and every successive block of 10 periods. Subjects were recruited through announcements in undergraduate classes, adver- tisements in the student newspaper, flyers posted on campus, and e-mail advertise-

18All games played in the same session terminate simultaneously. 19The instructions given to the subjects can be found at http://homepages.nyu.edu/ gf35/print/afinstructions.pdf. 20As it happened, the subjects were matched with someone they played with previously in only 14% of the cases. 21In actual implementation of the control treatment, we let the subjects observe their actions. With random and anonymous rematching after every period, however, the level of monitoring should be irrelevant.

15 Treatments β =0 β =1 β =4 β =10 β = ∞ Sessions 2 2 2 2 2 Subjects 24 14 26 16 20 20 20 26 16 20 Cycles 8 10 6 7 4 5 5 8 Periods 74 91 73 78 66 75 73 69 74 75

Table 1: Subjects, cycles and periods per session ment at the Ohio State University. This resulted in recruiting a broad cross section of undergraduate students. At the end of each experimental session, subjects were paid $0.017 for every point they accumulated in the experiment. Earnings ranged from $20.35 to $36.55. Details about the number of subjects and periods in each treatment are provided in Table 1.22

6Results

6.1 Payoffs

We first examine the subjects’ payoffs. Note that in the continuation probability formulation, the sum of stage payoffs for the duration of the game corresponds to the average discounted payoff of the infinitely repeated game. For each given value of β,letv¯(β) be the sum of stage payoffs averaged over all cycles and sessions, and let v¯(β) g0 y¯(β)= − gˆ g0 − be the normalization of v¯(β). When the subjects play a symmetric equilibrium of the repeated game, then y¯(β) should lie between the one-shot NE level 0 and the 23 maximal symmetric equilibrium level y∗(β) for each β. We refer to the three treatments (β =0,1,4)forwhichcooperationispossible accordingtothetheory(i.e., y∗(β) > 0)asthecooperation treatments, and the two treatments (β =10, )forwhichitisnot(i.e., y (β)=0)asthenon- ∞ ∗ cooperation treatments. The general evolution of play between the cooperation and non-cooperation treatments is strikingly different: Figure 2 plots y¯(β) by treatment

22The β =0session with 14 subjects had a crash after the end of cycle 8, and was re-started for two additional cycles. The β = sessionwith16subjectshad1crash. 23 ∞ In the experiment, β is changed while δ =0.9 is fixed. Hence, y∗ is indexed by β instead of δ.

16 aafo the from end. data the in keep eventually will they level the reach almost di to begin yet identically, play. 2cyclesonlyhadatotalof3periodsofplay. ylsls tlat2 eid.Teecpini n eso fthe of session one is exception The periods. 20 least at last cycles fi onadtedfor trend downward n ycce Teulbldln orsod othe to corresponds line unlabeled (The cycle. by and ramns n em oices sniedcess sfrtescn e,both set, second the for As decreases. close noise relatively as have increase treatments seen, to but As seems possible, and theoretically otherwise. treatments, is cooperate cooperation to when not time learn over cooperate to ability their that shows niaigthat indicating ueaewrhntn:Fraltetet,teaeaepayo average the treatments, all For noting: worth are gure 24 oeta 0i h xetdnme fprosfr2cce.I l u n eso,the session, one but all In cycles. 2 for periods of number expected the is 20 that Note nodrt ocnrt nsal eair h nlssi htflosexcludes follows what in analysis the behavior, stable on concentrate to order In al it h ausof values the lists 2 Table 24 y ¯ .35 .45 .55 .65 .75 .85

hasanupwardtrendovertimefor y*(delta) .4 .5 .6 .7 .8 .9 fi y ¯ s w ylso h ylsta cu nthe in occur that cycles the or cycles two rst 1 a h aerltv reigas ordering relative same the has y β ¯ beta=4 beta=0 iue2 vlto of Evolution 2: Figure (0) 2 =10 ff > rsbtnilyb yl .Frhroe ycce3they 3 cycle by Furthermore, 3. cycle by substantially er y ¯ 3 (1) and y ∗ y ¯ ( > ( β ∞ β 4 ) y ¯ ) nohrwrs ujcsapa oimprove to appear subjects words, other In . (4) igamtcly ehave: we Diagrammatically, . and 5 > 17 > y Cycle ¯ ( y β ¯ y ¯ (10) ) y ¯ 6 ( β o ahtreatment. each for ytreatments by ) beta=10 beta=1 β ≈ smc ihri the in higher much is 7 =0 y y ¯ β ( ∗ ∞ e oeapcso this of aspects more few A . = ,ad4 n nopposite, an and 4, and 1, , ) β 8 ∞ > =0 ramn. This treatment.) 0 , ramn hr the where treatment 9 ff tr u almost out start s fi s 0prosof periods 20 rst y ¯ 10 ( β ) fi spositive is s e of set rst fi fi gure s 2 rst fi rst Treatment y∗(β)¯y(β) β =0 1 0.845 (0.021) β =1 0.948 0.774 (0.025) β =4 0.486 0.695 (0.033) β =10 0 0.467 (0.027) β = 0 0.418 ∞ (0.017) Standard deviations in parentheses.

Table 2: y∗(β) and y¯(β) by treatments

Treatments β =1 β =4 β =10 β =0 0.017 0.000 0.000 β =1 0.031 0.000 β =4 0.000

Table 3: p-values of the one-sided Mann-Whitney test that y¯ decreases with noise for both the cooperation and non-cooperation treatments at the 1% significance level. y¯(β) lies in the predicted interval [0,y∗(β)] for two of the three cooperation treatments, but not for the non-cooperation treatments. In either case, it is not close to y∗(β):itistoolowwhenβ =0and 1, and too high for all other treatments. It should also be noted that y¯(10) and y¯( ) appear comparable. Each of these ∞ observations will be analyzed in turn. That y¯(β) decreases with β is formally established in Table 3, which gives the p- values of a Mann-Whitney test of the hypothesis that the y¯(β) for β on the left (row) is equal to the y¯(β) for β in the top (column) against the one sided hypothesis that the former is greater than the latter.25 Every test is statistically significant at the 5% level. On the other hand, the hypothesis that y¯(10) =y ¯( ) cannot be rejected ∞ (p =0.108, two-tailed Mann-Whitney test).26 The results of these tests support

25For all such tests, per subject averages are used instead of all the subject-cycle averages because it is likely that y¯i are correlated accross cycles for a given individual. 26Both ANOVA and Kruscal-Wallis tests reject at the 1% level the null hypothesis that the

18 n-htP.T epeie iue3dsrbsteeouino h aeo the of rate the of evolution the describes the 3 on experiments Figure in precise, action observed cooperative be cooperation of To levels PD. positive one-shot the to similar is This ramn fDlB,tert fcoeaini itemr hn5 yteend. the one-shot by the 5% In than more little end. a the is payo by cooperation di 0% Du of this of almost that rate believe treatment the We to matching Bo, drops Dal random cooperation of the treatment of In rate (2003): the Bo Ochs, Dal signi or is (2003) it experiments Ochs related example, in For those PD. to the compared when on high relatively is cooperation of level ett hwthat show to test qa to equal efc oioigtetet asdtelvlo oprto ntenon-cooperation the in cooperation of level the raised treatment, monitoring perfect ramn a oe no has treatment osntcnomt h ote most the to conform not does n-htenvironment. one-shot h eea hoeia rdcin htcoeaini airt uti hnnoise su when under sustain cooperation to that easier and is small, cooperation is that predictions theoretical general the igedt on.T drs hscnenw a average can we concern this address To point. c data one single session, a a within correlated are observations shge hnta nayssino h o-oprto treatments. non-cooperation the of session any in fact, that In than 0.01. higher of is p-value a with rejected is hypothesis .35 .45 .55 .65 .75 .85

27 Cooperation Rate .3 .4 .5 .6 .7 .8 .9 hscnlso osntcag vni ahssini rae steui fosrain If observation: of unit the as treated is session each if even change not does conclusion This ncnrdcint h hoy both theory, the to contradiction In ff 1 ubr,wihaedsge ognrt ihlvl fcoeainudrthe under cooperation of levels high generate to designed are which numbers, iue3 vlto fcoeain ae of Rates cooperation: of Evolution 3: Figure y beta=4 beta=0 2 ∗ tte1 ee o ahtetet nohrwrs h ujcs play subjects’ the words, other In treatment. each for level 1% the at 3 y ¯ ff shge for higher is c on ect 4 a i = ff 27 5 y Cycle rnei trbtdt h eeto ftepayo the of selection the to attributed is erence L for nteohrhn,ttssrjc h yohssthat hypothesis the reject t-tests hand, other the On swl sta fteato pro action the of that as well as 6 β β beta=10 beta=1 =0 =1 7 ffi ,1,4,and10. and in ymti equilibrium. symmetric cient fi 8 atyhge hnta eotdb Du by reported that than higher cantly β =4 9 ffi 19 y ¯ inl ag os sa di as is noise large ciently udageta ahssinsol etetdas treated be should session each that argue ould (10) hnfor than 10 y ¯ nayssino h oprto treatments cooperation the of session any in and

.05 .15 Joint.25 .35 Cooperation.45 .55 .65 .75 .1 .2 .3 .4 .5 .6 .7 .8 β y ¯ =10 y 1 ¯ yssinadueaMann-Whitney a use and session by ( L ∞ and beta=4 beta=0 2 ) and r signi are fi le ( β 3 ,L L, = ( ,L L, ∞ 4 ) h n-ie null one-sided The . fi ycycles by ) atypositive. cantly ffi .Theobserved 5 ff Cycle uta nthe in as cult arx Our matrix: 6 beta=10 beta=1 ff ff yand 7 yand y ¯ is 8 9 10 ee fpro oprto ntoetetet nrae vrtecus feach of course the over increases treatments those of in rates cooperation the 1 comparison, period In of tr only cooperation level our is strategies. in 1 trigger cooperation the reject 1 use period (1991) period subjects in Wilson and their action Sell that cooperative experiments, goods hypothesis of public rate repeated their the in that 39.1% found Having treatment. h aams lasehbtaction exhibit always must data the cinsol equal should action 1 ramnsa well. as treatments cooper- the of rates the that test Mann-Whitney choice two-sided ative the of p-values 4: Table unga 17) u Di xetdt eeaemr oprto hnayo h he PD three the of any than cooperation more generate to Du expected by is used matrices PD our (1978), Murnighan 28 codn oalfu nie rpsdb aootadCamh(95 n ohand Roth and (1965) Chammah and Rapoport by proposed indices four all to According Whenthesubjectsplaythemoste iue4 ae ftecoeaiechoice cooperative the of Rates 4: Figure

L cooperation npro r qa costreatments across equal are 1 period in ff .35 .45 .55 .65 .75 .85 .95 β β β Treatments β n cs(03 n a o(2003). Bo Dal and (2003) Ochs and y .4 .5 .6 .7 .8 .9 28 =4 =1 =0 =10 1 a i = beta=4 beta=0 2 L when 3 .3 .8 .0 0.000 0.000 0.287 0.630 β =1 β L 4 =0 ffi npro fayccei n cooperation any in cycle any of 1 period in 20 .4 .0 0.000 0.000 0.441 β in ymti qiiru,terperiod their equilibrium, symmetric cient =4 5 ,o .Udrti yohss hence, hypothesis, this Under 4. or 1, , amnsaehge.Frhroe the Furthermore, higher. are eatments Cycle 6 .0 0.000 0.000 β =10 beta=10 beta=1 L 7 npro ycycles by 1 period in β 0.230 8 = ∞ 9 10 session (Figure 4). By the last cycle, the rate of period 1 cooperation is 85% in the cooperation treatments. On the other hand, the rate of period 1 cooperation in the non-cooperation treatments is much lower at 41% in the last cycle. Table 4 reports the p-values for the test of the hypothesis that period 1 cooperation is the same across different treatments. The hypothesis that they are the same across all coop- eration treatments cannot be rejected.29 Neither can be the hypothesis that they are the same across the non-cooperation treatments. On the other hand, we can reject the hypothesis that they are the same between the cooperation and non-cooperation treatments. However, for all treatments, t-tests reject at any conventional level the hypothesis that the rates of period 1 cooperation equal unity.

6.2 Strategies

We next turn to the analysis of the strategies. As mentioned in the Introduction, we focus on “threshold strategies” that switch between cooperation and punishment phases based on thresholds on the public signal. This class includes variations of trigger and tit-for-tat strategies that are most often discussed in the experimental analysis of repeated game strategies, as well as some form of “private” strategies that choose actions based on one’s own actions in the past. We use standard likelihood ratio tests to examine if any particular specification of a threshold strategy describes the observed pattern of play. In all but one treatment, we find that the best fitting strategy is one that uses only the most recent public signal for transition between the phases. The analysis in this subsection excludes data from the control treatment. Note first that any strategy that supports cooperation above the one-shot NE level must condition the current choice on the past public signals. In fact, this is what we observe in this experiment. In the perfect monitoring (β =0)treatment, for example, if both players cooperated in the last period, each player cooperates in the current period 94% of the time. On the other hand, if one player cooperated and the other defected in the last period, then the rate of cooperation by the former in the current period decreases to 43%. The same trend can be found in the imperfect monitoring treatments. Figures 5, 6, and 7 shows how a player who cooperated in the previous period chooses his action in the current period as a function of the most recent public signal z. In the β =1and β =4treatments, the rate of cooperation is

29Neither ANOVA nor Kruscal-Wallis rejects the null hypothesis that the treatment has no impact on the rate of period 1 cooperation for β =0,1,and4(p-value> 0.1).

21 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 12.5<=z<15 15<=z<17.5 17.5<=z<20 20<=z<22.5 22.5<=z<25 25<=z<27.5 7 64 446 276 22 1

Figure 5: Rates of L as a function of the most recent public signal (β =1), [No. of Obs. Under the Category]

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -10<=z<0 0<=z<10 10<=z<20 20<=z<30 30<=z<40 1 43 397 346 32

Figure 6: Rates of L as a function of the most recent public signal (β =4), [No. of Obs. Under the Category]

22 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -60<=z<15 15<=z<17.5 17.5<=z<20 20<=z<22.5 22.5<=z<25 25<=z<100 167 47 66 67 46 138

Figure 7: Rates of L as a function of the most recent public signal (β =10), [No. of Obs. Under the Category] clearly an increasing function of z.30 Even in the β =10treatment where the theory predicts no cooperation, we see the cooperation level increase around z =17.5.

6.2.1 Threshold strategies

While the above relationship between z and the rate of cooperation can be a conse- quence of many different strategies, we suppose that the subjects’ behavior follows some simple rules. Specifically, we suppose that it is expressed by a finite automa- ton, which consists of a finite number of states as well as behavior and transition rulesasfollows:Inthefinite automaton representation of a strategy, a player is in one of the states in every period. The behavior rule determines the action he should choose today as a function of the current state, and the transition rule determines tomorrow’s state as a function of the current state as well as the public signal and his own action choice today.31 A threshold strategy with T +1 states (T =1, 2,...)is a finite automaton with T +1 states with the behavior and transition rules specified as follows. The behavior rule specifies action L (cooperation) in state 0, which corre- sponds to the cooperation phase, and action H (non-cooperation) in states 1,...,T, which correspond to the punishment phase. The transition rule is described as fol-

30In Figure 6, this observation holds with the exception of the first column, which has only one observation. 31In a usual specification of a finite automaton, transition is assumed independent of one’s own action. We have chosen a more general specification here in order to explicitly consider a possible adjustment a player may make after his own mistake.

23 lows: The initial state is state 0, and the state in the next period is either state 0 or state 1 depending on the public signal z and the player’s own action ai in the current period.32 When in state t (t =1,...,T 1), unconditional transition to − state t +1 takes place in the next period. When in state T , the state in the next period is either state T again or state 0 depending on the public signal z and the player’s own action ai. The transition at states 0 and T can be described in more detail as follows:

State 0: Stay in state 0 if the public signal z is above threshold a when the own action is L,orifz is above another threshold a + b1 when the own action is H. Move from state 0 to state 1 otherwise.

State T :MovefromstateT to state 0 if the public signal z>a+ b2 + b3 when the own action is L,orifz>a+ b2 when the own action is H. Stay in state T otherwise.

In Figure 8, for example, the number above each arrow indicates the threshold condition to be satisfied for the corresponding transition when the present action is L, and the number below each arrow indicates the same when the present action is H.Wealloweachthresholdtobe+ or . Note that the threshold grim-trigger ∞ −∞ strategy discussed in Section 4 is a threshold strategy with T =1, b2 = and 33 ∞ b1 = b3 =0. A trigger strategy with a fixed punishment length T as discussed by Green and Porter (1984) is also a threshold strategy with b2 = and b1 = −∞ b3 =0. Furthermore, the tit-for-tat strategy in the perfect monitoring environment is obtained by setting T =1, a (s(L, H),s(L, L)] = (18, 20], b1 = b3 = 2,and ∈ − b2 =0. Note that a threshold strategy with b2 > has the punishment phase −∞ possibly longer than T periods. In general, a threshold strategy is public in the sense defined in Section 3 if b1 = b3 =0. Otherwise, a threshold strategy is a private strategy which conditions its action choice on one’s own private history. Note, however, that the on-the-path action is publicly determined even if b1 =0and/or 6 b3 =0. Therefore, if a threshold strategy gives rise to a symmetric equilibrium with 6 perfection after every public history, then the associated payoff is still bounded from above by v∗ (or its normalization y∗) described in Section 4.

32Of course, when the player follows the behavior rule, then his action in state 0 is L.Weletthe transition depend on the own action choice in order to explicitly account for the possiblity that a player chooses an action differentfromthatspecified by the behavior rule by mistake. 33 Whenthegameisoffinite length, an equivalent strategy is obtained by letting b1 =0and T larger than the length of the game.

24 z > a z ≤ a + b2 + b3 z > a + b1 z ≤ a + b2

State 0 z ≤ a State 1 For all z State 2 Play L z ≤ a + b1 Play H Play H

z > a + b2 + b3 z > a + b2

Figure 8: General threshold strategy with T =2

For our discussion, we find it convenient to express the action choice by a thresh- old strategy without explicit reference to the states. Identify 1 with the cooperative action L and 0 with the non-cooperative action H, and for any sequences of public signals z1,z2,... and own actions ci1,ci2,...,let

sit =1for t 1, ≤ and define si,t+1 and kit (t 1) recursively by ≥

a if sit =1and cit =1 ⎧ a + b1 if sit =1and cit =0 ⎪ kit = ⎪ a + b2 if si,t T +1 = = sit =0and cit =1 (11) ⎪ − ··· ⎨⎪ a + b2 + b3 if si,t T +1 = = sit =0and cit =0, − ··· ⎪ otherwise. ⎪ ∞ ⎪ and ⎩⎪

si,t+1 = 1 zt>kit , (12) { } where for any condition A, 1 A =1if A holds and =0otherwise. It can be seen { } that ki,t 1 and sit (t =1, 2,...) equal the threshold and action choice, respectively, − prescribed by the T -state threshold strategy in state t when the public signals are z1,z2,...,zt 1 and the own action choices are ci1,ci2,...,ci,t 1 in periods 1,...,t 1. − − −

25 In actual estimation, we use a stochastic version of the threshold strategy instead of the deterministic definition given above. This is necessary in order to explicitly incorporate the possibility of mistakes by subjects, and to capture possible asym- metry across them. Specifically, we make the following modifications to (11) and

(12): First, we introduce a parameter γ0 and reformulate the threshold condition as

si,t+1 = 1 γ zt>κit (t 1). (13) { 0 } ≥ 34 Clearly, (13) is equivalent to (12) if γ0 > 0 and κit = γ0kit. We suppose that κit is given by (sit =1for t 1): ≤

α + νi if sit =1and cit =1 ⎧ α + νi + γ1 if sit =1and cit =0 ⎪ κit = ⎪ α + νi + γ2 if si,t T +1 = = sit =0and cit =1 (14) ⎪ − ··· ⎪ ⎨ α + νi + γ2 + γ3 if si,t T +1 = = sit =0and cit =0 − ··· α + ν + γ otherwise. ⎪ i 4 ⎪ ⎪ In (14), note⎩ that the parameters α, γ1, γ2, γ3, γ4 are all assumed common across subjects, and that only νi is indexed by i. The relationship κit = γ0kit will be restored if γ = and 4 ∞ α + νi γ1 γ2 γ3 a = ,b1 = ,b2 = ,b3 = . (15) γ0 γ0 γ0 γ0

The term νi, which is the only term indexed by i, is supposed to capture possible asymmetry across the subjects: The larger is νi, the higher the threshold and hence more likely is subject i to play the non-cooperative action H. Technically, νi is treated as correlated random effects, and is assumed to be a random variable with the normal distribution N(ψζi,σν) for some constants ψ, σν and ζi.Thevarianceσν and the factor of proportion ψ are common across subjects, and are to be estimated from the data. On the other hand, ζi is set equal to the fraction of times that subject i chooses H in period 1 of each cycle under estimation: ζi serves as a proxy for i’s tendency to play the non-cooperative action given that any threshold strategy would 35 play action L in period 1. We assume that νi’s are independent across subjects, and use them to test the symmetry assumption of the theory.

34 γ0 is required for a technical reason: It will allow the error term uit introduced later to have the unit variance. 35 We assume that the mean of νi is proportional to ζi in order to deal with the initial conditions problem. (See Heckman (1981) or Chamberlain (1980) for the static case.) Under an alternative 2 assumption that νi N 0,σ , the consistency of our estimate would require the (unlikely) inde- ∼ ν   26 β =0 β =1 β =4 β =10 α -1.437*** -0.819*** -0.569*** -0.430*** (0.127) (0.090) (0.102) (0.150) ψ 1.698*** 1.313*** 1.278*** 1.393*** (0.210) (0.379) (0.250) (0.250)

ρ 0.265§ 0.363§ 0.295§ 0.183§ LL -781.234 -577.886 -809.018 -774.420 Obs. 1908 1112 1360 1258 *, **, *** indicate statistical significance at 10%, 5%, and 1% respectively.

§ indicates statistical significance at 1% using a likelihood ratio test.

Table 5: Estimates of the random choice strategy

We introduce another element of randomness in the form of a random shock term which represents errors made by subjects. More specifically, for a random variable uit, we suppose that i’s action choice ci,t+1 in period t +1 is determined in reference 36 to κit + uit rather than κit itself. In other words, ci,t+1 is determined by

ci,t+1 = 1 γ zt>κit+uit . (16) { 0 }

We assume that uit is independent across subjects and across periods, and has the standard normal distribution N(0, 1). One intended effect of specifying the shock term uit as in (16) is as follows: When the realized public signal zt is close κit to his threshold , subject i makes errors more often than when zt is far from γ0 it.37 Together, (13) and (16) define a limited dependent variable model with lagged dependent variables.

6.2.2 Testing of the random choice strategy

As a benchmark, we first estimate a random choice strategy, for which the expected probability of each action (H and L) is the same throughout the game. Specifically, pendence of ci1’s and νi’s. (See Wooldridge (2002) for a clear expostion of the initial conditions problem and solutions to it.) The log likelihood is estimated using quadrature techniques with a twelve points Gauss Hermit quadrature. Weights and abscisae can be found in Abramowitz and Stegun (1972). 36By modeling deviations as random shocks, we ignore any underlying motive for those deviations. 37An alternative specification is one where the probability of mistakes is independent of the realized value of the public signal. Such specification, however, is not compatible with the standard estimation techniques.

27 the estimated equation is given by

ci,t+1 = 1 α+νi+uit<0 , { } where νi and uit are as defined in the previous section. Note that this is obtained from the general model (16) by setting γ0 = γ1 = γ2 = γ3 = γ4 =0.Table5reports the estimates of this benchmark model, where

2 σν ρ = 2 σν +1

2 is used as a substitute to σν as is customary for random effects estimates. As one would expect, the positive estimate of the coefficient ψ on ζi indicates that someone whoislesslikelytocooperateinperiod1islesslikelytodosoinanyotherperiod. Theconstanttermα increases with noise, which implies that increasing noise tends to decrease cooperation. The random-effects specification is not rejected in any treatment, indicating that tendencies to cooperate vary across the subjects.38

6.2.3 Estimation of general threshold strategies

We now turn to the estimation of general threshold strategies. A few comments are in order: First, as noted earlier, automatic transition from state 1 to state 2, ..., from state T 1 to state T in the original specification of a threshold strategy is − captured by setting γ = in (14). However, when γ = , uit fails to give the 4 ∞ 4 ∞ desired random effect through (16). For this reason, we set γ4 equal to a finite value 39 in our estimation. Although the choice of any particular value for γ4 is arbitrary, we set γ = γ z α ψ,wherez is the highest realized value of the public signal 4 0 − − z in each treatment. The action choice implied by (16) for this γ4 isthesameas that for γ = for the observed signal value zt and the mean values of νi and uit. 4 ∞ However, the two action choices differ from each other when νi + uit takes a large

38When testing for the significance of the random effects specification, the fact that the null hypothesis is at the boundary of the parameter space is properly dealt with. See Gutierres, Carter, and Drukker (2001). 39 Alternatively, γ4 could also be estimated from the data. In this case, however, the interpre- tation of the estimated strategy would become difficult since T would no longer be the minimum punishment length. Furthermore, it will also be seen shortly that the best estimate of T is always 1. In this sense, it does not appear important for our qualitative conclusions whether the value of

γ4 is fixed this way or is estimated.

28 β =0 β =1 β =4 β =10 α 9.543 4.410*** -0.245*** -0.318* (42.103) (1.157) (0.084) (0.168)

γ0 0.564 0.283*** 0.026*** 0.008* (2.335) (0.988) (0.007) (0.005)

γ1 -0.237 0.010 0.804*** 0.079 (4.957) (0.011) (0.103) (0.065)

γ2 -0.167 -0.211 0.049* 0.010 (0.580) (0.245) (0.026) (0.009)

γ3 -0.147 -0.010 0.011* -0.003 (0.329) (0.024) (0.006) (0.002) ψ 0.908 1.192* 0.010* 1.460*** (1.898) (0.126) (0.005) (0.498) T 11 1 1

ρ 0.116§ 0.229§ 0.313§ 0.218§ LL -636.384 -520.464 -775.364 -771.442 Obs. 1908 1112 1360 1258 *, **, *** indicate statistical significance at 10%, 5%, and 1% respectively.

§ indicates statistical significance at 1% using a likelihood ratio test.

Table 6: Parameter estimates of the general threshold strategy

40 negative value. Second, we do not place the restriction that γ0 > 0.Aswillbe seen, however, the estimate of γ0 turns out to be positive in every treatment. Third, our estimation is done separately for each value of T given that every sequence of play in the data is finite. The T with the highest log likelihood is selected. It should be noted that a threshold strategy with the number of states T +1 greater than the longest cycle in each session is identified with a grim-trigger strategy. Table 6 reports the estimates of the coefficients in (16). The standard errors are bootstrapped using a fixed T .41 The estimated T is 1 in every treatment.

40 The smaller the value we choose for γ4, the more likely do the two action choices differ from each other. In essence, we have chosen the smallest value for which the action choices are the same for the mean values of νi and uit. We do not expect that other choices would yield significantly different conclusions. For the strategy with T =1,ofcourse,thechoiceofanyvalueforγ4 is irrelevant. 41Fifty replications using the full sample size are computed. Computation of standard errors for T (a discrete variable) is omitted. It would add little to the interpretation of the results.

29 Once again, the random-effects specification is not rejected for any treatment. It is also worth noting that γ0 is statistically significant in the β =10treatment in contradiction to the prediction of the theory: This suggests that the subjects use the public signal even when they should not. Few coefficient estimates are statistically significant for the β =0, β =1and β =10treatments, while all regressors are statistically significant for the β =4 treatment. We suspect that the lack of statistical significance results from the inclusion of too many parameters. This point is examined in more detail in the next section. In general, the model has explicative power. For each β 4, a likelihood ratio ≤ test rejects any of the random choice model described in Table 5 at the 1% sig- nificance level. On the other hand, for β =10, the random choice model cannot be rejected even at the 10% level. We observe the following concerning the base threshold level a in (11): The coefficient estimates of α and γ0 both decrease with noise, and the ratio of those estimates ( α ) decreases with noise. This indicates from γ0 (15) that the subject-independent component of the base threshold a decreases with noise. On the other hand, the ratio of the estimates of ψ and γ (i.e., ψ )increases 0 γ0 with noise, indicating that the weight on the subject-specific component of a in- creases with noise. In other words, as noise increases, the gap in the base threshold levels widens across individuals.

6.2.4 Restricted threshold strategies

We now restrict some of the parameters of the general threshold specification and ex- amine the performance of the resulting models. Specifically, the tested specifications are as follows:

Sa: b1 = b2 = b3 =0

Sb: b1 = b2 =0, b3 = 2 −

Sc: b1 = 2, b2 = , b3 =0 − −∞

Sd: b1 = b3 = 2, b2 =0 −

Se: b1 = b3 =0, b2 = −∞ The specific values are chosen based on the following considerations: The restriction b1 =0implies that the threshold in state 0 is independent of the player’s own

30 γ1 γ2 γ3 Sa 0 0 0 Sb 0 0 2γ − 0 Sc 2γ γ z α ψ 0 − 0 0 − − Sd 2γ 0 2γ − 0 − 0 Se 0 γ z α ψ 0 0 − − z denotes the lowest realization of z.

Table 7: Parameter restrictions under each specification action. In other words, even if the player has deviated to H in state 0 and is himself responsible for the downward shift of the distribution of the public signal, he ignores its effect. When b1 = 2, on the other hand, the threshold in state 0 is lowered after − an own deviation to H in state 0 to properly discount its effect on z.42 Similarly, when b3 =0,thethresholdinstateT is independent of the own action, and when b3 = 2, it is lowered when the player has chosen action H (as intended in state T ). − The restriction on b2 concerns the length of the punishment phase: When b2 = , −∞ the punishment phase is of fixed length T and transition from state T to state 0 is automatic, and when b2 =0, the threshold in state T isthesameasthatinstate0 and the punishment phase is of variable length. When T =1, threshold strategies with b2 =0resemble the tit-for-tat strategy in that their (on-the-path) reaction is determined only by the most recent public signal and that it does not depend on the state.

The above restrictions on b1, b2 and b3 translate to those on γ1, γ2 and γ3 through (15). Just as in the case of γ4 discussed earlier, we would like to generate randomness by uit even when b2 = . For this reason, we replace γ = by −∞ 2 −∞ γ = γ z α ψ,wherez denotes the lowest realization of z:Whenγ = γ z α ψ, 2 0 − − 2 0 − − the inequality between γ zt and κit + uit stays the same as when γ = for each 0 2 −∞ realization of zt if νi and uit are at or below their mean values, but is reversed when

νi +uit takes a large positive value. In sum, the restrictions for each model are those giveninTable7aswellasγ > 0 and γ = γ z α ψ (as mentioned earlier). 0 4 0 − − Our findings can be summarized as follows: The minimal punishment length T is estimated to be 1 in each model. In terms of their fit levels, the five models are

42Note that the probability that i’s opponent played L given the public signal z and the own action ai = L is the same as that given z 2 and ai = H. −

31 β =0 β =1 β =4 β =10 α 8.391*** 4.220*** 0.309* -0.330** (0.599) (0.511) (0.168) (0.159)

γ0 0.504*** 0.274*** 0.048*** 0.005* (0.031) (0.027) (0.007) (0.003) ψ 0.819*** 1.253*** 1.219*** 1.389*** (0.257) (0.375) (0.242) (0.250)

ρ 0.124§ 0.227§ 0.286§ 0.184§ LL -639.219 -520.528 -784.588 -772.695 Obs. 1908 1112 1360 1258 *, **, *** indicate statistical significance at 10%, 5%, and 1% respectively.

§ indicates statistical significance at 1% using a likelihood ratio test.

Table 8: Parameter estimates of the (Sa) specification ordered almost identically for all the treatments: (Sa) has the best fit and is followed by (Sb), (Sd), (Se) and (Sc) in this order.43 The remaining results are obtained using likelihood ratio tests. In every treatment except for β =4,wecannotrejectatthe 10% level the hypothesis that the (Sa) specification fits the data as well as the general threshold specification does. In the β =4treatment, however, (Sa) is rejected at the 1% level. With a few exceptions, the corresponding hypothesis for each one of the other specifications is rejected at the 1% level in every treatment. We depict in Figure 9 the (Sa) specification. It can be checked that given our parameter values, there exists a threshold k for which (Sa) is a symmetric Nash equilibrium strategy if 1 5δ − 44 and only if the noise level β 2 log 5δ 2 3.4026. Thismayinpartexplain ≤ − ≈ the rejection of this strategy in the³ β =4´treatment. The fact that (Sa) is not rejected for β =10indicates that some subjects may condition their behavior on the public signal even though the theory suggests that they should not. Again the random-effects and ψ bothturnouttobestatistically

43(Sd) and (Se) are reversed for β =1and (Se) and (Sc) are reversed for β =10. 1 44 5δ − For δ>2/5 and β 2 log 5δ 2 , the optimal threshold is given by ≤ −   1 1 2 2 2 − β − β δ 1+ ( δ 1) + e (3 4e ) k∗(β,δ)=18+β log − − − 18. t 2 3 4e− β ≤ − It can also be verified that (Sa) cannot be a symmetric perfect equilibrium strategy for any β.

32 z > a z ≤ a

State 0 z ≤ a State 1 Play L Play H

z > a

Figure 9: The (Sa) specification significant, implying the existence of asymmetry across subjects. As with the general threshold specification, the coefficient estimates of α and γ0,aswellastheratioof those estimates ( α ) all decrease with noise, while the estimate of ψ and the ratio γ0 of the estimates of ψ and γ0 both increase with noise. Since the rejection of the (Sa) specification in the β =4treatment indicates that at least one of the parameters b1, b2,andb3 is non-zero, we test the following additional specifications:

Sa-i: b2 = b3 =0

Sa-ii: b1 = b3 =0

Sa-iii: b1 = b2 =0

Both (Sa-ii) and (Sa-iii) are rejected at the 1% level, whereas (Sa-i) is not rejected 45 at the 10% level. The estimates for γ0 and γ1 both turn out positive, implying that b1 = γ1/γ0 > 0, or that subjects have a higher threshold following an own deviation to H in state 0. It is not clear why the estimated strategy only in this treatment conditions on private actions. As a further examination of the (Sa) specification, we report below the results of two alternative tests that do not rely on the parametric assumptions used for the estimation of the threshold strategies.

First, we test the hypothesis that b1 = b3 =0in the perfect monitoring treat- ment (β =0) using the sign test (Snedecor and Cochran 1980), which requires no

45 The coefficient estimates are α =0.554, γ0 =0.059, γ1 =0.196, ψ =1.048,andρ =0.339.

33 parametric assumption. Suppose that a player uses a threshold strategy with b1 =0 in the perfect monitoring game. Then the condition of transition from state 0 to state 1 should be neutral with respect to the identity of the deviator. In other words, if the opponent’s deviation in state 0 moves the player to state 1, then so does his own deviation. This can be tested as follows. Take any subject i and consider the following two sequences of play: In the first sequence, both i and his opponent j play L in period t 2,andi plays L and j plays H in period t 1. In the second − − sequence, both play L in t 2 and i plays H and j plays L in t 1.Whenb1 =0, i’s − − action in period t should be the same conditional on either sequence. We compare theratethatsubjecti plays action H after the first sequence with that after the second sequence using a sign test. The null hypothesis that they are the same can- not be rejected (p-value = 0.51): 6 subjects play H more often when they played L in period t 1, 3 less often and 1 exactly the same number of times. Assuming − T =1, we can also test if b3 =0by comparing the rate that subject i plays L after the sequence ((H, H), (H, L)) with that after the sequence ((H, H), (L, H)). Again, the null hypothesis that they are the same cannot be rejected (p-value = 0.45): 2 subjects played L more often when they played L in t 1,5lessoften,and − 1 the same number of times. These results support the findings from the likelihood ratio test. They in particular imply that the specifications (Sb)-(Sd), which all have 46 (b1,b3) =(0, 0), are unlikely. 6 To give a better idea on how well the (Sa) strategy fits, we check how well the deterministic specification of the (Sa) strategy (i.e., b1 = b2 = b3 =0in (11) and (12)) describes the data. Specifically, pick any subject, fix the base threshold level a, and consider the sequence of actions this strategy would generate given the sequence of the public signals and his own actions. For each value of a,wecompare the actions thus generated against the actions actually chosen by the subject, and count the number of periods in which the former matches the latter. We then choose a so as to maximize the hit rate, i.e., the ratio of periods for which the two action choices coincide. It is 93% for the median player and 88% on average in the β =0 treatment.47 Likewise, the median and average hit rates are 80% and 82% in the β =1treatment, 77% and 77% in the β =4treatment, and 67% and 71% in the β =10treatment. All these numbers are statistically different from 50% (a coin toss)atthe1%level.

46 While (Se) has (b1,b3)=(0, 0), it performs extremely poorly in the likelihood ratio test. 47In other words, when the subjects are ranked by their hit rates, the hit rate of the median subject is 93%.

34 The results of these alternative tests provideastrongersupportfortheuseof the (Sa) strategy by the subjects.

7 Discussions

As discussed in the Introduction, this paper analyzes cooperation in infinitely re- peated games with imperfect public monitoring under the exact conditions of the standard theory. It studies the effect of noise levels in a standard oligopolistic set- ting where deviations monotonically shift the distribution of the one-dimensional public signal. Our findings suggest that subjects do cooperate in such an environ- ment, and their payoffs are a decreasing function of the level of noise as predicted. The paper also analyzes the subjects’ strategies by focusing attention on threshold strategies, which encompass trigger and other strategies that have been frequently discussed in the literature. Our estimates suggest that the subjects’ strategies in most treatments have a remarkably simple representation: In every period, this strategy chooses the cooperative action when the public signal in the last period is above a certain threshold, and chooses the punitive action otherwise. While the present paper limits its theory to symmetric equilibrium payoffs, the data suggests a certain degree of asymmetry in the subjects’ strategies. A few com- ments are in order regarding this point. First, a theory of asymmetric equilibrium payoffs as a function of noise would be enormously complex. We think that the sim- plifying assumption of symmetry provides a good approximation to our qualitative findings on noise and cooperation.48 Second, while the consideration of asymmetric strategy profiles raises efficiency, it appears to add little to our analysis of payoffs: Inthelownoisetreatments,theobservedpayoffs are lower than the maximal sym- metric equilibrium payoff. Hence, they are also lower than the maximal asymmetric equilibrium payoff. In the high noise treatments, on the other hand, the observed payoffs do exceed the maximal symmetric equilibrium payoff. However, it is not easy to explain this through asymmetric equilibria either. The case in point is the control treatment where the subjects’ payoffs are strictly positive. In theory, however, the unique (symmetric or asymmetric) equilibrium in this treatment is the repetition of the one-shot Nash equilibrium, which yields zero.

48In fact, symmetry is the working assumption of much of the experimental literature, which often finds asymmetries across subjects. See, for example, Roth (1995) and Casari, Ham and Kagel (2004).

35 One possible explanation for the observed deviations of the subjects’ payoffs can be provided by trembling in choosing actions. Suppose, for example, that the noise is low so that the efficient equilibrium strategy entails the cooperative action L most of the time on the equilibrium path. Then, when a player trembles, he switches from L to H more frequently, triggering a punishment and lowering his payoff from the level without trembling. On the other hand, if the noise is high, the efficient equilibrium strategy entails H most of the time, and trembling causes switching from H to L more often. This raises the player’s payoff from the level without trembling. In comparison with the real industrial setting, the subjects in our experiments play in an extremely simple environment with two stage actions and a single public signal. It remains to be seen whether the paper’s observation continues to hold in a more complex environment that mimics the reality. In this sense, more analysis is required for the discussion of its implications on social welfare.

Appendix

ProofofProposition1. By the bang-bang property of a perfect public equi- librium (Abreu et al. (1990, Theorem 3)), the maximal symmetric perfect public equilibrium payoff v can be generated by a stationary grim-trigger equilibrium σ which plays the symmetric action profile aˆ throughout the cooperation phase and reverts to a0 if and only if z/Q for some Q R.49 Consider an alternative grim- ∈ ⊂ trigger strategy profile σˆ which begins with aˆ and reverts to a0 ifandonlyifz k, ≤ where k is such that P (z Q aˆ)= h(z aˆ) dz = ∞ h(z aˆ) dz = P (z>k aˆ). (17) ∈ | | | | ZQ Zk It then follows from (2) that σˆ and σ yield the same payoff. On the other hand, the incentive constraint (5) for this strategy can be written as 0 0 0 1 δ 1 H(k ai , aˆ i) gi(ai , aˆ i) g − { − | − } − − . (18) 1 δ 1 H(k aˆ) ≥ gˆ g0 − { − | } − In what follows, we show that σˆ is also an equilibrium by verifying (18). Denote

K =(k, ) and write for ai Ai, ∞ ∈

M(ai)= h(z ai, aˆ i) dz and N(ai)= h(z ai, aˆ i) dz. K Q | − Q K | − Z \ Z \ 49Generation of a perfect public equilibrium payoff below v requires the use of a different Q in period 1.

36 Note that M(ˆai)=N(ˆai) by (17), and that

P (z>k ai, aˆ i)=P (z Q ai, aˆ i)+M(ai) N(ai). | − ∈ | − − 0 0 Note that (18) follows from P (z>k ai , aˆ i) P (z Q ai , aˆ i),orequivalently, | − ≤ ∈ | − M(a0) N(a0). Assumption 1 implies that i ≤ i 0 0 M(ai )= h(z ai , aˆ i) dz K Q | − Z \ 0 0 h(z ai , aˆ i) = h(k ai , aˆ i) | 0 − dz K Q | − h(k ai , aˆ i) Z \ | − 0 h(z aˆ) h(k ai , aˆ i) | dz ≤ K Q | − h(k aˆ) Z \ | 0 h(k ai , aˆ i) = | − M(ˆa ), h(k aˆ) i | and that

0 0 N(ai )= h(z ai , aˆ i) dz Q K | − Z \ 0 0 h(z ai , aˆ i) = h(k ai , aˆ i) | 0 − dz Q K | − h(k ai , aˆ i) Z \ | − 0 h(z aˆ) h(k ai , aˆ i) | dz ≥ Q K | − h(k aˆ) Z \ | 0 h(k ai , aˆ i) = | − N(ˆa ). h(k aˆ) i | Therefore, 0 0 0 h(k ai , aˆ i) M(a ) N(a ) | − M(ˆai) N(ˆai) =0. i − i ≤ h(k aˆ) { − } | This completes the proof. //

Characterization of K(δ) and k∗(δ). Proposition 3 below provides a character- ization of the set K(δ) of thresholds for which player i finds playing aˆi incentive compatible.

Assumption 4 f is continuous, and f(0) = maxx R f(x). ∈ Assumption 4 holds for many standard distributions as well as the one (10) used in our experiment. The following proposition shows that the optimal threshold under such a distribution is always below the expected value of the public signal.

37 Proposition 3 Suppose that Assumptions 2 and 3 hold. Then K(δ) is a (possi- bly empty) closed interval. If, in addition, Assumption 4 holds, then the optimal threshold k (δ)

∞ f(x + d) W (k,ai)= l f(x) dx. k s(ˆa) − f(x) Z − ½ ¾ After some algebra, we see that (7) is equivalent to

l 1 W (k, ai) − (19) ≥ δ

f(x+d) Since f satisfies Assumption 3 and d 0,itcanbeverified that f(x) is weakly ≥ l 1 decreasing in x.Takeanyk and k such that k

k s(ˆa) 00− f(x + d) W (k00,ai)=W (k,ai) l f(x) dx, − k s(ˆa) − f(x) Z − ½ ¾ and k s(ˆa) 0− f(x + d) W (k00,ai)=W (k0,ai)+ l f(x) dx. k s(ˆa) − f(x) Z 00− ½ ¾ Since the quantity inside the brackets in each integrand is weakly increasing in x,if the first integral is positive, so is the second, and equivalently, if the second integral l 1 is negative, so is the first. In either case, we have W (k ,ai) − .Thisimplies 00 ≥ δ that the set of k’s which satisfy (8) is convex. That K(δ) is closed follows from the continuity of W in k. Suppose now that Assumption 4 holds. We then have

f(d) l > 0. − f(0)

f(x+d) l 1 Since l f(x) is weakly increasing and continuous in x,ifW (k, ai) −δ for some − l 1 ≥ k s(ˆa),thenW (s(ˆa) γ,ai) >W(s(ˆa),ai) W (k,ai) − for a sufficiently ≥ − ≥ ≥ δ small γ>0 as well. This shows that k∗(δ)

Description of K(δ). When the distribution of the random variable x is as spec- ified in (10) with β>0,thesetK(δ) of effective thresholds is explicitly given as

38 follows. Let δ2l λ =log , (δl+1 l)2 − δl µ =log , δ(2l 1) 2(l 1) − − − δ ν =log . δl 2(l 1) − − 2(l 1) 2(l 1) Note that µ is well-defined when δ> 2l−1 ,andν is well-defined when −l <δ< − 1. Furthermore, whenever these quantities are well-defined, we have log l<λ< µ<ν. The following three cases arise depending on the value of the discount factor δ:

2(l 1) 1) δ 0, 2l−1 . ∈ − ³ i K(δ)= . ∅ 2(l 1) 2(l 1) 2) δ 2l−1 , min −l , 1 . ∈ − ³ n o´ d [k3,k2] if β [µ, ), d ∈ ∞ K(δ)= [k1,k2] if [λ, µ), ⎧ β ∈ ⎨⎪ if d (0,λ), ∅ β ∈ where ⎩⎪

1/2 λ/2 λ d 1/2 k1 = s(ˆa)+β log l− e− e− e− β , − − 1/2 n λ/2 λ d 1/2o k2 = s(ˆa)+β log l− e− + ¡e− e− β ¢ , − 2(1 nδ)(l 1)¡ ¢ o k3 = s(ˆa)+β log − d − . δ e β l − n o 0 d 0 Inthiscase,wehave(a)k1 >s(ai , aˆ i) <µ,(b)k3 µ,and(c)k2

2(l 1) 3) δ min − , 1 , 1 . ∈ l h n o ´ d [k3,k4] if β [ν, ), d ∈ ∞ [k3,k2] if [µ, ν), K(δ)=⎧ β ⎪ d ∈ ⎪ [k1,k2] if [λ, µ), ⎨⎪ β ∈ if d (0,λ), ∅ β ∈ ⎪ ⎪ ⎩ 39 where k1, k2 and k3 are defined as above and

d δ le β 1 0 − k4 = s(ai , aˆ i)+β log . − n2(l 1) o − d In this case, we have (a) and (b) above, and (c’) k2 s(ˆa) >ν. ⇔ β

40 References

Abramowitz, M. and Stegun, I. A. (eds): 1972, Handbook of mathematical functions with formulas, graphs, and mathematical tables, U.S. Government Printing Of- fice, Washington, D.C.

Abreu, D., Pearce, D. and Stacchetti, E.: 1990, Toward a theory of discounted repeated games with imperfect monitoring, Econometrica 58, 1041—1063.

Bereby-Meyer, Y. and Roth, A. E.: 2004, The fragility of cooperation. mimeo.

Casari, M., Ham, J. C. and Kagel, J. H.: 2004, Selection bias, demographic effects and ability effects in common value auction experiments. mimeo.

Cason, T. N. and Khan, F. U.: 1999, A laboratory study of voluntary public goods provision with imperfect monitoring and communication, Journal of Develop- ment Economics 58, 533—552.

Chamberlain, G.: 1980, Analysis of covariance with qualitative data, Review of Economic Studies 47, 225—238.

Cosslett, S. R. and Lee, L.-F.: 1985, Serial correlation in latent discrete variable models, Journal of Econometrics 27, 79—97.

Dal-Bo, P.: 2003, Cooperation under the shadow of the future: experimental evi- dence from infinitely repeated games. mimeo.

Duffy, J. and Ochs, J.: 2003, Cooperative behavior and the frequency of social interaction. mimeo.

Ellison, G.: 1994, Theories of cartel stability and the joint executive committee, Rand Journal of Economics 25, 37—57.

Engle-Warnick, J. and Slonim, R. L.: 2002, Inferring repeated-game strategies from actions: Evidence from trust game experiments. mimeo.

Feinberg, R. M. and Husted, T. A.: 1993, An experimental test of discount-rate effects on collusive behaviour in duopoly markets, Journal of Industrial Eco- nomics 41(2), 153—160.

Feinberg, R. M. and Snyder, C.: 2002, Collusion with secret price cuts: An experi- mental investigation, Economics Bulletin 3(6), 1—11.

41 Green, E. J. and Porter, R. H.: 1984, Noncooperative collusion under imperfect price information, Econometrica 52, 87—100.

Gutierrez, R. G., Carter, S. and Drukker, D. M.: 2001, sg160: On boundary-value likelihood-ratio tests, Stata Techical Bulletin Reprints 10, 269—273.

Heckman, J. J.: 1981, Structural Analysis of Discrete Data with Econometric Ap- plications, MIT Press, Cambridge, MA, chapter The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time- Discrete Data Stochastic Process, pp. 179—195.

Holcomb, J. H. and Nelson, P. S.: 1997, The role of monitoring in duopoly market outcomes, Journal of Socio-Economics 26(1), 79—93.

Holt, C. A.: 1995, Handbook of Experimental Economics, Princeton University Press, Princeton, chapter Industrial Organization: A Survey of Laboratory Research, pp. 349—444.

Kandori, M.: 1992, The use of information in repeated games with imperfect mon- itoring, Review of Economic Studies 59, 581—593.

Karlin, S.: 1968, Total Positivity, Stanford University Press, Stanford.

Lee, L.-F. and Porter, R. H.: 1984, Switching regression models with imperfect sample separation information — with an application to cartel stability, Econo- metrica 52, 391—418.

Mason, C. F. and Phillips, O. R.: 2001, In support of trigger strategies: Experimen- tal evidence from two-person non-cooperative games. mimeo.

Porter, R. H.: 1983, Optimal cartel trigger price strategies, Journal of Economic Theory 29, 313—338.

Porter, R. H.: 1985, On the incidence and duration of price wars, Journal of Indus- trial Economics 33, 415—426.

Rapoport, A. and Chammah, A. M.: 1965, Prisoner’s Dilemma: A Study in Conflict and Cooperation, The University of Michigan Press, Ann Arbor.

Roth, A. E.: 1995. “Bargaining Experiments” in Handbook of Experimental Eco- nomics, ed. by John H. Kagel and Alvin E. Roth. Princeton: Princeton Uni- versity Press, 253-348.

42 Roth, A. E. and Murnighan, J. K.: 1978, Equilibrium behavior and repeated play of the prisoner’s dilemma, Journal of Mathematical Psychology 17, 189—198.

Sell, J. and Wilson, R. K.: 1991, Trigger strategies in repeated-play public goods games: Forgiving, non-forgiving or non-existent. Working Paper, Rice Univer- sity.

Snedecor, G. W. and Cochran, W. G.: 1980, Statistical Methods, Iowa State Uni- versity Press, Ames, Iowa.

Wooldridge, J. M.: 2002, Econometric Analysis of Cross Section and Panel Data, The MIT Press, Cambridge.

43