When Does Learning in Games Generate Convergence to Nash Equilibria? The Role of Supermodularity in an Experimental Setting

By YAN CHEN AND ROBERT GAZZALE*

This study clarifies the conditions under which learning in games produces con- vergence to Nash equilibria in practice. We experimentally investigate the role of supermodularity, which is closely related to the more familiar concept of strategic complementarities, in achieving convergence through learning. Using a game from the literature on solutions to externalities, we find that supermodular and “near- supermodular” games converge significantly better than those far below the thresh- old of supermodularity. From a little below the threshold to the threshold, the improvement is statistically insignificant. Increasing the parameter far beyond the threshold does not significantly improve convergence. (JEL C91, D83)

When do players learn to play Nash equilib- they converge to the set of Nash equilibria that ria? The answer to this important question will bound the serially undominated set. The learn- help us identify when the outcomes predicted by ing dynamics include Bayesian learning, ficti- theory will be realized in competitive environ- tious play, adaptive learning, Cournot best ments involving real people. This question has reply, and many others. These games include been examined both theoretically (see Drew the supermodular games of Donald Topkis Fudenberg and David Levine, 1998, for a sur- (1979), Xavier Vives (1985, 1990), Russell vey) and experimentally (see Colin Camerer, Cooper and Andrew John (1988), and Milgrom 2003, for a survey). and Roberts (1990). In supermodular games, According to the theoretical literature, games each player’s marginal utility of increasing her with strategic complementarities ( rises with increases in her rival’s strat- and John Roberts, 1991; Milgrom and Chris egies, so that (roughly) the players’ strategies Shannon, 1994) have robust dynamic stability are “strategic complements.” properties: under numerous learning dynamics, Existing literature recognizes that games with strategic complementarities encompass impor- tant economic applications of noncooperative * Chen: School of Information, University of Michigan, , for example, macroeconomics un- 1075 Beal Avenue, Ann Arbor, MI 48109 (email: der imperfect competition (Cooper and John, [email protected]); Gazzale: Department of Economics, 1988), search (Peter A. Diamond, 1982), bank Williams College, Fernald House, Williamstown, MA 01267 (email: [email protected]). We thank Jim An- runs (Douglas W. Diamond and Philip H. dreoni, Charles Brown, David Cooper, Julie Cullen, John Dybvig, 1983; Andrew Postlewaite and Vives, DiNardo, Jacob Goeree, Ernan Haruvy, Daniel Houser, Pe- 1987), network and adoption externalities ter Katuscak, and Jean Krisch for helpful discussions, sem- (Dybvig and Chester S. Spatt, 1983), and mech- inar participants at the University of Michigan, Virginia anism design (Chen, 2002). Tech, and ESA 2001 (Barcelona and Tucson) for their comments, Emre Sucu for programming for the experiment, Past experimental studies of learning and and Yuri Khoroshilov, Jim Leady, and Belal Sabki for suggest that, in addition excellent research assistance. We are grateful for the con- to equilibrium efficiency, mechanism choice structive comments of B. Douglas Bernheim, the co-editor, should depend on whether players learn to play and three anonymous referees, which significantly im- proved the paper. The research support provided by NSF the equilibrium and the nature of play on Grant SES-0079001 to Chen is gratefully acknowledged. the path to equilibrium. In reviewing the ex- Any errors are our own. perimental literature on incentive-compatible 1505 1506 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 mechanisms for pure public goods, Chen (forth- proposes a simple class of two-stage mecha- coming) finds that mechanisms with strategic nisms, the compensation mechanisms, in which complementarities, such as the Groves-Ledyard subgame-perfect equilibria implement efficient mechanism under a high punishment parameter, allocations. In a generalized version of Varian’s converge robustly to the efficient equilibrium mechanism (John Cheng, 1998), one can vary, (Chen and Charles R. Plott, 1996; Chen and without altering the equilibrium, a parameter Fang-Fang Tang, 1998). Conversely, those far that determines whether the condition for stra- away from the threshold of strategic comple- tegic complementarities is satisfied. mentarities do not seem to converge (Vernon L. There have been two experimental studies of Smith, 1979; Ronald M. Harstad and Michael the compensation mechanisms, neither of which Marrese, 1982). In previous experiments, pa- adopts a version with strategic complementari- rameters are set either far away from the thresh- ties. James Andreoni and Varian (1999) study old for strategic complementarities (e.g., Chen the mechanism in the context of the Prisoners’ and Tang, 1998) or very close to the threshold Dilemma. They find that adding a commitment (e.g., Josef Falkinger et al., 2000).1 These ex- stage to the standard Prisoners’ Dilemma game periments do not, however, systematically set nearly doubles the amount of cooperation to the parameters below, close to, at, and above the two-thirds. Yasuyo Hamaguchi et al. (2003) threshold to assess the effect of strategic investigate a version with a larger strategy space complementarities on convergence. This is the and find 20-percent play. first such systematic experimental study of In this paper, we examine the compensation games with strategic complementarities.2 Con- mechanism in an economic environment with a sequently, this study answers three important much larger strategy space. Furthermore, we questions that the theory on games with strate- choose various versions to study systematically gic complementarities does not address. First, the effect of strategic complementarities on as the parameters approach the threshold of convergence. strategic complementarities, will play converge The paper is organized as follows. Section I to equilibrium gradually or abruptly? Second, is introduces games with strategic complementa- there a clear performance ranking among games rities and presents theoretical properties of the with strategic complementarities? Third, how compensation mechanisms. Section II presents important is strategic complementarity com- the experimental design. Section III introduces pared to other factors? The answer to the first the set of hypotheses. Section IV presents ex- question can help us assess, a priori, whether a perimental results on the level and speed of game “close” to being supermodular, such as convergence, as well as efficiency. Section V the Falkinger mechanism (Falkinger, 1996), presents the calibration of three learning mod- might also have good convergence properties. els, validation of the models on a hold-out sam- The answer to the second question will help us ple, and simulation of performance in the long choose the best parameters within the class of run using a calibrated learning model. Section games with strategic complementarities. The VI discusses our findings and Section VII answer to the third question will help us assess concludes. the importance of strategic complementarities in learning and convergence. I. Strategic Complementarity and the To address these questions, this study adopts Compensation Mechanisms an experimental game from the literature on solutions to externalities. Hal R. Varian (1994) In this section, we first introduce games with strategic complementarities and their theoretical properties. We then introduce and analyze a 1 Proofs of supermodularity of the Groves-Ledyard and family of mechanisms in this class of games, the Falkinger mechanisms are presented in Chen (in press). namely the compensation mechanisms. 2 By systematically varying the free parameter in the Games with strategic complementarities Groves-Ledyard mechanism, Jasmina Arifovic and John O. Ledyard (2001) study learning dynamics and mechanism (Milgrom and Shannon, 1994) are those where, convergence using genetic algorithms compared with ex- given an ordering of strategies, higher actions perimental data. by one player provide an incentive for the other VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1507

players to take higher actions as well. These Thus, while games with strategic complemen- games include supermodular games in which tarities ought to converge robustly to the Nash the incremental return to any player from in- equilibrium, games without strategic comple- creasing her strategy is a nondecreasing func- mentarities may also converge under specific tion of the strategy choices of other players learning algorithms. Whether these specific (increasing differences). Furthermore, if a play- learning algorithms are a realistic description of er’s strategy space has more than one dimen- human learning is an empirical question. sion, components of a player’s strategy are While the theory on games with strategic complements (supermodularity). Membership complementarities predicts convergence to in this class of games is easy to check. Indeed, equilibrium, it does not address four practical ޒn for smooth functions in , letting Pi be the issues. First, as the parameters of a game ap- ␲ strategy space and i be the payoff function of proach the threshold of strategic complementa- player i, the following theorem characterizes rities, does play converge gradually or abruptly? increasing differences and supermodularity. Second, is convergence faster further past the threshold? Third, how important is strategic ␲ THEOREM 1 (Topkis, 1978): Let i be twice complementarity compared to other features of ␲ continuously differentiable on Pi. Then i has a game that might also induce convergence to increasing differences in (pi, pj) if and only if equilibrium? Last, for supermodular games with Ѩ2␲ Ѩ Ѩ Ն  Յ Յ i/ pih pjl 0 for all i j and all 1 h multiple Nash equilibria, will players learn to Յ Յ ␲ ki and all 1 l kj; and i is supermodular in coordinate on a particular equilibrium? We Ѩ2␲ Ѩ Ѩ Ն pi if and only if i/ pih pil 0 for all i and all choose a game that allows us to answer the first Յ Ͻ Յ 1 h l ki. three questions. The fourth question has been addressed by John Van Huyck et al. (1990) and Increasing differences means that an increase James Cox and Mark Walker (1998).3 in the strategy of player i’s rivals raises her Specifically, we use the compensation mech- marginal utility of playing a high strategy. The anism to study the role of strategic comple- supermodularity requirement ensures comple- mentarities in learning and convergence to mentarity among components of a player’s equilibrium play. In the mechanism, each of two strategies and is automatically satisfied in a players offers to compensate the other for the one-dimensional strategy space. Note that a “costs” of the efficient choice. Assume that supermodular game is a game with strategic when player 1’s production equals x, her net complementarities, but the converse is not profit is rx Ϫ c(x), where r is the market price ,true. and c٪ is a differentiable, positive, increasing Supermodular games have interesting theo- and convex cost function. Production causes an retical properties. In particular, they are robustly externality on player 2, whose payoff is Ϫe(x), stable. Milgrom and Roberts (1990) prove that, in these games, learning algorithms consistent with adaptive learning converge to the set bounded by the largest and the smallest Nash 3 Cox and Walker (1998) study whether subjects can equilibrium strategy profiles. Intuitively, a se- learn to play Cournot duopoly strategies in games with two kinds of interior Nash equilibrium. Their type I duopoly has quence is consistent with adaptive learning if a stable interior Nash equilibrium under Cournot best-reply players “eventually abandon strategies that per- dynamics and therefore is dominance solvable (Herve Mou- form consistently badly in the sense that there lin, 1984). Their type II duopoly has an unstable interior exists some other strategy that performs strictly Nash equilibrium and two boundary equilibria under and uniformly better against every combination Cournot best-reply dynamics, and therefore is not domi- nance solvable. They found that after a few periods subjects of what the competitors have played in the did play stable interior, dominance-solvable equilibria, but not-too-distant past” (Milgrom and Roberts, they did not play the unstable interior equilibria nor the 1990). This includes numerous learning dynam- boundary equilibria. It is interesting to note that these ics, such as Bayesian learning, fictitious play, duopoly games are submodular games. Being two-player games, they are also supermodular (Rabah Amir, 1996). adaptive learning, and Cournot best reply. Results of Cox and Walker (1998) illustrate the importance While strategic complementarity is sufficient of uniqueness together with supermodularity in inducing for convergence, it is not a necessary condition. convergence. 1508 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 also assumed to be differentiable, positive, in- The unique subgame-perfect equilibrium has ϭ ϭ creasing, and convex. The mechanism is a two- p1 p2 p*, where p* is the Pigovian tax staged game where the unique subgame-perfect which induces the efficient quantity, x*. As the Nash equilibrium induces the Pareto-efficient equilibrium does not depend on the value of ␤, outcome of x such that r ϭ eЈ(x) ϩ cЈ(x). In the it holds for the original version where ␤ ϭ 0. first stage (the announcement stage), player 1 When ␤ is set appropriately, however, the announces p1, a per-unit subsidy to be paid to generalized version is a supermodular mecha- player 2, while player 2 simultaneously an- nism. The following proposition characterizes nounces p2, a per-unit tax to be paid by player 1. the necessary and sufficient condition for Announcements are revealed to both players. In supermodularity. the second stage (the production stage), player 1 chooses a production level x. The payoff to PROPOSITION 1 (Cheng, 1998): The gener- ␲ ϭ Ϫ Ϫ Ϫ ␣ Ϫ player 1 is 1 rx c(x) p2x (p1 alized version of the compensation mechanism 2 ␲ ϭ ␣ Ͼ ␤ Ն p2) , while the payoff to player 2 is 2 is supermodular if and only if 0 and p x Ϫ e x ␣ Ͼ Ϫ 1 xЈ p 1 ( ), where 0 is a free punishment 2 ( 2). parameter chosen by the designer. We study a generalized version of the com- The proof is simple. First, as the strategy pensation mechanism (Cheng, 1998), which space is one-dimensional, the supermodularity Ϫ␤ Ϫ 2 adds a punishment term, (p1 p2) ,to condition is automatically satisfied. Second, we player 2’s payoff function, thus making the pay- use Theorem 1 to check for increasing differ- off functions ences. For player 1, from equation (2), we have Ѩ2␲ Ѩ Ѩ ϭ ␣ Ͼ 1/ p1 p2 2 0, while for player 2, from ␲ ϭ Ϫ ͑ ͒ Ϫ Ϫ ␣͑ Ϫ ͒2 Ѩ2␲ Ѩ Ѩ ϭ Ј ϩ (1) 1 rx c x p2 x p1 p2 , equation (3), we have 2/ p1 p2 x (p2) ␤ Ѩ2␲ Ѩ Ѩ Ն 2 . Therefore, 2/ p1 p2 0 if and only if ␲ ϭ Ϫ ͑ ͒ Ϫ ␤͑ Ϫ ͒2 ␤ Ն Ϫ 1 xЈ p 2 p1 x e x p1 p2 . 2 ( 2). To obtain analytical solutions, we use a qua- Using the generalized version, we solve the dratic cost function c(x) ϭ cx2, where c Ͼ 0, game by . In the production and a quadratic externality function e(x) ϭ ex2, stage, player 1 chooses the quantity that solves where e Ͼ 0. We now summarize the best Ϫ Ϫ Ϫ ␣ Ϫ 2 maxx rx c(x) p2x (p1 p2) . The first response functions, equilibrium solutions, and Ϫ Ј Ϫ ϭ order condition, r c (x) p2 0, character- stability analysis in this environment. izes the in the second stage, x(p2). In the announcement stage, player 1 solves PROPOSITION 2: Quadratic cost and exter- max rx Ϫ c(x) Ϫ p x Ϫ ␣(p Ϫ p )2. The first nality functions yield the following character- p1 2 1 2 order condition is izations:

Ѩ␲ (i) The best response functions for players 1 1 ϭ Ϫ ␣͑ Ϫ ͒ ϭ (2) Ѩ 2 p1 p2 0 and 2 are: p1 ϭ which yields the best response function for (4) p1 p2 ; ϭ player 1 as p1 p2. Player 2 simultaneously Ϫ Ϫ ␤ Ϫ solves maxp p1x(p2) e(x(p2)) (p1 1 er 2 2 ␤ Ϫ 2 p2) . The first order condition is 4c 4c ϭ ϩ (5) p2 p1 ; and Ѩ␲ e e 2 ␤ ϩ ␤ ϩ (3) ϭ p xЈ͑p ͒ Ϫ eЈ͑x͒xЈ͑p ͒ 4c2 4c2 Ѩp 1 2 2 2 r Ϫ p ϭ ͭ 2ͮ ϩ ␤͑ Ϫ ͒ ϭ x max 0, . 2 p1 p2 0 2c which characterizes player 2’s best response (ii) The subgame-perfect Nash equilibrium is function. characterized as VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1509

͑ ͒ p*1 , p*2 , x* sloping and they intersect at the equilibrium. It er er r is easy to verify graphically that adaptive learn- ϭ ͩ , , ͪ. ers, e.g., Cournot best reply, will converge to e ϩ c e ϩ c ͑e ϩ c͒ 2 the equilibrium regardless of where they start. (iii) If players follow Cournot best-reply dy- To examine how ␣, which is unrelated to namics, (p*1, p*2) is a globally asymptoti- strategic complementarity, might affect behav- cally stable equilibrium of the continuous ior, we observe that when player 1 deviates ␣ Ͼ ␧ ϭ ϩ ␧ time dynamical system for any 0 and from the best response by , i.e., p1 p2 , ␤ Ն ⌬␲ ϭϪ␣␧2 0. his profit loss is 1 . This profit loss, (iv) The game is supermodular if and only if which is proportional to the magnitude of ␣,is ␣ Ͼ 0 and ␤ Ն 1/4c. the deviation cost for player 1. Based on previ- ous experimental evidence, the incentive to de- PROOF: viate from best response decreases when the See Appendix. deviation cost increases. Chen and Plott (1996) call it the General Incentive Hypothesis, i.e., the The best-response functions presented in part error of game theoretic models decreases as the (i) of Proposition 2 reveal interesting incentives. incentive to best-respond increases. We there- While player 1 has an incentive to always match fore expect that an increase in ␣ improves player 2’s price, player 2 has an incentive to player 1’s convergence to equilibrium. When match only when player 1 plays the equilibrium player 1 plays equilibrium strategy, player 2’s strategy. Also, at the threshold for strategic best response is to play equilibrium strategy as complementarity, ␤ ϭ 1/4c, player 2 has a dom- well. Therefore, we expect that an increase in ␣ ϭ ϩ ϭ inant strategy, p2 er/(e c) p*2. Part (iii) of might improve player 2’s convergence to equi- Proposition 2 extends Cheng (1998), who librium as well. It is not clear, however, whether shows that the original version of the compen- the ␣-effects systematically change the effects sation mechanism (␤ ϭ 0) is globally stable of the supermodularity parameter ␤. We rely on under continuous-time Cournot best-reply dy- experimental data to test the interaction of the namics.4 Cournot best-reply is, however, a rel- ␣-effects on ␤-effects. atively poor description of human learning (Richard Boylan and Mahmoud El-Gamal, II. Experimental Design 1993). Therefore, global stability under Cournot best reply for any ␤ Ն 0 does not imply equi- Our experimental design reflects both theo- librium convergence among human subjects. retical and technical considerations. Specifi- Part (iv) characterizes the threshold for strategic cally, we chose an environment that allows complementarity in our experiment, a more ro- significant latitude in varying the free parame- bust stability criterion than that characterized by ters to better assess the performance of the part (iii). compensation mechanism around the threshold An intuition for how strategic complementa- of strategic complementarity. We describe this rities affect the outcome of adaptive learning environment and the experimental procedures can be gained from analysis of the best response below. functions, equations (4) and (5). While player 1’s best response function is always upward A. The Economic Environment sloping, player 2’s best response function, equa- tion (5), is nondecreasing if and only if ␤ Ն We use the general payoff functions pre- 1/4c, i.e., when the game is supermodular. Be- sented in equation (1) with quadratic cost and yond the threshold for strategic complementar- externality functions to obtain analytical solu- ity, both best-response functions are upward tions: c(x) ϭ cx2 and e(x) ϭ ex2. We use the following parameters: c ϭ 1/80, e ϭ 1/40, r ϭ 24. From Proposition 2, the subgame-perfect * * ϭ 4 Cheng (1998) also shows that the original mechanism Nash equilibrium is (p1, p2, x*) (16, 16, 320) is locally stable under discrete-time Cournot best-reply dy- and the threshold for strategic complementarity namics. is ␤ ϭ 1/4c ϭ 20. 1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ʦ In the experiment, each player chooses pi instructions) and six player 2s (Blue players). {0, 1, ... , 40}. Without the compensation mech- Each player remains the same type throughout anism, the profit-maximizing production level is the experiment. At the beginning of each ses- x ϭ r/2c ϭ 960, three times higher than the sion, subjects randomly draw a PC terminal efficient level. To reduce the complexity of number. Each then sits in front of the corre- player 1’s problem, we use a grid size of 10 for sponding terminal and is given printed instruc- the quantity and truncate the strategy space, i.e., tions. After the instructions are read aloud, player 1 chooses X ʦ {0, 1, ... , 50}, where X ϭ subjects are encouraged to ask questions. The x/10. The payoff functions presented to the sub- instruction period lasts between 15 and 30 jects are adjusted accordingly. Truncating the minutes. strategy space to X Յ 50 also reduces the pos- Each round a player 1 is randomly matched sibility of player 2’s bankruptcy. To reduce with a player 2. Subjects are randomly re- payoff asymmetry, we give player 1 a lump- matched each round to minimize sum payment of 250 points each round. There- effects. The random rematching protocol also fore, the equilibrium payoffs under the minimizes the possibility that players collude ␲ ϭ 6 mechanism for the two players are 1 1530 on a high-subsidy and low-tax outcome. Each ␲ ϭ and 2 2560. session consists of 60 rounds. As we are inter- The functional forms and specific parameter ested in learning, there are no practice rounds. values are chosen for several reasons. First, Each round consists of two stages: equilibrium solutions are integers. Second, equilibrium prices and quantities do not lie in (i) Announcement Stage: Each player simulta- the center of the strategy space, thus avoiding neously and independently chooses a price, ʦ equilibrium convergence as a result of focal pi {0, 1, ... , 40}; points. Third, there is a salient gap between (ii) Production Stage: After (p1, p2) are chosen, efficiency with and without the mechanism. player 1’s computer displays player 2’s Without the mechanism, the profit-maximizing price and a payoff table showing her payoff production level is X ϭ 50, resulting in an for each X ʦ {0, 1, ... , 50}. Player 1 then efficiency level of 68.4 percent.5 With the chooses a quantity, X. The server calculates mechanism, the system achieves 100-percent payoffs and sends each player his payoff, efficiency in equilibrium. Finally, since the the quantity chosen, and the prices submit- threshold for strategic complementarity is ␤ ϭ ted by him and his match. 20, there is a large number of integer ␤ values to choose from both below and above the To summarize, each subject knows both pay- threshold. off functions, the choices made each round by To study how strategic complementarity af- himself and his match, as well as his per-period fects equilibrium convergence, we keep ␣ ϭ 20, and cumulative payoffs. At any point, subjects and vary ␤ ϭ 0, 18, 20, and 40. To study have ready access to all of this information. The whether ␣ affects convergence, we also keep mechanism is thus implemented as a game of ␣ ϭ 10, and vary ␤ ϭ 0, 20. complete information. We do not know, how- ever, how subjects processed this information, nor do we know their beliefs about the rational- B. Experimental Procedures

Our experiment involves 12 players per 6 If the players could commit to maximizing joint profits by choosing p1 and p2 cooperatively and x noncoopera- session—six player 1s (called Red players in the ϭ tively, they would choose, by equation (1), (p1, p2, x) (4(␣ ϩ ␤)er/[4(␣ ϩ ␤)(e ϩ c) Ϫ 1], r(4e(␣ ϩ ␤) Ϫ 1)/[4(␣ ϩ ␤)(e ϩ c) Ϫ 1], 2r(␣ ϩ ␤)/[4(␣ ϩ ␤)(e ϩ c) Ϫ 1]). With our 5 ϭ ϭ ␣ ϭ We compute the efficiency level by using the ratio of choice of parameters, we get: p1 24, p2 12 for 20 ␤ ϭ ϭ ϭ ␣ ϭ ␤ ϭ total earnings without the mechanism and total earnings and 0; p1 19.2, p2 14.4 for 20 and 20; ϭ ϭ ␣ ϭ ␤ ϭ with the mechanism. Without the mechanism, at the pro- and p1 18, p2 15 for 20 and 40; etc. If, in ϭ ϭ ␲ ϩ duction level of X 50 (or x 500), total earning is 1 addition, they could choose (p1, p2, x) cooperatively, then for ␲ ϭ Ϫ 2 Ϫ 2 ϭ Ϫ ϭ ␣ ϩ 2 (rx cx ) ex 2625. Therefore, the efficiency our parameter values, we get (p1 p2, x) (min{480/[3( level is 2625/(1280 ϩ 2560) ϭ 0.684. ␤) Ϫ 20], 40}, min{960(␣ ϩ ␤)/[3(␣ ϩ ␤) Ϫ 20], 50}). VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1511

TABLE 1—FEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium ␣

10 20 Supermodular (p*1, p*2, X*) ␣10␤00 ␣20␤00 0 (4 sessions) (5 sessions) ␣20␤18 ␤ 18 (4 sessions) No (16, 16, 32) ␣10␤20 ␣20␤20 20 (5 sessions) (5 sessions) ␣20␤40 40 (4 sessions) Yes (16, 16, 32)

ity of others. Both of these factors introduce DEFINITION 1: The level of convergence at uncertainty in the environment. round t, L(t), is measured by the proportion of Table 1 presents features of the experimental Nash equilibrium play in that round. The level sessions, including parameters, number of inde- of convergence for a block of rounds, Lb(t1, t2), pendent sessions in each treatment, whether the measures the average proportion of Nash equi- mechanism is supermodular in that treatment, librium play between rounds t1 and t2, i.e., Lb(t1, ϭ ¥t2 Ϫ ϩ Յ Յ and equilibrium prices and quantities. Overall, t2) tϭt L(t)/(t2 t1 1), where 0 t1 Յ 1 27 independent computerized sessions were t2 T and T is the total number of rounds. conducted in the Research Center for Group Dynamics (RCGD) lab at the University of We define the level of convergence for both a Michigan from April to July 2001, and in April round and a block of rounds. The block conver- 2003. We used zTree to program our experi- gence measure smooths out inter-round ments. Our subjects were students from the Uni- variation. versity of Michigan. No subject was used in Ideally, the speed of convergence should more than one session, yielding 324 subjects. measure how quickly all players converge to Each session lasted approximately one-and-a- equilibrium strategies. In our experimental set- half hours. The exchange rate for all treatments ting, however, we never observe perfect con- was one dollar for 4,250 points. The average vergence. We therefore use a more general earnings was $22.82. Data are available from definition for the speed of convergence. the authors upon request. DEFINITION 2: For a given level of conver- III. Hypotheses gence, L* ʦ (0, 1], the speed of convergence is measured by the first round in which the level of Given the design above, we next identify our convergence reaches L* and does not subse- hypotheses. To do so, we first define and discuss quently drop below this level, i.e., ␶ such that two measures of convergence: the level and L(t) Ն L* for any t Ն ␶. speed.7 In theory, convergence implies that all players play the stage game equilibrium and no We use Definition 2 for analysis of conver- deviation is observed. This is not realistic, how- gence speed in simulations. Due to oscillation in ever, in an experimental setting. Therefore, we experimental data, Definition 2 is not practical. define the following measures. We use the following observation which en- ables us to measure convergence speed using estimates for the slope of L(t), ⌬L(t) ϵ L(t ϩ 7 We thank anonymous referees for suggesting this sep- 1) Ϫ L(t), obtained in regression analysis. Let aration and appropriate measures. Ly(t) be the convergence level of treatment y at 1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 time t. We now relate the slope of L(t) and the Since we have not found any previous experi- initial level of convergence L(1) to the speed of mental studies within the class of games with convergence. strategic complementarities, Hypotheses 4 is again our speculation. Ն OBSERVATION 1: If L1(1) L2(1) and ⌬ Ͼ⌬ ʦ Ϫ ␤ ϭ L1(t) L2(t) for all t [1, T 1], then, HYPOTHESIS 5: When 0 or 20, increas- given any L* ʦ (0, 1], the first treatment con- ing ␣ from 10 to 20 significantly increases (a) verges more quickly than the second treatment. the level and (b) the speed of convergence.

Based on theories presented in Section I, we HYPOTHESIS 6: Changing ␣ from 10 to 20 now form our hypotheses about the level and significantly increases the improvement in (a) speed of convergence. While theories of strate- the level and (b) the speed of convergence that gic complementarities do not make any predic- results from increasing ␤ from 0 to 20. tions about the speed of convergence, we form our hypotheses based on previous experiments Hypothesis 5 is based on experimental findings that incidentally address games with strategic supporting the General Incentive Hypothesis. complementarities. Hypothesis 6 is our speculation. Hypotheses 1 through 6 are concerned with HYPOTHESIS 1: When ␣ ϭ 20, increasing ␤ only one measure of performance, convergence from 0 to 20 significantly increases (a) the level to equilibrium. Other measures, such as effi- and (b) the speed of convergence. ciency and budget balance, can be largely de- rived from convergence patterns. Therefore, Hypothesis 1 is based on the theoretical predic- although we omit the formal hypotheses, we tion that games with strategic complementari- present results regarding these measures in Sec- ties converge to the unique Nash equilibrium, as tions IV and V. well as on previous experimental findings that supermodular games perform robustly better IV. Experimental Results than their non-supermodular counterparts. In this section, we compare the performance HYPOTHESIS 2: When ␣ ϭ 20, increasing ␤ of the mechanism as we vary ␣ and ␤.Atthe from 0 to 18 significantly increases (a) the level individual level, we look at both the level and and (b) the speed of convergence. speed of convergence to subgame-perfect equi- librium and near equilibrium in each of the six Hypothesis 2 is based on the findings of different treatments. At the aggregate level, we Falkinger et al. (2000) that average play is close examine the efficiency and budget imbalance to equilibrium when the free parameter is generated by each treatment. In the following slightly below the supermodular threshold. discussion, we focus on prices rather than on quantity. Recall that player 1’s best response in HYPOTHESIS 3: When ␣ ϭ 20, increasing ␤ the production stage is uniquely determined by from 18 to 20 significantly increases (a) the p2, and that player 1 has all of the information level and (b) the speed of convergence. needed to select this best response. In our ex- periment, deviations in the production stage Since we have not found any previous experi- tend to be small (the average absolute deviation mental studies that compare the performance of is less than 1 in 23 of 27 sessions) and does not games with strategic complementarities with differ significantly among treatments. There- those near the threshold, Hypothesis 3 is pure fore, it is not surprising that, in all of our anal- speculation. yses, the results for quantities largely mirror the results for player 2’s price.8 HYPOTHESIS 4: When ␣ ϭ 20, increasing Recall that subgame perfect Nash equilib- ␤ from 20 to 40 does not significantly in- crease either (a) the level or (b) the speed of convergence. 8 Tables are available from the authors upon request. VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1513

ϭ rium for a stage game is (p*1, p*2, X*) (16, 16, [8]) are the same. While the proportion of 32). Since the strategy space in this experiment ␧-equilibrium play is much higher than the pro- is rather large and the payoff function is rela- portion of equilibrium play, the results of the tively flat near equilibrium, a small deviation permutation tests largely follow similar pat- from equilibrium is not very costly. For exam- terns. We now formally test our hypotheses ple, in the ␣20␤20 treatment, a one-unit unilat- regarding convergence level. Parts (i) to (iii) of eral deviation from equilibrium prices costs Result 1 present the effects of the degree of player 1 $0.005 and player 2 $0.014. Therefore, strategic complementarity (␤-effects). Parts (iv) we check the ␧-equilibrium play by looking at and (v) present effects due to changes in ␣ the proportion of price announcements within (␣-effects). Ϯ1 of the equilibrium price, and the quantity announcement within Ϯ4 of the equilibrium RESULT 1 (Level of Convergence in the quantity, since a one-unit change in p2 results in Last 20 Rounds): a four-unit best-response quantity change. ␧ ␧ ␣ ϭ ␤ Therefore, the -equilibrium prediction is ( -p*1, (i) When 20, increasing from 0 to 18 ␧ ␧ ϭ 10 -p*2, -x*) ({15, 16, 17}, {15, 16, 17}, and from 0 to 20 significantly increases ␧ {28, ... , 32, ... , 36}). the level of convergence in p*1, p*2 and -p*2; Figures 1 and 2 contain box and whiskers (ii) When ␣ ϭ 20, increasing ␤ from 18 to 20 plots for the prices of each treatment for all 60 does not change the level of convergence rounds by players 1 and 2, respectively. The box significantly; represents the ranges of the twenty-fifth and (iii) When ␣ ϭ 20, increasing ␤ from 20 to 40 seventy-fifth percentiles of prices, while the does not change the level of convergence whiskers extend to the minimum and maximum significantly; prices in each round. The horizontal line within (iv) When ␤ ϭ 0, increasing ␣ from 10 to 20 each box represents the median price. Com- weakly increases the level of convergence ␤ pared with the 00 treatments, equilibrium in p*2, and significantly increases the level ␧ price convergence is clearly more pronounced of convergence in -p*2, but has no signif- ␧ in the supermodular and near-supermodular icant effects on p*1 or -p*1; treatments. (v) When ␤ ϭ 20, increasing ␣ from 10 to 20 To analyze the performance of the compen- weakly increases the level of convergence ␧ ␧ sation mechanism, we first compare the conver- in p*1, but not in -p*1, p*2, or -p*2. gence level achieved in the last 20 rounds of each treatment. Table 2 reports the level of SUPPORT: The last two columns of Table convergence (Lb(41, 60)) to subgame-perfect 2 report the corresponding alternative hypothe- Nash equilibrium (panels A and B) and ␧-Nash ses and permutation test results. equilibrium (panels C and D) for each session under each of the six different treatments, as By part (i) of Result 1, we accept Hypotheses well as the alternative hypotheses and the cor- 1(a) and 2(a). Part (i) also confirms previous responding p-values of one-tailed permutation experimental findings that supermodular games tests9 under the null hypothesis that the con- perform significantly better than those far from vergence levels in the two treatments (column the supermodular threshold. Furthermore, near- supermodular games also perform significantly better than those far from the threshold. 9 The permutation test, also known as the Fisher random- By part (ii), however, we reject Hypothesis ization test, is a nonparametric version of a difference of 3(a). This is the first experimental result that two means t-test. (See, e.g., Sidney Siegel and N. John Castellan, 1988.) The idea is simple and intuitive: by pool- shows that, from a little below the supermodular ing all independent observations, the p-value is the exact threshold (␤ ϭ 18) to the threshold (␤ ϭ 20), probability of observing a separation between the two treat- ments as the one observed when the pooled observations are randomly divided into two equal-sized groups. This test 10 When presenting results throughout the paper, we uses all of the information in the sample, and thus has follow the convention that a significance level of 5 percent 100-percent power-efficiency. It is among the most power- or less is significant, while a significance level between 5 ful of all statistical tests. percent and 10 percent is weakly significant. 1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

α10β00 : Player 1 α20β00 : Player 1 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

α10β20 : Player 1 α20β20 : Player 1 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

α20β18 : Player 1 α20β40 : Player 1 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

FIGURE 1. DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS:PLAYER 1 improvement in convergence level is statisti- By contrast, we accept Hypothesis 4(a) by cally insignificant. In other words, we do not part (iii). This is the first experimental result see a dramatic improvement at the threshold. systematically comparing convergence levels of This implies that the performance of near- supermodular games, where theory is silent. supermodular games, such as the Falkinger The convergence level does not significantly mechanism, ought to be comparable to that of improve as ␤ increases from the threshold, 20, supermodular games. to 40. Therefore, the marginal returns for being VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1515

α10β00 : Player 2 α20β00 : Player 2 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

α10β20 : Player 2 α20β20 : Player 2 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

α20β18 : Player 2 α20β40 : Player 2 40 40 35 35 30 30 25 25 20 20 15 15 10 10 Price Announced Price Announced 5 5 0 0 1 5 10 15 20 25 30 35 40 45 50 55 60 1 5 10 15 20 25 30 35 40 45 50 55 60 Round Round

FIGURE 2. DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS:PLAYER 2

“more supermodular” diminish once the payoffs beyond. We investigate in Section V whether become supermodular. this difference persists in the long run. Proposition 2 predicts convergence under While parts (i) to (iii) present the ␤-effects, Cournot best reply for any ␤ Ն 0. However, parts (iv) and (v) examine the ␣-effects and we there is a significant difference in convergence partially accept Hypothesis 5(a). Recall from level as ␤ increases from 0 to 18, 20, and equation (5) and subsequent discussions that, at 1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

TABLE 2—LEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A: Proportion of player 1 Nash equilibrium price (p*1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value ␣10␤00 0.000 0.008 0.158 0.008 0.044 ␣10␤00 Ͻ ␣20␤00 0.397 ␣20␤00 0.000 0.083 0.083 0.042 0.058 0.053 ␣20␤00 Ͻ ␣20␤20 0.008*** ␣20␤18 0.125 0.117 0.100 0.192 0.133 ␣20␤00 Ͻ ␣20␤18 0.008*** ␣10␤20 0.242 0.067 0.067 0.067 0.208 0.130 ␣10␤20 Ͻ ␣20␤20 0.064* ␣20␤20 0.325 0.267 0.208 0.058 0.367 0.245 ␣20␤18 Ͻ ␣20␤20 0.064* ␣20␤40 0.175 0.400 0.158 0.133 0.217 ␣20␤20 Ͻ ␣20␤40 0.643

Panel B: Proportion of player 2 Nash equilibrium price (p*2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value ␣10␤00 0.025 0.042 0.025 0.067 0.040 ␣10␤00 Ͻ ␣20␤00 0.056* ␣20␤00 0.067 0.083 0.033 0.075 0.050 0.062 ␣20␤00 Ͻ ␣20␤20 0.032** ␣20␤18 0.133 0.292 0.108 0.258 0.198 ␣20␤00 Ͻ ␣20␤18 0.008*** ␣10␤20 0.392 0.108 0.175 0.067 0.667 0.282 ␣10␤20 Ͻ ␣20␤20 0.504 ␣20␤20 0.308 0.117 0.483 0.017 0.450 0.275 ␣20␤18 Ͻ ␣20␤20 0.238 ␣20␤40 0.325 0.492 0.233 0.233 0.321 ␣20␤20 Ͻ ␣20␤40 0.365

␧ ␧ Panel C: Proportion of player 1 -Nash equilibrium price ( -p*1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value ␣10␤00 0.108 0.200 0.350 0.158 0.204 ␣10␤00 Ͻ ␣20␤00 0.183 ␣20␤00 0.067 0.458 0.358 0.183 0.425 0.298 ␣20␤00 Ͻ ␣20␤20 0.087* ␣20␤18 0.317 0.625 0.300 0.583 0.456 ␣20␤00 Ͻ ␣20␤18 0.103 ␣10␤20 0.475 0.200 0.300 0.375 0.492 0.368 ␣10␤20 Ͻ ␣20␤20 0.194 ␣20␤20 0.450 0.558 0.475 0.133 0.775 0.478 ␣20␤18 Ͻ ␣20␤20 0.444 ␣20␤40 0.458 0.492 0.375 0.442 0.442 ␣20␤20 Ͻ ␣20␤40 0.643

␧ ␧ Panel D: Proportion of player 2 -Nash equilibrium price ( -p*2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value ␣10␤00 0.133 0.200 0.242 0.225 0.200 ␣10␤00 Ͻ ␣20␤00 0.016** ␣20␤00 0.300 0.400 0.200 0.275 0.333 0.302 ␣20␤00 Ͻ ␣20␤20 0.016** ␣20␤18 0.717 0.667 0.342 0.700 0.606 ␣20␤00 Ͻ ␣20␤18 0.016** ␣10␤20 0.650 0.517 0.542 0.400 0.892 0.600 ␣10␤20 Ͻ ␣20␤20 0.564 ␣20␤20 0.533 0.592 0.642 0.225 0.900 0.578 ␣20␤18 Ͻ ␣20␤20 0.556 ␣20␤40 0.825 0.792 0.467 0.517 0.650 ␣20␤20 Ͻ ␣20␤40 0.294

Note: Significant at: * 10-percent level; ** 5-percent level; *** 1-percent level.

the threshold of strategic complementarity ␤ ϭ In specifications (1) and (2), the independent 20, player 2’s Nash equilibrium strategy is also variables are: treatment dummies, Dy, where a dominant strategy. The finding of no ␣-effect y ϭ ␣10␤00, ␣20␤00, ␤18, ␣10␤20 and ␤40; on player 2’s equilibrium play when ␤ ϭ 20 is ln(Round); and a constant. We use dummies for consistent with this observation. different values of ␣ and ␤ rather than direct To determine the effects of strategic comple- parameter values to avoid assuming a linear mentarities and other factors on convergence relationship of parameter effects, and omit the speed, and to investigate further their effects on dummy for ␣20␤20. Therefore, in these two convergence level, we use probit models with specifications, restricting learning speed to be clustering at the individual level. The results the same across treatments, the estimated coef- from these models are presented in Table 3. The ficient of Dy captures the difference in the ␧ dependent variable is -p*1 in specifications (1) convergence level between treatments y and ␧ ␣ ␤ and (3), and -p*2 in specifications (2) and (4). 20 20. Results from these specifications are VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1517

TABLE 3—CONVERGENCE SPEED:PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable: ␧-Nash equilibrium play (1) ␧-Nash (2) ␧-Nash (3) ␧-Nash (4) ␧-Nash price 1 price 2 price 1 price 2 Ϫ Ϫ D␣10␤00 0.143 0.268 (0.048)*** (0.050)*** Ϫ Ϫ D␣20␤00 0.090 0.217 (0.060) (0.057)*** Ϫ D␤18 0.005 0.004 (0.071) (0.069) Ϫ D␣10␤20 0.080 0.025 (0.060) (0.073)

D␤40 0.000 0.078 (0.068) (0.078) ln(Round) 0.115 0.167 0.131 0.183 (0.014)*** (0.017)*** (0.021)*** (0.022)*** Ϫ Ϫ D␣10␤00ln(Round) 0.050 0.095 (0.019)*** (0.022)*** Ϫ Ϫ D␣20␤00ln(Round) 0.031 0.073 (0.020) (0.021)*** Ϫ D␤18ln(Round) 0.001 0.004 (0.020) (0.021) Ϫ D␣10␤20ln(Round) 0.024 0.008 (0.019) (0.022) Ϫ D␤40ln(Round) 0.002 0.023 (0.019) (0.024) Observations 9720 9720 9720 9720 Number of groups 162 162 162 162 Log pseudo-likelihood Ϫ5562.890 Ϫ5824.055 Ϫ5557.424 Ϫ5793.013 ϭ D␣10␤00 D␣20␤00 1.50 1.31 ϭ D␣10␤00ln(Round) D␣20␤00ln(Round) 1.33 1.23 Ϫ ϭϪ D␣10␤20 D␣10␤00 D␣20␤00 0.05 1.07 Ϫ D␣10␤20ln(Round) D␣10␤00ln(Round) 0.05 1.03 ϭϪ D␣20␤00ln(Round)

Notes: Coefficients are probability derivatives. Robust standard errors in parentheses are adjusted for clustering at the

individual level. Significant at: * 10-percent level; ** 5-percent level; *** 1-percent level. Dy is the dummy variable for ␹2 treatment y. Excluded dummy is D␣20␤20. The bottom panel presents null hypotheses and Wald (1) test statistics.

largely consistent with Result 1 which uses a coefficient for the interaction term, Dy more conservative test. The coefficients of ln(Round), captures the slope differences be- ln(Round) are both positive and highly signifi- tween treatment y and ␣20␤20. By Observation cant, indicating that players learn to play equilib- 1, as we cannot reject the hypotheses that initial rium strategies over time. The concave functional round probability of equilibrium play is the form, ln(Round), which yields a better log- same across all treatments,11 we can compare likelihood than either the linear or quadratic func- the slope of L(t), ⌬L(t), between different treat- tional form, indicates that learning is rapid at the ments. From the probit specification, L(t) ϭ beginning and decreases over time. We will ex- ⌽(constant ϩ Db ln(t)), where D is the 1 ϫ 5 amine learning in more detail in Section V. vector of treatment dummies and b is the 5 ϫ 1 In specifications (3) and (4), we use vector of estimated coefficients, we can derive ln(Round), interaction of each of the treatment dummies with ln(Round), and a constant as 11 When including treatment dummies in this specifica- independent variables. The interaction term al- tion, Wald tests on the hypothesis that the treatment dummy lows different slopes for different treatments. ␧ coefficients are all zero yield p-values of 0.2703 for -p*1, ␧ Compared with the coefficient of ln(Round), the and 0.3383 for -p*2. 1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

the slope of the probability function for treat- TABLE 4—CONVERGENCE SPEED ⌬ ϭ⌽Ј Probit models with individual-level clustering comparing ment y with respect to t, Ly(t) (con- ϩ ␣20␤18 with the ␣20 supermodular treatments achieving stant Db)by/t. All coefficients reported in similar convergence levels Table 3 are increments of probability deriva- ⌽Ј ϩ tives, (constant Db)by, compared to the Dependent variable: ␧-Nash baseline case of ␣20␤20. equilibrium play (1) ␧-Nash (2) ␧-Nash RESULT 2 (Speed of Convergence): price 1 price 2 ln(Round) 0.054 0.216 (i) When ␣ ϭ 20, increasing ␤ from 0 to 18 (0.040) (0.043)*** and from 0 to 20 significantly increases the Round 0.004 Ϫ0.001 ␧ ␧ (0.002)* (0.002) speed of convergence in -p*1 and -p*2; D ln(Round) Ϫ0.013 Ϫ0.066 (ii) When ␣ ϭ 20, increasing ␤ from 18 to 20 ␣20␤18 (0.032) (0.035)* has no significant effect on convergence D␣20␤18Round 0.001 0.006 speed; (0.002) (0.003)** (iii) When ␣ ϭ 20, increasing ␤ from 20 to 40 Observations 4680 4680 has no significant effect on convergence Log pseudo-likelihood Ϫ2879.07 Ϫ2993.85 speed; Notes: Coefficients reported are probability derivatives. Ro- (iv) When ␤ ϭ 0, increasing ␣ from 10 to 20 bust standard errors in parentheses. Significant at: * 10-

has no significant effects on convergence percent level; ** 5-percent level; *** 1-percent level. Dy is speed; the dummy variable for treatment y. Excluded dummies are D and D . (v) When ␤ ϭ 20, increasing ␣ from 10 to 20 ␣20␤20 ␣20␤40 has no significant effects on convergence speed. the same convergence level. Also, we cannot SUPPORT: Models (3) and (4) in Table 3 re- reject the hypothesis that round-one prices for port analyses on convergence speed. For each each player are drawn for the same distribution independent variable, the coefficients, standard for each treatment.13 As the treatments all start errors (in parentheses), and significance levels and converge to similar levels of equilibrium are reported. The second Wald test looks at play, we use a more flexible functional form to whether the coefficient of D␣10␤00ln(Round) look for differences in speed. equals that of D␣20␤00ln(Round). In Table 4, we report the results of probit regressions comparing ␣20␤18 with the two Part (i) of Result 2 supports Hypotheses 1(b) ␣20 supermodular treatments. We use Round, and 2(b), while by part (ii) we reject Hypothesis ln(Round), and their interactions with ␣20␤18 3(b). Part (iii) supports Hypothesis 4(b). Parts as independent variables to allow different con- (iv) and (v) reject Hypothesis 5(b). Result 2 vergence speeds to the same level of conver- provides the first empirical evidence on the role gence. In model (1), the dependent variable is of strategic complementarity and the speed of player 1 ␧-equilibrium play. There is no signif- convergence. icant difference between ␣20␤18 and the ␣20 Although part (ii) indicates that increasing ␤ supermodular treatments. In model (2), the de- from 18 to 20 does not significantly change pendent variable is player 2 ␧-equilibrium play. convergence speed, we now investigate whether There are significant differences between the there are any differences between ␤18 and the near-supermodular and ␣20 supermodular treat- supermodular treatments. In particular we com- ments. In early rounds, ln(Round) is large relative pare ␣20␤18 with ␣20␤20 and ␣20␤40.12 In to Round. The negative and weakly significant Result 1, we show that these treatments achieve coefficient on D␣20␤18ln(Round) implies a lower probability of early-round equilibrium play in

12 We omit ␣10␤20 from this analysis. First, it does not ␧ achieve the same -p*1 convergence level as the other su- permodular treatments. Second, omitting it avoids the pos- 13 Kolmogorov-Smirnov test result tables are available sibility of an ␣-effect. by request. VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1519

TABLE 5—EFFICIENCY MEASURE

Efficiency: Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value ␣10␤00 0.600 0.671 0.804 0.858 0.733 ␣10␤00 Ͻ ␣20␤00 0.087* ␣20␤00 0.650 0.870 0.907 0.873 0.825 0.825 ␣20␤00 Ͻ ␣20␤20 0.095* ␣20␤18 0.963 0.952 0.889 0.959 0.941 ␣20␤00 Ͻ ␣20␤18 0.016** ␣10␤20 0.958 0.919 0.905 0.881 0.984 0.929 ␣10␤20 Ͻ ␣20␤20 0.714 ␣20␤20 0.888 0.937 0.939 0.780 0.971 0.903 ␣20␤18 Ͻ ␣20␤20 0.833 ␣20␤40 0.973 0.962 0.943 0.949 0.957 ␣20␤20 Ͻ ␣20␤40 0.064*

Note: Significant at: * 10-percent level; ** 5-percent level; *** 1-percent level.

␣20␤18 relative to the supermodular treatments. equilibrium, total payoffs off the equilibrium The ␤18 treatment does catch up to these treat- path can be only weakly related to efficient ments, which is reflected in the positive and payoffs. Therefore, we use two separate mea- significant coefficient on D␣20␤18 interacted sures to capture welfare implications: an effi- with Round. This result suggests a difference ciency measure and a measure of budget between supermodular and near-supermodular imbalance. treatments: while they achieve similar conver- We first define the per-round efficiency mea- gence levels, the supermodular treatments per- sure which includes neither the tax/subsidy nor form better in early rounds and thus might the penalty terms, i.e., reach certain convergence levels faster than their near-supermodular counterparts. rx͑t͒ Ϫ c͑x͑t͒͒ Ϫ e͑x͑t͒͒ e͑t͒ ϭ The previous discussion examines the sepa- rx* Ϫ c͑x*͒ Ϫ e͑x*͒ rate effects of strategic complementarity (␤- effects) and ␣ (␣-effects) on convergence. where x* is the efficient quantity. The efficiency ␣ Յ Յ Varying allows us, however, to test the im- achieved in a block, e(t1, t2), where 0 t1 Յ portance of strategic complementarity relative t2 60, is then defined as to other features of mechanism design. Having a ␣ ϭ t2 full factorial design in the parameters 10, e͑t͒ ␤ ϭ ␣ ͑ ͒ ϭ ͸ 20 and 0, 20, we can study whether e t1 , t2 Ϫ ϩ . t2 t1 1 affects the role of strategic complementarity. In t ϭ t1 particular, as ␤ increases from 0 to 20, we expect improvement in convergence level and RESULT 3 (Efficiency in the Last 20 speed. We study whether this improvement Rounds): changes when ␣ increases from 10 to 20. The last two Wald tests in the bottom panel (i) When ␣ ϭ 20, increasing ␤ from 0 to 18 of Table 3 separately examine the ␣-effects significantly improves efficiency, while in- on the change in convergence level and speed creasing ␤ from 0 to 20 weakly improves resulting from increasing ␤ from 0 to 20 efficiency; (␣-effects on ␤-effects). Changing ␣ from 10 to (ii) When ␣ ϭ 20, increasing ␤ from 18 to 20 20 does not significantly change the improve- has no significant effect on efficiency; ment in either convergence level or speed. This (iii) When ␣ ϭ 20, increasing ␤ from 20 to 40 is the first empirical result examining the ef- weakly improves efficiency; fects of other factors on the role of strategic (iv) When ␤ ϭ 0, increasing ␣ from 10 to 20 complementarity. weakly improves efficiency; So far, we have discussed the performance of (v) When ␤ ϭ 20, increasing ␣ from 10 to 20 the mechanism relative to equilibrium predic- has no significant effect on efficiency. tions and individual behavior. We now turn to group-level welfare results. Since the compen- SUPPORT: Table 5 presents the efficiency sation mechanism balances the budget only in measure for each session in each treatment 1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 in the last 20 rounds, the alternative hypoth- V. Simulation Results: Continued Dynamics eses, and the results of one-tailed permutation tests. Our experiment examines the relationship be- tween strategic complementarity and conver- Result 3 is largely consistent with Result 1, gence to equilibrium. Learning theory predicts indicating that supermodular and near- long-run convergence. Figures 1 and 2 show supermodular mechanisms induce a signifi- that convergence continues to improve in later cantly higher proportion of equilibrium play rounds for several treatments, suggesting con- than mechanisms far from the supermodular tinued dynamics. Due to time, attention, and threshold. The new finding is that increasing ␤ resource constraints, it was not feasible to run from the threshold 20 to 40 weakly improves the experiment for much longer than 60 efficiency. rounds. Therefore, we rely on simulations to Second, the budget surplus at round t is the study continued convergence beyond 60 sum of the penalty terms, plus tax, minus sub- rounds. ϭ ␣ ϩ ␤ Ϫ sidy in that round, i.e., s(t) ( )(p1(t) To do so, we look for a learning algorithm 2 ϩ Ϫ p2(t)) p2(t)x(t) p1(t)x(t). Examining the which, when calibrated, closely approximates session-level budget surplus across all rounds, the observed dynamic paths over 60 rounds. we find the following results. First, 22 out of 27 The large empirical literature on learning in sessions result in a budget surplus. This comes games (see, e.g., Camerer, 2003, for a survey) from a combination of our choice of parameters suggests many models. Our interest here is not and the dynamics of play. Second, the only to compare the performance of various learning significant difference across treatments is that models. We look for learning models that match budget surpluses under ␣20␤18 and ␣20␤20 two criteria. First, we require a model that per- are significantly lower than those under forms well in a variety of experimental games. ␣20␤40 (p-values ϭ 0.029 and 0.024 respec- Second, given our experiment has complete in- tively). This finding points to a potential cost formation about the payoff structure, the model of driving up the punishment parameter, ␤.In needs to incorporate this information. In the other words, before the system equilibrates, a following subsections, we first introduce three high punishment parameter can worsen the learning models which meet our criteria. We budget imbalance problem inherent in the then look at how well the learning models pre- mechanism. dict the experimental data by calibrating each Overall, Results 1 through 3 suggest the fol- algorithm on a subset of the experimental data lowing observations regarding games with stra- and validating the models on a hold-out sample. tegic complementarities. First, in terms of We then report the forecasting results using one convergence level and speed, supermodular and of the calibrated algorithms. near-supermodular games perform significantly better than those far under the threshold. Sec- A. Three Learning Models ond, while there is no significant difference between supermodular and near-supermodular The models we examine are stochastic ficti- games in terms of convergence level, early per- tious play with discounting (hereafter shortened formance differences may lead to supermodular as sFP) (Yin-Wong Cheung and Daniel Fried- treatments reaching certain convergence levels man, 1997; Fudenberg and Levine, 1998), func- more quickly. Third, beyond the threshold, in- tional EWA (fEWA) (Teck-Hua Ho et al., creasing ␤ has no significant effect on either the 2001) and the payoff assessment learning model level or the speed of convergence, but has the (PA) (Rajiv Sarin and Farshid Vahid, 1999). disadvantage of risking higher budget imbal- We now give a brief overview of each model. ance. Last, for a given ␤, increasing ␣ has Interested readers are referred to the originals partial effects on convergence level, but no for complete descriptions. significant effect on convergence speed. To The particular version of sFP that we use is check the persistence of these experimental re- logistic fictitious play (see, e.g., Fudenberg and sults in the long run, we use simulations in Levine, 1998). A player predicts the match’s ϩ Section V. price in the next round, pϪi(t 1) according to VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1521

t Ϫ 1 1] is the imagination weight. In EWA, all pa- ͑ ͒ ϩ ¥ ␶ ͑ Ϫ ␶͒ pϪi t r pϪi t rameters are estimated, whereas in fEWA these ␶ ϭ ͑ ϩ ͒ ϭ 1 parameters (except for ␭) are endogenously de- (6) pϪi t 1 t Ϫ 1 ␶ termined by the following functions. The 1 ϩ ¥ r ␾ change-detector function, i(t), is given by ␶ ϭ 1

ʦ ϭ for some discount factor, r [0, 1]. Note r 0 mϪi ͑ j ͑ ͒͒ I pϪi, pϪi t corresponds to the Cournot best reply assess- (9) ␾ ͑t͒ ϭ 1 Ϫ .5΂͸ ΄ ϩ ϭ ϭ i 1 ment, pϪi(t 1) pϪi(t). When r 1, it yields j ϭ 1 the standard fictitious play assessment. The usual adaptive learning model assumes 0 Ͻ r Ͻ 1. t 2 ¥ ͑ j ͑ ͒͒ All observations influence the expected state but I pϪi, pϪi t more recent observations have greater weight. ␶ ϭ 1 Ϫ ΅ ΃. As opposed to standard fictitious play, stochas- t tic fictitious play allows decision randomization and thus better captures the human learning pro- This function will be close to 1 when recent cess. Omitting time subscripts, the probability that history resembles previous history. The imagi- ␦ ␾ ␬ a player announces price pi is given by: nation weight i(t) equals i(t), while equals the Gini coefficient of previous choice frequen-

exp͑␭␲͑ p , pϪ ͒͒ cies. EWA models encompass a variety of ͑ ͉ ͒ ϭ i i (7) Pr pi pϪi 40 . familiar learning models: cumulative reinforce- ment learning (␦ ϭ 0, ␬ ϭ 1, N(0) ϭ 1), ¥ exp͑␭␲͑ pj , pϪj ͒͒ j ϭ 0 weighted reinforcement learning (␦ ϭ 0, ␬ ϭ 0, N(0) ϭ 1), weighted fictitious play (␦ ϭ 1, ␬ ϭ Given a predicted price, a player is thus more 0), standard fictitious play (␦ ϭ ␾ ϭ 1, ␬ ϭ 0), likely to play strategies that yield higher pay- and Cournot best reply (␾ ϭ ␬ ϭ 1, ␦ ϭ 1). offs. How much more likely is determined by ␭, Finally, we introduce the main components the sensitivity parameter. As ␭ increases, the of the PA model. For simplicity, we omit all ␲ probability of a best response to pϪi increases. subscripts that represent player i, and let j(t) The second model we consider, fEWA, is a represent the actual payoff of strategy j in round one-parameter variant of the experience- t. Since the game has a large strategy space, we weighted attraction (EWA) model (Camerer incorporate similarity functions into the model and Ho, 1999). In this model, strategy probabil- to represent agent use of strategy similarity. As ities are determined by logit probabilities simi- strategies in this game are naturally ordered by ␲ lar to equation (7) with actual payoffs ( (pi, their labels, we use the Bartlett similarity func- pϪi)) replaced by strategy attraction. In both tion, fjk(h, t), to denote the similarity between j variants, strategy attraction, Ai(t), and an expe- the played strategy, k, and an unplayed strategy, rience weight, N(t), are updated after every j, at period t: period. The experience weight is updated ac- cording to N(t) ϭ ␾(1 Ϫ ␬) ⅐ N(t Ϫ 1) ϩ 1, Ϫ ͉j Ϫ k͉ h ͉j Ϫ k͉ Ͻ h ͑ ͒ ϭ ͭ1 / if , where ␾ is the change-detection parameter and fjk h, t 0 otherwise. ␬ controls exploration (low ␬) versus exploita- tion. Attractions are updated according to the In this function, the parameter h determines the following rule: h Ϫ 1 unplayed strategies on either side of the played strategy to be updated. When h ϭ 1, Aj͑t͒ (8) i fjk(1, t) degenerates into an indicator function

j j equal to one if strategy j is chosen in round t and ␾N͑t Ϫ 1͒A ͑t Ϫ 1͒ ϩ ͓␦ ϩ ͑1 Ϫ ␦͒I͑ p , p ͑t͔͒͒␲ ͑p , pϪ ͑t͒͒ ϭ i i i i j i zero otherwise. N͑t͒ The PA model assumes that a player is a j where the indicator function I(pi, pi(t)) equals myopic subjective maximizer. That is, he j ϭ ␦ ʦ one if pi pi(t) and zero otherwise, and [0, chooses strategies based on assessed payoffs, 1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 and does not explicitly take into account the all calibrations, we exclude the last two rounds likelihood of alternate states. Let uj(t) denote (59 and 60) to avoid any end-of-game effects. the subjective assessment of strategy pj at time We then compare the simulated paths with the t, and r the discount factor. Payoff assessments experimental data to find those parameters that are updated through a weighted average of his minimize the mean-squared deviation (MSD) previous assessments and the payoff he actually scores. Since the final distributions of our price obtains at time t. If strategy k is chosen at time data are unimodal, the simulated mean is an t, then: informative statistic and is well captured by MSD (Ernan Haruvy and Dale Stahl, 2000). In (10) u ͑t ϩ 1͒ ϭ ͑1 Ϫ rf ͑h, t͒͒u ͑t͒ all simulations, we use the k-period-ahead j jk j rather than the one-period-ahead approach14 be- ϩ rf ͑h, t͒␲ ͑t͒, ᭙ j. cause we are interested in forecasting the long- jk k run mechanism performance. In doing so, we choose k ϭ 10, 15, 20, 30, and 58. We look at Each period, the assessed strategy payoffs are blocks of 10, 15, 20, and 30 rounds because as subject to zero-mean, symmetrically distributed players gain information and experience, infor- ϭ shocks, zj(t). The decision maker chooses on the mation use may change over time. We use k basis of his shock-distorted subjective assess- 15 rather than the other values because it best ϭ ϩ ments, u˜j(t) uj(t) zj(t). At time t he chooses captures the actual dynamics in the experimen- strategy pk if: tal data. Each simulation consists of 1,500 games (18,000 players) and the following steps: (11) u˜ k ͑t͒ Ϫ u˜ j ͑t͒ Ͼ 0, ᭙ pj  pk . (i) Simulated players are randomly matched Note that mood shocks affect only his choices into pairs at the beginning of each round; and not the manner in which assessments are (ii) Simulated players select price announce- updated. ments: (a) Initial round: Almost half of all play- B. Calibration ers selected a first-round price of 20 (44 percent of player 1s and 43 percent Literature assessing the performance of of player 2s). There are also consider- learning models contains two approaches to cal- able spikes in the frequency of select- ibration and validation. The first approach cal- ing other prices divisible by 5.15 Based ibrates the model on the first t rounds and on Kolmogorov-Smirnov tests on the validates on the remaining rounds. The second actual round-one price distribution, we approach uses half of the sessions in each treat- reject the null hypotheses of uniform ment to calibrate and the other half to validate. distribution (d ϭ 0.250, p-value ϭ We choose the latter approach for two reasons. 0.000) and normal distribution (d ϭ First, this approach is feasible as we have mul- 0.381, p-value ϭ 0.000). We thus fol- tiple independent sessions for each treatment. low the convention (e.g., Ho et al., Second, we need not assume that the parameters 2001) and use the actual first-round of later rounds are the same as those in earlier empirical distribution of choices to rounds. We thus calibrate the parameters of generate the first-round choices; each model in blocks of 15 rounds using the (b) Subsequent rounds: Simulated player experimental data from the first two sessions of strategies are determined via equation each treatment. We then evaluate each model by (7) for sFP and fEWA, and equation measuring how well the parameterized model (11) for PA; predicts play in the remaining two or three sessions. 14 Ido Erev and Haruvy (2000) discuss the tradeoffs of For parameter estimation, we conduct Monte the two approaches. Carlo simulations designed to replicate the 15 For example, the seven most frequently selected prices characteristics of the experimental settings. In are divisible by 5. VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1523

TABLE 6—CALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model: sFP fEWA PA Block: 1 2 3 4 12341234 Discount rate (r) Discount rate (r) ␣10␤00 1.0 0.7 0.9 1.0 0.1 0.0 0.9 1.0 ␣20␤00 0.7 0.6 0.1 0.1 0.5 1.0 0.2 0.1 ␣20␤18 1.0 1.0 0.9 0.9 0.2 1.0 0.3 0.2 ␣10␤20 1.0 1.0 1.0 0.9 0.1 0.3 0.6 0.3 ␣20␤20 1.0 1.0 1.0 0.9 0.2 0.1 0.5 0.2 ␣20␤40 1.0 1.0 1.0 0.7 0.1 1.0 0.4 0.3 Sensitivity (␭) Sensitivity (␭) Shock interval (a) ␣10␤00 1.5 4.4 4.5 2.8 1.6 4.5 4.4 3.7 500 500 250 50 ␣20␤00 1.5 4.3 14.0 18.0 1.7 4.0 5.1 7.4 50 100 150 50 ␣20␤18 1.6 8.0 11.5 18.3 1.5 5.3 6.2 9.5 50 300 500 150 ␣10␤20 1.9 5.9 8.6 13.8 1.8 4.7 6.2 8.2 100 500 450 300 ␣20␤20 2.8 5.6 8.1 11.2 2.4 3.8 6.1 6.7 200 300 350 350 ␣20␤40 3.4 6.5 14.0 20.5 3.2 4.9 6.7 9.9 150 50 150 50 Similarity window (h) ␣10␤00 1 1 10 9 ␣20␤00 10 10 9 6 ␣20␤18 5461 ␣10␤20 1183 ␣20␤20 2271 ␣20␤40 1322

(iii) Simulated player 1’s quantity choice is a uniform distribution16 on an interval [Ϫa, a], based on the following steps: where a is searched on [0, 500] with a step size (a) Determine each player 1’s best re- of 50. For all parameters, intervals and grid sponse to p2; sizes are determined by payoff magnitude. (b) Determine whether player 1 will devi- Table 6 reports the calibrated parameters ate from the best response via the ac- (discount factor, sensitivity parameter, mood tual probability of errors for each shock interval, and similarity window size) for block; the first two sessions of each treatment in 15- (c) If yes, deviate via the actual mean and round blocks.17 Estimated parameters for the standard deviation for the block. Oth- supermodular and near-supermodular treatments erwise, play the best response; are consistent with the increased level of con- (iv) Payoffs are determined by the payoff func- vergence over time, while the ␤00 treatments tion of the compensation mechanism, are not. The second column (sFP) reports the equation (1), for each treatment; best-fit parameters for the stochastic fictitious (v) Assessments are updated according to play model. With the exception of treatment equation (8) for fEWA and equation (10) ␣20␤00, the discount factor is close to 1.0, for PA; (vi) Proceed to the next round. 16 Chen and Yuri Khoroshilov (2003) compare three The discount factor, r ʦ [0, 1], is searched at versions of PA models where shocks were drawn from a grid size of 0.1. The parameter ␭ is searched uniform, logistic, and double exponential distributions on at a grid size of 0.1 in the interval [1.5, 10.5] for two sets of experimental data and find that the performance of the three versions of PA were statistically indistinguish- fEWA, and [1.5, 25.0] for sFP. The size of the able. similarity window, h ʦ [1, 10], is searched at a 17 We also calibrate all parameters for the entire 60 grid size of 1. Mood shocks, z, are drawn from rounds, but do not report results due to space limitations. 1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004 indicating that players keep track of all past play sented in Section IV, but with a different metric when forming beliefs about opponents’ next (MSD). move.18 Given these beliefs, the increasing sen- Table 7 presents each model’s MSD scores sitivity parameter, ␭, for all treatments except for each hold-out session. Recall that two of ␣10␤00 indicates that the likelihood of best each treatment’s four or five independent ses- response increases over time. The third column sions are used for calibration, and the rest for reports calibration results for the fEWA model. validation. The results indicate that all three Again, with the exception of treatment ␣10␤00, learning models perform significantly better the parameter ␭ increases over time. The final than the random choice model (p-value Ͻ 0.01, column reports calibration of parameters in the one-sided permutation tests) and, by a larger PA model. For each treatment except ␣10␤00, a margin, significantly better than the equilibrium middle block has the highest discount factor, model (p-value Ͻ 0.01, one-sided permutation indicating more weight on new information tests). While the equilibrium model does a poor about a strategy’s performance. The second pa- job of explaining overall experimental data, its rameter in the PA model represents the upper performance improvement over time19 justifies bound of the interval from which shocks are the use of learning models to explain the dy- drawn, a. Experimentation should decrease in namics of play. Within each of the top three the final rounds. Indeed, for all treatments, the panels (i.e., learning models), session-level estimated mood-shock ranges (weakly) de- MSD scores are lower for the supermodular and crease from the third to the last block. Finally, near-supermodular treatments, indicating that the decreasing similarity windows from the each learning model does a better job explaining third to final block indicate decreasing strategy behavior in those treatments. Indeed, in the em- spillover. Relatively large discount factors in pirical learning literature learning models fit the third block, combined with relatively large better in experiments with better equilibrium similarity windows, flatten the payoff assess- convergence (see, e.g., Chen and Khoroshilov, ments around the most recently played strate- 2003). gies, consistent with the local experimentation (or relatively stabilized play) observed in the RESULT 4 (Comparison of Learning Mod- data. els): The performance of PA is weakly better than that of fEWA and strictly better than that of C. Validation sFP. The performances of fEWA and sFP are not significantly different. Using the parameters calibrated on the first two sessions of each treatment, we next com- SUPPORT: Table 7 reports the MSD scores for pare the performance of the three learning mod- each independent session in the hold-out sample els in predicting play in the hold-out sample. under each model. The Wilcoxon signed-ranks For comparison, we also present the perfor- tests show that the MSD scores under PA are mance of two static models. The random choice weakly lower than those under fEWA (z ϭ model assumes that each player randomly Ϫ0.085, p-value ϭ 0.068), and strictly lower chooses any strategy with equal probability for than those under sFP (z ϭϪ0.028, p-value ϭ all rounds. This model incorporates only num- 0.023). Using the same test, we cannot reject the ber of strategies and thus provides a minimum hypothesis that fEWA and sFP yield the same standard for a dynamic learning model. The MSD scores (z ϭ 0.662, p-value ϭ 0.508). equilibrium model assumes that each player plays the subgame perfect Nash equilibrium Although the performance of PA is only every round. Its fit conveys the same informa- weakly better than that of fEWA, the overall tion as the proportion of equilibrium play pre- MSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best reply, 19 We omit the table to show the performance of the we can reject this model based on our estimation of the equilibrium model in blocks of 15 rounds, as the informa- discount factors. tion is repetitive with Figures 1 and 2. VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1525

TABLE 7—VALIDATION ON HOLD-OUT SESSIONS

Stochastic fictitious play Session ␣10␤00 ␣20␤00 ␣20␤18 ␣10␤20 ␣20␤20 ␣20␤40 1 0.955 0.934 0.940 0.930 0.888 0.940 2 0.960 0.946 0.890 0.917 0.973 0.878 3 0.948 0.883 0.873 Overall 0.957 0.943 0.915 0.910 0.912 0.909 fEWA 1 0.956 0.934 0.942 0.928 0.887 0.948 2 0.960 0.946 0.890 0.922 0.978 0.886 3 0.946 0.879 0.871 Overall 0.958 0.942 0.916 0.910 0.912 0.917 Payoff assessment 1 0.952 0.933 0.914 0.941 0.888 0.916 2 0.957 0.944 0.899 0.922 0.917 0.898 3 0.947 0.894 0.903 Overall 0.955 0.941 0.907 0.919 0.902 0.907 Equilibrium play 1 1.867 1.853 1.833 1.833 1.489 1.792 2 1.889 1.919 1.606 1.839 1.953 1.669 3 1.917 1.447 1.539 Overall 1.878 1.896 1.719 1.706 1.660 1.731 Random play Overall 0.976 0.976 0.976 0.976 0.976 0.976

fEWA for every treatment. Therefore, we use lows us to study convergence and efficiency in PA for forecasting beyond 60 rounds. long but finite horizons. Figure 3 presents the simulated path of the In all forecasting exercises we base all post- calibrated PA model and compares it with the 58-round parameters on those from the last actual data in the hold-out sample by superim- block (calibrated from rounds 46 to 58). Since posing the simulated mean (black line) plus and we do not know how these parameters might minus one standard deviation (grey lines) on the change beyond 60 rounds, we use three differ- actual mean (black box) and standard deviation ent specifications. First, we retain all final-block (error bars). The simulation does a good job of parameters. Second, we exponentially decay in tracking both the mean and the standard devia- subsequent blocks the probability of deviation tion of the actual data. In treatments ␣10␤00 from the best response in the production stage.20 and ␣20␤40, however, the reduction in variance Third, we exponentially decay the shock inter- lags that seen in the middle rounds of the val in subsequent blocks. As all three specifica- experiment. tions yield qualitatively similar results, we report results from only the third specification D. Forecasting due to space limitations.21 In presenting the

We now report the results from the PA model. As the strategic complementarity predic- 20 In block k Ͼ 4, we use the probability of deviation, ϭ Ϫ 2 Ϫ8 Prk min{Pr4/(k 4) ,10 }. We set a lower bound of tions concern long-run performance, we use the 10Ϫ8 to avoid the division-by-zero problem. calibrated parameters for the first 58 rounds to 21 Different sequences of random numbers produce simulate play in later rounds. This exercise al- slightly different parameters estimates. While the conver- 1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

FIGURE 3. SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

results, we use the shorthand notation x Ͼ y to experiment. Since the simulated variance reduc- denote a measure under treatment x is signifi- tion lags that of the experiment, round 60 results cantly higher than that under treatment y at the are not achieved until approximately 90 rounds 5-percent level or less, and x ϳ y to denote that of simulation. We further note that these im- the measures are not significantly different at provements slow down after 200 rounds. The the 5-percent level. proportion of ␧-equilibrium play is bounded by Figure 4 reports the simulated proportion of 72 percent for player 1 and 93 percent for player ␧-equilibrium play. We report only the results 2. We now outline the simulation results. for the first 500 rounds since the dynamics do not change much thereafter. In our simulation, RESULT 5 (Level of Convergence in Round all treatments reach higher convergence levels 500): In the simulated data for player 1, at than those achieved in the 60 rounds of the round 500, we have the following level of con- vergence ranking in: ␣ ␤ ϳ ␣ ␤ Ͼ ␣ ␤ Ͼ gence level of a given treatment for a given set of parameter (i) p*1: 20 18 20 40 20 20 estimates is not affected by the sequence of random num- ␣10␤20 Ͼ ␣20␤00 Ͼ ␣10␤00; bers, this convergence level is somewhat sensitive to the ␧ ␣ ␤ Ͼ ␣ ␤ Ͼ ␣ ␤ Ͼ (ii) -p*1: 20 18 20 40 20 20 exact parameter calibration. We therefore conduct four dif- ␣10␤20 Ͼ ␣20␤00 Ͼ ␣10␤00; ferent calibrations for each treatment and select the param- ␣ ␤ Ͼ ␣ ␤ ϳ ␣ ␤ Ͼ eters that yield the highest level. The relative rankings of (iii) p*2: 20 40 20 18 10 20 treatments are quite robust with respect to ordinal selected. ␣20␤20 Ͼ ␣20␤00 Ͼ ␣10␤00; and VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1527

FIGURE 3. CONTINUED

␧ ␣ ␤ Ͼ ␣ ␤ Ͼ ␣ ␤ Ͼ ␤ (iv) -p*2: 20 40 20 18 10 20 and the 00 treatments. In addition, an increase ␣20␤20 Ͼ ␣20␤00 Ͼ ␣10␤00. in ␣ significantly improves convergence by a large margin (40 to 80 percent) for both players SUPPORT: The fourth and seventh columns of when ␤ ϭ 0. Table 8 report p-values for t-tests of the preced- It is instructive to compare these results with ing alternative hypotheses for players 1 and 2 those concerning the speed of convergence in respectively. our experimental treatments. In terms of con- ␧ vergence to -p2, our regression results (Table Comparing Results 1 and 5, we first note that 4) indicate that ␣20␤18’s performance in later the four supermodular and near-supermodular rounds enables it to achieve the same conver- treatments continue to dominate the ␤00 treat- gence level as the supermodular treatments. ments. In fact, the simulations suggest that the This trend continues in our simulations, as gap remains constant compared with ␣20␤00, ␣20␤18 performs robustly well compared to the and actually increases compared with ␣10␤00. supermodular treatments. Likewise, in our anal- Unlike Result 1, however, the four dominant ysis of the experimental results, the conver- treatments differ. In particular, ␣20␤18 per- gence speed of ␣20␤40 was not significantly forms better in terms of player 1’s convergence, different from those of the other dominant treat- and ␣20␤40 in terms of player 2’s, although the ments. In our simulations, however, this treat- differences within the top four treatments are ment dominates all treatments in player 2 smaller than the differences between these four equilibrium play. 1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

Player 1 α10β00 α20β00 100% α20β18 90% α10β20 α20β20 80% α20β40 70% 60% 50% 40% 30% 20% Epsilon-equilibrium play 10% 0% 0 100 200 300 400 500 Round

Player 2

100% 90%

80% α10β00 70% α20β00 α20β18 60% α10β20 50% α20β20 α20β40 40% 30%

Epsilon-equilibrium play 20% 10% 0% 0 100 200 300 400 500

Round

FIGURE 4. SIMULATED ␧-EQUILIBRIUM PLAY

In our simulations, the convergence improve- convergence level (all p-values for one-sided ment due to increasing ␤ from 0 to 20 does t-tests are less than 0.01). Due to the relatively depend on ␣ (␣-effect on ␤-effect). An increase poor performance of ␣10␤00 in our simula- in ␣ significantly decreases the improvement in tions, increasing ␣ from 10 to 20 when ␤ ϭ 0 VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1529

TABLE 8—RESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS (Values are for round 500 and based on 1,500 simulated games.)

Player 1 equilibrium price Player 2 equilibrium price

Treatment Probability H1 p-value Probability H1 p-value ␣10␤00 0.073 0.082 ␣20␤00 0.188 ␣20␤00 Ͼ ␣10␤00 0.000*** 0.213 ␣20␤00 Ͼ ␣10␤00 0.000*** ␣20␤18 0.265 ␣20␤18 Ͼ ␣20␤40 0.187 0.381 ␣20␤18 Ͼ ␣10␤20 0.264 ␣10␤20 0.211 ␣10␤20 Ͼ ␣20␤00 0.000*** 0.376 ␣10␤20 Ͼ ␣20␤20 0.001*** ␣20␤20 0.243 ␣20␤20 Ͼ ␣10␤20 0.000*** 0.353 ␣20␤20 Ͼ ␣20␤00 0.000*** ␣20␤40 0.259 ␣20␤40 Ͼ ␣20␤20 0.007*** 0.442 ␣20␤40 Ͼ ␣20␤18 0.000***

Player 1 ␧-equilibrium price Player 2 ␧-equilibrium price

Treatment Probability H1 p-value Probability H1 p-value ␣10␤00 0.216 0.232 ␣20␤00 0.519 ␣20␤00 Ͼ ␣10␤00 0.000*** 0.573 ␣20␤00 Ͼ ␣10␤00 0.000*** ␣20␤18 0.713 ␣20␤18 Ͼ ␣20␤40 0.045** 0.870 ␣20␤18 Ͼ ␣10␤20 0.008*** ␣10␤20 0.584 ␣10␤20 Ͼ ␣20␤00 0.000*** 0.857 ␣10␤20 Ͼ ␣20␤20 0.000*** ␣20␤20 0.662 ␣20␤20 Ͼ ␣10␤20 0.000*** 0.834 ␣20␤20 Ͼ ␣20␤00 0.000*** ␣20␤40 0.701 ␣20␤40 Ͼ ␣20␤20 0.000*** 0.925 ␣20␤40 Ͼ ␣20␤18 0.000***

TABLE 9—SPEED OF CONVERGENCE IN SIMULATED DATA (Threshold is the percent of players playing epsilon-equilibrium strategy. Entry indicates round in which went over threshold for good, where “—” indicates that treatment never achieved threshold.)

Player 1 Player 2 Threshold 0.4 0.5 0.6 0.7 0.4 0.5 0.6 0.7 0.8 0.9 ␣20␤18 54 83 126 316 38 52 62 87 127 — ␣10␤20 129 221 — — 36 48 63 91 148 — ␣20␤20 61 91 149 — 39 51 61 84 137 — ␣20␤40 101 166 263 496 30 38 56 94 166 339 improves convergence more dramatically in Apart from convergence to equilibrium, it is round 500 than in rounds 40 to 60. In the long also important to look at long-run welfare prop- run, an increase in ␣ is a partial substitute for an erties. We evaluate mechanism performance us- increase in ␤. ing the efficiency measure in Section IV. We now use definition 2 to examine the con- Figure 5 summarizes the simulated and actual vergence speed in the long run. Table 9 presents efficiency achieved by each of the treatments. results for the first time a treatment reaches the The top panel reports simulated results, while level of convergence, L*. We omit the two ␤00 the bottom panel reports efficiency in the ex- treatments since they do not converge to the perimental data. In the long run, supermodular same levels as the other treatments. In terms of and near-supermodular treatments continue to player 1, ␣20␤18 achieves the fastest conver- differ from the ␤00 treatments, with ␣20 supe- gence for all levels L*. The picture for player 2 rior to ␣10 in the ␤00 treatments. convergence, however, is more subtle. First, for the ␤20 and ␤18 treatments, the speeds of con- VI. Interpretation and Discussion vergence to all L* are remarkably similar. While ␣20␤40 lags the other treatments in The underlying force for convergence in terms of achieving 80-percent ␧-equilibrium games with strategic complementarities is the play for player 2, it is the only treatment to combination of the slope of the best-response achieve L* ϭ 90 percent. functions and adaptive learning by players. In 1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

Simulation Data

100% α10β00 α20β00 α20β18 95% α10β20 α20β20 α20β40 90%

85%

80% Fraction of Maximum Welfare

75% 50 150 250 350 450 Round

Experimental Data

90% 85% α10β00 80% α20β00 α20β18 75% α10β20 70% α20β20 α20β40 65% 60% 55% 50% Fraction of Maximum Welfare 45% 40% 0 102030405060 Round

FIGURE 5. EFFICIENCY IN THE SIMULATED AND ACTUAL DATA the compensation mechanisms, the slope of mismatched prices, we seriously consider the player 2’s best response function is determined possibility that convergence in supermodular by ␤.As␣ and ␤ determine the penalty for and near-supermodular treatments to an equilib- VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1531

rium where both players announce the same compare the performance of the matching price is not due to strategic complementarities, model with the sFP model using the hold-out but rather to increases in ␣ and ␤ making the sample. Using the Wilcoxon signed-rank tests penalty terms more prominent and matching strat- to compare the MSD scores in the hold-out egies more focal.22 We thus have two hypoth- sample, we conclude that sFP explains player eses to explain the observed improvement in 2’s behavior significantly better than the match- convergence to equilibrium play in supermodu- ing model (p-value ϭ 0.006). We conclude that lar and near-supermodular treatments: a best- best response to a predicted price explains be- response hypothesis and a matching hypothesis. havior significantly better than the matching In this section, we present evidence that the model. improvement in convergence is due to payoff- We next investigate whether differences in relevant changes to best responses, and not to the cost of matching, defined as the difference matching becoming more focal. between matching and best-response profits, While the focal point hypothesis suggests can explain all of the differences in match- that players converge to a match, it does not ing behavior among treatments, or whether specify the price at which they match. In the there exists additional matching behavior first round of our experiments, almost 50 per- that cannot be explained by payoff-relevant cent of all prices were 20, and less than 10 factors. percent were in the ␧-equilibrium range of We examine the probability that player 2 tries [15, 17]. In the final rounds of our experi- to match the anticipated player 1 price. We do ments, play in the four supermodular or near- not know what price player 2 anticipates. To supermodular treatments converges strongly estimate how players weigh history, we cali- to this range, suggesting more than simple brate a matching model in a manner similar to matching. that outlined in Section V B. We assume a By equation (4), player 1’s best response is player matches the price he anticipates, given always to match regardless of p2. By the by equation (6), and we find the discount rate General Incentive Hypothesis, we expect for each treatment that minimizes MSD.25 This more matching behavior by player 1 as ␣ gives us the anticipated price that best explains increases. By equation (5), however, match- the matching hypothesis. Using this discount ing is a best response for player 2 only in rate, we calculate, for each t Ͼ 1, an anticipated ៮ equilibrium. Therefore, data for player 2 can player 1 price, p1, for each player 2. We also ៮ help us separate the best-response and match- calculate the opportunity cost of matching p1, ing hypotheses. equal to the difference between best-response We first investigate which model better ex- and matching profits. This opportunity cost cap- plains our experimental data. We operationalize tures the payoff relevant effects of ␤ and also the best response hypothesis by the stochastic depends on the price to be matched. fictitious play model of Section V and the We next regress the probability that player 2 ϭ ៮ matching hypothesis with a stochastic matching matches the anticipated price, Pr(p2 p1), on model.23 In both models, the predicted player 1 ln(Round), treatment dummies, and treatment price is specified by equation (6). In the match- dummies interacted with ln(Round). In one ing model, player 2 plays this price with prob- specification, we include the cost of matching as ability 1 Ϫ ␧, and one of the other 40 prices a regressor. If parameter changes make match- with probability ␧/40. We estimate ␧ using the ing more focal, then changes in the cost of first two sessions of each treatment.24 We then matching will not explain all of the changes in probability of matching.

22 We thank an anonymous referee and the co-editor for pointing this out. 23 We do not report the results of the deterministic matching model as it performs significantly worse than its .2, .3, .3}; ␣10␤20 ϭ {.5, .3, .3, .2}; ␣20␤20 ϭ {.1, 0, 0, 0}; stochastic counterpart. ␣20␤40 ϭ {.2, 0, 0, .1}. 24 Estimated ␧ in each block are as follows: ␣10␤00 ϭ 25 In all treatments, the calibrated discount rate is r ϭ {1.0, .7, .6, .8}; ␣20␤00 ϭ {.7, .6, .5, .4}; ␣20␤18 ϭ {.7, 0.1. 1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

TABLE 10—PLAYER 2MATCHING HYPOTHESIS occurs when we increase ␤ from0to20isnot (Probit models with individual-level clustering comparing because matching is more focal, but rather models with and without the cost to Player 2 of matching ␤ the anticipated price of Player 1.) because an increase of decreases the cost of matching.26 Dependent variable: Probability player 2 price VII. Concluding Remarks equals anticipated player 1 price The appeal of games with strategic comple- Model (1) (2) mentarities is simple: as long as players are adaptive learners, the game will, in the limit, D␣10 0.014 0.017 (0.043) (0.041) converge to the set bounded by the largest and Ϫ Ϫ D␤00 0.077 0.043 smallest Nash equilibria. This convergence de- (0.035)** (0.036) Ϫ Ϫ pends on neither initial conditions nor the as- D␤18 0.046 0.032 (0.053) (0.056) sumption of a particular learning model. Ϫ D␤40 0.008 0.041 Unfortunately, while many competitive envi- (0.061) (0.042) ronments are supermodular, many are not. In ln(Round) 0.041 0.028 this study, we examine whether there are non- (0.011)*** (0.010)*** Match Cost Ϫ0.001 supermodular games which have the same con- (0.000)*** vergence properties as supermodular ones. We Ϫ Ϫ ln(Round)D␣10 0.014 0.012 also study whether there exists a clear conver- (0.012) (0.012) gence ranking among games with strategic ln(Round)D 0.003 Ϫ0.000 ␤00 complementarities. (0.013) (0.012) Our results confirm the findings of previous ln(Round)D␤18 0.006 0.002 (0.021) (0.020) experimental studies that supermodular games ln(Round)D␤40 0.001 0.015 perform significantly better than games far be- (0.018) (0.016) low the supermodular threshold. From a little Observations 9588 9588 Log pseudo-likelihood Ϫ3243.21 Ϫ3177.16 below the threshold to the threshold, however, the change in convergence level is statistically Notes: Coefficients reported are probability derivatives. Ro- insignificant. This results suggests that in the bust standard errors in parentheses are adjusted for cluster- context of mechanism design, the designer ing at the individual level. Significant at: * 10-percent level; need not be overly concerned with setting pa- ** 5-percent level; *** 1-percent level. D is the dummy y rameters that are firmly above the supermodular variable for treatment y. Excluded dummy is D␣20␤20. Match Cost is the difference, in cents, of matching and threshold: close is just as good. It also enlarges best-responding to anticipated Player 1 price. the set of robustly stable games. For example, the Falkinger mechanism (Falkinger, 1996) is not supermodular, but close. These results sug- Table 10 presents the results of two probit gest that near-supermodular games perform like models with clustering at the individual level. In supermodular ones. the first specification, we do not include the cost Our next result concerns convergence perfor- of matching as a regressor. In this specification, mance within the class of games with strategic the coefficient for D␤00 is negative and signifi- complementarities. Variations in the degree of cant; thus reducing ␤ from 20 to 0 decreases the complementarities have no significant effect on probability that player 2 will try to match player performance within the 60 experimental rounds. 1’s price. Including the cost of matching in the Our simulations suggest, however, an increased regression, however, we find that neither the dummies nor their interactions are signifi- cant—the coefficient for the previously signifi- 26 We consider other specifications of anticipated price, cant D␤00 dummy falls almost by a half, while including the averages of the last one, two, three, and four the precision of the estimate stays about the prices seen. In only one specification was a dummy or its same. The coefficient on the cost of matching, interaction with ln(Round) significant. In that specification, the coefficient for D␤18 is positive, and its interaction with however, is significant and negative. This ln(Round) is negative, a finding which is not consistent with finding suggests that the rise in matching that a focal point hypothesis. VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1533 degree of strategic complementarities leads to We now use the Lyapunov second method to improved convergence in the long run. show that system (12) is globally asymptoti- Finally, the generalized compensation mech- cally stable. Define anism we use to study convergence has a ͑ Ϫ ͒2 parameter unrelated to the degree of comple- p1 p2 V͑ p , p ͒ ϵ mentarities. We use this parameter to study the 1 2 2 effects factors not related to strategic comple- mentarities on the performance of supermodular p1 games. For a given level of strategic comple- ϩ ͵ ͑p Ϫ mp Ϫ n͒ dp mentarity, the factors have partial effects on convergence level and speed, but the effects of p0 strategic complementarities are largely robust to Ն variations in these other factors. In the long run, where p0 0 is a constant. We now show that these effects persist, and we find evidence that V(p1, p2) is the Lyapunov function of system strengthening the strategic complementarities in (12). one player’s strategy can partially substitute for Define G(p ) ϵ͐p1 (p Ϫ mp Ϫ n) dp.Weget 1 p0 the lack of strategic complementarities in the other’s. G͑ p ͒ ϭ 1 ͑ Ϫ m͒p2 Ϫ np A word of caution is in order. In a single 1 2 1 1 1 experimental setting, it is infeasible to study a large number of games in a wide range of com- Ϫ 1 ͑1 Ϫ m͒p2 ϩ np . plex environments. While this is the first sys- 2 0 0 tematic experimental study of the role of strategic complementarities in equilibrium con- As c Ͼ 0 and e Ͼ 0, for any ␤ Ն 0, we always Ͻ vergence, the applicability of our results to have m 1. Therefore, the function G(p1) has other games needs to be verified in future ex- a global minimum, which is characterized by periments. In the only other study of this kind, the following first-order condition: Arifovic and Ledyard (2001) examine similar questions using the Groves-Ledyard mecha- dG͑ p ͒ 1 ϭ ͑ Ϫ ͒ Ϫ ϭ nism. Their results are encouraging, as their 1 m p1 n 0. results are consistent with ours. dp1 ϭ Ϫ ϭ ϩ ϵ APPENDIX:PROOF OF PROPOSITION 2 Therefore, p1 n/(1 m) er/(e c) p* ϭ ϭ is the global minimum. When p1 p2 p*, Ϫ 2 We omit the proofs of parts (i), (ii), and (iv), (p1 p2) /2 also reaches its global minimum. as they follow directly from the previous anal- Therefore, V(p1, p2) is at its global minimum ϭ ϭ ysis and Proposition 1. We now present the when p1 p2 p*. proof for part (iii). If players follow Cournot Next we show that V˙ Յ 0. best-reply dynamics, we can rewrite equations (4) and (5) as ѨV ѨV ˙ ϭ ϩ ͑ ϩ ͒ ϭ ͑ ͒ V Ѩ p˙ 1 Ѩ p˙ 2 p1 t 1 p2 t ; p1 p2 ͑ ϩ ͒ ϭ ͑ ͒ ϩ ϭ ͑ Ϫ ϩ ͑ Ϫ ͒ Ϫ ͒͑ Ϫ ͒ p2 t 1 mp1 t n p1 p2 1 m p1 n p2 p1

ϭ ␤ Ϫ ␤ ϩ 2 ϭ ϩ͑ Ϫ ͒͑ Ϫ ϩ ͒ where m [ (1/4c)]/[ (e/4c )] and n p2 p1 mp1 p2 n (er/4c2)/[␤ ϩ (e/4c2)]. The analogous differen- tial equation system is ϭ Ϫ ͑ Ϫ ͒2 Յ 2 p1 p2 0. ϭ Ϫ (12) p˙ 1 p2 p1 ; Let B be an open ball around (p*, p*) in the ϭ Ϫ ϩ  ˙ Ͻ p˙ 2 mp1 p2 n. plane. For all (p1, p2) (p*, p*), V 0. 1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

Therefore, (p*, p*) is a globally asymptotically Chen, Yan and Tang, Fang-Fang. “Learning and stable equilibrium of (12). Incentive-Compatible Mechanisms for Public Goods Provision: An Experimental Study.” Journal of Political Economy, 1998, 106(3), REFERENCES pp. 633–62. Chen, Yan and Khoroshilov, Yuri. “Learning un- Amir, Rabah. “Cournot Oligopoly and the The- der Limited Information.” Games and Eco- ory of Supermodular Games.” Games and nomic Behavior, 2003, 44(1), pp. 1–25. Economic Behavior, 1996, 15(2), pp. 132– Cheng, Qiang John. “Essays on Designing Eco- 48. nomic Mechanisms.” Unpublished Paper, Andreoni, James and Varian, Hal R. “Pre-Play 1998. Contracting in the Prisoners’ Dilemma.” Pro- Cheung, Yin-Wong and Friedman, Daniel. “Indi- ceedings of the National Academy of Science, vidual Learning in Normal Form Games: 1999, 96(19), pp. 10933–38. Some Laboratory Results.” Games and Eco- Arifovic, Jasmina and Ledyard, John O. “Com- nomic Behavior, 1997, 19(1), pp. 46–76. puter Testbeds and Mechanism Design: Ap- Cooper, Russell and John, Andrew. “Coordinat- plication to the Class of Groves-Ledyard ing Coordination Failures in Keynesian Mod- Mechanisms for Provision of Public Goods.” els.” Quarterly Journal of Economics, 1988, Unpublished Paper, 2001. 103(3), pp. 441–63. Boylan, Richard T. and El-Gamal, Mahmoud A. Cox, James C. and Walker, Mark. “Learning to “Fictitious Play: A Statistical Study of Mul- Play Cournot Duopoly Strategies.” Journal of tiple Economic Experiments.” Games and Economic Behavior and Organization, 1998, Economic Behavior, 1993, 5(2), pp. 205–22. 36(2), pp. 141–61. Camerer, Colin F. Behavioral game theory: Ex- Diamond, Douglas W. and Dybvig, Philip H. periments in strategic interaction (Round- “Bank Runs, Deposit Insurance, and Liquid- table Series in Behavioral Economics). ity.” Journal of Political Economy, 1983, Princeton: Princeton University Press, 2003. 91(3), pp. 401–19. Camerer, Colin F. and Ho, Teck-Hua. “Experience- Diamond, Peter A. “Aggregate Demand Man- Weighted Attraction Learning in Normal agement in Search Equilibrium.” The Journal Form Games.” Econometrica, 1999, 67(4), of Political Economy, 1982, 90(5), pp. 881– pp. 827–74. 94. Chen, Yan. “A Family of Supermodular Nash Dybvig, Philip H. and Spatt, Chester S. “Adop- Mechanisms Implementing Lindahl Alloca- tion Externalities as Public Goods.” Journal tions.” Economic Theory, 2002, 19(4), pp. of Public Economics, 1983, 20(2), pp. 231– 773–90. 47. Chen, Yan. “Incentive-Compatible Mechanisms Erev, Ido and Haruvy, Ernan. “On the Potential for Pure Public Goods: A Survey of Experi- Value and Current Limitations of Data- mental Research,” in C. R. Plott and V. L. Driven Learning Models.” Unpublished Pa- Smith, eds., The handbook of experimental per, 2000. economics results. Amsterdam: Elsevier, Falkinger, Josef. “Efficient Private Provision of forthcoming. Public Goods by Rewarding Deviations from Chen, Yan. “Dynamic Stability of Nash- Average.” Journal of Public Economics, Efficient Public Goods Mechanisms: Recon- 1996, 62(3), pp. 413–22. ciling Theory and Experiments,” in Rami Falkinger, Josef; Fehr, Ernst; Ga¨chter, Simon Zwick and Amnon Rapoport, eds., Experi- and Winter-Ebmer, Rudlof. “A Simple Mech- mental business research, Vol. 2. Norwell anism for the Efficient Provision of Pub- and Dordrecht: Kluwer Academic Publishers, lic Goods: Experimental Evidence.” Ameri- in press. can Economic Review, 2000, 90(1), pp. 247– Chen, Yan and Plott, Charles R. “The Groves- 64. Ledyard Mechanism: An Experimental Study Fudenberg, Drew and Levine, David K. “The of Institutional Design.” Journal of Public Theory of Learning in Games,” in MIT Press Economics, 1996, 59(3), pp. 335–64. series on economic learning and social evo- VOL. 94 NO. 5 CHEN AND GAZZALE: LEARNING IN GAMES 1535

lution, Vol. 2. Cambridge, MA: MIT Press, nal of Political Economy, 1987, 95(3), pp. 1998. 485–91. Hamaguchi, Yasuyo; Maitani, Satoshi and Saijo, Sarin, Rajiv and Vahid, Farshid. “Payoff Assess- Tatsuyoshi. “Does the Varian Mechanism ments without Probabilities: A Simple Dy- Work? Emissions Trading as an Example.” namic Model of Choice.” Games and International Journal of Business and Eco- Economic Behavior, 1999, 28(2), pp. 294–309. nomics, 2003, 2(2), pp. 85–96. Siegel, Disney and Castellan, N. John, Jr. Non- Harstad, Ronald M. and Marrese, Michael. “Be- parametric statistics for the behavioral sci- havioral Explanations of Efficient Public ences, 2nd ed. New York: McGraw-Hill, Good Allocations.” Journal of Public Eco- 1988. nomics, 1982, 19(3), pp. 367–83. Smith, Vernon L. “An Experimental Compari- Haruvy, Ernan and Stahl, Dale. “Initial Con- son of Three Public Good Decision Mecha- ditions, Adaptive Dynamics, and Final nisms.” Scandinavian Journal of Economics, Outcomes: An Approach to Equilibrium Se- 1979, 81(2), pp. 198–215. lection.” Unpublished Paper, 2000. Topkis, Donald. “Minimizing a Submodular Ho, Teck-Hua; Camerer, Colin F. and Chong, Function on a Lattice.” Operations Research, Juin-Kuan. Economic value of EWA Lite: A 1978, 26(2), pp. 305–21. functional theory of learning in games. Cal- Topkis, Donald. “Equilibrium Points in Nonzero- ifornia Institute of Technology, Division of Sum N-Person Submodular Games.” SIAM the Humanities and Social Sciences, Working Journal on Control and Optimization, 1979, Paper No. 1122-2001. 17(6), pp. 773–87. Milgrom, Paul R. and Shannon, Chris. “Mono- Van Huyck, John B.; Battalio, Raymond C. and tone Comparative Statics.” Econometrica, Beil, Richard O. “Tacit Coordination Games, 1994, 62(1), pp. 157–80. Strategic Uncertainty, and Coordination Fail- Milgrom, Paul R. and Roberts, John. “Rational- ure.” American Economic Review, 1990, izability, Learning, and Equilibrium in 80(1), pp. 234–48. Games with Strategic Complementarities.” Varian, Hal R. “A Solution to the Problem of Econometrica, 1990, 58(6), pp. 1255–77. Externalities When Agents Are Well In- Milgrom, Paul R. and Roberts, John. “Adaptive formed.” American Economic Review, 1994, and Sophisticated Learning in Normal Form 84(5), pp. 1278–93. Games.” Games and Economic Behavior, Vives, Xavier. “Nash Equilibrium in Oligopoly 1991, 3(1), pp. 82–100. Games with Monotone Best Response.” Uni- Moulin, Herve. “Dominance Solvability and versity of Pennsylvania, CARESS Working Cournot Stability.” Mathematical Social Sci- Paper 85-10, 1985. ences, 1984, 7(1), pp. 83–102. Vives, Xavier. “Nash Equilibrium with Strategic Postlewaite, Andrew and Vives, Xavier. “Bank Complementarities.” Journal of Mathemati- Runs as an Equilibrium Phenomenon.” Jour- cal Economics, 1990, 19(3), pp. 305–21.