Essays on the Promotion of Game-Theoretic Cooperation

by

Catherine Soo-Yeon Moon

Department of Economics Duke University

Date: Approved:

Vincent Conitzer, Supervisor

Rachel Kranton

Curtis Taylor

David McAdams

Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Economics in the Graduate School of Duke University 2018 Abstract Essays on the Promotion of Game-Theoretic Cooperation

by

Catherine Soo-Yeon Moon

Department of Economics Duke University

Date: Approved:

Vincent Conitzer, Supervisor

Rachel Kranton

Curtis Taylor

David McAdams

An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Economics in the Graduate School of Duke University 2018 Copyright c 2018 by Catherine Soo-Yeon Moon All rights reserved except the rights granted by the Creative Commons Attribution-Noncommercial Licence Abstract

This dissertation looks at how to identify and promote cooperation in a multiagent system, first theoretically through the lens of computational and later empirically through a human subject experiment. Chapter2 studies the network dynamics leading to a potential unraveling of cooperation and identify the subset of agents that can form an enforceable cooperative agreement with. This is an impor- tant problem, because cooperation is harder to sustain when information of defection, and thus the consequent punishment, transfers slowly through the network structures from a larger community. Chapter3 examines a model that studies cooperation in a broader strategic context where agents may interact in multiple different domains, or games, simultaneously. Even if a game independently does not give an agent suf- ficient incentive to play the cooperative action, there may be hope for cooperation when multiple games with compensating asymmetries are put together. Exploit- ing compensating asymmetries, we can find an institutional arrangement that would either ensure maximum incentives for cooperation or require minimum subsidy to es- tablish sufficient incentives for cooperation. Lastly, Chapter4 studies a two-layered public good game to empirically examine whether community enforcement through existing bilateral relationships can encourage cooperation in a social dilemma situa- tion. Here, it is found that how the situation is presented matters greatly to real life agents, as their understanding of whether they are in a cooperative or a competitive, strategic setting changes the level of overall cooperation.

iv To my family

v Contents

Abstract iv

List of Tables ix

List of Figuresx

List of Abbreviations and Symbols xii

Acknowledgements xiv

1 Introduction1

1.1 Understanding Cooperation and Collective Level Dynamics...... 2

1.2 Institutional Arrangements for Sustainable Cooperation...... 3

1.3 Connecting Theory and Data on Agent Behavior...... 4

2 Maximal Cooperation in Repeated Games on Social Networks6

2.1 Related Literature...... 8

2.2 Model...... 9

2.3 Motivation and Illustrative Examples...... 10

2.3.1 A Pollution Reduction Example...... 10

2.3.2 Delayed Punishment due to Directionality of Information Flow 11

2.3.3 Discussion of Assumptions in the Model...... 11

2.4 Theoretical Analysis for Cooperation in ...... 12

2.5 Simulation Analysis on Random Graphs...... 16

2.5.1 Assumptions for Simulation...... 17

vi 2.5.2 Equilibrium Defection Phase Transition...... 18

2.6 Analytical Expression for Phase Transition...... 20

2.7 Credibility of Threats in Equilibrium...... 29

2.8 Extension: Simulation Analysis of the Credible Equilibrium...... 33

2.9 Conclusion...... 34

2.10 Future research...... 35

3 Role Assignment for Game-Theoretic Cooperation 36

3.1 Background: Repeated Games and the Folk Theorem...... 38

3.2 Motivating Example...... 39

3.3 Definitions...... 41

3.4 Related Literature...... 42

3.5 Theoretical Analysis...... 43

3.6 Complexity of ROLE-ASSIGNMENT...... 48

3.7 Algorithms for ROLE-ASSIGNMENT...... 50

3.7.1 Integer Program...... 50

3.7.2 Dynamic Program...... 51

3.8 Simulation Analysis...... 54

3.9 Conclusion...... 55

4 Framing Matters: Sanctioning in Public Good Games with Parallel Bilateral Relationships 57

4.1 Relevant Literature and Points of Difference...... 61

4.1.1 Sanctioning...... 62

4.1.2 Framing...... 64

4.2 Experimental Design...... 65

4.2.1 Global and Local Public Good Games Overlaid...... 65

4.2.2 Information Conditions...... 66

vii 4.2.3 Experimental Subjects and Pre-Experiment Procedure.... 67

4.2.4 Timeline and Details of the Experiment...... 68

4.2.5 Payoff Calculations...... 69

4.3 Interesting Patterns in Data...... 70

4.4 Experimental Results...... 73

4.4.1 Primary Effect: Framing Matters...... 74

4.4.2 Machine Learning: Clustering Analysis and the Secondary Ef- fect of Matching...... 76

4.4.3 Welfare Economics: Lorenz Curves and Gini Coefficients... 79

4.5 Discussion...... 80

4.5.1 Cooperative vs. Uncooperative Frame of Minds...... 80

4.5.2 Potential Theoretical Extensions and Future Questions.... 83

5 Concluding Remarks 85

A Figures for Framing Matters: Sanctioning in Public Good Games with Parallel Bilateral Relationships 86

Bibliography 100

Biography 107

viii List of Tables

4.1 Table of the four different treatments given in the experiment..... 66

4.2 Comparison of treatments and test results of whether the homogeneity hypothesis can be rejected...... 76

4.3 Numbers of time trends assigned to cluster 1 (C1) and cluster 2 (C2) based on k-means clustering with k “ 2...... 77

ix List of Figures

2.1 Three countries example...... 10

2.2 Three continents example...... 11

2.3 Algorithm finding the unique maximal set of forever-cooperating agents. 15

2.4 3-D figure showing nodes’ defection probabilities...... 18

2.5 Gradient figures showing agents’ defection probabilities...... 19

2.6 Gradient figures showing the average number of iterations needed until convergence of the algorithm...... 20

2.7 A variant of Hoeffding bounds due to Angluin and Valiant (Angluin and Valiant(1979))...... 22

2.8 Color figures that compare the simulation results to the analytical expression...... 28

2.9 Gradient figures showing the fraction of cases in which Nash equilib- rium is also a credible equilibrium...... 34

3.1 Two committees example...... 40

3.2 A Two-Player Active-Passive (2PAP) game...... 48

3.3 Integer program for ROLE-ASSIGNMENT...... 51

3.4 Plot showing the average runtime of solving an instance of ROLE- ASSIGNMENT through dynamic programming...... 52

3.5 Plot showing the average runtime of solving an instance of ROLE- ASSIGNMENT through integer programming...... 53

4.1 Timeline of the progression of a single round...... 68

x 4.2 Average contribution patterns observed within each treat- ment...... 78

4.3 Lorens curve and Gini coefficients for the global and local games, for each treatment...... 81

4.4 Decision tree illustrating how framing can impact the level of cooper- ative actions of an agent facing a new game...... 83

A.1 Individual-level global game contribution time trends...... 87

A.2 Individual-level local game contribution time trends, for the matched framed (M+F) treatment...... 88

A.3 Individual-level local game contribution time trends, for the matched non-framed (M+nF) treatment...... 89

A.4 Individual-level local game contribution time trends, for the non- matched framed (nM+F) treatment...... 90

A.5 Individual-level local game contribution time trends, for the non- matched non-framed (nM+nF) treatment...... 91

A.6 Group-level global game contribution time trends...... 92

A.7 Group-level local game contribution time trends...... 93

A.8 Number of censored observations for global games...... 94

A.9 Number of censored observations for local games...... 95

A.10 Boxplot representing the group-level mean of global game contribution. 96

A.11 Boxplot representing the group-level standard deviation of global game contribution level...... 97

A.12 Boxplot representing the pair-level mean of local game contribution.. 98

A.13 Boxplot representing the pair-level standard deviation of local game contribution...... 99

xi List of Abbreviations and Symbols

Abbreviations

MAID Multiagent influence diagram.

NA North America.

EU Europe.

AS Asia.

WLOG Without loss of generality.

ER Erd˝os-R´enyi random graph model.

PA Barab´asi-Albert preferential-attachment random graph model

2PAP Two-Player Active-Passive.

nPAP n-Player Active-Passive.

IP Integer program.

DP Dynamic program.

GAMUT A suite of game generators designed for testing game-theoretic algorithms.

CPLEX IBM ILOG CPLEX Optimizer.

M+F Matched framed treatment.

M+nF Matched non-framed treatment.

nM+F Non-matched framed treatment.

nM+nF Non-matched non-framed treatment.

IBRC Interdisciplinary Behavioral Research Center.

xii SSRI Social Science Research Institute.

MW Mann-Whitney U test.

CM Multivariate Cram´er-von Mises test.

C1 Cluster 1, or the “cooperative” group.

C2 Cluster 2, or the “uncooperative” group.

PAM Partitioning around medoids.

xiii Acknowledgements

I owe special thanks to my advisor, Vincent Conitzer, and to my committee members, Curtis Taylor, Rachel Kranton, and David McAdams. I thank Huseyin Yildirim, Kyle Jurado, Todd Sarver, James Roberts, Seth Sanders, Chris Timmins, Giuseppe (Pino) Lopomo, Thomas Nechyba, Mike McBride, Jana Schaich Borg, Philipp Sadowski, Paul Dudenhefer, Atila Ambrus, Craig Burnside, Peter Arcidiacono, Federico Bugni, Matthias Kehrig, and Dan Ariely, for the discussions and encouragement along the way. I also thank the participants at the Duke University Microeconomic Theory and DRIV seminars, as well as the Association for the Advancement of Artificial In- telligence (AAAI) 2015 Spring Symposium Series, the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 15), the 2015 Institute for Op- erations Research and the Management Sciences (INFORMS) Annual Meeting, the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 16), the 5th World Congress of the Game Theory Society (GAMES 2016) and the 17th ACM Economics and Computing Conference (EC’16), for their valuable comments; the members of the MURI award W911NF-11-1-0332 grant for the helpful discus- sions during the annual review meetings. I am grateful to the Program for Advanced Research in the Social Sciences (PARISS) for their financial support. Finally, my deepest thanks to all my family members for their unyielding and constant support.

xiv 1

Introduction

Cooperation allows for the society as a whole to attain better social outcomes, which are not achievable individually without cooperation. Even though taking proactive actions will generate long-term collective benefits internationally, individual coun- tries struggle to act cooperatively in pollution abatement agreements for reasons such as the short-term upfront cost and the fear of being the workhorse. Oftentimes, identification of potential cooperators and careful design of structural arrangements can promote cooperation. For example, countries trading carbon permits can be viewed as pairing of low cost polluters making sacrificies within the pollution reduc- tion game in return for the high cost polluters making sacrifices within the monetary payment game. I use game theory and computational methods to explore how to identify and promote sustainable cooperation in a multiagent system. Broadly, I have three wings in my research: (1) analyze the collective level dy- namics unraveling from some given situation, (2) find arrangements that conduce cooperation, and (3) understand the gap between game-theoretically rational behav- iors and observed behaviors affected by confounding factors. Specifically, with game theory as a common lens to view and simplify the conflict of priorities each agent

1 faces, I use actions of the rational, self-interested agents as a guide to understand in- centives underneath potential actions in the real world. This understanding enables me to explore algorithmically institutional arrangements where robust, sustainable cooperation is possible. Lastly, to understand how insights from game theory can be extended to behaviors in the real world, I conduct a human subject experience where agents face decisions between self- and group-interested behaviors. Below, I share a high-level overview of the subsequent chapters (joint with Vincent Conitzer).

1.1 Understanding Cooperation and Collective Level Dynamics

Members within a society face many potentially competing priorities, for example, their own self-interest often competes against societally beneficial group-interest. Rational, self-interested agents, as studied in traditional game theory, would want to maximize their own utility, and would cooperate only when group-interested actions align with that objective. When there is an immediate conflict between self- and group-interested behaviors, agents would cooperate only if they are presented with high enough threat of consequent punishment for uncooperative behaviors. The first wing of my research is dedicated to understanding the collective level dynamics of cooperative behaviors that would unfold in some given situation. Understanding an agent’s incentive to cooperate becomes more complex when punishment is not as easy, due to characteristics such as delays in information trans- fer from social distance and asymmetry of social relationships in cost, benefit, and information associate with the interaction. In chapter2, Maximal Cooperation in Repeated Games on Social Networks, we study relationships with the above two characteristics represented by a directed network of agents, and use game theor and algorithmic iterative elimination to efficiently identify the set of agents with suffi- cient incentive to cooperate. Our understanding of the interdependencies and the

2 consequent collective dynamics helps negotiation among enforceable parties to reach cooperative agreement. For illustration, take China, South Korea, and Japan, and a pollution abatement agreement as an example. Geographically, South Korea and Japan share the East Sea, and China is located to the west of both countries. Con- sequently, China’s pollution affects both Korea and Japan, but due to the dominant west wind, China is not affected by their pollution. However, Korea and Japan shares a sea, so their pollution does affect each other. In this setting, since China does not receive any benefits, but only pays a cost for the cooperative action of reducing pol- lution, China would not cooperate. Korea and Japan, on the other hand, depending on the cost and benefit structure, may decide to sustain cooperation. Identification of the set of cooperating agents, none or Korea and Japan in this case, provides a pivotal information for interested agents to effectively form coalition. Through computer simulation varying the value of cooperative relationships and degree of discounting, we observe a phase transition with a sharp drop in cooperative behaviors. We can mathematically derive an analytical expression approximating this fine line between cooperative and uncooperative societies. This insight on the subtleties of cooperation naturally leads to the second wing of my research: how to structure arrangements for sustainable cooperation.

1.2 Institutional Arrangements for Sustainable Cooperation

To ensure cooperation among self-interested agents, it is crucial to build in a high enough threat of punishment for uncooperative behaviors to the relationship. Of- tentimes, in literature, games are studied in isolation, even when the same set of agents interact in multiple different games together in reality. Looking at a problem in an isolated manner can be limiting. Even if a game independently does not give an agent sufficient incentive to play the “cooperative” action, there may be hope for cooperation when multiple games with compensating asymmetries are put together.

3 In this situation, agents would have incentives to cooperate in all the games they are participating in, as long as the losses in some of the games are offset by gains in the other games, and cooperation in the games with gains is conditional on co- operative actions in the games with losses. The second wing of my research aims to discover optimal institutional structures and arrangements to promote sustainable cooperation. In chapter3, Role Assignment for Game-Theoretic Cooperation, we formalize this set up as a problem of assigning roles within multiple projects, to an overlapping set of agents. Using game theoretical methods, we simplify the problem into a computa- tional one, and find a worst case bound on how long it would take to find a solution. We also provide an empirically useful integer program which solves role assignment instances in a really, really short time. This chapter and its algorithm provides answers to two important questions: (1) what is the institutional arrangement guar- anteeing the most robust cooperation and (2) what is the minimum subsidy needed to encourage cooperation, when it is not naturally possible. To understand the implications of extending these insights to real world imple- mentation, my final wing of research attempts to experiemntally understand the gap between behaviors of game-theoretical, rational agents and agents in the real world.

1.3 Connecting Theory and Data on Agent Behavior

Real world agents can (and often may) behave differently from the completely self- interested and rational agents studied in game theory. Understanding this gap, and bringing back to the model is important in better approximating the cooperative dynamics. In chapter4, Framing Matters: Sanctioning in Public Good Games via Paral- lel Bilateral Relationships, we run a human subject experiment designed to better understnad the behavioral relationship between sanctioning and cooperation in pub-

4 lic good games. While prior studies find that sanctioning is a robust mechanism to enforce cooperation in public good games, they only study structures of adding external punishment stages. In real world settings, however, adding an additional domain of interaction for punishment purposes, on top of existing relationships, can be difficult and sanctioning may take place informally through other domains. For example, while it would be nearly impossible to introduce a new domain of interac- tion for punishment purposes among countries in international climate agreements, sanctioning could take place informally in trade. Though simple Folk Theorem could predict cooperation as the likely , it is important to answer whether these other domains, such as trade relationships, can still act as an effective sanctioning tool, given that they are of importance independently. To capture an important distinction of how potential sanctioning happens in real world, we introduce a new experimental setting where agents have “local” bilateral public good games representing other domains of interaction. Surprisingly, we find that “framing”, or pointing out the possible use of these other domains as sanctioning tools, reversely lowered cooperation in a statistically significant way, unlike previous studies where the addition of explicit sanctioning stages led to higher contributions. Analyses based on machine learning and welfare economics corroborated that this framing led to a less cooperative mindset in the experiment. This suggests that understanding non-game-theoretical aspects such as how societal norms are devel- oped, with respect to planning how to present sitautions, are important in a road to sustainable cooperation.

5 2

Maximal Cooperation in Repeated Games on Social Networks

In systems of multiple self-interested agents, we cannot impose behavior on the agents. For desired behavior to take place, it will have to be in equilibrium, but this is not necessarily the case, especially in one-shot games. It is well known in game theory that a much greater variety of behavior can be obtained in the equi- libria of infinitely repeated games than in the equilibria of one-shot games. The standard example is that of the prisoner’s dilemma. Defecting is a dominant strat- egy for both players in the one-shot version. In the repeated version, cooperation between the players can be sustained due to the threat of the loss of future coopera- tion. Consequently, modeling repeated play and solving for equilibria of the resulting games are crucial to the design of systems of multiple self-interested agents. The well-known folk theorem in game theory characterizes the payoffs that agents can obtain in the equilibria of repeated games. This theorem also serves as the basis of algorithms that compute equilibria of repeated games, which can be done in polynomial time for 2-player games (Littman and Stone(2005)). (This does not extend to games with more than 2 players (Borgs et al. (2010)), unless correlated

6 punishment is possible (Kontogiannis and Spirakis(2008)). Recent work shows that heuristic algorithms can nevertheless be effective (Andersen and Conitzer(2013)).) These results operate under the assumption that an agent’s behavior is instantly observable to all other agents. For reasonably large multiagent systems, this can be a very limiting restriction. When an agent does not interact with another agent, it may take some time before one finds out about the other’s defection. In such cases, it is more difficult to sustain cooperative behavior in equilibrium, because the punishment for defection will arrive later in the future and therefore be more heavily discounted. Then, under what conditions can we still sustain cooperation, and can we compute the resulting equilibria? These are the questions we set out to answer in this chapter. Graphical games (Kearns et al. (2001)) constitute a natural model for the inter- action structure. One shortcoming of graphical games is that typically, the graph is undirected. Here, we are also interested in modeling directed relationships, where b is affected by a’s actions but not vice versa. Of course, this can be represented using an undirected edge as well, with some duplication of values in the utility table. How- ever, besides being concerned with computational efficiency, we also want the edges to capture the flow of information. For example, suppose b is affected by a’s actions, c is affected by b’s actions, and a is affected by c’s actions (but not vice versa). If a defects, c will initially not be aware of this. However, b will be, causing b to defect in the second round. At that point c will become aware of a defection, resulting in a defection by c in the third round, perhaps corresponding to a receiving a delayed punishment for defecting in the first round. In any case, allowing for directed edges in graphical games is a trivial modification. (There are other graphical models that are used to represent games, such as multiagent influence diagrams (MAIDs) (Koller and Milch(2003)) and action-graph games (Jiang et al. (2011)), but graphical games provide the structure we need.)

7 2.1 Related Literature

Many papers on repeated public good games show full cooperation is possible when players become arbitrarily patient (Kandori(1992); Ellison(1994); Deb(2008); Taka- hashi(2010)), even with delayed monitoring (Kinateder(2008)), imperfect monitor- ing (Laclau(2012)), and local monitoring (Nava and Piccione(2014)) in graphical games. However, there is not much work characterizing the maximum level of co- operation sustainable in equilibrium. As an exception, Wolitzky(2013) explores the maximum level of contributions sustainable in equilibrium for a fixed discount factor for public good games under private “all or nothing” monitoring (player i changes her monitoring neighborhoods every period, and perfectly observes contributions made within the monitoring neighborhood) and shows that grim-trigger strategies provide the strongest possible incentives for cooperation on-path. This work relates to ours as cooperation in our framework can be interpreted as providing a local public good, where the access to the “local public good” is asymmetric and only the player who benefits from the cooperation observes the cooperation. Given these complications, our model’s decision variables are discrete (cooperate or defect), unlike Wolitzky’s which are continuous. Also closely related is work by Mihm et al. (Mihm et al. (2010)), which also concerns sustaining cooperation on social networks in repeated play. An important technical difference is that in their model, edges necessarily trans- mit information in both directions. If we were to consider the special case of our model where every edge transmits information in both directions, our results would become rather trivial because a deviating agent would be instantly punished by all its neighbors; on the other hand, Mihm et al. allow agents to play different actions for each of their neighbors, which we do not allow. Mihm et al. also require the game played to be the same on every edge. In , repeated games on graphs are also sometimes considered, but here the network structure is used to

8 let agents copy each others strategies (e.g., imitate the best-performing neighbor). In this chapter, we focus on finding the maximal set of cooperating agents in equilibrium for a given discount factor and experimentally investigating the range of discount factors under which cooperation can be sustained. We also assess the credibility of the threats used in this equilibrium: we give a condition for threats to be credibly executed and experimentally investigate when this condition holds for the maximal set of cooperating agents.

2.2 Model

We suppose that there is a set of agents A organized in a directed graph G “ pV,Eq where V “ A, i.e., agents are vertices in the graph. A directed edge indicates that the agent at the tail of the edge benefits if the agent at the head of the edge cooperates.

Associated with every agent (vertex) i is a cooperation cost κi P R. Associated with every directed edge pi, jq P E is a cooperation benefit βi,j P R. Agents play a , where in each round, each agent either cooperates or defects. Note that it is not possible for an agent to cooperate for some of its outgoing edges but defect for others within the same round. In round t, i receives ´xi,tκi ` j:pj,iqPE xj,tβj,i, where xi,t P t0, 1u indicates whether i cooperated in round t or not.ř That is, agent i experiences cost κi for cooperating and benefit βj,i for j cooperating. Defection is irreversible, i.e., once an agent has defected it will defect forever, thereby effectively destroying its outgoing edges. At the end of round t, agent i learns, for every j with pj, iq P E, whether j cooperated or defected in round t, and nothing else. As is standard in repeated games, a of a player maps the entire history that that player observed to an action. Payoffs in round t are discounted by a factor δt. We are interested in Nash equilibria of this game (and, later, in equilibrium refinements).

9 2.3 Motivation and Illustrative Examples

Our directed graph framework enables us to model asymmetry in both payoffs and information flow. (Undirected graphs can be seen as the special case of our framework where edges always point in both directions.) In this section, we illustrate our model using some examples.

2.3.1 A Pollution Reduction Example

Suppose there is a set of countries, tChina, South Korea, Japanu, gathering together to try to reduce pollution. Geographically, South Korea and Japan share the Sea of Japan (also called the East Sea of Korea), and China is located to the west of both countries. Consequently, if South Korea does not reduce its pollution, Japan would find out and suffer due to the pollutants in the sea, and vice versa; but China would not be affected in either case. If China does not reduce its pollution, however, both South Korea and Japan would notice and suffer, with the pollutants traveling with the dominant west wind.

Figure 2.1: Three countries example.

In this simplified model, South Korea and Japan do not have a way to punish China even if China decides to not reduce pollution. However, Japan and South Ko- rea may be able to negotiate and enforce an agreement between themselves, because when one of the two defects, the other can punish the former (enough) by defecting

10 back. Thus, our model allows us to solve for the set of agents that would be able to negotiate an enforceable agreement.

2.3.2 Delayed Punishment due to Directionality of Information Flow

Figure 2.2: Three continents example.

As a similar example, consider three agents, North America (NA), Europe (EU), and Asia (AS) negotiating a pollution reduction agreement. With the dominant wind from the West, North America would notice Asia’s pollution, Europe would notice North America’s pollution, and Asia would notice Europe’s pollution. Hence, defection in North America could trigger Europe to defect in the next period, which in turn could trigger Asia to defect in the period after, at which point North America experiences a delayed punishment.

2.3.3 Discussion of Assumptions in the Model

One of our assumptions is that it is not possible for an agent to defect on one neighbor but to simultaneously cooperate with another neighbor. This makes sense in the examples above— e.g., it is not possible for China to pollute only towards Japan but not towards South Korea. Moreover, the cost of reducing pollution does not depend on one’s neighbors. Another assumption is that defection is irreversible. Our results depend on this assumption: dropping it may allow arrangements with partial cooperation (where, for example, one agent is expected only to cooperate every other period) which we do not consider here. In any case, this assumption is reasonable in cases where continued

11 cooperation requires upkeep—e.g., once you install a dirty coal-fired power plant it may be hard to shut it down; in accounting of nuclear resources, cooperation may mean following a tight protocol, and once an agent defects for a period it may become impossible to re-account for all the resources properly; etc.

2.4 Theoretical Analysis for Cooperation in Nash Equilibrium

We will show that we can without loss of generality restrict our attention to grim-

trigger equilibria, in which there is some subset of agents S Ď A that each cooperate until another agent in S defects on them.

Definition 1. Player i’s strategy grim trigger for S consists of i cooperating until some player j P S with pj, iq P E defects, which is followed by i defecting (forever).

Definition 2. The grim trigger profile T rSs consists of all the agents in S playing the grim for S, and all other players always defecting.

When one agent’s defection triggers another’s defection, the latter can set off further defections. This cascade of defections is what lets the information that there has been a defection travel through the graph. Accordingly, to determine how long it takes an agent to find out about the original defection, we need a definition of distance that takes into account that information is transmitted only through the (originally) cooperating agents.

Definition 3. For any subset S of the agents and any i, j P S, the distance from i to j through S, denoted dpi, j, Sq, is the length of the shortest path from i to j that uses only agents in S. For a set of agents G Ď S, dpG, j, Sq “ miniPG dpi, j, Sq.

The next proposition establishes a necessary and sufficient condition for T rSs to be a Nash equilibrium.

12 Proposition 1. (Equilibrium) T rSs is an equilibrium if and only if for all i P S,

dpi,j,Sq jPS:pj,iqPE δ βj,i ě κi. ř Proof. First observe that every player outside S is best-responding, because her actions do not affect any other players’ future actions and within any single round, defecting is a strictly dominant strategy. For a player i in S, without loss of generality we can focus on whether she would prefer to defect (starting) in the first round. If

8 t i were to defect, the total utility gain from reduced effort are exactly t“0 δ κi “

κi{p1´δq, and the total utility loss from neighbors eventually defecting asř a result of

8 t dpi,j,Sq i’s defection is jPS:pj,iqPE t“dpi,j,Sq δ βj,i “ jPS:pj,iqPE δ βj,i{p1 ´ δq. The latter follows from theř fact that ři’s defection will causeř j to defect exactly dpi, j, Sq rounds later. Thus, i has no incentive to defect if the former is no greater than the latter.

Multiplying both sides by 1 ´ δ gives the desired inequality.

The next proposition shows that no equilibria can have more agents cooperate for- ever than the grim-trigger equilibria. Intuitively, this is because grim-trigger profiles provide the maximum possible punishment for deviating agents.

Proposition 2. (Grim Trigger is WLOG) Suppose there exists a pure-strategy

equilibrium in which S is the set of players that cooperates forever. Then T rSs is also an equilibrium.

Proof. For an arbitrary player i P S, we must prove the inequality in Proposition1. For the given equilibrium, consider some period τ at which every player outside S has defected (on the path of play). If player i considers defecting at this point, the

8 t τ total utility gain from reduced effort would be exactly t“τ δ κi “ δ κi{p1 ´ δq. On the other hand, the total utility loss from neighbors defectingř as a result of i’s

t τ dpi,j,Sq defection is at most jPS:pj,iqPE t“τ`dpi,j,Sq δ βj,i “ δ jPS:pj,iqPE δ βj,i{p1 ´ δq. The latter follows fromř the factř that i’s defection canř only cause changes in j’s

13 behavior dpi, j, Sq rounds later, because no information can pass through nodes that have already defected. By the equilibrium assumption, the latter expression is at least as large as the former. Multiplying both by p1´δq{δτ gives the desired inequality.

The next lemma shows that the more other agents cooperate, the greater the incentive to cooperate.

Lemma 3. (Monotonicity) If S Ď S1 and the incentive constraint from Proposi-

dpi,j,Sq tion1 holds for i relative to S (so i P S, jPS:pj,iqPE δ βj,i ě κi), then it also holds for i relative to S1. ř

dpi,j,S1q dpi,j,Sq Proof. We argue that jPS1:pj,iqPE δ βj,i ě jPS:pj,iqPE δ βj,i, which estab-

lishes that the former isř also at least κi. First, allř summands are nonnegative, and the former expression has a summand for every j for which the latter expression has

a summand. Second, for any i, j, we have that dpi, j, S1q ď dpi, j, Sq, because having additional agents cannot make the distance greater. Because δ ă 1, it follows that δdpi,j,S1q ě δdpi,j,Sq. Hence, for j for which both expressions have a summand, the summand is at least as large in the former expression (corresponding to S1) as in the latter (corresponding to S). This establishes that the former expression is at least as large.

The next proposition shows that we cannot have multiple distinct maximal grim- trigger equilibria.

Proposition 4. (Maximality) If T rSs and T rS1s are both equilibria, then so is T rS Y S1s.

Proof. Consider some i P S Y S1; without loss of generality, suppose i P S. We must show that the incentive constraint from Proposition1 holds for i relative to S Y S1. But this follows from Lemma3 and the facts that it holds for i relative to S and

that S Ď S Y S1.

14 Proposition4 implies that there exists a unique maximal set S˚ of forever- cooperating agents such that T rS˚s is an equilibrium. We can use the following algorithm for finding the unique maximal set S˚. Initialize Scurrent to include all

agents. Then, in each iteration, check, for each i P S, whether the incentive con- straint holds for i relative to Scurrent. Remove those i for which it does not hold from Scurrent. Repeat this until convergence; then, return Scurrent. Call this Algorithm 2.4, presented formally below.

Figure 2.3: Algorithm finding the unique maximal set of forever-cooperating agents.

15 Proposition 5. (Correctness) Algorithm 2.4 returns the unique maximal set S˚ such that T rS˚s is an equilibrium.

Proof. It suffices to show that if i is at some point eliminated from Scurrent in Algo- rithm 2.4, then there exists no set S such that i P S and T rSs is an equilibrium. We prove this by induction on the round in which i is eliminated. Suppose it holds for all rounds before t; we prove it holds when i is eliminated in round t. Let St denote the set of agents that have not yet been eliminated in round t (including i). By the induction assumption, for any S such that T rSs is an equilibrium, we have S Ď St. But the incentive constraint from Proposition1 does not hold for i relative to St, because i is eliminated in this round. But then, by Lemma3, it also does not hold for i relative to any S Ď St. Hence, there is no S such that i P S and T rSs is an equilibrium.

Proposition 6. (Runtime) The runtime of Algorithm 2.4 is Op|V |2 |E| log |V |q.

Proof. Because at least one agent is removed in every iteration before the last, there can be at most |V | ` 1 iterations. In each iteration, we solve all-pairs shortest paths, which can be done in O(|V | |E| log |V |) time (Cormen et al. (2001)). Also, in each iteration, for every agent, we need to evaluate whether the incentive constraint holds, requiring us to sum over all its incoming edges. Because each edge has only one vertex for which it is incoming, we end up considering each edge at most once per iteration for this step, so that its runtime is dominated by the shortest-paths runtime.

2.5 Simulation Analysis on Random Graphs

In this section, we evaluate the techniques developed so far in simulations. After this, we continue with further theoretical development, which we subsequently evaluate in simulations as well.

16 2.5.1 Assumptions for Simulation

For our simulation analysis, we make the following additional assumptions on the cost and benefit structure. First, i’s cost of cooperation is proportional to the number

of outgoing edges i initially has: κi “ iPS:pi,jqPE 1, normalizing the per-edge cost to

1. Also, we assume a constant benefitř to having an incoming edge, that is, βj,i “ β for all pj, iq P E. This implies that the total benefit i receives is proportional to the number of incoming edges i has (from cooperating agents). We generate random graphs based on two models: the Erd˝os-R´enyi random graph model (ER) and the Barab´asi-Albert preferential-attachment random graph model (PA), with some modifications to generate directed rather than undirected graphs. The ER model requires n, the number of nodes, and p, the probability of a directed edge between two nodes. The PA model requires n, the number of nodes, e, the number of edges per node, and γ, a parameter determining the degree of “preference” given to nodes with more edges when a new node enters the network in the formation process (γ “ 0 makes the probability of an existing node being selected for edge generation directly proportional to its current degree, whereas γ “ 1 makes the probability of being selected equal among all the currently existing nodes). To obtain directed graphs from PA, we first turn any undirected edge into two directed edges

(directed both ways). We then add pdegeneration, the probability of removing a directed edge between any two nodes (we calibrate e and pdegeneration to keep p, the probability of a directed edge between two nodes, comparable to the ER cases and the analytical

expression). All the graphs presented in this chapter are with p “ 0.3, and with each point averaged over 100 graphs. This probability is chosen as a representative value for presentation, but the patterns shown are consistent with graphs generated using different p values.

17 2.5.2 Equilibrium Defection Phase Transition

Given a graph, there are two parameters that can further affect i’s incentive to cooperate or defect. Varying β will change the value of incoming edges (relative to the cost of outgoing edges). On the other hand, varying δ will affect how an agent trades off current and future payoffs. Thus, we apply Algorithm 2.4 for varying values of δ and β.

Figure 2.4: Nodes’ defection probabilities—the fractions of nodes that fail to coop- erate in the maximally cooperative equilibrium—for values of β and δ (ER: n “ 100, p “ 0.3).

Figure 2.5 shows the resulting equilibrium cooperation and defection patterns for representative ER and PA specifications for different values of n. These are top-down views of graphs of the form in Figure 2.4. For all the gradient graphs, the transition pattern is similar to that of Figure 2.4. We see an apparent phase transition, with a sudden sharp drop in the defection probability, indicating a transition from everyone defecting to everyone cooperating. This implies that β and δ suffice to predict whether a node will cooperate; in particular, knowing a node’s centrality in the graph does little to help predict cooperation of that node, because it tends to be the case that the whole graph cooperates or defects together.

18 Figure 2.5: Gradient figures (top-down view of graphs of the form in Figure 2.4) showing agents’ defection probabilities for different values of β and δ, for both ER and PA graphs with various numbers of nodes n. The x-axis shows the β values, ranging from 0 to 10, and the y-axis shows the δ values, ranging from 0 to 1. The white section indicates complete defection, while the black section indicates complete cooperation. Intermediate shades indicate partial cooperation.

Intuitively, we might expect a phase transition for the following reason. For high values of β and δ, it will be rare that any agents have an incentive to defect when all other agents play the grim trigger strategy. However, once we drop to values where some agents have an incentive to defect, we can no longer have these agents participate in the grim trigger strategy. This then increases the incentives for other agents to defect, so some of them no longer participate either, etc., quickly leading to a collapse of all cooperation. As pointed out in the proof of Proposition6, the algorithm returns the equilibrium after at most |V | ` 1 iterations, but this is a very pessimistic upper bound, assuming exactly one agent is eliminated each iteration. In simulations, significantly fewer iterations are required for convergence, as indicated in Figure 2.6. For a network

19 Figure 2.6: Gradient figures showing the average number of iterations needed until convergence of the algorithm, for various values of β and δ. The specifications of each panel are the same as in Figure 2.5. In the bottom left, where we see total collapse of cooperation, Algorithm 2.4 converges within 2 iterations on average. In the top right, where we see total cooperation, Algorithm 2.4 (of course) converges within 1 iteration. The middle band requires more iterations (maximally 9 on average, for n “ 100). with 100 agents (n “ 100), on average, the algorithm converges within at most 12 iterations. Moreover, we see the typical pattern where runtime is greatest around where the phase transition happens. This pattern confirms our intuition that the cascade of defection set off by the initial defection results in the observed phase transition. Further away from the band, the algorithm converges within an average of 2 iterations.

2.6 Analytical Expression for Phase Transition

In this section, we derive an analytical expression that is an approximation of where the phase transition occurs, as a function of p, β, and δ, based on the ER model. (Though none of our analysis is for the PA model, in simulations, the same expression

20 appears to work well for the PA model as well.) To get an intuitive understanding, let us consider under which conditions a “representative” agent is likely to be eliminated by the algorithm in the first iteration. The idea is that as n grows, with high probability all agents will resemble this representative agent. This representative agent has pn incoming edges, and pn outgoing edges. If the agent defects, she will

8 t save t“0 pnδ “ pn{p1 ´ δq in effort cost. On the other hand, her defection will causeř agents that have an edge from her to defect as well. The representative agent has p2n agents with which she constitutes a 2-cycle (edges going both ways). These

8 2 t 2 agents will defect the next round, at a cost of t“1 p nδ β “ δp nβ{p1 ´ δq to the representative agent. Because we are interestedř in obtaining an approximation that is accurate when n is large, we may assume that all remaining pn ´ p2n agents that have an edge to the representative agent will hear of her defection and defect in the next round after that. (The probability that there is no path of length two from the representative agent to a given agent is p1 ´ p2qn´2 which quickly approaches 0 as n

8 2 t grows.) Hence, the cost from this second class of defections is t“2pp ´ p qnδ β “ δ2pp ´ p2qnβ{p1 ´ δq. Combining the three expressions by equatingř the savings to the cost, and rearranging, we obtain 1 ´ δβp ´ δ2βp1 ´ pq “ 0 as the expression for where the phase transition occurs. Next, we prove that this approximation becomes arbitrarily accurate as n in- creases, using Hoeffding bounds. Here, we use the variant of Hoeffding bounds in Figure 2.7(Angluin and Valiant(1979)). We first show that if we are on the “defecting” side of the phase transition, then for sufficiently large n, with high probability all agents would be eliminated in the first iteration of the algorithm—that is, no agent will cooperate even if all other agents played grim trigger.

Theorem 7. If β, δ, and p are such that 1 ´ δβp ´ δ2p1 ´ pqβ ą 0, then for any

21 n For Y “ i 1 Xi with Xi as independent random variables taking value 1 with “ n probability p and 0 with probability 1 ´ p, where m “ ErY s “ i“1 p and 0 ă b ă 1,ř we have 2 ´ b m ř P rY ď p1 ´ bqms ď e 2 2 ´ b m P rY ě p1 ` bqms ď e 3

Figure 2.7: A variant of Hoeffding bounds due to Angluin and Valiant (Angluin and Valiant(1979)).

 ą 0, there exists N such that for all n ą N, the probability that every agent is eliminated in the first iteration of Algorithm 2.4 is at least 1 ´ .

Proof. There are three random variables that we will use in this theorem: Y 1 for the number of 2-cycles, Y 2 for the number of incoming edges, and Y 3 for the number of outgoing edges, all for a given agent (vertex). (Thus, technically we should add a subscript i to each of these three random variables, where i is the agent in question, but we do not do so to minimize clutter.) Along the lines of the intuitive argument provided for the representative agent, we know that the agent’s cost of defection from

δY 1β 2-cycles (when all other agents play grim trigger) is 1´δ , and the cost from larger

δ2pY 2´Y 1qβ cycles is at most 1´δ (as this quantity would correspond to the case where all the larger cycles are 3-cycles, which is the worst case for a defecting agent). Thus,

δY 1β δ2pY 2´Y 1qβ δβp1´δqY 1`δ2βY 2 the cost of defection for an agent is at most 1´δ ` 1´δ “ 1´δ .

Y 3 The savings from defection, on the other hand, is 1´δ . Multiplying by p1 ´ δq, we obtain that the agent will be eliminated in the first iteration of the algorithm if

0 ă Y 3 ´ δβp1 ´ δqY 1 ´ δ2βY 2. Under the ER model, we have ErY 1s “ pn ´ 1qp2 and ErY 2s “ ErY 3s “ pn ´ 1qp. Multiplying the inequality assumed in the statement of the theorem by pn ´ 1qp, we obtain 0 ă pn´1qp´δβpn´1qp2 ´δ2pn´1qpp1´pqβ “ pn´1qp´δβp1´δqpn´1qp2 ´ δ2βpn ´ 1qp “ ErY 3s ´ δβp1 ´ δqErY 1s ´ δ2βErY 2s. This is precisely the sufficient

22 condition we obtained above for an agent being eliminated in the first round, except that now expectations of all random variables are taken. What remains to show is that as n increases, the random variables go to their expected values sufficiently fast that all agents will be eliminated in the first round with high probability. We use Hoeffding bounds to show this.

Because 0 ă pn ´ 1qp ´ δβp1 ´ δqpn ´ 1qp2 ´ δ2βpn ´ 1qp, we can choose values

2 2 b1, b2, b3 ą 0 so that 0 ă pn´1qpp1´b3q´δβp1´δqpn´1qp p1`b1q´δ βpn´1qpp1`

3 1 2 2 b2q “ ErY sp1 ´ b3q ´ δβp1 ´ δqErY sp1 ` b1q ´ δ βErY sp1 ` b2q is satisfied. Then, using the Hoeffding bound, with high probability (for sufficiently large n), the actual number of 2-cycles is below 1`b1 of its estimate for every node, the actual number of incoming edges is below 1 ` b2 of its estimate for every node, and the actual number of outgoing edges is above 1 ´ b3 of its estimate for every node. Moreover, by the expression we chose the bi to satisfy, as long as the actual values are this close to their estimates, the savings in fact outweigh the costs for every node. Specifically, we first apply the Hoeffding bound to the number of 2-cycles to obtain

1 2 ´b2pn´1qp2{3 P rY ě p1 ` b1qpn ´ 1qp s ď e 1 . The right-hand side probability bound,

b2 n 1 p2 3 even when multiplied by n to obtain ne´ 1p ´ q { , will decrease arbitrarily close to

´b2pn´1qp2{3 0 as n increases. That is, we can find N1 large enough that e 1 ă {p3nq for n ą N1. Because the left-hand side is an upper bound on the probability that an agent will have an unusually large number of 2-cycles, applying the union bound over all agents (thereby multiplying by n), we obtain that all agents’ numbers of 2-cycles stay below the upper bound with probability at least 1 ´ {3. Similarly, for the

2 ´b2pn´1qp{3 number of incoming edges, we know P rY ě p1 ` b2qpn ´ 1qps ď e 2 . Again,

b2 n 1 p 3 as n increases, ne´ 2p ´ q { will decrease arbitrarily close to 0 as n increases, and we

´b2pn´1qp{3 can find N2 large enough that e 2 ă {p3nq for n ą N2. Applying the union bound again, we obtain that each agent’s number of incoming edges does not exceed the upper bound with probability at least 1 ´ {3. Lastly, applying the Hoeffding

23 3 ´b3pn´1qp{2 bound to the number of outgoing edges, P rY ď p1 ´ b3qpn ´ 1qps ď e 3 .

´b2pn´1qp{2 Since ne 3 will decrease arbitrarily close to 0 as n increases, we can find N3

´b2pn´1qp{2 large enough that e 3 ă {p3nq for n ą N3. Applying the union bound, we obtain that each agent’s number of outgoing edges exceeds the lower bound with probability at least 1 ´ {3. By applying the union bound once more, we obtain that for n ą N “ maxtN1,N2,N3u, all the random variables stay within their bounds—

1 1 2 2 3 3 that is, Y ă ErY sp1 ` b1q, Y ă ErY sp1 ` b2q, and Y ą ErY sp1 ´ b3q—for all agents with probability at least 1 ´ . Hence, for n ą N, with probability at least

3 1 2 2 3 1 1 ´ , for all agents Y ´ δβp1 ´ δqY ´ δ βY ą ErY sp1 ´ b3q ´ δβp1 ´ δqErY sp1 `

2 2 b1q ´ δ βErY sp1 ` b2q ą 0 (the last inequality holding by our choice of the bi). But this is precisely the condition for an agent being eliminated in the first iteration of the algorithm.

We now show that if we are on the “cooperating” side of the phase transition, then for sufficiently large n, with high probability no agents would be eliminated in the first iteration of the algorithm, thereby bringing it to a halt. Thus, with high probability all agents can cooperative forever in equilibrium.

Theorem 8. If β, δ, and p are such that 1´δβp´δ2p1´pqβ ă 0, then for any  ą 0, there exists N such that for all n ą N, the probability that no agent is eliminated in the first iteration of Algorithm 2.4 is at least 1 ´ .

Proof. We start by showing that for sufficiently large n, the probability of any agent not finding out about another agent’s defection in two steps becomes arbitrarily close to 0. Given an ordered pair of agents, the expression for this probability is p1 ´ pqp1 ´ p2qn´2. This value quickly becomes smaller as n increases, so that even when multiplied by npn ´ 1q to consider all possible combinations of two agents in the network, the new expression will eventually become arbitrarily close to 0. That

2 n´2 is, we can find N0 large enough that p1 ´ pqp1 ´ p q ă {r4npn ´ 1qs would hold 24 for all n ą N0. Because the left-hand side is an expression for the probability that a given agent will not find out of another given agent’s defection in two steps, applying the union bound over all ordered pairs of agents (thereby multiplying by npn ´ 1q), we obtain that any agent would find out about any other agent’s defection within

two steps with probability at least 1 ´ {4, when n ą N0. The remainder of the proof mirrors that of Theorem7. Again, there are three random variables that we will use in this theorem: Y 1 for the number of 2-cycles, Y 2 for the number of incoming edges, and Y 3 for the number of outgoing edges, for a given agent (vertex). Based on simplifications in Theorem7, assuming that any agent would find out about any other agent’s defection in at most two steps, the

δβp1´δqY 1`δ2βY 2 cost of defection of an agent is 1´δ while the savings from defection is

Y 3 1´δ . Multiplying by p1 ´ δq, we obtain that the agent will not be eliminated in an iteration of the algorithm if 0 ą Y 3 ´ δβp1 ´ δqY 1 ´ δ2βY 2. Under the ER model, we have ErY 1s “ pn ´ 1qp2 and ErY 2s “ ErY 3s “ pn ´ 1qp. Multiplying the inequality assumed in the statement of the theorem by pn ´ 1qp, we obtain 0 ą pn ´ 1qp ´ δβpn ´ 1qp2 ´ δ2pn ´ 1qpp1 ´ pqβ “ pn ´ 1qp ´ δβp1 ´ δqpn ´ 1qp2 ´ δ2βpn ´ 1qp “ ErY 3s ´ δβp1 ´ δqErY 1s ´ δ2βErY 2s. This is the expression we obtained above for an agent to not be eliminated in an iteration, except that now expectations of all random variables are taken. Now we want to show that as n increases, the random variables go to their expected values sufficiently fast that no agent will be eliminated in the first round with high probability. We use Hoeffding bounds to show this.

The steps are identical to those in the proof of Theorem7. Because 0 ą pn ´

2 2 1qp ´ δβp1 ´ δqpn ´ 1qp ´ δ βpn ´ 1qp, we can choose values b1, b2, b3 ą 0 so that

2 2 3 0 ą pn ´ 1qpp1 ` b3q ´ δβp1 ´ δqpn ´ 1qp p1 ´ b1q ´ δ βpn ´ 1qpp1 ´ b2q “ ErY sp1 `

1 2 2 b3q´δβp1´δqErY sp1´b1q´δ βErY sp1´b2q is satisfied. Then, using the Hoeffding

25 bound, with high probability (for sufficiently large n), the actual number of 2-cycles

is above 1 ´ b1 of its estimate for every agent, the actual number of incoming edges

is above 1 ´ b2 of its estimate for every agent, and the actual number of outgoing

edges is below 1 ` b3 of its estimate for every agent. Moreover, by the expression we

chose the bi to satisfy, as long as the actual values are this close to their estimates, the costs in fact outweigh the savings for every agent. Specifically, we first apply the Hoeffding bound to the number of 2-cycles to obtain

1 2 ´b2pn´1qp2{2 P rY ď p1 ´ b1qpn ´ 1qp s ď e 1 . The right-hand side probability bound,

b2 n 1 p2 2 even when multiplied by n to obtain ne´ 1p ´ q { , will decrease arbitrarily close to

´b2pn´1qp2{2 0 as n increases. That is, we can find N1 large enough that e 1 ă {p4nq for n ą N1. Because the left-hand side is an upper bound on the probability that an agent will have an unusually small number of 2-cycles, applying the union bound over all agents (thereby multiplying by n), we obtain that all agents’ numbers of 2-cycles stay within the bounds with probability at least 1´{4. For the number of incoming

2 ´b2pn´1qp{2 edges, we know P rY ď p1 ´ b2qpn ´ 1qps ď e 2 . Again, as n increases,

´b2pn´1qp{2 ne 2 will decrease arbitrarily close to 0, and we can find N2 large enough

´b2pn´1qp{2 that e 2 ă {p4nq for n ą N2. Applying the union bound again, we obtain that all agents’ number of incoming edges exceed the lower bound with probability

at least 1 ´ {4. Lastly, applying Hoeffding bound to the number of outgoing edges,

3 ´b3pn´1qp{3 ´b2pn´1qp{3 P rY ě p1 ` b3qpn ´ 1qps ď e 3 . Since ne 3 will decrease arbitrarily

´b2pn´1qp{3 close to 0 as n increases, we can find N3 large enough that e 3 ă {p4nq for

n ą N3. Applying the union bound, we obtain that each agent’s number of outgoing edges does not exceed the lower bound with probability at least 1 ´ {4. Applying

the union bound over these three cases, we know that for n ą maxtN1,N2,N3u,

all the random variables stay within their bounds—that is, Y1 ą ErY1sp1 ´ b1q,

3 Y2 ą ErY2sp1 ´ b2q, and Y3 ă ErY sp1 ` b3q—for all agents with probability at least 1 ´ 3{4.

26 Additionally we need that every agent would find out about every other agent’s defection within two steps; the first part of the proof showed that this happens with

probabillity at least 1´{4 for n ą N0. To finish the proof, we apply the union bound

one last time, and obtain that for n ą N “ maxtN0,N1,N2,N3u, with probability at least 1 ´ , no agent will be eliminated in the first iteration of the algorithm.

While these results show that our expression is accurate in the limit as the number of agents grows, when applying the same expression for a finite number n of agents, it is merely an analytical estimate of where cooperation will transition to defection (and this transition may be gradual for small numbers). Intuitively, our analytical estimate of where the phase transition happens is optimistic (meaning that our estimate would suggest cooperation is achieved in more cases that it actually is) in various ways. First, we made the assumption that a defection is detected by other agents in at most two rounds. In reality some agents will take longer to find out, and this increases the incentive to defect. Second, in reality the notion of a representative agent is a fiction; at best, agents are all in a situation roughly similar to that of the representative agent, but there will be some noise. In particular, some agents will be in a situation that is slightly less conducive to cooperation (e.g., having more outgoing edges or fewer incoming edges). If so, then this agent may (barely) have an incentive to defect even if the representative agent (barely) has no incentive to defect. Thereby, such agents could set off a cascade of further defections causing a collapse of cooperation. For both ways in which the estimate is optimistic, as p and n increase, we expect the estimate to get more accurate—in the first case because it becomes increasingly likely that another agent will find out in at most two rounds, and in the second case because agents’ incentives to defect and cooperate will cluster relatively more closely to their expected values.

27 Figure 2.8: Color figures that compare the simulation results to the analytical expression (by showing the difference between them). The specifications of each panel are the same as in Figure 2.5. The analytical expression predicts cooperation on the upper right sector and defection on the rest. The pale orange sections indicate the analytical prediction and the simulation results matching up. The band in the middle indicates where the simulation result deviates from the analytical prediction. The white (black) section in each band indicates where the analytical prediction is cooperation (defection) while significant defection (cooperation) is observed in simulations.

How well does the analytical expression actually perform at predicting the results from simulations? This is shown in Figure 2.8. The results are as expected (given the previous discussion) for the ER graphs. The analytical expression starts to match very well once n is sufficiently large, and where it is off (close to the phase transition), it is off because it is too optimistic, predicting cooperation where we in fact still collapse to defection. We were quite surprised to find that the analytical expression, which we after all developed based on intuitions about ER graphs, actually performs even better at predicting the phase transition in PA graphs. In these graphs, the analytical prediction is sometimes too optimistic and sometimes too pessimistic, but

28 overall the match is remarkable.

2.7 Credibility of Threats in Equilibrium

So far, we have focused on the maximal set of cooperating agents, S˚, in Nash equilibrium. However, these equilibria involve threats of grim-trigger defection that may not be credible. Suppose agent 1 has one otherwise isolated neighbor, agent 2, that has defected. If agent 1 defects, as she is supposed to in the grim trigger equilibrium, then she will also set off defections in the rest of the network, which may come at a significant cost to her. On the other hand, if she ignores the defection and continues to cooperate, then the other agents will never learn of the defection (since agent 2 is otherwise isolated) and continue to cooperate. The latter may be better for agent 1, in which case the grim trigger strategy is not credible. Hence, if we impose an additional credibility condition on the equilibrium, the maximal set of nodes cooperating in equilibrium may be strictly smaller than S˚ (it will always be a subset). In this section, we will give a condition for the grim trigger strategy to be credible for a given set of agents in the network (S Ď A). Hence, when T rSs is a Nash equilibrium and the condition holds, then T rSs is also a credible equilibrium. As we will later show, for sufficiently large networks, this condition always holds in our simulations for S˚. So far, we have avoided a precise definition of when equilibria are credible. Of course, the notion of threats that are not credible is common in game theory, mo- tivating refinements of Nash equilibrium such as -perfect Nash equilibrium. Subgame-perfect equilibrium will not suffice for our purposes, because generally our game has no proper : the acting agent does not know which moves were just taken by agents at distance 2 or more. Hence, subgame perfection does not add any constraints. We need a stronger refinement. Common stronger equilibrium re-

29 finements (such as perfect Bayesian equilibrium or ) generally require specifying the beliefs that the players have at information sets that are off the path of play. In fact, in our game, we can obtain the following very strong refine- ment, which we will call credible equilibrium: if an agent learns that some deviation from the equilibrium has taken place, then she will be best off following her equilib- rium strategy (e.g., grim trigger) regardless of her beliefs about which deviations have taken place, assuming that the other agents also follow their equilibrium strategies

from this round on.1 We now proceed towards deriving our condition for T rS˚s to be a credible equilibrium. The first lemma shows that we can restrict our attention to only a single neighbor defecting.

Lemma 9. (Sufficiency of Singleton Deviations) Suppose that, for agent i,

there is a set of agents K (with i R K and K X tk : pk, iq P Eu ‰ H) such that if i believes that K is the set of all agents that have defected so far, and all other agents will play grim trigger from now on, then i prefers postponing defection for

r (1 ď r ď 8) rounds to defecting immediately (i.e., grim trigger is not credible). Then, for any k P K X tk : pk, iq P Eu, if i believes that tku is the set of all agents that have defected so far, and all other agents will play grim trigger from now on, i prefers postponing defection for r rounds to defecting immediately as well.

Proof. Let S Ď A and S´i “ Sztiu. The intuition for the lemma is that, in both scenarios, the cost of cooperating for r more rounds is the same, but the bene- fit from postponing defection—which is that other nodes will cooperate longer— is always at least as large in the second case. This is because for an arbitrary

1 A similar notion from the literature on repeated games with imperfect private monitoring is that of belief-free equilibrium (Ely et al. (2005)), in which an agent’s continuation strategy is optimal regardless of her beliefs about opponents’ histories. This, however, is not true for the strategies we study: specifically, if agent i is in a situation where all agents other than i’s own neighbors have defected, and all i’s neighbors will defect in the next period, then i would be better off defecting this period—but she will not do so under the grim-trigger strategy, as she is not yet aware of any defection having taken place. In contrast, our notion of credible equilibrium conditions on the agent knowing that a defection has taken place.

30 node j with pj, iq P E, in the first scenario, j will learn of the defection after

mintr`dpi, j, Sq, dpK, j, S´iqu rounds instead of after mintdpi, j, Sq, dpK, j, S´iqu (for defecting immediately). Similarly, in the second scenario, j will learn of the defection

after mintr `dpi, j, Sq, dpk, j, S´iqu rounds instead of after mintdpi, j, Sq, dpk, j, S´iqu

(for defecting immediately). Of course, dpK, j, S´iq ď dpk, j, S´iq. If dpK, j, S´iq ď dpi, j, Sq, then postponing does not help (for this node j) in the first scenario, and it can only help in the second scenario. If dpi, j, Sq ă dpK, j, S´iq, then postponing helps (for this node j) in the first scenario, but it helps at least as much in the second scenario, for the following reason: in either scenario, without postponing, j learns of the defection after dpi, j, Sq rounds, but with postponing, j will learn of the defection at least as late in the second scenario as in the first scenario.

Lemma 10. (One-Round Postponement) Suppose that the following holds: if i

believes that tku (with pk, iq P E) is the set of all agents that have defected so far, and all other agents will play grim trigger from now on, then i prefers postponing defection for r (1 ď r ď 8) rounds to defecting immediately (i.e., grim trigger is not credible). Then, in these circumstances, i will also prefer postponing defection for exactly 1 round to defecting immediately.

Proof. The intuition for the lemma is that the longer agent i waits to defect, the fewer other agents will be informationally affected—in the sense that they learn about a defection having taken place later—by i deciding to wait one additional round. This is because increasingly many agents will learn about the defection via paths not involving i. Hence, the incentive to wait is strongest in the first round.

Let I “ tp : pp, iq P Euztku. If all j P S´i are playing grim trigger for S, we can divide the agents in I into three sets, by comparing Case I where i defects immediately to Case II where i defects after waiting r periods. The first set includes agents in I that learn of a defection after the same number of periods in Case I

31 and Case II. Call this set A fi tj P I : dpk, j, Sq “ dpk, j, S´iqu. The second set includes agents that find out strictly later in Case II, but by fewer than r periods.

Call this set B fi tj P I : dpk, j, Sq ` r ą dpk, j, S´iq ą dpk, j, Squ. The last set consists of agents that find out exactly r periods later in Case II. Call this set

C fi tj P I : dpk, j, Sq ` r ď dpk, j, S´iqu. Similarly, let Case III be the case where i delays defection by exactly one period. Correspondingly, we can divide the agents in I into two sets by comparing cases I and III. The first set is identical to A. The second set includes agents that learn of a defection exactly one period later in Case III than in Case I. Call this set

D fi tj P I : dpk, j, Sq ` 1 ď dpk, j, S´iqu “ B Y C. Let U I,U II,U III denote the long-term discounted utility for agent i in each of the three cases. We wish to show that U II ą U I ñ U III ą U I. So, we assume U II ą U I. We have:

dpk,j,Sq´1 I t U “ βjiδ jPI t“0 ÿ ÿ

dpk,j,Sq´1 dpk,j,Sq III t t U “ ´κi ` βjiδ ` βjiδ jPA t“0 jPD t“0 ÿ ÿ ÿ ÿ

r´1 dpk,j,Sq´1 dpk,j,S´iq´1 dpk,j,Sq`r´1 II t t t t U “ ´ κiδ ` βjiδ ` βjiδ ` βjiδ t“0 jPA t“0 jPB t“0 jPC t“0 ÿ ÿ ÿ ÿ ÿ ÿ ÿ

Using D “ B Y C, we may rewrite U II ą U I as

r´1 dpk,j,S´iq´1 dpk,j,Sq`r´1 t t t κiδ ă βjiδ ` βjiδ t“0 jPB t“dpk,j,Sq jPC t“dpk,j,Sq ÿ ÿ ÿ ÿ ÿ

dpk,j,Sq`r´1 r´1 t dpk,j,Sq t ď βjiδ “ δ βjiδ jPD t“dpk,j,Sq jPD t“0 ÿ ÿ ÿ ÿ

32 r´1 t dpk,j,Sq Dividing both sides by t“0 δ , we obtain κi ´ δ jPD βji ă 0, whose left-hand side is equal to U I ´ U IIIř. Hence, U III ą U I, as was toř be shown.

Theorem 11. (Credible Equilibrium) Suppose that for some set S Ď A, T rSs is a Nash equilibrium. Then T rSs is a credible equilibrium if and only if for any

dpk,j,Sq k, i P S with pk, iq P E, it holds that κi ´ δ jPD βji ě 0, where D “ tj P I : dpk, j, Sq ` 1 ď dpk, j, S´iqu. ř

Proof. Because T rSs is a Nash equilibrium, it is credible if and only if immediate defection upon learning of a deviation is credible. The condition in the theorem is equivalent to saying that when a single neighbor has deviated alone, defecting immediately is better than waiting one round to defect. By Lemma 10, this implies that under these circumstances, defecting immediately is optimal. By Lemma9, this implies that defecting immediately is optimal whenever the agent believes some nonempty subset has defected.

2.8 Extension: Simulation Analysis of the Credible Equilibrium

In our previous simulation section, we did not consider whether T rS˚s was a credible equilibrium. In this simulation section, we use the condition in Theorem 11 to determine when this is the case. Figure 2.9 shows the fraction of instances for

which T rS˚s is a credible equilibrium. Of course, this is always true when S˚ “ H. Generally, when the credibility condition fails, it is in the region where both the discount factor δ and the benefit multiplier β are high. This makes intuitive sense: if these parameters are low, there is little reason to postpone defection. More significantly, we see that as n increases, the fraction of cases where the condition holds quickly converges to 1 everywhere. This, too, makes intuitive sense: the main reason to postpone defection is to slow down the spread of the information that a defection has taken place. However, the larger the network, the less likely it is that

33 Figure 2.9: Gradient figures showing the fraction of cases in which T rS˚s is a credible equilibrium, for different values of β and δ. The specifications of each panel are the same as in Figure 2.5. In the bottom left of each panel (black section), T rS˚s is always credible; in the top right it is not always (when it is brighter), though as n grows quickly the condition starts holding everywhere. In the n “ 10 cases, even for the combination of β and δ values where grim trigger is least credible, T rS˚s is a credible equilibrium about 60 percent of the time. an individual node can do much to keep this information from spreading.

2.9 Conclusion

In this chapter, we considered the following problem. In repeated games in which the agents are organized in a social network, it may take more than one round for an agent to find out about a defection that happened elsewhere in the graph. If so, it may increase incentives for agents to defect, because the losses from resulting punishment will be delayed and therefore discounted more heavily. We restricted our attention to games in which the agents can either cooperate or defect. We proved that there exists a unique maximal set of forever-cooperating agents in such games. We also gave an efficient algorithm for computing this set, which relies on iteratively removing agents

34 from the set that cannot possibly be incentivized to cooperate forever, based on which agents are still in the set. We evaluated this algorithm on randomly generated graphs and found an apparent phase transition: when the relative cooperation benefit β and the discount factor δ are high enough, all agents can cooperate forever, but once these are lowered beyond a threshold, we get a total collapse, with no agents cooperating. We give an analytical expression for where the phase transition occurs and prove that it is correct in the limit. Lastly, we gave an easy-to-check condition for when the threats in the equilibrium are credible, and found in simulations that for large graphs this condition always holds.

2.10 Future research

One direction for future research is to generalize the techniques in our chapter to more general games on networks, in which agents’ action spaces are not restricted to cooperation and defection. We expect that many of the same phenomena would occur in such games, and similar techniques would apply, but several additional technical challenges would have to be overcome. The main issue is that in sufficiently general games, multiple agents would need to coordinate their punishment actions for them to be maximally effective (unlike in the game studied here, where defection is always maximally effective). Such coordination is known to pose computational challenges even without network structure (Kontogiannis and Spirakis(2008); Hansen et al. (2008); Borgs et al. (2010); Andersen and Conitzer(2013)), and in our context there will be further challenges in coordinating punishment because information spreads slowly.

35 3

Role Assignment for Game-Theoretic Cooperation

Role assignment is an important problem in the design of multiagent systems. When multiple agents come together to execute a plan, there is generally a natural set of roles to which the agents need to be assigned. There are, of course, many aspects to take into account in such role assignment. It may be impossible to assign certain combinations of roles to the same agent, for example due to resource constraints. Some agents may be more skilled at a given role than others. In this chapter, we assume agents are interchangeable and instead consider an- other aspect: if the agents are self-interested, then the assignment of roles has certain game-theoretic ramifications. A careful assignment of roles might induce cooperation whereas a careless assignment may result in incentives for an agent to defect. Specif- ically, we consider a setting where there are multiple minigames in which agents need to be assigned roles. These games are then infinitely repeated, and roles can- not be reassigned later on.1 It is well known, via the folk theorem, that sometimes cooperation can be sustained in infinitely repeated games due to the threat of future

1 Our model disallows reassigning agents because in many contexts, such reassignment is infeasible or prohibitively costly due to agents having built up personalized infrastructure or specialized expertise for their roles, as is easily seen in some of the examples in the next paragraph.

36 punishment. Nevertheless, some infinitely repeated games, in and of themselves, do not offer sufficient opportunity to punish certain players for misbehaving. If so, cooperation may still be attained by the threat of punishing the defecting agent in another (mini)game. But for this to be effective, the defecting agent needs to be as- signed the right role in the other minigame. This chapter studies this game-theoretic role-assignment problem. Our work contrasts with much work in game theory in which the model zooms in on a single setting without considering its broader strategic context. In such models, firms make production and investment decisions based on competition in a single market; teammates decide on how much effort to put in on a single project; and countries decide whether to abide by an agreement on, for instance, reducing pollution. In reality, however, it is rare to have an isolated problem at hand, as the same agents generally interact with each other in other settings as well. Firms often compete in several markets (e.g., on computers and phones, cameras, and displays); members of a team usually work on several projects simultaneously; and countries interact with each other in other contexts, say trade agreements, as well. Looking at a problem in such an isolated manner can be limiting. There are games where a player has insufficient incentive to play the “cooperative” action, as the payoff from that action and the threat of punishment for defection are not high enough. In such scenarios, putting two or more games with compensating asymmetries can leave hope for cooperation. A firm may allow another firm to dominate one market in return for dominance in another; a team member may agree to take on an undesirable task on one project in return for a desirable one on another; and a country may agree to a severe emissions-reducing role in one agreement in return for being given a desirable role in a trade agreement. In this chapter, we first formalize this setup. Subsequently, we give useful neces- sary and sufficient conditions for a role assignment to sustain cooperation. We then

37 consider the computational problem of finding a role assignment satisfying these con- ditions. We show that this problem is NP-hard. We then give two algorithms for solving the problem nonetheless, both of which can be modified to find the minimal subsidy necessary to induce cooperation as well. One relies on an integer program formulation; the other relies on a dynamic programming formulation. We show the latter solves the problem in pseudopolynomial time when the number of agents is constant. However, in our experiments, the former algorithm is significantly faster, as shown at the end of our chapter.

3.1 Background: Repeated Games and the Folk Theorem

When considering systems of multiple self-interested agents, behavior cannot be di- rectly imposed on the agents. However, incentives for desirable behavior can be created by means of rewards and punishments. In one-shot games, there are no op- portunities to reward cooperation or to punish defection. On the other hand, when the game is repeated and agents interact repeatedly over time, richer outcomes could arise in equilibrium. Intuitively, when there is potential for future interactions (given that agents care about future outcomes, that is, the discount factor δ is positive), outcomes not sustainable in the one-shot version can be attained given the right reward and punishment strategies. The folk theorem characterizes equilibrium payoffs that can be obtained in in- finitely repeated games, as agents become arbitrarily patient (see, e.g., (Fudenberg and Tirole(1991))). The focus is on infinitely repeated games with discount factor δ

(or, equivalently, repeated games that end after each period with probability 1 ´ δ). The possibility of future rewards and punishments makes certain outcomes besides the static Nash equilibria of the stage game sustainable, if the agents care “enough” about future payoffs. To characterize which outcomes are sustainable in equilibrium, an agent’s payoff—the payoff that the other agents can guarantee she gets

38 at most, if they set out to minimize her utility—is key. There can be no equilib- rium where an agent receives less than her minimax payoff, because the agent could then deviate and receive more. The folk theorem states that basically all feasible payoff vectors that Pareto dominate the minimax payoff vector can be attained in equilibrium with the threat of sufficiently harsh punishment for deviating from the intended behavior. Based on the ideas behind the folk theorem, we develop a characterization the- orem (Theorem 12) that will serve as the foundation for analyzing our problem of bundling roles within (mini)games for game-theoretic cooperation. This theorem provides an easy-to-check condition for whether cooperation can be attained under a given role assignment.

3.2 Motivating Example

Consider two individuals (e.g., faculty members or board members) who together are to form two distinct committees. Each of the committees needs a chair and another member; these are the roles we need to assign to the two individuals. Each committee’s chair can choose to behave selfishly or cooperatively. Each committee’s other member can choose to sabotage the committee or be cooperative. The precise payoffs differ slightly across the two committees because of their different duties. (For example, acting selfishly as the chair of a graduate admissions committee is likely to lead to different payoffs than acting selfishly as the chair of a faculty search committee.) Let us first consider each of these two minigames separately. If the minigame is only played once, the chair has a strictly dominant strategy of playing selfishly (and hence, by iterated dominance, the other member will sabotage the committee).

Even if the game is repeated (with a discount factor δ ă 1), we cannot sustain the (cooperate, cooperate) outcome forever. This is because the chair would receive

39 Figure 3.1: Two committees example.

a payoff of 2 in each round from this outcome—but defecting to playing selfishly would give her an immediate utility of 3 or 4 in that round after which she can still guarantee herself a utility of at least 2 in each remaining round by playing selfishly. Now let us consider the minigames together. If the same agent is assigned as chair in each minigame, again we could not sustain the (cooperate, cooperate) outcome in

both mini-games, because the chair would gain 3`4 immediately from defecting and still be able to obtain 2 ` 2 in each round forever after. On the other hand, if each agent is chair in one game, then with a reasonably high discount factor, (cooperate, cooperate) can be sustained. For suppose the chair of the second committee deviates

by acting selfishly in that committee. This will give her an immediate gain of 4´2 “ 2. However, the other agent can respond by playing selfishly on committee 1 and sabotaging committee 2 forever after. Hence in each later round the original defector can get only 1 ` 2 “ 3 instead of the 2 ` 2 “ 4 from both agents cooperating, resulting in a loss of 1 in each round relative to cooperation. Hence, if δ is such that

2 ď δ{p1 ´ δq, the defection does not benefit her in the long run. This shows that linking the minigames allows us to attain cooperative behavior where this would not have been possible in each individual minigame separately. It also illustrates the importance of assigning the roles carefully in order to attain cooperation. One may wonder what would happen if we link the mini-games in a single-shot

(i.e., not repeated) context. This would correspond to the case δ “ 0, so that the

40 above formula indicates that cooperation is not attained in this case. In fact, linking minigames cannot help in single-shot games in general: in a single-shot model, any equilibrium of the (linked) game must consist simply of playing an equilibrium of each individual minigame. (Otherwise, a player could improve her overall payoff by deviating in a minigame where she is not best-responding.) Linking becomes useful only when the game is repeated, because then one’s actions in one minigame can affect one’s future payoffs in other minigames, by affecting other players’ future actions. This is why the repeated game aspect is essential to our model.

3.3 Definitions

When there is a risk of confusion, we distinguish between minigames (of which there are two in the example above, corresponding to the two committees) and the larger meta-game that the minigames together constitute after roles have been assigned to agents. Note that technically the players in minigames are roles, not agents, and the metagame among the agents is not defined until roles have been assigned to

them. Let N “ t1, . . . , nu be the set of agents and G the set of minigames. We assume each minigame has n roles (such as committee chair or committee mem- ber); if this is not the case, we can simply add dummy roles. For each minigame

g g P G and each role r in g, there is a set of actions Ar for the agent in that role

g g to play, and a utility function ur that takes as input a profile of actions ~a in g,

g consisting of one action ar for each role r in g, and as output returns the utility of the agent in role r from this minigame. An agent i’s total payoff in round t is

g g gPG urpi,gqp~a ptqq, where rpi, gq denotes the role assigned to agent i in minigame g andř ~agptq is the profile played in minigame g in round t. For the repeated game, we

8 t g g consider both discounted payoff ( t“0 δ gPG urpi,gqp~a ptqq) and limit average pay-

T ´1 gř g ř off (lim infT Ñ8p1{T q t“0 gPG urpi,gqp~a ptqq). We assume perfect monitoring, i.e., ř ř 41 agents observe all actions taken by all agents in prior rounds. All this is common

knowledge, and so is the role assignment rpi, gq once the agents play the game. Our main interest is in assessing whether a particular outcome can be sustained in repeated play. We assume that for each game g, for each role r in g, there

˚g g is a distinguished action ar P Ar that we call the target action. (There is no requirement that this action should be “cooperative” in the sense of increasing other agents’ utilities or maximizing the social utility; it can be any action, for example one that a principal assigning the roles would like to see happen for exogenous reasons. Alternatively, one can think of the principal having an extremely large utility for the target action being played, and including the principal’s utility in the social welfare.)

Our main question is whether there exists a role assignment function rpi, gq such that there is an equilibrium of the repeated game where every agent always plays the target action in every role assigned to her. For this question, the key issue is which roles (from different minigames) are bundled together, rather than which particular agent is assigned this bundle of roles. We postpone the definition of this question as a computational problem until we have done some further simplifying analysis.

3.4 Related Literature

The assignment of roles in multiagent systems has of course received previous at- tention, especially in domains such as RoboCup soccer. However, we are not aware of any multiagent systems literature on assigning roles across multiple games in a way to achieve game-theoretically stable cooperation. The work by Grossi and Tur- rini(2012) that combines the concepts of dependence theory and game theory is distantly related to our work, focusing on identifying coalitions such that agents mutually benefit within this coalition. In the economics literature, some of the first work to recognize the effect of playing multiple games in parallel took place in research on industrial behavior. In

42 1955, Corwin Edwards proposed the possibility that multimarket contact between firms could allow them to reach strategically stable arrangements that could not be reached in a single market and thereby foster anticompetitive outcomes. Subsequent theoretical works explore the effect of multimarket contact on economic performance (e.g., Bulow et al. (1985)) and the degree of cooperation sustainable with repeated competition (e.g., Bernheim and Whinston(1990)). Papers by Folmer and von Mouche (2007) and Just and Netanyahu (2000) concern how the structure of the linked component games affect the potential for cooperation. Both identify the expansion in the bargaining set through linkage as the key to potential increase in cooperation. The latter work further examines linking common game classes such as the prisoner’s dilemma, assurance, iterated dominance, and chicken games; they do not observe the chances of coming to a fully cooperative equilibrium increasing significantly except for the case of linking prisoner’s dilemma games. The intuition that linking games relaxes incentive constraints and the results on what game structure allows linkage to lead to cooperation are both in accordance with the results by Jackson and Sonnenschein (2007). These authors formally ad- dress the relaxation of incentive constraints through linking decision problems and the resulting efficiency gains in the general context of social choice with preference announcements.

3.5 Theoretical Analysis

In this section, we use the ideas behind the folk theorem to analyze whether our problem has a solution or not. We show that whether it does comes down to a single number per minigame role. The intuition that allows us to show this is as follows. To determine whether a given agent i will defect (i.e., play something other than the target action in some role assigned to her), by the folk theorem, we may assume

43 that all other agents will play their target actions until some defection has taken place, after which they maximally punish agent i (in all games, not just the ones in which she defected). Thus, in the round in which agent i defects, she may as well play the single-round best-response to the target actions in every role assigned to her; afterwards, she will forever receive the best she can do in response to maximal punishment. (Since we only consider Nash equilibrium, we do not have to worry about multiple agents deviating.) Target actions are limited to pure strategies (since agents cannot observe whether a specific mixed strategy was played), but we allow for mixed (even correlated, as we discuss below) actions in the punishment phase. The net effect of the defection on i’s utility may be positive or negative for any given role; whether i will defect depends solely on the sum of these effects. We now formalize this. Recall that a˚g is the profile of target actions for minigame g.

g Definition 4. Given a minigame g and a role r in g, let cr denote the (long-run) co- operation value for that role when the target actions are played by all players. With

g g ˚g g 8 t g ˚g limit average payoffs, cr “ urp~a q. With discounted payoffs, cr “ t“0 δ urp~a q “ g ˚g urp~a q{p1 ´ δq. ř

Next, we want to specify the defection value. This requires us to know what utility a player will get in rounds after defection, which depends on how effective the other players are in punishing. In a two-player game, the punishing player should play a minimax strategy—i.e., play as if she were playing a zero-sum game where her utility is the negative of that of the defecting player. With three or more players, an important question is whether the players other than the defector can coordinate (i.e., correlate) their strategies.2 If not, this leads to NP-hardness (Borgs et al.

2 For 3+ players, it strengthens our hardness result (to be discussed in Section 3.6) that it holds even with correlated punishment. Computational issues aside, without correlated punishment there are some (known) conceptual problems because the minimax theorem fails with 2 players against 1. Briefly, consider a variant of rock-paper-scissors where player 3 just plays R, P, or S, but players 1 and 2, who play together in a sense that they have identical payoffs, both pick from t0, 1u; 00

44 (2010)). Therefore, we assume that they can correlate, which allows polynomial- time computability (Kontogiannis and Spirakis(2008)). Formally, when the player

g g g in role r has defected, the remaining players ´r will play arg min g max g u pa , σ q, σ´r ar r r ´r g where σ´r is a mixed strategy for the players ´r (allowed to be correlated if there are two or more players in ´r). In the case of discounted payoffs, we also need to know

˚g g g g the utility a player will get in the first round she defects; this is maxar urpar, a´rq.

g Definition 5. Given a minigame g and a role r in g, let dr denote the (long-run) de-

g g g g fection value for that role. With limit average payoffs, d “ min g max g u pa , σ q. r σ´r ar r r ´r

g g g ˚g 8 t g g g With discounted payoffs, d “ max g u pa , a q ` δ min g max g u pa , σ q “ r ar r r ´r t“1 σ´r ar r r ´r

g g ˚g δ g g g max g u pa , a q ` min g max g u pa , σ q. ř ar r r ´r 1´δ σ´r ar r r ´r

In the end, what matters is the net effect of defection.

g g g Definition 6. Given a minigame g and a role r in g, let mr “ cr ´ dr denote the robustness measure for that role.

The theorem now says that we can obtain the desired outcome if and only if there is an assignment that gives each agent a nonnegative sum of robustness measures.

Theorem 12. The repeated metagame with assignment rpi, gq has an equilibrium (allowing correlated punishment) in which on the path of play, in every round t, in

˚g every minigame g, every role r plays ar if and only if the assignment rpi, gq is such g that for all i, g mrpi,gq ě 0. ř g Proof. We first prove the “if” direction, supposing that g mrpi,gq ě 0 for all i.

˚g Consider a grim trigger strategy profile where all playersř cooperate (play ar ) in

means rock, 01 means paper, 10 means scissors, and 11 means a fourth action (fragile) that always loses. If 1 and 2 can correlate, they can play the regular RPS minimax strategy guaranteeing payoff 0. But if they cannot, there is no joint strategy for 1 and 2 guaranteeing 0. On the other hand, the best that player 3 can guarantee is 0. So there is no well-defined value of the game and it is arguably less clear what will happen.

45 every role r to which they have been assigned as long as everyone else does so; if

some player (say i) has deviated, the other players ´i switch to maximally punish- ing i via correlated punishment—that is, in every game g, they play a strategy in

g g g arg minσg maxag u pa , σ q. We must show this strategy profile is ´rpi,gq rpi,gq rpi,gq rpi,gq ´rpi,gq an equilibrium. Consider an arbitrary agent i. Not deviating will give i a long-term

g utility of g crpi,gq. What about deviating? Without loss of generality, suppose i deviates inř the first round. The highest expected payoff i can obtain in that first

g g ˚g round is maxag u pa , a q; in every remaining round, she can obtain g rpi,gq rpi,gq rpi,gq ´rpi,gq g g g at most ř minσg maxag u pa , σ q. Hence, her long-term utility g ´rpi,gq rpi,gq rpi,gq rpi,gq ´rpi,gq ř g g is at most g drpi,gq. But by assumption, g mrpi,gq ě 0, which is equivalent to g ř g ř g crpi,gq ě g drpi,gq. So agent i has no incentive to deviate. ř ř g We now prove the “only if” direction, supposing that g mrpi,gq ă 0 for some i.

˚g Consider a strategy profile where on the path of play all playersř cooperate (play ar ). g Again, not deviating will give player i a long-term utility of g crpi,gq. By the minimax g g g g g g theorem, minσg maxag u pa , σ q “ maxσg ř minag u pσ , a q. ´rpi,gq rpi,gq rpi,gq rpi,gq ´rpi,gq rpi,gq ´rpi,gq rpi,gq rpi,gq ´rpi,gq Consider the deviating strategy where player i plays, in every minigame g, a strategy

g g ˚g from arg maxag u pa , a q in the first round, and then a mixed strategy rpi,gq rpi,gq rpi,gq ´rpi,gq g g g from arg maxσg minag u pσ , a q. This guarantees her a long-term rpi,gq ´rpi,gq rpi,gq rpi,gq ´rpi,gq g g utility of at least drpi,gq in each minigame g. By assumption, g mrpi,gq ă 0, which is g g ř equivalent to g crpi,gq ă g drpi,gq. So player i has an incentive to deviate and the original strategyř profile isř not an equilibrium.

To determine whether cooperation can be attained, properties of the group of minigames as a whole matter, as opposed to those of individual minigames: even if there are many minigames with negative robustness measuress to all agents, one more minigame could reverse the sum of the robustness measures to poositive and result in

46 cooperation. Because it is straightforward to compute the robustness measures from the minigames using the formulas above, Theorem 12 allows us to efficiently check whether a given role assignment has the desired behavior as an equilibrium. It also reduces the problem of finding such a role assignment to the following computational problem

g Definition 7. (ROLE-ASSIGNMENT) We are given the mr (a vector of n|G| numbers). We are asked whether there exists a function rpi, gq that maps players

g one-to-one to the roles of the game g, such that for all i, g mrpi,gq ě 0. ř The reader may be concerned that perhaps the formal ROLE-ASSIGNMENT problem is harder than the problem we actually need to solve, because perhaps some instances of this problem would not correspond to any actual game. Theorem 13

g establishes that, conversely, given any vector of robustness measures mr (a vector of n|G| numbers), it is straightforward to construct a vector of g normal-form minigames that results in these robustness measures. This implies that if there is an efficient algorithm to solve the original problem given in the normal-form representation, then any ROLE-ASSIGNMENT instance can also be solved efficiently by solving the corresponding normal-form instance. We omit some proofs to save space.

Theorem 13. Given a length n vector of robustness measures m, we can efficiently construct an n-player normal-form minigame with two actions per player that gen- erates exactly these robustness measures.

Proof. We refer to the i-th element of the vector m by mi. From the robustness mea- sure vector, we construct an n-player binary-action normal-form minigame, where each agent chooses whether to play the target action (cooperate) or not (defect). We first show how to construct the normal-form game in the limit average payoff model. If a given mi is nonnegative, then we let the i-th agent’s payoff for when

47 the action profile ~a “ pcooperate, . . . , cooperateq is played be mi. If, on the other

hand, a given mi is negative, then we let the i-th agent’s payoff for when ~a “

pcooperate, . . . , cooperateq is played be 0, and all of i’s payoffs where i’s action ai is

defect (regardless of the actions of others ´ai) be ´mi (that is, there is a positive value for defection). All payoffs not yet specified are set to 0. In the discounted payoff model, we go through the same operations, except instead of using the payoffs mi and ´mi, we replace them with p1 ´ δqmi and ´p1 ´ δqmi, to account for the discounting. Finally, we show that the normal-form game we constructed produces the original set of robustness measures. If mi was nonnegative, ci “ mi and di “ 0, and if mi was negative, ci “ 0 and di “ ´mi, for both limit average and discounted payoffs.

Consequently, we get ci ´ di “ mi for all i, as desired.

3.6 Complexity of ROLE-ASSIGNMENT

In this section, we show that the ROLE-ASSIGNMENT problem is NP-complete. We first show that it is weakly NP-complete even in an extremely restricted special case, namely, the case where we have only two players and each minigame has the following structure.

Figure 3.2:A Two-Player Active-Passive (2PAP) game.

Each minigame has two roles, Active and Passive. Passive has no choice. Active can choose to defect, in which case both players get 0, or to cooperate, in which

case Active gets ´xg ď 0 and Passive gets xg ě 0. Hence, the robustness values

g g are mActive “ ´xg and mPassive “ xg, for all g P G. We call these games Two-Player 48 Active-Passive (2PAP) games. Intuitively, cooperating in a game means giving the other player a specified gift, and enabling cooperation in all games requires that we balance the gifts exactly between the players.3 This suggests a reduction from the PARTITION problem (which is only weakly NP-hard, allowing pseudopolynomial- time algorithms).

Theorem 14. ROLE-ASSIGNMENT is (weakly) NP-complete for 2PAP games.

Proof. To prove NP-hardness, we reduce from the PARTITION problem, in which we are given a set of integers tw1, . . . , wqu, and are asked whether there exists a subset q S Ď t1, . . . , qu such that jPS wj “ j“1 wi{2. For an arbitrary instance of the PARTITION problem, we constructř ař ROLE-ASSIGNMENT instance by creating a

2PAP game gpjq with xgpjq “ wj for each j P t1, . . . , qu. If a solution S to the PARTITION instance exists, then assign agent 1 to the

Active role in all gpjq with j P S and to the Passive role in all other games. Then, we

g g have g mrp1,gq “ ´ jPS wj ` jRS wj “ 0 and g mrp2,gq “ ´ jRS wj ` jPS wj “ 0. Henceř a solution toř the ROLE-ASSIGNMENTř ř instance exists.ř ř Conversely, if a solution to the ROLE-ASSIGNMENT instance exists, let S be the set of all j such that agent 1 is assigned to the Active role in gpjq. We know

g g 0 ď g mrp1,gq “ ´ jPS wj ` jRS wj and 0 ď g mrp2,gq “ ´ jRS wj ` jPS wj.

Henceř jPS wj “ jřRS wj and Sřis a solution to theř PARTITIONř instance.ř ř ř In the coming subsection on the dynamic program algorithm, we will show that the ROLE-ASSIGNMENT problem can in fact be solved in pseudopolynomial time when there are at most a constant number of agents. We now proceed to exhibit strong NP-completeness for n-Player Active-Passive (nPAP) games, in which each

3 One may wonder why cooperation is desirable at all in these games, but note that the reduction will work just as well if Passive receives a sufficiently small bonus  when receiving a gift and Active does not have to pay this bonus. Alternatively, the principal may have an exogenous reason for preferring cooperation.

49 minigame g has one Active and n ´ 1 Passive players, and the Active player can

choose to make a specified gift xg that will be equally divided among the other players (so they each receive xg{pn ´ 1q). The reduction resembles the previous one but is based on the strongly NP-hard 3-PARTITION problem.

Theorem 15. ROLE-ASSIGNMENT is (strongly) NP-complete for nPAP games.

3.7 Algorithms for ROLE-ASSIGNMENT

Here, we present two algorithms for ROLE-ASSIGNMENT.

3.7.1 Integer Program

First, we reduce ROLE-ASSIGNMENT to the integer program (IP) in Figure 3.3. Combining this with any IP solver results in an algorithm for ROLE-ASSIGNMENT.

g The robustness measure mr for each minigame g and each role r in g is a parameter of the IP. We have an indicator variable bpi, g, rq P t0, 1u for each agent i, each minigame g, and each role r in g.(bpi, g, rq “ 1 if and only if i is assigned role r in g.) There is another variable v which the solver will end up setting to the minimum

g aggregate robustness value any agent has, mini g mrpi,gq; maximizing this is the objective of the IP. ř Note that this is not necessarily pushing things towards an equitable solution,

g as mr is not the payoff to the agent. The point is that this lets the IP determine not only whether the ROLE-ASSIGNMENT instance has a solution (which is the case if and only if the optimal objective value is nonnegative), but also the “most robust” solution.4 The assignment does not affect the overall welfare of the agents as the aggregate payoffs are constant and predetermined from the prescription of the “cooperation” actions. This IP can easily be modified to determine the minimal

4 The word “robustness” was chosen as a higher value implies the solution would be more robust to changes in the utilities.

50 subsidy necessary to induce cooperation, by adding a payment variable for each player, whose sum is then minimized (cf. cost of stability (Bachrach et al. (2009))).

maximize v subject to g (@i) v ´ g r in g mrbi,g,r ď 0 (min. robustness) (@g, r in g) b “ 1 (one player per role) ř ři i,g,r (@i, g) r in g bi,g,r “ 1 (one role per player per game) ř ř Figure 3.3: Integer program for ROLE-ASSIGNMENT.

3.7.2 Dynamic Program

Even though the general n-player ROLE-ASSIGNMENT problem is strongly NP- complete (Theorem 15), below we give a dynamic programming (DP) algorithm that solves it in pseudopolynomial time when the number of agents is constant. For the purpose of presenting this algorithm, we assume that the payoffs are integers. (Of course, any rational numbers could be scaled up to integers).

g The algorithm takes as input the the vector of robustness measure mr for each

g minigame g P G and each role r in g. Let L “ g mint0, minrpmrqu (the low- est possible aggregate robustness measure for anř agent from any subset of the

g games), U “ g maxt0, maxrpmrqu (the highest), and X “ U ´ L. Also, let Pg be a (size r!) setř of vectors of length |N|, where each element is a permutation of

g g g tm1, m2, . . . , m|N|u. That is, Pg is the set of all the possible robustness measure vectors from game g.

The algorithm fills up a table of size |G| ˆ Xn containing Boolean values, with the first axis of the table ranging from 1 to |G|, and the other n axes (one for each agent) ranging from L to U. The table entry T pg, k1, k2, . . . , knq represents whether it is possible for each player i to obtain an aggregate robustness measure of ki from

51 Figure 3.4: The plot represents the average runtime of solving an instance of ROLE- ASSIGNMENT through dynamic programming, given the social welfare maximizing target action profile. The x axis represents how many minigames are to be assigned (|G|), and the y-axis represents how long it took to solve the instances, in seconds. Each data point in the graph is an average of multiple instances (ranging from 4 to 200, due to cases such as n “ 6, where we decide to timeout the program). While we studied many different game generators, as a representative case, we present the results for the case where G consists of uniformly random games. role assignments to the first g minigames only (arbitrarily labeling the games as

1,..., |G|). We omit the formal description of the algorithm to save space. The ROLE-ASSIGNMENT instance then has a solution if and only if the last row of the table (for g “ |G|) has a 1 for an entry with nonnegative values for the other axes—i.e., pDk1, . . . , kn ě 0q T p|G|, k1, . . . , knq “ 1. In this row we can also find the maxmin aggregate robustness level that the integer programming algorithm

finds, i.e., maxtv : pDk1, . . . , kn ě vq T p|G|, k1, . . . , knq “ 1u. We can also determine the minimal subsidy necessary to induce cooperation by a single pass through this row. All that is needed is to bring up aggregate robustness values that are below zero to zero.

52 Figure 3.5: The plot represents the average runtime of solving an instance of ROLE- ASSIGNMENT through integer programming, given the social welfare maximizing target action profile. The x axis represents how many minigames are to be assigned (|G|), and the y-axis represents how long it took to solve the instances, in seconds. Each data point in the graph is an average of 200 instances. The top of the range bar indicates the maximum time an individual instance required to be solved, and the bottom of the range bar indicates the shortest. The graphs presented here are from uniformly random games with no integral payoff restrictions.

Theorem 16. ROLE-ASSIGNMENT can be solved in pseudopolynomial time for a constant number of agents n.

Proof. The table has |G|ˆXn entries, and filling in an entry requires up to n! lookups. Moreover, X ď |G| ¨ d, where d is the maximum difference between two robustness values in a minigame, which itself is Opυ ´ λq where υ (λ) is the largest (smallest) single payoff in a minigame. Hence, with constant n, the algorithm is polynomial in

|G| and υ ´ λ. (Of course, the input size is polynomial in |G| and logpυ ´ λq, which is why the algorithm is only pseudopolynomial.)

53 3.8 Simulation Analysis

In this section, we evaluate the two algorithms on random instances, generated us- ing GAMUT (Nudelman et al. (2004)). For a given number n of players, a given number |G| of minigames, and a given game generator in GAMUT, we generate an instance by drawing |G| n-player games with payoffs in the interval r´5, 5s from the generator. (Though we have done the simulation on many different families of games (dispersion, coordination, N-player chicken, etc.), the runtimes do not appear to depend on the family, so we omit them due to limited space.) Because the DP algorithm requires payoffs to be integers, we round all the payoffs in each game to

integers in t´5, ´4,..., 5u. We evaluate the IP algorithm on nondiscretized payoffs. (When we run it on the rounded payoffs, the IP algorithm is in fact even faster, and always returns the same solution as the DP.) CPLEX 12.6.0.0 and g++ 4.8.4 were used for IP and DP respectively. We present the experimental results for when the target action is determined as the action profile maximizing the social welfare. (Similar patterns are observed when the target action is determined randomly.) We evaluate whether the target action can be sustained in an equilibrium. Both algorithms are guaranteed to return a solution, if one exists. Figure 3.4 show the results for the DP algorithm.Predictably, the runtime of the DP algorithm closely tracks the number of table entries that need to be filled in

(|G| ˆ Xn). The number of table entries blows up quickly when n increases. Figure 3.5 shows the results for the IP algorithm. The IP algorithm scales much, much better.

54 3.9 Conclusion

In this chapter, we have identified the problem of assigning roles to agents across multiple games in such a way that cooperative behavior becomes an equilibrium. We provided an easy-to-check necessary and sufficient condition for a given role assignment to induce cooperation and used this to obtain hardness results as well as algorithms for the problem of finding such a role assignment. Our IP algorithm significantly outperformed our DP algorithm in experiments, even though the latter is pseudopolynomial for constant numbers of agents. We believe that there are many other important directions that can be studied in the context of game-theoretic role assignment. Our model can be extended to allow (perhaps costly) reassignment of roles as time progresses; different agent types that value roles differently, and preferences not only over roles but also over which type of agent one is matched with (providing connections to matching (Klaus et al. (2015)) and hedonic games (Aziz and Savani(2015))); side payments between agents (providing connections to matching with contracts (Hatfield and Milgrom(2005))); not every minigame being played in each round; generalizing from repeated games to stochastic or arbitrary extensive-form games; and so on. An alternate formation of studying which payoff profiles can be sustained can be considered as well. In the limit-average scenario generalizing to this is straightforward. In the discounted case, however, it is less obvious because the immediate payoff from deviation becomes significant, which depends on which specific entry is played (especially when δ ă 1 is considered). We believe that this chapter provides a good foundation for such follow-up work. The availability of a pseudopolynomial-time algorithm, when the number of agents is constant, also suggests that there may be potential for approximation al- gorithms. However, note that the problems as we have defined them are decision

55 problems, and it is not immediately obvious what the right optimization variant would be. One possibility may be to consider approximate equilibria.

56 4

Framing Matters: Sanctioning in Public Good Games with Parallel Bilateral Relationships

Cooperation among agents is crucial in achieving collective goals. However, we gen- erally cannot expect every agent to choose actions that maximize societal benefit, especially if there are other actions available with greater private benefit to the agent. On a big scale, this is exemplified in struggles to address environmental problems. Collectively, everyone benefits from harmful emissions being reduced, but individual countries may prefer to keep their own emissions high and free-ride off the others’ reductions. On a more engageable day to day scale, we face struggles within teams, organizations, and communities. In a team project, agents face the temptation to slack off and free-ride on other members who are the workhorse. In attempting to achieve cooperation, one possibility is to sanction those who do not do their part. In the context of climate change, previous agreements explicitly prohibited sanctioning of noncooperative behaviors (Frankel(2005); United Nations Framework Convention on Climate Change(1997b,a); Economics focus(2003); Bond (2003)). In contrast, in the context of ozone layer depletion, where reduction efforts

57 were much more successful, bilateral trade sanctions were built into the treaty as an enforcement mechanism (United Nations(2017); Goldberg(1992); United Nations Environment Programme Ozone Secretariat(2017)). While it is hard to attribute the success in the latter case to this difference, it is natural to wonder how much the possibility of sanctions might contribute to achieving cooperation. Prior work has investigated the role of sanctioning in establishing cooperation in public good games (Yamagishi(1986); Ostrom et al. (1992); Fehr and G¨achter (2000, 2002); Masclet et al. (2003); Rege and Telle(2004); Andreoni et al. (2003); Gachter et al. (2008); Nikiforakis(2008); Cinyabuguma et al. (2006); Gintis(2000); Fehr et al. (2002); Gintis et al. (2003); Nikiforakis and Normann(2008); Carpenter (2007); Sefton et al. (2007); Gachter et al. (2008)). In these prior studies, there were separate sanctioning stages in which subjects were explicitly given an opportunity to punish other players. Introducing such sanctioning stages, by themselves, appears to help establish cooperation (Yamagishi(1986); Ostrom et al. (1992); Fehr and G¨achter(2000, 2002); Masclet et al. (2003); Rege and Telle(2004); Andreoni et al. (2003); Gachter et al. (2008)). However, the sanctioning stages in these studies took a limited format, where individual players “purchase” and “cast” disapproval votes to other players, consequently affecting the payoffs of the vote-receiving players. This means that this form of sanctioning may be difficult to institute without a “central planner” figure to introduce and enforce this formal sanctioning stage. Additionally, if “counterpunishment” stages are added in which those punished can counterpunish those who punished them, the effect goes away (Nikiforakis(2008)), unless it is obscured who did the punishing (Cinyabuguma et al. (2006)). Instituting any formal sanctioning stages in a climate change agreement is prob- ably not realistic. Instead, any sanctioning is likely to take place informally in other domains of international relations, such as trade. Rather than anticipating a clear, direct, and immediate consequence of failing to cooperate, a country is likely to worry

58 more about a general loss of reputation in the international arena and consequent difficulties in establishing cooperation in other domains. While those domains thus may be used to sanction defectors, they are of significant importance in their own right and their use for sanctioning is likely only one of multiple considerations in determining how to act there. We look at how existing bilateral relationships can be an effective medium for community enforcement of cooperation. To investigate the effects on such informal sanctioning on cooperation, we design a multilayered public good game experiment that more accurately models the oppor- tunities for sanctioning in global negotations on climate change (and other similar problems). In our design, there is both a “global” public good game in which every- one participates (if one country reduces emissions, everyone else benefits), and a set of “local” public good games, one for each pair of participants, in which only those two participate (for example, the countries’ bilateral trade relationship). Unlike ex- isting papers on public good games with sanctioning, only these local public good games are available for sanctioning, and a subject may choose to view actions taken in these local games as entirely separate from those taken in the global game. We have four treatment conditions in our study, corresponding to two binary variables that can be set independently. The first variable is “matching” which determines whether subjects are able to see how their partners in the local games match up to those in the global game. We have the on/off of this first variable represent two extremes of an accountability or observability variation. In the “on” setting, an agent knows exactly which of the other agents making a certain type of local game contribution made a specific global game contribution. In the “off” setting, on the other hand, subjects do not know which of the local games corresponds to a specific global game behavior, and thus cannot use a local game to punish this defector in the global game. Therefore, setting the matching variable to “off” results in a type of control condition where most of the structure of the game is the same but

59 local games have become ineffective as a tool for sanctioning. The second variable is an on/off “framing” variable which does not affect the structure of the game. This variable determines whether we cast a characteristic onto the relationship between the two types of games we are playing. If this variable is set to “on” then, at the beginning of the experiment, the subject is presented additional text that explicitly points out the possibility of using the local games for sanctioning. This is a suggestive example on how the local games and global games can be related, which can affect behaviors empirically but does not introduce a structural change game-theoretically. When this variable set to “off”, we provide the mechanics of the two games neutrally, without suggesting a particular manner of use. In this case, the relationship between the two games would evolve naturally, which could still be the one that subjects arrive under our suggestive framing condition as well. The main results of this paper are that framing matters significantly, with coop- eration decreasing when the possibility of sanctioning is pointed out, while matching only makes marginal differences to the contribution patterns. In light of previous research suggesting that sanctioning mechanisms can increase cooperation, these re- sults are surprising, in the following two ways. (1) Only when the matching variable is set to “on” is targeted sanctioning possible, since matching allows for accurately linking players in the global and local games. Therefore, in light of previous work, one might have expected matching to have a significant positive effect. However, we do not observe that. Holding the framing variable constant, there is no statistically significant difference observed in the contribution patterns between the matched and non-matched treatments, implying that this fundamental change to the information structure of the game does not result in any noticeable changes in behavior. (2) Because framing brings the possibility of punishment for failing to cooperate to the players’ attention, according to the sanctioning literature, one might have expected that agents would become more cooperative when the framing variable is set to its

60 “off” position, in fear of being punished otherwise (Yamagishi(1986); Ostrom et al. (1992); Fehr and G¨achter(2000, 2002); Masclet et al. (2003); Gachter et al. (2008)). We in fact observe the opposite. The two framed treatments where the presence of the sanctioning opportunities was brought to the agents’ attention suffer from con- tribution levels lower than the corresponding two non-framed treatments. Together, these results suggest that when sanctioning is possible only through existing other interactions, the effect can be quite different from that of a formalized sanctioning process, and framing the possibility of sanctioning can have unexpected effects in this context. Our experimental results suggest that if the possibility of sanctioning parties through existing bilateral relations is discussed in climate change agreements, this may in fact reduce mitigation efforts. Without rigid structure to ensure enforcement or commitment to punishment (as in the Montreal Protocol for mitigating ozone layer depletion), discussing the possibility of sanctioning may cause the parties involved to be more conservative in their initial mitigation efforts. Further analysis of our results shows that the non-framed cases exhibit more equitable levels of contribution, suggesting a more cooperative mindset. Before discussing our results in detail, we give a short litlerature review and an overview of our experimental design. After analyzing the results, we close with a discussion of the implications for encouraging cooperative behavior in practice.

4.1 Relevant Literature and Points of Difference

There are two branches of literature that would be useful as background for better understanding this paper. The first is on sanctioning, where the literature examines the impact and pattern of introducing a costly punishment mechanism and the con- sequential (dominantly) increase and maintenance of higher level of contributions. We explore how our experiment structurally differs with others, which may be one

61 of the two keys to understanding why our results are different. The second branch is on framing, where we put our work in context by discussing the dominant role of framing in the literature.

4.1.1 Sanctioning

Social dilemma situations are widely studied in the experimental literature, as the understanding of those situations could be helpful in identifying factors that pro- mote cooperative behaviors. Public good game is among the most studied frame- works, because it provides a simple situation with a direct tradeoff between self- and group-interested behaviors. Dominant traditional literature finds sanctioning to be a robust mechanism for eliciting cooperative behaviors in public good games, which is contrasting result to ours, as we will discuss further. In the public good games literature, providing agents with sanctioning opportu- nities is said to be a robust mechanism for eliciting higher, cooperative contributions. While public good games without sanctioning opportunities exhibit contribution lev- els that decrease over time (Bohm(1972, 1983); Dawes(1980); Orbell et al. (1990); Marwell and Ames(1981)) (or the number of remaining interactions (Isaac et al. (1994)), many papers show that when sanctioning opportunities are introduced, the contribution levels can be sustained at a high level (Yamagishi(1986); Ostrom et al. (1992); Fehr and G¨achter(2000, 2002); Masclet et al. (2003)). Agents are will- ing to punish others who make low contributions, even when the punishing decision only reduces their own monetary payoffs and not have any potential future benefits, due to an extremely low likelihood of future interaction (Fehr and G¨achter(2000)). This expectation of (even irrational) punishments in face of noncooperative behaviors seem to make the presence of sanctioning opportunity itself become an even stronger credible threat to potential noncooperators. The ability to counter-punish, how- ever, seem to potentially make cooperation vulnerable again (Nikiforakis(2008)),

62 though more exploration would be needed for this because this study’s restricted structure for possible counter-punishments (that is, only if you were punished, can you counter-punish) makes such breakdown not too surprising. Our experiment has a fundamental structural difference in design, compared to other experiments in the literature on public good games with sanctioning. Simi- larly to the games in the literature, we have a two-staged design, where agents first play the standard public good games in the first stage and then agents face the sanctioning stage in the second stage of a round. However, the sanctioning stage in typical public good games take a limited format of essentially “purchasing” pun- ishment/disapproval votes and “casting” them to other agents, which is difficult to institute without an authority to form and enforce such external sanctioning stages. Further, this means that punishment does not have strategic tradeoff in isolation, but only gains strategic significance when placed in relation to the public good game played in the first stage. That is, if the sanctioning stage was isolated and played on its own repeatedly, the strictly dominant strategy would always be not assigning any punishment points to the others, because the only action available to an agent would be to reduce her own payoff to reduce others’ payoffs, to which others can respond by further reducing her payoff in retaliation, both of which consequently reduces her payoff. Our sanctioning stage, in contrast, mirrors existing one-on-one relationships in real life and is designed as a bilateral public good game, where there is a strategic tradeoff when viewed in isolation as well. Keeping a cooperative rela- tionship is desirable for longrun interaction and payoff, though in the shortrun, it is possible to gain from exploiting the other. This allows us to assess whether these existing bilateral relationships on their own can act as a sanctioning mechanism or whether it’d be necessary to find a new way to integrate external sanctioning mech- anisms for sanctioning in the literature to be possible. To an extent, our sanctioning stage serves as a middle between the traditional external sanctioning structure and

63 Nikiforakis (Nikiforakis(2008))’s structure where an additional stage for counter- punishment was introduced. This change in the sanctioning stage introduces an additional uncertainty: both traditional structure and our structure provides agents with the opportunities to sanction other agents at the cost of one’s own payoff, but in our structure the extent to which one needs to bear the cost can be uncertain and endogenous to the interaction itself. Since agents can identify each action of others in the game and are informed of the presence of a potential sanctioning mechanism, we can use our matched framed treatment as the benchmark for comparisons with the literature.

4.1.2 Framing

Framing effect widely refers to how agents react when facing manipulation on how a situation is presented to them. In the public good games context, there are two dominant directions that framing and its effects were studied: the first is the positive- negative framing, which studies differences from whether the positive externality of contributing to the public good or the negative externality of not contributing are emphasized (Andreoni(1995); Park(2000); Hiroaki and Park(2010)); the second is the contrast between whether to “contribute” to a public good or to “take” from a public resource, described as the give-take framing (Cubitt et al. (2011); Dufwenberg et al. (2011); Fosgaard et al. (2014); Khadjavi and Lange(2015); Cox(2015)). Framing in our work relates more closely to that seen in the frame selection literature, where researchers study the spontaneous mental associations of a game with preexisting frames, in the absence of cues to color the interaction (Eriksson and Strimling(2014)). Eriksson and Strimling find that agent contribution behaviors correspond to their interpretation of the situation. That is, agents who have sponta- neously selected the cooperative frame made similar contribution patterns to those from agents who were given a cooperative framing of the situation. They also find

64 that in the spontaneous frame selection conditions, it was very common for agents to associate games with cooperative settings (76–80%) while it is fairly common to associate it with competitive settings (37–44%). In our experiment, agents are given an additional medium of interaction, where we either “color” the medium as a sanc- tioning mechanism or leave it neutrally. In the non-framed treatments, where we compare the framed treatment to, agents are not prescribed expectation as to what action should take place, which, according to the frame selection literature, agents would find their own lens to interpret. Our result of observing higher contributions in the non-framed treatments is in accordance with what can be expected from results from the frame selection literature.

4.2 Experimental Design

We now present the details of the experimental setup.

4.2.1 Global and Local Public Good Games Overlaid

There are two types of games that subjects play simultaneously: the global game, which is the public good game in which all subjects participate, and a set of local games, which are bilateral public good games, one for each pair of subjects. In our experiments four subjects participate at a time, resulting in six local games. The global game represents, e.g., contributions to climate change mitigation, and the local games represent contributions made in other one-on-one relationships, e.g., tariff reductions in bilateral trades. While we are interested in the use of the local games for the purpose of sanctioning, given our motivation we do not cast them as a separate punishment stage, unlike existing literature. In both global and local games, agents are given endowments, and can then decide how much of this endowment to put into the common pot and how much to keep privately. Contributions to the common pot increase in value (at rates discussed

65 Table 4.1: Table of the four different treatments given in the experiment, correspond- ing to settings of the two binary variables, matching and framing.

framing no framing matching M+F M+nF no matching nM+F nM+nF

below) and are then split equally among participants in that game. This process is repeated ten times.

4.2.2 Information Conditions

There are four different treatment conditions, corresponding to two variables that are turned on or off (see Table 4.1). When we turn on the framing variable, we merely add a sentence in the description of the game to subjects, pointing out that the local game can be used to target a specific player to reward or punish for behaviors observed in the global game. Thus, the setting of the framing variable makes no difference to the structure of the actual game. On the other hand, turning on the matching variable introduces an actual change to the structure of the game played, by revealing to which local-game partner each player in the global game corresponds. That is, when the matching variable is turned on, agents are able to identify who in the global game is who in the local games, whereas when the matching variable is turned off, they are aware that each other player in the global game corresponds to the partner in some local game, but not which one. As a result, when the matching variable is turned off, it makes local games useless for the purpose of targeted punishing or rewarding players for actions in the global game. (One can still collectively reward or punish all agents for actions in the global game via the local games, but it is no longer possible to target a specific individual for actions in the global game.) For the purpose of not overwhelming subjects with too much information at decision points, subjects observe contribution decisions of others in all the games that they are participating in, and those only. Specifically, there is no way for a

66 subject to know the behavior of her local-game partners in their other local games, because she is not participating in those games.

4.2.3 Experimental Subjects and Pre-Experiment Procedure

The experiment was conducted at the Interdisciplinary Behavioral Research Cen- ter (IBRC) lab of Duke University’s Social Science Research Institute (SSRI). The group of eligible participants was restricted to Duke students who are at least 18 years old (age range can vary, as it includes students from undergraduate, graduate, and professional school programs). Students were asked not to sign up in groups, so that the group behavior would not be distorted from playing the game against acquaintances. Maximally, three experimental sessions were conducted per week, due to space restrictions and to ensure enough studenets sign up to each allotted timeslot. Given four participants were necessary for a session, up to six subjects were re- cruited at once. If the required number of subjects did not arrive by the appointment time, we waited up to 10 minutes for the remaining group to show up. If the required number of subjects or more showed up by the appointment time, we used a random number generator to select the four participants. The rest of the subjects were given the promised show up fee and was sent back. Since the experiment instructions were not shared at that point, they were allowed to sign up for the experiment once more. The participants were given the Informed Consent Forms, and upon filling the forms out, they were escorted to individual rooms in the back side of the IBRC lab. There, I read the experiment script corresponding to the treatment they were assigned to. Afterwards, participants were given 5 minutes to understand the infor- mation and ask me questions they had. The experiment began after everyone verified they understood the instructions and closed the doors.

67 Figure 4.1: Timeline of the progression of a single round, showing each stage within the round.

4.2.4 Timeline and Details of the Experiment

In our experiment, four agents are assigned to a group. The group is assigned to one of the four treatments (see Table 4.1), and a subject plays in the global game of that group (involving all four subjects) and in three local games, one for each of the other subjects in the group. Subjects are informed of all the game play details corresponding to their assigned treatment at the beginning of the experiment; in particular, there is no deception. 10 groups of subjects were recruited per treatment, resulting in data from 160 distinct subjects overall. Each group of subjects plays 10 rounds, where a round consists of two action stages (see Figure 4.1). In the first stage, subjects play the global game. In the second stage, subjects play in each of their local games. After these two stages, the other players’ contributions to the common pot are revealed for all (and only) the games in which a subject participates, along with the agent’s round payoff and accumulated payoff, and the round concludes. This is slightly different from what is typically done in the literature (Yamagishi(1986); Ostrom et al. (1992); Fehr and G¨achter (2000, 2002); Masclet et al. (2003); Gachter et al. (2008); Carpenter(2007)), where an explicit sanctioning stage happens after the revelation of the round contribution results. As a result, an agent can only use a local game to sanction another player for behavior in the global game with a lag of one round. This is then repeated for 10 rounds.

68 4.2.5 Payoff Calculations

For each game within a round, a participant is given an endowment, of which she can contribute any part to the common pot. For the global game this endowment is 30 per round, and for each local game it is 10 per round. The amount of each endowment that a subject does not contribute to the common pot is immediately added to her overall token count, corresponding to the payment she will receive at the end of the game. The total amount contributed to the common pot of a game is multiplied by 1.6, and subsequently split evenly among the participants in that game for their token counts. That is, at a given round t, when Z is the set of agents within

Z a particular game and cit the round t contribution made to the common pot of that game by the agent i P Z, each participant in that game will receive the following amount of tokens from the pot: 1.6 cZ |Z| it iPZ ÿ Note that tokens earned in one round cannot be used in later rounds, just as the endowment for one game cannot be used for another game. Hence, tokens earned in previous rounds cannot be used for sanctioning, unlike in many other studies (Ya- magishi(1986); Ostrom et al. (1992); Fehr and G¨achter(2000, 2002); Masclet et al. (2003); Gachter et al. (2008); Sefton et al. (2007); Carpenter(2007)). At the end of the entire experiment, the numbers of tokens accumulated by agent j

Z Z in the global game G and a given local game L, respectively, are (letting ci “ t cit): 1.6 cG ř wG “ p300 ´ cGq ` i i (4.1) j j 4 ř 1.6 cL wL “ p100 ´ cLq ` i i (4.2) j j 2 ř From these equations, we can see that, if contributions of other subjects are held fixed, contributing is not in a subject’s best interest (from the perspective of maxi-

69 mizing the subject’s own wealth only), receiving only 0.4 of each dollar contributed in the global game and 0.8 of each dollar given in the global game. Of course, the subject may be altruistic, or the subject may reason that contributing now is a good way to encourage others to contribute more in the future. At the end of all the 10 rounds, the tokens subjects have accumulated throughout the rounds, from both global and local games, are summed and converted to US dollar amounts. In a global game, the theoretical minimum an agent can receive is 120 while the maximum is 660. Similarly, in a single local game the minimum is 80 while the maximum is 180. We converted tokens to US dollars and added a constant amount, in such a way that the minimum total payment to a subject was $10 and the maximum was $25.

4.3 Interesting Patterns in Data

In this section, we discuss distinctive patterns observed in the data1. OBSERVATION 1: The collected data shows a heavily censored contribution pattern, with an especially large cluster of maximal contribution in the non-matched non-framed treaetment. Censored observations refer to contributions made on the minimal and maximal level of possible contributions. In our particular setting, each agent is restricted by their own per round endowment within the game in the amount of contributions she can make, for both global and local games. That is, for global public good games, an agent can make a contribution that is between 0 and 30, while for each local public good games, an agent can make a contribution that is between 0 and 10. Censored observations would refer to 0 and 30 for global games, and 0 and 10 for local games. In the collected data, we observe a lot of censored observations (see Figure A.8 and Figure A.9 in AppendixA), especially with the number of censored observations

1 For graphical representation of full data, please refer to the AppendixA

70 taking up more than half of the observations in most of rounds within the non- matched non-framed treatment. Even for the treatment where censoring is scarce, it does not seem to be a negligible number, with the number reaching almost 1{4 of the total observations (again, see Figures A.8 and A.9). Accounting for censoring is important because it is ambiguous whether a censored observation point comes from agents wanting to contribute beyond (below) the max- imal (minimal) contribution level but being restricted in their ability to, or from an agent wanting to contribute exactly the maximal (minimal) contribution level and executing such contribution exactly. For instance, when a maximal contribution is continued, such as in the non-matched non-framed treatment, it is possible that, if given higher endowments, the agents contribute higher, though not necessarily to the maximal level. OBSERVATION 2: Higher, more cooperative contribution patterns in the non- framed treatments. Looking at the overall global game contribution levels (see Figures A.1 and A.6), we can hypothesize that the non-framed treatments tend to have a higher range of contribution levels than in the framed treatments. Later results from clustering analyses along with looking at simple average patterns verify this hypothesis: the framed cases form their initial average contribution levels between 10 and 15 and progress lower over time, while the no frame cases form them between 15 and 20 and display a hump shape temporal pattern. This observed pattern is unexpected from the perspective of the literature on public good games with sanctioning. The public good games with sanctioning liter- ature would presume that when sanctioning medium is provided and sanctioning is identifiably possible, namely under the matched framed condition, the contribution levels would be higher and also sustained at a higher level. From Nikiforakis (2008), we can hypothesize that this difference is potentially from the mutual nature of the

71 local game. Since local game is a bilateral public good game, agents can be easily counter-punished for potential punishments laid out, which could potentially dwindle the magnitude of the threat of sanctioning. This pattern is studied further in the next section. OBSERVATION 3: Single clustering for matched treatments and double cluster division for non-matched treatments, in the global game contributions. Looking at the overall group-level global game contribution (see Figure A.6), we can observe a potential effect of the variable matching. The contribution patterns from each group of the matched treatments exhibit a relatively tightly clustered time trends, while the contribution patterns from the non-matched treatment groups show a more divergent pattern. In fact, we can visually identify two potential time trends in divergent directions. This pattern leads us to combine clustering analysis with statistical methods, to uncover the secondary effect of matching variable. OBSERVATION 4: Convergence in local game contribution patterns. Looking at the overall local game contribution levels especially (see Figure A.1 for global pattern and Figures A.2, A.3, A.4, and A.5 for local patterns), we can see a prospect for convergent behaviors in the framed cases. Whereas the non-framed treatments have a lot of behavioral patterns that look less correlated to other agents’ previous round behaviors, many of the framed treatment pairs seem to exhibit a certain tagging behavior, responding as if to compensate for the previous round contribution differentials. Non-matched non-framed treatment has a potential to be a slight outlier, with many agents possibly non-responsively placing the maximal contribution. Unfortunately, statistical analyses are not fitting to analyze whether there is convergence in the contribution patterns. For a statistically significant measure of convergence, there needs to be enough divergent behavior patterns, which would in

72 turn leave out the more convergent cases from the set of “statsitically significantly” convergent cases. OBSERVATION 5: Mixed contributions in the latter two rounds of the game. Through principles such as unraveling of cooperation, game theory predicts that finitely repeated games would have trouble sustaining cooperation. At the last round, agents realize they would not face any consequences if they were to act non- cooperatively. Because agents are aware others would act non-cooperatively at the last round, regardless of their own actions, non-cooperative behaviors in the second to last round cannot yield any consequences, either. Repeatedly in this fashion, the cooperation theoretically completely unravels. In experiments on repeated games, cooperation generally does not unravel com- pletely. Agents are aware of the benefits of sustaining cooperation, and they tend to not trigger at the initial round, unless they expect such behaviors from the oppo- nent (Palacios-Huerta and Volij(2009)). However, agents are still aware that there is no threat of sanctioning at the last round, and defecting at the last round is a strictly dominant strategy. In our experiment, we see a mixed contributions in the latter two rounds of the game. Even when cooperative time trend is observed, there are agents who (as theoretically appropriate) drop cooperation at the latter rounds, especially the latter two rounds. Some agents, yet, sustain the cooperative pattern through the end of the game, when they are in a cooperative group. This mixed result that appropriately happens suggests that some statistical analyses to under- stand the effects of specific treatments on behavior may benefit from excluding the latter one or two round contributions from the analyses.

4.4 Experimental Results

In this section, we focus on OBSERVATION 2 and OBSERVATION 3 from the observed patterns and examine the data from the viewpoints of statistics, ma-

73 chine learning, and welfare economics. Through statistical methods, we identify the primary effect from the variable “framing” and observe that it dominates the effect from the variable “matching”. Nonetheless, we are able to identify the secondary effect from the variable “matching” by combining machine learning and statistical methods. Finally, machine learning and welfare economics tools allow us to make a connection to the frame selection literature. Together, these analyses paint a coher- ent picture of the importance of framing for establishing a cooperative mindset.

4.4.1 Primary Effect: Framing Matters

For each of the four treatments (again, see Table 4.1), we have 10 groups participating in the study, made up of a total of 40 individuals, each with 10 rounds of contribution decision data in a global public good game (40 time trends) and three local public good games (120 time trends). We have performed both individual- and group- level analyses on these data, using parametric and non-parametric methods, and the results across these tests are consistent. Here, we only present results from two non-parametric tests whose independence assumptions are easy to justify here: the Mann-Whitney U (Bauer(1972); Hollander and Wolfe(1999); R Team(2017)) and multivariate Cram´er-von Mises (Baringhaus and Franz(2004); Franz(2014)) tests.2 We first review the Mann-Whitney U test and how we apply it to our data. The test considers numerical data drawn independently from two different distributions. (No assumption about the shape of the distribution is needed.) The goal of the test is to determine whether numbers drawn from one of the distributions tend to be larger than those drawn from the other distribution. From our data, we let every group of subjects correspond to just a single number, namely the average of total

2 The results we obtain are consistent across many parametric and nonparametric analyses, even when using a composite time trend looking at global and local game trends together. For ease of presentation, we only share graphs that represent global and local games separately.

74 contributions made through the 10 rounds. (Here, we can consider either the total contributions made in the global game, or the total contributions made in all the local games combined.) This gives us ten data points per treatment, which can reasonably be held to be independent as they were obtained from disjoint groups of subjects. For the multivariate Cram´er-von Mises test, we take each of the treatments as a

population represented by random variable Pxy and its cdf FPxy ppxyq, where pxy is a 10-dimensional vector with the average contribution in each round within a group as a component. (Here, x indicates whether the treatment is matched or not, and y whether it is framed or not). Unlike some other nonparametric tests, the multivariate Cram´er-von Mises test can accommodate dependence in the contribution levels across rounds, because the multivariate random variable structure allows for dependence among its component random variables. For the test, we make bilateral comparisons between the four treatments, and evaluate whether we can reject that the two Pxy are identical multivariate distributions. The critical values and p-values are calcu- lated through permutation Monte-Carlo bootstrapping (Franz(2014); Baringhaus and Franz(2004)). Results for both tests, for both global and local contributions, are in Table 4.2. The Mann-Whitney U test results show the relative differences in the total con- tribution averages. Note that to directly compare the numbers for the global game and the local games, adjustment for the difference in endowment would be neces- sary. The Mann-Whitney U test allows us to confidently reject homogeneity when and only when there is a difference in the framing variable, with lower contributions when there is framing. In the multivariate Cram´er-von Mises test comparisons, we also see that in both global games and local games, we can consistently reject ho- mogeneity between framed and non-framed treatments. The only exception is one of the two cross-type bilateral comparisons between the matched non-framed (M+nF)

75 Table 4.2: Comparison of treatments and test results of whether the homogeneity hypothesis can be rejected. MW is for the Mann-Whitney U test and CM for the mul- tivariate Cram´er-von Mises test. The two test results are complementary: between any two treatments, CM fully utilizes the data structure but tests for homogeneity only, whereas MW quantifies the relative difference of the 10-round total of average contributions. The possible range of numbers in MW global is 0 to 300, while it is 0 to 100 in MW local. Treatments CM global CM local MW global MW local M+F vs M+nF ***‰ ***‰ ***-87.00 ***-20.91 M+F vs nM+F ““ (-16.37) (-9.66) M+F vs nM+nF ***‰ ***‰ ***-119.00 ***-35.66 M+nF vs nM+F **‰“ *+70.87 *+13.95 M+nF vs nM+nF ““ (-28.12) (-13.87) nM+F vs nM+nF **‰ **‰ **-82.87 **-26.54 Significance marked as 1% “ ˚ ˚ ˚, 5% “ ˚˚, and 10% “ ˚. For MW cases where the homogeneity hypotheses were not rejected, the estimates are enclosed in parentheses. and non-matched framed (nM+F) treatments in local games, where homogeneity cannot be rejected by this test. From the tests together, we can conclude that the framed and non-framed cases are statistically significantly different, and specifically the framed cases have lower contribution levels than the non-framed cases.

4.4.2 Machine Learning: Clustering Analysis and the Secondary Effect of Matching

To gain a better understanding of the temporal patterns of contribution for each treatment, we performed a k-means clustering analysis on the game-level contribu- tion series. k-means clustering is a common machine learning technique where, for a given number k, the algorithm partitions data points into k clusters of similar characteristics (R Core Team(2017); Hartigan and Wong(1979)). We take a two- step approach, first identifying the optimal number of clusters (for both the global and local game cases) using a version of the partitioning around medoids (PAM) method (Hennig(2015); Hennig and Liao(2013)), and then applying k-means clus- tering.

76 Table 4.3: Numbers of time trends assigned to cluster 1 (C1) and cluster 2 (C2) based on k-means clustering with k “ 2. For both global and local games, the higher time trend cluster is C1, or the “cooperative” group, while the lower time trend cluster is C2, or the “uncooperative” group (see Figure 4.2). There are 10 global game time trends per treatment and 60 local game time trends per treatment provided as input.

Treatments C1 global C2 global C1 local C2 local M+F 0 10 9 51 M+nF 6 4 34 26 nM+F 4 6 22 38 nM+nF 6 4 39 21

For us, a data point corresponds to a 10-dimensional within-game average contri- bution time trend. Thus, from each group of four subjects in a treatment, we obtain one global-game time trend and six local-game time trends. The average contribu- tions are on the same scale across treatments and groups within games (ranging from 0 to 30 for all global games and 0 to 10 for all local games), so normalization is not necessary. PAM identifies the optimal number of clusters, k, as 2 for both global and local games. Through k-means clustering, we see that the two clusters identified through this method (see Figure 4.2) are naturally labeled as the cooperative and the uncooperative time trends. For both global and local games, there is a consistent pattern with respect to the treatment types. The time trends from the matched framed (M+F) treatment are dominantly assigned to the uncooperative cluster. Further, even though the time trends for the other treatments come out mixed, the framed treatments have more time trends assigned to the uncooperative cluster than the cooperative cluster, while the non-framed treatments have more assigned to the cooperative cluster (see Table 4.3). This again reinforces that framing, by suggesting the possibility of using local games as sanctioning devices for uncooperative behavior (as it was done in our experiment), leads to uncooperative interactions. Through statistical methods alone, we are able to identify the effect of the framing

77 Figure 4.2: Each panel in the row of graphs shows the average global game contri- bution patterns observed within the treatment. The thick red and blue lines show the two clusters identified through the k-means clustering method, when the combined time trend data (irrespective of treatment conditions) were given as input. Table 4.3 shows how many time trends within each treatment were classified to each group. Local game patterns show similar trends, and are omitted for space.

78 variable, but not the matching variable, primarily because the framing variable has a dominating effect. Based on the results from Clustering Analysis, however, we do a secondary statistical analyses on the matching variable, and successfully identify a weaker, yet present, secondary effect of the matching variable, particularly in the framed treatments3. We make two comparisons: first a comparison of the C1 time trends from the non- matched framed (nM+F) treatment against the C2 time trends from the matched framed (M+F) treatment, and a second comparison of the C1 time trends and C2 time trends, both from the non-matched framed (nM+F) treatment. In both cases, we find that the time trends belonging to non-matched framed (nM+F) treatment’s C1, or the cooperative cluster, exhibit a statistically significant difference (at 1% level) to the framed treatments’ time trends assigned to C2, or the uncooperative cluster.

4.4.3 Welfare Economics: Lorenz Curves and Gini Coefficients

We now turn to an analysis of how the contributions are distributed among the subjects.4 Specifically, we inspect the Lorenz curves and Gini coefficients (Gastwirth (1972); Lorenz(1905); Zeileis(2014)). The Lorenz curve of the total contribution in each case is obtained by plotting the cumulative share of subjects against the cumulative share of total contributions. That is, we sort subjects by the amount they contributed, and show the fraction of the total contribution coming from the lowest x% of the population, for every x. When there is total equality in the contribution levels, we obtain the 45-degree line. The higher the inequality in the contribution

3 Within our data set, time trends for the framed treatments form patterns tighter together to each other. This lets us test the effect of the matching variable despite the small number of observations we have. We hypothesize that with more observations, we would be able to parse out a statistically significant effect of the matching variable under the non-framed treatments as well. 4 We perform a similar analysis of the subjects’ final wealth in the experiment as well, which is included in the appendix.

79 levels, the further away the graph would bow out from the 45-degree line. The Gini coefficient quantifies this inequality as the area between the 45-degree line and the Lorenz curve, times two. In Figure 4.3, we see that for the global games, the matched non-framed (M+nF) treatment has the most equitable distribution of contributions, followed by the non-matched non-framed (nM+nF), matched framed (M+F), and then the non- matched framed (nM+F) treatments. In the local games, the non-matched non- framed (nM+nF) and the matched non-framed (M+nF) treatments have more eq- uitable distributions, compared to the non-matched framed (nM+F) and matched framed (M+F) treatments. Gini coefficients and the confidence interval calcula- tions based on bootstrapping quantifies the differences in the level of inequality (Fox and Weisberg(2011); Davison and Hinkley(1997)). The non-matched non-framed (nM+nF) and the matched non-framed (M+nF) treatments have similar, lower Gini coefficients while the non-matched framed (nM+F) treatment has a distinctively higher Gini coefficient. The matched framed (M+F) treatment has closer value to the two non-framed treatments in the global game, while it has similar value to the non-matched framed (nM+F) treatment in the local game (see Figure 4.3). Overall, this suggest that people are in a more equitable mindset without framing.

4.5 Discussion

4.5.1 Cooperative vs. Uncooperative Frame of Minds

We observed significantly lower contributions when the possibility of using pairwise relationships for sanctioning was pointed out. This is in spite of the fact that in pre- vious studies, the addition of explicit sanctioning stages led to higher contributions. In contrast, obscuring the correspondence between identities in the global and lo- cal games—thereby effectively removing the option of targeted sanctioning—did not make a significant difference. Our other analyses, drawing upon machine learning

80 Global Public Good Games Local Public Good Games

● ● ● 1.00 1.00 ●● ● ●● ● ●●● ● ●●● ●●●● ● ● ● ●●●● ●●●● ● ● ●● ●●●● ● ●●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● 0.75 ● ● 0.75 ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ●● 0.50 ● ● 0.50 ● ●● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●● ●●● ●● ● ● ●● ●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ●● ●● ● ● ●●●● ● ● ● ● ●● ●● ●● ● ● ●●● ● ● ● ● ●● ● ● ● ●●● ●● ●● ● ● ●●● ● ● 0.25 ● 0.25 ●● ●● ●● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ●● ●● ● ●● ●● ●● ● ● ●● ●●●● ● ● ● ●● ●●●● ●● ●●●● ● ● ● ●● ●●●● ● ● ●● ●●● ● ● ●●● ●●●● ● ● ● ●●●● ●●● ● ● ● ●●●● ●●● ● ● ● ●●●●●●●● ● ● ● ● ●●●●●●●●● ● ● ● ● ●●●●●●●●●● ● ● ● ● ●●●●●●●●●●● ● ● ● ● ● ●●●●●●●●●●● 0.00 ● ● ● ● ● ● ● ● 0.00 ●●●●●●●●●●●●● cumulative share of total contribution cumulative 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 cumulative share of players game type: ● M+F ● M+nF ● nM+F ● nM+nF Global Public Good Games Local Public Good Games

nM + nF ● nM + nF ●

nM + F ● nM + F ●

M + nF ● M + nF ● game type

M + F ● M + F ●

0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Gini and 95% confidence interval Figure 4.3: The first two graphs show Lorenz curves for the global and local games, respectively. The latter two graphs show Gini coefficients for each treatment calculated from the Lorenz curves, along with 95% confidence intervals for them, for the global and local games respectively. Overlap in the confidence intervals implies the corresponding treatments may not be distinguishable in their inequality levels.

81 and concepts from the economics of inequality, corroborates that framing led to a less cooperative mindset in the experiment. In the Clustering Analyses, we see that time trends from the framed treatments dominantly clusters to the “uncooperative” (C2) group, while the time trends from the non-framed treatments clusters to both the “cooperative” (C1) group and the “uncooperative” (C2) group. This relates to the frame selection literature (Eriksson and Strimling(2014); Binmore(2006)). Under the framed treatments–that is, when people are prompted of the possibility of punishment and, further, the possibility of uncooperative behavior–agents are pushed to the more strategic, uncooperative frame of mind in interpreting the given game, and act accordingly. On the other hand, under the non-framed treatments, where agents are only exposed to the neu- tral description of the given game, they use their social norm to choose and inter- pret whether the given game should be played with a cooperative or uncooperative frame of mind. Through welfare economics analyses, we can see that the non-framed treatments, with more cooperative, higher contributions, also exhibit more equitable contribution patterns, signifying an overall sustaining a higher cooperative contribu- tion pattern was aided with agents within the group keeping a similar higher level of contribution together. This frame selection and the resulting cooperative, higher contributions observed in our experiment leads us to think about the importance of how a new game should be presented to participants, to promote cooperation with real life agents. Game- theoretically, framing, or how the situation is presented, does not make a real differ- ence. Framing, however, can impact whether cooperation would happen from agents being in a cooperative frame of mind, or purely from agents expecting sufficiently high negative consequences for their uncooperative behaviors, as is illustrated in Fig- ure 4.4. If we can trigger the cooperative frame of mind, it would be a powerful tool, given we would observe cooperation happening even when we do not expect it to

82 Figure 4.4: Decision tree illustrating how framing can impact the level of cooper- ative actions of an agent facing a new game. happen5.

4.5.2 Potential Theoretical Extensions and Future Questions

Future research directions from our experimental findings on the dominant effect of framing, and the potential of high cooperation through keeping agents in a cooper- ative frame of mind, go in two directions. The first direction is in buidling a theoretical model to accommodate for our observational results, which could be used in designing systems that are in line with

5 An accessible example of the former case, cooperation happening when we do not expect it to happen, can be found in the story of a primary school that faced a late pickup problem (Gneezy and Rustichini(2000b,a)). The school had some parents pick up their children late, which resulted in some extra strain on the teachers. To solve the late pick up problem, the primary school decided to fine parents that pick up their children late. Unlike expected, this fine policy increased the number of parents that picked up their children late. Hoping to rewind the situation, the primary school decided to remove the fine policy; however, the late pick up rate did not go back down. The original “cooperative” behavior of picking up children on time stemmed from parents being in a cooperative frame of mind and acting through their intrinsic motivation. However, the fine changed the nature of the relationship to something that is driven via extrinsic motivation, where parents would make a strategic decision of when to pick up their children given price for the service of “late hour child care”. Removing the fine did not switch back the nature of the relationship, which in turn led the primary school to re-introduce the fine policy, with a prohibitively high price for late pick ups.

83 promoting potential cooperative behaviors. Concepts from contract theory and un- derstanding the effect of uncertainty may shed light in this venture. Local games, unlike the sanctioning stages in the previous literature on sanctioning with public good games, have a value as independent games, and that the uncertainty in the intention behind certain action – what range of contributions would be considered cooperative and whether a low local contribution stems from uncooperative intention or from intention to punish the other – could lead to confusion and misunderstand- ing that would ultimately undermine cooperation. Works in contract theory (Spier (1992); Malhotra and Murnighan(2002)) study how the act of offering a contract that regulates agent behaviors can signal the degree of trust within the system. Si- miliarly to our framing that pushed people into the uncooperative frame of mind, such contracts can hinder trust building, which in turn can hinder cooperation. The second direction concerns a series of follow up questions. If agents are pushed to the uncooperative frame of mind, are there ways to bring people back to the first question node (see Figure 4.4) or even send people to the cooperative frame of mind? Are there other ways of framing the possibility of sanctioning through bilateral re- lationships that would push people towards the cooperative frame of mind and en- courage cooperation, instead of towards the uncooperative frame of mind and have deleterious effect that we observed? Alternatively, is it the case that the beneficial effects of a dedicated sanctioning mechanism, as observed in previous research, can- not be replicated by sanctioning through bilateral relationships that are of value in their own right? Answers to these questions would be of major importance to the design of future agreements on climate change, as well as on many other issues.

84 5

Concluding Remarks

In this dissertation, we have looked at how to identify and promote cooperation in a multiagent system, both theoretically through tools in computational game theory and empirically by analyzing a human subject experiment with statistics, machine learning and welfare economics tools. Oftentimes, we observe cooperation happening when we do not expect it to (for instance, charitable giving or open source softwares) and cooperation not happening when we expect it to. Developing an empirical understanding of cooperation provides ideas for us on how to complement the theoretical understanding of cooperation we have. Experimental research is a good starting point for the venture of connecting the theory to data, because the collected data is in a more controlled, understandable environment. Improving the theoretical models with our understanding of observed behaviors would allow us to predict and promote cooperative future behaviors.

85 Appendix A

Figures for Framing Matters: Sanctioning in Public Good Games with Parallel Bilateral Relationships

86 global public game: contributions of individual players (each panel corresponds to each global game pair)

M + F M + non−F non−M + F non−M + non−F 30 ● ● ● ● ● ● ● ● ● ● ● ● ● pair 1 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 3 ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● pair 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● pair 5 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 6 ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ●

Contribution ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 7 ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● pair 9 ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● 10 ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● pair 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 Round

player: ● 1 ● 2 ● 3 ● 4

Figure A.1: Individual-level global game contribution time trends, separated by the treatments and groups.

87 local public game under matching and framing: contributions of individual players (each panel corresponds to each local game pair) (the header of each panel: the first 2 digit identifies the global game pair; the last digit identifies the local game pair)

11 12 13 14 15 16 21 22 10.0 ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 23 24 25 26 31 32 33 34 10.0 ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 35 36 41 42 43 44 45 46 10.0 ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 51 52 53 54 55 56 61 62 ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 63 64 65 66 71 72 73 74 10.0 ● ● ● ● ● ● ● ● ● ● ● ●

Contribution 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 76 81 82 83 84 85 86 10.0 ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 91 92 93 94 95 96 101 102 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 103 104 105 106 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● 2.5 ● 0.0 ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 Round

player: ● 1 ● 2 ● 3 ● 4

Figure A.2: Individual-level local game contribution time trends, for the matched framed (M+F) treatment, by groups.

88 local public game under matching and non−framing: contributions of individual players (each panel corresponds to each local game pair) (the header of each panel: the first 2 digit identifies the global game pair; the last digit identifies the local game pair)

11 12 13 14 15 16 21 22 ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 23 24 25 26 31 32 33 34 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 35 36 41 42 43 44 45 46 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 51 52 53 54 55 56 61 62 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 63 64 65 66 71 72 73 74 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contribution 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● 75 76 81 82 83 84 85 86 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● 91 92 93 94 95 96 101 102 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 103 104 105 106 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● 2.5 ● ● 0.0 ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 Round

player: ● 1 ● 2 ● 3 ● 4

Figure A.3: Individual-level local game contribution time trends, for the matched non-framed (M+nF) treatment, by groups.

89 local public game under non−matching and framing: contributions of individual players (each panel corresponds to each local game pair) (the header of each panel: the first 2 digit identifies the global game pair; the last digit identifies the local game pair)

11 12 13 14 15 16 21 22 ● ● ● ● ● ● 10.0 ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● 23 24 25 26 31 32 33 34 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● 35 36 41 42 43 44 45 46 10.0 ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 51 52 53 54 55 56 61 62 10.0 ● ● ● ● 7.5 ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 63 64 65 66 71 72 73 74 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contribution 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● 0.0 ● ● ● ● ● ● ● 75 76 81 82 83 84 85 86 ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 91 92 93 94 95 96 101 102 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● 2.5 ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 103 104 105 106 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● 5.0 ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 Round

player: ● 1 ● 2 ● 3 ● 4

Figure A.4: Individual-level local game contribution time trends, for the non- matched framed (nM+F) treatment, by groups.

90 local public game under non−matching and non−framing: contributions of individual players (each panel corresponds to each local game pair) (the header of each panel: the first 2 digit identifies the global game pair; the last digit identifies the local game pair)

11 12 13 14 15 16 21 22 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 23 24 25 26 31 32 33 34 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● 35 36 41 42 43 44 45 46 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 51 52 53 54 55 56 61 62 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● 63 64 65 66 71 72 73 74 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ●

Contribution 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● 2.5 ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 76 81 82 83 84 85 86 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● ● ● ● ● ● ● 2.5 ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 91 92 93 94 95 96 101 102 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5.0 ● ● ● 2.5 0.0 ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 103 104 105 106 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7.5 ● ● 5.0 2.5 0.0 ● ● ● 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 Round

player: ● 1 ● 2 ● 3 ● 4

Figure A.5: Individual-level local game contribution time trends, for the non- matched non-framed (nM+nF) treatment, by groups.

91 global public game average contributions of each pair over 10 rounds

2 4 6 8 10 matched/framed matched/non−framed

● ● ● ● ● ● ● 30 ● ● ● ●

● ● ● ● ● ● ● 25 ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0

non−matched/framed non−matched/non−framed

● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ● ● ● ● ● ● ● ● average contribution average ● ● 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ●

2 4 6 8 10 round (red solid curve: smooth curve fitted by LOESS, dark−green dashed horizontal line: the overall mean)

Figure A.6: Group-level global game contribution time trends, by treatment.

92 local public game average contributions of each pair over 10 rounds

2 4 6 8 10 matched/framed matched/non−framed

● ● ● ● 10

● ● ● ● ● ● ● ● ● 8

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● 0

non−matched/framed non−matched/non−framed

● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● average contribution average ● ● ● ● ● ● ● ● ● ● ● ● 8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ●

2 4 6 8 10 round (red solid curve: smooth curve fitted by LOESS, dark−green dashed horizontal line: the overall mean)

Figure A.7: Group-level local game contribution time trends, by treatment.

93 censored/uncensored contributions censored/uncensored contributions at round 1 at round 2 40 40 30 30 20 20 cases cases 10 10 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 30) (left: censored at 0, middle: uncensored, right: censored at 30)

censored/uncensored contributions censored/uncensored contributions at round 3 at round 4 40 40 30 30 20 20 cases cases 10 10 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 30) (left: censored at 0, middle: uncensored, right: censored at 30)

censored/uncensored contributions censored/uncensored contributions at round 5 at round 6 40 40 30 30 20 20 cases cases 10 10 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 30) (left: censored at 0, middle: uncensored, right: censored at 30)

censored/uncensored contributions censored/uncensored contributions at round 7 at round 8 40 40 30 30 20 20 cases cases 10 10 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 30) (left: censored at 0, middle: uncensored, right: censored at 30)

censored/uncensored contributions censored/uncensored contributions at round 9 at round 10 40 40 30 30 20 20 cases cases 10 10 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 30) (left: censored at 0, middle: uncensored, right: censored at 30)

Figure A.8: Number of censored observations for global games, per round, by treatment.

94 censored/uncensored contributions censored/uncensored contributions at round 1 at round 2 120 120 80 80 cases cases 40 40 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 10) (left: censored at 0, middle: uncensored, right: censored at 10)

censored/uncensored contributions censored/uncensored contributions at round 3 at round 4 120 120 80 80 cases cases 40 40 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 10) (left: censored at 0, middle: uncensored, right: censored at 10)

censored/uncensored contributions censored/uncensored contributions at round 5 at round 6 120 120 80 80 cases cases 40 40 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 10) (left: censored at 0, middle: uncensored, right: censored at 10)

censored/uncensored contributions censored/uncensored contributions at round 7 at round 8 120 120 80 80 cases cases 40 40 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 10) (left: censored at 0, middle: uncensored, right: censored at 10)

censored/uncensored contributions censored/uncensored contributions at round 9 at round 10 120 120 80 80 cases cases 40 40 0 0 M+F M+nonF nonM+F nonM+nonF M+F M+nonF nonM+F nonM+nonF game type game type (left: censored at 0, middle: uncensored, right: censored at 10) (left: censored at 0, middle: uncensored, right: censored at 10)

Figure A.9: Number of censored observations for local games, per round, by treat- ment.

95 matching identity matching identity + framing + non−framing means means 30 30 25 25 20 20 15 15 10 10 5 5 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

non−matching identity non−matching identity + framing + non−framing means means 30 30

25 ● 25 20 20 15 15 10 10 5 5

● 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

Figure A.10: Boxplot representing the group-level mean per round level of global game contribution, by treatment.

96 matching identity matching identity + framing + non−framing standard deviations standard deviations 30 30 25 25 20 20

● ● ● 15 15 ● ●

10 ● 10 5 5

● 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

non−matching identity non−matching identity + framing + non−framing standard deviations standard deviations 30 30 25 25 20 20

15 15 ●

● 10 10 5 5 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

Figure A.11: Boxplot representing the group-level per round level of standard deviation of global game contribution, by treatment.

97 matching identity matching identity + framing + non−framing means means

● ● ● ● ● 10 10 8 8 ● 6 6 4 4 2 2 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

non−matching identity non−matching identity + framing + non−framing means means 10 10 8 8 6 6 4 4 2 2 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

Figure A.12: Boxplot representing the pair-level mean per round level of local game contribution, by treatment.

98 matching identity matching identity + framing + non−framing differences differences

● ● ● ● ● ● ● ● ● ● ● 10 10

● ● ● ● ● 8 8

● ● ● ●

● ● 6 6

● 4 4

● 2 2 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

non−matching identity non−matching identity + framing + non−framing differences differences

● ● ● ● ● ● ● ● ● ● 10 10 ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● 8 8

● ● ●

● ● ● 6 6

● 4 4 2 2 0 0

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

round round

Figure A.13: Boxplot representing the pair-level per round standard deviation of local game contribution level, by treatment.

99 Bibliography

Garrett Andersen and Vincent Conitzer. Fast equilibrium computation for infinitely repeated games. In Proceedings of the Twenty-Seventh AAAI Conference on Ar- tificial Intelligence, pages 53–59, Bellevue, WA, USA, 2013.

J. Andreoni, W. Harbaugh, and L. Vesterlund. The carrot or the stick: Rewards, punishments, and cooperation. American Economic Review, 93:893–902, 2003.

James Andreoni. Warm-glow versus cold-prickle: the effects of positive and negative framing on cooperation in experiments. The Quarterly Journal of Economics, 110:1–21, 1995.

Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for hamiltonian circuits and matchings. Journal of Computer and System Sciences, 18(19):155– 193, 1979.

Haris Aziz and Rahul Savani. Hedonic games. In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors, Handbook of Computational Social Choice, chapter 15. Cambridge University Press, 2015.

Yoram Bachrach, Edith Elkind, Reshef Meir, Dmitrii Pasechnik, Michael Zuckerman, J¨orgRothe, and Jeffrey Rosenschein. The cost of stability in coalitional games. In Marios Mavronicolas and Vicky Papadopoulou, editors, Algorithmic Game Theory, chapter 12. Springer, 2009.

L. Baringhaus and C. Franz. On a new multivariate two-sample test. Journal of Multivariate Analyses, 88:190–206, 2004.

David F. Bauer. Constructing confidence sets using rank statistics. Journal of the American Statistical Association, 67:687–690, 1972.

B. Douglas Bernheim and Michael D. Whinston. Multimarket contact and collusive behavior. The RAND Journal of Economics, 21(1):1–26, 1990.

Ken Binmore. Why do people cooperate? Politics, Philosophy & Economics, 5:81– 96, 2006.

100 Peter Bohm. Estimating demand for public goods: an experiment. Eurpoean Eco- nomic Review, 3:111–130, 1972.

Peter Bohm. Revealing demand for an actual public good. Journal of Public Eco- nomics, 24:135–151, 1983.

Eric Bond. Climate change and the Kyoto Protocol, 2003.

Christian Borgs, Jennifer Chayes, Nicole Immorlica, Adam Tauman Kalai, Vahab Mirrokni, and Christos Papadimitriou. The myth of the Folk Theorem. Games and Economic Behavior, 70(1):34–43, 2010.

Jeremy I. Bulow, John D. Geanakoplos, and Paul D. Klemperer. Multimarket oligopoly: Strategic substitutes and complements. Journal of Political Economy, 93(3):488–511, 1985.

Jeffrey P. Carpenter. Punishing free-riders: How group size affects mutual monitoring and the provision of public goods. Games and Economic Behavior, 60:31–51, 2007.

Matthias Cinyabuguma, Talbot Page, and Louis Putterman. Can second-order pun- ishment deter perverse punishment? Experimental Economics, 9:265–279, 2006.

Thomas Cormen, Charles Leiserson, Ronald Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, second edition, 2001.

Caleb Cox. Decomposing the effects of negative framing in linear public goods games. Economics Letters, 126:63–65, 2015.

Robin P. Cubitt, Michalis Drouvelis, and Simon G¨achter. Framing and free rid- ing: emotional responses and punishment in social dilemma games. Experimental Economics, 14:254–272, 2011.

A. C. Davison and D. V. Hinkley. Bootstrap Methods and their Applications. Oxford: Oxford University Press, 1997.

Robyn M. Dawes. Social dilemmas. Annual Review of Psychology, 31:169–193, 1980.

Joyee Deb. Cooperation and community responsibility: A folk theorem for repeated matching games with names. NYU Working Paper Series, 2008.

Martin Dufwenberg, Simon G¨achter, and Heike Hennig-Schmidt. The framing of games and the psychology of play. Games and Economic Behavior, 73:459–478, 2011.

Economics focus. Atmospheric pressure. The Economist, 367:63–64, 2003.

Glenn Ellison. Cooperation in the prisoner’s dilemma with anonymous random matching. Review of Economic Studies, 61:567–588, 1994.

101 Jeffrey C. Ely, Johannes Horner, and Wojciech Olszewski. Belief-free equilibria in repeated games. Econometrica, 73(2):377–415, 2005.

Kimmo Eriksson and Pontus Strimling. Spontaneous associations and label framing have similar effects in the . Judgment and Decision Making, 9:360–372, 2014.

Ernst Fehr and Simon G¨achter. Cooperation and punishment in public goods exper- iments. American Economic Review, 90:980–994, 2000.

Ernst Fehr and Simon G¨achter. Altruistic punishment in humans. Nature, 415:137– 140, 2002.

E. Fehr, U. Fischbacher, and S. Gachter. Strong reciprocity, human cooperation, and the enforcement of social norms. Human Nature - An Interdisciplinary Biosocial Perspective, 13:1–25, 2002.

Henk Folmer and Pierre von Mouche. Linking of repeated games: When does it lead to more cooperation and Pareto improvements? Working paper 60.2007, FEEM Fondazione Eni Enrico Mattei, C.so Magenta, 63, 20123 Milano, Italy, May 2007.

Toke R. Fosgaard, Lars Hansen, and Erik Wengstr¨om.Understanding the nature of cooperation variability. Journal of Public Economics, 120:134–143, 2014.

John Fox and Sanford Weisberg. An R Companion to Applied Regression. Sage, Thousand Oaks CA, second edition, 2011.

Jeffrey Frankel. Climate and trade: Links between the Kyoto Protocol and WTO. Environment, 47:8–19, 2005.

Carsten Franz. Multivariate nonparametric Cramer-test for the two-sample-problem, 2014.

Drew Fudenberg and . Game Theory. MIT Press, October 1991.

Simon Gachter, Elke Renner, and Martin Sefton. The long-run benefits of punish- ment. Science, 322:1510, 2008.

Joseph L. Gastwirth. The estimation of the Lorenz Curve and Gini Index. The Review of Economics and Statistics, 54:306–316, 1972.

H. Gintis, S. Bowles, R. Boyd, and E. Fehr. Explaining altruistic behavior in humans. Evolution and Human Behavior, 24:153–172, 2003.

H. Gintis. Strong reciprocity and human sociality. Journal of Theoretical Biology, 206:169–179, 2000.

102 Uri Gneezy and Aldo Rustichini. A fine is a price. Journal of Legal Studies, 29:1–17, 2000. Uri Gneezy and Aldo Rustichini. Pay enough or don’t pay at all. Quarterly Journal of Economics, 115:791–810, 2000. Donald M. Goldberg. Provisions of the Montreal Protocol affecting trade, 1992. Davide Grossi and Paolo Turrini. Dependence in games and dependence games. Autonomous Agents and Multi-Agent Systems, 25(2):284–312, 2012. Kristoffer Arnsfelt Hansen, Thomas Dueholm Hansen, Peter Bro Miltersen, and Troels Bjerre Sørensen. Approximability and parameterized complexity of min- max values. In Proceedings of the Fourth Workshop on Internet and Network Economics (WINE), pages 684–695, Shanghai, China, 2008. J. A. Hartigan and M. A. Wong. Algorithm as 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C: Applied Statistics, 28:100–108, 1979. John William Hatfield and Paul R. Milgrom. Matching with contracts. American Economic Review, 95(4):913–935, September 2005. Christian Hennig and Tim F. Liao. How to find an appropriate clustering for mixed- type variables with application to socio-economic stratification. Journal of the Royal Statistical Society. Series C: Applied Statistics, 62:309–369, 2013. Christian Hennig. Flexible procedures for clustering, 2015. Fujimoto Hiroaki and Eun-Soo Park. Framing effects and gender differences in volun- tary public goods provision experiments. The Journal of Socio-Economics, 39:455– 457, 2010. Myles Hollander and Douglas A. Wolfe. Nonparametric Statistical Methods. New York: John Wiley & Sons, 2nd edition, 1999. 1st edition 1973. R. Mark Isaac, James M. Walker, and Arlington W. Williams. Group size and the voluntary provision of public goods: Experimental evidence utilizing large groups. Journal of Public Economics, 54:1–36, 1994. Matthew O. Jackson and Hugo F. Sonnenschein. Overcoming incentive constraints by linking decisions. Econometrica, 75(1):241–257, 2007. Albert Xin Jiang, Kevin Leyton-Brown, and Nivan A. R. Bhat. Action-graph games. Games and Economic Behavior, 71(1):141–173, 2011. Richard E. Just and Sinaia Netanyahu. The importance of structure in linking games. Agricultural Economics, 24(1):87–100, 2000.

103 Michihiro Kandori. Social norms and community enforcement. Review of Economic Studies, 59:63–80, 1992.

Michael Kearns, Michael Littman, and Satinder Singh. Graphical models for game theory. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pages 253–260, 2001.

Menusch Khadjavi and Andreas Lange. Doing good or doing harm: experimental evidence on giving and taking in public good games. Experimental Economics, 18:432–441, 2015.

Markus Kinateder. Repeated games played in a network. Fondazione Eni Enrico Mattei Working Papers, 2008.

Bettina Klaus, David Manlove, and Francesca Rossi. Matching under preferences. In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A. D. Procaccia, editors, Hand- book of Computational Social Choice, chapter 14. Cambridge University Press, 2015.

Daphne Koller and Brian Milch. Multi-agent influence diagrams for representing and solving games. Games and Economic Behavior, 45(1):181–221, 2003.

Spyros C. Kontogiannis and Paul G. Spirakis. Equilibrium points in fear of corre- lated threats. In Proceedings of the Fourth Workshop on Internet and Network Economics (WINE), pages 210–221, Shanghai, China, 2008.

Marie Laclau. A folk theorem for repeated games played on a network. Games and Economic Behavior, 76:711–737, 2012.

Michael L. Littman and Peter Stone. A polynomial-time Nash equilibrium algorithm for repeated games. Decision Support Systems, 39:55–66, 2005.

Max O. Lorenz. Methods of measuring the concentration of wealth. Publications of the American Statistical Association, 9:209–219, 1905.

Deepak Malhotra and J. Keith Murnighan. The effects of contracts on interpersonal trust. Administrative Science Quarterly, 47:534–559, 2002.

Gerald Marwell and Ruth E. Ames. Economists free ride, does anyone else? Journal of Public Economics, 15:295–310, 1981.

David Masclet, Charles Noussair, Steven Tucker, and Marie-Claire Villeval. Mon- etary and nonmonetary punishment in the voluntary contributions mechanism. American Economic Review, 93:366–380, 2003.

Maximilian Mihm, Russell Toth, and Corey Lang. What goes around comes around: Network structure and network enforcement, 2010. Working Paper.

104 Catherine Moon and Vincent Conitzer. Maximal cooperation in repeated games on social networks. In Proceedings of the Twenty-Fourth International Joint Confer- ence on Artificial Intelligence (IJCAI), pages 216–223, Buenos Aires, Argentina, 2015.

Catherine Moon and Vincent Conitzer. Role assignment for game-theoretic coop- eration. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pages 416–423, New York City, NY, USA, 2016.

Francesco Nava and Michele Piccione. Efficiency in repeated games with local inter- action and uncertain local monitoring. Theoretical Economics, 9:279–312, 2014.

Niko Nikiforakis and Hans-Theo Normann. A comparative statics analysis of pun- ishment in public good experiments. Experimental Economics, 11:358–369, 2008.

Nikos Nikiforakis. Punishment and counter-punishment in public good games: Can we really govern ourselves? Journal of Public Economics, 92:91–112, 2008.

Eugene Nudelman, Jennifer Wortman, Kevin Leyton-Brown, and Yoav Shoham. Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms. In Proceedings of the International Conference on Autonomous Agents and Multi- Agent Systems (AAMAS), pages 880–887, New York, NY, USA, 2004.

John Orbell, Robyn Dawes, and Alphons van de Kragt. The limits of multilateral promising. Ethics, 100:616–627, 1990.

Elinor Ostrom, James Walker, and Roy Gardner. Covenants with and without a sword: Self-governance is possible. The American Political Science Review, 86:404– 417, 1992.

Ignacio Palacios-Huerta and Oscar Volij. Field centipedes. American Economic Review, 99:1619–1635, 2009.

Eun-Soo Park. Warm-glow versus cold-prickle: a further experimental study of framing effects on free-riding. Journal of Economic Behavior and Organization, 43:405–421, 2000.

R Core Team. The R stats package, 2017.

M. Rege and K. Telle. The impact of social approval and framing on cooperation in public good situations. Journal of Public Economics, 88:1625–1644, 2004.

Martin Sefton, Robert Shupp, and James M. Walker. The effect of rewards and sanctions in provision of public goods. Economic Inquiry, 45:671–690, 2007.

Kathryn Spier. Incomplete contracts and signaling. RAND Journal of Economics, 23:432–443, 1992.

105 Satoru Takahashi. Community enforcement when players observe partners’ past play. Journal of Economic Theory, 145:42–62, 2010.

United Nations Environment Programme Ozone Secretariat. Handbook for the Mon- treal Protocol on Substances that Deplete the Ozone Layer, 2017.

United Nations Framework Convention on Climate Change. Kyoto Protocol, 1997.

United Nations Framework Convention on Climate Change. Kyoto Protocol to the United Nations Framework Convention on Climate Change, 1997.

United Nations. International day for the preservation of the ozone layer, 16 septem- ber, 2017.

Alexander Wolitzky. Cooperation with network monitoring. Review of Economic Studies, 80:395–427, 2013.

Toshio Yamagishi. The provision of a sanctioning system as a public good. Journal of Personality and Social Psychology, 51:110–116, 1986.

Achim Zeileis. Measuring inequality, concentration, and poverty, 2014.

106 Biography

My name is Catherine Moon. I was born on March 22, 1990 in New Brunswick, New Jersey. My family relocated to South Korea when I was five, and I moved back to the United States when I came to Duke University for my undergraduate studies in 2008. After much exploration, I decided to pursue a B.S. in Economics and B.A. in Computer Science, and graduated in 2012. I began my graduate education, again, at Duke University in 2012 for a Ph.D. in Economics, and am planning to graduate in May, 2018. During my undergraduate studies, I have received the Sara Hall Brandaleone & Bruce Scholarships and the Duke Study-in-China Merit Scholarship. I finished my undergraduate studies with Honors and Highest Distinction in Economics, and my Honors Thesis, Possibility of Cost Offset in Expanding Health Insurance Coverage: Using Medical Expenditure Panel Survey 2008, has been selected as one of the 2012 Allen Starling Johnson, Jr. Best Thesis Finalists. During my Ph.D. studies, I have received the Duke Economics Department Fel- lowship, Artificial Intelligence Journal Travel Grant, and the Computing Research Association-Women, Coalition to Diversify Computing, Artificial Intelligence Jour- nal Travel Grant. I have been selected to participate in the 2017 CRA-W Grad Cohort Workshop and as one of the Program for Advanced Research in the Social Sciences (PARISS) Fellow. From this dissertation, Chapters2 and3 are published in the Proceedings of the

107 International Joint Conference on Artificial Intelligence Moon and Conitzer(2015, 2016).

108