Discounting, Dynamic Consistency, and Cooperation in an Infinitely Experiment∗

Jeongbin Kim†

California Institute of Technology‡

May 2020

Abstract In this paper, we explore the relationship between individuals’ time and cooperation in an infinitely repeated prisoner’s dilemma experiment. In Amazon’s Mechanical Turk, we implement a novel longitudinal design in which subjects play one repeated game over several weeks—one stage game each week. We find that subjects’ measured time preferences are associated with various facets of cooperation in the play of a repeated game. First, consistent with a model of quasi- and its application to repeated games, the degree of is negatively correlated with cooperation. In contrast, there is a weaker association between discount factors and cooperation. Second, subjects with time consistent preferences are less likely to deviate from their plan of action. Third, subjects with time varying preferences are more likely to break cooperative relationships.

JEL codes: C91, D91 Keywords: Repeated game experiment, Longitudinal design, Time preferences, Coopera- tion ∗I am very grateful to my advisor Pedro Dal B´ofor his guidance, financial support, and encouragement. I also appreciate Tom Palfrey’s thoughtful guidance and comments. I thank Geoffroy de Clippel, Louis Putterman, Charlie Sprenger, and Sevgi Yuksel for helpful comments and discussions. This experiment was approved by the IRB at Brown University (#1506001261). †Email: [email protected] ‡Division of the Humanities and Social Science, Postdoctoral Scholar of .

1 1 Introduction

Understanding cooperation in human interactions is important in economics and other so- cial sciences. Contributions to the theory of infinitely repeated games show that allowing for repeated interactions can make punishment for opportunistic behavior and rewards for cooperative behavior credible, and cooperation can then be supported as an equilibrium outcome. This result implies that behavior in a repeated game is an intertemporal choice between more immediate benefits from defection and later, but larger, overall benefits from cooperation. That is, time preferences play an important role in determining cooperative behavior in repeated games. However, as an empirical question, whether measured time preferences of human subjects are indeed associated with behavior in a repeated game has yet to be studied. In this paper the relationship between time preferences and cooperation in an infinitely repeated prisoner’s dilemma experiment is explored.1 We elicit time preferences of human subjects using a longitudinal design used in an experimental design of Halevy (2015) and show measured time preferences are related to various aspects of cooperative behavior in a repeated prisoner’s dilemma game.2 The most common way to implement infinitely repeated games in an experiment is to use a random termination rule (Roth and Murnighan, 1978; Dal B´oand Fr´echette, 2018). With a constant predetermined probability of continuation that is commonly known to subjects, after each stage game, subjects proceed to the next stage game with the continuation probability, otherwise the repeated game is terminated. With some plausible assumptions such as risk neutrality, a random termination rule is theoretically equivalent to exponential discounting in the lab. However, the standard framework of random termination has some limitations to explore the role of time preferences in repeated game experiments for the following reasons. First, there is no“time”horizon over which payoffs from each stage game can actually be discounted. Since subjects usually receive all their earnings from stage games at the end of the experiment,

1Throughout the paper, “time ” refers to an individual subject’s patience (discount factor), stationarity, time consistency, and time invariance. Definitions and identifications of these parameters in detail will be given in the next section. 2A growing body of literature has studied how we can measure individuals’ time preferences. See Frederick et al. (2002) for a critical review of early attempts to measure time preferences, and Andersen et al. (2008) and Andreoni and Sprenger (2012) for recent developments. Many recent studies find that measured time preferences are related to individual intertemporal decision making. For example, Meier and Sprenger (2010) show that individuals with present bias are more likely to have credit card debt.

2 payoffs are not separable across stage games, and in consequence, no actual discounting exists. These aspects indicate that for payoffs to be discounted over time, subjects should receive their payoffs at the end of each stage game. Even if subjects are paid at the end of every stage game in the standard laboratory setting, some problems will still remain. Time horizons between stage games are too short (probably for some minutes), and there would be no opportunity for payoffs from one stage game to be consumed before subjects receive payoffs from the next stage game. Relatedly, there is growing evidence from the recent experiments that time and risk pref- erences are different. For instance, Andreoni and Sprenger (2012b) show that subjects per- ceive risky and delayed rewards differently.3 With the argument that assessing randomness in termination is more in regard to risk preferences rather than time preferences, whether time preferences are related to behavior in repeated game experiments remains as an open question. Third, it is important to note that random termination only admits exponential discount- ing, i.e., stationary and dynamically consistent preferences. The evolving literature, however, has responded to widespread observations of dynamically inconsistent behavior, leading to some recent theoretical models that apply non-stationary discounting into repeated games. Dynamic inconsistency is cleanly beyond the scope of exponential discounting. Taken to- gether, it is essential to introduce “time” into the play of experimental repeated games. Overcoming the above challenges, the main contribution of this paper is to introduce a novel longitudinal experimental design that allows us to study the role of time preferences in the play of a repeated game. A repeated prisoner’s dilemma (PD) game is played over time—subjects play one stage game and receive associated payoffs each week. The design also includes an elicitation of subjects’ time preferences. We conduct our experiment on Amazon’s Mechanical Turk (MTurk). Subjects are recruited and asked to participate in the experiment once a week until the session is finished.4 A repeated game is implemented by using a random termination rule with a continuation probability of 0.75. On top of random termination, we add actual “time” that separates stage games by the weekly time window. We present experimental evidence of how measured time preferences are related to various

3See Chakraborty et al. (Forthcoming) for the theoretical discussion of the relationship between risk and time preferences. 4As will be clearly demonstrated, what we mean by a “session” in this paper refers to a cohort of a longitudinal experiment. A session includes the elicitation of time preferences (week 0 and week 3) and one repeated game (week 1 and after, if applicable). 5 sessions started on 4 different dates, and each subject is allowed to participate in only one session. See Section 2 for the experimental design in detail.

3 facets of cooperation. We consider patience (discount factor) and three different properties of time preferences ´ala Halevy (2015)—stationarity (present bias), time consistency, and time invariance. First, consistent with a model of quasi-hyperbolic discounting (e.g., Phelps and Pollak, 1968; Laibson, 1997) and its application to repeated games (Chade et al., 2008), the influence of two parameters of time preferences—β, the extent to which present and future biases are measured, and δ, a discount factor—are examined. We find that β is positively and significantly correlated with cooperation. Interestingly, a weaker relationship between δ and cooperation is found. Second, we look at how time consistency and time invariance are related to cooperation. Time consistency requires that the preferred choice does not depend on the time at which decisions are made. To relate time consistency to behavior in a repeated game, we adopt the novel design of Dal B´oand Fr´echette (2019) to elicit subjects’ plan of action. When subjects choose their action in week 1, they are also asked to specify their actions in week 2 for all possible contingent histories that could happen in week 1. If a subject in week 2 chooses a different action rather than an action that she specified in week 1, it can be interpreted that she exhibits time inconsistent preferences. We find that subjects with time consistent preferences are significantly less likely to deviate from their plan of action in week 2 than subjects with time inconsistent preferences. Third, time invariance indicates that subjects’ preferences at different times should be identical. Given that the structure of a subgame of a repeated game is equivalent to that of the original game, a decision to cooperate in week 2 can be similar to a decision to cooperate in week 1—if mutual cooperation occurred in week 1 and beliefs about the opponent’s coop- eration are not eroded. That said, a cooperative decision in week 2 by a subject who observes mutual cooperation in week 1 depends on whether her time preferences in week 2 differ from those in week 1. This reasoning implies that subjects with time invariant preferences are less likely to break cooperation than those who have time varying preferences. Consistent with this prediction, we find that after mutual cooperation in week 1, subjects with time invariant preferences are more likely to cooperate than subjects with time variant preferences. This paper is related to the literature in different areas. First, this paper is based on the recent developments in the theoretical and empirical literature of time preferences. For instance, in regard to measuring time preferences, Andersen et al. (2008) and Andreoni and Sprenger (2012a) provide experimental methods that take into account the curvature of util- ities. Halevy (2015) introduces three definitions of time preferences based on a longitudinal

4 experimental design that is closely followed by this paper. Using such methods, previous studies look at the relationships between measured time preferences and intertemporal be- havior of individuals. Meier and Sprenger (2010) find that present bias is correlated with an individual’s credit card dept. Augenblick et al. (2015) and Gine et al. (2019) examine the demand for commitment devices by individuals who have dynamic inconsistency. While the literature exclusively focuses on intertemporal choices in the domain of individual decision making, we try to extend the scope of the literature into strategic interactions over time. Second, the literature of repeated game experiments is closely related to this paper. The use of a random termination rule has been successfully established in the lab experiments. For instance, Dal B´o(2005) and Dal B´oand Fr´echette (2011) show that higher continuation probabilities lead to greater cooperation in repeated prisoner’s dilemma experiments.5 Based on the random termination framework, many studies try to establish the relationship between personal characteristics and cooperation in repeated game experiments. In contrast to the findings of this paper, many studies report no robust associations between personal traits and cooperation. Davis et al. (2016) is closely related to this paper in its theme. The authors measure and relate subjects’ discount factors to their behavior in repeated games under the standard framework of random termination. No evidence of robust correlations is found, and this paper discusses why time preferences may not matter in the standard framework. Several papers examine the relationship between risk aversion and cooperation in a repeated game experiment (Sabater-Grande and Georgantzis, 2002; Proto et al. 2019; Davis et al., 2016). Risk aversion is negatively correlated with cooperation only in Sabater-Grande and Georgantzis (2002) in which groups are assigned based on individuals’ risk aversion. Proto et al. (2019) investigate the the effect of intelligence on cooperation. They find that a group of subjects with higher IQ test scores cooperate more than a group of subjects with lower IQ test scores, only when a high continuation probability is high. Dreber et al. (2014) show that there is no association between giving behavior in a and cooperation when the set of equilibrium outcomes includes cooperation. Regarding the experimental design of this paper, Wright (2013) also implements a longitu- dinal design of an infinitely repeated Bertrand game with students at the National University of Singapore. Subjects in Wright (2013) have two or three days between stage games to think about strategies without time pressure. The main difference is that while subjects in Wright (2013) receive all earnings at the end of the experiment, subjects in this study get stage

5See Dal B´oand Fr´echette (2018) for further discussion of repeated game experiments in the lab.

5 game payoffs each week. To make time preferences work, it is important to have that (1) stage game payoffs are paid separately over time and (2) the payment method is credible to minimize uncertainty about future payments, which are achieved in this paper. Third, this paper is also related to the small, but growing literature of dynamic games that admits dynamic inconsistency. As mentioned earlier, Chade et al. (2008) is the first paper to incorporate quasi hyperbolic discounting into repeated games. Obara and Park (2017) study the repeated games with a punishment that can be harsher than a revision to . They show that present and future bias affect the pattern of the worst punishment strategy. Schweighofer-Kodritsch (2018) introduces dynamic inconsistency into the canonical Rubinstein bargaining model. He shows that different types of deviations from dynamic consistency determine (non-) unique equilibrium in bargaining outcomes. The longitudinal experimental design of this paper seems applicable to the above models to shed light on the role of time preferences in the play of dynamic games.

2 Experimental Design

2.1 Overall Design

The experiment was conducted on Amazon’s Mechanical Turk (MTurk), an online labor platform provided by Amazon. An increasing number of experiments in economics have been conducted on MTurk.6 The novelty of this experiment is its longitudinal design with time separable payment: the same subjects play a repeated game over several weeks and receive stage game payoff each week. Once a week, the same subjects were invited to the MTurk page, where they participated in the task (either intertemporal choice, stage game, or both) and were paid for decisions made in that week.7 More specifically, in week 0, subjects were recruited from MTurk and participated in an incentivized task to measure their time preferences. In week 1 and subsequent weeks, the same subjects were invited to the MTurk page by the internal message system in MTurk, and played one stage PD game each week. The invitation message (with a link to the MTurk page) we sent through the MTurk message

6For instance, see Horton et al. (2011) for a comparison between laboratory experiments and online experiments on MTurk. 7Upon accepting the task on the MTurk page, subjects were given a link to a survey website provided by Qualtrics. After finishing their task, subjects were asked to return to the MTurk page and to enter a code that was given at the end of the survey. This is one typical way of having surveys or experiments on MTurk, and all the payments were made on MTurk by transferring money from the experimenter’s account to the subjects’ accounts.

6 system was sent to subjects’ email addresses that were registered for their MTurk account, so subjects did not need to log in MTurk to check their message. The timeline of a session is presented in Figure 1.

Figure 1: Timeline of a session

When the subjects were recruited in week 0, they were told that this experiment would last at least two weeks or longer, but they were not informed of what they would be asked to do later. In week 1, they learned about the rules of the indefinitely repeated PD game, including the payoff matrix and the probability of continuation. To maintain consistency of the timing at which subjects were recruited (or invited) and paid, throughout the experiment, we recruited and sent the invitation messages to subjects only on Wednesday and made clear announcements that subjects who completed the task by 5:00 pm E.T. on Friday would be eligib le to be paid before midnight E.T. on Friday of the same week.8

2.2 Measuring Time Preferences

2.2.1 Time Preferences Elicitation

We elicit subjects’ time preferences by using the experimental design proposed by Halevy (2015). This experimental design consists of two different time points at which subjects were asked to make intertemporal choices. In week 0, after being recruited, each subject was asked to make two blocks of decisions, and another block of decisions was given to each subject in week 3. For each decision block, the multiple list (MPL) method was used in which there are ten lists of two options: a sooner payment, and a later payment, which was delayed for one week. Sooner payments always paid subjects $0.50, and later payments

8In MTurk, workers usually do not exactly know when they will be paid by the requester to whom they submitted their work. Rather, requesters are asked to set a deadline by which they have to pay their workers and this deadline is known to workers before they decide to accept the task or not. The maximum possible duration of this deadline is 30 days. In this experiment, each week’s experiment started on Wednesday and subjects were clearly informed that they would be paid on Friday of the same week to avoid any kind of uncertainty from the timing of payments being made.

7 ranged from $0.50 to $0.68 (with $0.02 increments) and were arranged from top to bottom in an increasing order.9 In block 1 of decisions in week 0, each subject was asked to choose between sooner payments in week 0 (i.e., the same day) and later payments in week 1 (i.e., one week later). Block 2 of decisions in week 0 required subjects to choose between sooner payments in week 3 and later payments in week 4. In the week 3 elicitation task, for block 3 of decisions, subjects were asked to decide between sooner payments in week 3 (i.e., the same day) and later payments in week 4 (i.e., one week later). Let us denote by x1, x2, and x3 the switching point from a sooner payment to a later payment in block 1, block 2, and block 3, respectively. Figure 2 presents the timeline of the time preference elicitation, and t refers to a week on which corresponding intertemporal choices were made.

Figure 2: Timeline of the time preference elicitation Note: t indicates time (week) at which decisions are made.

To pay subjects in an incentive-compatible way, we use the design of the robustness treatment in Halevy (2015). When subjects made their decisions for block 1 and block 2 in week 0, they were told that only one decision for each block would be randomly chosen for their actual payment. In week 3, the decisions for block 3 are such that subjects were asked to revise their decisions for block 2. At this time, no feedback about subjects’ decision on block 2 was given and one of these revised decisions was randomly selected for payment.10

9For more details and web-based instructions, see Appendix. 10In the main treatment of Halevy (2015), when making decisions in week 0, subjects were informed that they would be asked to make decisions for block 3 in week 3, and by tossing a coin either block 2 or block 3 will be implemented for their actual payment with equal probability. We decide to use the robustness design to avoid making the decision problems too complicated for our subjects to understand. This also aims not to mislead subjects’ perceptions about the length of the experiment. When recruited, subjects did not know how long the experiment would exactly continue, and telling them that they would make decisions in week 3 may mislead subjects to believe that the experiment would not finish before week 3 and consequently, their behavior in a repeated game might be influenced by this misperception. See Halevy (2015) for the discussion about incentive compatibility and these two payment designs.

8 After all the relevant decisions had been made, subjects were notified of decisions that were randomly selected for their payments.

2.2.2 Identification of Present Bias, Time Consistency, and Time Invariance

First, for stationarity and patience, we focus on β and δ.11 Subjects’ decisions for each block may reveal their intertemporal marginal rate of substitution between two different time points. Inferring the accurate estimates of β and δ from these decisions requires more struc- tural assumptions and information on subjects’ function and liquidity constraints.12 Given the limited data and that our primary purpose is to investigate how heterogeneity of time preferences is related to subjects’ behavior in an infinitely repeated game, we simply define β and δ from the decisions for block 1 and 2 as follows. We assume that subjects have linear and their decisions for each block are narrowly bracketed in the sense that their intertemporal decisions are not affected by conditions outside the laboratory. We also assume that an individual is indifferent between two payments at the last point at which an individual prefers a sooner payment to a later payment.13 Given the switching point in block

2 (x2) between a sooner payment a and a later payment b (i.e., a = δb), δ can be calculated a as b . From the switching point in block 1 (x1), we have a = βδb, and β can be obtained. If β equals 1, then preferences are stationary. Second, comparing switching points across blocks enables us to identify time consistency and time invariance. Time consistency requires that the preferred choice does not depend on the time at which decisions are made. In other words, once a decision chooses temporal payments at t, he does not have an incentive to deviate from his decision at t0 from his ex

ante decision at t. This leads us to the following identification: If x3 = x2, then preferences are time consistent. Time invariance indicates that subjects’ preferences at different times should be identical. If preferences are invariant, the decision maker’s evaluation does not account for a specific

11In a model of quasi-hyperbolic discounting, a person with β −δ preferences evaluates a stream of payoffs with the sequence of quasi-hyperbolic discount, 1, βδ, βδ2,.... If β equals 1, this model is equivalent to the standard model of exponential discounting. If β < (>)1, it can capture the notion of present (future) bias. See Laibson (1997). 12See Andersen et al. (2008) and Andreoni and Sprenger (2012) for recent developments in measuring time preferences. See also Dean and Sautmann (2014) for discussion about accounting for financial shocks on liquidity constraints. 13Among 1,345 subjects, only 2.9% of subjects have multiple switching points for at least one block. For these subjects, we take the first switching point as their true switching point. Exclusion of these subjects does not affect the results in this paper.

9 date when the decision is made, but only consider the time delay between the time at which the decision is made and the time at which payments are given. Then, if x3 = x1, then preferences are time invariant.14

2.3 Infinitely Repeated Prisoner’s Dilemma

In week 1 and after, subjects participated in one infinitely repeated PD game. They played one stage game and received associated payoffs each week. The probability of continuation is fixed as 0.75 and the payoff table shown to each subject is presented in Table 1. The same payoff structure was used in Dal B´oand Fr´echette (2011), who represented 56.8% of cooperation in the first stage game of the first repeated game. We use this as a benchmark. Throughout the experiment, cooperation and defection are represented as action 1 and 2, respectively.

Table 1: Stage game payoffs The other’s choice Your choice 1 2 1 $0.48, $0.48 $0.12, $0.50 2 $0.50, $0.12 $0.25, $0.25

Subjects played with the same counterpart in all weeks if no attrition happened within a pair. At the beginning of the first stage game, subjects were told that they would be re-matched with another person if the partner did not return to the experiment in a future week. The subjects were also told that the other person’s attrition would not affect their eligibility for participation until the session was terminated.15 Since our experiment did not allow for instantaneous feedback about stage games, timing of feedback about stage game payoffs and opponent’s attrition was deliberately controlled. In week 1 (i.e., the first stage game), subjects submitted their action without knowing their opponent’s action. In week 2, subjects were given the history of actions chosen in week 1 when choosing an action. In week 3 and in consequent weeks, subjects also received feedback of whether they were re-matched

14Halevy (2015) proves that any two of the three properties—stationarity, time invariance, and time consistency—imply the third and shows that a substantial portion of subjects with time inconsistent prefer- ences has stationary time preferences. 15The exact wording is: “In the event, which we hope is unlikely, that your counterpart fails to continue the interaction in a future week, we will arrange for you to be able to continue playing with another counterpart or will make other arrangements to minimize any impact on your predicted earnings. Unless we inform you otherwise, you will definitely be playing with the same counterpart at each future stage.”

10 with another person.16 This implies that at least for the first two stage games (week 1 and 2), subjects were not under the direct effect of opponent’s attrition on their action choices.17 It is important to note that random termination with time separable payments induces the same incentives as theory models. Under the standard laboratory framework in which subjects receive all earnings at the end of the experiment, random termination is equivalent to exponential discounting only if subjects are risk neutral. Sherstyuk et al. (2013) show that paying subjects for their last stage game can accommodate different attitudes toward risk. Like in our experiment, however, if stage game payoffs are paid separably over time, a random termination rule is equivalent to exponential discounting regardless of risk attitudes. In one out of five sessions that we conducted, we elicited subjects’ plan of action. This session adopted the strategy elicitation method used in Dal B´oand Fr´echette (2019). In week 1, after choosing an action for the first stage game, subjects were asked to specify a plan of action by answering four questions regarding their action in week 2, which included all possible combinations of actions that could be chosen by the subject and the opponent in week 1.18 Subjects’ specified plan of action in week 1 are not binding for their action choice in week 2. Without having feedback about their specified plan of action, subjects were asked to choose their action in the second stage game. As shown later, elicitation of plan of action allows us to study whether time inconsistent subjects were more likely to deviate from their plan.

16For those who lost their opponent, we tried to re-match subjects whose previous histories are as similar as possible. In case of re-match, however, subjects did not see the new opponent’s history of actions before they were re-matched. 17In addition to the difference regarding payments, another main difference between Wright (2013) and this study is how to handle subjects’ attrition. In Wright (2013), if an opponent failed to return to the experiment, the other person was also removed from the experiment with some compensations. Different approaches to attrition embed different kinds of uncertainty. While subjects in Wright (2013) have uncertainty about termination of the experiment by the opponent’s fail to return, subjects in this study have uncertainty about being re-matched with another person who also lost their opponent. This is an unavoidable feature of the longitudinal design in strategic interactions. 18The exact wording is: “In addition to your choice above, you are asked to specify a plan of action. A plan of action is specified by answering 4 questions: After this week, if the experiment continues for one more week and (1) I last selected 1 and the other selected 1, then I will choose... (2) I last selected 1 and the other selected 2, then I will choose... (3) I last selected 2 and the other selected 1, then I will choose... (4) I last selected 2 and the other selected 2, then I will choose...”

11 2.4 Mechanisms to Prevent Attrition

It becomes important to prevent attrition of subjects as the same subjects are asked to repeatedly participate in the experiment for several weeks. Two mechanisms were used to achieve that goal. First, when being recruited in week 0, subjects were promised to be paid a significant completion bonus of $3 at the end of the session if they took part in all weeks. This completion bonus of $3 was given on top of the earnings from the experiment. Second, subjects were told that if they did not complete their task for a week, they would not be invited to the future tasks, nor would they be eligible for the completion bonus. After week 0, MTurkers who were invited by our MTurk message were only eligible for the experiment. These two mechanisms were clearly announced in week 0 when we recruited subjects.

2.5 Session information

Five sessions started on four different dates between September and November 2015 (Septem- ber 30th, October 14th, October 21st, and November 11th). These dates were chosen so that, for the time preference elicitation tasks, a sooner and a later payment were displayed on the same page of the calendar shown to subjects. This effort is to avoid a possible bias that if sooner and later payments were in different months of the calendar, subjects might per- ceive a sooner payment even closer to the date on which the decision was made. There are two sessions that started on October 14th. In one of those sessions, subjects were asked to specify a plan of action in week 1. Subjects who reside in the United States were eligible to participate in the experiment, and a total of 1,355 subjects participated in the experiment. MTurk allows employers to set up eligibility conditions for participation of employees, such as country of residence and workers’ reputation regarding their performance in the previous tasks. The experiment was restricted to subjects who reside in the United States. Even with this eligibility condition, we found some subjects whose IP addresses are outside the United States after the experiment. In the analysis below, we exclude subjects whose IP addresses were outside the United States. This criterion results in 1,345 subjects. Table 2 represents the information from each session. Within each session, all pairs share the same sequence of random numbers for continuation so that all pairs finish their repeated game at the same time.19 With the continuation probability of 0.75, the expected length of a repeated game is 4 weeks, and the realized lengths for repeated games are 5, 4, 2, and 10

19We created random numbers for continuation of repeated games ahead of the experiment, so predeter- mined the length of each session.

12 weeks. In week 3 of Session 4, subjects were told that the repeated game was finished in week 2 and participated only in the elicitation task (block 3). Table 2: Session information

Starting date Subjects Length: session / repeated game (weeks) Strategy elicitation Session 1 9/30/15 270 6 / 5 No Session 2 10/14/15 277 5 / 4 No Session 3 10/14/15 265 5 / 4 Yes Session 4 10/21/15 268 4 / 2∗ No Session 5 11/11/15 275 11 / 10 No *Session 4 has two stage games of a repeated game. In week 3, subjects participate only in the elicitation task (block 3).

3 Hypotheses

3.1 Discounting and Cooperation

Theoretical contributions in infinitely repeated games have shown that discount factors are essential in determining cooperation supported as an equilibrium outcome. Fudenberg and Maskin (1986) show that if players have a sufficiently high discount factor, δ, individually rational payoffs can be supported in a subgame perfect equilibrium. Abreu et al. (1990) show that the set of subgame perfect equilibrium payoffs expands in δ. As a benchmark, under the assumptions of complete information about payoffs and common discount factors, the threshold of a weekly discount factor, δ, over which cooperation can be supported as an equilibrium outcome is 0.107 in this experiment. Lehrer and Pauzner (1999) study a model of repeated games with differential discount factors under complete information. They show that the set of feasible payoffs can be larger than the convex hull of the stage game payoffs as players can be better off by trading payoffs over time.20 Chade et al. (2008) study infinitely repeated games under quasi-hyperbolic discounting. They show that the intuition that the set of equilibrium payoffs increases in δ and β may not hold in general. However, they prove that for a class of games (including the repeated prisoner’s dilemma) in which the point of the stage game coincides with a Nash equilibrium, with δ (β) fixed, the set of equilibrium outcomes expands as β (δ) increases. Taken together, these results lead us to have the following hypothesis. 20The environment of this experiment may embed incomplete information about different discount factors as subjects don’t know the discount factor of the other person they are paired with. The characterization of the set of equilibrium payoffs under such incomplete information is still in question.

13 Hypothesis 1 (Discount factor and Present bias). Subjects with higher β and δ will be more likely to cooperate.

3.2 Time Consistency and Cooperation

Time consistency argues that a decision maker’s choices should not depend on the time at which decisions are made. The same idea applies to the play of a repeated game. For instance, at the beginning of a repeated game, a subject may have a plan of action in mind to cooperate in week 2 if both players cooperated in week 1. Suppose that week 2 arrives and both players cooperated in week 1. If he deviates from his plan of action by defecting, this could be due to his preferences in week 2 that are different from those in week 1. In other words, if a subject has time consistent preferences, he is more likely to commit to his ex ante plan of action than a subject with time inconsistent preferences. By eliciting subjects’ plan of action in week 1 and comparing it to their actual behavior in week 2, we have the following hypothesis.

Hypothesis 2 (Time consistency). In week 2, subjects who exhibit time consistency will be less likely to deviate from their plan of action specified in week 1 than subjects who exhibit time inconsistency.

3.3 Time Invariance and Cooperation

One key feature of a repeated game is that the structure of a subgame is identical to that of the original game. For instance, a subgame that starts in week 2 has an identical structure to that of an original repeated game that begins in week 1. After week 1, if there is no drastic change that may affect cooperation (e.g., beliefs about the other person’s action), choosing an action in week 2 can be regarded as similar to intertemporal decision making that subjects faced in week 1. For instance, a subject cooperates in week 1 because she believes the opponent will also cooperate. Suppose that when week 2 arrives, she observes that both cooperated in week 1. What would she do? If she has the same time preferences in week 2 as she had in week 1 and holds a similar belief about the opponent, she will cooperate. If she chooses to defect, changes in choosing an action in week 2 may be due to changes in her time preferences between week 1 and 2. A subject with time invariant preferences should have identical time preferences at dif- ferent time points. That is, if everything else is equal, decisions made by subjects with

14 time invariant preferences do not depend on specific calendar dates. It is then intuitive to expect that after mutual cooperation in week 1, subjects are more likely to cooperate in week 2 if they have time invariant preferences. This argument leads us to have the following hypothesis.

Hypothesis 3 (Time invariance). After mutual cooperation in week 1, subjects with time invariant preferences will be less likely to defect than subjects with time variant preferences in week 2.

A similar logic applies to week 2 behavior after mutual defection in week 1. However, given the belief that the opponent will defect, the extent to which changes in time preferences make deviations to cooperation profitable may be different. Therefore, it will be an empirical question of whether time (in)variance is more appropriate to explain deviations from mutual cooperation or defection.

4 Results

4.1 Implementation of a Longitudinal Experimental Design

This experiment is the first longitudinal design that a repeated game is played over time with weekly payments. We first check whether the experiment was successfully conducted in MTurk, by looking at subjects’ participation over time. The left panel in Figure 3 shows the number of subjects in each week across the sessions. The main observation is that there is a big drop in participation between week 0 and 1. After week 1, however, the rate at which subjects did not return decreases over time. Such patterns are also clearly depicted in the right panel of Figure 3. For each session, the rate of attrition is 13.4% between week 0 and week 1. Once subjects return to week 1, attrition rates sharply decline that week 2 and after, the average rate of return is higher than 95%.21 As a result, 969 out of 1,345 (72%) subjects completed all weeks of their session. The high rate of return after week 1 is important to the play of a repeated game, because the probability of return may change the set of equilibrium outcomes. Suppose that the “effective” discount factor is continuation probability×probability of return×δ. Assuming (conservatively) the probability of return as 0.90 from the data, the new threshold of δ over which cooperation can be supported as an equilibrium outcome is 0.119, still low enough for

21The week 6 of session 5 was the week of Christmas. Even in this week, 95.2% of subjects returned.

15 Figure 3: Subjects’ participation over time the subjects to sustain cooperation. Even with the returning probability of 0.72 as the lower bound, the threshold for δ is 0.148. the high rate of return is also important for the subjects to have expectations that their opponent is highly likely to return to the experiment.

4.2 Present Bias, Impatience, and Cooperation

Given the proper implementation of the experimental design, we turn our attention to be- havior in a repeated PD game. Figure 4 shows the overall cooperative behavior over time in each session. In week 1, the average rate of cooperation is 78.1%, and after week 1, the rate is slightly decaying until week 5, resulting in the overall cooperation rate of 71.1% for all weeks. These rates of cooperation are strictly higher than the rates of cooperation in Dal B´oand Fr´echette (2011). In their experiment, the average rate of cooperation for the first stage game of the first repeated game was 56.8% and the average rate of cooperation

16 for the first repeated game was 56.1%.22 Note that there were two sessions that started on October 14th, with and without eliciting of a plan of action. As in Dal B´oand Fr´echette (2019), cooperative behavior in those sessions is not significantly different in week 1 and in all weeks (p-values: 0.191 and 0.999, respectively).23 Therefore, all five sessions are included in the analyses below.

Figure 4: Average cooperation over time

We then examine our first hypothesis regarding the relationship between subjects’ mea- sured present bias (β) and impatience (δ), and cooperation. Table 3 presents results from marginal effects of probit regressions. To control for the effect of subjective probability of at- trition on behavior, we divide the whole sample into two groups—those who (1) participated in all weeks of the experiment (“No Attrition”) and (2) failed to return in the course of the experiment (“Attrition”). In week 1, we find that β is positively and significantly correlated with cooperation for the Attrition group, while no significant correlation is found in the No

22The difference in cooperation rates is partially due to the fact that subjects with less present bias and more patience participated longer in our experiment. As shown in Table 7 in the Appendix, attrition between week 0 and 1 is positively correlated with β and more patient subjects remain in the experiment longer. Based on the result regarding hypothesis 1, the rate of cooperation could have been lower if the first stage game began in week 0. 23P-values are assessed by probit regressions. In all regressions throughout the paper, unless specified otherwise, standard errors are clustered at the level of subject.

17 Attrition group. It is interesting to see that although insignificant, the coefficients for δ for both groups are negative, contrast to our prediction. However, as subjects gain experience, it seems that measured time preferences are more closely related to cooperation in a repeated game. In week 2, 1% increase in β is now significantly correlated with 0.52% increase in cooperation for the No Attrition group, while the coefficient for the Attrition group is still positive, but loses its significance. For both groups, subjects with higher δ are more likely to cooperate, although the coefficients are not significant. In week 3 and after, β and δ are still positively associated with cooperation. For the Attrition group, 1% increase in δ is now significantly associated with 1.8% increase in cooperation. In column (3)-(6), we also add a dummy variable which indicates whether the opponent cooperated in the previous week, and this is to control for beliefs about the opponent’s cooperation in the current week. Subjects who observed the opponent’s cooperation in the previous week are highly likely to cooperate in the current week. Taken together, β has the more consistent and stronger relationship with cooperation than δ. Table 3: β, δ, and cooperation (Probit - Marginal effects)

Week 1 Week 2 Week 3 and after (1) (2) (3) (4) (5) (6) No Attrition Attrition No Attrition Attrition No Attrition Attrition β 0.228 0.874** 0.518** 0.368 0.440* 0.877 (0.216) (0.435) (0.247) (0.663) (0.243) (1.251) δ -0.050 -0.452 0.084 0.765 0.127 1.830** (0.168) (0.355) (0.192) (0.495) (0.176) (0.876) Opponent coop. 0.476*** 0.378*** 0.710*** 0.651*** (0.036) (0.109) (0.019) (0.086)

Observations 969 196 969 111 2,709 153 Notes: Dependent variable: cooperation=1, defection=0. Opponent cooperation=1 if the opponent in the previous week cooperated, otherwise 0. No attrition refers to subjects who participated in all weeks of the experiment. Attrition indicates subjects who failed to return in the course of the experiment. Clustered standard errors at the subject level in parentheses. Marginal effects are taken at the mean. ***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level.

Result 1. Subjects with less present bias are more likely to cooperate and the association between present bias and cooperation is stronger than that between patience and cooperation.

There are some remarks worth mentioning. First, while the previous experiments did not find robust relationship between measured preferences and cooperation (see, Sabater- Grande and Georgantzis, 2002; Davis et al., 2016; Dreber et al., 2014; Proto et al. 2019), our

18 experiment finds positive correlations between β and cooperation. Second, while subjects were asked to make their decision between Wednesday and Friday, payments were made on Friday evening, after decisions from subjects being collected. The unavoidable delays between when decisions were made and when subjects were paid could have made the role of present bias much weaker as there is no “immediate” reward. However, the result above shows that subjects, even with such delays, may perceive play of a repeated game as deci- sions with immediate rewards so that β is correlated with cooperation. Third, a short time window between stage games is possibly related to relatively weak relationship between δ and cooperation. A week between stage games may not be long enough to induce meaningful differences in δ among subjects. In contrast, β is less relevant to the length of time windows. Therefore, an interesting question for future research will be how different time windows between stage games affects play of a repeated game over time. Lastly, subjects may have uncertainty about being re-matched with another subject due to the opponent’s attrition. We do not have a measure for such uncertainty, but, instead, we look at the case in which subjects learn they were re-matched in the previous week. Table 8 in the Appendix shows that there is no significant effect of re-matching on cooperation in week 3 and 4.

4.3 Time Consistency and the Plan of Action

To study the relationship between time consistency and cooperation, we compare subjects’ behavior in week 2 with the plan of action that subjects specified in week 1. Table 4 represents marginal effects from probit regressions in which the session with the elicitation of a plan of action is considered. Column 1 reveals that in week 2, subjects with time consistent preferences are significantly less likely to deviate from their plan of action than subjects with time inconsistent preferences. This result is robust to the inclusion of β and δ. Looking separately at deviations from the different paths of history allows us to have better understanding. Column 3 deals with deviations from a plan of action in which mutual cooperation occurred in week 1. We find that the correlation is highly significant—subjects with time consistency are 15.1% less likely to deviate from a plan of action than subjects with time inconsistency. Among 132 subjects who observed mutual cooperation in week 1, 97.2% of the subjects originally planned to cooperate after mutual cooperation in week 1. Most subjects who deviated from this plan of action turn out to be time inconsistent. This result is also robust to the inclusion of β and δ and implies that time consistency and stationarity (i.e., β) work as different sources of dynamic inconsistency and reflect different

19 Table 4: Time consistency and deviation from the plan of action (Probit - Marginal effects)

All history After CC Other (1) (2) (3) (4) (5) (6)

Time consistency (=1) -0.118** -0.103* -0.151** -0.143*** -0.052 -0.027 (0.060) (0.063) (0.062) (0.063) (0.118) (0.140) β 0.139 0.007 0.716 (0.454) (0.379) (1.069) δ -0.268 -0.132 -0.624 (0.362) (0.308) (0.798)

Observations 204 204 132 132 72 72 Notes: Dependent variable: deviation from a plan of action=1, otherwise=0. Clustered standard errors in parentheses. Marginal effects are taken at discrete change from 0 to 1 for dummy variables. ***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level.

aspects of cooperation. For other paths of histories in column (5) and(6), 37.0%, 17.2%, and 6.2% of subjects planned to cooperate after CD, DC, and DD, respectively. However, we only find insignificant correlations for those paths, although time consistency is still negatively correlated deviations.24

Result 2. After mutual cooperation in week 1, subjects with time consistency deviate less from their plan of action (i.e., cooperation) in week 2.

It is important to note that the result regarding time consistency is orthogonal to un- certainty about the opponent’s attrition. In week 1, subjects already know that they will be re-matched with another subject in week 2 if their current opponent fails to return to the experiment. That is, subjects’ plan of actions specified in week 1 already take into ac- count uncertainty about the current opponent’s attrition—the decision making environment in week 2 is exactly the one that subjects imagined in week 1. Therefore, the result above cleanly presents how time consistency is associated with deviations from plan of actions.

24One reason for insignificant correlations is possibly due to a lack of observations for each path of histories (27, 29, and 16 observations for CD, DC, and DD, respectively).

20 4.4 Time Invariance and Persistent Cooperation

In this subsection, we test the question of whether time invariant subjects are less likely to deviate from mutual cooperation than time variant subjects in week 2. Table 5 shows the marginal effects from probit estimations in which behavior in week 2 is examined. Column 1 and 2 consider behavior in week 2 after mutual cooperation in week 1. In both columns, it is clear that time invariance is significantly and negatively correlated with subjects’ deviation from mutual cooperation that happened in week 1. In column 1, subjects who exhibit time invariance are less likely to break persistent cooperation than subjects who exhibit time variance by 6.5%. In column 2, the relationship between time invariance and deviation from mutual cooperation remains significant after controlling for β and δ. This result also indicates that time invariance reflects different aspects of cooperative behavior in a repeated game than β and δ. We also look at whether the similar argument applies to the case of mutual defection in week 1. Interestingly, subjects with time invariance are more likely to cooperate after mutual defection in week 1, but the coefficient is only marginally significant when β and δ are controlled. However, lack of observations for mutual defection in week 1 makes it hard to draw as definite conclusions as those from mutual cooperation.

Table 5: Time invariance and deviation from behavior in week 1 (Probit - Marginal effects)

After CC After DD (1) (2) (3) (4)

Time invariance (=1) -0.065*** -0.059*** 0.161 0.209* (0.019) (0.018) (0.116) (0.113) β 0.073 0.818 (0.125) (1.057) δ 0.148 -1.147 (0.094) (0.738)

Observations 642 642 53 53 Notes: Dependent variable: deviation from an action in week 1=1, otherwise=0. Clustered standard errors in parentheses. Marginal effects taken at discrete change from 0 to 1 for dummy variables. ***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level.

Result 3. After mutual cooperation in week 1, subjects with time invariance are less likely

21 to break persistent cooperation in week 2 than subjects with time variance.

Meanwhile, we also look at mutual cooperation that lasted longer than the first two weeks. We find that time invariance is no longer significantly correlated with deviation from cooperation that lasted for previous two weeks. More than 97.5% of subjects who observed no defection in week 1 and 2 choose to cooperate in later weeks. This observation suggests that the cooperative relationship that has been established in early weeks provides subjects with evidence of the benefits from future cooperation. In other words, possible variations in time preferences in week 3 are not enough to make the immediate rewards from defection more profitable than the delayed rewards from mutual and persistent cooperation in the future.

4.5 Are MTurkers Less Rational than Undergraduates?

In the previous subsections, we find that subjects’ measured time preferences are meaning- fully associated with various facets of cooperation in a repeated PD game. One question that can be raised is how rational MTurkers are. We try to answer this question by com- paring measured time preferences in our experiment with those of undergraduates in Halevy (2015).25 Figure 5 shows the distribution of β and δ. There is a negative relationship between δ and β by construction. Given that subjects’ choices in block 1 and 2 are highly correlated (Spearman’s rank correlation coefficient: 0.755 with p-value<0.001), 71.1% subjects have β equal to 1, i.e., stationary time preferences. 16.4% and 12.5% subjects have present bias (β < 1) and future bias (β > 1), respectively. In an aggregate level, the average of β and δ are 0.995 and 0.906, respectively. These numbers are quite consistent with the numbers in Halevy (2015). Two different amounts of money were used in his experiment, $10 and $100. The averages of estimated β are 0.994 and 1.00 for $10 and $100, respectively, and are resonating with the average of β in this paper. The averages of estimated δ are 0.944 and 0.963 for $10 and $100, respectively, and are higher than the average of δ in this paper. The differences in discount factors may be due to the “Magnitude effect” that small outcomes are usually discounted more than large outcomes.26 We next look at the distribution of three properties of time preferences and check whether

25The undergraduate subjects at University of British Columbia in Canada participated in the experiment of Halevy (2015). 26See Frederick et al. (2002) for more experimental evidence of the Magnitude effect.

22 Figure 5: Distribution of β and δ subjects in MTurk are more/less rational than undergraduate subjects in the lab. Table 6 compares the classification of subjects in Halevy (2015) and this experiment.27 It seems that the distributions of time preferences in both experiments are very similar. Interestingly, the proportion of subjects who have stationary, time consistent, and time invariant preferences is higher in this experiment than in Halevy (2015). In the same vein, the proportion of subjects who exhibit the most irriational behavior—non-stationary, time inconsistent, and time varying preferences—is lower in MTurk than in the lab. These observations helps us validate the quality of the data from MTurk, leading to the conclusion that subjects’ behavior MTurk are comparable to and even more rational than those of undergraduates in the lab.

5 Discussion and Conclusion

In this paper we measure and relate subjects’ time preferences to their behavior in an in- finitely repeated prisoner’s dilemma game. We implement a longitudinal experimental design

27Halevy (2015) used two different treatments for eliciting time preferences. Since the results from the two treatments are quite similar, we report the total number of subjects over the two treatments in table 6. See Halevy (2015) for more details.

23 Table 6: Classification of subjects

This experiment Halevy (2015) - $10 Halevy (2015) - $100 %%% Time Invariant Stationary 45.95 38.07 35.80 (x1 = x2 = x3)

Time Invariant Non-Stationary 5.88 (3.47) 7.95 (5.68) 10.23 (4.55) (x2 6= x1 = x3)

Time Varying Stationary 25.14 24.43 21.02 (x1 = x2 6= x3)

Non-Stationary consistent 9.06 (4.91) 13.64 (5.68) 10.23 (5.11) (x1 6= x2 = x3)

Time Varying Non-Stationary 13.97 (7.71) 15.91 (5.68) 22.72 (11.36) Inconsistent

Total (%) 100 100 100 Observations 1,032 176 176

Note: Present bias is identified if x2 < x1. Numbers in parentheses indicate the propor- tion of subjects who have preferences consistent with present bias. for a repeated game in which subjects play one stage game and receive associated payoffs each week. This design allows us to examine the relationship between various aspects of time preferences and cooperation. First, we find that the degree of present bias (non-stationarity) is negatively correlated with cooperation. There is an weaker association between discount factor and cooperation. Second, in week 2, subjects with time consistent preferences are less likely to deviate from their plan of action specified in week 1 than subjects with time inconsistent preferences. Third, we find that subjects who exhibit time varying preferences are more likely to break cooperative relationships. The results regarding time consistency and invariance are robust to the inclusion of present bias and discount factor, implying that those two properties of time preferences capture different aspects of intertemporal decisions in repeated games rather than non-stationarity and patience. Previous works in the literature found that measured time preferences are related to in- tertemporal choices in individual decision making—for instance, credit card borrowing (Meier and Sprenger, 2010), adolescent alcohol consumption and obedience to school code (Sutter

24 et al., 2013), and smoking (Harrison et al, 2010). Beyond the scope of the existing literature, studying strategic interactions over time is important because people usually interact with others repeatedly over time in many real world situations. For instance, in firms and other organizations, many projects are conducted based on the unit of a team or a partnership that may be maintained for months or years. This paper is the first attempt to explore the role of time preferences in such strategic interactions over time. Therefore, the experimen- tal evidence provided here may shed light on some questions about individual behavior in organizations. Who is going to work hard (or shirk) on a team-based task? What will be optimal incentives for workers who have non-stationary, time varying, or time inconsistent preferences in a unit of teams? The experimental result of this paper also raises a question regarding the role of time preferences in other dynamic games. Given that time inconsistency and variance have sig- nificant associations with cooperation in the play of a repeated game, theoretical endeavors that address dynamic inconsistency beyond non-stationarity in dynamic games will be an interesting avenue for future works. Based on such theoretical developments, it seems that the experimental design introduced in this paper can be applied to study the role of time preferences in other dynamic games such as Rubinstein bargaining, Multilateral bargaining, and in oligopoly.

25 References

[1] Abreu, Dilip, David Pearce, and Ennio Stacchetti (1990). “Toward a theory of discounted repeated games with imperfect monitoring.” Econometrica: 1041-1063.

[2] Andreoni, James, and Charles Sprenger (2012).“Estimating time preferences from convex budgets.” American Economic Review, 102(7): 3333-3356.

[3] Andreoni, James, and Charles Sprenger (2012b). “Risk preferences are not time prefer- ences.” American Economic Review, 102(7): 3357-3376.

[4] Andersen, Steffen, Glenn W. Harrison, Morten I. Lau, and E. Elisabet Rutstr¨om (2008). “Eliciting risk and time preferences.” Econometrica, 76(3): 583-618.

[5] Augenblick, Ned, Muriel Niederle, and Charles Sprenger (2015). “Working over time: Dynamic inconsistency in real effort tasks.” Quarterly Journal of Economics, 130(3): 1067-1115.

[6] Chakraborty, Anujit, Yoram Halevy, and Kota Saito (Forthcoming). “The Relation be- tween Behavior under Risk and over Time.” American Economic Review: Insights.

[7] Chade, Hector, Pavlo Prokopovych, and Lones Smith (2008). “Repeated games with present-biased preferences.” Journal of Economic Theory, 139(1): 157-175.

[8] Dal B´o, Pedro (2005). “Cooperation under the shadow of the future: experimental evi- dence from infinitely repeated games.” American Economic Review, 95(5): 1591-1604.

[9] Dal B´o,Pedro and Guillaume R. Fr´echette (2011). “The evolution of cooperation in infinitely repeated games: Experimental evidence.” American Economic Review, 101(1): 411-429.

[10] Dal B´o,Pedro and Guillaume R. Fr´echette (2018). “On the determinants of cooperation in infinitely repeated games: A survey.” Journal of Economic Literature, 56(10): 60-114.

[11] Dal B´o,Pedro and Guillaume R. Fr´echette (2019). “Strategy choice in the infinitely repeated prisoners’ dilemma.” American Economic Review, 109(11): 3929-3952.

[12] Davis, Douglas, Asen Ivanov, and Oleg Korenok (2016). “Individual characteristics and behavior in repeated games: an experimental study.” Experimental Economics, 19(1): 67-99.

26 [13] Dean, Mark, and Anja Sautmann (2014). “Credit constraints and the measurement of time preferences.” Available at SSRN 2423951.

[14] Dreber, Anna, , and David G. Rand (2014). “Who cooperates in re- peated games: The role of altruism, inequity aversion, and demographics.” Journal of Economic Behavior and Organization, 98: 41-55.

[15] Frederick, Shane, , and Ted O’donoghue (2002). “Time discounting and time preference: A critical review.” Journal of Economic Literature, 40(2): 351-401.

[16] Fudenberg, Drew, and David K. Levine (2006). “A dual-self model of impulse control.” American Economic Review, 95(5): 1449-1476.

[17] Fudenberg, Drew, and (1986). “The folk theorem in repeated games with discounting or with incomplete information.” Econometrica: 533-554.

[18] Gin´e,Xavier, Jessica Goldberg, Dan Silverman, and Dean Yang (2018). “Revising com- mitments: Field evidence on the adjustment of prior choices.” Economic Journal, 128 (608): 159-188.

[19] Halevy, Yoram (2015). “Time consistency: Stationarity and time invariance.” Econo- metrica, 83(1): 335-352.

[20] Harrison, Glenn W., Morten I. Lau, and E. Elisabet Rutstr¨om (2010). “Individual dis- count rates and smoking: Evidence from a field experiment in Denmark.” Journal of Health Economics, 29(5): 708-717.

[21] Horton, John J., David G. Rand, and Richard J. Zeckhauser (2011). “The online labo- ratory: Conducting experiments in a real labor market.” Experimental Economics, 14(3): 399-425.

[22] Laibson, David (1997). “Golden eggs and hyperbolic discounting.” Quarterly Journal of Economics: 443-477.

[23] Lehrer, Ehud, and Ady Pauzner (1999). “Repeated games with differential time prefer- ences.” Econometrica, 67(2): 393-412.

[24] Meier, Stephan, and Charles Sprenger (2010). “Present-biased preferences and credit card borrowing.” American Economic Journal: Applied Economics, 2(1): 193-210.

27 [25] Obara, Ichiro, and Jaeok Park (2017).“Repeated games with general discounting.”Jour- nal of Economic Theory, 172: 348-375.

[26] Phelps, Edmund S., and Robert A. Pollak (1968). “On second-best national saving and game-equilibrium growth.” Review of Economic Studies, 35(2): 185-199.

[27] Proto, Eugenio, Aldo Rustichini, and Andis Sofianos (2019). “Intelligence, Personality, and Gains from Cooperation in Repeated Interactions.” Journal of Political Economy, 127(3): 1351-1390.

[28] Roth, Alvin E., and J. Keith Murnighan (1978). “Equilibrium behavior and repeated play of the prisoner’s dilemma.” Journal of Mathematical Psychology, 17(2): 189-198.

[29] Sabater-Grande, Gerardo, and Nikolaos Georgantzis (2002). “Accounting for risk aver- sion in repeated prisoners’ dilemma games: An experimental test.” Journal of Economic Behavior and Organization, 48(1): 37-50.

[30] Schweighofer-Kodritsch, Sebastian (2018). “Time preferences and bargaining.” Econo- metrica, 86(1): 173-217.

[31] Sherstyuk, Katerina, Nori Tarui, and Tatsuyoshi Saijo (2013). “Payment schemes in infinite-horizon experimental games.” Experimental Economics, 16(1): 125-153.

[32] Sutter, Matthias, Martin G. Kocher, Daniela Gl¨atzle-Rutzler,¨ and Stefan T. Trautmann (2013). “Impatience and uncertainty: Experimental decisions predict adolescents’ field behavior.” American Economic Review, 103(1): 510-531.

[33] Wright, Julian (2013). “Punishment strategies in repeated games: Evidence from exper- imental markets.” Games and Economic Behavior, 82: 91-102.

28 Appendices

A Additional analyses

A.1 Predictor of attrition

Given that our longitudinal experimental design is smoothly conducted over several weeks, participation of subjects in the experiment over time can be interpreted as intertemporal decision making, because there is a tension between immediate benefits of attrition and later, but larger overall benefits of participation. We examine whether β and δ measured at the beginning of the experiment can predict attrition and the length of participation.28 If β and δ measured subjects’ time preferences, it is intuitive to expect that subjects with higher β and higher δ are (1) less likely to drop out and (2) more likely to participate longer in the experiment. Table 7 presents the results from probit estimations. Column 1-3 are the marginal effects from probit regressions in which the dependent variable equals 1 if attrition happens before the session has finished and 0 otherwise. Column 1 considers the attrition of subjects for all weeks of the experiment. Column 2 and 3 look at attrition in a more subtle sense that we focus on the subjects who leave the experiment after week 0 and 1, respectively. Overall subjects with higher β and δ are significantly less likely to drop out of the experiment. Looking at attrition after week 0 and after week 1 separately allows us to distinguish different effects of β and δ on attrition of subjects. Attrition after week 0 is significantly and negatively correlated with β, but not δ, and attrition after week 1 can only be explained by δ. Based on these results, we further investigate whether β and δ can predict how long subjects will take part in the experiment over time. It is important to note that as a depen- dent variable, the number of weeks in which subjects participated is censored from above so that subjects could have participated longer unless the session had finished. Therefore, it is appropriate to use a Tobit regression to account for the characteristic of the censored data. In addition, we take the data only up to week 3 because the shortest session lasted only for four weeks. In column 4 and 5, we present the result from Tobit regressions as well as that of OLS, which takes the dependent variable not censored from above. In both regressions,

28It is less likely that subjects’ time constraints would affect their participation. We allow subjects to have at least 48 hours to complete their tasks upon receiving the invitation email, and subjects know that it takes less than 10 minutes to finish each week’s task.

29 Table 7: Time preferences, attrition, and the length of participation

Probit - marginal effects Tobit OLS (1) (2) (3) (4) (5) All weeks After week 0 After week 1 Up to week 3 All weeks

β -0.491*** -0.371*** -0.084 4.336** 3.458** (0.189) (0.137) (0.121) (1.838) (1.431) δ -0.401*** -0.171 -0.227*** 3.678** 2.422** (0.151) (0.114) (0.088) (1.466) (1.066) Cooperation -0.011 (0.019) Opponent Coop. -0.012 (0.019) Constant -1.309 -1.424 (2.538) 1.930)

Observations 1,345 1,345 1,165 1,345 1,345 R-squared 0.006 Notes: Dependent variable: for probit regressions, attrition=1, otherwise=0. For tobit and OLS, the dependent variable is a number of weeks in which subjects participated. Coop and Opponent Coop are 1 if a subject and an opponent cooperated in week 1, respectively, otherwise 0. Standard errors in parentheses are clustered at the level of individuals. Marginal effects are taken at discrete change from 0 to 1 for dummy variables. ***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level. we find that both β and δ are significant predictors of subject participation. Subjects with higher β and δ participate significantly longer in the experiment. Taken together, measured present bias and patience reflect intertemporal aspects of participation in the longitudinal experiment.

30 A.2 The effects of re-matching on behavior

One concern for our longitudinal design is to control for subjects’ beliefs about their op- ponent’s attrition. As we do not have a direct measure for beliefs about such uncertainty, we instead investigate whether learning about being re-matched in the previous week affects behavior in the current week. Table 8: Re-matching and cooperation in week 3 and after (Probit - Marginal effects)

Week 3 Week 4 Week 5 and after (1) (2) (3) Re-matching 0.028 0.017 0.191*** (0.071) (0.092) (0.037) β 0.571* 0.702* 0.404 (0.334) (0.392) (0.285) δ 0.313 -0.325 0.512* (0.233) (0.248) (0.309) Opponent’s coop. 0.668*** 0.673*** 0.759*** (0.030) (0.029) (0.028)

Observations 832 809 1,221 Notes: Dependent variable: cooperation=1, defection=0. Re- matching=1 if subjects were re-matched in the previous week, other- wise 0. Opponent cooperation=1 if the opponent cooperated in the previous week, otherwise 0. Clustered standard errors at the subject level in parentheses. Marginal effects are taken at the mean. ***Significant at the 1 percent level. **Significant at the 5 percent level. *Significant at the 10 percent level.

Table 8 shows marginal effects of re-matching on cooperation in probit regressions. Note that there is a time lag between when re-matching happens and when subjects have feedback about being re-matched. For instance, if re-matching happens in week 2, then a subject is notified in week 3 that he was re-matched with another subjects in week 2. Therefore, the regressions above contain data in week 3 and after. We control for subjects’ time preferences by including β and δ, and the opponent’s cooperation in the previous week is also included. In week 3 and 4, being informed about re-matching in the previous week does not have significant effects on cooperation in the current week. the magnitudes of coefficients are also very small. Only in later weeks as shown in column (3), re-matching leads subjects to cooperate significantly more. This result indicates that the effect of re-matching on behavior is not strong in early weeks, and can be inferred as indirect evidence that the role of uncertainty about the opponent’s attrition might be weak in choosing an action.

31 B Screen shots for the experiment

B.1 The recruitment screen of the MTurk page

Figure B.1: The recruitment screen of the MTurk page

32 B.2 The instructions for week 0 (Qualtrics)

Figure B.2: The instructions for block 1 in week 0

33 Figure B.3: The instructions for block 2 in week 0

34 B.3 The instructions for a stage game in week 1 (Qualtrics)

Figure B.4: The instructions for a stage game in week 1

35 Figure B.5: The decision screen for a stage game in week 1

36 Figure B.6: The screen for specifying the plan of action in week 1

37