Dynamic Probability of Reinforcement for Cooperation: Random Game Termination in the Centipede Game

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2018, 109, 349–364 NUMBER 2 (MARCH)

DYNAMIC PROBABILITY OF REINFORCEMENT FOR COOPERATION: RANDOM GAME TERMINATION IN THE CENTIPEDE GAME

EVA M. KROCKOW,ANDREW M. COLMAN, AND BRIONY D. PULFORD

DEPARTMENT OF NEUROSCIENCE, PSYCHOLOGY AND BEHAVIOUR, UNIVERSITY OF LEICESTER, U.K.

Experimental games have previously been used to study principles of human interaction. Many such games are characterized by iterated or repeated designs that model dynamic relationships, including reciprocal cooperation. To enable the study of infinite game repetitions and to avoid endgame effects of lower cooperation toward the final game round, investigators have introduced random termination rules. This study extends previous research that has focused narrowly on repeated Prisoner’s Dilemma games by conducting a controlled experiment of two-player, random termination Centipede games involving probabilistic reinforcement and characterized by the longest decision sequences reported in the empirical literature to date (24 decision nodes). Specifically, we assessed mean exit points and cooperation rates, and compared the effects of four different termination rules: no random game termination, random game termination with constant termination probability, random game termination with increasing termination probability, and random game termination with decreasing termination probability. We found that although mean exit points were lower for games with shorter expected game lengths, the subjects’ cooperativeness was significantly reduced only in the most extreme condition with decreasing computer termination probability and an expected game length of two decision nodes. Key words: Centipede game, random game termination, backward induction, endgame effects, cooperation, reciprocity

Cooperative human interactions have fre- decision contexts, circumstances outside of the quently been modeled by repeated or sequen- players’ control can prematurely end a tial games, including the Repeated Prisoner’s sequence of cooperative turns, for example if Dilemma game and the Centipede game. These one player is forced to leave, or dies. games provide abstract decision contexts where To provide a methodological implementation two or more players can choose repeatedly of indefinite game repetitions with random stop- between cooperation and defection, either ping through external forces, random game ter- cooperatively sharing a pot of money with the mination was introduced as an alternative to otherplayerorselfishly choosing a larger share finitely repeated games (Roth & Murnigham, for themselves (Krockow, Colman, & Pulford, 1978). Unlike more traditional games with 2016a). These abstract decision tasks enable the explicit, finite horizons (e.g., Selten & Stoecker, study of fundamental principles underlying 1986), this termination rule involves players human relationships. However, an aspect that being informed of the probability that a further has often been neglected in experimental game round will be played but not which partic- designs is the possible effect of external factors ular round will be the last. Random game termi- such as random interventions, modeled in game nation was claimed to avoid an endgame effect. theory by a player called Nature. In real-life The term ‘endgame’ refers to the final stage in a game of chess, but has also been applied to the analogous stage in an experimental game— fi The research reported in this article was supported by the nal decisions in the game. The endgame awards from Friedrich Naumann Stiftung für die Freiheit effect, in turn, denotes the behavioral phenome- to the first author and from the Leicester Judgment and non that cooperation—even if stable through- Decision Making Endowment Fund (Grant RM43G0176) outmostofthegame—suddenly drops when to the second and third authors. The authors are grateful the players can predict that they are approach- to Eike Buabang, who helped with data collection, and Kevin McCracken and Jodil Davis, who helped with soft- ing the end of the interaction (Andreoni, 1988). ware development. Random game termination was introduced to Address correspondence to: Eva M. Krockow, Depart- allow for the study of infinitely extended games ment of Neuroscience, Psychology and Behaviour, Univer- (Dal Bó & Fréchette, in press; Fréchette & sity of Leicester, Leicester LE1 7RH, United Kingdom. E- mail: [email protected], Telephone: +44 (0)116 229 7084, Yuksel, 2017; Normann & Wallace, 2012). Based Fax: +44 (0)116 229 7196 on these methodological advantages, random doi: 10.1002/jeab.320 game termination rules may increase real-life

© 2018 Society for the Experimental Analysis of Behavior 349 350 EVA M. KROCKOW et al. applicability of repeated games, because human arguably consistent with the matching law, social interactions are rarely characterized by according to which the relative frequency of complete-information contexts with finite hori- responses (in this case, cooperative decisions) zons (Dal Bó, 2005; Jiborn & Rabinowicz, 2003). closely approximates the relative frequency of Most experimental research using random reinforcements in concurrent reinforcement termination designs has been conducted on schedules (Herrnstein, 1961). the repeated Prisoner’s Dilemma game However, only few studies (e.g., Engle-Warnick (RPDG), the iterated version of the dyadic, & Slonim, 2004) have investigated random termi- one-shot Prisoner’s Dilemma (PD), frequently nation rules in games other than the RPDG, and referred to as the most fundamental example no empirical research has studied random termi- of all social dilemmas (Colman & Pulford, nation rules in Rosenthal’s (1981) Centipede 2015; Rapoport, Seale, & Colman, 2015; Roth, game (CG) (see Fig. 1). In this sequential game 1995). The PD, originally named by Tucker with complete and perfect information, two (1950/2001), describes a strategic decision players A and B take turns in deciding between context in which two suspects have been two alternatives: a cooperative GO move that arrested for a joint crime. Both individuals leads the game to continue horizontally across have to choose (separately and simulta- the game tree, and a noncooperative STOP neously) between selling out the other person move that terminates the game through an (defection) and staying quiet (cooperation). immediate, downward exit move, leaving the Their sentences will be long if both of them defector with a relatively favorable payoff com- decide to sell out, and shorter if both remain pared to the other player. In the example CG, a silent, but if one person chooses betrayal while GO choice always decreases a player’spayoffby the other stays silent, then the defector will go three units and increases the co-player’spayoff free and the cooperator will suffer the maxi- bysevenunits.Inthiscase,thejointpayoffsof mum sentence. Despite describing a specific the player pair increase linearly from one exit decision scenario, the PD can be abstracted to point to another, but exponentially increasing model a general strategic dilemma that crops versions are also frequently studied. The subgame up in many economic, political, and interper- perfect Nash equilibrium of the CG, as derived sonal interactions. The RPDG refers to deci- through backward induction (BI) reasoning, is sion contexts in which two individuals the unconditional STOP move by Player A at the complete multiple PDs in a sequence. Roth first decision node, even though both players and Murnighan’s (1978) first investigation of would receive higher individual payoffs following random termination rules in RPDGs suggested just one cooperative move each (for a discussion that lower termination probabilities increased of BI in the context of the CG see Aumann, 1995, cooperation relative to higher probabilities. 1998; Colman, Krockow, Frosch, & Pulford, More recently, Dal Bó (2005) conducted a 2016). This surprising conclusion of backward comprehensive experiment on RPDGs using induction reasoning is also consistent with proba- three different random termination rules with bility discounting—the finding that individuals expected lengths of one, two, and four game generally prefer smaller certain rewards to larger rounds respectively. Additionally, they com- lower-probability rewards, the effect being most pared these conditions to finite-horizon games with matching numbers of expected game rounds. The results confirmed the earlier findings of Roth and Murnighan (1978), suggesting that decreasing the likelihood of game termination increased cooperation levels. Furthermore, the results showed that subjects were likely to fi cooperatemoreinthein nite-horizon RPDGs Fig. 1. Centipede game with a linearly increasing pay- than in those of a finite length, even if matched off function. The game proceeds from left to right. Two for expected game length. If we interpret the players (A and B) alternate in choosing between coopera- probability of game continuation in such tive GO moves that continue the interaction by moving horizontally to the right and noncooperative STOP moves random-termination games as a proxy for the that terminate the game by moving down. The numbers at players’ (anticipated) relative frequency of the bottom and right are payoffs to both players, with rewarding payoffs, then the results are also those of Player A displayed above those of Player B. RANDOM GAME TERMINATION 351 accurately described by a hyperbolic probability Whereas the CG has received increasing atten- discounting function (Green & Myerson, 2004; tion in the literature, with most empirical studies Myerson, Green & Morris, 2011). demonstrating high levels of cooperation and While sharing many features of the RPDG, reliable deviations from equilibrium play the CG provides a different decision context (e.g., Bornstein, Kugler, & Ziegelmeyer, 2004; and deserves investigation in its own right. In Krockow, Colman, & Pulford, 2016b; Krockow, the CG, the decision to defect terminates the Pulford, & Colman, 2015; McKelvey & Palfrey entire interaction, and its consequences are 1992), only short CGs with finite horizons have irrevocable. Retaliation through strategies such been investigated so far. The longest CG used in as Tit for Tat is therefore not possible. Further- a published, peer-reviewed experiment was more, it is characterized by a sequential, recip- Nagel and Tang’s (1998) 12-node game. That rocal move structure that may offer a closer game was presented in reduced normal form, model of many real decision situations than the which had the additional advantage of assessing simultaneous decision context of the RPDG all intended exit points in the game—the struc- (Krockow et al., 2016a). Finally, the payoffs of ture and interdependence of players’ decisions the standard RPDG remain constant through- in the sequential-move version mean that even out the decision sequence and therefore cannot the most cooperative player can never reach late model the same variety of dynamic incentive exitnodeswhenpairedwithanearly-defecting structures, including exponentially or linearly co-player. However, although the reduced nor- increasing payoffs, as the CG. mal form is likely to provide more accurate An example of the game’sapplicationto assessments of the prevalence of altruism in an real-life interactive decisions could include two experimental sample, it misses out the sequential neighboring couples who alternate helping player interaction characteristic of the standard each other with the baby-sitting. Neither of the CG and reduces the length of time invested in couples particularly enjoys looking after the each game. It presents a fundamentally different other family’s badly behaved children, and decision problem and may lead to a significantly there is always the possibility that one couple different behavior (Krockow et al., 2016a). could decide to end the relationship without Finally, no research to date has investigated further reciprocation. Nevertheless, in the long CGs with different termination rules, includ- run, both couples benefit from the arrange- ing random game termination, even though ment, because the cost of performing the chore these could provide informative insights into is less than the benefit to the other couple. decision-making situations under the risk of In addition to this social decision-making con- premature termination. Hence, there is a text, the CG has biological applications, for need for the investigation of longer CG example modeling certain animal mating sequences with a variety of termination rules behaviors. Hermaphrodite organisms (i.e., (Krockow et al., 2016a). organisms with both female and male reproduc- The present study investigated CGs with up to tive organs) such as the hermaphrodite sea bass 24 moves (twice as long as Nagel & Tang’s1998 have been found to distribute costly egg produc- version) and linearly increasing payoffs. Addi- tion by taking turns with their mates in laying tionally, we investigated the effects of four dif- small batches of eggs. This repeated exchange ferent termination rules, including two novel of small batches of eggs for fertilization—as rules of random termination with increasing opposed to the production of a large batch by and decreasing probabilities of game termina- one individual at a time—helps to prevent tion throughout the decision sequence, respec- mutant sea bass with male reproductive organs tively. No study to date appears to have only from fertilizing all eggs and swimming off combined random termination rules with finite without making a similarly large contribution to game horizons. However, the finite design reproduction (Binmore, 1998). The CG thus offers an advantage in the CG inasmuch as it provides an interesting experimental paradigm allows for the calculation of mean exit points, to study mutual trust and related topics of recip- an index of cooperation widely used in the pre- rocation, altruism, individual versus group bene- vious CG literature. Furthermore, as Selten, fits, and long-term versus short-term payoff Mitzkewitz, and Uhlich (1997) pointed out, infi- maximization (e.g., Krockow et al., 2016a; nitely repeated games are not feasible in prac- Palacios-Huerta & Volij, 2009). tice. Experimental subjects always know that the 352 EVA M. KROCKOW et al. game will have a finite duration, and the time cooperative interaction between the neighbors. slot they signed up for provides an effective Over time and with increasing work experience, upper bound. Consequently, no experimental however, job security and financial stability are game would ever be expected to be infinite. likely to improve, thus leading to a decreasing The study reported below aimed to com- probability of the relationship being terminated pare four CG conditions: A: no random game by environmental factors. Each of the example termination; B: random termination with a scenarios maps onto one of our experimental constant termination probability; C: random conditions, with the first scenario correspond- termination with increasing probability; and ing to Condition B, the second to Condition C D: random termination with decreasing proba- and the third to Condition D. bility. These conditions were based on theoret- All conditions of the experiment shared the ical interest and their direct applicability to same maximum game length of 24 nodes but different real-life decision contexts. were designed to differ in their expected game Consider again the neighborly relationship of lengths as based on the random termination alternating childcare support which was pre- probabilities. While Condition A without ran- sented as an example situation earlier. Random dom termination had an expected game length termination of the relationship through exter- of 24 nodes, all random termination conditions nal factors beyond the neighbors’ control is had lower expected lengths of approximately possible and could follow several different func- 4, 9, and 2 nodes, respectively. Previous litera- tions. In its simplest form, the probability of the ture reviewed above (e.g., Dal Bó, 2005) showed relationship being terminated by an external that random termination games of shorter factor could take on a fixed value. For example, expected lengths produced lower cooperation it is possible to imagine a lethal accident cutting in the RPDG than games with longer expected the relationship short. Following each coopera- lengths. Consequently, we hypothesized a simi- tive action by either neighbor, an accident lar decrease of cooperation in Centipede game could occur by chance, thus rendering either conditions in which the computer was statisti- one of the neighboring families unable to cally more likely to end the game earlier. More engage in further baby-sitting. The probability specifically, we used the order of expected game of such an accident could be fixed (e.g., 1/4) lengths presented above to arrive at our predic- and its value could depend on the general riski- tions of cooperation levels in the individual con- ness of the neighbors’ lifestyles. ditions. Based on this order, Condition D with In a slightly different variation of this sce- an expected length of just over two decision nario, one of the families could be living in a nodes was hypothesized to yield the lowest coop- rented house from which the landlords could eration levels, followed by Condition B and then evict them at any time. The landlords may even- Condition C. Dal Bó (2005) reported that tually use the property as their own future games with fixed lengths decreased cooperation retirement home or as the prospective house compared to games with random termination for their children. In this scenario, the land- rules. However, their treatment games were lord’s choice would be the external factor matched for expected game lengths. Given that potentially ending the neighbors’ relationship our fixed-length game presented in Condition prematurely. Although the initial probability of A was characterized by a comparatively high the landlord evicting his tenants may be very expected length of 24 decision nodes, we low, the probability would increase over time. hypothesized that this condition would yield Finally, consider this third variation of the higher levels of cooperation than all random- baby-sitting scenario. The families may have termination conditions in the experiment. moved to the neighborhood at an early age and with uncertain job prospects. Like many young Method professionals, they may initially depend on short-term work contracts or insecure temping Subjects jobs with zero-hour contracts. Given the initial A total of 148 undergraduate students from job insecurity, a long-term stay in the area may the University of Leicester with a mean age of be questionable, yielding a high early likelihood 19.34 years (SD = 2.86 years) participated in of forced relocation. Thus, job insecurity could the experiment (see Table 1). All were incen- be another external factor terminating the tivized with a between-subjects random lottery RANDOM GAME TERMINATION 353

Table 1 Hence, at the first node the computer never Summary of session and subject details chose to terminate, and at the game’s end (i.e., the computer’s 24th decision node) it Subjects Rounds terminated the game in 50% of the cases. Con- #of #of per per Condition subjects sessions session session versely, in Condition D, the probability of game termination by the computer steadily A: No random 40 2 22, 18 20 decreased from 1/2 at the first node to 0 at termination fi B: Constant δ 34 2 18, 16 20 the last node. Hence, at the rst node the C: Increasing δ 40 2 22, 18 20 computer chose to terminate in 50% of the D: Decreasing δ 34 2 18, 16 20 cases, and at the game’s end it never terminated. In both Conditions C and D, the mean value of δ is 1/4, which is why this value was system. One person per testing session chosen as the constant termination probability received the payoff from a randomly chosen in Condition B. Based on the above probabili- game completed during the session. The ties, the expected termination points T by the mean cash remuneration of the selected sub- computer were calculated to be as follows: jects was £14.36 ($18.00). We chose to select Condition A, T = 24.00; Condition B, T = one game for payment randomly rather than A B 4.00, Condition C, T = 8.99, Condition D, calculating an average across all games, C T = 2.13. The game trees displayed on screen because previous literature provided evidence D for the different treatment conditions are that this method prevents subjects from show in Figure 2. Detailed plots of the proba- responding to the individual game repetitions bility functions of games being randomly ter- merely as parts of one large “supergame” minated by the computer at each exit point (Bardsley, et al., 2010; Bolle, 1990; Cubitt, Star- are provided in Figure 3. mer, & Sugden, 1998). In particular, we As a general measure of cooperation, the wanted to ensure that subjects responded to subjects’ cooperation rates were calculated by every game as a separate decision context that dividing a player’s number of GO moves by could determine their total payoff in the the total number of moves that player made experiment. Selecting only one subject per ses- across all 20 game rounds. In the context of sion for payment is common practice in the present experiment, the proportion of GO research on experimental games, and informal moves provided a more accurate indication of feedback from subjects confirmed that they individual cooperation levels than the mean were sufficiently motivated by the chance of exit points reported in previous studies winning the money. (e.g., Krockow et al., 2016a), because it took into account the fewer decision opportunities Design in the three conditions with random termina- Subjects were randomly allocated to one of tion rules, while also capturing the cooperative four treatment conditions with different CGs. moves made in games which were prematurely Each game offered a maximum of 24 subject terminated by the computer. moves, and the combined payoffs of both Additionally, players’ STOP probabilities players at each node increased linearly from were calculated for each individual decision 4 at Node 1 to 100 at the natural end. The node to estimate the likelihood of game termi- four treatment conditions varied only as nation at each point in the game. This was regards the probability δ of random game ter- done by dividing the number of players who mination by the computer, as follows. A: no chose to STOP at each decision node by the random termination; B: constant termination total number of players who had reached the δ 1 respective node. probability B = 4 following each subject move; C: increasing termination probability δ 1 2 … 21 22 C =0,44 , 44 , , 44 , 44 ; and D: decreasing termi- Materials δ 22 21 … 2 1 : nation probability D = 44 , 44 , , 44 , 44 ,0 In The testing sessions were carried out in a Condition C, the probability of game termina- large computer laboratory. Each subject was tion by the computer steadily increased from seated at a computer desk, with all desks gen- 0 at the first node to 1/2 at the last node. erously spaced out in the laboratory to avoid 354 EVA M. KROCKOW et al.

Fig. 2. Specific Centipede game trees used in the present experiment: (a) Game tree used for Condition A: a long Centipede game with 24 decision nodes and no random termination by the computer; (b) Game tree used for Conditions B, C, and D: a long Centipede game with 24 decision nodes and random termination by the computer (random termination rules varied across the three conditions). any communication between subjects. For the Subsequent screen displays did not include anonymous game interaction, a custom-made reminders about the computer’s specific ter- web-based game application was used which mination probabilities at each node. We provided real-time feedback about the sub- made this decision despite recent literature jects’ choices, the computer’s choices and the suggesting that subjects’ responses to linear current round number. The subjects were pre- probability functions may frequently be dis- sented with the game tree of their respective torted, with subjects behaving as though the treatment condition. To visualize the com- likelihood of events with low probabilities are puter’s options for random termination in the higher and the likelihood of events with high last three conditions, additional decision probabilities lower than they actually are nodes with the label C for computer were (e.g., Zhang & Maloney, 2012). Given that inserted into the game tree following each the computer’s termination probabilities in player’s decision nodes. Several detailed Conditions C and D either increased or instruction slides explained the payoff func- decreased by 1/44 (0.0227) with each of the tion and the random termination rule for the computer’s decision nodes passed, we relevant treatment condition. For example, in believed that the small fractions or decimal Condition C (increasing termination probabil- numbers would impose an even greater chal- ity), the instructions read: lenge to the subjects’ adaptive learning than the linear probability functions explained in The Computer is programmed to make the instruction slides. The subjects saw eight random choices, prefers neither partici- player nodes at a time, and the display shifted pant, and gains nothing itself. by eight nodes once the game continued The probability that the Computer beyond the eighth node. The display shifted chooses GO steadily decreases from again to the game’s final set of eight decision 1(atthefirst circle) to 1/2 (at the last nodes if the subjects reached the 16th node. circle). This means that in the begin- We chose to shift the game tree by eight ning it always chooses GO and at the nodes at a time, because a previous experi- end it chooses GO in 1 out of 2 times. ment by Krockow, Colman, and Pulford (2017) suggested that subjects struggled with The probability that the Computer a constantly moving window that always chooses STOP steadily increases from displayed the next eight decision nodes. Addi- fi 0(atthe rst circle) to 1/2 (at the last tionally, the experiment included a paper- circle). This means that in the beginning based comprehension test to check for the it never chooses STOP and at the end it understanding of the game’s basic features as chooses STOP in 1 out of 2 times. well as the different termination rules. RANDOM GAME TERMINATION 355

Fig. 3. Subjects’ exit percentages in the experiment and computer STOP probabilities. Graphs show the percentage of experimental games that were terminated by human subjects at each of the 25 exit nodes in our Centipede games. Additionally, the calculated probabilities of games being terminated by the computer are displayed at each node. Graphs A–D correspond to the four conditions with different types of random computer termination.

Procedure Table 1). In each testing session, all subjects For each of the four conditions, two testing experienced the same condition, and they sessions were conducted, each of which con- were informed about this fact. The subjects tained between 16 and 22 subjects and took were instructed to focus only on their own approximately 50 min to complete (see materials and computer screens, and the 356 EVA M. KROCKOW et al. experimenters checked that these rules were Condition D (decreasing δ) it was 0.68. Hence, followed at all times across all testing sessions. in Condition C more than half of the games After completing the consent form, subjects were terminated by the subjects, whereas in were presented with detailed, animated the other two treatment conditions with ran- instructions on their computer screens. They dom termination, only around a third of the could work through the slides in their own games were ended by either of the human time, and were given the opportunity to ask subjects. questions in private. Then, they were asked to Taking a closer look at Figure 3, the distribu- ﬁll in a short comprehension test. The experi- tions of subjects’ exit moves show marked differ- menters checked all responses and corrected ences across treatment conditions. Although in any misunderstandings. Subsequently, the Condition A (no random termination) more experiment was started. The computer ran- than 50% of the games were stopped after the domly assigned all subjects to a player role in 20th exit point, not a single game in the other which they remained for the entire testing ses- treatment conditions was stopped after the 20th sion. The subjects were ignorant of the identity exit point. In Condition B (constant δ), games of their co-players, and they were randomly re- stopped by subjects followed a near normal dis- paired after each game round (i.e., after each tribution, with most game exits occurring at the game they completed). The re-pairing of third or fourth decision node and no game con- players was randomized with replacement, tinuing beyond the eighth decision node. In meaning that the ideal of perfect stranger Condition C (increasing δ), the pattern also matching (i.e., never encountering the same resembled a bell-shaped distribution but the dis- co-player twice) was not achieved. However, persion was larger. Most subjects exited this given the relatively large size of our testing ses- treatment condition at Node 6, but some games sions (compared to other CG research includ- continued for longer, with 19 being the latest ing Rapoport, Stein, Parco, & Nicholas, 2003), exit point reached. Finally, the exit distribution we do not believe this to be a problem. The of Condition D (decreasing δ), showed an web application provided them with real-time almost linear decrease across exit points. The feedback about all the moves made and on the majority of games (40%) that were exited by outcome of each game. Once each subject had human subjects stopped at Node 1, 30% completed 20 rounds of Centipede games, stopped at Node 2, 20% stopped at Node 3, and one subject was drawn at random for the lot- the ﬁnal 10% stopped at Nodes 4, 5, 6 and tery prize. The winner received his or her out- 7. Interestingly, the exit distributions described come (in pounds sterling) of one randomly above follow the probability function of game selected game which they completed during terminations by the computer. In Condition A the session. with zero possibility of computer termination throughout the game, subjects’ defection levels remain very low across many decision nodes Results before suddenly spiking close to the game’s The proportion of games ending at each end. In Conditions B, C, and D, which were exit node for the different conditions is shown characterized by high game termination proba- in Figures 3 and 4. Figure 3 displays the pro- bilities in the beginning, a much higher per- portions of games terminated by human centage of games were stopped at early exit players, and plots these results against the nodes by the subjects. Particularly Condition D probability functions of random computer ter- (decreasing δ) shows a close match between the mination. Figure 4 omits the probability func- subjects’ exit distributions and the computers’ tions, and shows the computer’s actual game linearly decreasing termination probabilities. terminations instead. For examples of individ- The overview of mean exit points is comple- ual behavior, please see the Appendix. mented by the display of players’ conditional As can be seen in Figure 4, a large propor- STOP probabilities at each node (see Fig. 5), tion of games in the treatment conditions with showing percentages of individuals who random computer stopping were in fact termi- reached each decision node and decided to nated by the computer. In Condition B (con- defect at that node. In Condition A stant δ) this proportion amounted to 0.65, in (no random termination), STOP probabilities Condition C (increasing δ) it was 0.49, and in are very low until Node 21, from which point RANDOM GAME TERMINATION 357

Fig. 4. Total exit percentages in the experiment. Graphs show the percentages of experimental games that were terminated at each of the 25 exit nodes in our Centipede games. The black bars represent the percentages of games stopped by experimental subjects. The grey bars represent the percentages of games ended by a computer move. Graphs A–D correspond to the four conditions with different types of random computer termination. they steadily increase toward a mode of 100% Condition D (decreasing δ), a small bell-curve at Node 25. In Condition B (constant δ), of STOP probabilities was found: Starting with STOP probabilities are below 10% on Node a percentage of approximately 10% at Node 1, but increase almost steadily until Node 1, STOP probabilities rise to almost 30% at 7, beyond which no game in this condition Node 5 and then begin to fall again. continued: The modal STOP probability was A Kruskal-Wallis H test was conducted to above 40% at Node 6. In Condition C compare the normalized cooperation rates (increasing δ), most STOP probabilities of sub- (i.e., the proportion of GO moves per total jects stayed below 20%, and the modal STOP moves) per subject across conditions. Signiﬁ- probability was found at Node 19, where a cant differences were found, χ2(3) = 14.95, third of all subjects stopped. Finally, in p < .005, with a mean rank cooperation rate of 358 EVA M. KROCKOW et al.

Fig. 5. Subjects’ STOP probabilities at each of the 24 decision nodes. Based on the experimental results, the graphs display calculated conditional probabilities (in percentages) of a subject choosing “STOP” assuming that they have reached the respective decision point. Graphs A–D correspond to the four conditions with different rules for random computer termination.

89.40 for Condition A, 84.49 for Condition C, 30.12 (decreasing δ)(U =429,p <.05,r =.32). 59.87 for Condition D, and 59.85 for Condi- Furthermore, Condition C (increasing δ)witha tion B (for mean cooperation rates see also mean rank of 44.18 was found to have a signifi- Table 2). Pairwise comparisons using Mann– cantly higher cooperation rate than Condition Whitney U tests showed that Condition A B (constant δ) with a mean rank of 29.65 (no random termination) with a mean rank of (U = 413, p <.005, r = .34). Condition C also 43.95 had a significantly higher cooperation had a significantly higher cooperation rate with rate than Condition B (constant δ) with a a mean rank of 42.99 than Condition D mean rank of 29.91 (U = 422, p < .005, (decreasing δ) with a mean rank of 31.04 r = .33). Condition A also had a significantly (U = 460.5, p < .05, r =.28). higher cooperation rate with a mean rank of The mean percentages of GO moves per 43.78 than Condition D with a mean rank of game round for all four conditions are RANDOM GAME TERMINATION 359 displayed in Figure 6. Only the graphs of Condi- rounds and suggesting that no learning took tion A (no random termination) and Condition place. C (increasing δ) show discernible temporal trends, indicating an increase of cooperation over rounds. In Condition A, the mean percent- Discussion age of GO moves increased from a value of approximately 89% in Round 1 to a value of This experiment aimed to extend previous research on repeated games with random ter- approximately 96% in Round 20. Time series fi analyses confirmed the learning pattern appar- mination rules by providing the rst investigation of CGs with varying termination rules and ent in Condition A. The SPSS Expert Modeler long decision sequences. In particular, we identified an exponential smoothing Holt linear used 24-node finite-horizon games and tested trend model with parameters of α (level γ for effects of different rules of random com- smoother) = 0.20 and (trend smoother) = puter termination (no random termination, 1.00, indicating a linearly increasing score pat- 2 fi constant, increasing, and decreasing termina- tern. The stationary R model t statistic was cal- tion probability) on human cooperation levels. culated to estimate the model’s goodness of fit. 2 All treatment conditions with random com- With an R value of .75, the model can explain puter termination were controlled for average approximately 75% of the variance in the data fi termination probability across the 24 decision and indicates a superior t compared to a sim- nodes (the mean probability was 1/4 for each ple mean model used as a baseline for compari- condition). However, the conditions varied son. Additionally, the Ljung-Box statistic Q was regarding their expected computer termina- calculated to test whether the model was cor- tion points, ranging from T = 2.13 to T = fi D C rectly speci ed. The value of Q(16) = 18.38, 8.99. Our results revealed large differences fi (p = .302) showed that no signi cant temporal between the four treatment conditions, with structure in the data set was unaccounted for by subjects’ mean exit points varying across con- fi the Holt linear model identi ed. ditions. Condition A (no random termination) In Condition C, the mean percentage of yielded significantly higher mean exit points GO moves increased from a value of approxi- than Condition C (increasing δ), and both of mately 80% in Round 1 to values above 90% these conditions yielded significantly higher in later rounds. Again, time series analyses means than Conditions B (constant δ) and D identified an exponential smoothing Holt lin- (decreasing δ). Matching the subjects’ mean ear trend model with parameters of α (level exit points with the respective expected game smoother) = 0.11 and γ (trend smoother) = lengths (as based on the random computer 2.281E–6, indicating a linearly increasing score termination rules), the values of mean exit pattern. With a stationary R2 value of .72, the points follow the same order as the values of fi model can explain approximately 72% of the the expected game length. More speci cally, variance of the data. Additionally, the Ljung- games with a higher expected game length Box statistic Q was calculated; the value of Q were stopped later than those with a lower (16) = 12.34, (p = .72) showed that no signifi- expected game length. Additionally, inspec- cant temporal structure in the data set was tion of results showed a close match between the percentages of subjects’ exit moves per unaccounted for by our model. decision node and the random termination Conditions B (constant δ) and D (decreas- δ probability associated with the respective ing ) did not show any temporal trends. For node. This finding is in line with our hypothe- both conditions, the SPSS Expert Modeler ses, and it supports previous experimental identified ARIMA (0,0,0), a model indicating results (e.g., Dal Bó, 2005; Roth & Mur- nothing but white noise in the data across nighan, 1978).

Table 2 Expected game length and cooperation rate

Condition No Termination Constant δ Increasing δ Decreasing δ

Expected game length T 24.00 4.00 8.99 2.13 Cooperation rate, M (SD) .92 (.10) .86 (.10) .92 (.06) .80 (.22) 360 EVA M. KROCKOW et al.

Fig. 6. Mean cooperation rates (percentage of GO moves) for each of the 20 game rounds. Graphs A–D correspond to the four conditions with different types of random computer termination. Black lines show the observed values (i.e., the data obtained experimentally). Dotted lines show the ﬁt line indicating the temporal data trend.

Interestingly, however, the decrease of the example, although Condition A’sexpectedgame mean exit points was less severe than what could length of 24 nodes was 12 times higher than the have been expected from the drastic decrease of expected game length of Condition D (2 nodes), expected game length across conditions. For the mean exit point of subjects in Condition A RANDOM GAME TERMINATION 361 was only 7.07 times higher than in Condition both the expected game length and the com- D. This indicates that cooperativeness did not puter’s termination rules across conditions. increase proportionately with the expected Based on the present design, it is not possible length of the games. to be certain of the reasons for differences in Indeed, the comparison of subjects’ cooper- the cooperation rates across the different ation rates across treatment conditions con- games, but we believe that they are jointly firmed this finding. Cooperation rates were influenced by expected game length and ter- surprisingly high across all conditions, with mination rules. Future research could extend 98% of subjects choosing GO more than half this study by controlling treatment conditions of the time, and more than 10% always choos- for the expected game length (rather than the ing GO. Significant differences in cooperation mean termination probability), while compar- rates between conditions became apparent, ing different termination rules. Additionally, it but these differences did not follow the data is possible that an increase in stimulus control patterns previously identified when using could be achieved by announcing the com- mean exit points as dependent variable in the puter’s termination probabilities at each stage analyses. Condition A (no random termina- of the game. tion) and Condition C (increasing δ) yielded When examining temporal data trends, it comparable mean cooperation rates of appears that learning occurred only in the treat- approximately .92. Condition B (constant δ) ment conditions with longer expected game produced a mean rate of approximately .86, lengths and either no random game termina- and Condition D (decreasing δ) generated the tion or increasing probability of termination. In lowest cooperation rates (.80). However, due the standard 24-node game, cooperation rates to comparatively high variances within groups, increased linearly with increasing experience in the only significant differences were found the game, reaching very high rates of over 95% between Condition D on the one hand and in the final game rounds. Hence, learning Conditions A and C on the other hand, indi- occurred in the opposite direction of equilib- cating that only Condition D, with the lowest rium play. Similarly, in the condition with expected game length TD = 2.13, resulted in a increasing termination probabilities, initial significant decrease in subjects’ cooperative- cooperation rates started at 83.3% and many ness compared to the control condition with- reached percentages higher than 90 toward the out random termination. final game rounds. This is an interesting find- An explanation for the large variances within ing, as the majority of experimental CG investi- groups could be the importance of individual gations reported decreases in cooperation over differences influencing cooperation rates. rounds (e.g., McKelvey & Palfrey, 1992; Rapo- Although the treatment condition had an port et al., 2003). Our learning effects could be impact on behavior, other-regarding behavioral explained by the linear payoff function and propensities (e.g., cooperative social value ori- comparatively low risk associated with each GO entations) may have accounted for some of the move in Condition A of the present study. variance (e.g., Krockow et al., 2016b; Pulford, Another reason may be the greater game Krockow, Colman, & Lawrence, 2016). Addi- length, which offers more opportunities for tionally, numeracy skills could have had an reciprocal cooperation (Krockow et al., 2016a). impact on decision making. The disproportion- Taken together, the findings suggest that CGs ally large number of cooperative choices in con- with far and finite horizons and linearly increas- ditions with shorter expected game lengths ing payoff functions generate high levels of could be explained by the subjects’ inability to cooperation that increase with higher experi- anticipate likely computer exit points from the ence in the game. When these games are com- termination probabilities. In future investiga- bined with different rules of random game tions, any confounding effects of numeracy and termination by the computer, the subjects’ mathematical ability could be reduced by mean exit points typically decrease. However, informing subjects about the expected game subjects’ cooperativeness as assessed by the length of their condition before the start of more accurate measure of cooperation rates each experiment. may be affected only in conditions with very A limitation of the present study’s research extreme conditions such as very low expected design concerns the simultaneous changes to game lengths. In this experiment, only 362 EVA M. KROCKOW et al.

Condition D, with decreasing termination prob- Binmore, K. G. (1998). Game theory and the social contract: ability and an expected game length of approxi- Just playing (Vol. 2). Cambridge, MA: MIT press. fi Bolle, F. (1990). High reward experiments without high mately two decision nodes, led to a signi cant expenditure for the experimenter. Journal of Economic decrease in cooperativeness relative to the con- Psychology,11(2), 157–167. https://doi.org/10.1016/ trol condition. Future research should investi- 0167-4870(90)90001-P gate the effects that individual differences may Bornstein, G., Kugler, T., & Ziegelmeyer, A. (2004). Indi- have on cooperation levels in CGs and RPDGs vidual and group decisions in the Centipede game: Are groups more “rational” players? Journal of Experi- with random termination rules. Interesting vari- mental Social Psychology, 40(5), 599–605. https://doi. ables to investigate could be social value orienta- org/10.1016/j.jesp.2003.11.003 tion and general numeracy skills. To increase Cerutti, D. T. (1989). Discrimination theory of rule- external validity of the current study design fur- governed behavior. Journal of the Experimental Analysis – ther, follow-up research could dispense with the of Behavior, 51(2), 259 276. https://doi.org/10.1901/ jeab.1989.51-259 formal rules communicated to experimental Colman, A. M., Krockow, E. M., Frosch, C. A., & subjects, because many real-life choices with Pulford, B. D. (2016). Rationality and backward probabilistic consequences are not presented induction in Centipede games. In N. Galbraith, with explicit probabilities. We tend instead in E. Lucas, & D. E. Over (Eds.), The thinking mind: A – some situations to adapt our behavior to proba- Festschrift for Ken Manktelow (pp. 139 150). London: Routledge. bilities through learning. An experiment with Colman, A. M., & Pulford, B. D. (2015). Psychology of learned instead of explicit probabilities, would game playing: Introduction to a special issue. Games, shift the experimental focus from rule-governed 6(4), 677–684. https://doi.org/10.3390/g6040677 behavior (or instructional control) to a focus on Cubitt, R., Starmer, C., & Sugden, R. (1998). On the valid- contingency-shaped behavior (learned behav- ity of the random lottery incentive system. Experimental Economics, 1(2), 115–131. https://doi.org/10.1007/ ior) (e.g., Cerutti, 1989), which could corre- BF01669298 spond more closely to everyday experience. Dal Bó, P. (2005). Cooperation under the shadow of the Applying the findings of our abstract game future: experimental evidence from infinitely repeated games. American Economic Review, 95, context to the previous real-life examples of – different baby-sitting scenarios presented in 1591 1604. https://doi.org/10.1257/0002828057750 14434 the introduction, it appears that mutual trust Dal Bó, P., & Fréchette, G. R. (in press). On the determi- and reciprocal cooperation are common in nants of cooperation in infinitely repeated games: A prolonged decision contexts marked by a per- survey. Journal of Economic Literature. sonal risk due to the other person’s possible Engle-Warnick, J., & Slonim, R. L. (2004). The evolution defection. Cooperation is maintained even of strategies in a repeated trust game. Journal of Eco- nomic Behavior and Organization, 55, 553–573. https:// under circumstances of increased uncertainty doi.org/10.1016/j.jebo.2003.11.008 including the relationship’s likely termination Fréchette, G. R., & Yuksel, S. (2017). Infinitely repeated through an external force beyond the decision games in the laboratory: Four perspectives on dis- makers’ control. Only very extreme condi- counting and random termination. Experimental Economics, 20, 279–308. https://doi.org/10.1007/ tions, such as an expected interaction length s10683-016-9494-z of only two encounters, appear to lead to a sig- Green, L., & Myerson, J. (2004). A discounting framework nificant decrease of cooperation. for choice with delayed and probabilistic rewards. Psy- chological Bulletin, 130, 769–792. https://doi.org/10. 1037/0033-2909.130.5.769 References Herrnstein, R. J. (1961). Relative and absolute strength of Andreoni, J. (1988). Why free ride? Strategies and learn- response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, ing in public good experiments. Journal of Public Eco- – nomics 37(3), 291–304. 267 272. https://doi.org/10.1901/jeab.1961.4-267 Jiborn, M., & Rabinowicz, W. (2003). Reconsidering the Aumann, R. J. (1995). Backward induction and common Foole’s rejoinder: Backward induction in indefinitely knowledge of rationality. Games and Economic Behavior, ’ – iterated Prisoner s dilemmas. Synthese, 136(2), 8(1), 6 19. https://doi.org/10.1016/S0899-8256(05) 135–157. https://doi.org/10.1023/A:1024731815957 80015-6 Krockow, E. M., Colman, A. M., & Pulford, B. D. (2016a). Aumann, R. J. (1998). On the Centipede game. Games and Cooperation in repeated interactions: a systematic – Economic Behavior, 23(1), 97 105. https://doi.org/10. review of Centipede game experiments, 1992-2016. 1006/game.1997.0605 European Review of Social Psychology, 27, 231–282. Bardsley, N., Cubitt, R., Loomes, G., Moffatt, P., https://doi.org/10.1080/10463283.2016.1249640 Starmer, C., & Sugden, R. (2010). Experimental econom- Krockow, E. M., Colman, A. M., & Pulford, B., D. (2016b). ics: Rethinking the rules. Princeton, NJ: Princeton Uni- Exploring cooperation and competition in the Centi- versity Press. pede game through verbal protocol analysis. European RANDOM GAME TERMINATION 363

Journal of Social Psychology, 46, 746–761. https://doi. Axelrod’s tournaments. PLOS ONE, 10(7), 1–11, org/10.1002/ejsp.2226 e0134128. https://doi.org/10.1371/journal.pone. Krockow, E. M., Colman, A. M., & Pulford, B. D. (2017). 0134128. Far but ﬁnite horizons promote cooperation in the Centipede Rapoport, A., Stein, W. E., Parco, J. E., & Nicholas, T. E. game. Unpublished manuscript, Department of Neu- (2003). Equilibrium play and adaptive learning in a roscience, Psychology and Behaviour, University of three-person Centipede game. Games and Economic Leicester, UK. Behavior, 43, 239–265. https://doi.org/10.1016/ Krockow, E. M., Pulford, B. D., & Colman, A. M. (2015). S0899-8256(03)00009-5 Competitive Centipede games: Zero-end payoffs and Rosenthal, R. W. (1981). Games of perfect information, payoff inequality deter reciprocal cooperation. Games, predatory pricing and chain store paradox. Journal of 6(3), 262–272. https://doi.org/10.3390/g6030262 Economic Theory, 25, 92–100. https://doi.org/10. McKelvey, R. D., & Palfrey, T. R. (1992). An experimental 1016/0022-0531(81)90018-1 study of the Centipede game. Econometrica, 60, Roth, A. E. (1995). Introduction to experimental econom- 803–836. https://doi.org/10.2307/2951567 ics. In J. Kagel & A. E. Roth (Eds.), Handbook of experi- – McKelvey, R., D., & Palfrey, T. R. (1998). Quantal mental economics (pp. 3 109). Princeton, NJ: Princeton response equilibria for extensive form games. Experi- University Press. – Roth, A. E., & Murnighan, J. K. (1978). Equilibrium behav- mental Economics, 1, 9 41. https://doi.org/10.1007/ ’ BF01426213 ior and repeated play of the Prisoner s Dilemma. Jour- nal of Mathematical Psychology, 17(2), 189–198. https:// Myerson, J., Green, L., & Morris, J. (2011). Modeling the doi.org/10.1016/0022-2496(78)90030-5 effect of reward amount on probability discounting. Selten, R., Mitzkewitz, M., & Uhlich, G. R. (1997). Duopoly Journal of the Experimental Analysis of Behavior, 95, strategies programmed by experienced players. 175–187. https://doi.org/10.1901/jeab.2011.95-175 Econometrica, 65, 517–556. https://doi.org/10. Nagel, R., & Tang, F. F. (1998). Experimental results on the 2307/2171752 Centipede game in normal form: An investigation on Selten, R., & Stoecker, R. (1986). End behavior in learning. Journal of Mathematical Psychology, 42(2/3), ﬁ ’ – sequences of nite Prisoner s Dilemma supergames: 356 84. https://doi.org/10.1006/jmps.1998.1225 A learning theory approach. Journal of Economic Behav- Normann, H. T., & Wallace, B. (2012). The impact of the ior and Organization, 7(1), 47–70. https://doi.org/10. ’ termination rule on cooperation in a prisoner s 1016/0167-2681(86)9002-1 dilemma experiment. International Journal of Game Tucker, A. (2001). A two-person dilemma (Unpublished – Theory, 41(3), 707 718. https://doi.org/10.1007/ notes, Stanford University). Reprinted in E. Rasmussen s00182-012-0341-y (Ed.), Readings in games and information (pp. 7–8). Mal- Palacios-Huerta, I., & Volij, O. (2009). Field centipedes. den, MA: Blackwell. (Original work published 1950) American Economic Review, 99, 1619–1635. https://doi. Zhang, H., & Maloney, L. T. (2012). Ubiquitous log odds: org/10.1257/aer.99.4.1619 a common representation of probability and fre- Pulford, B. D., Krockow, E. M., Colman, A. M., & quency distortion in perception, action, and cogni- Lawrence, C. L. (2016). Social value induction and tion. Frontiers in Neuroscience, 6, 1. https://doi.org/10. cooperation in the Centipede game. PLOS ONE, 11(3), 3389/fnins.2012.00001 1–21. https://doi.org/10.1371/journal.pone.0152352 Rapoport, A., Seale, D. A., & Colman, A. M. (2015). Is tit-for- Received: August 4, 2017 tat the answer? On the conclusions drawn from Final Acceptance: February 12, 2018 364 EVA M. KROCKOW et al.

Appendix Condition A, Participant ID 118, Player role 2 25 20 15 10

Exit point 5 0 1234567891011121314151617181920 Game round 25 Condition A, Participant ID 505, Player role 2 20 15 10

Exit point 5 0 1234567891011121314151617181920 Game round

25 Condition C, Participant ID 207, Player role 1 20 15 10

Exit point 5 0 1234567891011121314151617181920 Game round 25 Condition D, Participant ID 811, Player role 1 20 15 10

Exit point 5 0 1 2 3 4 5 6 7 8 9 1011121314151617181920 Game round

Fig. A1. Examples of individual participant behavior. For each condition, decisions of one representative participant displaying typical behavior for that condition is shown. The exit points of these participants are displayed across the 20 game rounds. Those games terminated by the individual participant are marked by black circular shapes. Those games terminated by the other participant are marked by circular shapes with the letter “O”. Those games terminated by the computer (only applicable in Conditions B, C, and D) are marked by a square shapes with the letter “C”.