bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Honest signalling of trustworthiness

2

3 Gilbert Roberts

4 Simonside Cottage, 6 Wingates, Morpeth, NE65 8RW, UK

5 [email protected]

6

7 Abstract

8

9 Trust can transform conflicting interests into cooperation. But how can

10 individuals know when to trust others? Here, I develop the theory that

11 reputation building may signal cooperative intent, or ‘trustworthiness’. I model

12 a simple representation of this theory in which individuals (1) optionally invest

13 in a reputation by performing costly helpful behaviour (‘signalling’); (2)

14 optionally use others’ reputations when choosing a partner; and (3) optionally

15 cooperate with that partner. In evolutionary simulations, high levels of

16 reputation building; of choosing partners based on reputation; and of

17 cooperation within partnerships emerged. Costly helping behaviour evolved

18 into an honest signal of trustworthiness when it was adaptive for cooperators,

19 relative to defectors, to invest in the long-term benefits of a reputation for

20 helping. I show using that this occurs when cooperators gain

21 larger marginal benefits from investing in signalling than do defectors. This

22 happens without the usual costly signalling assumption that individuals are of

23 two ‘types’ which differ in quality. Signalling of trustworthiness may help

24 explain phenomena such as philanthropy, pro-sociality, collective action,

25 punishment, and advertising in humans and may be particularly applicable to

26 courtship in other animals.

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

27 Introduction

28

29 When individuals perform costly acts such as helping others, they may in doing so

30 reveal or actively display information about themselves. This information may then

31 induce others to behave preferentially towards them. Building up a good reputation

32 by being seen to be helpful may be adaptive through signalling (sensu (Maynard

33 Smith and Harper 2003)); and may have been a major driver of social behaviour

34 especially among humans (Alexander 1987). However, we lack a coherent theory of

35 when investing in a reputation for being trustworthy is adaptive.

36

37 One approach to understanding reputation-building is to view it as a strategic

38 investment in making individuals attractive partners (Roberts 1998). This theory,

39 sometimes referred to as reputation-based partner choice (Sylwester and Roberts

40 2010), postulates that acts such as charitable donations, volunteering, courtship gifts,

41 or contributions to public goods make individuals more attractive partners for

42 relationships. It is based on individuals interacting in social groups and being able to

43 choose among potential interaction partners who differ in partner value by observing

44 their behaviour (Roberts 1998). It is then inferred that those that honestly signal their

45 high quality (Gintis, Smith, and Bowles 2001) will have preferential access through

46 assortative pairing (Johnstone 1997) to mutually beneficial partnerships, which may

47 be social or sexual. This is predicted to lead to an escalation of investment in being

48 seen to be helpful so as to assortatively pair with the most cooperative partners, a

49 phenomenon called competitive (Roberts 1998, Van Vugt et al. 2007,

50 Raihani and Smith 2015, Roberts 2015)

51

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

52 Accumulating evidence from the lab and field supports the role of reputation building

53 in choice of partners (Sylwester and Roberts 2010; Barclay and Willer 2007; Hardy

54 and Van Vugt ; Sylwester and Roberts 2013; Raihani and Smith 2015). It is these

55 distinct stages of costly reputation building followed by profiting from partnerships

56 which means that reputation-based partner choice can potentially explain altruism

57 directed unconditionally rather than by direct or indirect reciprocity (Roberts 2015a).

58

59 Reputation-based partner choice differs fundamentally from indirect reciprocity as

60 shown in Figure 1 (see also Roberts 2015a). Indirect reciprocity is a process in which

61 one individual incurs a cost and benefits a second individual. A third individual then

62 pays a cost and benefits the original actor, who consequently profits when this

63 benefit exceeds its cost. Indirect reciprocity was discussed by (Alexander 1987),

64 modelled by (Nowak and Sigmund 1998) and demonstrated by (Wedekind and

65 Milinski 2000). However, any form of reciprocity is by definition conditional upon the

66 behaviour of a recipient and so is not considered here as a candidate process

67 underlying the phenomena of interest.

68

69 Some key differences between indirect reciprocity and reputation-based partner

70 choice may be summarized as follows:

71 (1) The two theories aim to explain very different phenomena: indirect reciprocity

72 aims to explain cooperation in large scale societies where individuals may not re-

73 meet. In contrast reputation-based partner choice aims to explain how acts of

74 apparent selflessness such as donations to charity, heroism, courtship feeding form

75 part of long term social and sexual partnerships with repeated interactions.

76 (2) Reputation-based partner choice is explicitly about signaling whereas indirect

77 reciprocity is not. A signal is an adaptive behavior which benefits the signaler by

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

78 changing a receiver’s behavior. This means that the cost invested in the signal may

79 or may not benefit one or more recipients of any benefit and that the receivers of the

80 signal may or may not be the recipients of such benefits. In contrast, indirect

81 reciprocity involves a donation of a benefit to a recipient at a cost to the actor.

82 (3) Indirect reciprocity is explicitly conditional in that it can only operate where

83 individuals help those who help others. Reputation-based partner choice has no such

84 conditionality. Individuals help others as a signal of cooperativeness. They benefit

85 from then being more likely to be chosen for profitable partnerships.

86 (4) Reputation-based partner choice involves partner choice for profitable

87 relationships. The basic theory of indirect reciprocity involves no partner choice. It

88 can be extended only in as much as individuals could theoretically choose who to

89 give to. It cannot be extended to involve choosing as a partner someone that one has

90 given to since this would involve direct reciprocity, which is a different theory of how

91 helping Is beneficial.

92 The two theories have been compared experimentally and reputation-based partner

93 choice has been found to be more effective in inducing reputational concerns

94 (Sylwester and Roberts 2013).

95

96 The theory of reputation-based partner choice explicitly incorporates Zahavi’s

97 (Grafen 1990, Zahavi 1997), a form of costly signalling theory

98 (Gintis, Smith, and Bowles 2001) whereby the signal (cooperative reputation) reveals

99 any underlying differences in some quality such as resources. Inherent in this

100 approach is that reputations are more than simply histories of how individuals have

101 previously acted, each decision maximizing short term gains. Rather, reputations are

102 costly investments and the challenge is then to determine how such investments can

103 be adaptive. Standard costly signalling models applied to helping behaviour are

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

104 based upon the notion that reputations allow observers to correctly deduce the

105 underlying quality of the signaller (Gintis et al. 2001). An alternative or additional

106 hypothesis is that costly investments might be honest signals of intent. That is, those

107 who develop a reputation for helping might be more likely to be more cooperative and

108 so may be preferred as partners (Roberts 1998; Van Vugt, Roberts, and Hardy 2007;

109 Barclay 2013; Barclay 2015; Silk 2002; Bliege Bird and Smith 2005). However, the

110 notion that reputations might signal future intent (rather than underlying quality) has

111 been stated verbally by several of these authors yet has surprisingly not been

112 developed theoretically. Thus, we do not have a coherent theory of when we can

113 expect individuals to invest in reputations.

114

115 I consider investment in reputation to be a form of signalling (Maynard Smith and

116 Harper 2003). The term ‘reputation’ is used to describe how other individuals judge

117 someone based on their past behaviour. Reputation is explicitly linked to past helping

118 behaviour in the model because reputation is formed when an individual indulges in

119 helping. Helping behaviour in the model does not benefit an identified recipient either

120 directly or indirectly so as to avoid any possible confusion with models of direct or

121 indirect reciprocity. Reputation in the model is gained through a specific costly act

122 and is not simply a record of past behaviour that may have reflected economic

123 choices. In other words, we consider an act that is performed not because it brings

124 reciprocal benefits, but because it changes the behaviour of other individuals. This

125 means that reputation fulfils the criteria of being a signal and not just a cue (Maynard

126 Smith and Harper 2003).

127

128 Here, I develop a model of how individuals choose partners for cooperative

129 relationships. The aim is to show when it can pay to choose a social (or sexual)

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

130 partner based on costly help, and consequently when helping provides a signal of

131 intent to cooperate (trustworthiness). In the model, individuals compete for access to

132 partnerships. They may or may not invest in a reputation by signaling, and signallers

133 may or may not be more likely to be chosen as partners. Once partnerships are

134 formed, the partners play rounds of a classical reciprocity game (Axelrod and

135 Hamilton, 1981). In social dilemmas of this sort, defection can provide a short-term

136 advantage, yet mutual cooperation pays when there are sufficient rounds within a

137 partnership. It can therefore pay to select cooperative partners. The model aims to

138 predict when signalling by helping others can be an honest signal of being a

139 cooperative partner, i.e., when it pays to signal one’s cooperative intentions,

140 trustworthiness, or commitment to a cooperative relationship, and when it pays to

141 choose partners based on such signals.

142

143 Methods

144

145 I used evolutionary computational simulation complemented by analytical work. The

146 structure of the simulation models is outlined in Figure 2, while Figure S1 provides a

147 more detailed flow diagram of the processes. Table S1 provides a reference list of

148 parameters. The simulation was implemented in C++ in Microsoft Visual Studio 2017.

149 Full programming code is provided in Supplementary Information.

150

151 Agents were allocated by the computer program to one of two roles: those in the

152 Competitor role competed to be selected to form partnerships with those in the

153 Selector role. Selectors were assumed to be seeking partners to form a mutually

154 beneficial relationship. Ahead of these partnerships, Competitors could pay a cost

155 and thereby develop a ‘reputation’ in the eyes of a Selector, or they could opt not to

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

156 pay this cost. Selectors could then decide whether to select those who had paid the

157 cost (signalled) or to accept any partner presented to them by the program.

158 Whenever Selectors accepted a partner, they then played a dyadic iterated social

159 dilemma game in which reciprocity led to mutual benefit (see ‘Play’ below). Through

160 evolutionary simulation, I investigated the fate of strategies of investing in a

161 cooperative reputation by giving unconditionally versus withholding help; of selecting

162 a partner with a good reputation or of accepting an assigned partner; and for both

163 roles, of cooperating versus defecting in subsequent dyadic interactions. Figure 2

164 summarizes these strategies.

165

166 Dividing agents into the two roles allows us to study the evolution of strategies of

167 signalling and choosing without knowing precisely how strategies are expressed in

168 the real world. We could allow all individuals to express both strategies, but this

169 would give us 16 strategies representing all possible combinations of signaling/ not

170 signaling; choosing/not choosing; playing C/D after signaling; playing C/D after

171 choosing. Findings from this more complex approach are consistent with those of the

172 simpler model and so are not presented here. Furthermore, the division into two roles

173 is consistent with other studies in the field including models of the costly signalling of

174 quality (Gintis, Smith et al. 2001, Jordan, Hoffman et al. 2016) and the ‘biological

175 market’ literature where individuals must first be classed as ‘buyers’ or ‘sellers’ (Noë

176 and Hammerstein 1994) (Barclay 2013).

177

178 The model assumes that individuals differ in their profitability as cooperation partners

179 solely in terms of their strategy. It therefore does not consider type differences due to

180 resources or abilities that form the basis of typical costly signalling models (Gintis et

181 al. 2001). I further assume that these expressed differences are detectable by

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

182 potential partners as a signal or reputation index; and that individuals can choose

183 partners for future interactions based on this. Crucially, there is no pre-set link

184 between displaying and continuing to cooperate after being chosen and thus

185 by signalling and then exploiting a partner is possible in the system.

186

187 Strategies

188 Agents in Selector and Competitor roles were assigned strategies by the program.

189 Competitors were given strategies of either investing in a Signal or not (No Signal).

190 Selectors had strategies of either choosing those who had invested in a Signal

191 (Choose); or accepting all assigned partners (Any). Agents in both roles also had

192 strategies of either cooperate (C); or defect (D) with the partner. Strategies were

193 assigned independently, so that all four combinations of the two binary decision rules

194 could evolve independently. The resulting roles and strategy combinations are

195 summarized in Figure 2.

196

197 Model Structure

198 The model employed a meta-population structure in which a default of N = 100

199 agents in each of the two roles was distributed on each of a default of i = 10 islands

200 (hence a population size of 2000). Islands were units of reproduction (see below).

201 Island structure was employed because it is thought to overcome problems with

202 unrepresentative results arising in single populations (Leimar and Hammerstein

203 2001). Agents interacted with other agents within their islands.

204

205 Initialization

206 At the start of each simulation, all Competitor agents were initialized with strategy

207 elements set to “No Signal, Defect”, while Selector agents were initialized with “Any,

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

208 Defect”, on the basis that we were interested in whether investing in signalling, and

209 choosing signallers, can become established through association with cooperation in

210 an environment which started with none of these. Agents in the Competitor role were

211 initialized with a reputation index of neutral, reflecting the fact that each agent starts a

212 generation with no interaction history. All agents started with a baseline payoff set to

213 1000 so that no payoffs would be negative and selection would be weak.

214

215 Signalling and partner choice

216 The program simulated a process by which Selectors formed partnerships from

217 among the Competitors. At the start of each generation all Competitor agents

218 carrying the Signal strategy element paid the cost of signalling (i.e their baseline

219 payoff was reduced by amount s) and their reputation index changed from ‘neutral’ to

220 ‘positive’ (Figure 2, Stage 1). The program then cycled through all Selectors and

221 assigned them partners at random from amongst the Competitors. If the Selector had

222 the strategy ‘Any’ then a partnership was formed with the first randomly assigned

223 Competitor. If the Selector had the strategy ‘Choose’ then it rejected randomly

224 assigned partners until assigned one that did carry the Signal strategy element. Any

225 Competitor that had previously been in partnership with a Selector was rejected. This

226 meant that there was no opportunity for and memory from experience about

227 which Competitors were cooperative, and therefore that partner choice could not

228 involve reciprocity. All selections were subject to a giving up time of twice the

229 population size (to avoid endless loops if no suitable partners could be found). This

230 meant that should no suitable partner be found, the Selector had to settle for the last

231 Competitor randomly assigned by the program. This process resulted in all Selectors

232 having one interaction for each cycle m through the group, while Competitors had an

233 average of one but had a variance depending on the extent of partner choice. Choice

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

234 was costly in that Choosy agents faced a probability h that they would get no partner.

235 This was implemented as a simple exogenous expected opportunity cost so that

236 choosiness could not drift when signalling was common, but that further assumptions

237 about how the costs of choice might vary were not made. By default, each island

238 metapopulation was cycled through m=50 times, meaning that each agent had a 0.5

239 probability on average of meeting each other agent.

240

241 Play

242 Once partnerships were formed, the agents played an iterated Prisoner’s Dilemma

243 game. Agents carried strategies of ‘Cooperate’ (C) and ‘Defect’ (D). Cooperators

244 played a simple conditional strategy of Cooperate until the partner defected

245 (sometimes called “Grudger”). This meant that if both were C agents then they both

246 cooperated on all rounds. Cooperation incurred a cost of c=1 payoff unit, with a

247 benefit of b=2. However, if both agents were of type D, then no cooperation occurred

248 in any rounds. If one agent played D and the other played C, then the former

249 received a benefit whilst the other paid a cost. Agents that were exploited in this way

250 responded by failing to cooperate in any further rounds (the Grudger strategy). The

251 iterated game therefore had payoff matrix as follows in each round (payoffs are to the

252 row player):

C D

C 1 -1

D 2 0

253

254 Given this payoff matrix, mutual cooperation could be profitable with repeated rounds

255 and conditional strategies (like Grudger). Below a critical number of rounds r,

256 Defectors would have the higher payoff, while above this value, Cooperators had a

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

257 higher payoff. This is the classic game structure and result of (Axelrod and Hamilton

258 1981). It implements the basic conundrum of social dilemmas in that there can be a

259 benefit in selecting cooperative partners for long term relationships, yet a risk that

260 partners will defect and exploit cooperative first plays.

261 Agents had relationships with multiple other agents. In each of m meetings an agent

262 followed the processes described above of competing for / selecting partners and

263 then playing the repeated social dilemma with those partners (see program structure

264 in Figure S1). As described in the subsection ‘Signalling and partner choice’, agents

265 could not be paired with those with which they had already had a relationship.

266 Therefore the model considered choice of partners for long term reciprocally

267 beneficial relationships (within which memory must occur) yet was uncontaminated

268 by memory for prior partners which could have led to reciprocity between

269 relationships.

270

271 Reproduction

272 Generations were discrete. At the end of each generation, strategy frequencies for

273 the following generation were assigned in proportion to their success in the previous

274 generation using the following method. The program cycled through all agents in

275 each of the two roles separately. Payoffs were summed across all agents having

276 each strategy in turn. This was done within each island and then across all islands,

277 resulting in local and global strategy payoffs. Agents were assigned strategies in the

278 next generation in proportion to the relative payoff of each strategy. An individual for

279 the next generation was derived locally (i.e. within the island) with probability 0.8 and

280 globally with probability 0.2. This followed the process recommended to reduce the

281 potential for genetic drift and allow migration of successful strategies to other islands

282 (Leimar and Hammerstein 2001). Population size was kept constant, with each

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

283 discrete generation being fully replaced by the next. Reproduction was accompanied

284 by mutation: with probability µ= 0.02 an individual’s strategy was replaced at random

285 (any strategy could change to any other strategy without having to track through an

286 imposed sequence). By default, simulations were run for 1000 generations and 10

287 replicates were run for each set of conditions. Simulations therefore allowed strategy

288 frequencies to evolve across generations, with those strategies accumulating the

289 highest payoff in one generation becoming more numerous in the next. All strategy

290 combinations for both roles could evolve independently.

291

292

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

293 Results

294

295 The aim of the simulation model was to determine whether a link between investment

296 in a costly act, ‘reputation-building’, could become linked to cooperating, so that

297 agents selecting partners would use reputation-building as a signal which provided

298 information about future cooperative behaviour.

299

300 Considering first the evolutionary dynamics of all strategies, agents in the Competitor

301 role evolved to invest in the signal and then cooperate (Figure 3a), while those in the

302 Selector role evolved a tendency to choose Signallers preferentially and also to

303 cooperate with their chosen partner (Figure 3b). Summing these strategies shows

304 that signalling, choosing and cooperating all evolved readily from an environment of

305 no signalling, accepting any partner and defecting (Figure 3c).

306

307 Cooperators were more likely to signal than were defectors: the proportions of

308 cooperators and defectors that signalled were 0.96 ± 0.01 and 0.42 ± 0.02

309 respectively (averaged over 10 simulations, of 100 generations; Figure 4a). Thus,

310 those agents which went on to cooperate in stage 2 were much more likely to have

311 signalled in stage 1. Looking at this from the perspective of a Selector, the probability

312 that a Signaller was a Cooperator was much higher than the probability that a Non-

313 signaller was a Cooperator (0.99 ± 0.00 and 0.65 ± 0.02 respectively; values from 10

314 simulations and 100 generations; Figure 4b)). This meant that choosing a Signaller

315 rather than a Non-Signaller increased the chance of a Cooperative partner by

316 approximately 52%. This probability was much greater when Signalling and

317 Cooperation were becoming established whereas the quoted statistics are heavily

318 weighted by the equilibrium conditions. These probabilities show that a link evolved

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

319 between Signalling and Cooperation so that it paid to choose Signallers because

320 they were more likely to be Cooperative partners. Reputation-based partner choice

321 therefore worked when an association evolved between signalling and going on to

322 cooperate with a partner, and therefore Selectors did use the signal in partner choice.

323 It appears that when Cooperators are better able to engage in a costly act of helping

324 others than are Defectors, this ability to engage in a costly act becomes useful

325 information for Selectors, and hence in turn can promote engaging in costly acts by

326 Cooperators in order to be chosen.

327

328 Analysis

329 Why does investing in a good reputation (signalling) mean agents are chosen as

330 partners, and how is the signal an honest indicator of future cooperation? To answer

331 this we need to show how signalling, choosing and cooperating become linked. This

332 will happen when the signal provides reliable information about the likelihood of an

333 individual going on to cooperate. This means we need to find where it is strategic for

334 cooperators but not defectors to signal. For signalling to be strategic, the costs of

335 signalling s must be offset by the benefits of increased payoff when chosen.

336 Simplifying by considering the payoff to cooperators and defectors on any one

337 interaction when a signaller (who signals once) is chosen by a Selector who

338 cooperates andwhen non-signallers achieve no interactions, then signalling should

339 be strategic for cooperators when: 1

340 Where b and c are the costs and benefits of cooperation respectively and w is the

341 chance of

342 interactions continuing in an iterated prisoner’s dilemma game. And for defectors

343 signalling is strategic when

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

344 Thus, given these simplifications, signalling is strategic for cooperators but not

345 defectors when:

ͯ 346 ͥͯ2 [Equation 1]

347 We can see from this result that there is a basic asymmetry between cooperation and

348 defection strategies in the payoffs they get from being chosen; and hence that

349 signalling provides information that the Competitor will go on to play Cooperate.

350 Specifically, given these assumptions, signalling and cooperating will be linked when

351 the payoff for signallers of maintaining reputation (in repeated rounds of mutual

352 cooperation) is greater than the short-term payoff from exploiting good reputation.

353 Conversely, it is not strategic for defectors to invest in a reputation (pay costs of

354 signalling cooperativeness) and then exploit on the first round because they cannot

355 recoup their signal costs when individuals remember they have defected within a

356 repeated game.

357

358 This can be understood from Figure 6. Part (a) presents the typical costly signalling

359 approach. This assumes that individuals differ in quality and this leads to differences

360 in their stable levels of investment in signalling. A costly signal is then an honest

361 indicator of underlying quality. This is the typical explanation for generosity:

362 essentially, individuals that are very generous must be of high quality (e.g. Gintis et al

363 2003). Figure 5 (b) shows schematically the alternative theory presented here. My

364 theory is that individuals differ in the benefits they get from each signal: provided

365 Cooperators play a responsive strategy, they can establish partnerships in which

366 they gain higher benefits from repeated rounds of cooperation than Defectors can

367 gain by exploiting a Cooperator once. This can be interpreted as corresponding to

368 the case in (Johnstone 1997a) where honesty in signalling is maintained due to

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

369 differential benefits from signalling. However, instead of individuals differing in ‘need’

370 (as in chicks begging for food from their parents), individuals here differ in the

371 benefits they can get by cooperating after signalling. Essentially, paying a high cost

372 up front is an honest signal of going on to cooperate: individuals simply cannot afford

373 to pay a high cost up front if they are simply going to go on to exploit responsive

374 Cooperators once.

375

376 Parameters

377 In the context of the simulations, the analysis predicts that Cooperators will have

378 higher marginal return s investing in signalling than Defectors. The analysis further

379 suggests there are two key factors affecting whether reputation-based partner choice

380 will be found: on the one hand, the cost of the signal, and on the other the benefits of

381 a cooperative relationship. Figure 6 displays the effects of varying the cost of

382 signalling (a-c) and the number of plays within a partnership (d-f). At low signal costs,

383 almost all agents evolve to invest in signalling and the majority use the signal in

384 choosing a partner, but as signal cost increases, so the proportions of signallers and

385 choosers declines. The proportion cooperating does not depend on signal cost (while

386 the other parameters were kept constant). These results are qualitatively in line with

387 the analytical prediction that signalling and choice will be found when signalling is

388 honest, which will be when a low cost of signalling and a high benefit from repeated

389 rounds make the marginal costs of signalling lower for co-operators than for

390 defectors. For example, when the number of plays is high, there are larger gains to

391 be made from cooperative partnerships and so co-operators have more to gain,

392 hence it can be worth signalling that one is a co-operator.

393

394

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

395 I went on to investigate systematically the effects of departures from all default

396 parameters used in the previously presented results by varying one parameter at a

397 time and keep others constant. Increasing the cost of choice by increasing the

398 proportion of times that choosy individuals missed out on partnerships led to a

399 gradual reduction in both signalling and choice itself, but had little impact on

400 cooperation (Figure S2). Increasing the benefit:cost ratio increased the long term

401 benefits of cooperation and so as predicted increased the levels of signalling and

402 choice (Figure S3). Increasing the population size, N, above the default level had

403 little effect on signalling but choice fell away, probably because choice is

404 proportionally less effective when searching through larger populations (Figure S4).

405 Signalling and choice fall away at high numbers of meetings (Figure S5). This is

406 because agents are allowed to meet any other agent only once, so partner choice

407 becomes impossible as the number of meetings rises towards the total number of

408 other agents to meet. Mutation rate had little quantitative impact on signalling, choice

409 or cooperation, showing that the conclusions are not dependent upon a particular

410 level of variation (Figure S6). The baseline value for payoff used did not impact on

411 behaviour, showing the results are robust to the strength of selection (Figure S7).

412 There was some variability in results at low numbers of islands, but at or above the

413 default value of 10, there was little further quantitative impact of varying population

414 structure in this way (Figure S8).

415

416

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

417 Discussion

418

419 These results illustrate how paying a cost to help others can be strategic for those

420 who go on to form mutually beneficial cooperative interactions and not for those who

421 take the short-term benefit of defection. They show that the differential benefits of

422 cooperating and defecting strategies can lead to costly helping behaviour being an

423 honest signal of future commitment to cooperation (trustworthiness). Reputation-

424 building provides information about strategy, based simply on the fundamental

425 difference between cooperators and defectors. Reputations evolve into honest

426 signals of commitment which it pays to attend to when seeking cooperative partners.

427 The results of the model are therefore in line with prior experimental findings that

428 costly investment in reputation-building can bring net benefits through access to

429 profitable partnerships (Sylwester and Roberts 2010). The conditions in which this is

430 found are given by Equation 1. In particular, there will be a trade-off between the cost

431 of signalling and the benefits of iterated cooperation, such that higher signal costs

432 can be supported when seeking to engage in long, profitable relationships. This

433 makes important predictions about when we can expect to see reputational displays

434 in humans and other animals, whether in courtship, in social interactions and in

435 marketing.

436

437 The results can be interpreted as showing how partner choice creates assortment

438 among co-operators (e.g. (Sherratt and Roberts 2012, Wang, Suri et al. 2012, Nax

439 and Rigos 2016). The success of cooperation depends upon the act of cooperating

440 leading to an increase in the chance of receiving help. By honestly signaling that they

441 will cooperate, agents in the model assort with other co-operators.

442

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

443 Reputation-based partner choice contrasts with existing attempts to explain

444 cooperative displays. Considering the popular notion of indirect reciprocity (which

445 depends on a conditional rule of “help those who help” on each play), the two-stage

446 structure employed here as a solution to social dilemmas (Roberts 1998) means that

447 the costs of reputation building in one game are offset by access to profitable

448 partnerships in a second. This crucial difference allows for unconditional help (e.g.

449 charitable giving, public goods contributions); it shows how helping is in an

450 individual’s strategic interests (compared to indirect reciprocity where there is a

451 conflict between increasing image score and demonstrating discrimination (Leimar

452 and Hammerstein 2001); and it requires no complex ‘moral’ rules of who to give to,

453 such as the ‘leading eight’ (Ohtsuki and Iwasa 2006).

454

455 The reputation-based partner choice model also differs from current applications of

456 costly signalling theory to reputations. These models are based on finding a

457 separating equilibrium between two types differing in underlying quality, whether

458 genetics, health, resources or competitive ability (Gintis, Smith, and Bowles 2001).

459 Costly signal honesty occurs when it signalling is condition dependent, so that low

460 quality individuals suffer higher marginal costs or lower marginal benefits from

461 signalling. In contrast, any individuals in the reputation-based partner choice model

462 could play strategies of cooperator or defector (there are no quality differences) and

463 the frequencies of these strategies evolve in proportion to payoffs (as opposed to

464 simply being assumed to exist as separate types as in costly signalling models). In

465 the reputation-based partner choice model, signalling then provides information

466 about strategy itself (not quality type) because cooperative strategists have a

467 comparative advantage (Ricardo 1891). In contrast to a recent model of third party

468 punishment as a signal of trust (Jordan et al. 2016), the reputation-based partner

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

469 choice model does not depend on unsubstantiated assumptions. In particular, there

470 is no assumption that individuals belong to fixed trustworthy or exploitative types that

471 cannot change in frequency due to selection, nor that trustworthiness is associated

472 with lower net costs of third-party punishment.

473

474 It is sometimes stated that a cooperative reputation indicates both ability and intent to

475 cooperate (Van Vugt, Roberts, and Hardy 2007; Barclay and Willer 2007), yet costly

476 signalling theory focuses on the former (Gintis, Smith, and Bowles 2001). In contrast,

477 the model presented here is about signals of intent, or strategy, with no necessary

478 difference in ability. Intention signalling is honest because those going on to a

479 cooperative partnership can afford to invest in signalling whereas those that exploit

480 co-operators will get a high payoff once, but this is likely to be insufficient to make

481 signalling profitable. By signalling, agents demonstrated that they were investing in a

482 long term relationship; and so it paid those choosing partners to prefer signallers.

483 This model provides theoretical evidence in support of the hypothesis that I had

484 previously presented only verbally (Roberts 1998). The concept of signalling

485 intentions is being recognized by a few authors as being important in explaining

486 human behaviour, for example in Martu hunters where “signals may be crucial for

487 ensuring trust and commitment with long-term partners” (Bird et al. 2018), and more

488 generally in hunter-gatherers where “although the focus of much traditional costly

489 signaling literature has been hunting prowess [there is also] the potential of hunting

490 to signal generosity and garner cooperation partners” (Stibbard Hawkes 2019).

491

492 The issue of maintaining reputation versus taking a short term temptation is

493 analogous to the condition for reciprocity in an iterated prisoner’s dilemma (Axelrod

494 and Hamilton 1981). When the ‘shadow of the future’ is large, then it pays to play Tit-

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

495 for-Tat in repeated games, i.e. where (b-c)/(1-w) > b. Simply changing the benefit of

496 defection (b) to be the cost of signalling results in the simplified equation derived

497 above for when it pays to cooperate. Thus in reciprocity it pays to start by

498 cooperating when there are sufficient benefits of repeated interaction, and

499 analogously in reputation building it pays to start by signalling when this cost is

500 exceeded by the benefits of being chosen for repeated cooperation.

501

502 As we have seen, the reputation-based partner choice model is based on paying an

503 upfront cost. In this it is convergent with the notion of commitment devices (Frank

504 1988). The rationale is that an initial investment commits one to a course of action, in

505 this case benefiting from long term cooperation as opposed to taking the temptation

506 to defect. One can predict that the greater the potential benefits from cooperative

507 relationships, the greater the up-front cost that can be sustained in signalling

508 commitment [compare recent work on commitment in language (Vullioud et al.,

509 2017). However, the reputation-based partner choice model also differs from a model

510 of ‘’ where low cost signals can indicate intent provided there are repeated

511 interactions (Silk, Kaldor, and Boyd 2000) in that it specifically explains costly

512 reputation-building and in that reputations are the result of single signalling events

513 rather than being learned over many rounds.

514

515 The work presented here necessarily has limitations in its application. Each signalling

516 system in the real world may have different properties. The model is intended to

517 apply to behavioural signals. Individuals are envisaged to perform some behaviour

518 directed towards observers who may or may not elect to respond to the behaviour.

519 Specific models of real-world systems would need to tailor the model to account for

520 factors such as how many potential partners can be influenced by one signal.

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

521 Furthermore, while this model follows standard practice in dividing individuals into

522 two roles (here Competitors and Selectors while the Biological Market literature

523 considers ‘buyers’ and ‘sellers’), further work should model the case where

524 individuals can perform both roles.

525

526 In conclusion, I have shown that introducing signalling changes the dynamic between

527 cooperative and defecting strategies because cooperators gain larger marginal

528 benefits from investing in signalling to a potential partner than do defectors.

529 Reputation itself provides an honest signal of long term commitment, or

530 trustworthiness, providing a means by which cooperators can select each other for

531 mutually beneficial relationships. The development of this theory is important in

532 adding substance to previous verbal acknowledgements that signals might indicate

533 future strategy in addition to giving information about quality or resources (Roberts

534 1998; Van Vugt, Roberts, and Hardy 2007; P. Barclay 2013; Pat Barclay 2015; Silk

535 2002; Bliege Bird and Smith 2005). The formal theory of honest signalling of

536 cooperative intentions that I have advanced here fits well within recent verbal

537 arguments for such a theory (Bliege Bird and Power 2015, Bird, Ready et al. 2018,

538 Stibbard-Hawkes 2019)

539

540 Tests of the reputation-based partner choice model have already been carried out

541 based on the verbal two-stage structure outlined in (G. Roberts 1998), and so the

542 model seeks to offer a formal explanation for existing experimental results showing

543 the signalling of helping as a signal of trustworthiness (Sylwester and

544 Roberts 2010; Barclay 2004; Fehrler and Przepiorka 2013) and commitment

545 (Yamaguchi, Smith, and Ohtsubo 2015). Nevertheless, future experiments using

546 economic games should test the quantitative predictions of the model while varying

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

547 its parameters. The reputation-based partner choice model forces us to think

548 carefully about whether signals indicate differences in underlying quality or in

549 strategy (although of course in practice the two may intertwine). Individuals may

550 reveal their long term cooperative strategy by helping, by donating to charity or

551 contributing to public goods; brands may display long term strategy through

552 advertising investment; primates may demonstrate investment in social bonds;

553 monogamous birds such as albatrosses may give commitment signals where long

554 relationships are decisive. The key factor is that reputation-building can signal that

555 individuals are committed for the long term.

556

557 References

558 Alexander, R. D. (1987). The Biology of Moral Systems. New York: Aldine de 559 Gruyter.

560 Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211,

561 1390–1396.

562 Axelrod, R. (1971). The Complexity of Cooperation: Agent-Based Models of

563 Competition and Collaboration. Princeton University Press.

564 Barclay, P. (2004). Trustworthiness and can also solve the

565 “tragedy of the commons.” Evolution and Human Behavior, 25(4), 209–220.

566 Barclay, P. (2013). Strategies for cooperation in biological markets, especially for

567 humans. Evolution and Human Behavior, 34(3), 164–175.

568 https://doi.org/10.1016/j.evolhumbehav.2013.02.002

569 Barclay, P. (2015). Reputation. In D. Buss (Ed.) Handbook of Evolutionary

570 Psychology (2nd ed., pp. 810–828). Hoboken, NJ: J Wiley and Sons.

571 Barclay, P., & Willer, R. (2007). Partner choice creates competitive altruism in

572 humans. Proceedings of the Royal Society B: Biological Sciences, 274(1610),

573 749–753. https://doi.org/10.1098/rspb.2006.0209

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

574 Bird, R. B., E. Ready and E. A. Power (2018). "The social significance of subtle

575 signals." Human Behaviour 2(7): 452-457.

576 Bliege Bird, R. and E. A. Power (2015). "Prosocial signaling and cooperation among

577 Martu hunters." Evolution and Human Behavior 36(5): 389-397.

578 Bliege Bird, R., & Smith, E. A. (2005). Signaling Theory, Strategic Interaction, and

579 Symbolic Capital. Current Anthropology, 46(2), 221–248.

580 https://doi.org/10.1086/427115

581 Fehrler, S., & Przepiorka, W. (2013). Charitable giving as a signal of trustworthiness:

582 Disentangling the signaling benefits of altruistic acts. Evolution and Human

583 Behavior, 34(2), 139–145.

584 https://doi.org/http://dx.doi.org/10.1016/j.evolhumbehav.2012.11.005

585 Frank, R. (1988). Passions within Reason: The Strategic Role of the Emotions. New

586 York: Norton & Co.

587 Ghang, W. and Nowak, M.A. (2015). Indirect Reciprocity with Optional Interactions.

588 Journal of Theoretical Biology 365, 1–11.

589 http://dx.doi.org/10.1016/j.jtbi.2014.09.036.

590 Gintis, H., Smith, E. A. and Bowles, S. (2001). Costly Signaling and Cooperation.

591 Journal of Theoretical Biology 213 (1), 103–119.

592 Grafen, A. (1990). Biological signals as handicaps. Journal of Theoretical Biology 593 144 (4), 517-546. 594

595 Hardy, C. L., & Van Vugt, M. (2006). Nice guys finish first: The competitive altruism

596 hypothesis. Personality and Social Psychology Bulletin, 32(10), 1402–1413.

597 Johnstone, R. A. (1997a). The evolution of animal signals. In Behavioural Ecology:

598 An Evolutionary Approach (4th ed.). Oxford: Blackwell.

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

599 Johnstone, R. A. (1997). The Tactics of Mutual Mate Choice and Competitive

600 Search. and , 40(1), 51–59. Retrieved from

601 http://www.jstor.org/stable/4601296

602 Jordan, J.J., Hoffman, M., Bloom, P. and Rand, D.G. (2016). Third-Party Punishment

603 as a Costly Signal of Trustworthiness. Nature 530 (7591), 473–76.

604 https://doi.org/10.1038/nature16981

605 http://www.nature.com/nature/journal/v530/n7591/abs/nature16981.html#sup

606 plementary-information.

607 Lachmann, M., Szamado, S. and Bergstrom, C.T. (2001). Cost and Conflict in Animal

608 Signals and Human Language. Proceedings of the National Academy of

609 Sciences of the United States of America 98 (23), 13189–94.

610 Leimar, O. and Hammerstein, P. (2001). Evolution of Cooperation through Indirect

611 Reciprocity. Proceedings of the Royal Society of London Series B-Biological

612 Sciences 268 (1468), 745–53.

613 Maynard Smith, J., and Harper, D. (2003). Animal Signals. New York: Oxford

614 University Press.

615 McNamara, John M., Barta, Z., Fromhage, L. and Houston, A.I. (2008). The

616 of Choosiness and Cooperation. Nature 451 (7175), 189–92.

617 http://www.nature.com/nature/journal/v451/n7175/suppinfo/nature06455_S1.h

618 tml.

619 Milinski, M., Semmann, D. and Krambeck, H. J. (2002). Reputation Helps Solve the

620 ‘Tragedy of the Commons.’ Nature 415 (6870), 424–426.

621 Noë, R. and P. Hammerstein (1994). "Biological Markets: Supply and 622 Demand Determine the Effect of Partner Choice in Cooperation, 623 and Mating." Behavioral Ecology and Sociobiology 35(1): 1-11.

624 Nax, H. H. and A. Rigos (2016). "Assortativity evolving from social dilemmas."

625 Journal of Theoretical Biology 395: 194-203.

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

626 Noë, R. and P. Hammerstein (1994). "Biological Markets: Supply and Demand

627 Determine the Effect of Partner Choice in Cooperation, Mutualism and

628 Mating." Behavioral Ecology and Sociobiology 35(1): 1-11.

629 Nowak, M. A. and Sigmund, K. (1998). Evolution of indirect reciprocity by image

630 scoring. Nature 393, 573-577.

631 Nowak, M. A., and Sigmund, K. (2005). Evolution of Indirect Reciprocity. Nature 437,

632 1291–1297.

633 Nowak, M. A. (2012). Evolving Cooperation. Journal of Theoretical Biology 299 (0),

634 1–8. http://dx.doi.org/10.1016/j.jtbi.2012.01.014.

635 Ohtsuki, H., and Iwasa, Y. (2006). The Leading Eight: Social Norms That Can

636 Maintain Cooperation by Indirect Reciprocity. Journal of Theoretical Biology

637 239 (4), 435–444.

638 Raihani, N. J., and Smith, S. (2015). Competitive Helping in Online Giving. Current

639 Biology.

640 Ricardo, David. (1891). Principles of Political Economy and Taxation. G. Bell and

641 sons.

642 Roberts, G. (1998). Competitive Altruism: From Reciprocity to the Handicap

643 Principle. Proceedings of the Royal Society of London Series B-Biological

644 Sciences 265 (1394), 427–31.

645 Roberts, G. (2015a). Human Cooperation: The Race to Give. Current Biology 25

646 (10), R425–27. http://dx.doi.org/10.1016/j.cub.2015.03.045.

647 Roberts, G. (2015b). Partner Choice Drives the Evolution of Cooperation via Indirect

648 Reciprocity. PLoS ONE 10 (6), e0129442.

649 https://doi.org/10.1371/journal.pone.0129442.

650 Searcy, W. A. and Nowicki, S. (2005). The Evolution of Animal :

651 Reliability and Deception in Signaling Systems. Princeton University Press.

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

652 Sherratt, T. N., and Roberts, G. (1998). The Evolution of Generosity and Choosiness

653 in Cooperative Exchanges. Journal of Theoretical Biology 193 (1), 167–77.

654 Sherratt, T. N., and Roberts, G. (2001). The Role of Phenotypic Defectors in

655 Stabilizing . Behavioral Ecology 12, 313–17.

656 Sherratt, T. N. and G. Roberts (2012). "When Paths to Cooperation Converge."

657 Silk, J. B., Kaldor, E. and Boyd, R. (2000). Cheap Talk When Interests Conflict.

658 Animal Behaviour 59, 423–432. http://dx.doi.org/10.1006/anbe.1999.1312.

659 Stibbard-Hawkes, D. N. E. (2019). "Costly signaling and the handicap principle in

660 hunter-gatherer research: A critical review." Evolutionary Anthropology:

661 Issues, News, and Reviews 28(3): 144-157.

662 Silk, J. B. (2002). Grunts, Girneys, and Good Intentions: The Origins of Strategic

663 Commitment in Nonhuman Primates. In Commitment: Evolutionary

664 Perspectives (Ed. by R. Nesse), 138–57. Russell Sage Press.

665 Sylwester, K. & Roberts, G. (2010). Cooperators Benefit through Reputation-Based

666 Partner Choice in Economic Games. Biology Letters 6, 659–662.

667 https://doi.org/10.1098/rsbl.2010.0209.

668 Sylwester, K. & Roberts, G. (2013). Reputation-Based Partner Choice Is an Effective

669 Alternative to Indirect Reciprocity in Solving Social Dilemmas. Evolution and

670 Human Behavior 34 (3), 201–206.

671 http://dx.doi.org/10.1016/j.evolhumbehav.2012.11.009.

672 Van Vugt, M., Roberts, G. and Hardy, C. (2007). Competitive Altruism: Development

673 of Reputation-Based Cooperation in Groups. In Handbook of Evolutionary

674 Psychology, edited by R. Dunbar and L. Barrett. Oxford: Oxford University

675 Press.

676 Vullioud, Clément F, Scott-Phillips T, and Mercier H. (2017). Confidence as an

677 Expression of Commitment: Why Misplaced Expressions of Confidence

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

678 Backfire. Evolution and Human Behavior 38 (1), 9-17.

679 http://dx.doi.org/10.1016/j.evolhumbehav.2016.06.002.

680 Wang, J., S. Suri and D. J. Watts (2012). "Cooperation and assortativity with dynamic

681 partner updating." Proceedings of the National Academy of Sciences 109(36):

682 14363.Wedekind, C., and Milinski, M. (2000). Cooperation through Image

683 Scoring in Humans. Science 288 (5467), 850–52.

684 Yamaguchi, M., Smith, A. and Ohtsubo, Y. (2015). Commitment Signals in Friendship

685 and Romantic Relationships. Evolution and Human Behavior 36 (6), 467–74.

686 http://dx.doi.org/10.1016/j.evolhumbehav.2015.05.002.

687 Wedekind, C., & Milinski, M. (2000). Cooperation through image scoring in humans.

688 Science, 288(5467), 850–852.

689 Zahavi, A., & Zahavi, A. (1997). The Handicap Principle. Oxford: Oxford University

690 Press.

691

692 Figure Legends

693 Figure 1. Mechanisms for benefitting from being seen to help others. A. Indirect

694 reciprocity. Cooperation in indirect reciprocity involves a series of one-off donations

695 in fluid populations to those who have been seen to help others. Here, the individual

696 represented in blue helps red, each blue arrow representing paying a cost to benefit

697 the recipient. Green observes that blue has helped, and in consequence, helps blue.

698 Likewise red then helps green. Yellow does not help anyone and in consequence is

699 not helped. B. Reputation based partner choice. In a social setting, individuals are

700 aware of others’ reputations, which are gained by being seen to perform a helpful act

701 to any individual. Through partner choice, cooperators assort for mutually beneficial

702 long-term relationships. Here, blue (playing in the Competitor role) performs an act

703 with a cost to itself and a benefit to any other party, here represented by a blue arrow

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

704 away from the other individuals in order to avoid confusion with direct or indirect

705 reciprocity. As a result, blue gains a positive reputation and so green (playing in the

706 Selector role) chooses blue as a partner for a mutually beneficial relationship (two-

707 way arrow). Red (playing in the Competitor role) does not perform a costly act but

708 yellow (playing in the Selector role) accepts any partner for further interaction. The

709 theory holds that those performing helpful acts are honestly signaling that they will

710 make cooperative partners, and so that it makes sense for those looking for partners

711 to choose on the basis of this signal.

712

713 Figure 2. Roles and strategies. The model was based on the concept that some

714 individuals designated as Selectors were seeking potential partners from among

715 those designated as Competitors to form cooperative partnerships. The computer

716 simulation model therefore began by dividing agents into those in the Competitor

717 Role and those in the Selector Role. Competitors can either Signal, or they Don’t

718 Signal (all strategy element names are capitalized and indicated in bold).

719 Subsequently, Selectors can either accept Any partner generated at random by the

720 program, or they can Choose a partner who signalled. Once paired, agents in either

721 role may Cooperate or Defect. This structure specifies the eight strategies in the

722 model, each being an independent combination of options represented by a pathway

723 along the arrows shown e.g., a Selector may play a strategy of ‘Any’, ‘Cooperate’.

724

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

725

726 Figure 3. Reputation building evolves. Strategy dynamics are plotted for the first

727 100 generations (as strategy frequencies remained stable during the remaining

728 generations of the 1000 for which the simulations were run) and are averaged across

729 10 simulations. (a) ‘Competitors’, showing all strategy combinations of ‘No Signal’,

730 ‘Signal’ in Stage 1, and ‘C’ (Cooperate) or ‘D’ (Defect) in an iterated simultaneous

731 Prisoner’s Dilemma in Stage 2. (b) ‘Selectors’, showing all strategy combinations of

732 accept ‘Any’ or ‘Choose’ an agent that has signalled and whether they go on to

733 cooperate with partner (as for part a). Parameters were population size N = 100;

734 generations = 1000; islands i = 10; c=1; b = 10; meetings m = 50; rounds r = 25;

735 signal cost s = 10; cost of choice h = 0.01; baseline payoff = 1000; mutation rate µ=

736 0.02 (see Table S1).

737

738 Figure 4. Evolutionary dynamics of signalling by cooperators and defectors. A.

739 The probability of signalling given that an agent is a co-operator [p(S|C] versus a

740 defector [p(S|D)]. B. The probability of being a cooperator given the agent is a

741 signaller [p(C|S)] versus a non-signaller [p(pC|no-S)]. Simulation parameters as in

742 Fig 3.

743

744 Figure 5. Signal honesty. The costs of signalling can maintain honesty in two ways.

745 In (a), individuals belong to high and low quality classes. These classes differ in their

746 investment in signalling and so the form of the signal indicates underlying quality. In

747 (b) the costs of signalling are the same for all, but Cooperators get greater benefits

748 from signalling when they engage in repeated interactions than do Defectors who can

749 only exploit or cheat a partner once. The figure illustrates schematically how

750 individuals playing repeated cooperation (C) can make a net profit despite paying

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

751 signal costs, when this may not be possible for those ‘cheating’ i.e. defecting on co-

752 operators. Between the maximum signal cost sustainable by D’s and the maximum

753 sustainable by C’s is the region in which any Signaller must be a Cooperator.

754 Adapted from (Johnstone 1997a) Fig 7.2a.

755

756 Figure 6. Effects of varying key parameters. Panels a-c show the effect of varying

757 signal cost whilst keeping other parameters constant (see Figure 3 Legend for

758 parameters) on the percentage of agents displaying choosy strategies; the

759 percentage signalling; and the percentage cooperating. In each case these are

760 calculated as means with standard errors of the given variable value in generation

761 100, computed across 10 simulations. They are therefore intended to indicate the

762 average value at equilibrium (see Figure 3 for how the evolutionary dynamics reach

763 equilibrium value within 100 generations). Similarly panels d-f show average

764 percentages of Signallers, Choosers and Cooperators keeping other parameters

765 constant (as in Figure 3) while varying the number of plays as the primary means of

766 increasing the benefits of a relationship with a partner.

767

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

768 Figure 1

769

770

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

771 Figure 2

772

773 bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

774 Figure 3

775

776

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

777 Figure 4

778

779

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

780 Figure 5

781

782

bioRxiv preprint doi: https://doi.org/10.1101/2019.12.11.873208; this version posted December 12, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

783 Figure 6

784

785