<<

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2013 The of Deception in Signaling Systems Candace Ohm

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

THE EVOLUTION OF DECEPTION IN SIGNALING SYSTEMS

By

CANDACE OHM

A Dissertation submitted to the Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Degree Awarded: Fall Semester, 2013 Candace Ohm defended this dissertation on October 10, 2013. The members of the supervisory committee were:

Mike Mesterton- Professor Directing Dissertation

Mark Isaac University Representative

Alec Kercheval Committee Member

Warren Nichols Committee Member

The Graduate School has verified and approved the above-named committee members, and certifies that the dissertation has been approved in accordance with the university requirements.

ii This is dedicated to my best friend, my limelight, and a man I lost along the way. To my dearest Grandpa, I will always love you.

iii ACKNOWLEDGMENTS

Thank you to my advisor, Dr. Mike Mesterton-Gibbons. Thank you for challenging me every step along the way, thank you for never accepting anything less than perfect, and most of all, thank you for teaching me the definition of independence. I couldn’t have asked for a better advisor.

I’d also like to thank a few good mentors and committee members I had along the way. To my committee, Dr. Nichols, Dr. Kercheval, and Dr. Isaac, thank you for your outstanding service to the academic community. Thanks to Jonathan Rowell and Patrick Fletcher, you were good friends that offered great advice. Thanks to an undergraduate professor and best friend, Dr. Pam Warton.

I owe my graduate school career to Pam’s persistence, intelligence, stubbornness, and faith in me.

I’d also like to thank my friends and family. I made six irreplaceable friends along the way that made Tallahassee the place I grew up and the place I call home. Michelle, Tamzyn, Celes,

Jenn, Jules, and Tara-thank you for standing by me through the process. Miss Jules—thanks for assisting with your technical editing skills. Thanks to my oldest friends, my brother Matt and cousin Ciressa, for always sharing your honest opinion and for always staying close. Thanks to my uncles, Uncle Craig and Uncle Ruggy, for always making me smile. Finally, thank you to my parents, especially my father, for teaching me the value of hard work.

iv TABLE OF CONTENTS

ListofTables...... vii ListofFigures ...... viii Abstract...... ix

1 Introduction 1 1.1 DeceptiveSignaling...... 2 1.2 Signals and ...... 4 1.3 Agenda ...... 5 1.4 Definitions...... 7

2 Review of the Literature 9 2.1 Classical ...... 9 2.1.1 The Hawk-Dove Game ...... 10 2.2 SignalingGames ...... 11 2.2.1 The Frequency Dynamics of Signaling Games ...... 13 2.2.2 Stability Analysis ...... 15 2.3 Deceptive Signaling in Behavior ...... 17 2.3.1 One-ReceiverReducedGame ...... 20 2.3.2 Bluffing by Berritory Holders ...... 23 2.4 ExtendingtheCurrentLiterature...... 26

3 The Two-State with One Receiver 28 3.1 TheGeneralModelwithTwoStates ...... 28 3.1.1 The Replicator Equations ...... 30 3.1.2 The Learning Dynamic Equation ...... 35 3.1.3 A Brief Commentary ...... 37 3.2 Example:ManipulativeMimics ...... 39 3.2.1 The Dynamical System ...... 40 3.2.2 Results ...... 45

4 The Two-State Signaling Game with Multiple Receivers 49 4.1 The Replicator Equations ...... 50 4.2 DeceptioninRHP ...... 57 4.2.1 TheModel ...... 58 4.2.2 Strategies Within the Signaling System ...... 61

v 4.2.3 Creating the Payoff Functions ...... 64 4.2.4 The Dynamical System ...... 65 4.2.5 Results ...... 68

A n−State Games 73 A.1 The Replicator Equations ...... 74 A.2 TheLearningDynamics ...... 77

B Code 80

C Copyright Permission 81 References...... 82 BiographicalSketch ...... 86

vi LIST OF TABLES

1.1 Requirements for a general theory on signaling ...... 6

2.1 TheHawk-Dovegame ...... 10

2.2 Local stability analysis ...... 17

2.3 Payoffs in the one-receiver game ...... 20

2.4 Payoffsinthebluffingbyterritoryholdersmodel ...... 24

2.5 Parameters of the bluffing by holders model ...... 25

3.1 Summary of state variables and payoff functions ...... 31

3.2 Dynamical system for the two-state two receiver model ...... 38

3.3 Actions of the manipulative mimics game when a mimic is present...... 43

3.4 Actions of the predator against a toxic model...... 44

4.1 Variables in the n-receiver signaling game ...... 51

4.2 Dynamical system for the two-state n receivermodel...... 55

4.3 Possible states in the deceptive strength game...... 58

4.4 List of values for the deceptive strength game ...... 59

4.5 Possible states of the deception in RHP game ...... 60

4.6 Nash mixed equilibria of the deception in RHP game ...... 60

4.7 Players’ actions in the deception in RHP game ...... 63

4.8 Payoffs in the deception in RHP game ...... 65

4.9 Dynamical system for the deception in RHP game ...... 68

A.1 Variables in the n-state signaling game ...... 75

vii LIST OF FIGURES

2.1 Lewis signaling games ...... 13

2.2 The one-receiver game with interior equilibrium ...... 22

3.1 Extensive form manipulation game ...... 29

3.2 Themimicoctopus ...... 40

3.3 A tree of the manipulative mimics game ...... 42

3.4 Manipulative mimics game vector field ...... 48

4.1 Manipulation game with n receivers ...... 50

4.3 Stable interior fixed points for 0.973 / γ ≤ 1 ...... 69

4.2 DeceptioninRHPequilibria ...... 72

A.1 The n-statemodel ...... 74

viii ABSTRACT

In this dissertation, we create a dynamical learning model that helps to explain the evolution of deception in signaling systems. In our model, the signaler may choose to signal either of two possible states. We apply this model to Batesian and to deceptive signaling of fighting ability, or resource holding potential. We show how to expand this model to allow for multiple receivers as well as multiple possible states. External code for the and deceptive signaling of

fighting ability models can be accessed via the following link: https://dl.dropboxusercontent.com/u/18728677/Dissertation%20Code.zip

ix CHAPTER 1

INTRODUCTION

John Maynard-Smith and David Harper define a signaling device as, “An act or structure that alters the behavior of another organism, which has evolved because of that effect, and which is effective because the receiver’s response has also evolved” [28]. Signals can be externally visible features, behavior patterns, or sounds, all of which are, “designed by natural selection” [25]. Darwinian theory suggests that signaling systems exist because , on average, increases the long-term fitness of both the signaler and receiver. The value of a signal to a receiver is as a source of information. The receiver uses this information to choose the behavioral, physiological, or developmental responses that will maximize its fitness [37].

A signal is honest if some characteristic of the signal (which can include its absence or presence) is consistently correlated with some attribute of the signaler or its environment and receivers benefit from having information about this attribute [37]. A signal is a handicap when its honesty is guaranteed because the signaler’s associated costs of the signal are greater than the signal’s requirements for being effective [28]. For example, the antlers of a stag are a sign of strength and dominance because antlers are a stag’s primary tool in battle. However, larger antlers are more costly because there is a direct correlation between antler size and weight, and so larger antlers incur a greater cost over a stag’s lifetime. Thus, antlers are a handicap [50].

According to Zahavi’s handicap principle, all signals must be, on average, honest [50]. The handicap principle explains how honest signaling is able to evolve when a signaler has an obvious motivation to deceive or exaggerate the signal [11, 50]. The three conditions of the handicap

1 principle state that: (1) on average signals must be honest, (2) signals are costly, and (3) the differential costs of signaling correlate to the true quality of the signaler [11, 50].

Zahavi’s handicap principle, initially rejected by many scientists [28, 36, 37, 48], became gener- ally accepted when Alan Grafen constructed various game-theoretic models. The models created by

Grafen clarified the handicap principle and substantially increased its importance and scope. The handicap models created by Grafen [11] show under fairly general conditions that if the handicap principle’s conditions are met, then an evolutionarily stable signaling equilibrium exists and any signaling equilibrium must satisfy the conditions of the handicap principle. A signaling equilib- rium is a pair of signaler and receiver strategies such that neither individual gains by selecting a different strategy [11]. Grafen further explained three possible interpretations of the handicap principle. These are: (1) the possession of the handicap puts an extra risk upon the individual, (2) differences in quality are exhibited by revealing the handicap (for example, a stag’s quality is revealed in a battle because its antlers, which are a handicap, are being utilized. In contrast, differences in quality are not exhibited when a stag is foraging because antlers do not aid in foraging), and (3) only high quality males are capable of expressing the handicap.

1.1 Deceptive Signaling

If honesty is used to define the fundamental conjunction of signal level and , where information is based on what the receiver wants to know, then virtually no signaling system will be completely honest [41]. Deceit occurs when the signaler’s fitness increases at the cost of the receiver’s fitness [30]. Deceptive signaling is able to evolve when it is less costly to send a deceptive signal than a truthful signal [11]. The signaler is able to select what information is transmitted, and so the signaler will elect only to emit signals that will induce a desirable behavior from the receiver

[47].

Zahavi’s handicap principle does not preclude the existence of deception, rather, it suggests deception is equivalent to cheating in a signaling system [11, 28]. Cheating is expected in evolu- tionarily stable signaling systems, but the system is stable only if there is some reason why cheating

2 is not profitable on most occasions. Otherwise the meaning of the signal is debased [11]. Explana- tions for the existence of deception that acknowledge the handicap principle include (1) studying the signaling system as a whole unit, (2) frequency dependent signaling, and (3) costs associated with deceptive signaling.

Studying the signaling system as a whole unit requires looking at all types of signalers a receiver encounters and vice versa. For example, consider the angler fish. The angler fish has a fleshy growth, or “lure,” that hangs above its head. This worm-like lure is an evolved mechanism that serves to attract prey. The prey see the lure and are tricked— rather than feed on a tasty feast, they are feasted upon. Therefore, the worm-like lure is debatably a deceptive signal [11, 28].

Grafen argues that the lure is not actually a signaling device. Rather, he provides a definition for signaling that requires observing a signaling system as a whole unit [11]. In this case, the signalers would be the angler fish and the worm. The receivers are the species that are predators of the worm that are also prey of the angler fish. Thus we must also consider the viewpoint of the worm. A worm wriggles and twists because major selective forces, such as feeding, dispersal, and circulation of oxygen, shape its appearance and behavior. The worm does not intentionally move in this manner to attract predators, therefore, the worm is not signaling [11]. Since the worm does not intend to signal, this system is not a signaling system, and so the handicap principle does not apply.

According to Grafen [11], the lure of an angler fish is no more a signal for fish to approach and be eaten than the firing of a shotgun is a signal that invites suicide on the part of end users [11].

The handicap principle implies that the frequency of deceptive signals must depend on the frequency of honest signals i.e., deceptive signals are frequency dependent. If the number of deceptive signals has too much impact on the signaling system, they will disrupt it [11, 28]. Most likely, the number of deceptive signalers is a minority and the difference between honest signalers and deceptive signalers is the cost of the signal [11]. If the fraction of cheats becomes too large, the meaning of the signal loses its value [13].

Since deceptive signals are frequency dependent, there must be certain elements of the signaling system that causes the frequency of deceptive signaling to be rationed [28]. For example, consider the Great Tit, Parus major. In the absence of predators these will emit hawk alarms, causing

3 other feeding birds to flee, and so the signaler is left to feast upon the remaining food with no competition. Thus the caller benefits at a cost to the receiver [28]. In this example of great tits, deceptive signals appear to be rationed in three ways: (1) false signals are not used when confronting an individual subordinate to themselves, (2) false signals are less likely to be used when food is readily available, and (3) false signals decrease when food is evenly distributed [28].

Costs associated with deceptive signaling are incurred by both the signaler and the receiver.

Receivers incur assessment costs while signalers incur indirect costs. Assessment costs refer to all costs the receiver incurs while listening to and interpreting the signal. Receivers often can reduce assessment costs, but by doing so, they also reduce the effort spent assessing the signal, which can allow some level of deception to exist within the signaling system. However, if deception becomes too frequent, either the signal loses its value or the receiver will invest more in assessment costs to reduce the uncertainty of the signal [41].

A signaler’s indirect costs are the costs incurred that are not associated with the direct emission of the signal. For example, Batesian mimicry occurs when an edible species has evolved to adopt the warning colorations or markings of a toxic species, such as African Swallowtail butterfly, Papilio dardanus. This butterfly is edible, but its markings match the same pattern as a non-edible version, i.e., a toxic model [28]. If the handicap principle is true, then it must be the case that the brightly colored markings are disadvantageous in some other aspect of the species’ life. It is possible that other predators may find all prey edible, or the bright colors exhibit some other handicap in the individual’s life, for example, a lack of camouflage [11].

1.2 Signals and Learning

John Krebs and argue that manipulation is a learned process that co-evolves with mind-reading and the two concepts are highly relevant to the study of animal communication

[25, 40]. Mind-reading involves exploiting the targeted individual’s behavior while manipulation leads to actively changing the targeted individual’s behavior [25]. are able to mind-read because the sequences of behavior follow statistical rules. Past interactions allow individuals to

4 predict the of future interactions. For example, an animal will most likely interpret a baring its teeth as a sign of aggression based on previous interactions with teeth-baring .

For an animal, the data collection and statistical analysis process can be interpreted as either an evolutionary process or learning in an individual’s life. As an evolutionary process, natural selection acts on a mind reader’s ancestors over a long period of time. Individuals learn from the behavior of their ancestors because favorable traits are selected for and become ingrained in their genetic coding. At the individual level, animals develop some process of learning in their own lifetime through repeated interactions [25]. 1

Marian Stamp Dawkins and Tim Guilford argue that honesty, manipulation, and mind reading are all strategic components of signal design which operate through receiver psychology. This implies that the efficacy of the signal relies upon the psychological landscape of the receiver [13].

The psychological landscape refers to any physiological features or psychological factors that might affect a receiver’s response to a signal. If a receiver finds certain signals memorable or attention- grabbing, then these features are more likely to be incorporated into the evolution of some signals.

For example, if a female is attracted to a certain movement, symmetry pattern, or coloration while searching for food, then male birds of the same species might be expected to exhibit similar actions in displays [13].

1.3 Agenda

A theoretical framework is needed to describe how social interactions evolve in signaling systems

[46, 12]. Marian Stamp Dawkins argues that a general theory of signaling should have at least three requirements [40]. The table shown below outlines these requirements.

1 There are four tactics that an individual may adopt to read another individual’s behavior [47]. The first is implicit mind reading, where an individual’s appreciation of the link between, for example, another individual’s perception and their actions implicitly suggests what comes between the perception and the action. The second is counter-deception, where an individual is able to discriminate between its opponent’s true state of mind (such as intentions) and the overt ‘false’ behavior. Next is recognition of intervening variables, where the insight from certain sets of behaviors and/or conditions generates states in other individuals that can, in turn, predict sets of future actions. Lastly, experience projection occurs when an individual is able to use its own experience to predict how another individual will act [47].

5 Table 1.1: Requirements for a general theory on signaling

Conditions of the handicap principle Interpretation Lead to evolutionary stability Mathematical model Diversity of function Discernibility of the signal Diversity of form Variations of the signal

The first is that the system should lead to evolutionary stability, and so any useful theory should be formalized into a mathematical model. Second, it should explain the observed diversity of function of animal signals. Signaling systems with conspicuous displays, such as claw-waving of fiddler crabs [36, 40], have been the most studied because conspicuous displays are energy- consuming or appear to involve high risk or considerable costs to the animal displaying them.

However, most animal communication takes place through signals that are hardly noticeable [40].

Finally, a general theory of signaling should explain the observed diversity of form of animal signals.

Even closely related species with similar sensory systems elicit major differences in communicative strategies. For example, species of sea bass within the genus Hypoplectrus have adapted numerous coloring ranges, including black, blue with white stripes, and yellow with iridescent face stripes

[40].

The rest of this dissertation proceeds as follows. We conclude the introduction with a short glossary of common terms that arise in the theory of signaling systems. In Chapter Two, we review the literature. We motivate as a tool to understand the long term dynamics of signaling systems and we discuss deterministic evolutionary games in more detail. We introduce the Lewis signaling game, which provides a general outline for game-theoretic models that aim to explain the evolution of signaling [19, 20]. We provide a detailed discussion of a specific

Lewis signaling game that shows how deception can evolve in a signaling system [36].

In Chapter Three, we develop a general model that allows interactions on two distinct levels, which we shall refer to as the signaling and confrontation interactions. This allows us to distinguish an individual’s signaling strategies from the confrontation that takes place after signaling occurs.

We apply the model to a specific case where signalers may send false information about the state of the environment: Batesian mimicry. Our results show that the frequency of deceptive signaling

6 depends on the receiver’s cost of making an incorrect assessment of the signal. This provides evidence that honesty, manipulation, and mind-reading operate through receiver psychology, as suggested by Stamp Dawkins and Guilford [13].

In Chapter Four, we develop an extension of the model. We generalize the model to a signaling system with n different types of receivers. We apply the general model to deceptive signaling in an individual’s fighting ability, or resource holding potential [33]. Our model is an extension of a model developed by Rowell et al., which analyzes the signaling strategies of players with asymmetric strength. Both models assume a population with two types of individuals, strong and weak, that are competing over a resource. We make three major alterations to Rowell et al.’s assumptions.

First, we do not assume that strong individuals always defeat weak individuals. Second, we assume dyadic interactions whereas Rowell et al. assume triadic. Finally, our model assumes signaling takes place at the individual level while Rowell et al.’s model analyzes the overall signaling population.

Our results show that weak receivers often disbelieve the signal, which implies they will engage in a battle more frequently. Thus, our results suggest that weaker males tend to be overly aggressive.

1.4 Definitions

Batesian Mimicry Mimicry in which an edible animal is protected by its

resemblance to a noxious one that is avoided by predators [28].

Cost The loss of fitness resulting from making or receiving a

signal.

Deception Occurs when the signaler’s fitness increases at the cost of the

receiver’s fitness [30].

Evolutionary Stable A strategy that cannot be bettered by any feasible alternative

Strategy strategy, provided sufficient members of the population adopt

it [30].

Handicap Signal A signal whose reliability is ensured because its cost is

7 greater than required by efficacy requirements; the signal

may be costly to produce, or have costly consequences [28].

Manipulate To skillfully control or influence an individual or situation [25].

Model Toxic prey that advertises its toxicity by conspicuous

signals [8].

Resource Holding A measure of the absolute fighting ability of a given individual

Potential [33].

Signal (Def. 1)* Any act or structure which alters the behavior of other

organisms, which evolved because of that effect, and

which is effective because the receivers have also evolved [28].

(Def. 2) Behavioral, physiological, or morphological characteristics

fashioned or maintained by natural selection because they

convey information to other organisms [37].

(Def. 3) Let the set of possible actions by an actor be A, and a set of

character states of the actor be C. Let the set of possible

actions by an observer of the action be R. Then if the choice

from A is a signal for C, we must have (1) natural selection

could produce any rule relating the actor’s state to its action

and (2) natural selection can produce a rule in the observer

relating A to permutation of the elements of A [11].

*For the remainder of this dissertation, we shall adopt Maynard Smith and Harper’s definition of signal (definition 1). This definition is clearly stated, concise, and is generally accepted by most scientists. This definition also grasps the overall theme of this dissertation—what effect does a signaling system have on the evolution of signaler and receiver strategies?

8 CHAPTER 2

REVIEW OF THE LITERATURE

2.1 Classical Game Theory

Strategic behavior arises when an individual’s interaction outcome depends on the actions of other individuals. Game theory is a mathematical tool that is used to describe the strategic inter- actions between individuals. We define a game as the mathematical model of strategic interaction and each individual in the game is called a player.

Games are defined by four concepts. The first is the set of players. We assume there are at least two players interacting with each other. Second, each player has a set of strategies, where strategies are constrained by a player’s information structure of the game. Players base strategy decisions on what information is known about the game. Third, there is a well-defined payoff function, where a player’s payoff function depends on the strategy selected by all individuals in the game. Lastly, there is a . The solution concept provides a rationale for determining which strategy each individual will select [31].

The main premise of classical game theory is that players are rational and choose a strategy based on the actions of their opponent. A pure strategy provides a complete definition of how a player will play the game. A mixed strategy occurs when a player’s strategy is selected prob- abilistically. A occurs when both players have nothing to gain by adopting another strategy provided that the other player does not depart from its strategy.

An extensive-form game is an explicit representation of the sequencing of all players’ possible

9 moves at every possible decision point and is typically represented using a tree that outlines players’ decisions at every stage of the game. A is any part (or subtree) of the extensive-form game that meets the following criteria:

1. The initial node is a singleton information set

2. All successors of the initial node are contained in the subgame

3. Any successor of a node in the subgame must be contained in the subgame [10].

A key feature of a subgame is that, when viewed in isolation, a subgame provides the essential concepts that constitute a game.

2.1.1 The Hawk-Dove Game

The Hawk-Dove game is a two player, two strategy game. The Hawk-Dove game assumes that players contest over a divisible resource valued at V , where V > 0. Each player must either choose

Hawk and fight for the resource or choose Dove and decline the resource. If both players opt to play Hawk, then a battle ensues. The winner will gain the resource valued at V and the loser incurs a cost c, where c> 0. The traditional payoff matrix of the Hawk-Dove game is given by Maynard

Smith in Evolution and the Theory of Games [27]:

Table 2.1: The Hawk-Dove game

Player 2 Hawk Dove V −c V −c Player 1 Hawk ( 2 , 2 ) (V, 0) V V . Dove  (0,V ) ( 2 , 2 )

Each player is equally likely to win the contest, hence wins with probability 1/2. Thus the V − c expected payoff for either player when both players play Hawk is . If one player selects Hawk 2 and the other selects Dove, then the player who chose Hawk gains the resource and receives a payoff of V and the player who elected to play Dove gains nothing thus receives a payoff of 0. This

10 corresponds to the entries in the first row, second column and second row, first column of the payoff matrix. If both players select Dove, then we assume players split the resource and so the payoff V for both players is , which corresponds to the entry in the second row and second column of the 2 payoff matrix.

A mixed Nash equilibrium exists when V

V − c y + V (1 − y). 2

V The expected payoff for Player 1 if it selects Dove is (1 − y) because it will receive a payoff 2 of V/2 if Player 2 plays Dove and 0 if Player 2 plays Hawk. Then a mixed strategy for Player 1 occurs when the expected payoff of playing Hawk is equal to the payoff of playing Dove, or

V − c V x y + V (1 − y) = (1 − x) (1 − y) . (2.1)  2   2 

A similar analysis will yield a nearly equivalent equation for Player 2. Player 2’s expected payoff of playing Hawk is equal to the expected payoff of playing Dove when:

V − c V y x + V (1 − x) = (1 − y) (1 − x) . (2.2)  2   2 

If we simultaneously solve equations 2.1 and 2.2, we find that a mixed Nash equilibrium occurs V when both players select Hawk with probability x∗ = y∗ = . c

2.2 Signaling Games

In contrast to classical game theory, evolutionary game theory focuses on entire populations of players that are programmed to use a certain strategy or type of behavior [16]. It has been

11 suggested that there is a great importance to study signaling evolution dynamically because dy- namical signaling models can show how signaler and receiver strategies change during the course of evolution [25, 7].

The Lewis Signaling game (LSG) was introduced by David Lewis to study the emergence of conventions. The goal was to provide a general framework for studying the emergence of communication by developing a structured framework for a signaling game [19, 26]. The general framework of the LSG relies on two things: (1) an interactive model (or family of models) of potential signaling situations, and (2) an abstract dynamics of evolution or learning that can operate on such a model [20]. The LSG can be used with to simple mathematical models to provide insight into the specific processes that may play a role in signaling evolution, for example, mutation-selection dynamics [14, 19], signals on networks [20], and deceptive signaling [36].

Assume there are two players, a signaler and a receiver. Suppose there are n possible states and the signaler is given biased information about the state of the system. The signaler sends one of n possible signals to the receiver, who observes the signal but does not observe the state, then selects one of n possible actions. Both parties are interested in coordinating a stipulation to associate each act with a signal and each signal with the appropriate act [19].

The common theme of a Lewis style signaling game and similar game-theoretic models of com- munication is outlined as follows:

1. The game consists of two players, the signaler and the receiver

2. The signaler has private information about an event. This information is unknown to the

receiver. This event is chosen by according to some fixed probability distribution.

3. The signaler transmits a signal, which is then interpreted by the receiver

4. The receiver performs an action which may depend on the observed signal

5. Payoffs are distributed. The payoff of the signaler and receiver may depend on the state of

the event, the signal emitted, and the receiver’s action to the signal [21].

12 The signaler’s strategy is a function from the state of possible events to the possible signals and the receiver’s strategy is a function from the set of observed signals to each possible act. We illustrate the concept of a Lewis signaling game in the graphic below.

State Signal Response

a1 s1 r1

a2 s2 r2 . . . . . s . . j . . . a . r i . i ......

an sn rn

ak Observed state sk Signaled state rk Receiver’s action

Figure 2.1: Lewis signaling games

The signaler observes one of n possible states and sends one of n possible signals to the receiver. The receiver interprets and selects one of n possible actions.

2.2.1 The Frequency Dynamics of Signaling Games

Deterministic evolutionary signaling games are analyzed by creating a dynamical system that models the signaler and receiver strategies over time. One equation often used to model the fre- quency dynamics of evolutionary signaling games is the . The replicator equation was introduced by Peter Taylor and Leo Jonker [44] and is intended to relate the growth rate of the proportion of each type of individual to its expected payoff with respect to the average payoff in the population. The dynamical system embraces the concept that more successful strategies

13 will spread over time because individuals with above-average payoffs will increase in frequency and individuals with below-average payoff will decrease in frequency [17].

Suppose we have n strategies, labeled A1,A2,...,An, which occur with frequency n x1,x2,...,xn, where xi ∈ [0, 1] and xi = 1. Let x = hx1,x2,...,xni be the frequency vector. Xi=1 n Let fi(x) denote the payoff function associated with strategy Ai and let f¯(x)= xifi(x) be the Xi=1 average fitness of the population. Then the general version of the replicator equation is

x˙ i = fi(x) − f¯(x) xi, (2.3)  where the dot above the xi represents the derivative with respect to time [16, 17, 19, 36, 44].

Notice the sign of the dynamical equation is strictly determined by the sign of fi(x) − f¯(x).

When the payoff for strategy Ai is greater than the mean fitness i.e., fi(x) > f¯(x), this strategy will increase in frequency. Likewise, when fi(x) < f¯(x), then the frequency of individuals adopting strategy Ai will decrease. A dynamic is qualitatively adaptive if, according to that dynamic, a strategy that is not extinct increases its proportion of the population if its fitness is higher than the population average, decreases its population proportion if its fitness is less than the population average, and keeps the same population proportion if its fitness is equal to the population average. The replicator dynamics is a member of this class, but there are many other qualitatively adaptive dynamics [38].

Another qualitatively adaptive dynamic commonly used in deterministic evolutionary game the- ory is the learning dynamic [15, 36]. The learning dynamic is qualitatively similar to the replicator equation, but describes the behavior of individuals with distinct qualitative properties [36]. Suppose we have n players and each player can choose between two strategies labeled A1 and A2. Suppose player i adopts strategy A1 with frequency xi. Then general version of the learning dynamic is given by ∂fi x˙ i = xi(1 − xi), (2.4) ∂xi where fi(x) is the payoff function for player i. The logistic term xi(1 − xi) represents the frequency of

14 random encounters of individuals playing different strategies [36]. Notice that the sign ofx ˙ i agrees ∂f with the sign of i , which ensures that the learning dynamic is qualitatively adaptive. ∂xi The learning dynamic can be interpreted in two ways: in a biological context or in a dynamical learning context [19, 36]. In a population genetics context, we consider fitness instead of payoff, or average fitness instead of average payoff [19]. In this sense, the state variables are Mendelian traits that correspond to two alleles at a locus and the learning dynamic models the change in frequencies of each allele over time. [17]. The learning dynamics shown in equation (2.4) is obtained as the classical continuous-time model for allele frequency dynamics, where fi(x) represents the mean

fitness for allele Ai with frequency xi [36]. A second interpretation is learning over an individual’s lifetime. Allowing random encounters by individuals of different strategies allows individuals to compare payoffs, thus players learn by assessing the payoffs of other individuals. If a player observes that another player has a higher payoff, then it can either switch entirely to that strategy (this requires assuming the population is infinite and strategies are pure, so xi represents a frequency in the total population) or it can increase the frequency with which it uses the more rewarding strategy. Models with this interpretation have been studied extensively [17].

2.2.2 Stability Analysis

An evolutionary stable strategy (ESS) is a strategy that cannot be bettered by any feasible alternative strategy, provided sufficient members of the population adopt it. The best strategy for an individual often depends upon the strategies of other members of the population and the resulting ESS may be a mixture of a number of these strategies [30]. In deterministic evolutionary game theory, the ESS occur at fixed points of the deterministic system. The fixed points of the dynamical system may be identified by solving the equationx ˙ i = 0 for each i ∈ {1, 2,...,n}.A fixed point for the replicator equation shown in equation (2.3) will yield a point that lies in the interior, edge, or on a vertex of the n-dimensional simplex. Notice that the origin of the simplex is an equilibrium point because xi = 0 for all i = 1, 2,...,n will yieldx ˙ i = 0 for all i. A fixed point for the learning dynamic shown in equation (2.4) will yield a point that lies in the interior, edge,

15 or on a vertex of the n-cube. Notice that all vertices of the n-cube are equilibrium points because xi = 0 or xi = 1 for all i = 1, 2,...n will yieldx ˙ i = 0 for all i.

Determining the stability of a fixed point will explain the behavior of the dynamical system for points that lie in a small open neighborhood of the fixed point. A local fixed-point stability analysis is performed by calculating the eigenvalues of the Jacobian at the equilibrium point. We define the Jacobian of the system by the matrix:

∂x˙ 1 ∂x˙ 1 ... ∂x˙ 1  ∂x1 ∂x2 ∂xn  ∂x˙ 2 ∂x˙ 2 ∂x˙ 2 A =  ∂x1 ∂x2 ∂xn   . . .   . .. .      ∂x˙ n ∂x˙ n   ......   ∂x1 ∂xn 

If all the eigenvalues of the matrix A have nonzero real part, then the equilibrium is said to be hyperbolic and the eigenvalues determine the local stability properties of the equilibrium. If the real parts of the eigenvalues are negative, then the equilibrium is a sink and is asymptotically stable.

If the real parts of the eigenvalues are positive, then the equilibrium is called a source and repels nearby points away from the equilibrium point [42]. If the real parts of some of the eigenvalues are positive and some are negative, then the fixed point is called a saddle point. Sources and saddle points are unstable. If the point is non-hyperbolic, then local stability must be investigated using other means [38]. We summarize the conclusions of a fixed point stability analysis in table 2.2.

Section 2.3 describes a particular LSG created by Rowell et. al [36] entitled, “Why animals lie.”

This model uses the learning dynamic shown in equation (2.4) to explain how deceptive signaling may evolve in animal societies. The main objective of this model is to calculate frequencies which one individual (the signaler) sends a deceptive signal and another individual (the receiver) believes this signal. The primary objective of this paper is to find equilibrium systems where a certain level of deception may exist in an animal signaling system.

16 Table 2.2: Local stability analysis

Type Stability Real Parts of the Eigenvalues Description Sink Stable Negative Attractor, points near will converge Source Unstable Positive Repeller, points near will diverge Saddle Unstable Both negative and positive Some points will converge, some will diverge

2.3 Deceptive Signaling in Animal Behavior

We consider a game with either two or three players. Player 1, the signaler, has two possible choices, send a true or a false signal. Suppose we have two types of receivers. For simplicity, we call them “R1” for a receiver of Type 1 and “R2” for a receiver of Type 2. After receiving the signal,

R1 and R2 individuals interpret and choose to believe or disbelieve.

Let τ be the fraction of time a true signal is sent, ρ be the fraction of time that a Type 1 receiver believes the signal is true, and υ be the fraction of time that a Type 2 receiver believes the signal is true. Then τ, ρ, and υ, where τ, ρ, υ ∈ [0, 1], are strategies for the signaler and receivers. If τ = 1, then the signaler always signals truthfully, τ = 0 implies the signaler always signals deceptively, and 0 <τ < 1 implies the signaler adopts a mixed strategy of truth/deception. Likewise, ρ = 1

(or υ = 1) means a Type 1 (Type 2) receiver always believes the signal, ρ =0(υ = 0) means R1

(R2) always disbelieves, and 0 <ρ< 1 (0 <υ< 1) means R1 (R2) adopts a mixed strategy of belief/disbelief.

In this model, we will adopt the learning dynamics equation given by equation (2.4). Recall that the learning dynamics equation can be interpreted in two different manners, either evolutionary

(across-generation) or behavioral (within-generation) timescales. First, we may interpret equation

(2.4) as describing evolutionary change in a trait under selection. Behavior alternatives (truth vs. deception, belief vs. not belief) are Mendelian traits that correspond to two different alleles at a locus. The model also has a second interpretation, which is that signaling strategies change over

17 time due to learning [36].

The learning dynamics model is conceptually related to the replicator dynamics model shown in equation (2.3). The classical replicator model assumes that the population is a mix of a large number of individuals playing pure strategies. Rowell et al. propose the learning dynamics model presented in equation (2.4) so that the same model can be applied to pure or mixed strategies.

Models presented in their paper cover deceptive signals to potential mates or competitors, where both signalers and receivers could adopt a mixed strategy [36].

Each player has two possible choices of actions, which means there are eight possible scenarios that may result. We may represent the game by the following payoff matrices, where payoffs are of the form (Type 1-Receiver, Type 2-Receiver, Signaler):

True Signal τ = 1

R1 Believe ρ = 1 (A1,B1,C1) (A2,B2,C2)   (2.5) R1 Disbelieve ρ = 0 (A3,B3,C3) (A4,B4,C4)   R2 Believe υ = 1 R2 Disbelieve υ = 0

False Signal τ = 0

R1 Believe ρ = 1 (A5,B5,C5) (A6,B6,C6)   R1 Disbelieve ρ = 0 (A7,B7,C7) (A8,B8,C8)   R2 Believe υ = 1 R2 Disbelieve υ = 0

The first matrix corresponds to players’ payoffs when the signal is true and the second matrix corresponds to the players’ payoffs when the signal is false. The top row in each matrix corresponds to the players’ payoffs when a Type 1-Receiver believes the signal while the second row corresponds to players’ payoffs when a Type-1 Receiver does not believe the signal. Likewise, the columns of the payoff matrices correspond to a Type-2 Receiver believing or disbelieving the signal.

Payoffs are determined by multiplying the expected payoff times the frequency that a player

18 adopts a given strategy. For example, define

Ji(ρ, υ)= C1+iυρ + C2+iρ(1 − υ)+ C3+i(1 − ρ)υ + C4+i(1 − ρ)(1 − υ), (2.6)

where i ∈ {0, 4}. Then J0 yields the payoff for the signaler when a true signal is sent and J4 yields the payoff for the signaler when a false signal is sent. From equation (2.5), the payoff for the signaler is given by

fS(τ, ρ, υ)= τJ0 + (1 − τ)J4, and so the payoff differential with respect to τ is given by

∂fS =(C1 − C5)ρυ +(C2 − C6)ρ(1 − υ)+ ∂τ +(C3 − C7)(1 − ρ)υ +(C4 − C8)(1 − ρ)(1 − υ).

∂f 1 ∂f 2 Similarly, we can calculate R and R : ∂ρ ∂υ

∂fR1 =(A1 − A3)τυ +(A2 − A4)τ(1 − υ) ∂ρ

+(A5 − A7)(1 − τ)υ +(A6 − A8)(1 − τ)(1 − υ)

∂fR2 =(B1 − B2)τρ +(B3 − B4)τ(1 − ρ) ∂υ +(B5 − B6)(1 − τ)ρ +(B7 − B8)(1 − τ)(1 − ρ).

This yields the following dynamical system:

∂f τ˙ = S τ(1 − τ) ∂τ ∂f 1 ρ˙ = R ρ(1 − ρ) ∂ρ ∂f 2 σ˙ = R υ(1 − υ). ∂υ

There is a subtle assumption in the derivation of the dynamical system that is unstated by the ∂f 1 ∂f 2 authors. Notice that the payoff differentials, R and R , for the two types of receivers depend ∂ρ ∂υ on the strategy of the other receiver, and so the payoff for a Type 1 receiver is a function of υ and

19 the payoff for a Type 2 receiver is a function of ρ, which implies it must be the case that Type 1 and

Type 2 receivers are interpreting the signal simultaneously. Since the receivers’ payoffs depend on each other, this implies that Rowell et al.’s three player model assumes a three player interaction between the signaler, Type 1 receiver, and Type 2 receiver.

The overall state of the model (τ, ρ, υ) can be thought of as a point within or on a face of the unit cube. The cube has a variety of possible equilibrium structures. The eight vertices will always be equilibria as a result of the logistic growth terms.

In the remainder of this section, we explain two specific examples that were presented in Rowell et al.’s paper. The first is a one-receiver reduced model. The second is intended to represent bluffing by territory holders.

2.3.1 One-Receiver Reduced Game

In this example, we have two players, a signaler and a receiver. Suppose the players’ payoff matrix can be represented by:

Table 2.3: Payoffs in the one-receiver game

Truthful (τ = 1) Not Truthful (τ = 0)

Believe (ρ = 1) A1, C1 A5, C5

Not Believe (ρ = 0) A3, C3 A7, C7

For i ∈ {1, 3, 5, 7}, Ai represents the payoff to the receiver and Ci represents the payoff to the signaler. Notice the payoff matrix shown above can be constructed by setting υ = 1 and removing all unnecessary payoff values from equation (2.5).

Creating the dynamical system works similarly to the three player model. In fact, set υ = 1 and the payoff function Ji shown in equation (2.6) still holds for i ∈ {0, 4}. Since there are two players in the interaction game, the dynamical model lies on the [0, 1] × [0, 1] square. The model

20 may be represented by the equations:

τ˙ = [(C1 − C5)ρ +(C3 − C7)(1 − ρ)] τ(1 − τ),

ρ˙ = [(A1 − A3)τ +(A5 − A7)(1 − τ)] ρ(1 − ρ).

The logistic term τ(1 − τ) or ρ(1 − ρ) ensures that corners of the signaling square are always

fixed points. There will be an interior fixed point if and only if (C1 − C5)ρ +(C3 − C7)(1 − ρ) = 0 and (A1 −A3)τ +(A5 −A7)(1−τ) = 0. Thus, we have an interior fixed point if and only if (C1 −C5) and (C3 − C7) have opposite signs, and (A1 − A3) and (A5 − A7) have opposite signs [36].

A verbal analysis provides insight of the signaling interaction. Suppose the signaler and receiver have opposing interests. Then the receiver would prefer to believe a truthful signal and disbelieve a dishonest signal, and so

A1 > A3, and

A5 < A7.

If these two inequalities hold, then there exists a solution to (A1 − A3)τ +(A5 − A7)(1 − τ) = 0 for some τ ∗ such that 0 < τ ∗ < 1. Similarly, if the receiver always believes, then the signaler will receive a greater payoff by sending a deceptive signal, thus,

C1

These conditions combined define the nature of truthful and deceptive signals [36]. Given that

∗ ∗ A1 > A3 and A5 < A7, we are able to solve (A1 −A3)τ +(A5 −A7)(1−τ ) = 0, and so we conclude that the signaling system contains some deception without complete breakdown. In order to find the fraction of belief/disbelief, we explore the relationship between C3 and C7.

If C3 ≤ C7 , thenτ ˙ < 0 for any point in the interior, which will cause τ to decrease. As

τ decreases, eventually we will haveρ ˙ < 0. The system converges to complete breakdown: the signaler always lies (τ = 0) and the listener always disbelieves (ρ = 0).

Therefore, the only possibility of deception without complete breakdown occurs when C3 >C7.

In this case, the dynamics are cyclic and the directions of the flow on the boundary of the square

21 follow a counterclockwise rotation.

Figure 2.2: The one-receiver game with interior equilibrium

Qualitatively, this makes sense:

1. If the signaler always tells the truth, then the receiver will always believe

2. If the signaler always sends a deceptive message, then the receiver will never believe

3. If the receiver always believes, then the signaler will always send deceptive signals, and

4. If the receiver never believes, then the signaler will always send a true signal (because C4 <

C3).

This system could arise if there is an intrinsic cost to giving a dishonest signal because the signal cost arises when the signaler sends the signal rather than as a result of the receiver’s response.

This is similar to the handicap mechanism because the intrinsic cost associated with deception keeps the signaler from being dishonest so that all receivers will believe. Introducing a second receiver will prevent complete breakdown in a similar way. The second receiver can behave in a

22 way that penalizes the signaler when the cost of deception is too high. Thus, the cost of deception is frequency-dependent upon the result of the behavioral dynamics given in the signaling interaction: the signaler gains from sending deceptive signals to one type of receiver but loses payoff by the other type when the frequency of deceptive signals is too high.

2.3.2 Bluffing by Berritory Holders

The main emphasis of this example is to portray the strategy dynamics involved in signaling games. We assume the signaler is a territory holder and calls are intended to show ownership.

Listeners are individuals passing by the territory that may be interested in fighting the territory- holding male for possession of the resource.

One possible application of this model is to the fiddler crab, Uca annulipes. The fiddler crab has an enlarged claw called the brachychelous claw. If this claw is lost due to injury, then the crab will regenerate a new claw, called the leptochelous, which is weaker than the original claw because there is less muscle tissue. This regrown claw is also longer and leaner, which means crabs with a leptochelous claw can engage in an active (waving) at a reduced energetic cost.

The game has two types of signalers and two types of receivers: strong signalers/receivers have a brachychelous claw and weak signalers/receivers have the regrown leptochelous claw. A signal is intended to say, “I am strong and this is my territory.” If a receiver believes the signal, then the receiver will always respect the signal. This implies that a battle will not escalate. A battle will escalate when the receiver does not believe the signal. In this case, a strong individual will always defeat a weak individual. Conflicts between individuals of equal size will yield a higher penalty on both individuals because of the associated risk of escalating violence. Payoffs are defined by the payoff matrices shown in table 2.4.

In this model, individuals are fighting over a territory resource valued at T . A small individual that signals and is believed will receive a benefit of T + ǫ, where ǫ > 0 is the added benefit a small individual receives from gaining the territory. For example, ǫ could represent the reduced cost of signaling as seen in the fiddler crab. Fiddler crabs with a leptochelous claw have a longer, leaner claw, thus active displays consume less energy than the displays of a fiddler crab with a

23 Table 2.4: Payoffs in the bluffing by territory holders model

Big (honest) Signaler (τ = 1) BR: Respect b1,b2,T b1, −F3,T − F3 (ρ = 1)

BR: Challenge T T T T − F1 − I,b2, + I − F1 − F1 − I, 0, + I − F1 (ρ = 0) 2 2 2 2 SR: Respect (υ = 1) SR: Challenge (υ = 0)

Small (dishonest) Signaler (τ = 0)

BR: Respect T +ǫ T +ǫ b1,b2,T + ǫ b1, − F2 − I, − F2 + I (ρ = 1) 2 2 BR: Challenge T − F3,b2, −F3 T − F3, 0, −F3 (ρ = 0) SR: Respect (υ = 1) SR: Challenge (υ = 0)

brachychelous claw.

If a challenge is escalated, then a large individual will always defeat a small individual and individuals of equal size attain the resource with probability 1/2. Challenges also incur a cost. Let

F1 and F2 be the cost of fighting a same size individual for large and small males, respectively. Let

F3 be the cost of fighting between different size males. We assume F3 < min(F1,F2) because the tendency for aggression escalates when individuals are similar in size. We assume that the resource is more valuable for territory holders. Let 2I represent the advantage to territory holders when a

fight between same size males escalates. If a large or small listener chooses to respect the signal, then they will receive a benefit of b1 or b2, respectively. Parameters are summarized in table 2.5.

Recall from page 17 that, in the general model, Rowell et al. define τ as the fraction of time a true signal is sent. Notice from equation (2.4) that τ is used to represent the fraction of strong signalers relative to weak signalers, i.e., the top matrix represents players’ payoffs when the signaler is strong and the bottom matrix represents payoffs when the signaler is weak. The signaler’s strength is fixed, therefore the signaler cannot choose whether it is strong or weak and so τ does not represent the fraction of time an individual sends a false signal. Rather, Rowell et al. assume τ represents the overall relative frequency of strong versus weak signals emitted. Rowell et al. argue that even though this model is not derived from individual level behavior, the results are

24 Table 2.5: Parameters of the bluffing by territory holders model

Parameter Description b1,b2 Background payoff opportunities for big and small listeners T Value of the territory ǫ Possible greater benefit from territory to smaller males F1,F2 Cost of fighting a same-size individual for big and small males, respectively 2I Added benefit the territory holder receives in battles against same size males F3 Cost of a fight between different size males, F3 < min(F1,F2)

qualitatively similar because when the payoff to strong signalers is higher than the payoff to weak signalers, there is an increase in strong to weak active signalers and vice versa.

We argue that this creates a major complication in the model. Since τ represents the overall fraction of strong to weak signals, some information is lost. In particular, the model does not depend upon the overall composition of the population, i.e., the fraction of strong to weak territory holders. Therefore, Rowell et al.’s model could derive frequency dynamics about the fraction of signals emitted that is inconsistent with the actual composition of the population.

A second complication is the assumption that strong individuals always defeat weak individuals.

It would be interesting to see if similar dynamics would result if this were not the case. Also, as stated on page 19, strong and weak receivers’ payoffs directly depend upon each other. Thus, the model assumes a three-player interaction between the signaler and both receivers takes place at every point in the signaling system. A weak receiver who challenges is automatically defeated if either (1) the signaler is strong, or (2) a strong receiver chooses to challenge. It is highly unlikely that three player signaling interactions between a signaler, weak receiver, and strong receiver always occur, and we suggest an alternative model where a signaler interacts with only one receiver at a time, where the receiver’s type is randomly selected based on the composition of the population.

In section 4.2 we create a similar model that addresses these concerns. We extend the “Bluffing by territory holders” model so that strong receivers defeat weak receivers only a fraction of the time. By doing so, we are able to define τ, ρ, and υ at the individual level rather than the fraction

25 of strong to weak territory holders. Finally, our model assumes two player interactions. A strong receiver’s payoff does not directly depend on the weak receiver’s payoff and vice versa.

2.4 Extending the Current Literature

Lewis signaling games seek to explain how communication systems evolve [19, 20, 38]. The evolution of deceptive signaling is a concept that has recently been addressed in evolutionary game theory [5, 36, 43, 51] and Rowell et al.’s paper is one such paper that analyzes the evolution of deception using learning dynamics [36]. This paper expounds on the concept of how a deceptive signaler is able to fool a receiver. In Rowell et al.’s model, the signaler is able to strategically exploit the behavior of the receiver. If the receiver disbelieves a true signal, the error can be very costly. Deceptive signalers, such as the weak individual portrayed in the “Bluffing by territory holders” example on page 23 are aware of this and will send false signals that take advantage of the receiver’s actions.

In Chapter Three, we begin by constructing a four dimensional model where there are two possible states. The signaler is given privileged information; the signaler knows the state of the environment and the receiver does not. The receiver interprets the signal and selects an appropriate action based on belief of the actual state of the environment. We apply this model to Batesian mimicry, as discussed on page 4, where the signaler uses deceptive signaling to trick the receiver into believing a toxic model is present. This signaling system requires interspecies communication, which is not considered in Rowell et al.’s paper [36]. Therefore, we briefly compare an individual’s learning mechanisms when communicating with individuals of a different species to the learning mechanisms used in intraspecies communication.

Our results show that there are two factors that determine the dynamics of the signaling system.

The first is the fraction of deceptive signalers relative to the fraction of honest signalers, which is a result that supports Zahavi’s handicap principle [50]. The second factor is the receiver’s costs associated with making an incorrect assessment of this signal. Recall from page 5 that Stamp

Dawkins and Guilford argue that the receiver’s psychological landscape plays a major contribution

26 in the evolution of animal signaling. They argue that many signals depend largely upon the receiver’s ability to detect, discriminate, and remember a given signal [13]. Our model supports this theory because our model shows that a receiver’s ability to correctly assess a signal is crucial to the evolutionary stable equilibrium of the signaling system.1

In Chapter Four, we extend the general model to a system where there is one signaler and n different types of receivers. We develop a model similar to Rowell et al.’s “Bluffing by territory holders” example on page 23, where our model addresses the critiques listed on page 25. There are three novel aspects to this model. First we develop a game with three interactants, a signaler and two types of receivers. The signaler is randomly paired with one receiver based on the composition of the population. Second, our model assumes a strong individual defeats a weak individual with probability γ, where 0.5 ≤ γ ≤ 1. Third, our model derives the frequency of signaling dynamics at the individual level rather than the overall relative frequency of signals emitted. In the appendix, we generalize the model to a signaling system with n states.

The primary difference between the models presented in this dissertation and the published work discussed in this chapter is that our models make the distinction between individuals and players. In our models, we introduce a nonstrategic individual in the signaling system who is not a strategic player in the signaling game.

1 The general model in Chapter 3 is on on the next page, the Manipulative Mimics model is on on page 39, the general model with n receivers is on on page 49, the Deception in RHP model is on on page 57, and the n-state model is on on page 73.

27 CHAPTER 3

THE TWO-STATE SIGNALING GAME WITH ONE

RECEIVER

In this chapter, we assume a game between two individuals, a signaler and a receiver. The signaler is given biased information about the state of the environment and sends a signal to the receiver.

The receiver interprets the signal and predicts the actual state of the system. Players select an appropriate action or response to the signal and payoffs are distributed. In Section 3.1 we derive the general form of the dynamical system. The general model is a four-dimensional dynamical system that models the fraction of time that signalers will send a truthful/deceptive signal and the fraction of time that receivers will believe/disbelieve the signal. In section 3.2 we apply the model to a specific case of deceptive signaling: mimicry. In particular, we analyze the behavior of the mimic .

3.1 The General Model with Two States

We will assume that the environment is in one of two possible states, called State 1 and State 2.

These states could have multiple interpretations, such as the type of animal present or information about the surrounding environment. Batesian mimicry, for example, is a signaling system where multiple types of animals are present. In a signaling system with two types, State 1 could refer to the presence of a Batesian mimic while State 2 is the presence of the unpalatable model. If the environment is in State 1, then the animal may send a truthful signal, indicating that the

28 environment is State 1, or the signaler may send a deceptive signal, indicating that the environment is State 2.

In contrast, the signal could portray information about the surrounding environment. For example, State 1 could refer to the absence of a predator while State 2 could imply a predator is present, and so this interpretation could be used to model the deceptive signaling system of the great tits as described on page 3.

In the picture below, we explain the extensive-form game in more detail:

Nature

d (1 − d)

State 1 State 2

τ1 1 − τ1 1 − τ2 τ2

s11 s12 s21 s22

Receives State 1 Receives State 2

ρ1 1 − ρ1 1 − ρ2 ρ2

r11 r12 r21 r22

Figure 3.1: Extensive form manipulation game

The game can be interpreted as follows. Nature selects a state of the system. With probability d the environment is in State 1 and with probability 1−d the environment is in State 2. The signaler observes the state (but the receiver does not) and selects a signal. Let sij denote the action of signaling State j when the actual environment is State i. When the actual environment is State 1, then the signaler will choose action s11 with frequency τ1 and will action s12 with frequency 1 − τ1.

29 Similarly, if the environment is in State 2, then the signaler will select action s22 with frequency τ2 and will select action s21 with frequency 1 − τ2. After receiving the signal, the receiver interprets.

Let rjk be the response when State j is the signal received and State k is the signal believed. If the signal received is State 1, the receiver will select the response r11 with frequency ρ1 and r12 with frequency 1 − ρ1. Similarly, if the signal received is State 2, then the receiver will select a response of r22 with frequency ρ2 and a response of r21 with frequency 1 − ρ2.

Notice that each player’s strategy contains two components: the frequency of sending/believing signals and the action/response corresponding to this signal. This is because we assume the extensive-form game requires interaction on two distinct levels. First, players are involved in a signaling interaction and second, players are involved in a confrontation interaction. The signaler emits a signal and the receiver interprets. The signaler and receiver must select a signal and inter- pretation, respectively, which correspond to the variables τi and ρj. After signaling takes place, the signaler and receiver engage in a post-signaling interaction i.e., confrontation. The values sij and rjk are actions in the confrontation interaction that depend on the strategies used when players engaged in pre-confrontation signaling.

Finally, we determine payoffs by defining arbitrary payoff functions. Let ψi(rij,sjk) represent the payoff for the signaler when the environment is in State i, State j is signaled, and the receiver assumes State k. We define φi(rij,sjk) as the payoff for the receiver when in State i, Player 1 signals State j, and the receiver assumes State k. We summarize all variables in table 3.1.

Depending upon the context of the signaling environment, either the replicator equations for- malized by Hofbauer and Sigmund [15] or the learning dynamics proposed by Rowell et al. [36] can be applied to this model.

3.1.1 The Replicator Equations

In this section, we explain how to construct the replicator equation when using the model shown in Figure 3.1. Recall from page 14 equation (2.3) the general formula of the replicator equation is given by

x˙ i = xi [fi(x) − f(x)] ,

30 Table 3.1: Summary of state variables and payoff functions

Player Variable Description

Signaler τi The frequency at which the signaler sends a true signal when the environment is State i sij Action of signaling State j when the actual environment is State i ψi(sij,rjk) Payoff for the signaler in State i when the signaler and receiver adopt action and response sij and rjk, respectively Receiver ρj The frequency at which the receiver believes a signal of State j rjk Response when believing State k and State j was signaled φi(sij,rjk) Payoff for the receiver in State i when the signaler and receiver adopt actions and responses sij and rjk, respectively

where xi is the frequency of strategy i, x = hx1,x2,...,xni is the frequency vector, fi is the payoff for strategy i, and f is the average payoff of the population.

Define s = hs11,s12,s22,s21i and r = hr11,r12,r21,r22i as the action and response vector for the signaler and receiver, respectively. Let τ = hτ1,τ2i, ρ =hρ1, ρ2i be the frequency vector for the signaler and receiver, respectively.

First, suppose nature selects State 1. The signaler has two choices, either select action s11 with probability (w.p.) τ1 or action s12 (w.p. 1 − τ1). If the signaler selects s11, it will receive a payoff of ψ1(s11,r11) if the receiver chooses response r11 (occurs w.p. ρ1) or will receive a payoff of

ψ1(s11,r12) if the receiver chooses response r12 (occurs w.p. 1 − ρ1). Then the payoff for adopting action s11 is given by

f11(s, r, ρ)= ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12). (3.1)

Alternatively, the signaler could have adopted action s12. Then the signaler will receive a payoff of ψ1(s12,r22) if the receiver chooses response r22 (occurs w.p. ρ2) and will receive a payoff of ψ1(s12,r21) if the receiver chooses response r21 (occurs w.p. 1−ρ2). Then the payoff for adopting

31 action s12 is given by

f12(s, r, ρ)= ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21). (3.2)

Combining equations (3.1) and (3.2), the average payoff for the signaler when the environment is State 1 can be written as

f1(s, r, ρ, τ)= τ1f11(s, r, ρ) + (1 − τ1)f12(s, r, ρ) (3.3)

Combining equations (3.1) and (3.3) yields the replicator dynamic for strategy τ1:

τ˙1 = τ1 [f11(s, r, ρ) − f1(s, r, ρ, τ)]

= τ1(1 − τ1)[f11(s, r, ρ) − f12(s, r, ρ)] (3.4) = τ1(1 − τ1)[ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12)−

ρ2ψ1(s12,r22) − (1 − ρ2)ψ1(s12,r21)] .

Note that the frequency of action s12 is 1 − τ1, and soτ ˙1 sufficiently captures the frequency dynamics for the signaler’s set of actions when State 1 is observed.

Now, suppose the actual environment is State 2. Then the signaler may choose action s22 (w.p.

τ2) or action s21 (w.p. 1 − τ2). If the signaler selects action s22, then it will receive a payoff of ψ2(s22,r22) if the receiver chooses response r22 (occurs w.p. ρ2) and will receive a payoff of

ψ2(s22,r21) if the receiver chooses response r21 (occurs w.p. 1 − ρ2). Then the payoff for action s22 can be written as

f22(s, r, ρ)= ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21). (3.5)

If the signaler selects action s21, (w.p. 1 − τ2), then it will receive a payoff of ψ2(s21,r11) if the receiver selects response r11 (occurs w.p. ρ1) and will receive a payoff of ψ2(s21,r12) if the receiver selects response r12 (occurs w.p. 1 − ρ1). Then the payoff for action s21 can be written as

f21(s, r, ρ)= ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12). (3.6)

32 Combining equations (3.5) and (3.6), we can calculate the average payoff for the signaler when the environment is State 2:

f2(s, r, ρ, τ)= τ2f22(s, r, ρ) + (1 − τ2)f21(s, r, ρ) (3.7)

By combining equations (3.5) and (3.7), we formulate the replicator equation for action s22:

τ˙2 = τ2 [f22(s, r, ρ) − f2(s, r, ρ, τ)]

= τ2(1 − τ2)[f22(s, r, ρ) − f21(s, r, ρ)] (3.8) = τ2(1 − τ2)[ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21)

−ρ1ψ2(s21,r11) − (1 − ρ1)ψ2(s21,r12)]

Next we find the replicator equations for the receiver. Assume that State 1 is the signal received.

Then the receiver may select between responses r11 and r12. If the receiver selects r11, then it will receive a payoff of φ1(s11,r11) if the signaler selected action s11 (occurs w.p. τ1) and will receive a payoff of φ2(s21,r11) if the signaler selected action s21 (occurs w.p. 1 − τ2). Then the payoff for selecting response r11 is

g11(s, r,τ)= dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11). (3.9)

If the receiver selects response r12, then it will receive a payoff of φ1(s11,r12) if the signaler selected action s11 (occurs w.p. τ1) and will receive a payoff of φ2(s21,r12) if the signaler selected action s21 (occurs w.p. 1 − τ2). Then the payoff for selecting response r12 is

g12(s, r,τ)= dτ1φ1(s11,r12) + (1 − d)(1 − τ2)φ2(s21,r12). (3.10)

Combining equations (3.9) and (3.10) yields an average payoff for the receiver when the signal received is State 1:

g1(s, r, τ, ρ)= ρ1g11(s, r,τ) + (1 − ρ1)g12(s, r,τ). (3.11)

33 Combining equations (3.9) and (3.11), we can write the replicator equation for response r11:

ρ˙1 = ρ1 [g11(s, r,τ) − g1(s, r, τ, ρ)]

= ρ1(1 − ρ1)[g11(s, r,τ) − g12(s, r,τ)] (3.12) = ρ1(1 − ρ1)[dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11)

−dτ1φ1(s11,r12) − (1 − d)(1 − τ2)φ2(s21,r12)]

Finally, suppose State 2 is the signal received. If the receiver selects response r22, then it will receive a payoff of φ2(s22,r22) if the signaler selected action s22 (occurs w.p. τ2) and will receive a payoff of φ1(s12,r22) if the signaler selected action s12 (occurs w.p. 1 − τ1). Then the payoff for playing response r22 is given by

g22(s, r,τ) = (1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22). (3.13)

The payoff for response r21 can be written as

g21(s, r,τ) = (1 − d)τ2φ2(s22,r21)+ d(1 − τ1)φ1(s12,r21). (3.14)

Combining equations (3.13) and (3.14) allows us to calculate the average payoff when the receiver was given a signal of State 2:

g2(s, r, τ, ρ)= ρ2g22(s, r,τ) + (1 − ρ2)g21(s, r,τ). (3.15)

The replicator equation for response r22 can be found by combining equations (3.13) and (3.15)

ρ˙2 = ρ2 [g22(s, r,τ) − g2(s, r, τ, ρ)]

= ρ2(1 − ρ2)[g22(s, r,τ) − g21(s, r,τ)] (3.16) = ρ2(1 − ρ2) [(1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22)

−(1 − d)τ2φ2(s22,r21) − d(1 − τ1)φ1(s12,r21)] .

The computation above results in a dynamical system with four variables. The four-dimensional

34 system is given by equations (3.4), (3.8), (3.12), and (3.16). We summarize by repeating these equations:

τ˙1 = [ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21)

−ρ1ψ2(s21,r11) − (1 − ρ1)ψ2(s21,r12)] τ2(1 − τ2)

τ˙2 = [ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21)

−ρ1ψ2(s21,r11) − (1 − ρ1)ψ2(s21,r12)] τ2(1 − τ2)

(3.17) ρ˙1 = [dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11)

−dτ1φ1(s11,r12) − (1 − d)(1 − τ2)φ2(s21,r12)] ρ1(1 − ρ1)

ρ˙2 = [(1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22)

−(1 − d)τ2φ2(s22,r21) − d(1 − τ1)φ1(s12,r21)] ρ2(1 − ρ2).

Perhaps we should comment on the inclusion of the parameter d, or the probability that the actual environment is State 1. Notice the probability of State 1 occurs in the receiver’s payoffs but is not included in the computation of the signaler’s payoffs. The probability that actual environment in State 1 is not included in the signaler’s payoffs because this information is already given when we calculate the payoffs for the signaler because we assume the signaler is perfectly aware of the state of the environment. The receiver, however, is not. Therefore, it is reasonable to assume that the receiver’s payoff depends on the probability that the environment is in the given state.

3.1.2 The Learning Dynamic Equation

Recall from equation (2.4) that the general form of the learning dynamic equation can be written as ∂f x˙ i = xi(1 − xi), ∂xi

where xi is the frequency of strategy i, x = hx1,x2,...,xni is the frequency vector, fi is the

35 payoff for strategy i, and f is the average payoff of the population. In this section, we derive the learning dynamic equations when there are two possible states. When the actual environment is

State 1, the signaler has two possible actions, s11 and s12, which occur with frequency τ1 and 1−τ1, respectively. In this case, the payoff for action s11 was calculated in equation (3.3):

f1(s, r, ρ, τ) = τ1f11(s, r, ρ) + (1 − τ1)f12(s, r, ρ)

= τ1 [ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r11)]

+(1 − τ1)[ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21)] ,

∂f1 and so, after calculating , we arrive at the differential equation ∂τ1

τ˙1 = [ψ1(s11,r11)ρ1 − ψ1(s12,r22)ρ2 + ψ1(s11,r12)(1 − ρ1) (3.18) −ψ1(s12,r21)(1 − ρ2)] τ1(1 − τ1).

When the actual environment is State 2, the signaler has another pair of actions, s22 and s21, which occur with frequency τ2 and 1 − τ2, respectively. The payoff for the signaler when the actual environment is State 2 was calculated in equation (3.7):

f2(s, r, ρ, τ) = τ2f22(s, r, ρ) + (1 − τ2)f21(s, r, ρ)

= τ2 [ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21)]

+(1 − τ2)[ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12)] .

∂f2 We calculate and find the dynamic learning equation for τ2: ∂τ2

τ˙2 = [ψ2(s22,r22)ρ2 − ψ2(s21,r11)ρ1 + ψ2(s22,r21)(1 − ρ2) (3.19) −ψ2(s21,r12)(1 − ρ1)] τ1(1 − τ1).

Next, we construct the learning dynamic equations for the receiver. If the signal received was

State 1, then the receiver has two possible responses, r11 and r12, which occur with frequency ρ1 and 1 − ρ1, respectively. In this case, the payoff for the receiver was calculated in equation (3.11):

36 g1(s, r, τ, ρ) = ρ1g11(s, r,τ) + (1 − ρ1)g12(s, r,τ)

= ρ1 [dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11)]

+(1 − ρ1)[dτ1φ1(s11,r12) + (1 − d)(1 − τ2)φ2(s21,r12)] .

∂g1 We calculate the partial derivative, , which yields the differential equation ∂ρ1

ρ˙1 = ρ1(1 − ρ1)[dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11)] (3.20) −dτ1φ1(s11,r12) − (1 − d)(1 − τ2)φ2(s21,r12).

If the signal received was State 2, then the receiver has two possible responses, r22 and r21, which occur with frequency ρ2 and 1 − ρ2, respectively. The payoff for the receiver was calculated in equation (3.15):

g2(s, r, τ, ρ) = ρ2g22(s, r,τ) + (1 − ρ2)g21(s, r,τ)

= ρ2 [(1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22)]

+(1 − ρ2) [(1 − d)τ2φ2(s22,r21)+ d(1 − τ1)φ1(s12,r21)] .

∂g2 We calculate the partial derivative, , which yields the differential equation ∂ρ2

ρ˙2 = [(1 − d)(φ2(s22,r22) − φ2(s22,r21)) τ2 + d (φ1(s12,r22) (3.21) −φ1(s12,r21)) (1 − τ1)] ρ1(1 − ρ1).

We conclude by rewriting the four learning dynamic equations given in equations (3.18), (3.19),

(3.20), and (3.21) in table 3.2. Notice that these equations are the same equations shown in equation

(3.17).

3.1.3 A Brief Commentary

It is well known that the replicator equation is equal to the learning dynamics equation when the system is two-dimensional [15, 16]. In the four-dimensional model constructed above, we show

37 Table 3.2: Dynamical system for the two-state two receiver model

τ˙1 = d [ψ1(s11,r11)ρ1 − ψ1(s12,r22)ρ2 + ψ1(s11,r12)(1 − ρ1) −ψ1(s12,r21)(1 − ρ2)] τ1(1 − τ1). τ˙2 = (1 − d)[ψ2(s22,r22)ρ2 − ψ2(s21,r11)ρ1 + ψ2(s22,r21)(1 − ρ2) −ψ2(s21,r12)(1 − ρ1)] τ1(1 − τ1) ρ˙1 = [d (φ1(s11,r11) − φ1(s11,r12)) τ1 + (1 − d)(φ2(s21,r11) −φ2(s21,r12)) (1 − τ2)] ρ1(1 − ρ1) ρ˙2 = [(1 − d)(φ2(s22,r22) − φ2(s22,r21)) τ2 + d (φ1(s12,r22) −φ1(s12,r21)) (1 − τ1)] ρ2(1 − ρ2)

the replicator dynamics is equal to the learning dynamics. Although this is a new result, it is not very surprising because players take on distinct roles (signaler and receiver) and there are two dynamical equations pertaining to each role. This allows us to simplify the expressions in a manner similar to that presented by Hofbauer and Sigmund in [15, 16].

In our model, the ESS of signaling and receiving strategies are calculated independently: the signaler’s actions after observing State 1 are independent of its actions after observing State 2.

Similarly, the receiver’s strategy after receiving a signal of State 1 is independent of its strategy after receiving a signal of State 2. Then a signaler’s strategy is defined only by the actual state of the environment and a receiver’s strategy is defined only by the signal received. In a standard

Lewis signaling game, a player’s strategy is defined by a set of moves that defines a signal for every possible state (i.e., a signaler’s strategy is defined by a vector hτ1,τ2i and a receiver’s strategy is defined by a vector hρ1, ρ2i). With that said, our model allows players to adopt mixed strategies because τ1,τ2, ρ1, and ρ2 can take any value in the interval [0, 1] whereas, with the exception of

Rowell et al.’s model, the standard LSG only allows pure strategies [18, 19, 20].

Another interpretation is to consider our model as a subgame of the general LSG. Recall that a subgame, as defined on page 9, is any part of the extensive-form game that, in isolation, can constitute as a game. We assume the state of the environment is fixed and the signaler observes the state, selects an appropriate signal, and the receiver reacts based on the signal received. Then each instance of observing the state and selecting a signal is a subgame defined by the actual state of the environment.

38 3.2 Example: Manipulative Mimics

The mimic octopus is a that has adapted the fascinating ability to alter its shape and behavior to impersonate poisonous sea creatures, such as the fish, banded sea , and

flounder. The lion fish has banded poisonous spines all over its body. The mimic octopus will mimic this behavior by flaring its tentacles on the side of its body and swimming slowly in the midst of the ocean. The flounder is a poisonous flat fish that skims across the bottom of the ocean, and so the mimic octopus will imitate a flounder’s motions by making itself flat and wrapping its tentacles behind its head. The mimic octopus has also been spotted impersonating a banded sea snake when the octopus was attacked by small damselfishes. The octopus threaded six arms down a hole and raised the remaining two in opposite directions, banded and curled. In general, the mimic octopus will alter its color to camouflage with the background when passing certain predatory fishes [32].

The most obvious impersonations that the octopus makes are all animals that produce strong toxins. Banded sea have fangs, the spines of lion fishes are tipped with toxins, and the

flounder is poisonous. The mimic octopus’s manipulative behavior is hypothesized to be a mech- anism intended to deceive predators for two reasons: (1) mimic octopus has only been observed imitating creatures that are toxic or dangerous, and (2) the octopus’s prey is mainly subterranean and fishes [32].

We assume a communication game between the mimic octopus and its predator. The mimic octopus is the signaler and the predator is the receiver. We are interested in determining the equilibrium state of truth/deception and belief/disbelief. We seek to find the equilibrium state that determines how often the mimic octopus will mimic the behavior of a poisonous sea creature and how often the predator will believe this signal.

Literature on information acquisition nearly exclusively focuses on within-species social learning and the dynamics of interspecies communication has not been investigated theoretically [6, 36].

Social learning between species is widespread in the animal kingdom. Examples of interspecies communication include the pollination dance of bees [1] the dance of fireflies, [39], and the foraging behavior of drongos [35]. It has been suggested that between-species social learning relies,

39 Figure 3.2: The mimic octopus

(a) in a , (b) normal foraging color, (c) impersonating a flounder, (d) a flounder model, (e) impersonating a lion fish, (f) a lion fish model, (g) impersonating the banded sea snake, and (h) a banded sea snake model. at least partially, on the same mechanisms as individual learning [1]. Heterospecific information transfer is widespread and occurs in all the ecological and cognitive domains in which within-species social learning is also found [1], and so it is reasonable to assume the adaptive learning model given above should provide an excellent fit to model the manipulative behavior of the mimic octopus.

3.2.1 The Dynamical System

Assume the environment may have two possible states, State 1 and State 2. State 1 refers to the presence of a mimic octopus and State 2 is the presence of a toxic sea creature, such as the banded sea snake. If the mimic octopus signals State 1, the predator (receiver) assumes the signal

40 is true—a mimic octopus is present. If the mimic octopus signals State 2, then the predator must choose to believe the signal or disbelieve the signal. If the predator believes the signal, then it assumes its interaction partner is a poisonous sea creature. If the predator does not believe the signal, then it assumes its interaction partner is the mimic octopus.

The mimic octopus may send either a truthful signal (the signal is State 1) and Flee the area or send a deceptive signal (the signal is State 2) and Not Flee the area. Thus, Flee/Not Flee are the actions adopted by the mimic octopus that correspond to a true/deceptive signal, respectively.

Suppose that a mimic octopus signals State 1 with frequency τ1 and signals State 2 with frequency

1 − τ1.

The presence of a toxic sea creature is a factor in the predator’s payoff function because, if the predator receives a signal of State 2, then the signaler may be a mimic octopus or the predator may actually be interacting with a toxic sea creature. After receiving a signal of State 2 (observing a truthful toxic creature or deceptive mimic octopus), the receiver may choose to believe/disbelieve this signal. Suppose that the receiver has two possible actions, Attack or Not Attack. a mimic octopus can send a false signal which, if believed, will prevent the predator from attacking and will allow the mimic octopus to Not Flee the area. A predator would prefer to Attack a mimic octopus and would prefer to Not Attack a toxic creature. The predator believes the signal and will Not

Attack with frequency ρ2. The predator does not believe the signal and will Attack with frequency

1 − ρ2.

Assume that a predator encounters a mimic octopus with probability d and a toxic model with probability 1 − d. The parameter d can be interpreted as the fraction of time that a receiver encounters a mimic relative to a toxic, i.e., d is a weight of certainty for the receiver. It is an approximation adopted by the receiver to predict the actual state of the environment. In essence, d corresponds to the concept of mind-reading (performed by the receiver) as proposed by Krebs and Dawkins [25].

The graphic shown in figure 3.3 summarizes players’ strategies in the Manipulative Mimics game. The mimic octopus can send a false signal to a predator by mimicking a toxic model. If the mimic octopus sends a false signal, then the octopus incurs a cost c. If the predator sees a toxic

41 Nature

d 1 − d

Mimic Octopus Toxic Model

τ1 1 − τ1 Mimic Octopus Toxic Model Toxic Model s11 = Flee s12 = Not Flee

P2 Received Mimic Octopus P2 Received Toxic Model

1 − ρ2 ρ2

P2 Believes State 1 P2 Believes State 1 P2 Believes State 2 r21 = Attack r11 = Attack r22 = Ignore

Figure 3.3: A tree of the manipulative mimics game

model, then the predator may choose to believe or disbelieve the signal. If the signal is believed, then the predator assumes a toxic creature is present and will Not Attack. If the signal is not believed, then the predator assumes a mimic octopus is present and will Attack.

Denote the mimic octopus’ expected payoff fitness by E[F ], where

1 if the octopus survives, and F =  (3.22)  0 otherwise.  The octopus has two choices of actions, either signal honestly and Flee or signal dishonestly and

Not Flee. Let pf be the probability the octopus survives if it selects to Flee. If the octopus sends a dishonest signal, then it impersonates a toxic model and the receiver may believe or disbelieve the signal. Let pb and pd be the octopus’ survival probabilities if the predator believes or disbelieves the signal, respectively. We assume that the octopus’ probability of survival is lowest when the receiver disbelieves the signal, or pd < min {pb,pf }, but we make no relation between the parameters pb

42 and pf . Suppose that a mimic octopus sends a false signal. If the predator chooses Not Attack and a mimic is present, then the predator will receive a payoff of 0. If the predator chooses Attack, then it will receive a payoff of 1 − pd.

Table 3.3: Actions of the manipulative mimics game when a mimic is present.

Signal, Freq. P1’s Action Belief, Freq. P2’s Action Payoffs

True, τ1 s11 = Flee State 1, ρ1 = 1 r11 = Attack (pf , −) False, 1 − τ1 s12 = Not Flee State 1, 1 − ρ2 r21 = Attack (pd, 1 − pd) False, 1 − τ1 s12 = Not Flee State 2, ρ2 r22 = Not Attack (pb, 0)

The payoff for sending a signal of State 1 is pf because the predator will always believe a true signal. The payoff for sending a signal of State 2 is pb if the predator believes the signal and pd if the predator does not believe the signal and so the payoff of sending a signal of State 2 is pbρ2 + pd(1 − ρ2). Then average payoff for a mimic octopus can be written as

f¯1(τ1, ρ1)= τ1pf + (1 − τ1)[pbρ2 + pd(1 − ρ2)] .

The replicator equation for τ1 is given by

τ˙1 = τ1 pf − f¯(τ1, ρ1)   = τ1(1 − τ1)[pf − pbρ2 − pd(1 − ρ2)] .

Since the receiver’s payoff depends upon the frequency of mimics encountered relative to the frequency of toxic sea creatures, we must also know payoffs for the predator’s actions against a toxic model. Suppose that the predator receives a payoff of h (“harmless”) if the predator opts to

Not Attack a toxic creature and t (“toxic”) if the predator chooses to Attack. Since the predator prefers not to be poisoned, we assume t

If a signal of State 2 is received, then there are two possible ways the predator could have

43 Table 3.4: Actions of the predator against a toxic model.

P2 Belief, Frequency P2’s Action Payoff State 1, 1 − ρ2 r21 = Attack t State 2, ρ2 r22 = Not Attack h

received the signal: either a mimic octopus sent a false signal or a toxic creature is present. If the predator believes the signal, then the predator chooses response r22 thus will Not Attack. Then the payoff of believing a signal of State 2 is the probability that an octopus sent a false signal times the payoff of believing the signal plus the probability that a toxic creature is present times the payoff of believing a toxic creature is present:

g22(τ1) = (1 − d)h. (3.23)

The payoff for the predator when the signal is not believed may be calculated similarly, but we sum payoffs from response r21 (Attack):

g21(τ1)= d(1 − τ1)(1 − pd) + (1 − d)t. (3.24)

Combining equations (3.23) and (3.24) the average payoff for the predator after receiving a signal of State 2 can be written as

g¯(τ1, ρ2)= ρ2g22(τ1) + (1 − ρ2)g21(τ1). (3.25)

Combining equations (3.23), (3.24), and (3.25) allows us to create the replicator equation for ρ2:

ρ˙2 = ρ2 [g22(τ1) − g¯(τ1, ρ2)]

= ρ2(1 − ρ2) [(1 − d)(h − t) − d(1 − τ11)(1 − pd)] .

We summarize this section by rewriting the two-dimensional dynamical system:

44 τ˙1 =[pf − pbρ2 − pd(1 − ρ2)] τ1(1 − τ1) (3.26) ρ˙2 = [(1 − d)(h − t) − d(1 − τ1)(1 − pd)] ρ2(1 − ρ2).

3.2.2 Results

Equilibria occur when the equations given in equation (3.26) are equal to zero. This will occur at the four corners of the unit square: τ1 = 0, 1 and ρ2 = 0, 1. There is also a possible fixed point when

pf − pbρ2 − pd(1 − ρ2) = 0 and

(1 − d)(h − t) − d(1 − τ1)(1 − pd) = 0, which yields

∗ pf − pd ρ2 = pb − pd and

∗ d(1 − pd) − (1 − d)(h − t) τ1 = . d(1 − pd)

When pb < pf , then, regardless of the predator’s strategy, the mimic octopus has a greater survival probability if it signals true Flees than if it signals false. This implies it is always more benefical to send a truthful signal. Thus the only signaling system that will result is a system where the signaler will always signal true and the receiver will always believe.

When pb >pf , we have three possible dynamical structures that may evolve, depending on the

∗ ∗ sign of τ1 . If τ1 < 0, there is no interior equilibrium and the only stable equilibrium point is ρ2 = 1 ∗ and τ1 = 0. If τ1 > 0, the system has an interior fixed point. This system yields the same results ∗ as presented in Rowell et al.’s one-listener game [36]. Finally, if τ1 = 0, then there is a fixed point ∗ ∗ everywhere along the line τ1 = 0, where ρ2 < ρ2 yields an unstable fixed point and ρ2 > ρ2 yields a stable fixed point. In summary, a fixed point stability analysis yields one of the three vector fields shown in figure 3.4.

The graph in Figure 3.5 (b) was studied extensively in [36]. Numerical simulations show that

45 the interior fixed point is a center. Dynamics in the interior of the unit square are cyclic, resulting in periodic orbits in the clockwise direction. This model presents a system where deception is possible without complete breakdown, but it does not support stable deception [36].

The graph in Figure 3.5 (a) shows a system where complete deception is always believed and does not break down. The stable fixed point ρ2 = 1, τ1 = 0 implies the mimic will always deceive and the receiver will always believe the signal. The graph in figure 3.5 (c) is similar, but allows for some values where disbelief occurs.

This differs from Rowell et al.’s two player model because we introduce a third nonstrategic individual, the toxic sea creature. We have three types of individuals in the signaling system, but two strategic players in the signaling game. In their analysis, as summarized on page 20, Rowell et al. make assumptions on the parameters and they show that the equilibrium of a two-player deceptive signaling interaction should result in a system similar to the one shown in figure 3.4 (b).

In this model, we show that the resulting equilibrium solution is not just a function of payoffs from the signaler and receiver, rather, it’s a function that also captures external factors of believing or disbelieving a signal.

Empirical arguments in the existing literature suggest that mimicry can evolve because Batesian mimics are rare relative to the models that are being mimicked [11, 28, 41, 50], but this dynamical system shows that is not exactly the case. The outcome of a stable mimicry system depends on the sign of (1 − d)(h − t)+ d(f − a), which is function of the fraction of Batesian mimics, the payoff difference of misjudging a false signal (Type 1 error), and the payoff difference of misjudging a true signal (Type 2 error). This implies the evolution of mimicry is dependent upon the fraction of mimics in the population as well as the payoff difference of making an incorrect assessment for the receiver. Therefore, mimicry evolves based on the fraction of Batesian mimics relative to models as well as the cost associated when the receiver makes a judgement error.

Recall from page 5 that Stamp Dawkins and Guilford argue that honesty, manipulation, and mind-reading are strategic components operate through receiver psychology. Honest signals emit information that is important to the receiver and is easy for the receiver to detect, discriminate, or remember whereas manipulative signals involve the signaler exploiting the receiver’s psychology.

46 This is precisely how the mimic octopus is able to cheat in this signaling system. The receiver, which is a predator of the mimic octopus, incurs a substantial cost if it attacks a toxic model. The mimic octopus is able to thrive by mimicking the behavioral patterns and colors of the toxic model and exploiting the receiver’s psychological landscape.

47 1.0 1.0

0.8 0.8

0.6 0.6 1 1 Τ Τ

0.4 0.4

0.2 0.2

0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Ρ2 Ρ2 (a) (b)

(c) Legend ρ2 Fraction of time a mimic octopus sends a deceptive signal τ1 Fraction of time a predator believes the signal

Figure 3.4: Manipulative mimics game vector field

∗ ∗ (a) τ1 > 0 so the system has no interior equilibrium, (b) τ1 < 0 so the system has an interior ∗ equilibrium, and (c) τ1 = 0 and so there is an equilibrium line at τ1 = 0

48 CHAPTER 4

THE TWO-STATE SIGNALING GAME WITH

MULTIPLE RECEIVERS

In this chapter, we create a model with one signaler and n types of receivers, which we will label as type i, i ∈ {1, 2,...,n}. The signaler is randomly paired with a receiver based on the composition of the population. The signaler is given biased information; the signaler knows the state of the environment but the receiver does not. A signal is selected and the signaler chooses an appropriate action in accordance to the signal. After receiving the message, the receiver interprets the message and selects an appropriate response.

Suppose we have two possible states, labeled G1 and G2. Let d be the probability that the environment is in State G1 and 1 − d be the probability the environment is in State G2.

Let τi be the frequency at which the signaler sends a true signal when the actual state is

m Gi. Let ρj be the frequency at which a type m receiver believes a signal of Gj. Suppose the signaler encounters receiver m with probability pm. We assume a receiver is always present, and n so pi = 1. Xi=1 m Let sij be the action of signaling Gj when the observed state is Gi. Let rjk be the response for receiver m when Gj was signaled and Gk was believed.

m Let φGi (sij,rjk) be the payoff function for the signaler when the actual state is Gi. Let m ψGi (sij,rjk) be the payoff for a type m receiver when the actual state is Gi, Player 1 signals

Gj and the receiver believes the state is Gk.

49 Nature

d (1 − d)

State 1 State 2

τ1 1 − τ1 1 − τ2 τ2

s11 s12 s21 s22

Receives State 1 Receives State 2

m m m m ρ1 1 − ρ1 1 − ρ2 ρ2 m m m m r11 r12 r21 r22

Figure 4.1: Manipulation game with n receivers

4.1 The Replicator Equations

In this section, we will apply the replicator equation to the model shown in figure A.1. Recall from page 14 that the general form of the replicator equation can be written as

x˙i = xi fi(x) − f¯(x) ,   where xi is the frequency of a given strategy with index i, x = hx1,x2,...,xni is the frequency vector, fi(x) is the payoff for adopting a strategy with index i, and f¯(x) is the average payoff of the population. We summarize all variables in table 4.1.

We begin by creating the replicator equation for the signaler. Suppose the actual environment is State Gi. Define s = hs11,s12,s22,s21i and

1 1 1 1 2 2 2 2 n n n n r = hr11,r12,r21,r22,r11,r12,r21,r22,...,r11,r12,r21,r22i

50 Table 4.1: Variables in the n-receiver signaling game

Parameter Interpretation

Signaler sij Action of signaling State Gj when the

environment is Gi

τi The frequency of action sii when the

environment is Gi

ψi Payoff function for the signaler in State Gi m Type m Receiver rjk A type m’s response to signal Gj when State Gk is believed m m ρj A type m’s frequency of response rjj when the signal is Gj

φi Payoff function for the receiver in State Gi.

as the action and response vectors for the signaler and receivers, respectively. Let τ = hτ1,τ2i and

1 1 2 2 n n ρ =hρ1, ρ2, ρ1, ρ2 . . . , ρ1 , ρ2 i be the frequency vector for the signaler and receivers, respectively.

First, suppose nature selects State 1. The signaler has two choices, either select action s11 with probability (w.p.) τ1 or action s12 (w.p. 1 − τ1). If the signaler selects s11, it will receive a payoff

m m m of ψ1(s11,r11) if a type m receiver chooses response r11 (occurs w.p. ρ1 ) or will receive a payoff m m m of ψ1(s11,r12) if a type m receiver chooses response r12 (occurs w.p. 1 − ρ1 ). Since the signaler interacts with a type m receiver with probability pm, the payoff for adopting action s11 is given by

1 1 1 1 f11(s, r, ρ) = p1 ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) 2 2 2 2  +p2 ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) + ... (4.1) n n n n  +pn (ρ1 ψ1(s11,r11) + (1 − ρ1 )ψ1(s11,r12)) n i i i i = i=1 pi ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) . P 

Alternatively, the signaler could adopt action s12. Then the signaler will receive a payoff of

m m m ψ1(s12,r22) if a type m receiver chooses response r22 (occurs w.p. ρ2 ) and will receive a payoff of m m m ψ1(s12,r21) if the receiver chooses response r21 (occurs w.p. 1 − ρ2 ). The signaler interacts with a

51 type m receiver with probability pm and so the payoff for adopting action s12 is given by

1 1 1 1 f12(s, r, ρ) = p1 ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21) 2 2 2 2  +p2 ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21) + ... (4.2) n n n n  +pn (ρ2 ψ1(s12,r22) + (1 − ρ2 )ψ1(s12,r21)) n i i i i = i=1 pi ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21) P  Combining equations (4.1) and (4.2), the average payoff for the signaler when the environment is State 1 can be written as

f1(s, r, ρ, τ)= τ1f11(s, r, ρ) + (1 − τ1)f12(s, r, ρ) (4.3)

Combining equations (4.1) and (4.3) yields the replicator dynamic for strategy τ1:

τ˙1 = τ1 [f11(s, r, ρ) − f1(s, r, ρ, τ)]

= τ1(1 − τ1)[f11(s, r, ρ) − f12(s, r, ρ)] . (4.4) n i i i i = τ1(1 − τ1) i=1 pi ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) − n i P i i i  i=1 pi ρ2ψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21) . P  Note that the frequency of action s12 is 1 − τ1, and soτ ˙1 sufficiently captures the frequency dynamics for the signaler’s set of actions when State 1 is observed.

Now, suppose the actual environment is State 2. Then the signaler may choose action s22 (w.p.

τ2) or action s21 (w.p. 1 − τ2). If the signaler selects action s22, then it will receive a payoff of ψ2(s22,r22) if the receiver chooses response r22 (occurs w.p. ρ2) and will receive a payoff of

ψ2(s22,r21) if the receiver chooses response r21 (occurs w.p. 1 − ρ2). Then the payoff for action s22

52 can be written as

1 1 1 1 f22(s, r, ρ) = p1 ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21) 2 2 2 2  +p2 ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21) + ... (4.5) n n n n  +pn (ρ2 ψ2(s22,r22) + (1 − ρ2 )ψ2(s22,r21)) n i i i i = i=1 pi ρ2ψ2(s22,r22) + (1 − ρ2)ψ2(s22,r21) P  m If the signaler selects action s21, (w.p. 1 − τ2), then it will receive a payoff of ψ2(s21,r11) if a m m m type m receiver selects response r11 (occurs w.p. ρ1 ) and will receive a payoff of ψ2(s21,r12) if a m m type m receiver selects response r12 (occurs w.p. 1 − ρ1 ). Then the payoff for action s21 can be written as 1 1 1 1 f21(s, r, ρ) = p1 ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12) 2 2 2 2  +p2 ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12) + ... (4.6) n n n n  +pn (ρ1 ψ2(s21,r11) + (1 − ρ1 )ψ2(s21,r12)) n i i i i = i=1 pi ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12) P  Combining equations (4.5) and (4.6), we can calculate the average payoff for the signaler when the environment is State 2:

f2(s, r, ρ, τ)= τ2f22(s, r, ρ) + (1 − τ2)f21(s, r, ρ) (4.7)

By combining equations (4.5) and (4.7), we formulate the replicator equation for action s22:

τ˙2 = τ2 [f22(s, r, ρ) − f2(s, r, ρ, τ)]

= τ2(1 − τ2)[f22(s, r, ρ) − f21(s, r, ρ)] (4.8) n i i i i = τ2(1 − τ2) i=1 pi ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) − n i P i i i  i=1 pi ρ1ψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12) P  Next we find the replicator equations for a type m receiver. Assume that State 1 is the signal

m m m received. Then the receiver may select between responses r11 and r12. If the receiver selects r11, m then it will receive a payoff of φ1(s11,r11) if the signaler selected action s11 (occurs w.p. τ1) and

53 m will receive a payoff of φ2(s21,r11) if the signaler selected action s21 (occurs w.p. 1 − τ2). Then the m payoff for selecting response r11 is

m m m g11(s, r,τ)= dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11). (4.9)

m m If a type m receiver selects response r12, then it will receive a payoff of φ1(s11,r12) if the signaler m selected action s11 (occurs w.p. τ1) and will receive a payoff of φ2(s21,r12) if the signaler selected m action s21 (occurs w.p. 1 − τ2). Then the payoff for selecting response r12 is

m m m g12(s, r,τ)= dτ1φ1(s11,r12) + (1 − d)(1 − τ2)φ2(s21,r12). (4.10)

Combining equations (4.9) and (4.10) yields an average payoff for a type m receiver when the signal received is State 1:

m m m g1 (s, r, τ, ρ)= ρ1g11(s, r,τ) + (1 − ρ1)g12(s, r,τ). (4.11)

Combining equations (4.9) and (4.11), we can write the replicator equation for response r11 as

m m m m ρ˙1 = ρ1 [g11(s, r,τ) − g1 (s, r, τ, ρ)] m m m m = ρ1 (1 − ρ1 )[g11(s, r,τ) − g12(s, r,τ)] (4.12) m m m m = ρ1 (1 − ρ1 )[dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11) m m −dτ1φ1(s11,r12) − (1 − d)(1 − τ2)φ2(s21,r12)]

m Finally, suppose State 2 is the signal received. If a type m receiver selects response r22, then m it will receive a payoff of φ2(s22,r22) if the signaler selected action s22 (occurs w.p. τ2) and will m receive a payoff of φ1(s12,r22) if the signaler selected action s12 (occurs w.p. 1 − τ1). Then the m payoff for playing response r22 is given by

m m m g22(s, r,τ) = (1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22). (4.13)

54 m The payoff for response r21 can be written as

m m m g21(s, r,τ) = (1 − d)τ2φ2(s22,r21)+ d(1 − τ1)φ1(s12,r21). (4.14)

Combining equations (4.13) and (4.14) allows us to calculate the average payoff when the receiver was given a signal of State 2:

m m g2(s, r, τ, ρ)= ρ2g22(s, r,τ) + (1 − ρ2)g21(s, r,τ). (4.15)

m The replicator equation for response r22 can be found by combining equations (4.13) and (4.15)

m m m m ρ˙2 = ρ2 [g22(s, r,τ) − g2 (s, r, τ, ρ)] m m m m = ρ2 (1 − ρ2 )[g22(s, r,τ) − g21(s, r,τ)] . (4.16) m m m m = ρ2 (1 − ρ2 ) [(1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22) m m −(1 − d)τ2φ2(s22,r21) − d(1 − τ1)φ1(s12,r21)]

The computation above results in a dynamical system with four variables. The four-dimensional system is given by equations (4.4), (4.8), (4.12), and (4.16) We summarize the dynamical system in table 4.2.

Table 4.2: Dynamical system for the two-state n receiver model

n i i i i τ˙1 = τ1(1 − τ1) i=1 pi ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) − n i i i i i=1 pi ρ2Pψ1(s12,r22) + (1 − ρ2)ψ1(s12,r21)  n i i i i τ˙2 = τP2(1 − τ2) i=1 pi ρ1ψ1(s11,r11) + (1 − ρ1)ψ1(s11,r12) n i i i i i=1 pi ρ1Pψ2(s21,r11) + (1 − ρ1)ψ2(s21,r12)  m m m m m ρ˙1 = Pρ1 (1 −ρ1 )[dτ1φ1(s11,r11) + (1 − d)(1 − τ2)φ2(s21,r11) m m −dτ1φ1(s11,r12) − (1 − d)(1 − τ2)φ2(s21,r12)] m m m m m ρ˙2 = ρ2 (1 − ρ2 ) [(1 − d)τ2φ2(s22,r22)+ d(1 − τ1)φ1(s12,r22) m m −(1 − d)τ2φ2(s22,r21) − d(1 − τ1)φ1(s12,r21)]

A similar argument to what was discussed on page 37 will show that the learning dynamics will equal the replicator equations for this model. This is because receivers’ strategies do not depend

55 on each other, as receivers only interact with the signaler; they do not interact with each other.

Also, the signaler is unaware of the type of receiver in the interaction, therefore, the signaler is not able to alter its strategy based on its interaction partner.

This model differs from Rowell et al.’s model described on page 17 in a few aspects. First, we use two dynamical equations to represent the strategy frequencies for the signaler. Two dynamical equations are used because we wanted to construct a signaling system where the signaler’s choices were more complex than truth/deceit, and the construction adopted above provides a framework to do so. In this model, State 1 and State 2 can represent a system of truth/deception or it may simply offer two different signals that the signaler may adopt in the given situation. For example, signaling State 1 versus State 2 could indicate one signal for a ground predator and another for an aerial predator [34].

Second, Rowell et al.’s approach entangles payoffs when there are two types of receivers. Their model establishes a payoff matrix by determining whether the signaler sent a truthful signal or not, whether a type-1 receiver believed the signal or not, and whether a type-2 receiver believed the signal or not. In particular, payoffs for a type-1 receiver directly depend upon whether the type-2 receiver believed the signal (or not) and vice versa. Therefore, their model assumes three player interactions in the signaling game, which is an assumption unstated by the authors. The model shown above can be used to model a communication system with multiple types of receivers but the receivers do not interact with each other. The signaler’s payoff depends upon the actions of all receivers, the receivers’ payoff depends upon the action of the signaler, but the payoffs for distinct receivers do not directly depend upon each other, and so our model assumes two player interactions where one player is assigned the role of the signaler and the other player is the receiver.

In the Appendix we extend our model to allow n-states so that more than two signals are allowed. Each player has n(n − 1) strategies, totaling 2n(n − 1) strategies (see page 75).

For the remainder of this chapter, we set m = 2 and apply the general form of the model shown in 4.1 to an application where individuals are able to send false information about their resource holding potential (RHP), or fighting quality. This model is an extension of Rowell et al.’s “Bluffing by territory holders” model described on page 23. Recall that their model assumes two types of

56 individuals in the population, strong and weak, that engage in conflict over a territory. The owner of the territory signals its strength, and then a weak individual chooses to challenge or respect the signal.

In their model, strong individuals always defeat weak individuals, and so the outcome of an in- teraction is based solely on whether signalers send true/false signals and receivers believe/disbelieve the signal. No signal is a perfect predictor of the outcome of a fight between unequally matched opponents because chance factors of defeating an opponent can affect the signaling interaction as well as a player’s probability of engaging in a fight [41]. Therefore, we extend Rowell et al.’s model by creating a model where unequally matched opponents have a chance factor of winning the battle.

We assume strong individuals will defeat weak individuals with probability γ, where 0.5 ≤ γ ≤ 1.

A second difference we address is that we consider two player interactions. The signaler is paired with a strong or weak receiver, where the probability of being paired with either type is dependent upon the composition of the population.

Our main result is that weak receivers disbelieve the signal for most values of γ (for 0.5 ≤

γ / 0.965). We believe this result holds because of an assumption we make on fighting costs—we assume it is less costly for a weak individual to fight a strong individual than for a weak individual to fight another weak individual. We suggest this might be the causation for overly aggressive behavior that is often exhibited in smaller or weaker males.

4.2 Deception in RHP

Suppose we have two types of individuals in the population, strong and weak individuals.

Players are given partial information— each player knows their own strength but does not know the strength of their opponent. We shall assume the signaler is a territory owner and the receiver is another individual passing by the territory that has interest in possessing the territory. Both players may select to play Hawk and fight for the territory or play Dove and forgo the territory.

The signaler sends a message indicating his strength, and so a signal is intended to say, “I am strong,” or “I am weak.” After interpreting the message, the receiver chooses to believe/disbelieve

57 the signal.

One biological example of this model occurs in the fiddler crab, Uca annulipes. The fiddler crab has one claw that is much larger and stronger, which used for fighting to gain access to a territory.

If a fiddler crab loses a claw, then a new claw will grow back. The new claw is weaker, but less costly to display. The original claw is called a brachychelous claw and a regrown claw is called a leptochelous claw. Let crabs with a brachychelous claw represent strong individuals, or type-L

(for large) individuals, and crabs with a leptochelous claw represent weak individuals, or type-W individuals.

4.2.1 The Model

Suppose that a fraction d of the population is type-L and the remaining fraction (1 − d) is type-W. We assume that a signal is used to indicate the player’s type, and so signals indicate the strength of the signaler. We have four possible states that may exist; both players may be either strong or weak. Let Gij denote the game structure where the signaler is a Type-i player and the receiver is a Type-j player. The four possible combinations are shown in the table below.

Table 4.3: Possible states in the deceptive strength game.

Signaler’s Type Receiver’s Type The State

LL GLL L W GLW WL GWL W W GWW

The game structures GLL and GWW are the Hawk-Dove game when players have equal strength and the game structures GLW and GWL are the Hawk-Dove game when players have unequal strength. Players are competing over a resource with value V . We will assume that a strong individual defeats a weak individual with probability γ, where 0.5 ≤ γ ≤ 1.

If both players select Hawk, then a battle ensues. If both players are strong, then the loser incurs a cost cL. We assume that players with equal strength are equally likely to win the battle,

58 V − c and so each player receives an expected payoff of L . Likewise, if both players are weak, then 2 V − c the loser will incur a cost c . Each player receives an expected payoff of L . If one player is W 2 strong and the other is weak, a strong loser will incur a costc ˆL and a weak loser will incur a cost cˆW . We will assume that it is less costly for a strong individual to lose against a weak individual than it is for a weak individual to lose against a strong individual, and so cˆL ≤ cˆW . Assume that a strong individual defeats a weak individual with probability γ. Then expected payoff for a large when fighting against a weak individual is γV −(1−γ)ˆcL. Similarly, the expected payoff for a weak individual against a strong individual is (1 − γ)V − γcˆW .

Assume V

Table 4.4: List of values for the deceptive strength game Parameter Interpretation and Constraints V Value of the resource

cL, cW Cost associated with fighting when players are equal strength V

cˆL,c ˆW Cost associated with fighting when players are unequal strengthc ˆL ≤ cˆW ≤ min(cW ,cL) d Fraction of Large individuals 0 ≤ d ≤ 1

We have four possible game structures that may exist depending on the types of players engaged in the game. The four possible games that may result are GLL, GLW , GWL, and GWW . The games

GLL and GWW represent the standard two-player Hawk-Dove game as defined by Maynard Smith and Price [29] while GLW and GWL represent the asymmetric Hawk-Dove game, where a strong individual defeats a weak individual with probability γ. Players’ payoffs for selecting Hawk or Dove in any of the four possible environmental states given in table 4.5.

59 Table 4.5: Possible states of the deception in RHP game

Hawk Dove V −cL V −cL GLL = Hawk ( 2 , 2 ) (V, 0) V V Dove  (0,V ) ( 2 , 2 )

GLW = (γV − (1 − γ)ˆcL, (1 − γ)V − γcˆW ) (V, 0) V V  (0,V ) ( 2 , 2 )

GWL = ((1 − γ)V − γcˆW ,γV − (1 − γ)ˆcL) (V, 0) V V  (0,V ) ( 2 , 2 )

V −cW V −cW GWW = ( 2 , 2 ) (V + ǫ, 0) V V  (0,V ) ( 2 , 2 )  The games GLL and GWW represent interaction games with players of equal strength. The games GLW and GWL represent interaction games where players are unequal strength and a large individual defeats a weak individual with probability γ.

∗ ∗ Let (xij,yij) be the mixed-strategy Nash equilibrium for the game Gij. Finding mixed-strategy equilibria of the payoff matrices given above is a straightforward calculation and we omit the work.

See section 2.1.1 to see an example of how to calculate mixed-strategy equilibria. We summarize the equilibrium points in the table 4.6, where the values in the table correspond to the fraction of time an individual will play Hawk.

Table 4.6: Nash mixed strategy equilibria of the deception in RHP game

Equilibria for the Asymmetric Strength Hawk-Dove Game ∗ ∗ V xLL = yLL = cL ∗ ∗ V xLW = yWL = γ(ˆcW + V ) ∗ ∗ V xWL = yLW = V + 2(1 − γ)ˆcL − 2γ(V ) ∗ ∗ V xWW = yWW = cW

60 Since signalers are the territory owners, signalers do not wish to battle because they already own the territory. Therefore we assume that a strong signaler will always signal strong. Since strong signalers do not signal weak, then the only individuals signaling weak are weak individuals.

This implies a receiver will know the signaler is weak if the signal is weak. The strong signaler’s role in the signaling game is not strategic because the same signal is sent in any interaction between a signaler and receiver. However, the weak signaler’s role is strategic. The weak signaler can send a true signal and admit its weakness or it may send a deceptive signal stating it is strong. Both the strong and weak receiver’s roles are strategic because any receiver must choose to believe or disbelieve a signal indicating their opponent is strong.

Let τ be the frequency that a weak individual sends a signal of weak. Let ρ and υ be the frequency that a strong and weak receiver, respectively, believe that their opponent is strong when the signal was strong. Let sij is the signaler’s action in State i when State j was signaled. Let rjk and ujk be the strong and weak receivers’ responses, respectively, when State j was signaled and State k is believed.

4.2.2 Strategies Within the Signaling System

In this model, players are engaged in the Hawk-Dove game with unequal strength and players are unaware of the strength of their opponent. However, the receiver can presume the signaler’s strength from pre-confrontation signaling. Recall from page 30 that we assume a players’ strategy in the confrontation interaction depends upon pre-confrontation signaling. Thus, sij, rjk and ujk are strategies in the Hawk-Dove game, where a player’s elected strategy depends on their interpretations of their opponent’s strength.

After interpreting the signal, receivers will develop a theory of the signaler’s strength, and so the receiver has a biased perception of what the actual state of the environment is. We assume that the receiver will act optimally based on its perception of its opponent’s strength. Therefore, we assume that receivers will choose the mixed strategy Nash equilibrium value (MSE) that corresponds to their perceived state. Also, since the signaler has no information about the receiver’s actual strength, we assume signalers select a strategy based on the composition of the population and

61 whether the signal was true or not.

Suppose that the receiver is strong, which means the receiver knows the state is either GLL or

GWL. The strong receiver will believe the state is GLL only if the signal received was strong and

∗ the signal was believed (response rLL). Then we set rLL equal to the MSE of GLL, or rLL = yLL.

The receiver will believe the state is GWL in either one of two cases:

1. The signal received was weak and the signal was believed (response rWW ), or

2. The signal received was strong and the signal was not believed (response rLW ).

∗ In either case we set rWW = rLW = yWL. When a strong receiver believes its opponent is weak, then the receiver will assume the actual state is GWL and play corresponding to the MSE of GWL,

∗ or yWL.

Now suppose that the receiver is weak and so the receiver knows the state is either GLW or

GWW . The receiver will believe the actual state is GLW when it believes the signaler is strong, which occurs precisely when the signal is strong and the receiver believes the signal (response uLL).

∗ We set the corresponding action, uLL, equal to the MSE of GLW , or uLL = yLW . The receiver will believe the game is GWW in either one of two cases:

1. The signal received was weak and the signal was believed (response uWW ), or

2. The signal received was strong and the signal was not believed (response uLW ).

∗ In either case we set the corresponding action equal to the MSE of GWW , or uWW = uLW = yWW . Next, we consider the signaler’s set of possible actions. We start with the strong signaler.

The strong signaler knows the actual state is either GLL or GLW . Since signalers are given no information pertaining to the receiver’s RHP, we assume the signaler will adopt a MSE according to the proportion of individuals within the population. Then the signaler should assume his opponent is strong i.e. the state is GLL with probability d and should assume his opponent is weak i.e. the state is GLW with probability (1 − d). Then the strong signaler will play Hawk with probability

∗ ∗ sLL = dxLL + (1 − d)xLW .

62 Likewise, we will assume that a weak individual signaling weak will adopt a MSE according to the composition of the population. Since a weak signaler assumes the state is GWL with probability d and GWW with probability 1 − d, then a weak individual signaling weak will play Hawk with probability

∗ ∗ sWW = dxWL + (1 − d)xWW .

Finally, we determine the actions of a weak individual signaling strong, or sWL, which implies the signaler was deceptive. If the signaler chose a strategy based on its own strength, it would imply sWL = sWW and if the signaler chose a strategy based on the strength of the signal emitted, it would imply sWL = sLL. In order to account for both cases, we will assume that a weak deceptive signaler will choose to adopt some linear combination of actions truthful to the actual environment and truthful to the signal:

∗ sWL = αsWW + (1 − α)sLL, where 0 ≤ α ≤ 1. In this sense, α acts as a scaling parameter. If an individual sends a deceptive signal, it may either act according to the Nash equilibrium of the actual state of the system (α = 0) or it may act as though the signal they sent is the actual state of the system (α = 1). Then α refers to the fraction of time that a deceptive signaler acts as though its signal was true.

In summary, players’ strategies are represented by the table below:

Table 4.7: Players’ actions in the deception in RHP game

Action Equilibrium Strategy ∗ rLL yLL ∗ rWW ,rLW yWL ∗ uLL yLW ∗ uWW ,uLW yWW ∗ ∗ sLL dxLL + (1 − d)xLW ∗ ∗ sWW dxWL + (1 − d)xWW sWL αsWW + (1 − α)sLL

63 4.2.3 Creating the Payoff Functions

Let ψij(sij,rjk) be the payoff for the weak signaler in State i, the signaler adopts action sij, and the receiver chooses response rjk. Let φij(sij,rjk) and Υij(sij,ujk) be the payoff for the strong and weak receiver, respectively, when the signaler chooses action sij.

If the game is GLL, then both players are strong. Since the signaler is strong, the signaler does not send a false signal. We define the payoff function for the receiver by the expected payoff based on the actions of each player. If the signaler chooses action sij and the strong receiver chooses action rjk, then the payoffs for the strong receiver is given by:

V − cL V φ (sij,r )= sijr + V (1 − sij)r + (1 − sij)(1 − r ), LL jk 2 jk jk 2 jk which corresponds to the payoffs given by the payoff matrix for GLL shown in equation (4.6).

If the game is GLW , then the signaler is strong and the receiver is weak. If the strong signaler and weak receiver play Hawk with probability sij and ujk, respectively, then the weak receiver’s payoff may be represented by:

V Υ (sij,u ) = ((1 − γ)V − γcˆ ) siju + V (1 − sij)u + (1 − sij)(1 − u ). LS jk S jk jk 2 jk

If the game is GWL, then the signaler is weak and the receiver strong. If the signaler plays Hawk with probability s and the receiver plays Hawk with probability r, then the payoffs are given by V ψ (sij,r ) = ((1 − γ)V − γcˆ ) sijr + Vsij(1 − r )+ (1 − sij)(1 − r ) WL jk S jk jk 2 jk V φ (sij,r )=[γV − (1 − γ)ˆc ]sijr + V (1 − sij)r + (1 − sij)(1 − r ), WL jk L jk jk 2 jk

respectively. If the game is GWW , then the signaler is weak and the receiver is weak. We assume the Hawk-Dove game with equal strength. If the signaler plays Hawk with probability sij and the receiver plays Hawk with probability ujk, then payoffs for the signaler and receiver are given by

64 V − cS V ψ (sij,u )= siju + Vsij(1 − u )+ (1 − sij)(1 − u ) WW jk 2 jk jk 2 jk

V − cS V Υ (sij,u )= siju + V (1 − sij)u + (1 − sij)(1 − u ), WW jk 2 jk jk 2 jk

respectively. We conclude by summarizing all payoff functions in the table below:

Table 4.8: Payoffs in the deception in RHP game

Function Formula V −cS V ψWW (sij,ujk) 2 sijujk + Vsij(1 − ujk)+ 2 (1 − sij)(1 − ujk) V ψWL(sij,rjk) ((1 − γ)V − γcˆS) sijrjk + Vsij(1 − rjk)+ 2 (1 − sij)(1 − rjk) V −cL V φLL(sij,rjk) 2 sijrjk + V (1 − sij)rjk + 2 (1 − sij)(1 − rjk) V φWL(sij,rjk) [γV − (1 − γ)ˆcL]sijrjk + V (1 − sij)rjk + 2 (1 − sij)(1 − rjk) V ΥLW (sij,ujk) ((1 − γ)V − γcˆS) sijujk + V (1 − sij)ujk + 2 (1 − sij)(1 − ujk) V −cS V ΥWW (sij,ujk) 2 sijujk + V (1 − sij)ujk + 2 (1 − sij)(1 − ujk) Payoffs when the signaler plays Hawk with probability sij, the large receiver plays Hawk with probability rjk, and the small receiver plays Hawk with probability ujk.

4.2.4 The Dynamical System

Let s = hsWW ,sWLi, r = hrWW ,rLL,rLW i, and u = huWW ,uLL,uLW i be the action and response vectors, respectively, for the signaler and two types of receivers. The weak signaler has two strategies, signal weak or strong, with corresponding actions sWW and sWL, and which occur with frequency τ and 1 − τ, respectively. Since a receiver always believes a weak signal, the payoff for adopting strategy sWW is equal to the probability of facing a strong opponent times the payoff of facing a strong opponent plus the probability of facing a weak opponent times the payoff of facing a weak opponent, or

fWW (s, u)= dψWL(sWW ,rWW ) + (1 − d)ψWW (sWW ,uWW ). (4.17)

The payoff for adopting strategy sWL involves four components: the receiver could be strong or weak, and the receiver could believe or disbelieve this signal. The receiver is strong with probability

65 d and weak with probability 1 − d. If a strong receiver believes the signal, which occurs with frequency ρ, then the strong receiver will play Hawk with probability rLL. If the strong receiver disbelieves the signal, which occurs with frequency 1 − ρ, then the strong receiver will play rLW .A receiver is weak with probability 1 − d. If the weak receiver believes the signal (which occurs with frequency υ) then the receiver will play Hawk with probability uLL. If the weak receiver does not believe the signal (which occurs with frequency 1 − υ) then the weak receiver will play Hawk with probability uLW . Then the payoff for a deceptive weak signaler can be written as:

fWL(s, r, u, ρ, υ) = d (ρψWL(sWL,rLL) + (1 − ρ)ψWL(sWL,rLW )) (4.18) +(1 − d)(υψWW (sWL,uLL) + (1 − υ)ψWW (sWL,uLW )) .

By combining equations (4.17) and (4.18), the average payoff for a weak signaler is

f(s, r, u, τ, ρ, υ)= τfWW (s, u) + (1 − τ)fWL(s, r, u).

∂f We calculate and arrive at the learning dynamic equation for the signaler: ∂τ

τ˙ =[fWW (s, u) − fWL(s, u)] τ(1 − τ)

=[dψWL(sWW ,rWW ) + (1 − d)ψWW (sWW ,uWW ) (4.19) −d (ρψWL(sWL,rLL) + (1 − ρ)ψWL(sWL,rLW ))

−(1 − d)(υψWW (sWL,uLL) + (1 − υ)ψWW (sWL,uLW ))] τ(1 − τ).

The strong receiver has two strategies, believe or disbelieve the signal. If the strong receiver believes the signal, it will play Hawk with probability rLL. In this case, either the signaler was strong and honest with frequency d (in which case, will play Hawk with probability sLL) or weak and deceptive with frequency (1 − d)(1 − τ) (thus will play Hawk with probability sWL). Then the payoff for adopting strategy rLL is

gLL(s, r,τ)= dφLL(sLL,rLL) + (1 − d)(1 − τ)φWL(sWL,rLL).

66 The payoff for adopting strategy rLW can be calculated similarly:

gLW (s, r,τ)= dφLL(sLL,rLW ) + (1 − d)(1 − τ)φWL(sWL,rLW ).

Then the average payoff for a strong receiver is

g(s, r, τ, ρ)= ρgLL(s, r,τ) + (1 − ρ)gLW (s, r,τ).

∂g We calculate and arrive at the learning dynamics equation for the strong receiver: ∂ρ

ρ˙ =[gLL(s, r,τ) − gLW (s, r,τ)] ρ(1 − ρ)

=[dφLL(sLL,rLL) + (1 − d)(1 − τ)φWL(sWL,rLL) (4.20)

dφLL(sLL,rLW ) + (1 − d)(1 − τ)φWL(sWL,rLW )] ρ(1 − ρ).

The weak receiver has two strategies, believe or disbelieve the signal. If the weak receiver believes the signal, it will play Hawk with probability uLL. Its opponent is either (1) strong and honest with frequency d (in which case, will play Hawk with probability sLL), or (2) weak and deceptive with frequency (1 − d)(1 − τ) (thus, will play Hawk with probability sWL). Then the payoff for adopting strategy uLL is given by

hLL(s, u,τ)= dΥLL(sLL,uLL) + (1 − d)(1 − τ)ΥWL(sWL,uLL),

The payoff for adopting strategy uLW is calculated similarly:

hLW (s, u,τ)= dΥLL(sLL,uLW ) + (1 − d)(1 − τ)ΥWL(sWL,uLW ), and so the average payoff for a weak receiver is

h(s, u,τ,υ)= υhLL(s, u,τ) + (1 − υ)hLW (s, u,τ).

67 ∂h We calculate and arrive at the learning dynamics equation for the weak receiver ∂υ

υ˙ =[hLL(s, u,τ) − hLW (s, u,τ)] υ(1 − υ)

[dΥLL(sLL,uLL) + (1 − d)(1 − τ)ΥWL(sWL,uLL (4.21)

−dΥLL(sLL,uLW )+ −(1 − d)(1 − τ)ΥWL(sWL,uLW )] υ(1 − υ).

Finally, we summarize by rewriting the three dimensional dynamical system from equations

4.19, 4.20, and 4.21:

Table 4.9: Dynamical system for the deception in RHP game

τ˙ =[fWW (s, u) − fWL(s, u)] τ(1 − τ) =[dψWL(sWW ,rWW ) + (1 − d)ψWW (sWW ,uWW ) −d (ρψWL(sWL,rLL) + (1 − ρ)ψWL(sWL,rLW )) −(1 − d)(υψWW (sWL,uLL) + (1 − υ)ψWW (sWL,uLW ))] τ(1 − τ)

ρ˙ =[gLL(s, r,τ) − gLW (s, r,τ)] ρ(1 − ρ) =[dφLL(sLL,rLL) + (1 − d)(1 − τ)φWL(sWL,rLL) dφLL(sLL,rLW ) + (1 − d)(1 − τ)φWL(sWL,rLW )] ρ(1 − ρ)

υ˙ =[hLL(s, u,τ) − hLW (s, u,τ)] υ(1 − υ) [dΥLL(sLL,uLL) + (1 − d)(1 − τ)ΥWL(sWL,uLL −dΥLL(sLL,uLW )+ −(1 − d)(1 − τ)ΥWL(sWL,uLW )] υ(1 − υ)

4.2.5 Results

We thought it would be interesting to explore how different values of γ affect the signaling system. When γ = 0.5, “strong” and “weak” individuals are actually equal in resource holding potential. As γ increases to 1, the asymmetries in strength increase, that is, a strong individual will defeat a small individual with a higher frequency. We fix all parameters except γ and allow γ to range between 0.5 and 1. The graphics in figure 4.2 summarize our findings.

For 0.5 ≤ γ / 0.669, stable fixed points occur at the corners (τ, ρ, υ) = (0, 1, 0) and (1, 1, 0). Thus for small γ, the strong receiver will always believe, the weak receiver will always disbelieve, and the signaler will play a pure strategy of either always tell the truth or always deceive. When

68 0.669 / γ / 0.722, we get an infinite family of stable limit cycles on the face υ = 0. A stable fixed point occurs at (1, 1, 0), or the weak signaler always signals true, a strong receiver always believes, and a weak receiver never believes. In this system, the weak signaler is either able to (1) signal false a portion of the time, deceiving a strong receiver a portion of the time, or (2) always send a truthful signal that a strong receiver always believes and a weak receiver never believes. This occurs because we assume it is less costly for a weak individual to fight a strong individual than it is to fight a weak individual and so it is more beneficial for a weak receiver to take the risk of

fighting a strong signaler than to defer the resource against a deceptive weak signaler.

When 0.722 / γ / 0.965, equilibria occur (1) when the signaler always signals false and neither receiver believes the signal, and (2) where the signaler always signals true, a strong receiver believes, and a weak receiver does not believe. Lastly, when 0.973 / γ ≤ 1, we have two stable equilibrium points, one occurs at (1, 1, 1), or complete honesty and complete belief, and the other occurs on the face of ρ = 0. The graph shown below displays the change in τ and υ as γ increases from 0.973 to

1.

Tau,Ups

1.0 Tau

0.8 Upsilon

0.6

0.4

0.2

0.975 0.980 0.985 0.990 0.995 1.000 gamma

Figure 4.3: Stable interior fixed points for 0.973 / γ ≤ 1

As γ increases from 0.973 to 1, τ and υ decrease. In this case, the strong receiver will always

69 disbelieve, but the weak receiver will believe the signal a small portion of the time (0.15 / ρ / 0.25). The weak signaler’s strategy changes drastically. When γ ≈ 0.973 the signal is true with frequency

τ ≈ 0.75 and as γ approaches 1 this frequency drops linearly to τ ≈ 0.2. The weak receiver is able to be fooled because strong individuals defeat weak individuals virtually all the time. Therefore if a weak receiver misjudges a signal of strong, then the error is very costly because it will almost surely lose the battle against its opponent.

Even though mixed strategies are allowed, we find pure Nash equilibria throughout most of the dynamics. One exception occurs in figure (b), where an infinite family of stable limit cycles occur on the face υ = 0. This occurs because the weak receiver’s strategy converges to 0 very quickly relative to the weak signaler and strong receiver’s strategies. After the weak receiver’s strategy converges, we are left with a two dimensional system where truth/deceive and believe/disbelieve strategies chase each other, similar to Rowell’s two-dimensional model as shown in Figure 2.2. In this system, the weak receiver always disbelieves. The weak signaler will signal false a portion of the time, which causes the strong receiver to disbelieve the signal. When the strong receiver disbelieves with high enough frequency, the weak signaler begins to tell the truth. Since the weak signal is truthful, the strong receiver starts believing the signal, which causes the frequency of false signals to increase. Thus, a fraction of deceptive signals is possible without complete breakdown, but the system does not converge to a stable fixed point.

It is interesting that for γ / 0.965, a weak receiver always disbelieves the signal. This follows from the assumptionc ˆL, cˆW ≤ min(cW ,cL), that is, it is more costly for an individual to fight an opponent with equal strength than it is to fight an opponent with unequal strength. This is a common assumption in game-theoretic models and we derive an interesting result here. The weak receiver ignores the signal and always plays as though its opponent is weak. This implies that it is more beneficial for a weak receiver to take the risk of fighting a strong receiver, which may incur a costc ˆW , than to be deceived by a weak signaler and forfeit the resource. This causes weak receivers to become more aggressive for 0.5 ≤ γ / 0.965. When 0.965 / γ ≤ 1, weak receivers will be defeated when fighting against a strong signaler almost all the time, and so the weak receiver always believes the signal, which causes it to become less aggressive. Therefore, this model provides one reason why

70 weak individuals may be overly aggressive, or opt to play Hawk with greater frequency than the mixed Nash equilibrium of the Hawk-Dove game. Overly aggressive behavior by weaker individuals is also known as the Napoleon complex and has been observed in mollusks [49], sea anemones

[3], fishes [2, 9, 45], hermit crabs [4, 24], and the swordtail fish [23]. Other explanations for the

Napoleon complex include asymmetries in the value of the resource, a weak individual misperceives its actual fighting ability, and the Desperado Effect, where the Desparado Effect occurs when there is no other alternative strategy to obtain the resource for weak individuals [22].

71 (a) (b)

1 1

0.8 0.8

0.6 0.6

υ 0.4 υ 0.4

0.2 0.2

0 0 1 1

0.8 0.8 1 1 0.6 0.8 0.6 0.8

0.4 0.6 0.4 0.6 0.4 0.4 0.2 0.2 ρ 0.2 ρ 0.2 0 0 τ 0 0 τ

(c) (d)

1 1

0.8 0.8

0.6 0.6

υ 0.4 υ 0.4

0.2 0.2

0 0 1 1

0.8 0.8 1 1 0.6 0.8 0.6 0.8

0.4 0.6 0.4 0.6 0.4 0.4 0.2 0.2 ρ 0.2 ρ 0.2 0 0 τ 0 0 τ

(e) Legend Stable fixed point

1 Unstable fixed point 0.8 τ Frequency that a signaler sends 0.6

υ 0.4 a deceptive signal 0.2 ρ Frequency that a strong receiver 0 1

0.8 1 believes the signal 0.6 0.8

0.4 0.6 0.4 0.2 ρ 0.2 0 0 τ υ Frequency that a weak receiver believes the signal Figure 4.2: Deception in RHP equilibria

Parameter values α = 0.1, d = 0.3, V = 0.5, cL = cW = 8,c ˆL =c ˆW = 6.6. Figure (a), 0.5 ≤ γ / 0.669, (b) .669 / γ / 0.722, (c) .722 / γ / .965, (d) .965 / γ / .973, and (e) .973 / γ ≤ 1.

72 APPENDIX A

N−STATE GAMES

The 2-state model works well when creating a model of deception in animal signaling systems. The signaler and receiver are given a pair of choices—either signal true or false and believe or disbelieve.

However, a more elaborate communication system may require more than two states to signal, or more than two possible interpretations to a signal. Here we create a model where the signaler and receiver select from n possible signals.

In this model, players have variations in their interpretations of the state of the environment.

The signaler has privileged information about the state of the environment and can attempt to bias the perception of the opposing player (the receiver). After receiving the message, Player 2 interprets the message and selects an appropriate action.

Suppose the environment can exist in one of n possible states, represented by G1,G2,...,Gn.

Suppose that State Gi occurs with probability di, i = 1, 2,...,n. The environment must exist in n some state, and so di = 1. The signaler observes the environment and selects a signal, which Xi=1 may be any of the possible states G1,G2,...,Gn. Let sij be the action of signaling Gj when the observed state is Gi and let τij be the corresponding frequency of action sij. Given any state Gi, n we must have the frequency of all possible signals sum to one, and so τij = 1. Xj=1 If the receiver is given a signal of Gj, then the receiver may assume the environment is in any of the possible states G1,G2,...,Gn. Let ρjk be the frequency that the receiver assumes the environment is Gk given a signal of Gj. Let rjk be the response of believing that the state is Gk when the state Gj is signaled. Given any signal Gk, the frequency of all possible interpretations

73 n must sum to 1, and so we have ρjk = 1. Xk=1

State Action Response

G1 si1 rj1

G2 si2 rj2 ...... s . . ij . . . . . Gi ...... rjk . . . . Gn sin rjn

Gi Observed state sij Signaler’s action rjk Receiver’s response

Figure A.1: The n-state model

The signaler observes the state of environment, where G1,G2,...,Gn are the possible states. Given State Gi, the signaler may select any action si1,si2,...,sin. The signaler selects action sij with frequency τij. The receiver observes signal Gj and may select any possible response rjk with frequency ρjk.

Define φGi (pij,qjk) and ψGi (pij,qjk) to be the payoffs for Player 1 and Player 2, respectively, when the actual state is Gi, Player 1 signals Gj and Player 2 believes the state is Gk.

A.1 The Replicator Equations

In this section, we will apply the replicator equation to the model shown in figure A.1. Recall from page 14 that the general form of the replicator equation can be written as

x˙i = xi fi(x) − f¯(x) ,  

74 where xi is the frequency of a given strategy with index i, x = hx1,x2,...,xni is the frequency vector, fi(x) is the payoff for adopting a strategy with index i, and f¯(x) is the average payoff of the population.

When we apply the general form of the replicator equation to our model, we derive a set of differential equations for the frequency of signals τij and the frequency of responses ρjk, where i,j,k ∈ {1, 2,...,n}. Then signaling game allows n possible signals for each of the n states, totaling n(n − 1) possible strategies for the signaler. Similarly, the receiver interprets one of n possible signals with n possible actions which results in n(n − 1) possible strategies. This yields a total of 2n(n − 1) equations included in the n-state model. The complexity of the signaling system may be reduced (see 3.2 and 4.2 for examples).

We summarize all variables in the table below:

Table A.1: Variables in the n-state signaling game

Parameter Interpretation Signaler sij Action of signaling State Gj when the environment is Gi τij The frequency of action sij when the environment is Gi ψi Payoff function for the signaler in State Gi Receiver rjk The response to signal Gj when State Gk is believed ρjk The frequency of response rjk when the signal is Gj φi Payoff function for the receiver in State Gi.

We begin by creating the replicator equation for the signaler. Suppose the actual environment is State Gi. We calculateτ ˙ij for arbitrary j. The payoff for action sij is determined by the receiver’s response to the signal. If State Gj is signaled, then the receiver has response set {rj1,rj2,...,rjn}, where each response rjk occurs with frequency ρjk. Then the payoff for action sij is equal to the payoff for each response rjk times the frequency that the receiver adopts a given response:

fij(ρ) = ρj1ψi(sij,rj1)+ ρj2ψi(sij,rj2)+ ··· + ρjnψi(sij,rjn) (A.1) n = k=1 ρjkψi(sij,rjk). P

75 The average payoff for the signaler given state Gi is calculated by determining the average payoff from every action sij, where i (the actual state) is fixed and j (the signaler’s possible actions) ranges from 1 to n:

f¯i(τ, ρ) = τi1fi1(ρ)+ τi2fi2(ρ)+ ··· + τinfin(ρ) n = j=1 τijfij(ρ). (A.2) Pn n = j=1 τij ( k=1 ρjkψi(sij,rjk)) P P Combining equations (A.1) and (A.2), we can write the replicator equation for action sij as

τ˙ij = τij fij(ρj) − f¯i(τ, ρ)  n  = τij fij(ρj) − j=1 τijfij(ρ) (A.3) h i n P n n = τij k=1 ρjkψi(sij,rjk) − j=1 τij ( k=1 ρjkψi(sij,rjk)) . hP P P i

Next, we create the replicator equation for the receiver. Suppose that State Gj is the signal received. The payoff for any given response rjk is dependent upon the actual state of the environ- ment, which is unknown to the receiver. The signaler could have adopted any action s1j,s2j,...,snj, which occur with frequency τ1j,τ2j,...,τnj, respectively. Recall that, for any i, the probability that the environment is in State Gi is equal to di, and so the payoff for response rjk can be written as:

gjk(τ) = d1τ1jφ1(s1j,rjk)+ d2τ2jφ2(s2j,rjk)+ ··· + dnτnjφn(snj,rjk) (A.4) n = i=1 diτijφi(sij,rjk) P

When the signal is Gj, the receiver could select any response rj1,rj2,...,rjn, which occur with frequency ρj1, ρj2, . . . , ρjn. Then the average payoff for the receiver is given by

g¯j(τ, ρ) = ρj1gj1(τ)+ ρj2gj2(τ)+ ··· + ρjngjn(τ) (A.5) n = m=1 ρjmgjm(τ). P

76 By combining equations (A.4) and (A.5), the replicator equation can be written as

ρ˙jk = ρjk [gjk(τ) − g¯j(τ, ρ)] n = ρjk [gjk(τ) − m=1 ρjmgjm(τ)] (A.6) n P n n = ρjk [ i=1 diτijφi(sij,rjk) − m=1 ρjm ( i=1 diτijφi(sij,rjk))] . P P P We conclude this section by rewriting the replicator equations found in (A.3) and (A.6):

n n n τ˙ij = τij k=1 ρjkψi(sij,rjk) − j=1 τij ( k=1 ρjkψi(sij,rjk)) h i Pn P n P ρ˙jk = ρjk [ i=1 diτijφi(sij,rjk) − m=1 ρjm ( i diτijφi(sij,rjk))] . P P P A.2 The Learning Dynamics

Recall from page 14 that the learning dynamic equation can be written as

∂f x˙ i = xi(1 − xi), ∂xi

where xi is the frequency of a given strategy with index i, x = hx1,x2,...,xni is the frequency vector and f(x) is a player’s payoff function.

First, we find the learning dynamics equation for the signaler for an arbitrary action sij with corresponding frequency τij. Given state Gi, the average payoff for the signaler is given by equation A.2:

f¯i(τ, ρ) = τi1fi1(ρ)+ τi2fi2(ρ)+ ··· + τinfin(ρ) n = m=1 τimfim(ρ). Pn n = m=1 τim ( k=1 ρjkψi(sij,rjk)) . P P n Then if we want to find an equation forτ ˙ij, we must first calculate ∂f¯i/∂τij. Since τij = 1, Xj=1 then any m 6= j, we write τim = 1 − τil. Thus, for any m, lX6=m

∂ (τ f ) fim if j = m im im =  ∂τij   −fim if j 6= m. 

77 Therefore, the partial of f¯i with respect to a given frequency τij is given by

∂f¯i = fij(ρ) − fim(ρ), ∂τij mX6=j and so we may write the learning dynamics equation as

τ˙ij = τij(1 − τij) fij(ρ) − fim(ρ) . (A.7) mX6=j  

Next we find the learning dynamics equation for strategy ρjk with corresponding response rjk.

Given a signal of state Gj, the average payoff for the receiver is given by equation (A.5):

g¯j(τ, ρ) = ρj1gj1(τ)+ ρj2gj2(τ)+ ··· + ρjngjn(τ) n = m=1 ρjmgjm(τ). P n Since ρjk = 1, then for any m 6= k, we write ρjm = 1 − ρjl. Thus, for any m, Xk=1 lX6=m

∂ (ρ g ) gjm if k = m jm jm =  ∂ρjk   −gjm if k 6= m.  Therefore, the partial ofg ¯j with respect to a given frequency ρjk is given by

∂g¯j = gjk(τ) − gjh(τ), ∂ρjk hX6=k and so we may write the learning dynamics equation as

ρ˙jk = ρjk(1 − ρjk) gjk(τ) − gjh(τ) . (A.8) hX6=k  

78 We summarize by rewritingτ ˙ij andρ ˙jk, which are found in equations A.7 and A.8:

τ˙ij = τij(1 − τij) fij(ρ) − m6=j fim(ρ) , h P i ρ˙jk = ρjk(1 − ρjk) gjk(τ) − h6=k gjh(τ) . h P i

79 APPENDIX B

CODE

The stability analysis for the, “Manipulative mimics,” and “Deception in RHP,” models were done using Mathematica. The graphics for the, “Deception in RHP,” model were done by importing data from XPPAUT into MATLAB. The code can be downloaded via the following link:

https://dl.dropboxusercontent.com/u/18728677/Dissertation%20Code.zip

80 APPENDIX C

COPYRIGHT PERMISSION

Copyright approval for figure 3.2.

81 REFERENCES

[1] A. Avargu`es-Weber, E. H. Dawson, and L. Chittka. Mechanisms of social learning across species boundaries. Journal of Zoology, 2013. 3.2

[2] G.W. Barlow, W. Rogers, and N. Fraley. Do midas cichlids win through prowess or daring? It depends. and , 19:1–8, 1986. 4.2.5

[3] R.C. Brace and J. Pavey. Size-dependent dominance hierarchy in the anemone Actinia equina. Nature, 273:752–753, 1978. 4.2.5

[4] B.M. Dowds and R.W. Elwood. Shell wars: Assessment strategies and the timing of decisions in hermit crab shell fights. Behaviour, 85:1–24, 1983. 4.2.5

[5] R. Earley. Social eavesdropping and the evolution of conditional cooperation and cheating strategies. Philosophical Transactions of the Royal Society B, 365:2675–2686, May 2010. 2.4

[6] J.A. Endler. Some general comments on the evolution and design of animal communication systems. Philosophical Transactions: Biological Sciences, 340(1292):215–225, May 1993. 3.2

[7] M Enquist, P. Hurd, and S. Ghirlanda. Signaling. In D. Westneat and C. Fox, editors, Evolutionary Behavioral Ecology, chapter 16, pages 266–284. Oxford University Press, New York, 2010. 2.2

[8] M. Enquist and O. Leimar. Evolution of fighting behaviour: The effect of variation in resource value. Journal of Theoretical , 127:187–205, 1987. 1.4

[9] M.H. Figler and D.M. Einhorn. The territorial prior residence effect in convict cichlids (Cichla- soma nigrofasciatum G¨unther): temporal aspects of establishment and retention and proximate mechanisms. Behaviour, 85:157–181, 1983. 4.2.5

[10] R. Gibbons. Game Theory for Applied Economists. Princeton University Press, Princeton, NJ 08540, 1992. 3

[11] A Grafen. Biological signals as handicaps. Journal of Theoretical Biology, 144:517–546, August 1990. 1, 1.1, 1.4, 3.2.2

[12] A. Grafen and R. A. Johnstone. Why we need ESS signaling theory. Philosophical Transactions: Biological Sciences, 340(1292):245–250, May 1993. 1.3

[13] T. Guilford and M. Stamp Dawkins. Receiver psychology and the evolution of animal signals. Animal Behaviour, 42:1–14, 1991. 1.1, 1.2, 1.3, 2.4

82 [14] J. Hofbauer and S. Huttegger. Feasibility of communication in binary signaling games. Journal of Theoretical Biology, 254:843–849, 2008. 2.2

[15] J. Hofbauer and K. Sigmund. The Theory of Evolution and Dynamical Systems: Mathematical Aspects of Selection. Cambridge University Press, Cambridge England and New York, 1988. 2.2.1, 3.1, 3.1.3

[16] J. Hofbauer and K. Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press, 1 edition, 1998. 2.2, 2.2.1, 3.1.3

[17] J. Hofbauer and K. Sigmund. Evolutionary game dynamics. Bulletin of the American Mathe- matical Society, 40(4):479–519, July 2003. 2.2.1, 2.2.1, 2.2.1

[18] S. Huttegger. Robustness in signaling games. Philosophy of Science, 74:839–847, December 2007. 3.1.3

[19] S. Huttegger. Signaling games: Dynamics of evolution and learning. preprint, pages 1–18, 2008. 1.3, 2.2, 2.2.1, 2.2.1, 2.4, 3.1.3

[20] S.M. Huttegger, B. Skyrms, R. Smead, and K.J. Zollman. Evolutionary dynamics of lewis signaling games: Signaling systems vs. partial pooling. Synthese, 172(1):177–191, 2010. 1.3, 2.2, 2.4, 3.1.3

[21] G. J¨ager. Evolutionary stability conditions for signaling games with costly signals. Journal of Theoretical Biology, 253:131–141, 2008. 5

[22] W. Just and M.R. Morris. The Napoleon complex: why smaller males pick fights. Evolutionary Ecology, 17:509–522, 2003. 4.2.5

[23] W. Just, M. Wu, and J.P. Holt. How to evolve a Napoleon complex. In Evolutionary Com- putation, 2000. Proceedings of the 2000 Congress on, volume 2, pages 851–856. IEEE, 2000. 4.2.5

[24] E.R. Keeley and J.W.A. Grant. Asymmetries in the expected value of food do not predict the outcome of contests between convict cichlids. Animal Behaviour, 45:1035–1037, 1993. 4.2.5

[25] J.R. Krebs and R. Dawkins. Animal signals: Mind reading and manipulation. In J.R. Krebs and N.B. Davies, editors, Behavioural ecology: an evolutionary approach, chapter 15, pages 380–402. Blackwell, Oxford, UK, 1984. 1, 1.2, 1.4, 2.2, 3.2.1

[26] D. Lewis. Convention. Harvard University Press, Cambridge, 1969. 2.2

[27] J. Maynard Smith. Evolution and the Theory of Games. Cambridge University Press, Cam- bridge, 1982. 2.1.1

[28] J. Maynard Smith and D. Harper. Animal Signals. Oxford University Press, New York, New York, 2003. 1, 1.1, 1.4, 3.2.2

[29] J. Maynard Smith and G. Price. The logic of animal conflict. Nature, 246(2):15–18, 1973. 4.2.1

83 [30] D. McFarland. Dictionary of Animal Behaviour. Oxford University Press, New York, New York, 2006. 1.1, 1.4, 2.2.2

[31] M. Mesterton-Gibbons. Game-theoretic modeling. In Encyclopedia of Philosophy and the Social Sciences, pages 377–378. SAGE Publications, b. kaldis edition, 2013. 2.1

[32] M. Norman, J. Finn, and T. Tregenza. Dynamic mimicry in an indo-malayan octopus. Pro- ceedings of the Royal Society of London B., 268(1478):1755–1758, September 2001. 3.2, 3.2

[33] G.A. Parker. Assessment strategy and the evolution of fighting behaviour. Journal of Theo- retical Biology, 47:223–243, March 1974. 1.3, 1.4

[34] S. Perry and J. H. Manson. Manipulative Monkeys: The Capuchins of Lomas Barbudal. Harvard University Press, Cambridge, Massachusetts, 2008. 4.1

[35] A. Ridley, M. Child, and M. Bell. Interspecific audience effects on the alarm-calling behaviour of a kleptoparasitic bird. Biology Letters, 3:589—591, 2007. 3.2

[36] J. Rowell, S. Ellner, and H.K. Reeve. Why animals lie: How dishonesty and belief can coexist in a signaling system. The American Naturalist, 168:180–204, 2006. 1, 1.3, 2.2, 2.2.1, 2.2.1, 2.2.2, 2.3, 2.3.1, 2.4, 3.1, 3.2, 3.2.2, 3.2.2

[37] W. Searcy and S. Nowicki. The evolution of animal communication: Reliability and deception in signaling systems. Princeton University Press, Princeton, New Jersey, 2005. 1, 1.4

[38] B. Skyrms. Stability and explanatory significance of some simple evolutionary models. Phi- losophy of Science, 67:94–113, 2000. 2.2.1, 2.2.2, 2.4

[39] B. Skyrms. Signals: Evolution, Learning, and Information. Oxford University Press, New York, New York, 2010. 3.2

[40] M. Stamp Dawkins. Are there general principles of signal design? Philosophical Transactions: Biological Sciences, 1292(340):251–255, May 1993. 1.2, 1.3, 1.3

[41] M. Stamp Dawkins and T. Guilford. The corruption of honest signalling. Animal Behaviour, 41:865–873, 1991. 1.1, 3.2.2, 4.1

[42] S. Strogatz. Nonlinear Dynamics And Chaos: With Applications To Physics, Biology, Chem- istry, And Engineering (Studies in Nonlinearity). Perseus Books Publishing, LLC, Cambridge, MA, 1994. 2.2.2

[43] S. Sz´amad´o. Long-term commitment promotes honest status signalling. Animal Behaviour, 82(2):295–302, 2011. 2.4

[44] P.D. Taylor and L.B. Jonker. Evolutionary stable strategies and game dynamics. Mathematical Biosciences, 40:145–156, 1978. 2.2.1, 2.2.1

[45] G.F. Turner and F. Huntingford. A problem for game theory analysis: assessment and intention in male mouthbrooder contests. Animal Behaviour, 34:961–970, 1986. 4.2.5

84 [46] A. Whiten and R. Byrne, editors. Machiavellian Intelligence 2: Extensions and evaluations. Cambridge University Press, Cambridge CB2 1RP, United Kingdom, 1997. 1.3

[47] R.H. Wiley. The evolution of communication: Information and manipulation. In T.R. Halliday and P.J.B. Slater, editors, Animal behaviour 2: Communication, volume 2, chapter 5, pages 156–189. Blackwell Scientific Publications, 1983. 1.1, 1

[48] D.S. Wilson, D. Near, and R.R. Miller. Machiavellianism: A synthesis of the evolutionary and psychological literatures. Psychological Bulletin, 119(2):285–299, 1996. 1

[49] S. Zack. A description and analysis of agonostic behaviour patterns in an opishobranch mollusk, Hermissenda crassicornis. Behaviour, 5:238–267, 1975. 4.2.5

[50] A. Zahavi and A. Zahavi. The Handicap Principle: A Missing Piece of Darwin’s Puzzle. Oxford University Press, 198 Madison Avenue, New York, New York, 1997. 1, 2.4, 3.2.2

[51] J. Zhuang, V. Bier, and O. Alagoz. Modeling secrecy and deception in a multiple-period attacker-defender signaling game. European Journal of Operational Research, 203(2):409–418, 2010. 2.4

85 BIOGRAPHICAL SKETCH

Candace Ohm was born and raised in in a small southern Michigan town. She graduated from

Addison High School in 2003, then attended Siena Heights University and graduated cum laude with a Bachelor of Arts in mathematics in 2006. She started her graduate school career at Bowling

Green State University, earned a Master of Arts in mathematics, then attended Florida State

University, where she received her Ph.D under the guidance of Dr. Mike Mesterton-Gibbons in

2013.

86