<<

Recap Example Pareto Optimality and

Game intro

CPSC 532A Lecture 3

Game Theory intro CPSC 532A Lecture 3, Slide 1 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Lecture Overview

1 Recap

2 Example Matrix Games

3 Pareto Optimality

4 Best Response and Nash Equilibrium

Game Theory intro CPSC 532A Lecture 3, Slide 2 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Defining Games

Finite, n-person game: N, A, u : h i N is a finite set of n players, indexed by i A = A ... A , where A is the action set for player i 1 × × n i (a1, . . . , an) ∈ A is an action profile, and so A is the space of action profiles

u = u1, . . . , un , a function for each player, where u : Ah R i i 7→

Writing a 2-player game as a matrix: row player is player 1, column player is player 2 0 rows are actions a A1, columns are a A2 cells are outcomes,∈ written as a tuple of∈ utility values for each player

Game Theory intro CPSC 532A Lecture 3, Slide 3 56 3 and Coordination: Normal form games

when congestion occurs. You have two possible : C (for using a Correct implementation) and D (for using a Defective one). If both you and your colleague Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium adopt C then your average packet delay is 1ms (millisecond). If you both adopt D the Gamesdelay is 3ms, in because Matrix of Form additional overhead at the network router. Finally, if one of you adopts D and the other adopts C then the D adopter will experience no delay at all, but the C adopter will experience a delay of 4ms. These consequences are shown in Figure 3.1. Your options are the two rows, and yourHere’s colleague’s the TCP options Backoff are the Game columns. written In each as a cell, matrix the fi (“normalrst number represents yourform”). payoff (or, minus your delay), and the second number represents your colleague’s TCP user’s payoff.1 game C D Prisoner’s dilemma game C 1, 1 4, 0 − − −

D 0, 4 3, 3 − − −

Figure 3.1 The TCP user’s (aka the Prisoner’s) Dilemma.

GivenIt’s anthese example options of what prisoner’s should you dilemma adopt,. C or D? Does it depend on what you think your colleague will do? Furthermore, from the perspective of the network opera- tor, what kind of behavior can he expect from the two users? Will any two users behave Gamethe same Theory whenintro presented with this scenario? Will the behavior changeCPSC 532A if theLecture network 3, Slide 4 operator allows the users to communicate with each other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the same counterpart multiple times? Do answers to the above questions depend on how rational the agents are and how they view each other’s ? Game theory gives answers to many of these questions. It tells us that any rational user, when presented with this scenario once, will adopt D—regardless of what the other user does. It tells us that allowing the users to communicate beforehand will not change the . It tells us that for perfectly rational agents, the decision will remain the same even if they play multiple times; however, if the number of times that the agents will play this is infinite, or even uncertain, we may see them adopt C.

3.2 Games in normal form game in The normal form, also known as the strategic or matrix form, is the most familiar strategic form representation of strategic interactions in game theory. game in matrix form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanying the . Imagine the players of the game are two prisoners suspected of a crime rather than network users, that you each can either Confess to the crime or Deny it, and that the absolute values of the numbers represent the length of jail term each of you will get in each scenario.

c Shoham and Leyton-Brown, 2006

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Lecture Overview

1 Recap

2 Example Matrix Games

3 Pareto Optimality

4 Best Response and Nash Equilibrium

Game Theory intro CPSC 532A Lecture 3, Slide 5 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Games of Pure Competition

Players have exactly opposed There must be precisely two players (otherwise they can’t have exactly opposed interests)

For all action profiles a A, u1(a) + u2(a) = c for some constant c ∈ Special case: zero sum Thus, we only need to store a utility function for one player in a sense, it’s a one-player game

Game Theory intro CPSC 532A Lecture 3, Slide 6 Play this game with someone near you, repeating five times.

3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games represent situations of pure coordination, zero-sum games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to represent zero-sum games, in which we write only one payoff in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use theRecap abbreviation Example we must Matrix explicit Games statePareto whether Optimality this matrix representsBest Responsea and common-payoff Nash Equilibrium game or a zero-sum one. MatchingA classical Penniesexample of a zero-sum game is the game of . In this matching game, each of the two players has a penny, and independently chooses to display either pennies game heads or tails. The two players then compare their pennies. If they are the same then player 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shownOne in Figureplayer 3.5. wants to match; the other wants to mismatch.

Heads Tails

Heads 1 1 −

Tails 1 1 −

Figure 3.5 Matching Pennies game.

The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, provides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or matrix of this zero-sum game is shown in Figure 3.6. In this game, each of the two Rochambeau players can choose either Rock, Paper, or Scissors. If both players choose the same game Gameaction, Theory there intro is no winner, and the are zero. Otherwise,CPSC each 532A of Lecture the actions 3, Slide 7 wins over one of the other actions, and loses to the other remaining action. In general, games tend to include elements of both coordination and competition. Prisoner’s Dilemma does, although in a rather paradoxical way. Here is another well- known game that includes both elements. In this game, called Battle of the Sexes, a husband and wife wish to go to the movies, and they can select among two movies: “Violence Galore (VG)” and “Gentle Love (GL)”. They much prefer to go together rather than to separate movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly.

Multi Systems, draft of February 11, 2006 3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-strategy) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games represent situations of pure coordination, zero-sum games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to represent zero-sum games, in which we write only one payoff value in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use theRecap abbreviation Example we must Matrix explicit Games statePareto whether Optimality this matrix representsBest Responsea and common-payoff Nash Equilibrium game or a zero-sum one. MatchingA classical Penniesexample of a zero-sum game is the game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game heads or tails. The two players then compare their pennies. If they are the same then player 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shownOne in Figureplayer 3.5. wants to match; the other wants to mismatch.

Heads Tails

Heads 1 1 −

Tails 1 1 −

Figure 3.5 Matching Pennies game. Play this game with someone near you, repeating five times. The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, provides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or matrix of this zero-sum game is shown in Figure 3.6. In this game, each of the two Rochambeau players can choose either Rock, Paper, or Scissors. If both players choose the same game Gameaction, Theory there intro is no winner, and the utilities are zero. Otherwise,CPSC each 532A of Lecture the actions 3, Slide 7 wins over one of the other actions, and loses to the other remaining action. In general, games tend to include elements of both coordination and competition. Prisoner’s Dilemma does, although in a rather paradoxical way. Here is another well- known game that includes both elements. In this game, called Battle of the Sexes, a husband and wife wish to go to the movies, and they can select among two movies: “Violence Galore (VG)” and “Gentle Love (GL)”. They much prefer to go together rather than to separate movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly.

Multi Agent Systems, draft of February 11, 2006 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Rock-Paper-Scissors

60 3 Competition and Coordination: Normal form games Generalized matching pennies.

Rock Paper Scissors

Rock 0 1 1 −

Paper 1 0 1 −

Scissors 1 1 0 −

Figure 3.6 Rock, Paper, Scissors game. ...Believe it or not, there’s an annual international competition for this game! VG GL

VG 2, 1 0, 0 Game Theory intro CPSC 532A Lecture 3, Slide 8

GL 0, 0 1, 2

Figure 3.7 Battle of the Sexes game.

3.2.2 Strategies in normal-form games We have so far defined the actions available to each player in a game, but not yet his set of strategies, or his available choices. Certainly one kind of strategy is to select pure strategy a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some ; such a strategy is called mixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) be a normal form game, and for any set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Games of

Players have exactly the same interests. no conflict: all players want the same things

a A, i, j, ui(a) = uj(a) ∀ ∈ ∀ we often write such games with a single payoff per cell why are such games “noncooperative”?

Game Theory intro CPSC 532A Lecture 3, Slide 9 Play this game with someone near you. Then find a new partner and play again. Play five times in total.

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j pure- Common-payoff games are also called pure coordination games, since in such games coordination the agents have no conflicting interests; their sole challenge is to coordinate on an game actionRecap that is maximallyExample Matrix beneficial Games to all.Pareto Optimality Best Response and Nash Equilibrium Because of their special nature, we often represent common value games with an Coordinationabbreviated form of Game the matrix in which we list only one payoff in each of the cells. As an example, imagine two drivers driving towards each other in a country without traffic rules, and who must independently decide whether to drive on the left or on the right. If the players choose the same side (left or right) they have some high utility, and otherwiseWhich they side have of a the low road utility. should The game you matrix drive on?is shown in Figure 3.4.

Left Right

Left 1 0

Right 0 1

Figure 3.4 . zero-sum game At the other end of the spectrum from pure coordination games lie zero-sum games, which (bearing in mind the comment we made earlier about positive affine transforma- constant-sum tions) are more properly called constant-sum games. Unlike common-payoff games, games c Shoham and Leyton-Brown, 2006 Game Theory intro CPSC 532A Lecture 3, Slide 10 58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j pure- Common-payoff games are also called pure coordination games, since in such games coordination the agents have no conflicting interests; their sole challenge is to coordinate on an game actionRecap that is maximallyExample Matrix beneficial Games to all.Pareto Optimality Best Response and Nash Equilibrium Because of their special nature, we often represent common value games with an Coordinationabbreviated form of Game the matrix in which we list only one payoff in each of the cells. As an example, imagine two drivers driving towards each other in a country without traffic rules, and who must independently decide whether to drive on the left or on the right. If the players choose the same side (left or right) they have some high utility, and otherwiseWhich they side have of a the low road utility. should The game you matrix drive on?is shown in Figure 3.4.

Left Right

Left 1 0

Right 0 1

Figure 3.4 Coordination game. Play this game with someone near you. Then find a new partner zero-sum game Atand the other play endagain. of the Play spectrum five times from inpure total. coordination games lie zero-sum games, which (bearing in mind the comment we made earlier about positive affine transforma- constant-sum tions) are more properly called constant-sum games. Unlike common-payoff games, games c Shoham and Leyton-Brown, 2006 Game Theory intro CPSC 532A Lecture 3, Slide 10 Play this game with someone near you. Then find a new partner and play again. Play five times in total.

60 3 Competition and Coordination: Normal form games

Rock Paper Scissors

Rock 0 1 1 Recap Example Matrix Games Pareto Optimality− Best Response and Nash Equilibrium

General Games:Paper Battle of1 the Sexes0 1 −

Scissors 1 1 0 − The most interesting games combine elements of cooperation and competition. Figure 3.6 Rock, Paper, Scissors game.

B F

B 2, 1 0, 0

F 0, 0 1, 2

Figure 3.7 Battle of the Sexes game.

3.2.2 Strategies in normal-form games We have so far defined the actions available to each player in a game, but not yet his

Gameset of Theorystrategies intro , or his available choices. Certainly one kind ofCPSC strategy 532A Lecture is to 3, selectSlide 11 pure strategy a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some ; such a strategy is called mixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) be a normal form game, and for any set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

60 3 Competition and Coordination: Normal form games

Rock Paper Scissors

Rock 0 1 1 Recap Example Matrix Games Pareto Optimality− Best Response and Nash Equilibrium

General Games:Paper Battle of1 the Sexes0 1 −

Scissors 1 1 0 − The most interesting games combine elements of cooperation and competition. Figure 3.6 Rock, Paper, Scissors game.

B F

B 2, 1 0, 0

F 0, 0 1, 2

Figure 3.7 Battle of the Sexes game. Play this game with someone near you. Then find a new partner and play again. Play five times in total. 3.2.2 Strategies in normal-form games We have so far defined the actions available to each player in a game, but not yet his

Gameset of Theorystrategies intro , or his available choices. Certainly one kind ofCPSC strategy 532A Lecture is to 3, selectSlide 11 pure strategy a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called mixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) be a normal form game, and for any set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Lecture Overview

1 Recap

2 Example Matrix Games

3 Pareto Optimality

4 Best Response and Nash Equilibrium

Game Theory intro CPSC 532A Lecture 3, Slide 12 we have no way of saying that one agent’s interests are more important than another’s intuition: imagine trying to find the revenue-maximizing outcome when you don’t know what has been used to express each agent’s payoff Are there situations where we can still prefer one outcome to another?

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Analyzing Games

We’ve defined some canonical games, and thought about how to play them. Now let’s examine the games from the outside From the point of view of an outside observer, can some outcomes of a game be said to be better than others?

Game Theory intro CPSC 532A Lecture 3, Slide 13 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Analyzing Games

We’ve defined some canonical games, and thought about how to play them. Now let’s examine the games from the outside From the point of view of an outside observer, can some outcomes of a game be said to be better than others? we have no way of saying that one agent’s interests are more important than another’s intuition: imagine trying to find the revenue-maximizing outcome when you don’t know what currency has been used to express each agent’s payoff Are there situations where we can still prefer one outcome to another?

Game Theory intro CPSC 532A Lecture 3, Slide 13 An outcome o∗ is Pareto-optimal if there is no other outcome that Pareto-dominates it. can a game have more than one Pareto-optimal outcome? does every game have at least one Pareto-optimal outcome?

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Pareto Optimality

Idea: sometimes, one outcome o is at least as good for every agent as another outcome o0, and there is some agent who strictly prefers o to o0 in this case, it seems reasonable to say that o is better than o0 we say that o Pareto-dominates o0.

Game Theory intro CPSC 532A Lecture 3, Slide 14 can a game have more than one Pareto-optimal outcome? does every game have at least one Pareto-optimal outcome?

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Pareto Optimality

Idea: sometimes, one outcome o is at least as good for every agent as another outcome o0, and there is some agent who strictly prefers o to o0 in this case, it seems reasonable to say that o is better than o0 we say that o Pareto-dominates o0.

An outcome o∗ is Pareto-optimal if there is no other outcome that Pareto-dominates it.

Game Theory intro CPSC 532A Lecture 3, Slide 14 does every game have at least one Pareto-optimal outcome?

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Pareto Optimality

Idea: sometimes, one outcome o is at least as good for every agent as another outcome o0, and there is some agent who strictly prefers o to o0 in this case, it seems reasonable to say that o is better than o0 we say that o Pareto-dominates o0.

An outcome o∗ is Pareto-optimal if there is no other outcome that Pareto-dominates it. can a game have more than one Pareto-optimal outcome?

Game Theory intro CPSC 532A Lecture 3, Slide 14 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Pareto Optimality

Idea: sometimes, one outcome o is at least as good for every agent as another outcome o0, and there is some agent who strictly prefers o to o0 in this case, it seems reasonable to say that o is better than o0 we say that o Pareto-dominates o0.

An outcome o∗ is Pareto-optimal if there is no other outcome that Pareto-dominates it. can a game have more than one Pareto-optimal outcome? does every game have at least one Pareto-optimal outcome?

Game Theory intro CPSC 532A Lecture 3, Slide 14 58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion.3.2 Games The first in normal is the formclass of common-payoff games. These are games in which, for59 every action profile, all players have the same payoff. constant-sum games are meaningful primarily in the context of two-player (though not common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all necessarily two-strategy) games. game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j Definition 3.2.3 A normal form game is constant sum if there exists a constant c such thatCommon-payoff for each strategy games profile area alsoA calledApureit is coordination the case that gamesu (a,) since + u in(a) such = c games. pure- ∈ 1 × 2 1 2 coordination the agents have no conflicting interests; their sole challenge is to coordinate on an action that is maximally beneficial to all. game For convenience, when we talk of constant-sum games going forward we will always Because of their special nature, we often represent common value games with an assume that c = 0, that is, that we have a zero-sum game. If common-payoff games abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 represent3 Competition situations and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure As an example, imagine two drivers driving towards each other in a country without competition; one player’s gain must come at the expense of the other player. traffic rules, and who must independently decide whether to drive on the left or on the As in the case of common-payoff games, we can use an abbreviated matrix form to right. If the players choose the same side (left or right) they have some high utility, and representRock zero-sum Paper games, Scissors in which we write only one payoff value in each cell. This otherwise they have a low utility. The game matrix is shown in Figure 3.4. value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use Rock 0 1 1 Left Right the abbreviation− we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. Paper 1 0 1 A classical example of a− zero-sumLeft game1 is the0 game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game Scissors heads or tails. The two players then compare their pennies. If they are the same then 1 1 0 Right 0 1 player− 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.4 Coordination game. B F Heads Tails zero-sum game At the other end of the spectrum from pure coordination games lie zero-sum games, which (bearing in mind the comment we made earlier about positive affine transforma- constant-sum tions)B are2 more, 1 properly0, 0 calledHeadsconstant-sum1 games1. Unlike common-payoff games, − games c Shoham and Leyton-Brown, 2006 F 0, 0 1, 2 Tails 1 1 −

Figure 3.7 Battle of the Sexes game.Figure 3.5 Matching Pennies game.

The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, 3.2.2 Strategies in normal-form gamesprovides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or We have so far defined the actionsmatrix available of this to zero-sum each player game in is a game, shown but in Figure not yet 3.6. his In this game, each of the two Rochambeau game set of strategies, or his availableplayers choices. can Certainlychoose either oneRock, kind of Paper, strategy or Scissors. is to select If both players choose the same pure strategy a single action and play it; weaction, call such there a strategy is no winner, a pure and strategy the utilities, and we are will zero. use Otherwise, each of the actions the notation we have already developedwins over for one actions of the to other repres actions,ent it. and There loses is, to however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy a mixed strategy. Although it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: of mixed strategies is critical. We“Violence will return Galore to this (VG)” when and we discuss “Gentle solution Love (GL)”. concepts They much prefer to go together for games in the next section. rather than to separate movies, but while the wife prefers VG the husband prefers GL. We define a mixed strategy forThe a normal payoff matrixform game is shown as follows. in Figure 3.7. We will return to this game shortly.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) beMulti a normal Agent form Systems game,, draft and of for February any 11, 2006 set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

56 3 Competition and Coordination: Normal form games when congestion occurs. You have two possible strategies: C (for using a Correct implementation) and D (for using a Defective one). If both you and your colleague adopt C then your average packet delay is 1ms (millisecond). If you both adopt D the delay is 3ms, because of additional overhead at the network router. Finally, if one of you adopts D and the other adopts C then the D adopter will experience no delay at all, but the C adopter will experience a delay of 4ms.

These consequencesRecap are shownExample in Matrix Figure Games 3.1. YourPareto options Optimality are the twoBest rows, Response and and Nash Equilibrium your colleague’s options are the columns. In each cell, the first number represents your payoff (or,Pareto minus your Optimal delay), and theOutcomes second number in rep Exampleresents your Games colleague’s TCP user’s payoff.1 game C D Prisoner’s dilemma game C 1, 1 4, 0 − − −

D 0, 4 3, 3 − − −

Figure 3.1 The TCP user’s (aka the Prisoner’s) Dilemma.

Given these options what should you adopt, C or D? Does it depend on what you think your colleague will do? Furthermore, from the perspective of the network opera- tor, what kind of behavior can he expect from the two users? Will any two users behave the same when presented with this scenario? Will the behavior change if the network operator allows the users to communicate with each other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the same counterpart multiple times? Do answers to the above questions depend on how Game Theory intro CPSC 532A Lecture 3, Slide 15 rational the agents are and how they view each other’s rationality? Game theory gives answers to many of these questions. It tells us that any rational user, when presented with this scenario once, will adopt D—regardless of what the other user does. It tells us that allowing the users to communicate beforehand will not change the outcome. It tells us that for perfectly rational agents, the decision will remain the same even if they play multiple times; however, if the number of times that the agents will play this is infinite, or even uncertain, we may see them adopt C.

3.2 Games in normal form game in The normal form, also known as the strategic or matrix form, is the most familiar strategic form representation of strategic interactions in game theory. game in matrix form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanying the numbers. Imagine the players of the game are two prisoners suspected of a crime rather than network users, that you each can either Confess to the crime or Deny it, and that the absolute values of the numbers represent the length of jail term each of you will get in each scenario.

c Shoham and Leyton-Brown, 2006

3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-strategy) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games 60 represent3 Competition situations and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to representRock zero-sum Paper games, Scissors in which we write only one payoff value in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Rock Note,0 though, that1 whereas the1 full matrix representation is unambiguous, when we use the abbreviation− we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. Paper 1 0 1 A classical example of a− zero-sum game is the game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game Scissors heads1 or tails. The1 two players0 then compare their pennies. If they are the same then player− 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game.

B F Heads Tails

B 2, 1 0, 0 Heads 1 1 −

F 0, 0 1, 2 Tails 1 1 −

Figure 3.7 Battle of the Sexes game.Figure 3.5 Matching Pennies game.

The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, 3.2.2 Strategies in normal-form gamesprovides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or We have so far defined the actionsmatrix available of this to zero-sum each player game in is a game, shown but in Figure not yet 3.6. his In this game, each of the two Rochambeau game set of strategies, or his availableplayers choices. can Certainlychoose either oneRock, kind of Paper, strategy or Scissors. is to select If both players choose the same pure strategy a single action and play it; weaction, call such there a strategy is no winner, a pure and strategy the utilities, and we are will zero. use Otherwise, each of the actions the notation we have already developedwins over for one actions of the to other repres actions,ent it. and There loses is, to however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy a mixed strategy. Although it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: of mixed strategies is critical. We“Violence will return Galore to this (VG)” when and we discuss “Gentle solution Love (GL)”. concepts They much prefer to go together for games in the next section. rather than to separate movies, but while the wife prefers VG the husband prefers GL. We define a mixed strategy forThe a normal payoff matrixform game is shown as follows. in Figure 3.7. We will return to this game shortly.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) beMulti a normal Agent form Systems game,, draft and of for February any 11, 2006 set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 3 Competition and Coordination: Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. You have two possible strategies: C (for using a Correct implementation)pure- and D (for usingCommon-payoff a Defective one). games If both are also you called and yourpure colleague coordination games, since in such games adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an delay is 3ms,game because of additionalaction overhead that is maximally at the network beneficial router. to all.Finally, if one of you adopts D and the other adoptsBecause C then the of D their adopter special will nature, experience we often no delay represent at all, common value games with an but the C adopter will experienceabbreviated a delay of form 4ms. of the matrix in which we list only one payoff in each of the cells. As an example, imagine two drivers driving towards each other in a country without These consequencesRecap are shownExample in Matrix Figure Games 3.1. YourPareto options Optimality are the twoBest rows, Response and and Nash Equilibrium your colleague’s options are thetraffic columns. rules, and In each who mustcell,the independently first number decide represents whether to drive on the left or on the your payoff (or,Pareto minus your Optimal delay),right. and If theOutcomes secondplayers number choose in the rep Exampleresents same side your (left Games colleague’s or right) they have some high utility, and TCP user’s payoff.1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. game C D Left Right Prisoner’s dilemma game C 1, 1 4, 0 Left 1 0 − − −

D 0, 4 3, 3 Right 0 1 − − −

Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game.

Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- tor, what kindconstant-sum of behavior can hetions) expect are from more the properly two users? called Willconstant-sum any two users games behave. Unlike common-payoff games, games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicate with each other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the same counterpart multiple times? Do answers to the above questions depend on how Game Theory intro CPSC 532A Lecture 3, Slide 15 rational the agents are and how they view each other’s rationality? Game theory gives answers to many of these questions. It tells us that any rational user, when presented with this scenario once, will adopt D—regardless of what the other user does. It tells us that allowing the users to communicate beforehand will not change the outcome. It tells us that for perfectly rational agents, the decision will remain the same even if they play multiple times; however, if the number of times that the agents will play this is infinite, or even uncertain, we may see them adopt C.

3.2 Games in normal form game in The normal form, also known as the strategic or matrix form, is the most familiar strategic form representation of strategic interactions in game theory. game in matrix form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanying the numbers. Imagine the players of the game are two prisoners suspected of a crime rather than network users, that you each can either Confess to the crime or Deny it, and that the absolute values of the numbers represent the length of jail term each of you will get in each scenario.

c Shoham and Leyton-Brown, 2006

3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-strategy) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games represent situations of pure coordination, zero-sum games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to represent zero-sum games, in which we write only one payoff value in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use the abbreviation we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. A classical example of a zero-sum game is the game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game heads or tails. The two players then compare their pennies. If they are the same then player 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5.

Heads Tails

Heads 1 1 −

Tails 1 1 −

Figure 3.5 Matching Pennies game.

The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, provides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or matrix of this zero-sum game is shown in Figure 3.6. In this game, each of the two Rochambeau players can choose either Rock, Paper, or Scissors. If both players choose the same game action, there is no winner, and the utilities are zero. Otherwise, each of the actions wins over one of the other actions, and loses to the other remaining action. In general, games tend to include elements of both coordination and competition. Prisoner’s Dilemma does, although in a rather paradoxical way. Here is another well- known game that includes both elements. In this game, called Battle of the Sexes, a husband and wife wish to go to the movies, and they can select among two movies: “Violence Galore (VG)” and “Gentle Love (GL)”. They much prefer to go together rather than to separate movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly.

Multi Agent Systems, draft of February 11, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 3 Competition and Coordination: Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. You have two possible strategies: C (for using a Correct implementation)pure- and D (for usingCommon-payoff a Defective one). games If both are also you called and yourpure colleague coordination games, since in such games adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an delay is 3ms,game because of additionalaction overhead that is maximally at the network beneficial router. to all.Finally, if one of you adopts D and the other adoptsBecause C then the of D their adopter special will nature, experience we often no delay represent at all, common value games with an abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 but the C adopter will experience a delay3 Competition of 4ms. and Coordination: Normal form games As an example, imagine two drivers driving towards each other in a country without These consequencesRecap are shownExample in Matrix Figure Games 3.1. YourPareto options Optimality are the twoBest rows, Response and and Nash Equilibrium your colleague’s options are thetraffic columns. rules, and In each who mustcell,the independently first number decide represents whether to drive on the left or on the your payoff (or,Pareto minus your Optimal delay),right.Rock and If theOutcomes secondplayers Paper number choose in Scissors the rep Exampleresents same side your (left Games colleague’s or right) they have some high utility, and TCP user’s payoff.1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. game Rock 0 1 1 C− D Left Right Prisoner’s dilemma game Paper 1 0 1 C 1, 1 4, 0 − Left 1 0 − − − Scissors 1 1 0 D −0, 4 3, 3 Right 0 1 − − − Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game. B F Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- tor, what kindconstant-sum of behavior can hetions) expectB are from2 more, 1 the properly0 two, 0 users? called Willconstant-sum any two users games behave. Unlike common-payoff games, games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicateF 0, 0 with1 each, 2 other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game. same counterpart multiple times? Do answers to the above questions depend on how Game Theory intro CPSC 532A Lecture 3, Slide 15 rational the agents are and how they view each other’s rationality? 3.2.2 StrategiesGame theory in normal-form gives answers to games many of these questions. It tells us that any rational user, when presented with this scenario once, will adopt D—regardless of what the otherWe have user so does. far defined It tells the us actions that allowing available the to users each to player commun in aicategame, beforehand but not yet will his notset changeof strategies the outcome., or his available It tells us choices. that for perfectly Certainly ration one kindal agents, of strategy the decision is to select will pure strategy remaina single the action same and even play if they it; we play call multiple such a times; strategy however, a pure if strategythe number, and of we times will that use thethe agents notation will we play have this already is infinite, developed or even for uncertain, actions to we repres mayent see it. them There adopt is, C.however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called mixed strategy3.2 Gamesa mixed strategy in normal. Although form it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role game in Theof mixed normal strategies form, also is critical. known We as will the returnstrategic to thisor matrix when wform,e discuss is the solution most concepts familiar strategic form representationfor games in the of next strategic section. interactions in game theory. game in matrix We define a mixed strategy for a normal form game as follows. form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanyingDefinition 3.2.4 the numbers.Let (N, Imagine(A1,...,A the playersn), of O, the µ, game u) be area two normalprisoners form suspected game, of anda crime for rather any thanset X networklet Π( users,X) thatbe the you set each of can all either probability Confess to distributions the crime or Deny over it,X and. Then that the the absolute set of valuesmixed of the numbers represent the length of jail term each of you will get in each scenario. mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S1 Sn. c Shoham and Leyton-Brown, 2006× · · · ×

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion.3.2 Games The first in normal is the formclass of common-payoff games. These are games in which, for59 every action profile, all players have the same payoff. constant-sum games are meaningful primarily in the context of two-player (though not common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 necessarily3 Competition two-strategy) and Coordination: games. Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. YouDefinition have two possible 3.2.3 A strategies:normal form C game(for using is constant a Correct sum if there exists a constant c such implementation) and D (for usingthatCommon-payoff a for Defective each strategy one). games profile If both area also youA called and1 yourApure2 itcolleague is coordination the case that gamesu1(a,) since + u2 in(a) such = c games. pure- ∈ × adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an action that is maximally beneficial to all. delay is 3ms,game because of additionalFor overhead convenience, at the when network we talk router. of constant-sum Finally, if one games of going forward we will always Because of their special nature, we often represent common value games with an you adopts D and the other adoptsassume C then that thec D= adopter 0, that will is, that expe werience have no a delay zero-sum at all, game. If common-payoff games abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 but the C adopter will experiencerepresent a delay3 Competition ofsituations 4ms. and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure As an example, imagine two drivers driving towards each other in a country without These consequencesRecap are shownExamplecompetition; in Matrix Figure Games 3.1.one player’s YourPareto options gain Optimality must are the come two atBest rows, the Response expense and and Nash of th Equilibriume other player. traffic rules, and who must independently decide whether to drive on the left or on the your colleague’s options are the columns.As in the case In each of common-payoff cell, the first number games, representswe can use an abbreviated matrix form to right. If the players choose the same side (left or right) they have some high utility, and your payoff (or,Pareto minus your Optimal delay),representRock and theOutcomes zero-sum second Paper number games, in Scissorsrep inExample whichresents we your write Games colleague’s only one payoff value in each cell. This 1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. TCP user’s payoff. value represents the payoff of player 1, and thus the negative of the payoff of player 2. game Rock Note,0 though, that1 whereas the1 full matrix representation is unambiguous, when we use C− D Left Right Prisoner’s the abbreviation we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. dilemma game Paper 1 0 1 C A classical1, 1 example4, 0 of a− zero-sumLeft game1 is the0 game of matching pennies. In this matching game,− each− of the− two players has a penny, and independently chooses to display either pennies game Scissors heads or tails. The two players then compare their pennies. If they are the same then 1 1 0 Right 0 1 Dplayer−0, 14 pockets3 both,, 3 and otherwise player 2 pockets them. The payoff matrix is − − − shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game. B F Heads Tails Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- constant-sum tions)B are2 more, 1 properly0, 0 calledHeadsconstant-sum1 games1. Unlike common-payoff games, tor, what kind of behavior can he expect from the two users? Will any two users behave− games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicateF 0, 0 with1 each, 2 other beforeTails making1 a decision?1 Under what changes to the delays would the users’ decisions still be the− same? How would the users behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game.Figure 3.5 Matching Pennies game. same counterpart multiple times? Do answers to the above questions depend on how Game Theory intro CPSC 532A Lecture 3, Slide 15 rational the agents are and how they view each other’s rationality? The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, 3.2.2 StrategiesGame theory in normal-form gives answers to games many of these questions. It tells us that any rational user, when presented with thisprovides scenario aonce, three-strategy will adopt generalization D—regardless of of the what matching- the pennies game. The payoff Scissors, or otherWe have user so does. far defined It tells the us actions thatmatrix allowing available of this the to zero-sum users each to player commun game in is aicategame, shown beforehand but in Figure not yet will 3.6. his In this game, each of the two Rochambeau game notset changeof strategies the outcome., or his available It tellsplayers us choices. that can for perfectly Certainlychoose either ration oneRock, kindal agents, of Paper, strategy the or decision Scissors. is to select will If both players choose the same pure strategy remaina single the action same and even play if they it; we playaction, call multiple such there a times; strategy is no however, winner, a pure ifand strategythe the number utilities, and of we timesare will zero. that use Otherwise, each of the actions thethe agents notation will we play have this already is infinite, developedwins or over even for one uncertain, actions of the to other we repres ma actions,yent see it. them and There loses adopt is, to C.however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy3.2 Gamesa mixed strategy in normal. Although form it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: game in Theof mixed normal strategies form, also is critical. known We“Violence as will the returnstrategic Galore to thisor (VG)”matrix when and wform,e discuss “Gentle is the solution Love most (GL)”. concepts familiar They much prefer to go together strategic form representationfor games in the of next strategic section. interactionsrather than in game to separate theory. movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly. game in matrix We define a mixed strategy for a normal form game as follows. form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanyingDefinition 3.2.4 the numbers.Let (N, Imagine(A1,...,A the playersn), of O, the µ, game u) be areMultia two normalprisoners Agent form Systems suspected game,, draftof anda crime of for February rather any 11, 2006 thanset X networklet Π( users,X) thatbe the you set each of can all either probability Confess to distributions the crime or Deny over it,X and. Then that the the absolute set of valuesmixed of the numbers represent the length of jail term each of you will get in each scenario. mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S1 Sn. c Shoham and Leyton-Brown, 2006× · · · ×

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Lecture Overview

1 Recap

2 Example Matrix Games

3 Pareto Optimality

4 Best Response and Nash Equilibrium

Game Theory intro CPSC 532A Lecture 3, Slide 16 Let a−i = a1, . . . , ai−1, ai+1, . . . , an . h i now a = (a−i, ai)

∗ Best response: ai BR(a−i) iff ∗ ∈ ai Ai, ui(a , a−i) ui(ai, a−i) ∀ ∈ i ≥

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Best Response

If you knew what everyone else was going to do, it would be easy to pick your own action

Game Theory intro CPSC 532A Lecture 3, Slide 17 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Best Response

If you knew what everyone else was going to do, it would be easy to pick your own action

Let a−i = a1, . . . , ai−1, ai+1, . . . , an . h i now a = (a−i, ai)

∗ Best response: ai BR(a−i) iff ∗ ∈ ai Ai, ui(a , a−i) ui(ai, a−i) ∀ ∈ i ≥

Game Theory intro CPSC 532A Lecture 3, Slide 17 Idea: look for stable action profiles.

a = a1, . . . , an is a (“pure strategy”) Nash equilibrium iff h i i, ai BR(a−i). ∀ ∈

Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Nash Equilibrium

Now let’s return to the setting where no agent knows anything about what the others will do What can we say about which actions will occur?

Game Theory intro CPSC 532A Lecture 3, Slide 18 Recap Example Matrix Games Pareto Optimality Best Response and Nash Equilibrium Nash Equilibrium

Now let’s return to the setting where no agent knows anything about what the others will do What can we say about which actions will occur?

Idea: look for stable action profiles.

a = a1, . . . , an is a (“pure strategy”) Nash equilibrium iff h i i, ai BR(a−i). ∀ ∈

Game Theory intro CPSC 532A Lecture 3, Slide 18 58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion.3.2 Games The first in normal is the formclass of common-payoff games. These are games in which, for59 every action profile, all players have the same payoff. constant-sum games are meaningful primarily in the context of two-player (though not common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all necessarily two-strategy) games. game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j Definition 3.2.3 A normal form game is constant sum if there exists a constant c such thatCommon-payoff for each strategy games profile area alsoA calledApureit is coordination the case that gamesu (a,) since + u in(a) such = c games. pure- ∈ 1 × 2 1 2 coordination the agents have no conflicting interests; their sole challenge is to coordinate on an action that is maximally beneficial to all. game For convenience, when we talk of constant-sum games going forward we will always Because of their special nature, we often represent common value games with an assume that c = 0, that is, that we have a zero-sum game. If common-payoff games abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 represent3 Competition situations and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure As an example, imagine two drivers driving towards each other in a country without competition; one player’s gain must come at the expense of the other player. traffic rules, and who must independently decide whether to drive on the left or on the As in the case of common-payoff games, we can use an abbreviated matrix form to right. If the players choose the same side (left or right) they have some high utility, and representRock zero-sum Paper games, Scissors in which we write only one payoff value in each cell. This otherwise they have a low utility. The game matrix is shown in Figure 3.4. value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use Rock 0 1 1 Left Right the abbreviation− we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. Paper 1 0 1 A classical example of a− zero-sumLeft game1 is the0 game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game Scissors heads or tails. The two players then compare their pennies. If they are the same then 1 1 0 Right 0 1 player− 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.4 Coordination game. B F Heads Tails zero-sum game At the other end of the spectrum from pure coordination games lie zero-sum games, which (bearing in mind the comment we made earlier about positive affine transforma- constant-sum tions)B are2 more, 1 properly0, 0 calledHeadsconstant-sum1 games1. Unlike common-payoff games, − games c Shoham and Leyton-Brown, 2006 F 0, 0 1, 2 Tails 1 1 −

TheFigure paradox 3.7 ofBattle Prisoner’s of the Sexes dilemma game.Figure: the 3.5 NashMatching equilibrium Pennies isgame. the only non-Pareto-optimal outcome! The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, 3.2.2 Strategies in normal-form gamesprovides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or We have so far defined the actionsmatrix available of this to zero-sum each player game in is a game, shown but in Figure not yet 3.6. his In this game, each of the two Rochambeau game set of strategies, or his availableplayers choices. can Certainlychoose either oneRock, kind of Paper, strategy or Scissors. is to select If both players choose the same pure strategy a single action and play it; weaction, call such there a strategy is no winner, a pure and strategy the utilities, and we are will zero. use Otherwise, each of the actions the notation we have already developedwins over for one actions of the to other repres actions,ent it. and There loses is, to however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy a mixed strategy. Although it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: of mixed strategies is critical. We“Violence will return Galore to this (VG)” when and we discuss “Gentle solution Love (GL)”. concepts They much prefer to go together for games in the next section. rather than to separate movies, but while the wife prefers VG the husband prefers GL. We define a mixed strategy forThe a normal payoff matrixform game is shown as follows. in Figure 3.7. We will return to this game shortly.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) beMulti a normal Agent form Systems game,, draft and of for February any 11, 2006 set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

56 3 Competition and Coordination: Normal form games when congestion occurs. You have two possible strategies: C (for using a Correct implementation) and D (for using a Defective one). If both you and your colleague adopt C then your average packet delay is 1ms (millisecond). If you both adopt D the delay is 3ms, because of additional overhead at the network router. Finally, if one of you adopts D and the other adopts C then the D adopter will experience no delay at all, but the C adopter will experience a delay of 4ms. These consequences are shown in Figure 3.1. Your options are the two rows, and your colleague’s options are the columns. In each cell, the first number represents your payoff (or, minusRecapyour delay),Example and Matrix the Games second numberPareto rep Optimalityresents yourBest colleague’s Response and Nash Equilibrium TCP user’s payoff.1 game Nash Equilibria of Example Games C D Prisoner’s dilemma game C 1, 1 4, 0 − − −

D 0, 4 3, 3 − − −

Figure 3.1 The TCP user’s (aka the Prisoner’s) Dilemma.

Given these options what should you adopt, C or D? Does it depend on what you think your colleague will do? Furthermore, from the perspective of the network opera- tor, what kind of behavior can he expect from the two users? Will any two users behave the same when presented with this scenario? Will the behavior change if the network operator allows the users to communicate with each other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the same counterpart multiple times? Do answers to the above questions depend on how rational the agents are and how they view each other’s rationality? Game theoryGame gives Theory answers intro to many of these questions. It tells us that any rationalCPSC 532A Lecture 3, Slide 19 user, when presented with this scenario once, will adopt D—regardless of what the other user does. It tells us that allowing the users to communicate beforehand will not change the outcome. It tells us that for perfectly rational agents, the decision will remain the same even if they play multiple times; however, if the number of times that the agents will play this is infinite, or even uncertain, we may see them adopt C.

3.2 Games in normal form game in The normal form, also known as the strategic or matrix form, is the most familiar strategic form representation of strategic interactions in game theory. game in matrix form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanying the numbers. Imagine the players of the game are two prisoners suspected of a crime rather than network users, that you each can either Confess to the crime or Deny it, and that the absolute values of the numbers represent the length of jail term each of you will get in each scenario.

c Shoham and Leyton-Brown, 2006

3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-strategy) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games 60 represent3 Competition situations and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to representRock zero-sum Paper games, Scissors in which we write only one payoff value in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Rock Note,0 though, that1 whereas the1 full matrix representation is unambiguous, when we use the abbreviation− we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. Paper 1 0 1 A classical example of a− zero-sum game is the game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game Scissors heads1 or tails. The1 two players0 then compare their pennies. If they are the same then player− 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game.

B F Heads Tails

B 2, 1 0, 0 Heads 1 1 −

F 0, 0 1, 2 Tails 1 1 −

TheFigure paradox 3.7 ofBattle Prisoner’s of the Sexes dilemma game.Figure: the 3.5 NashMatching equilibrium Pennies isgame. the only non-Pareto-optimal outcome! The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, 3.2.2 Strategies in normal-form gamesprovides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or We have so far defined the actionsmatrix available of this to zero-sum each player game in is a game, shown but in Figure not yet 3.6. his In this game, each of the two Rochambeau game set of strategies, or his availableplayers choices. can Certainlychoose either oneRock, kind of Paper, strategy or Scissors. is to select If both players choose the same pure strategy a single action and play it; weaction, call such there a strategy is no winner, a pure and strategy the utilities, and we are will zero. use Otherwise, each of the actions the notation we have already developedwins over for one actions of the to other repres actions,ent it. and There loses is, to however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy a mixed strategy. Although it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: of mixed strategies is critical. We“Violence will return Galore to this (VG)” when and we discuss “Gentle solution Love (GL)”. concepts They much prefer to go together for games in the next section. rather than to separate movies, but while the wife prefers VG the husband prefers GL. We define a mixed strategy forThe a normal payoff matrixform game is shown as follows. in Figure 3.7. We will return to this game shortly.

Definition 3.2.4 Let (N, (A1,...,An), O, µ, u) beMulti a normal Agent form Systems game,, draft and of for February any 11, 2006 set X let Π(X) be the set of all probability distributions over X. Then the set of mixed mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S S . 1 × · · · × n

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 3 Competition and Coordination: Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. You have two possible strategies: C (for using a Correct implementation)pure- and D (for usingCommon-payoff a Defective one). games If both are also you called and yourpure colleague coordination games, since in such games adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an delay is 3ms,game because of additionalaction overhead that is maximally at the network beneficial router. to all.Finally, if one of you adopts D and the other adoptsBecause C then the of D their adopter special will nature, experience we often no delay represent at all, common value games with an but the C adopter will experienceabbreviated a delay of form 4ms. of the matrix in which we list only one payoff in each of the cells. These consequences are shownAs in an Figure example, 3.1. imagine Your options two drivers are the driving two rows, towards and each other in a country without your colleague’s options are thetraffic columns. rules, and In each who mustcell,the independently first number decide represents whether to drive on the left or on the your payoff (or, minusRecapyour delay),Exampleright. and Matrix If the Games secondplayers number choosePareto the rep Optimalityresents same side your (leftBest colleague’s or Response right) and the Nashy have Equilibrium some high utility, and TCP user’s payoff.1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. game Nash Equilibria of Example Games C D Left Right Prisoner’s dilemma game C 1, 1 4, 0 Left 1 0 − − −

D 0, 4 3, 3 Right 0 1 − − −

Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game.

Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- tor, what kindconstant-sum of behavior can hetions) expect are from more the properly two users? called Willconstant-sum any two users games behave. Unlike common-payoff games, games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicate with each other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the same counterpart multiple times? Do answers to the above questions depend on how rational the agents are and how they view each other’s rationality? Game theoryGame gives Theory answers intro to many of these questions. It tells us that any rationalCPSC 532A Lecture 3, Slide 19 user, when presented with this scenario once, will adopt D—regardless of what the other user does. It tells us that allowing the users to communicate beforehand will not change the outcome. It tells us that for perfectly rational agents, the decision will remain the same even if they play multiple times; however, if the number of times that the agents will play this is infinite, or even uncertain, we may see them adopt C.

3.2 Games in normal form game in The normal form, also known as the strategic or matrix form, is the most familiar strategic form representation of strategic interactions in game theory. game in matrix form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanying the numbers. Imagine the players of the game are two prisoners suspected of a crime rather than network users, that you each can either Confess to the crime or Deny it, and that the absolute values of the numbers represent the length of jail term each of you will get in each scenario.

c Shoham and Leyton-Brown, 2006

3.2 Games in normal form 59

constant-sum games are meaningful primarily in the context of two-player (though not necessarily two-strategy) games.

Definition 3.2.3 A normal form game is constant sum if there exists a constant c such that for each strategy profile a A A it is the case that u (a) + u (a) = c. ∈ 1 × 2 1 2

For convenience, when we talk of constant-sum games going forward we will always assume that c = 0, that is, that we have a zero-sum game. If common-payoff games represent situations of pure coordination, zero-sum games represent situations of pure competition; one player’s gain must come at the expense of the other player. As in the case of common-payoff games, we can use an abbreviated matrix form to represent zero-sum games, in which we write only one payoff value in each cell. This value represents the payoff of player 1, and thus the negative of the payoff of player 2. Note, though, that whereas the full matrix representation is unambiguous, when we use the abbreviation we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. A classical example of a zero-sum game is the game of matching pennies. In this matching game, each of the two players has a penny, and independently chooses to display either pennies game heads or tails. The two players then compare their pennies. If they are the same then player 1 pockets both, and otherwise player 2 pockets them. The payoff matrix is shown in Figure 3.5.

Heads Tails

Heads 1 1 −

Tails 1 1 −

The paradox of Prisoner’s dilemmaFigure: the 3.5 NashMatching equilibrium Pennies isgame. the only non-Pareto-optimal outcome! The popular children’s game of Rock, Paper, Scissors, also known as Rochambeau, Rock, Paper, provides a three-strategy generalization of the matching-pennies game. The payoff Scissors, or matrix of this zero-sum game is shown in Figure 3.6. In this game, each of the two Rochambeau players can choose either Rock, Paper, or Scissors. If both players choose the same game action, there is no winner, and the utilities are zero. Otherwise, each of the actions wins over one of the other actions, and loses to the other remaining action. In general, games tend to include elements of both coordination and competition. Prisoner’s Dilemma does, although in a rather paradoxical way. Here is another well- known game that includes both elements. In this game, called Battle of the Sexes, a husband and wife wish to go to the movies, and they can select among two movies: “Violence Galore (VG)” and “Gentle Love (GL)”. They much prefer to go together rather than to separate movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly.

Multi Agent Systems, draft of February 11, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion. The first is the class of common-payoff games. These are games in which, for every action profile, all players have the same payoff. common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 3 Competition and Coordination: Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. You have two possible strategies: C (for using a Correct implementation)pure- and D (for usingCommon-payoff a Defective one). games If both are also you called and yourpure colleague coordination games, since in such games adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an delay is 3ms,game because of additionalaction overhead that is maximally at the network beneficial router. to all.Finally, if one of you adopts D and the other adoptsBecause C then the of D their adopter special will nature, experience we often no delay represent at all, common value games with an abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 but the C adopter will experience a delay3 Competition of 4ms. and Coordination: Normal form games These consequences are shownAs in an Figure example, 3.1. imagine Your options two drivers are the driving two rows, towards and each other in a country without your colleague’s options are thetraffic columns. rules, and In each who mustcell,the independently first number decide represents whether to drive on the left or on the your payoff (or, minusRecapyour delay),Exampleright.Rock and Matrix If the Games secondplayers Paper number choosePareto Scissors the rep Optimalityresents same side your (leftBest colleague’s or Response right) and the Nashy have Equilibrium some high utility, and TCP user’s payoff.1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. game Nash Equilibria of Example Games Rock 0 1 1 C− D Left Right Prisoner’s dilemma game Paper 1 0 1 C 1, 1 4, 0 − Left 1 0 − − − Scissors 1 1 0 D −0, 4 3, 3 Right 0 1 − − − Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game. B F Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- tor, what kindconstant-sum of behavior can hetions) expectB are from2 more, 1 the properly0 two, 0 users? called Willconstant-sum any two users games behave. Unlike common-payoff games, games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicateF 0, 0 with1 each, 2 other before making a decision? Under what changes to the delays would the users’ decisions still be the same? How would the users behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game. same counterpart multiple times? Do answers to the above questions depend on how rational the agents are and how they view each other’s rationality? Game Theory intro CPSC 532A Lecture 3, Slide 19 3.2.2 StrategiesGame theory in normal-form gives answers to games many of these questions. It tells us that any rational user, when presented with this scenario once, will adopt D—regardless of what the otherWe have user so does. far defined It tells the us actions that allowing available the to users each to player commun in aicategame, beforehand but not yet will his notset changeof strategies the outcome., or his available It tells us choices. that for perfectly Certainly ration one kindal agents, of strategy the decision is to select will pure strategy remaina single the action same and even play if they it; we play call multiple such a times; strategy however, a pure if strategythe number, and of we times will that use thethe agents notation will we play have this already is infinite, developed or even for uncertain, actions to we repres mayent see it. them There adopt is, C.however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called mixed strategy3.2 Gamesa mixed strategy in normal. Although form it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role game in Theof mixed normal strategies form, also is critical. known We as will the returnstrategic to thisor matrix when wform,e discuss is the solution most concepts familiar strategic form representationfor games in the of next strategic section. interactions in game theory. game in matrix We define a mixed strategy for a normal form game as follows. form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanyingDefinition 3.2.4 the numbers.Let (N, Imagine(A1,...,A the playersn), of O, the µ, game u) be area two normalprisoners form suspected game, of anda crime for rather any thanset X networklet Π( users,X) thatbe the you set each of can all either probability Confess to distributions the crime or Deny over it,X and. Then that the the absolute set of valuesmixed of the numbers represent the length of jail term each of you will get in each scenario. mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S1 Sn. c Shoham and Leyton-Brown, 2006× · · · ×

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

The paradox of Prisoner’s dilemma: the Nash equilibrium is the only non-Pareto-optimal outcome!

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion.3.2 Games The first in normal is the formclass of common-payoff games. These are games in which, for59 every action profile, all players have the same payoff. constant-sum games are meaningful primarily in the context of two-player (though not common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 necessarily3 Competition two-strategy) and Coordination: games. Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. YouDefinition have two possible 3.2.3 A strategies:normal form C game(for using is constant a Correct sum if there exists a constant c such implementation) and D (for usingthatCommon-payoff a for Defective each strategy one). games profile If both area also youA called and1 yourApure2 itcolleague is coordination the case that gamesu1(a,) since + u2 in(a) such = c games. pure- ∈ × adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an action that is maximally beneficial to all. delay is 3ms,game because of additionalFor overhead convenience, at the when network we talk router. of constant-sum Finally, if one games of going forward we will always Because of their special nature, we often represent common value games with an you adopts D and the other adoptsassume C then that thec D= adopter 0, that will is, that expe werience have no a delay zero-sum at all, game. If common-payoff games abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 but the C adopter will experiencerepresent a delay3 Competition ofsituations 4ms. and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure As an example, imagine two drivers driving towards each other in a country without These consequences are showncompetition; in Figure 3.1.one player’s Your options gain must are the come two at rows, the expense and of the other player. traffic rules, and who must independently decide whether to drive on the left or on the your colleague’s options are the columns.As in the case In each of common-payoff cell, the first number games, representswe can use an abbreviated matrix form to Recap Exampleright. Matrix If the Games players choosePareto the Optimality same side (leftBest or Response right) and the Nashy have Equilibrium some high utility, and your payoff (or, minus your delay),representRock and the zero-sum second Paper number games, Scissorsrep in whichresents we your write colleague’s only one payoff value in each cell. This 1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. TCP user’s payoff. value represents the payoff of player 1, and thus the negative of the payoff of player 2. game Nash Equilibria of Example Games Rock Note,0 though, that1 whereas the1 full matrix representation is unambiguous, when we use C− D Left Right Prisoner’s the abbreviation we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. dilemma game Paper 1 0 1 C A classical1, 1 example4, 0 of a− zero-sumLeft game1 is the0 game of matching pennies. In this matching game,− each− of the− two players has a penny, and independently chooses to display either pennies game Scissors heads or tails. The two players then compare their pennies. If they are the same then 1 1 0 Right 0 1 Dplayer−0, 14 pockets3 both,, 3 and otherwise player 2 pockets them. The payoff matrix is − − − shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game. B F Heads Tails Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- constant-sum tions)B are2 more, 1 properly0, 0 calledHeadsconstant-sum1 games1. Unlike common-payoff games, tor, what kind of behavior can he expect from the two users? Will any two users behave− games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicateF 0, 0 with1 each, 2 other beforeTails making1 a decision?1 Under what changes to the delays would the users’ decisions still be the− same? How would the users behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game.Figure 3.5 Matching Pennies game. same counterpart multiple times? Do answers to the above questions depend on how rational the agents are and how they view each other’s rationality? Game Theory intro The popular children’s game of Rock, Paper, ScissorsCPSC 532A, also Lecture known 3, Slide 19 as Rochambeau, Rock, Paper, 3.2.2 StrategiesGame theory in normal-form gives answers to games many of these questions. It tells us that any rational user, when presented with thisprovides scenario aonce, three-strategy will adopt generalization D—regardless of of the what matching- the pennies game. The payoff Scissors, or otherWe have user so does. far defined It tells the us actions thatmatrix allowing available of this the to zero-sum users each to player commun game in is aicategame, shown beforehand but in Figure not yet will 3.6. his In this game, each of the two Rochambeau game notset changeof strategies the outcome., or his available It tellsplayers us choices. that can for perfectly Certainlychoose either ration oneRock, kindal agents, of Paper, strategy the or decision Scissors. is to select will If both players choose the same pure strategy remaina single the action same and even play if they it; we playaction, call multiple such there a times; strategy is no however, winner, a pure ifand strategythe the number utilities, and of we timesare will zero. that use Otherwise, each of the actions thethe agents notation will we play have this already is infinite, developedwins or over even for one uncertain, actions of the to other we repres ma actions,yent see it. them and There loses adopt is, to C.however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy3.2 Gamesa mixed strategy in normal. Although form it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: game in Theof mixed normal strategies form, also is critical. known We“Violence as will the returnstrategic Galore to thisor (VG)”matrix when and wform,e discuss “Gentle is the solution Love most (GL)”. concepts familiar They much prefer to go together strategic form representationfor games in the of next strategic section. interactionsrather than in game to separate theory. movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly. game in matrix We define a mixed strategy for a normal form game as follows. form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanyingDefinition 3.2.4 the numbers.Let (N, Imagine(A1,...,A the playersn), of O, the µ, game u) be areMultia two normalprisoners Agent form Systems suspected game,, draftof anda crime of for February rather any 11, 2006 thanset X networklet Π( users,X) thatbe the you set each of can all either probability Confess to distributions the crime or Deny over it,X and. Then that the the absolute set of valuesmixed of the numbers represent the length of jail term each of you will get in each scenario. mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S1 Sn. c Shoham and Leyton-Brown, 2006× · · · ×

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006

58 3 Competition and Coordination: Normal form games

C D

C a, a b, c

D c, b d, d

Figure 3.3 Any c>a>d>b define an instance of Prisoner’s Dilemma.

To fully understand the role of the payoff numbers we would need to enter into utility theory a discussion of utility theory. Here, let us just mention that for most purposes, the positive affine analysis of any game is unchanged if the payoff numbers undergo any positive affine transformation transformation; this simply means that each payoff x is replaced by a payoff ax + b, where a is a fixed positive real number and b is a fixed real number. There are some restricted classes of normal-form games that deserve special men- tion.3.2 Games The first in normal is the formclass of common-payoff games. These are games in which, for59 every action profile, all players have the same payoff. constant-sum games are meaningful primarily in the context of two-player (though not common-payoff Definition 3.2.2 A common payoff game, or team game, is a game in which for all 56 necessarily3 Competition two-strategy) and Coordination: games. Normal form games game action profiles a A1 An and any pair of agents i, j, it is the case that u (a) = u (a). ∈ × · · · × team game i j when congestion occurs. YouDefinition have two possible 3.2.3 A strategies:normal form C game(for using is constant a Correct sum if there exists a constant c such implementation) and D (for usingthatCommon-payoff a for Defective each strategy one). games profile If both area also youA called and1 yourApure2 itcolleague is coordination the case that gamesu1(a,) since + u2 in(a) such = c games. pure- ∈ × adopt C thencoordination your average packetthe delay agents is have1ms (millisecond). no conflictingIf interests; you both their adopt sole D the challenge is to coordinate on an action that is maximally beneficial to all. delay is 3ms,game because of additionalFor overhead convenience, at the when network we talk router. of constant-sum Finally, if one games of going forward we will always Because of their special nature, we often represent common value games with an you adopts D and the other adoptsassume C then that thec D= adopter 0, that will is, that expe werience have no a delay zero-sum at all, game. If common-payoff games abbreviated form of the matrix in which we list only one payoff in each of the cells. 60 but the C adopter will experiencerepresent a delay3 Competition ofsituations 4ms. and of pure Coordination: coordination, Normal zero-sum form games games represent situations of pure As an example, imagine two drivers driving towards each other in a country without These consequences are showncompetition; in Figure 3.1.one player’s Your options gain must are the come two at rows, the expense and of the other player. traffic rules, and who must independently decide whether to drive on the left or on the your colleague’s options are the columns.As in the case In each of common-payoff cell, the first number games, representswe can use an abbreviated matrix form to Recap Exampleright. Matrix If the Games players choosePareto the Optimality same side (leftBest or Response right) and the Nashy have Equilibrium some high utility, and your payoff (or, minus your delay),representRock and the zero-sum second Paper number games, Scissorsrep in whichresents we your write colleague’s only one payoff value in each cell. This 1 otherwise they have a low utility. The game matrix is shown in Figure 3.4. TCP user’s payoff. value represents the payoff of player 1, and thus the negative of the payoff of player 2. game Nash Equilibria of Example Games Rock Note,0 though, that1 whereas the1 full matrix representation is unambiguous, when we use C− D Left Right Prisoner’s the abbreviation we must explicit state whether this matrix represents a common-payoff game or a zero-sum one. dilemma game Paper 1 0 1 C A classical1, 1 example4, 0 of a− zero-sumLeft game1 is the0 game of matching pennies. In this matching game,− each− of the− two players has a penny, and independently chooses to display either pennies game Scissors heads or tails. The two players then compare their pennies. If they are the same then 1 1 0 Right 0 1 Dplayer−0, 14 pockets3 both,, 3 and otherwise player 2 pockets them. The payoff matrix is − − − shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user’s (aka the Prisoner’s)Figure Dilemma. 3.4 Coordination game. B F Heads Tails Given thesezero-sum options game what shouldAt you the other adopt, end C orof theD? spectrumDoes it depe fromnd pure on what coordination you games lie zero-sum games, think your colleague will do? Furthermore,which (bearing from in the mind perspec the commenttive of the we network made earlier opera- about positive affine transforma- constant-sum tions)B are2 more, 1 properly0, 0 calledHeadsconstant-sum1 games1. Unlike common-payoff games, tor, what kind of behavior can he expect from the two users? Will any two users behave− games the same when presented with this scenario? Will the behavior change if the network c Shoham and Leyton-Brown, 2006 operator allows the users to communicateF 0, 0 with1 each, 2 other beforeTails making1 a decision?1 Under what changes to the delays would the users’ decisions still be the− same? How would the users behave if they have the opportunity to face this same decision with the TheFigure paradox 3.7 ofBattle Prisoner’s of the Sexes dilemma game.Figure: the 3.5 NashMatching equilibrium Pennies isgame. the only same counterpart multiple times? Do answersnon-Pareto-optimal to the above questions outcome! depend on how rational the agents are and how they view each other’s rationality? Game Theory intro The popular children’s game of Rock, Paper, ScissorsCPSC 532A, also Lecture known 3, Slide 19 as Rochambeau, Rock, Paper, 3.2.2 StrategiesGame theory in normal-form gives answers to games many of these questions. It tells us that any rational user, when presented with thisprovides scenario aonce, three-strategy will adopt generalization D—regardless of of the what matching- the pennies game. The payoff Scissors, or otherWe have user so does. far defined It tells the us actions thatmatrix allowing available of this the to zero-sum users each to player commun game in is aicategame, shown beforehand but in Figure not yet will 3.6. his In this game, each of the two Rochambeau game notset changeof strategies the outcome., or his available It tellsplayers us choices. that can for perfectly Certainlychoose either ration oneRock, kindal agents, of Paper, strategy the or decision Scissors. is to select will If both players choose the same pure strategy remaina single the action same and even play if they it; we playaction, call multiple such there a times; strategy is no however, winner, a pure ifand strategythe the number utilities, and of we timesare will zero. that use Otherwise, each of the actions thethe agents notation will we play have this already is infinite, developedwins or over even for one uncertain, actions of the to other we repres ma actions,yent see it. them and There loses adopt is, to C.however, the other remaining action. another, less obvious type of strategy;In general, a player games can choose tend to to includerandomize elements over the of set both of coordination and competition. available actions according to somePrisoner’s probability Dilemma distribut does,ion; although such a in strategy a rather is paradoxical called way. Here is another well- mixed strategy3.2 Gamesa mixed strategy in normal. Although form it mayknown not game be immediately that includes obvious both elements. why a player In this shou game,ld called Battle of the Sexes, a introduce randomness into his choicehusband of andaction, wife inwish fact in to a go multi-agent to the movies, setting and the they role can select among two movies: game in Theof mixed normal strategies form, also is critical. known We“Violence as will the returnstrategic Galore to thisor (VG)”matrix when and wform,e discuss “Gentle is the solution Love most (GL)”. concepts familiar They much prefer to go together strategic form representationfor games in the of next strategic section. interactionsrather than in game to separate theory. movies, but while the wife prefers VG the husband prefers GL. The payoff matrix is shown in Figure 3.7. We will return to this game shortly. game in matrix We define a mixed strategy for a normal form game as follows. form 1. The term ‘Prisoners’ Dilemma’ for this famous game theoretic situation derives from the original story accompanyingDefinition 3.2.4 the numbers.Let (N, Imagine(A1,...,A the playersn), of O, the µ, game u) be areMultia two normalprisoners Agent form Systems suspected game,, draftof anda crime of for February rather any 11, 2006 thanset X networklet Π( users,X) thatbe the you set each of can all either probability Confess to distributions the crime or Deny over it,X and. Then that the the absolute set of valuesmixed of the numbers represent the length of jail term each of you will get in each scenario. mixed strategy strategies for player i is Si = Π(Ai). The set of mixed strategy profiles is simply the profiles Cartesian product of the individual mixed strategy sets, S1 Sn. c Shoham and Leyton-Brown, 2006× · · · ×

By si(ai) we denote the probability that an action ai will be played under mixed strategy si. The subset of actions that are assigned positive probability by the mixed strategy si is called the support of si.

Definition 3.2.5 The support of a mixed strategy si for a player i is the set of pure strategies a s (a ) > 0 . { i| i i } c Shoham and Leyton-Brown, 2006