Backward Induction

ISCI 330 Lecture 14

March 1, 2007

Backward Induction ISCI 330 Lecture 14, Slide 1 Recap Backward Induction Lecture Overview

Recap

Backward Induction

Backward Induction ISCI 330 Lecture 14, Slide 2 I He does it to threaten player 2, to prevent him from choosing F , and so gets 5 I However, this seems like a non-credible threat I If player 1 reached his second decision node, would he really follow through and play H?

5.1 Perfect-information extensive-form games 109

¨H1 ¨¨ HH ¨ q H 0–2 ¨ 1–1 H 2–0 ¨¨ HH ¨¨ HH 2 ¨ 2 H 2 ¡A ¡A ¡A ¡ q A ¡ q A ¡ q A no ¡ A yes no ¡ A yes no ¡ A yes ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A (0,0) (2,0) (0,0) (1,1) (0,0) (0,2) q q q q q q Figure 5.1 The Sharing game. Recap Backward Induction Notice that the deﬁnition contains a subtlety. An agent’s strategy requires a decision at each choice node, regardless of whether or not it is possible to reach that node given Subgame Perfectionthe other choice nodes. In the Sharing game above the situation is straightforward— player 1 has three pure strategies, and player 2 has eight (why?). But now consider the game shown in Figure 5.2. 1 A B 2 2

C D E F 1 (3,8) (8,3) (5,5) G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

I There’sIn something order to deﬁne a complete intuitively strategy for this wrong game, each with of the players the must equilibrium choose an action at each of his two choice nodes. Thus we can enumerate the pure strategies (B,Hof), the(C,E players as) follows. I WhyS = would(A, G), ( playerA, H), (B, 1G), ever(B, H) choose to play H if he got to the 1 { } secondS = ( choiceC, E), (C,F node?), (D, E), (D,F ) 2 { } It isI importantAfter to all, noteG thatdominates we have to includeH thefor strategies him(A, G) and (A, H), even though once A is chosen the G-versus-H choice is moot. The deﬁnition of best response and Nash equilibria in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfect- information game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form im- age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are

Multi Agent Systems, draft of September 19, 2006

Backward Induction ISCI 330 Lecture 14, Slide 3 5.1 Perfect-information extensive-form games 109

C D E F 1 (3,8) (8,3) (5,5) G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

I There’sIn something order to deﬁne a complete intuitively strategy for this wrong game, each with of the players the must equilibrium choose an action at each of his two choice nodes. Thus we can enumerate the pure strategies (B,Hof), the(C,E players as) follows. I WhyS = would(A, G), ( playerA, H), (B, 1G), ever(B, H) choose to play H if he got to the 1 { } secondS = ( choiceC, E), (C,F node?), (D, E), (D,F ) 2 { } It isI importantAfter to all, noteG thatdominates we have to includeH thefor strategies him(A, G) and (A, H), even though once A is chosen the G-versus-H choice is moot. I HeThe does deﬁnition it ofto best threaten response and playerNash equilibria 2, toin this prevent game are exactly him as from they choosing Fare, andin for normal so gets form games. 5 Indeed, this example illustrates how every perfect- information game can be converted to an equivalent normal form game. For example, theI perfect-informationHowever, this game ofseems Figure 5.2 like cana be non-credibleconverted into the normal threat form im- age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are I If player 1 reached his second decision node, would he really follow throughMulti Agent Systems and, play draft ofH September? 19, 2006

Backward Induction ISCI 330 Lecture 14, Slide 3 Recap Backward Induction Formal Deﬁnition

I Deﬁne subgame of G rooted at h:

I the restriction of G to the descendents of H.

I Deﬁne set of subgames of G:

I subgames of G rooted at nodes in G

I s is a subgame perfect equilibrium of G iﬀ for any subgame G0 of G, the restriction of s to G0 is a Nash equilibrium of G0

I Notes:

I since G is its own subgame, every SPE is a NE. I this deﬁnition rules out “non-credible threats”

Backward Induction ISCI 330 Lecture 14, Slide 4 I (A, G), (C,F ) is subgame perfect I (B,H) is an non-credible threat, so (B,H), (C,E) is not subgame perfect

I (A, H) is also non-credible, even though H is “oﬀ-path”

5.1 Perfect-information extensive-form games 109

¨H1 ¨¨ HH ¨ q H 0–2 ¨ 1–1 H 2–0 ¨¨ HH ¨¨ HH 2 ¨ 2 H 2 ¡A ¡A ¡A ¡ q A ¡ q A ¡ q A no ¡ A yes no ¡ A yes no ¡ A yes ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A (0,0) (2,0) (0,0) (1,1) (0,0) (0,2) Recap q q q q q q Backward Induction Figure 5.1 The Sharing game. Back to the Example Notice that the deﬁnition contains a subtlety. An agent’s strategy requires a decision at each choice node, regardless of whether or not it is possible to reach that node given the other choice nodes. In the Sharing game above the situation is straightforward— player 1 has three pure strategies, and player 2 has eight (why?). But now consider the game shown in Figure 5.2. 1 A B 2 2

C D E F 1 (3,8) (8,3) (5,5) G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

In order to deﬁne a complete strategy for this game, each of the players must choose I Whichan equilibria action at each of his from two choice the nodes. example Thus we can are enumerat subgamee the pure strategies perfect? of the players as follows. S = (A, G), (A, H), (B, G), (B, H) 1 { } S = (C, E), (C,F ), (D, E), (D,F ) 2 { } It is important to note that we have to include the strategies (A, G) and (A, H), even though once A is chosen the G-versus-H choice is moot. The deﬁnition of best response and Nash equilibria in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfect- information game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form im- age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are Backward Induction ISCI 330 Lecture 14, Slide 5 Multi Agent Systems, draft of September 19, 2006 5.1 Perfect-information extensive-form games 109

C D E F 1 (3,8) (8,3) (5,5) G H

(2,10) (1,0)

Figure 5.2 A perfect-information game in extensive form.

In order to define a complete strategy for this game, each of the players must choose I Whichan equilibria action at each of his from two choice the nodes. example Thus we can are enumerat subgamee the pure strategies perfect? of the players as follows. I (A, G), (C,F ) is subgame perfect S = (A, G), (A, H), (B, G), (B, H) 1 { } I (B,HS =)(C,is E an), (C,F non-credible), (D, E), (D,F ) threat, so (B,H), (C,E) is not 2 { } subgameIt is important perfect to note that we have to include the strategies (A, G) and (A, H), even though once A is chosen the G-versus-H choice is moot. I (A, H) is also non-credible, even though H is “off-path” The definition of best response and Nash equilibria in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfect- information game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form im- age of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are Backward Induction ISCI 330 Lecture 14, Slide 5 Multi Agent Systems, draft of September 19, 2006 Recap Backward Induction Lecture Overview

Recap

Backward Induction

Backward Induction ISCI 330 Lecture 14, Slide 6 Recap Backward Induction Centipede Game

118 5 Reasoning and Computing with the Extensive Form

1 A 2 A 1 A 2 A 1 A (3,5) D q D q D q D q D q

(1,0) (0,2) (3,1) (2,4) (4,3)

Figure 5.9 The centipede game

I Play this as a fun game... place. In other words, you have reached a state to which your analysis has given a probability of zero. How should you amend your beliefs and course of action based on this measure-zero event? It turns out this seemingly small inconvenience actually raises a fundamental problem in game theory. We will not develop the subject further here, but let us only mention that there exist different accounts of this situation, and

theyBackward depend Induction on the probabilistic assumptions made, on what is commonISCI 330 knowledge Lecture 14, Slide (in 7 particular, whether there is common knowledge of rationality), and on exactly how one revises one’s beliefs in the face of measure zero events. The last question is intimately related to the subject of belief revision discussed in Chapter 2.

5.2 Imperfect-information extensive-form games

Up to this point, in our discussion of extensive-form games we have allowed players to specify the action that they would take at every choice node of the game. This implies that players know the node they are in, and—recalling that in such games we equate nodes with the histories that led to them—all the prior choices, including those of other agents. For this reason we have called these perfect-information games. We might not always want to make such a strong assumption about our players and our environment. In many situations we may want to model agents needing to act with partial or no knowledge of the actions taken by others, or even agents with limited memory of their own past actions. The sequencing of choices allows us to represent such ignorance to a limited degree; an “earlier” choice might be interpreted as a choice made without knowing the “later” choices. However, we cannot represent two choices made in the same play of the game in mutual ignorance of each other. The normal form, of course, is optimized for such modelling.

5.2.1 Definition Imperfect-information games in extensive form address this limitation. An imperfect- information game is an extensive-form game in which each player’s choice nodes are information sets partitioned into information sets; intuitively, if two choice nodes are in the same information set then the agent cannot distinguish between them. From the technical point of view, imperfect-information games are obtained by overlaying a partition structure, as defined in Chapter 1 in connection with models of knowledge, over a perfect-information game. imperfect- Definition 5.2.1 An imperfect-information game (in extensive form) is a tuple information (N,A,H,Z,χ,ρ,σ,u,I), where game c Shoham and Leyton-Brown, 2006

Recap Backward Induction Computing Subgame Perfect Equilibria

I Idea: Identify the equilibria in the bottom-most trees, and adopt these as one moves up the tree I Procedure I starting at each terminal node, move up to its parent node and label it with the utility values of the terminal node that would be the best response for the player who gets to choose at this parent node

I repeat this procedure, treating the highest already labeled nodes as terminal nodes, until the root is reached

I Stop when the root of the tree is labeled I Note: Before performing this procedure at any given node, it must be performed at all subnodes ﬁrst I the procedure doesn’t return an equilibrium strategy, but rather labels each node with a vector of real numbers.

I This labeling can be seen as an extension of the game’s utility function to the non-terminal nodes

I The equilibrium strategies: take the best action at each node.

Backward Induction ISCI 330 Lecture 14, Slide 8 Recap Backward Induction

118 Backward Induction 5 Reasoning and Computing with the Extensive Form

1 A 2 A 1 A 2 A 1 A (3,5) D q D q D q D q D q

(1,0) (0,2) (3,1) (2,4) (4,3)

Figure 5.9 The centipede game I What happens when we use this procedure on Centipede? I In the only equilibrium, player 1 goes down in the ﬁrst move. place. In other words, you have reached a state to which your analysis has given a probabilityI However, of zero. How this should outcome you amend is Pareto-dominated your beliefs and course by of all action but based one on this measure-zeroother outcome. event? It turns out this seemingly small inconvenience actually raises a fundamental problem in game theory. We will not develop the subject further I Two considerations: here, but let us only mention that there exist different accounts of this situation, and theyI dependpractical: on the probabilistic humansubjects assumptions don’t made, on go what down is common right knowledge away (in particular,I theoretical: whether there what is common should knowledge you ofdo rationali as playerty), and 2 on if exactly player how 1 one doesn’t revises one’sgo down? beliefs in the face of measure zero events. The last question is intimately related to the subject of belief revision discussed in Chapter 2. I SPE analysis says to go down. However, that same analysis says that P1 would already have gone down. How do you 5.2 Imperfect-informationupdate your extensive-form beliefs upon observation games of a measure zero event? Up to this point,I but in our if playerdiscussion 1 knowsof extensive-form that you’ll games do w somethinge have allowed else, players it is to specify the actionrational that they for would him take not at to every go choice down node anymore... of the game. a paradox This implies that playersI knowthere’s the node a whole they are literature in, and—recalling on this that question in such games we equate Backward Inductionnodes with the histories that led to them—all the prior choices, including thoseISCI of 330 other Lecture 14, Slide 9 agents. For this reason we have called these perfect-information games. We might not always want to make such a strong assumption about our players and our environment. In many situations we may want to model agents needing to act with partial or no knowledge of the actions taken by others, or even agents with limited memory of their own past actions. The sequencing of choices allows us to represent such ignorance to a limited degree; an “earlier” choice might be interpreted as a choice made without knowing the “later” choices. However, we cannot represent two choices made in the same play of the game in mutual ignorance of each other. The normal form, of course, is optimized for such modelling.