Game Theory Yiling Chen SEAS

AM 121: Intro to Optimization Models and Methods Lecture 11: Game theory Yiling Chen SEAS Lesson Plan Two player, zero-sum games 4 techniques to solve such games The minimax theorem and Nash equilibrium Solving poker (for those interested) Very elegant connection between game theory and duality theory! Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Payoff for Row player: ± ¶VRIYRWHVZRQIURPRSSRQHQW ± FROXPQSOD\HU¶VSD\RIILV ± payoff to row player) column E A B E 1, -1 2, -2 4, -4 row A 1, -1 0, 0 5, -5 B 0, 0 1, -1 -1, 1 Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Payoff for Row player: ± ¶VRIYRWHVZRQIURPRSSRQHQW ± FROXPQSOD\HU¶VSD\RIILV ± payoff to row player) column E A B E 1 2 4 row A 1 0 5 B 0 1 -1 The family of games we consider Two player, zero sum games ^«P`VWUDWHJLHVIRUURZSOD\HUDQG ^«Q`VWUDWHJLHVIRUFROXPQSOD\HU Denote entry aij!R in payoff table, the payoff to row when row plays i and column plays j ³3D\RII´WRSOD\HUVSr(i,j) = aij, Sc(i,j) = - aij " Sc(i,j) + Sr LM WKH³]HURVXP´SURSHUW\ Goal: compute a solution to the game that provides an optimal strategy for each player Solution Concept: Nash Equilibrium Roughly speaking, a strategy profile is a NE LLIHYHU\SOD\HU¶VVWUDWHJ\LVRSWLPDOJLYHQ RWKHUSOD\HUV¶VWUDWHJLHV I. Solving via Iterated removal of strictly dominated strategies Row: Strategy i is strictly dominated by strategy L¶ZKHQ ¾ Sr L¶M #Sr(i,j) for all j!^«Q`DQG ¾ Sr L¶M !Sr(i,j) for some j!^«Q` Column: Strategy j is strictly dominated by VWUDWHJ\M¶ when ¾ Sc LM¶ #Sc(i,j) for all i!^«P`DQG ¾ Sc LM¶ !Sc(i,j) for some i!^«P` Can apply iteratively: iterated removal of strictly dominated strategies Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Can solve by iterated removal of strictly dominated strategies: E A B E 1 2 4 A 1 0 5 B 0 1 -1 Solution is (E,E) Say game has value 1. (Payoff to row player in solution.) Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Can solve by iterated removal of strictly dominated strategies: E A B E 1 2 4 A 1 0 5 B 0 1 -1 1 Solution is (E,E) Say game has value 1. (Payoff to row player in solution.) Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Can solve by iterated removal of strictly dominated strategies: 2 E A B E 1 2 4 A 1 0 5 B 0 1 -1 1 Solution is (E,E) Say game has value 1. (Payoff to row player in solution.) Example: Political Campaign Two days, two cities. Available strategies: ± one day in each city (E) ± two days in city A ± two days in city B Can solve by iterated removal of strictly dominated strategies: 2 E A B E 1 2 4 3 A 1 0 5 B 0 1 -1 1 4 Solution is (E,E) Say game has value 1. (Payoff to row player in solution.) Political Campaign: Variation 1 No dominated strategies! E A B E -3 -2 6 A 2 0 2 B 5 -2 -4 Political Campaign: Variation 1 1RGRPLQDWHGVWUDWHJLHV&RQVLGHU³EHVW- UHVSRQVH´G\QDPLFV E A B E -3 -2 6 A 2 0 2 B 5 -2 -4 The arrow from (E,E) to (B,E) indicates that (B,E) has more payoff for row than (E,E) and that row would deviate and play (E,E) if it knew column was playing E. Not all arrows are shown! (e.g., also one from (A,E) to (B,E) not shown.) Unique stable solution is (A,A). Political Campaign: Variation 1 &DQDOVRFRQVLGHUWKH³PLQLPD[FULWHULRQ´ min value E A B saddle point E -3 -2 6 -3 A 2 0 2 0 maximin B 5 -2 -4 -4 strategy max value 5 0 6 worst case payoff column minimax worst case payoff row can can expect to achieve when strategy expect to achieve when plays each strategy plays each strategy II. Solving via Saddle points Minimax criterion: Row: maxi minj Sr(i,j) Column: minj maxi Sr(i,j) (= -maxj mini Sc(i,j)) ,QH[DPSOHVDPHHQWU\ $$ \LHOGVWKH³PD[LPLQ´ YDOXHIRUURZDQGWKH³PLQLPD[´YDOXHIRUFROXPQ &DOOHGD³saddle point´ This is why the maximin and minimax strategies form a stable point and solve this game. Even if column knows row is playing A, column does not want to deviate (and vice versa) Political Campaign: Variation 2 Strategy B for row is dominated by strategy A. Apply minimax criterion? E A B E 0 -2 2 A 3 4 -3 B 2 3 -4 Political Campaign: Variation 2 Strategy B for row is dominated by strategy A. Apply minimax criterion not a saddle E A B point E 0 -2 2 -2 maximin -3 strategy for A 3 4 -3 row B 2 3 -4 3 4 2 minimax strategy But (E,B) not a saddle point. for column "$column can deviate and play A, then row can GHYLDWHDQGSOD\$DQGVRRQ« Political Campaign: Variation 2 Consider best-response dynamics: E A B E 0 -2 2 A 3 4 -3 B 2 3 -4 No stable solution! If one player knows the other SOD\HU¶VVWUDWHJ\WKHQKHRUVKHKDVDXVHIXO deviation. Simpler Example: Odds and Evens Two players (Odd and Even) simultaneously choose the number of fingers (1 or 2) to put out. If the sum is odd, then Odd wins $1 from Even. If the sum is even, Even wins $1 from Odd. Even one two Odd one -1 +1 two +1 -1 No stable solution. How do you play this game? Mixed strategies m Let x=(x1«[m), xi#0, ¦ixi=1 denote a mixed strategy of row player; the probability xi with which row plays each strategy i!^«P` Define a maximin strategy x* as solving m maxx minj!^«Q` ¦i=1 xiaij Find the mixed strategy x that maximizes my worst- case payoff (given that column knows what I will do) Even Graphical method: plot expected value to row for each one two pure strategy of column as mixed one -1 +1 strategy of row varies Odd two +1 -1 1 Expected EV when column value (EV) SOD\VµWZR¶ row player 0 x SUREµRQH¶E\URZ ! 1 1 EV when column maximin SOD\VµRQH¶ -1 for row Lower envelope is the minimal payoff row would receive for each strategy (x1, 1-x1) Mixed strategies m Let x=(x1«[m), xi#0, ¦ixi=1 denote a mixed strategy of row player; the probability xi with which row plays each strategy i!^«P` n Let y=(y1«\n), yj#0, ¦jyj=1 denote a mixed strategy of column player Define a maximin strategy x* as solving m maxx minj!^«Q` ¦i=1 xiaij Define a minimax strategy y* as solving n miny maxi!{1,...,m} ¦j=1 yjaij Even Graphical method: plot expected value to column for one two each pure strategy of row as mixed one -1 +1 strategy of column varies Odd two +1 -1 Upper envelope is the Sr (= -Sc) for each strategy (y , 1-y ) by column 1 1 1 EV to EV when row row SOD\VµRQH¶ player 0 y SUREµRQH¶E\FRO ! 1 1 EV when row minimax SOD\VµWZR¶ -1 for column Mixed strategies ,QWKHH[DPSOHWKH³PD[LPLQFULWHULRQ´VXJJHVWV x*=(!, !) and y*=(!, !) as solutions Note: the value of the maximin strategy to row = the value of the minimax strategy to column = 0 Also a stable solution: if column plays y* then row does not wish to deviate from x*, and vice versa Consider this a solution to the game Even one two one -1 +1 Odd two +1 -1 III. Solving via the Graphical method Can use in a game with only two pure strategies for each player the graphical method can be used Row: vary x1 and plot EV to row for each possible strategy j!^«Q`RIFROXPQ7DNHWKHORZHU- * envelope and find x 1 that maximizes. Column: vary y1 and plot EV to row for each possible strategy i!^«P`RIURZ7DNHWKHXSSHU- * envelope and find y1 that minimizes. Can also use graphical method as long as one of the players has only two pure strategies; can solve for other player algebraically. IV. Solving via LP Computing maximin strategy via LP m maxx minj ¦i=1aij xi y1 y2 y3 m s.t. ¦i=1xi = 1 E A B xi # 0 x1 E 0 -2 2 x2 A 3 4 -3 Computing maximin strategy via LP m maxx minj ¦i=1aij xi y1 y2 y3 m s.t. ¦i=1xi = 1 E A B xi # 0 x1 E 0 -2 2 x2 A 3 4 -3 max v m s.t. v % ¦i=1 aij xi, & j!^«Q` m ¦i=1xi = 1 xi # 0, & i!^«P` v free Example: Political campaign max v y y y m 1 2 3 s.t. v % ¦i=1aij xi, & j!^«Q` E A B m ¦i=1xi = 1 x1 E 0 -2 2 xi # 0, & i!^«P` x2 A 3 4 -3 v free max v s.t.

Game Theory Yiling Chen SEAS

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support