Nash's Theorem and Von Neumann's Minimax Theorem 18.1 Review
Total Page:16
File Type:pdf, Size:1020Kb
EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem Lecturer: Jacob Abernethy Scribes: Lili Su, Editors: Weiqing Yu and Andrew Melfi 18.1 Review: On-line Learning with Experts (Actions) Setting Given n experts (actions), the general on-line setting involves T rounds. For round t = 1 :::T : t !t • The algorithm plays with the distribution p = t 2 ∆n. k! k1 t • The i-th expert (action) suffers the loss `i 2 [0; 1]. • The algorithm suffers the loss pt · `t. Theorem 18.1 (Regret Bound for EWA). T T r ! T r ! 1 X t t 1 X t log N 1 X t log N p · ` ≤ min `i +O = min p · ` + O T i T T p2∆n T T t=1 t=1 t=1 | {z } | {z } t+1 t+1 LMA LIX Note The distribution p = ei means the algorithm puts all mass on the i-th action. 18.2 Two Player Game Definition 18.2 (Two Player Game). A two player game is defined by a pair of matrices M; N 2 [0; 1]n×m. Definition 18.3 (Pure Strategy). With a pure strategy in a two player game, P1 chooses an action i 2 [n], and P2 chooses an action j 2 [m]. P1 thus earns Mij, and P2 earns Nij. Definition 18.4 (Mixed Strategy). With a mixed strategy in a two player game, P1 plays with a distri- > P bution p 2 ∆n, and P2 plays with a distribution q 2 ∆m. P1 thus earns p Mq = piqjMij, and P2 earns i;j > P p Nq = piqjNij. i;j Definition 18.5 (Zero-sum Game). A zero-sum game is a two player game, where the matrices M; N has the relation M = −N. 18.3 Nash's Theorem Definition 18.6 (Nash Equilibrium). In a two player game, a Nash Equilibrium(Neq), in which P1 plays with the distribution pe 2 ∆n, and P2 plays with the distribution qe 2 ∆m, satisfies > > • for all p 2 ∆n, pe Mqe ≥ p Mqe > > • for all q 2 ∆m, pe Nqe ≥ pe Nq Theorem 18.7 (Nash's Theorem). Every two player game has a Nash Equilibrium(Neq). (Not all have pure strategy equilibria.) Lemma 18.8 (Brouwer's Fixed-point Theorem). Let B ⊆ Rd be a compact convex set, and a function f : B ! B is continuous. Then there exists x 2 B, such that x = f(x). 18-1 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem 18-2 Proof Sketch of Nash's Theorem > > > > 1. Let ci(p; q) = max 0; ei Mq − p Mq , and di(p; q) = max 0; q Mej − p Mq . 0 0 0 pi+ci(p;q) 0 qi+di(p;q) 2. Define a map f :(p; q) ! (p ; q ), p = P , and q = P . i 1+ ci0 (p;q) i 1+ di0 (p;q) i02[n] i02[m] 3. By Brouwer's fixed-point theorem, there exists a fixed-point (pe; qe), f(pe; qe) = (pe; qe). 4. Show the fixed-point (pe; qe) is the Nash Equilibrium. 18.4 Von Neumann's Minimax Theorem Theorem 18.9 (Von Neumann's Minimax Theorem). min max p>Mq = max min p>Mq p2∆n q2∆m q2∆m p2∆n Proof by Nash's Theorem • Exercise Proof by the Exponential Weighted Average Algorithm > a) The "≥" direction is straightforward. Let p1 2 ∆n, q1 2 ∆m be the choices for min max p Mq = p2∆nq2∆m > > > p1 Mq1, and p2 2 ∆n, q2 2 ∆m be the choices for max min p Mq = p2 Mq2. q2∆mp2∆n > > > > > min max p Mq = p1 Mq1 ≥ p1 Mq2 ≥ p2 Mq2 = max min p Mq: p2∆n q2∆m q2∆m p2∆n > An intuitive explanation for the first inequality is in minp2∆n maxq2∆m p Mq, q is chosen to maximize > > > p Mq for any given q, therefore, p1 Mq1 ≥ p1 Mq for any q 6= q1. Similar explanation goes for the second inequality. b) Show the "≤" direction holds up to O p1 approximation. t Setting Imagine playing a T -round game against a really hard adversary. For round t = 1 :::T : t !t • Player 1 plays with the distribution p = t 2 ∆n. k! k1 • Player 2 plays with the distribution qt = arg maxptMq. q2∆m • Let `t = Mqt, and Player 1 suffers the loss pt · `t = pt · Mqt. 1 t+1 t t • Let ! = (1 ::: 1), and update !i = !i exp(−η`i). T 1 P t t Trick Analyze T p · Mq . t=1 1. By Jensen's Inequality, T T T ! 1 X 1 X 1 X ptMqt = max ptMq ≥ max pt Mq ≥ min max p>Mq: T T q2∆m q2∆m T p2∆n q2∆m t=1 t=1 t=1 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem 18-3 2. By the exponential weighted average algorithm, T T 1 X 1 X ptMqt = pt · `t T T t=1 t=1 T 1 X t t RegretT ≤ min p · `i + T (:= ) i T T t=1 T 1 X t = min p · ` + T p2∆n T t=1 T 1 X t = min p · Mq + T p2∆n T t=1 T 1 X t = min p · M( q ) + T p2∆n T t=1 > ≤ max min p Mq + T : q2∆m p2∆n Putting the results in 1: and 2: together, we have > > min max p Mq ≤ max min p Mq + T : p2∆n q2∆m q2∆m p2∆n T can be chosen as big as we wanted, and thus = O p1 vanishes. It completes the prove of the T T "≤" direction Theorem 18.10 (Generalization of Von Neumann's Minimax Theorem). Let X ⊆ Rn, Y ⊆ Rm be compact convex sets. Let f : X × Y !R be some differentiable function with bounded gradients, where f(·; y) is convex in its first argument for all fixed y, and f(x; ·) is concave for in its second argument for all fixed x. Then inf sup f(x; y) = sup inf f(x; y): x2X y2Y y2Y x2X.