Nash's Theorem and Von Neumann's Minimax Theorem 18.1 Review

Nash's Theorem and Von Neumann's Minimax Theorem 18.1 Review

EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem Lecturer: Jacob Abernethy Scribes: Lili Su, Editors: Weiqing Yu and Andrew Melfi 18.1 Review: On-line Learning with Experts (Actions) Setting Given n experts (actions), the general on-line setting involves T rounds. For round t = 1 :::T : t !t • The algorithm plays with the distribution p = t 2 ∆n. k! k1 t • The i-th expert (action) suffers the loss `i 2 [0; 1]. • The algorithm suffers the loss pt · `t. Theorem 18.1 (Regret Bound for EWA). T T r ! T r ! 1 X t t 1 X t log N 1 X t log N p · ` ≤ min `i +O = min p · ` + O T i T T p2∆n T T t=1 t=1 t=1 | {z } | {z } t+1 t+1 LMA LIX Note The distribution p = ei means the algorithm puts all mass on the i-th action. 18.2 Two Player Game Definition 18.2 (Two Player Game). A two player game is defined by a pair of matrices M; N 2 [0; 1]n×m. Definition 18.3 (Pure Strategy). With a pure strategy in a two player game, P1 chooses an action i 2 [n], and P2 chooses an action j 2 [m]. P1 thus earns Mij, and P2 earns Nij. Definition 18.4 (Mixed Strategy). With a mixed strategy in a two player game, P1 plays with a distri- > P bution p 2 ∆n, and P2 plays with a distribution q 2 ∆m. P1 thus earns p Mq = piqjMij, and P2 earns i;j > P p Nq = piqjNij. i;j Definition 18.5 (Zero-sum Game). A zero-sum game is a two player game, where the matrices M; N has the relation M = −N. 18.3 Nash's Theorem Definition 18.6 (Nash Equilibrium). In a two player game, a Nash Equilibrium(Neq), in which P1 plays with the distribution pe 2 ∆n, and P2 plays with the distribution qe 2 ∆m, satisfies > > • for all p 2 ∆n, pe Mqe ≥ p Mqe > > • for all q 2 ∆m, pe Nqe ≥ pe Nq Theorem 18.7 (Nash's Theorem). Every two player game has a Nash Equilibrium(Neq). (Not all have pure strategy equilibria.) Lemma 18.8 (Brouwer's Fixed-point Theorem). Let B ⊆ Rd be a compact convex set, and a function f : B ! B is continuous. Then there exists x 2 B, such that x = f(x). 18-1 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem 18-2 Proof Sketch of Nash's Theorem > > > > 1. Let ci(p; q) = max 0; ei Mq − p Mq , and di(p; q) = max 0; q Mej − p Mq . 0 0 0 pi+ci(p;q) 0 qi+di(p;q) 2. Define a map f :(p; q) ! (p ; q ), p = P , and q = P . i 1+ ci0 (p;q) i 1+ di0 (p;q) i02[n] i02[m] 3. By Brouwer's fixed-point theorem, there exists a fixed-point (pe; qe), f(pe; qe) = (pe; qe). 4. Show the fixed-point (pe; qe) is the Nash Equilibrium. 18.4 Von Neumann's Minimax Theorem Theorem 18.9 (Von Neumann's Minimax Theorem). min max p>Mq = max min p>Mq p2∆n q2∆m q2∆m p2∆n Proof by Nash's Theorem • Exercise Proof by the Exponential Weighted Average Algorithm > a) The "≥" direction is straightforward. Let p1 2 ∆n, q1 2 ∆m be the choices for min max p Mq = p2∆nq2∆m > > > p1 Mq1, and p2 2 ∆n, q2 2 ∆m be the choices for max min p Mq = p2 Mq2. q2∆mp2∆n > > > > > min max p Mq = p1 Mq1 ≥ p1 Mq2 ≥ p2 Mq2 = max min p Mq: p2∆n q2∆m q2∆m p2∆n > An intuitive explanation for the first inequality is in minp2∆n maxq2∆m p Mq, q is chosen to maximize > > > p Mq for any given q, therefore, p1 Mq1 ≥ p1 Mq for any q 6= q1. Similar explanation goes for the second inequality. b) Show the "≤" direction holds up to O p1 approximation. t Setting Imagine playing a T -round game against a really hard adversary. For round t = 1 :::T : t !t • Player 1 plays with the distribution p = t 2 ∆n. k! k1 • Player 2 plays with the distribution qt = arg maxptMq. q2∆m • Let `t = Mqt, and Player 1 suffers the loss pt · `t = pt · Mqt. 1 t+1 t t • Let ! = (1 ::: 1), and update !i = !i exp(−η`i). T 1 P t t Trick Analyze T p · Mq . t=1 1. By Jensen's Inequality, T T T ! 1 X 1 X 1 X ptMqt = max ptMq ≥ max pt Mq ≥ min max p>Mq: T T q2∆m q2∆m T p2∆n q2∆m t=1 t=1 t=1 Lecture 18: Nash's Theorem and Von Neumann's Minimax Theorem 18-3 2. By the exponential weighted average algorithm, T T 1 X 1 X ptMqt = pt · `t T T t=1 t=1 T 1 X t t RegretT ≤ min p · `i + T (:= ) i T T t=1 T 1 X t = min p · ` + T p2∆n T t=1 T 1 X t = min p · Mq + T p2∆n T t=1 T 1 X t = min p · M( q ) + T p2∆n T t=1 > ≤ max min p Mq + T : q2∆m p2∆n Putting the results in 1: and 2: together, we have > > min max p Mq ≤ max min p Mq + T : p2∆n q2∆m q2∆m p2∆n T can be chosen as big as we wanted, and thus = O p1 vanishes. It completes the prove of the T T "≤" direction Theorem 18.10 (Generalization of Von Neumann's Minimax Theorem). Let X ⊆ Rn, Y ⊆ Rm be compact convex sets. Let f : X × Y !R be some differentiable function with bounded gradients, where f(·; y) is convex in its first argument for all fixed y, and f(x; ·) is concave for in its second argument for all fixed x. Then inf sup f(x; y) = sup inf f(x; y): x2X y2Y y2Y x2X.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    3 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us