Stochastic Games

Stochastic games Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Reachability objectives Stochastic Games The value Min strategies (in Formal Verification) Max strategies Determinacy Finite-state games BPA games Branching-time objectives Basic properties Deciding the winner Antonín Kuceraˇ Games with time Masaryk University Brno SFM-10:QAPL 2010 1/56 Stochastic games Game theory Antonín Kuceraˇ Preliminaries Games Strategies, plays Game theory studies the behavior of rational “players” who can Objectives make choice and attempt to achieve a certain objective.A Reachability objectives player’s success depends on the choices of the other players. The value Min strategies Max strategies stochastic games: Determinacy Finite-state games BPA games the impact of players’ choices in uncertain; Branching-time objectives the players’ choice can be randomized. Basic properties Deciding the winner Games with games in computer science: time formal semantics; communication protocols; Internet auctions; . many other things. SFM-10:QAPL 2010 2/56 Stochastic games Stochastic games in formal verification Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Reachability objectives Our setting: The value Min strategies Max strategies state space: discrete Determinacy Finite-state games players: controller, environment BPA games Branching-time objectives: antagonistic objectives Basic properties Deciding the winner choice: turn-based, randomized Games with time information: perfect Is there a strategy for the controller such that the system satisfies a certain property no matter what the environment does? SFM-10:QAPL 2010 3/56 Stochastic games Outline Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Reachability objectives The value Preliminaries. Min strategies Max strategies Games, strategies, objectives. Determinacy Finite-state games BPA games Stochastic games with reachability objectives. Branching-time objectives Basic properties The (non)existence of optimal strategies. Deciding the winner Games with Algorithms for finite-state games. time Stochastic games with branching-time objectives. Stochastic games with time. SFM-10:QAPL 2010 4/56 We want to measure the probability of certain subsets of Run(s). For every finite path w initiated in s, we define the probability of Run(w) in the natural way. This assignment can be uniquely extended to the (Borel) σ-algebra generated by all Run(w). F Thus, we obtain the probability space (Run(s); ; ). F P Stochastic games Markov chains Antonín Kuceraˇ Preliminaries Games Definition 1 (Markov chain) Strategies, plays Objectives 1 Reachability 4 objectives 1 s t 1 = (S; ; Prob) The value 2 3 M ! Min strategies 1 3 S is at most countable set of states; Max strategies 1 1 Determinacy 4 3 Finite-state games S S is a transition relation; BPA games !⊆ × u Branching-time Prob is a probability assignment. objectives Basic properties 1 Deciding the winner Games with time SFM-10:QAPL 2010 5/56 Stochastic games Markov chains Antonín Kuceraˇ Preliminaries Games Definition 1 (Markov chain) Strategies, plays Objectives 1 Reachability 4 objectives 1 s t 1 = (S; ; Prob) The value 2 3 M ! Min strategies 1 3 S is at most countable set of states; Max strategies 1 1 Determinacy 4 3 Finite-state games S S is a transition relation; BPA games !⊆ × u Branching-time Prob is a probability assignment. objectives Basic properties 1 Deciding the winner Games with time We want to measure the probability of certain subsets of Run(s). For every finite path w initiated in s, we define the probability of Run(w) in the natural way. This assignment can be uniquely extended to the (Borel) σ-algebra generated by all Run(w). F Thus, we obtain the probability space (Run(s); ; ). F P SFM-10:QAPL 2010 5/56 Stochastic games Turn-based stochastic games Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Reachability objectives The value Definition 2 (Turn-based stochastic game) Min strategies Max strategies Determinacy G = (V; E; (V; V^; V ); Prob) Finite-state games BPA games 0:2 the setV is at most countable; Branching-time objectives each vertex has a successor; Basic properties 0:8 Deciding the winner Prob is positive; Games with time G is a Markov decision process (MDP) ifV ^ = orV = . 0:4 0:6 ; ; SFM-10:QAPL 2010 6/56 Stochastic games Strategies Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Definition 3 (Strategy) Reachability LetG = (V; E; (V; V^; V ); Prob) be a game. A strategy for objectives The value player is a function σ which to every wv V ∗V assigns a Min strategies 2 Max strategies probability distribution over the set of outgoing edges ofv. Determinacy Finite-state games BPA games Branching-time A strategy for player ^ is defined analogously. objectives Basic properties Deciding the winner We can classify strategies according to Games with time memory requirements: history-dependent (H), finite-memory (F), memoryless (M) randomization: randomized (R), deterministic (D) Thus, we obtain the classes ofMD,MR,FD,FR,HD, andHR strategies. SFM-10:QAPL 2010 7/56 Stochastic games Plays Antonín Kuceraˇ Preliminaries Games Strategies, plays Objectives Reachability objectives Definition 4 (Play) The value Min strategies Max strategies LetG = (V; E; (V; V^; V ); Prob) be a game. Each pair (σ; π) of Determinacy Finite-state games strategies for player and player ^ determines a unique play BPA games G(σ,π), which is a Markov chain whereV + is the set of states and Branching-time objectives transitions are defined accordingly. Basic properties Deciding the winner Games with time Plays are infinite trees. For a pair of memoryless strategies (σ; π), the play G(σ,π) can be depicted as a Markov chain with the set of states V. SFM-10:QAPL 2010 8/56 Is there a strategy σ such that v = >0(v) in Gσ ? j G Is there a strategy σ such that v = >0(v >0u) in Gσ ? j G ^ F Obviously, there is no suchMR (or evenFR) strategy. wv wv 1=2j j 1 1=2j j Let σ(wv) = v u; v − v −−−−! −−−−−−! 1=2 3=4 7=8 15=16 v vv vvv vvvv 1=2 1=4 1=8 1=16 vu vvu vvvu vvvvu 1 1 1 1 Stochastic games Plays (2) Antonín Kuceraˇ Preliminaries Games Example 5 (A game and its play) Strategies, plays Objectives Reachability v u 1 objectives The value Min strategies Max strategies Determinacy Finite-state games BPA games Branching-time objectives Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56 Is there a strategy σ such that v = >0(v >0u) in Gσ ? j G ^ F Obviously, there is no suchMR (or evenFR) strategy. wv wv 1=2j j 1 1=2j j Let σ(wv) = v u; v − v −−−−! −−−−−−! 1=2 3=4 7=8 15=16 v vv vvv vvvv 1=2 1=4 1=8 1=16 vu vvu vvvu vvvvu 1 1 1 1 Stochastic games Plays (2) Antonín Kuceraˇ Preliminaries Games Example 5 (A game and its play) Strategies, plays Objectives Reachability v u 1 objectives The value Min strategies Max strategies >0 σ Determinacy Is there a strategy σ such that v = (v) in G ? Finite-state games j G BPA games Branching-time objectives Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56 Obviously, there is no suchMR (or evenFR) strategy. wv wv 1=2j j 1 1=2j j Let σ(wv) = v u; v − v −−−−! −−−−−−! 1=2 3=4 7=8 15=16 v vv vvv vvvv 1=2 1=4 1=8 1=16 vu vvu vvvu vvvvu 1 1 1 1 Stochastic games Plays (2) Antonín Kuceraˇ Preliminaries Games Example 5 (A game and its play) Strategies, plays Objectives Reachability v u 1 objectives The value Min strategies Max strategies >0 σ Determinacy Is there a strategy σ such that v = (v) in G ? Finite-state games j G BPA games >0 >0 σ Branching-time Is there a strategy σ such that v = (v u) in G ? objectives j G ^ F Basic properties Deciding the winner Games with time SFM-10:QAPL 2010 9/56 wv wv 1=2j j 1 1=2j j Let σ(wv) = v u; v − v −−−−! −−−−−−! 1=2 3=4 7=8 15=16 v vv vvv vvvv 1=2 1=4 1=8 1=16 vu vvu vvvu vvvvu 1 1 1 1 Stochastic games Plays (2) Antonín Kuceraˇ Preliminaries Games Example 5 (A game and its play) Strategies, plays Objectives Reachability v u 1 objectives The value Min strategies Max strategies >0 σ Determinacy Is there a strategy σ such that v = (v) in G ? Finite-state games j G BPA games >0 >0 σ Branching-time Is there a strategy σ such that v = (v u) in G ? objectives j G ^ F Basic properties Obviously, there is no suchMR (or evenFR) strategy. Deciding the winner Games with time SFM-10:QAPL 2010 9/56 Stochastic games Plays (2) Antonín Kuceraˇ Preliminaries Games Example 5 (A game and its play) Strategies, plays Objectives Reachability v u 1 objectives The value Min strategies Max strategies >0 σ Determinacy Is there a strategy σ such that v = (v) in G ? Finite-state games j G BPA games >0 >0 σ Branching-time Is there a strategy σ such that v = (v u) in G ? objectives j G ^ F Basic properties Obviously, there is no suchMR (or evenFR) strategy. Deciding the winner wv wv Games with 1=2j j 1 1=2j j time Let σ(wv) = v u; v − v −−−−! −−−−−−! 1=2 3=4 7=8 15=16 v vv vvv vvvv 1=2 1=4 1=8 1=16 vu vvu vvvu vvvvu 1 1 1 1 SFM-10:QAPL 2010 9/56 Stochastic games A taxonomy of objectives Antonín Kuceraˇ Preliminaries Games Each play of a game G is assigned a (numerical) yield.

Stochastic Games

Stochastic Game Theory Applications for Power Management in Cognitive Networks

Deterministic and Stochastic Prisoner's Dilemma Games: Experiments in Interdependent Security

Stochastic Games with Hidden States∗

570: Minimax Sample Complexity for Turn-Based Stochastic Game

Discovery and Equilibrium in Games with Unawareness∗

Efficient Exploration of Zero-Sum Stochastic Games

Modeling Replicator Dynamics in Stochastic Games Using Markov Chain Method

Nash Q-Learning for General-Sum Stochastic Games

Bayesian Analysis of Deterministic and Stochastic Prisoner's Dilemma Games

Stochastic Constraint Programming for General Game Playing with Imperfect Information

Perfect Conditional E-Equilibria of Multi-Stage Games with Infinite Sets

Monte-Carlo Tree Reductions for Stochastic Games Nicolas Jouandeau, Tristan Cazenave