Stochastic Games and Their Complexities Public Defense

Stochastic games and their complexities public defense Marcin Przyby lko University of New Caledonia University of Warsaw 03.04.2019 Supervisors: Teodor Knapik & Damian Niwiński Auxiliary supervisor: Micha l Skrzypczak 0 / 22 Stochastic games Games with probabilistic transitions backgammon board 1 / 22 Stochastic games Games with probabilistic transitions ... ... ... ... ... ... graph of configurations 1 / 22 Formal model Games on graphs I move a single token along the lines of a graph e m 0:2 e 0:5 0:8 n 0:5 0:8 a m a 0:2 I decide the outcome based on the history of the token a word over the labels of the vertices 2 / 22 Stochastic branching games I study I two-player zero-sum games e.g. chess I of infinite duration e.g. parity games I with randomness e.g. simple stochastic games I and a form of concurrency. i.e. stochastic branching games by Mio Strong model: subsume many turn-based games extension of Gale-Stewart games and some concurrent games. can encode Blackwell games 3 / 22 Branching game Game G = hB; Φi is played by two players (Eve and Adam) and consists of I a board B the graph where we move the tokens I and an objective Φ: plays(B) ! [0; 1]. that can be seen as a rulebook ... 4 / 22 Branching games, an example Board { (unfolding of a) binary graph, consisting of Adam's, Eve's, Nature's and branching vertices. c n e1 0:8 0:2 e2 a f1 f2 The black token marks the initial vertex. 5 / 22 Branching play Board Play c c n e1 0:8 0:2 e2 a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e2 a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 f1 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 f1 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 f1 f2 preplay { a labelled tree with tokens 6 / 22 Branching play Board Play c c n e1 n e1 0:8 0:2 e1 e2 e2 a a a f1 f2 f1 f2 play { an infinite labelled tree 6 / 22 Games and measure Fix a game G and a profile of strategies hσ; πi; σ; π : fL; Rg∗ ! fL; Rg σ,π the outcome generates a measure µ defined on a set TΓ; TΓ { a set of trees labelled with some finite setΓ. the objective is a functionΦ: TΓ ! [0; 1] Φ is called the pay-off function Eve wants to maximise the pay-off, Adam to minimise it. the value val (σ; π) of a profile is the expected pay-off EΦ wrt. the measure µσ,π 7 / 22 Two questions Problem (determinacy) Given a game G, is the game determined? pure and mixed determinacy Problem (game value) Given a game G, what is the value of the game G? six possible values 8 / 22 Game values and determinacy Let G be a branching game, I Eve's value: XE def val = supσ2ΣXE infπ2ΣA val (σ; π) I Adam's value: XA def val = infπ2ΣXA supσ2ΣE val (σ; π) X ranges over pure ("), behavioural (B), and mixed (M) strategies Strategies ∗ I pure σ : fL; Rg ! fL; Rg, ∗ I mixed σm 2 Dist(σ : fL; Rg ! fL; Rg). E BE ME ΣB ( ΣB ( ΣB A BA MA ΣB ( ΣB ( ΣB 9 / 22 Determinacy Holds valA ≥ valBA ≥ valMA ≥ valME ≥ valBE ≥ valE Game G is determined I under pure strategies if valA = valE , I under behavioural strategies if valBA = valBE , I under mixed strategies if valMA = valME . 10 / 22 valA valBA valMA valME valBE valE Game values: 3 1 1 1 1 4 2 2 4 0 Double matching pennies c c c x1 x2 x3 x4 0 1 f def L = t 2 plays(B) j (x3(t)=x4(t)) =) (x1(t)=x2(t)=x3(t)=x4(t)) example by Mio 11 / 22 Double matching pennies c c c x1 x2 x3 x4 0 1 f def L = t 2 plays(B) j (x3(t)=x4(t)) =) (x1(t)=x2(t)=x3(t)=x4(t)) example by Mio valA valBA valMA valME valBE valE Game values: 3 1 1 1 1 4 2 2 4 0 11 / 22 Regular branching games Fix a game G and a set of trees L ifΦ: TΓ ! [0; 1] of form 1 if t 2 L Φ(t) = 0 otherwise then L is called the winning set of G. L regular =) G regular branching game. regular means recoginsable by a finite automaton on trees 12 / 22 Regular pure branching games 12 / 22 Regular pure two-player games no Nature's vertices Main results: I games may not be determined, not even under mixed strategies; we provide an example I pure values are computable; winning strategies are enough I the determinacy under pure strategies can be decided. we can compute the values 13 / 22 Regular pure branching games Theorem (indeterminacy) There exists a regular finite branching pure game G such that valA valBA valMA valME valBE valE 1 1 1 0 0 0 the winning set is a difference of two open sets Theorem (computing the value) Let G be a regular finite pure branching game. Then the set of winning strategies SE (resp., SA) of Eve (resp., Adam) is regular and effectively computable. E A val = [SE 6= ;]; val = [SA 6= ;] 14 / 22 Regular stochastic branching games 14 / 22 Regular stochastic branching games Main results: I determined under mixed strategies, if the winning set is topologically closed; recall: difference of two closed sets can be indeterminate I the values are not computable, not even for one-player games; nor even mixed values of two-player pure games 15 / 22 Regular stochastic two-player games Theorem (determinacy) Let G be a regular stochastic branching game with a closed winning set. Then G is determined under mixed strategies. pay-off function is semicontinous =) apply Glicksberg's minimax theorem Theorem (computing the value) There is no algorithm that given a regular finitary branching game G computes the value of the game. reduction from probabilistic automata 16 / 22 Measures 16 / 22 Regular zero-player stochastic games Fix strategies σ; π and a board B, then µσ,π : 2TΓ * [0; 1] defined as σ,π µ (L) = valhB;Li(σ; π) is a Borel measure. We focus on the uniform measure on infinite binary trees µ∗ for any position u 2 fL; Rg∗ and any letter a 2 Γ ∗ −1 µ (ft 2 TΓ j t(u) = ag) = jΓj 17 / 22 Measures regular means recoginsable by a finite automaton Main result: the uniform measure of a regular set of trees L is computable, if L is defined by I a first-order formula with no descendant, I or a conjunctive query. both results utilise a form of locality 18 / 22 Measures Theorem (FO case) If ' is a first-order formula not using descendant, then µ∗(L(')) is rational and computable in three-fold exponential space. Theorem (CQs case) If ' is a Boolean combination of conjunctive queries, then µ∗(L(')) is rational and computable in exponential space. we reduce L(') to a clopen set having the same measure 19 / 22 Plantation game 19 / 22 Plantation game Goals: I practical part of the thesis; I presentation of potential use of discrete stochastic games; I beachhead for the future, more involved, cooperation. Problem to solve: I given a history of a fruit plantation a time series of measurements I produce a tool to predict and control the level of infestation. solution: stochastic games 20 / 22 Approach Utilise a standard framework: 1. convert raw data into series with finite number of values clustering 2. create a model simulating the changes partially observable Markov decision process 3. create a game-based reactive model a Gale-Stewart-like game with hidden states Result: simple and procedural framework creating reactive models for given time series of observations 21 / 22 Conclusions Determinacy: - mixed determinacy, if the winning set is closed; - mixed determinacy fails at the second level of Borel hierarchy; Values: - pure values of pure regular branching games are computable; - values of stochastic regular branching games are not. Measures: - uniform measure can be computed for some restricted first-order formulae. Applications: - a description of a roboust framework and a proposition of a possible implementation.

Stochastic Games and Their Complexities Public Defense

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support