Proceedings of the Twenty-Fifth International Joint Conference on (IJCAI-16)

Epistemic GDL: A for Representing and Reasoning about Imperfect Information Games

Guifei Jiang,1,2 Dongmo Zhang,1 Laurent Perrussel,2 and Heng Zhang3 1AIRG, Western Sydney University, Penrith, Australia 2IRIT, Universite´ Toulouse Capitole, Toulouse, France 3School of Computer Sci. & Tech., Huazhong University of Sci. & Tech., Wuhan, China

Abstract For example, players should always know their own avail- able actions in non-terminal states and know their results in This paper proposes a logical framework for rep- terminal states. Such epistemic properties of a game are nor- resenting and reasoning about imperfect informa- mally implied by the game rules and thus need reasoning fa- tion games. We first extend the game description cilities to infer and verify them. Unfortunately, GDL-II (or language (GDL) with the standard epistemic oper- GDL) is not designed for this purpose. To handle this is- ators and provide it with a semantics based on the sue, a few approaches have been proposed, mostly embed- epistemic state transition model. We then demon- ding GDL-II into a logical system, such as Situation Calcu- strate how to use the language to represent the rules lus or Alternating-time Temporal Epistemic Logic (ATEL), to of an imperfect information game and formalize its use their reasoning facilities [Schiffel and Thielscher, 2011; epistemic properties. We also show how to use the Ruan and Thielscher, 2011; 2012; Huang et al., 2013]. As framework to reason about player’s own as well as long as the targeting are expressive enough to interpret other players’ during game playing. Fi- any GDL description, it is possible to use the inference mech- nally we prove that the model-checking problem of p anisms of these logics for reasoning about GDL-II games. the framework is in 2, which is the lowest among However, a highly expressive logic may incur high complex- the existing similar frameworks, even though its p ity for reasoning tasks. For instance, Ruan and Thielscher lower bound is ⇥2. These results indicate that the [2012] propose an adaption of ATEL to verify epistemic prop- framework makes a good balance between expres- erties of GDL-II games, and show that the model-checking sive power and computational efficiency. problem in that setting is 2EXPTIME-hard. Such high com- putational complexity may not be what we want. 1 Introduction This paper aims to propose a different approach to deal General Game Playing (GGP) is concerned with creating in- with this problem. We introduce a logical framework, called telligent agents that understand the rules of arbitrary new EGDL, equipped with a language for representing imper- games and learn to play these games without human interven- fect information games and a semantical model that can be tion [Genesereth et al., 2005]. Representing and reasoning used for reasoning about game information and players’ epis- about games is a core technique in GGP. A formal game de- temic status. More importantly, we develop a model-checking scription language (GDL) has been introduced as an official algorithm for EGDL and show that the complexity of the language for GGP since 2005. GDL is defined as a high-level, model-checking problem for the logic can be significantly p machine-processable language for representing the rules of reduced to 2. There are two major reasons that help us arbitrary games [Love et al., 2006]. Originally designed for to reduce the complexity. Firstly, our language is a conser- perfect information games, GDL has recently been extended vative extension of GDL with the standard epistemic oper- to GDL-II so as to incorporate imperfect information games, ators [Fagin et al., 2003]. We take a cautious way of do- such as Poker, Backgammon [Thielscher, 2010]. ing that without introducing the until operator or coalition Playing games with imperfect information poses an intri- operators. Secondly, we provide an imperfect recall seman- cate reasoning challenge for players since imperfect infor- tics for knowledge. Other cases could be considered; nev- mation requires a player to use the rules of a game to in- ertheless, the addition of perfect recall to GDL-II renders fer legal actions, draw conclusions from her own knowledge the model-checking problem of ATEL undecidable in gen- about the current game state and about knowledge of other eral [Ruan and Thielscher, 2012]. Also, in many applications, agents. However, as a purely descriptive language, GDL-II especially when modeling extremely large games, imperfect is only a tool for describing the rules of an imperfect infor- recall may provide considerably empirical and practical ad- mation game but does not provide a facility for reasoning vantages [Piccione and Rubinstein, 1997; Waugh et al., 2009; about how a player infers unveiled information based on the Busard et al., 2015]. Despite of a moderate expressive power, rules [Schiffel and Thielscher, 2011; 2014]. Indeed, some we demonstrate with a running example that our language is information is essential for players to proceed with a game. enough for expressing game rules, formalizing essential epis-

1138 temic properties and specifying the interactions of knowledge 1. dj(i) L(wj 1) (that is, any action that is taken must and actions. In this sense, EGDL makes a good balance be- be legal),2 tween expressive power and computational efficiency. 2. wj = U(wj 1,dj) (state transition), and The rest of this paper is structured as follows: Section 2 establishes the syntax and the semantics of EGDL. Section 3 3. w, ,we 1 T = (that is, only the last state may { ··· }\ ; demonstrates its expressiveness and reasoning facility. Sec- be terminal). tion 4 investigates the model-checking problem of EGDL. A path is complete if we T . Let (M) denote the set Finally, we conclude with a discussion of related work and of all complete paths in M.2 When MPis fixed, we simply future work. write . Given , the states on are called reachable P 2P states. Let [j] denote the j-th reachable state of and ✓i(, j) 2 The Framework denote the action of agent i taken at stage j of . The length of , written , is defined as the number of actions. All games we consider in this paper are assumed to be played | | in multi-agent environments. A game signature is a triple The following definition, by extending equivalence rela- (N, , ), where S tions over states to complete paths, characterizes precisely A what an agent with imperfect recall and perfect reasoning can N = 1, 2, ,k is a non-empty finite set of agents, • { ··· } in principle know at a specific stage of a game. i i = i N A , where A consists of a non-empty finite •A 2 Definition 3. Two complete paths , 0 are imperfect set of actions for agent i s.t. Ai Aj = if i = j, and recall (also called memoryless) equivalent2 forP agent i at stage S \ ; 6 j = p, q, is a finite set of propositional atoms for j N, written 0, iff [j]Ri0[j]. • { ···} 2 ⇡i specifying individual features of a game state. That is, imperfect recall requires an agent to be only aware of Hereafter we will fix a game signature and all concepts the present state but forget everything that happened. This is will be based on the same game signatureS unless otherwise similar to the notion of imperfect recall in ATL [Schobbens, specified. 2004]. It should be noted that the paper focuses on imperfect recall; nevertheless, the framework is flexible to define ATL 2.1 Epistemic State Transition Models state-based perfect recall [Jamroga and van der Hoek, 2004] We begin by introducing the semantical structures used to and GDL-II perfect recall [Thielscher, 2010]. model synchronous games with imperfect information. By To illustrate our framework, we use as our running example synchronous we mean that all players move simultaneously. a variant of the Tic-Tac-Toe, called Krieg-Tictactoe in [Schif- In particular, turn-based asynchronous games are modelled fel and Thielscher, 2011]. by only allowing the “noop” action for a player, when it is Example. Krieg-Tictactoe is played by two players, cross x not her turn. and naught o, who take turns marking grids in a 3 3 board. ⇥ Definition 1. An epistemic state transition (ET) model M is Different from standard Tic-Tac-Toe, each player can see her a tuple (W, w, T, Ri i N ,L,U,g,V), where own marks, but not those of her opponent, just like the chess { } 2 variant Kriegspiel [Pritchard, 1994]. Players are informed W is a non-empty set of possible states. • of the turn-taking. The game ends if the board is completely w W , representing the initial state. filled or one player wins by having completed a horizontal, • 2 T W , representing the set of terminal states. vertical or diagonal line of three with her own symbol. • ✓ R W W is an equivalence relation for agent i, To represent Krieg-Tictactoe in terms of the ET-model, we • i ✓ ⇥ indicating the states that are indistinguishable for i. first describe the game signature, written KT , as follows: let i i S i NKT = x, o , A = a 1 j, k 3 noop , L W is a legality relation, describing the legal { } KT { j,k |   }[{ } = pi ,tried(ai ),turn(i) i , 1 • actions✓ at⇥ eachA state. and KT j,k j,k x o and { i | 2{ }  j, k 3 , where aj,k denotes the action that player i fills U : W D W is an update function, specifying state  } i • ⇥ ! i grid (j, k) with her symbol, noop denotes that player i does transitions, where D is the set of joint actions i N A . i 2 action noop, pj,k represents the fact that grid (j, k) is filled W g : N 2 is a goal function, specifying theQ winning with player i’s symbol, tried(ai ) represents the fact that • states for! each agent. j,k player i has tried to fill grid (j, k) before, and turn(i) says V : W 2 is a standard valuation function. that it is player i’s turn now. • ! For d D, let d(i) denote the i-th component of d. That We next specify the ET-model for this game, written MKT , 2 is, agent i’s action in the joint action d. For convenience, let as follows: let WKT = (tx,to,x1,1, ,x3,3):tx,to 0, 1 & x , ,x { , , ,···, be the set of2 L(w)= a (w, a) L denote the set of all legal { } 1,1 ··· 3,3 2{⇤ ⇥ ⌦ }} actions at{ state2wA|. We now define2 } the set of all possible ways possible states, where tx, to specify the turn taking and xi,j (i, j) in which a game can develop. represents the fact that grid is⇢ occupied by the cross and not tried by the nought ⇥, occupied by the nought and Definition 2. Let M =(W, w, T, Ri i N ,L,U,g,V) be an { } 2 not tried by the cross , occupied by the nought and tried by ET-model. A path is a sequence of states and joint actions the cross , occupied by the cross and tried by the nought d d w 1 e w such that for all 1 j e and any i N, , or empty⌦ . The initial state w is (1, 0, , , ). !···! e   2 ⇤ ⇢ ⇤ ··· ⇤

1139 For any two states w =(t ,t ,x , ,x ) and w0 = Let us illustrate the intuition of the language with our run- x o 1,1 ··· 3,3 (t0 ,t0 ,x0 , ,x0 ) in WKT , wR w0 iff (1) t = t0 for ning example. x o 1,1 ··· 3,3 x i i any i NKT , (2) x , iff x0 , for 2 j,k 2{⇥ } j,k 2{⇥ } Example (continued) The rules of Krieg-Tictactoe are speci- any 1 j, k 3, and (3) xj,k = iff x0 = for fied by EGDL in Figure 1.   ⌦ j,k ⌦ any 1 j, k 3. The equivalence relation for o is de-   1. initial turn(x) turn(o) 3 ( (px po ) fined in a similar way. Let VKT be a valuation such that for $ ^¬ ^ j,k=1 ¬ j,k _ j,k (tried(ax ) tried(ao ))) each state w =(tx,to,x1,1, ,x3,3) WKT , VKT (w)= ^¬ j,k _ V j,k x ··· 2 o turn(i):ti =1 p : xj,k , p : xj,k 2. wins(i) ( 3 2 pi ) ( 3 2 pi ) j,k ⇥ j,k $ j=1 l=0 j,1+l _ k=1 l=0 1+l,k { }[x{ 2{ }}[{o 2 2 i 2 i , tried(a ):xj,k = tried(a ):xj,k = W ( Vl=0 p1+l,1+l) W( l=0Vp1+l,3 l) { ⌦}}[{ j,k ⌦}[{ j,k _ _ . Moreover, we assume that each player takes the same 3. teminal winsV(x) wins(o) 3V (px po ) } $ _ _ j,k=1 j,k _ j,k action⇢ at stages of all her indistinguishable complete paths, 4. turn(i) turn(i) turn(Vi) j !¬ ^ i.e., ✓i(, j)=✓i(0,j) whenever i 0. Due to the space 5. legal(noopi) turn( i), where i represents i’s opponent limit, we refrain from explicitly listing⇡ the legality relation, $ 6. legal(ai ) turn(i) pi tried(ai ) the update function, and the terminal and goal states for the j,k $ ^¬ j,k ^¬ j,k 7. pi terminal pi (does(ai ) (px po )) agents as this is possible but considerably lengthy. However, j,k $ _ j,k _ j,k ^¬ j,k _ j,k i i i i the syntactic descriptions of the game given in the following 8. tried(aj,k) terminal tried(aj,k) (does(aj,k) pj,k ) section detail them in a more compact way. $ _ _ ^ 9. does(ai ) K (does(ai )) j,k ! i j,k 10. initial Einitial 2.2 The Language ! 11. (turn(i) Eturn(i)) ( turn(i) E turn(i)) Let us now introduce an epistemic extension of a variant of ! ^ ¬ ! ¬ 12. (pi K pi ) ( pi K pi ) GDL [Zhang and Thielscher, 2015b] to represent games with j,k ! i j,k ^ ¬ j,k ! i¬ j,k i i i i imperfect information, and further provide a semantics for the 13. (tried(aj,k) Kitried(aj,k)) ( tried(aj,k) Ki tried(aj,k)) language based on the epistemic state transition model. We ! ^ ¬ ! ¬ call this framework EGDL for short. Figure 1: An EGDL description of Krieg-Tictactoe. Definition 4. The language, denoted by , consists of L the finite set of propositional atoms; The initial state, each player’s winning states, the terminal • pre-defined propositions: initial, terminal, wins( ), states and the turn-taking are given by rules 1-4. • legal( ) and does( ); · The preconditions of each action are specified by Rule 5 · · and Rule 6. The player who has the turn can fill any grid s.t. logical connectives and ; • ¬ ^ (i) it is not filled by herself, and (ii) she has never tried to fill temporal operator ; the grid before. The other player can only do action noop. • epistemic operators: K and C. Rules 7 and 8 are the combination of the frame axioms and • [ ] A formula ' in is defined by the following BNF: the effect axioms Reiter, 1991 . Rule 7 states that a grid is L marked with a player’s symbol in the next state if the player '::= p initial terminal legal(ai) wins(i) does(ai) takes the corresponding action at the current state, or the grid | | | | | | ' ' ' K ' C' has been filled by herself, or the game ends. Similarly, Rule ¬ | ^ | | i | 8 says that an action is tried by a player in the next state if where p , i N and ai Ai. 2 2 2 the action is ineffective while still taken by the player at the Other connectives , , , , are defined by and current state, or it has been tried before, or the game ends. in a standard way._ Intuitively,! $ >initial? and terminal¬specify^ The others are the epistemic rules. Rule 9 states each the initial state and the terminal states of a game, respectively; player knows which action she is taking. Rule 10 and Rule does(ai) asserts that agent i takes action a at the current state; 11 says both players know the initial state and the turn-taking, legal(ai) asserts that action a is available to agent i at the respectively. Rule 12 says that each player knows which grid current state; and wins(i) asserts that agent i wins at the is filled or not by her own symbol. Similarly, Rule 13 states current state. The formula ' means “' holds in the next that each player knows which grid is tried or not by herself. state”. All these components are inherited from GDL. The Let ⌃KT be the set of rules 1-13. It should be noted that epistemic operators K and C are taken from the Modal Epis- rules 11-13 together specify the epistemic relations for each temic Logic [Fagin et al., 2003]. The formula Ki' is read player: two states are indistinguishable for a player if their as “agent i knows '”, and C' as “' is configurations of the game board are the same in her view. among all the agents in N”. We use the following abbreviations in the rest of the paper: 2.3 Semantics

Ki' =def Ki ' C' =def C ' E' =def Ki' The semantics of the language is based on the epistemic state ¬ ¬ ¬ ¬ i N transition model. ^2 b b where K and C are the dual operators of K and C, respec- Definition 5. Let M be an ET-model. Given a complete path i i in M, a stage j of and a formula ' , we say ' is true tively. Ki' says “' is compatible with agent i’s knowledge” at j of under M, denoted by M,,j =2'L, according to the and it isb similarb to C'. E' says “every agent in N knows '”. following definition: | b b

1140 M,,j = p iff p V ([j]) Proposition 2. Let M be an ET-model. Then M,,j |= ' iff M,2,j = ' | ¬ 6| 1. M = initial C initial iff for all ,0 and any M,,j = '1 '2 iff M,,j = '1 and M,,j = '2 | j ! 2P | ^ | | j N, if 0, then ([j]=w iff 0[j]=w). M,,j = initial iff [j]=w 2 ⇡N | i i M,,j = terminal iff [j] T 2. M = legal(a ) Ki(legal(a )) iff for all ,0 and | 2 | !j i i 2P M,,j = wins(i) iff [j] g(i) 0 0 | i i 2 any j N, if i , then (a L([j]) iff a L( [j])). M,,j = legal(a ) iff a L([j]) 2 ⇡ 2 2 i i | i 2 i 0 M,,j = does(a ) iff ✓i(, j)=a 3. M = does(a ) Ki(does(a )) iff for all , and | | !j i 2Pi M,,j = ' iff if j< , then M,,j +1 = ' any j N, if i 0, then (✓i(, j)=a iff ✓i(0,j)=a ). | | | j | 2 ⇡ M,,j = Ki' iff for any 0 with 0, | 2P ⇡i 4. M = wins(i) Ki(wins(i)) iff for all ,0 and any M,0,j = ' | j ! 2P 0 0 | j j N, if i , then ([j] g(i) iff [j] g(i)). M,,j = C' iff for any 0 with 0, 2 ⇡ 2 2 | 2P ⇡N M,0,j = ' 5. M = terminal C terminal iff for all ,0 and | j! 2P | any j N, if 0, then ([j] T iff 0[j] T ). j j 2 ⇡N 2 2 where N is its transitive closure of i N i . ⇡ 2 ⇡ Obviously, not all games with imperfect information sat- A formula ' is globally true in anS ET-model M, written isfy these epistemic properties. For instance, property (5) M = ', if M,,j = ' for any and any 0 j . | | 2P  | | does not hold for Krieg-Tictactoe. Consider the two reach- A formula ' is valid, written = ', if M = ' for any ET- able states depicted in Figure 2. They are indistinguishable model M. A formula ' is called| true at| a state w in M, written M,w = ', if it is true for all complete paths going o x o o o through w, i.e.,| M,,j = ' for any and any j 0 | 2P x x with [j]=w. Finally, let ⌃ be a set of formulas in , then x x x M is a model of ⌃ if M = ' for all ' ⌃. L | 2 We now show that EGDL provides a sound description for Figure 2: The indistinguishable states for o Krieg-Tictactoe.

Proposition 1. MKT is a model of ⌃KT . for player o. Yet the left one is a terminal state while the It follows that these game rules are common knowledge right one is not. In fact, according to the game rules, Krieg- among two players, which is just what we expect. Tictactoe satisfies all the other properties.

Corollary 1. MKT = C' for all ' ⌃KT . Observation 1. Formulas (1)-(4) are globally true in MKT . | 2 It should be noted that each EGDL-formula may be inter- 3 Epistemic and Strategic Reasoning preted as a property of a game. Typically, globally true for- In this section, we demonstrate the expressive power and flex- mulas describe properties for a particular game, such as the ibility of EGDL by showing how it allows us to specify epis- rules for Krieg-Tictactoe, while valid formulas specify gen- temic properties and reason about agents’ knowledge. eral properties of a class of games and thus can be used to classify games. For instance, different from Krieg-style board 3.1 Epistemic Properties games, most card games have the property (5). The introduction of imperfect information raises new epis- temic properties of a game. For instance, to make a game 3.2 Strategic Reasoning playable, each player should always know her own legal ac- Let us now show how to use EGDL to reason about agents’ tions in the course of the game. This property as well as some knowledge and actions based on the game rules. In the con- other well-known properties can be naturally formulated by text of imperfect information, epistemic reasoning is closely EGDL as follows: given i N and ai Ai, 2 2 related to strategic reasoning. To start with, the following i i (1) initial C initial (2) legal(a ) Ki(legal(a )) proposition shows that EGDL is suitable for reasoning about ! ! i i players’ knowledge as it is a conservative extension of the (3) does(a ) Ki(does(a )) (4) wins(i) Ki(wins(i)) C ! ! standard Epistemic S5n [Fagin et al., 2003]. (5) terminal C terminal ! Proposition 3. Given an EGDL-formula ' without involving Formulas (1) and (5) express that the initial state, the ter- the operator and the pre-defined propositions, ' is valid C minal states are common knowledge, respectively. Formula in EGDL iff it is valid in S5n . (2) says that each agent knows her own legal actions. In This result indicates that EGDL is sufficient to provide a ATEL, this is a required semantic property yet with no syntac- static characterization of agents’ knowledge at a certain stage. tic expression [Agotnes,˚ 2006]. Formula (3) asserts that each C For instance, with S5n , we can derive the following formulas agent is aware of her own actions. This is called the “uni- from the rules of Krieg-Tictactoe. form” property of actions (strategies) also with no syntactic counterpart in ATEL [Van der Hoek and Wooldridge, 2003; Observation 2.

Jamroga and van der Hoek, 2004]. Finally, formula (4) spec- 1. MKT = turn(i) Cturn(i) | ! ifies that each agent should know her winning result. i i 2. MKT = Ki(K ip K i p ) Moreover, the above epistemic properties are precisely | j,k _ ¬ j,k i i characterised by indistinguishable complete paths as follows: 3. MKT = Kitried(a ) Kip | j,k ! j,k

1141 Clause 1 says the turn-taking is common knowledge. Clause 4 Model Checking 2 says a player knows the opponent knows whether or not a In this section, we investigate the complexity of the model- grid is filled by herself. The last one says if a player knows checking problem for EGDL and develop a model-checking she has tried an action, then she knows the corresponding grid algorithm for EGDL. has been filled by the opponent. The last two properties are The model checking problem for EGDL, denoted by important when players gather information. EGDL-MC, is the following: Given an EGDL-formula ', an Furthermore, with the full expressive power of the lan- ET-model M, a path of M and a stage j on , determine guage, we can use EGDL to specify agents’ knowledge of whether M,,j = ' or not. In principle, two variants of particular game features and reason about how agent’s knowl- EGDL-MC can be| defined as follows: Given an ET-model M, edge changes as a game progresses. a state w of M and an EGDL-formula ', determine whether M,w = ' and determine whether M = '. It should be Observation 3. | | i i noted that all the bounds presented in this section remain true 1. MKT = Kip Ki p | j,k ! j,k for these variants. Proofs are similar to those of EGDL-MC, i i 2. MKT = Ki tried(a ) Kitried(a ) | j,k ! j,k or can be obtained by simple reductions to/from EGDL-MC. 3 x o Let us first consider the upper bound of the complexity for 3. MKT = initial C( legal(aj,k) legal(noop )) | ! j,k=1 ^ model-checking. Our goal is to show the following bound. i i i 4. MKT = does(aj,k) V Ki(pj,k tried(aj,k)) p | ! _ Theorem 1. EGDL-MC is in 2. Intuitively, clause 1 says that if a player knows a grid has been To prove this upper bound, according to the definition of filled by herself, then she still knows this fact holds at the next p, we need to prove that there is a polynomial-time deter- state. Clause 2 says that a player is able to remember the grid 2 ministic Turing machine with an NP-oracle such that she has tried to fill before. Clause 3 says that at the initial state solves the model-checkingM problem for EGDL. To show theM the legal actions are common knowledge among two players, existence, let us start with a simple property of EGDL. and Clause 4 expresses that if a player takes an action now, Let ' be an EGDL-formula, and M =( ,V) be an ET- then at the next state she will know either the corresponding model over . Take to be any subformulaG of ' of the form grid has been filled by her symbol or she has tried that action. #, where S is either C or K for some i N. We introduce i 2 Most importantly, the interactions of actions and knowl- a fresh propositional atom p for . Let M be the ET-model edge can be naturally formulated using EGDL. Specifically, ( ,V ) where V is a valuation function defined by they interact in three different ways: G V (w) p if M,w = ; (i) Knowledge is necessary for an agent to perform an ac- V (w):= [{ } | i V (w) otherwise tion, which may be formulated by does(a ) Ki'. For in- ⇢ stance, in Krieg-Tictactoe, with partial observation,! a player for each state w of M. Let ' denote the formula obtained might take an ineffective action by trying to fill a grid which from ' by replacing by p . Then, by the definition of se- has been filled by the opponent. Then we say a player i i i mantics for EGDL, the following property is clearly true. takes a good action aj,k, written good(aj,k), if it is effec- tive. It follows that, to take a good action, a player needs Lemma 1. For every path of M and every stage j on , it holds that M,,j = ' iff M ,,j = ' . to know the grid she attempts to fill is empty. Formally, | | i x o MKT = good(a ) K ( (p p )). Thus, by applying the above lemma, the epistemic opera- | j,k ! i ¬ j,k _ j,k (ii) Performing an action may increase an agent’s knowl- tors can be eliminated from the formula in a recursive way. i edge, which may be specified by does(a ) Ki'. For ex- For the EGDL-formulas without epistemic operators, we can ample, if a player takes an ineffective action,! then she would show that its model-checking problem is tractable. know the corresponding grid has been filled by the other Lemma 2. The following problem is in PTIME: Given an player. Consider the following complete path ET-model M, a path of M, a stage j on and an EGDL- ax ,noopo noopx,ao ax ,noopo = w h 2,2 i w h 2,2i w h 1,1 i w formula ' without involving any epistemic operators, deter- ! 1 ! 2 ! 3 ··· mine whether M,,j = ' or not. At stage 2 after player o tries to fill grid (2,2), by Rule 7 and | Rule 13, she knows that the grid has been filled by player x. Due to the space limit, we omit the proof here. Roughly o o x Thus, MKT ,,1 = does(a ) K (tried(a ) p ). speaking, in the context of model checking, the operator 2,2 o 2,2 2,2 (iii) An agent| makes her choice! of actions based^ on her can be simply regarded as a standard . Note i knowledge, which may be captured by Ki' does(a ). Let that model-checking for the basic modal logic is in PTIME. us consider the following two basic actions:! To construct the model M , the value of under M

i i i and a given state is needed to be evaluated. To simplify the attack =def Ki(does(a ) wins(i)) does(a ) j,k ^ ! j,k question, let us first consider a simple case as follows. i i i block =def Ki (does(a ) wins( i)) does(a ) j,k ^ ! j,k Lemma 3. The following problem is in NP: Given an ET- model M, a state w of M and an EGDL-formula ' where Intuitively, attack says if a player knows that filling a grid leads to win, then she should fill that grid. Instead, block C Ki : i N and ' does not involve any epistemic operators,2{ }[ determine{ 2 whether} M,w = ' or not. says if a player knows her opponent makes to win by filling a | grid in the next state, then the player must fill that grid at the Thisb lemmab holds due to the observation that, given any current state to avoid an immediate loss. path of M and any j 0, we have M,,j = ' if, and |

1142 only if, M,[j, j + k], 0 = ', where k is the number of oc- M0,,j = '0. This assures the soundness of the algorithm currences of in ', and |[j, j + k] denotes the segment of mc. On| the other hand, by the previous analysis, the first starting from position j with length k. With it, one can design stage can be implemented in a polynomial-time determinitic a nondeterministic Turing machine which first guesses a path Turing machine with an NP-oracle; by Lemma 2, the second of length k, and then check the truth value of ' under this stage can be done in PTIME. Thus, the algorithm mc can be path. By Lemma 2, the later can be done in PTIME. implemented in a polynomial-time determinitic Turing ma- To eliminate all the epistemic operators, it remains to con- chine with an NP-oracle, which proves Theorem 1. sider the formulas with nested epistemic operator. With this Next we identify a lower bound of the complexity of complexity result for the non-nested case, we are now able to model-checking for EGDL. design an algorithm for the general case. Roughly speaking, p Theorem 2. EGDL-MC is ⇥ -hard. the idea is to carry out the elimination of epistemic operator in 2 a bottom-up way. As we can see in Algorithm 1, such an idea Due to the space limit, we skip the proof here. The ba- is implemented in the algorithm elimeop. It’s easy to check sic idea is to reduce the validity problem for Carnap’s modal logic C to the model-checking problem of EGDL. The former p Input : an ET-model M and an EGDL-formula ' has been proved to be ⇥2-complete [Gottlob, 1995]. Output: an ordered pair (M0,'0) It should be noted that the lower bound shows that the al- p p begin gorithm mc is nearly optimal, since 2 and ⇥2 are very close switch ' do together, which both lie in the second level of the polynomial case ' is atomic do hierarchy. M0 M; '0 '; case ' is of the form , where , do 2{¬ } (N0, 0) elimeop(M, ); 5 Conclusion

M0 N0; '0 0; We have presented a logical framework for representing and case ' is of the form do ^ reasoning about imperfect information games with imperfect (N0, 0) elimeop(M, ); recall players. The framework allows us to represent game (N0,0) elimeop(N0,); rules, formalize epistemic properties, specify the interactions M0 N0; '0 0 0; ^ of knowledge and actions as well as reason about agents’ case ' is of the form , where C, Ki do 2{ } knowledge during game playing. We have also investigated (N0, 0) elimeop(M, ); the model-checking problem for the logic. These results in- V the valuation function of N0; for all w in W do dicate that we have made a reasonable compromise between if N0,w = 0 is false then expressive power and computational efficiency. | ¬ V (w) V (w) p ; Most of the related work has been discussed in the Intro- [{ } end b duction. Besides that, the following is also worth mention- end ing. Ruan and Thielscher [2011] study the epistemic structure M0 the model obtained from N0 by replacing and expressiveness of GDL-II in terms of epistemic modal

the valuation function with V ; logic. They mainly provide a static characterization of play- '0 p ; ers’ knowledge at a certain stage without involving the tem- end poral dimension. Haufe and Thielscher [2012] develop an au- return (M0,'0); tomated reasoning method to deal with epistemic properties end for GDL-II. Different from ours, their method is restricted Algorithm 1: elimeop(M,') to positive-epistemic formulas. Our underlying language is from [Zhang and Thielscher, 2015b]. It was originally pro- that the algorithm can be implemented in a polynomial-time posed for reasoning about strategies of asynchronous games deterministic Turing machine, but with an NP-oracle. Here with perfect information, while we investigate its epistemic the procedure of checking N0,w = 0 is used as the extension for representing and reasoning about synchronous | ¬ NP-oracle. By Lemma 3, the checking is in NP. In addition, games with imperfect information. Algorithm elimeop visits each subformulab of ' at most once, Directions of future research are manifold. As we have and that the number of subformulas of ' is not greater than mentioned, besides imperfect recall, the framework is flexi- the size of '. These assure that the Turing machine will ter- ble enough to specify other memory types. To obtain a com- minate in a polynomial number of stages. plete picture of the relation between perfect or imperfect in- With this algorithm, we can then devise an algorithm mc formation, and perfect or imperfect recall, we plan to study for the model-checking of EGDL such that, given any proper properties of these memory types; We also want to investigate M,,j and ' as input, mc works as follows: the satisfiability problem and the axiomatization of EGDL Firstly, mc call the algorithm elimeop(M,'), and let based on the current literature [Zhang and Thielscher, 2015a; • (M0,'0) be the results of this call. Halpern et al., 2004].

Next, mc check whether M0,,j = '0 or not, and re- • turn “true” if it holds, “false” otherwise.| Acknowledgments By Lemma 1 and the definition of algorithm elimeop, it We are grateful to Michael Thielscher for his valuable sug- is not difficult to verify that M,,j = ' if, and only if, gestions, and special thanks are due to four anonymous ref- |

1143 erees for their insightful comments. This research was par- [Ruan and Thielscher, 2011] Ji Ruan and Michael tially supported by ANR-11-LABX-0040-CIMI within the Thielscher. The epistemic logic behind the game program ANR-11-IDEX-0002-02. description language. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11), pages References 840–845, 2011. [Agotnes,˚ 2006] Thomas Agotnes.˚ Action and knowledge [Ruan and Thielscher, 2012] Ji Ruan and Michael in alternating-time temporal logic. Synthese, 149(2):375– Thielscher. Strategic and epistemic reasoning for 407, 2006. the game description language GDL-II. In Proceedings of [Busard et al., 2015] Simon Busard, Charles Pecheur, the 20th European Conference on Artificial Intelligence Hongyang Qu, and Franco Raimondi. Reasoning about (ECAI’12), pages 696–701, 2012. memoryless strategies under partial observability and [Schiffel and Thielscher, 2011] Stephan Schiffel and unconditional fairness constraints. Information and Michael Thielscher. Reasoning about general games Computation, 242:128–156, 2015. described in GDL-II. In Proceedings of the 25th AAAI [Fagin et al., 2003] Ronald Fagin, Yoram Moses, Joseph Y Conference on Artificial Intelligence (AAAI’11), pages Halpern, and Moshe Y Vardi. Reasoning about Knowl- 846–851, 2011. edge. MIT press, 2003. [Schiffel and Thielscher, 2014] Stephan Schiffel and [Genesereth et al., 2005] Michael Genesereth, Nathaniel Michael Thielscher. Representing and reasoning about Love, and Barney Pell. General game playing: Overview the rules of general games with imperfect information. of the aaai competition. AI magazine, 26(2):62–72, 2005. Journal of Artificial Intelligence Research, 49:171–206, 2014. [Gottlob, 1995] G. Gottlob. NP trees and Carnap’s modal logic. Journal of the ACM, 42(2):421–457, 1995. [Schobbens, 2004] Pierre-Yves Schobbens. Alternating- time logic with imperfect recall. Electronic Notes in The- [Halpern et al., 2004] Joseph Y Halpern, Ron Van Der Mey- oretical Computer Science, 85(2):82–93, 2004. den, and Moshe Y Vardi. Complete axiomatizations for reasoning about knowledge and time. SIAM Journal on [Thielscher, 2010] Michael Thielscher. A general game de- Computing, 33(3):674–703, 2004. scription language for incomplete information games. In Proceedings of the 24th AAAI Conference on Artificial In- [Haufe and Thielscher, 2012] Sebastian Haufe and Michael telligence (AAAI’10), pages 994–999, 2010. Thielscher. Automated verification of epistemic properties for general game playing. In Proceedings of the 13th In- [Van der Hoek and Wooldridge, 2003] Wiebe Van der Hoek ternational Conference on Principles of Knowledge Rep- and Michael Wooldridge. Cooperation, knowledge, and resentation and Reasoning (KR’12), pages 339–349, 2012. time: Alternating-time temporal epistemic logic and its ap- Studia Logica [Huang et al., 2013] Xiaowei Huang, Ji Ruan, and Michael plications. , 75(1):125–157, 2003. Thielscher. Model checking for reasoning about incom- [Waugh et al., 2009] Kevin Waugh, Martin Zinkevich, plete information games. In Proceedings of the 26th Aus- Michael Johanson, Morgan Kan, David Schnizlein, and tralasian Joint Conference on Advances in Artificial Intel- Michael H Bowling. A practical use of imperfect recall. ligence (AI’13), pages 246–258, 2013. In Proceedings of the 8th Symposium on Abstraction, [Jamroga and van der Hoek, 2004] Wojciech Jamroga and Reformulation, and Approximation (SARA’09), pages Wiebe van der Hoek. Agents that know how to play. Fun- 175–182, 2009. damenta Informaticae, 63(2):185–219, 2004. [Zhang and Thielscher, 2015a] Dongmo Zhang and Michael [Love et al., 2006] Nathaniel Love, Timothy Hinrichs, Thielscher. A logic for reasoning about game strategies. David Haley, Eric Schkufza, and Michael Gene- In Proceedings of the 29th AAAI Conference on Artificial sereth. General game playing: Game descrip- Intelligence (AAAI’15), pages 1671–1677, 2015. tion language specification. Stanford Logic Group [Zhang and Thielscher, 2015b] Dongmo Zhang and Michael Computer Science Department Stanford University. Thielscher. Representing and reasoning about game strate- http://logic.stanford.edu/reports/LG-2006-01.pdf, 2006. gies. Journal of Philosophical Logic, 44(2):203–236, [Piccione and Rubinstein, 1997] Michele Piccione and Ariel 2015. Rubinstein. On the interpretation of decision problems with imperfect recall. Games and Economic Behavior, 20(1):3–24, 1997. [Pritchard, 1994] David Brine Pritchard. The Encyclopedia of Chess Variants. Games & Puzzles, 1994. [Reiter, 1991] Raymond Reiter. The frame problem in the situation calculus: A simple solution (sometimes) and a completeness result for goal regression. Artificial Intelli- gence and Mathematical Theory of Computation: Papers in Honor of John McCarthy, 27:359–380, 1991.

1144