Complexity of Planning with Partial Observability

Jussi Rintanen Albert-Ludwigs-Universitat¨ Freiburg, Institut fur¨ Informatik Georges-Kohler-Allee,¨ 79110 Freiburg im Breisgau Germany

Abstract PSPACE We show that for conditional planning with partial observ- branching beliefs ability the existence problem of plans with success proba- bility 1 is 2-EXP-complete. This result completes the com- EXP = APSPACE EXPSPACE plexity picture for non-probabilistic propositional planning. We also give new more direct and informative proofs for the beliefs branching EXP-hardness of conditional planning with full observability and the EXPSPACE-hardness of conditional planning with- 2−EXP = AEXPSPACE out observability. The proofs demonstrate how lack of full observability allows the encoding of exponential space Tur- Figure 1: Effect of branching and partial observability on ing machines in the planning problem, and how the neces- complexity of planning: classical planning is PSPACE- sity to have branching in plans corresponds to the move to a complete, beliefs (partial observability) add space require- defined in terms of alternation from the cor- ment exponentially, and branching (nondeterminism) adds responding deterministic complexity class. Lack of full ob- alternation. servability necessitates the use of beliefs states, the number of which is exponential in the number of states, and alternation corresponds to the choices a branching plan can make. bilities of nondeterministic events do not matter, and a fi- nite discrete belief space can be used. In many applica- Introduction tions plans with success probability anywhere strictly below The of many forms of AI plan- 1 are not interesting. This is so especially in many engineer- ning is well known. The most important problem that has ing and manufacturing applications. This is in strong con- been analyzed is that of existence of plans, in some cases trast to many optimizations problems typically expressed as existence of plans having a certain success probability or - MDPs/POMDPs, for which existence of solutions is obvi- source consumption. Plan existence for classical planning ous, and the problem is to find a solution that is optimal or (deterministic, one initial state) is PSPACE-complete (By- close to optimal. lander 1994). Conditional planning with nondeterministic In this paper we show that for non-probabilistic (success operators, several initial states and full observability is EXP- probability 1) partially observable planning the plan exis- complete (Littman 1997). Conditional planning with non- tence problem is 2-EXP-complete. We outline new proofs deterministic operators and without observability (confor- of the EXP-hardness of conditional planning with full ob- mant planning) is EXPSPACE-complete (Haslum and Jon- servability and EXPSPACE-hardness of conditional plan- sson 2000). ning without observability, and obtain the 2-EXP-hardness Associating probabilities with nondeterministic choices proof as a generalization of both of these two new proofs. and imposing a lower bound on plan’s success probability The proofs very intuitively explain the problem complexi- make the problem more difficult. Without observability the ties in terms of the types of Turing machines simulated. problem of existence of plans with success probability ≥ c This paper completes the complexity picture of non- is undecidable (Madani et al. 2003). Both full observability probabilistic propositional planning in its most general and restriction to exponentially long plan executions make forms, as summarized in Figure 1. Transition from the state the problem decidable and bring it down to EXPSPACE and space to the belief space leads to exactly an exponential in- below (Littman et al. 1998; Mundhenk et al. 2000). crease in . From classical planning to con- The complexity of one important problem has remained formant planning this is from PSPACE to EXPSPACE, and unknown until now. For probabilistic planning under par- from nondeterministic full-information planning to nonde- tial observability, the special case of success probability 1 terministic planning with partial observability this is from is of great importance. The problem is decidable because APSPACE = EXP to AEXPSPACE = 2-EXP. Similarly, tran- for reaching the goals with probability 1 the exact proba- sition from the deterministic to the corresponding nondeter-

1 ministic planning problem1 means a transition from a de- PSPACE is the class of decision problems solvable by de- terministic complexity class to the corresponding alternat- terministic Turing machines that use a number of tape cells ing complexity class; this corresponds to the introduction of bounded by a polynomial on the input length n. Formally, branches into the plans. From classical planning to nonde- [ k terministic full information planning this is from PSPACE PSPACE = DSPACE(n ). to APSPACE = EXP, and from conformant planning to gen- k≥0 eral partially observable planning this is from EXPSPACE Similarly other complexity classes are defined in terms of to AEXPSPACE = 2-EXP. the time consumption (DTIME(f(n)), or time and space The structure of the paper is as follows. First we define al- consumption on alternating Turing machines (ATIME(f(n)) ternating Turing machines and explain the relations between and ASPACE(f(n))) (Balcazar´ et al. 1988; 1990). deterministic complexity classes and their alternating coun- S nk terparts, followed by a definition of the planning problems EXP = k≥0 DTIME(2 ) we address. In the rest of the paper we analyze the computa- S nk EXPSPACE = k≥0 DSPACE(2 ) k tional complexity of the fully observable, unobservable and S 2n partially observable planning problems, in first two cases 2-EXP = k≥0 DTIME(2 ) S k giving a new more direct hardness proof, and in the third APSPACE = k≥0 ASPACE(n ) case we establish the complexity for the first time. Before S nk AEXPSPACE = k≥0 ASPACE(2 ) concluding the paper we discuss related work. There are many useful connections between these classes Preliminaries: Complexity Classes (Chandra et al. 1981), for example EXP = APSPACE In this section we define alternating Turing machines and 2-EXP = AEXPSPACE. several complexity classes used in this paper. Preliminaries: Planning Definition 1 An alternating (ATM) is a tu- We formally define the conditional planning problem. ple hΣ, Q, δ, q0, gi where • Q is a finite set of states (the internal states of the ATM), Definition 2 Let A be a set of state variables. A problem instance in planning is hI, O, G, V i where I and G are for- • Σ is a finite alphabet (the contents of tape cells), mulae on A respectively describing the sets of initial and • δ is a transition function δ : Q × Σ ∪ {|, 2} → goal states, O is a set of operators hc, ei where c is a for- 2Σ∪{|}×Q×{,N,}, mula on A describing the precondition and e is an effect, and V ⊆ A is the set of observable state variables. Effects • q is the initial state, and 0 are recursively defined as follows. • g : Q → {∀, ∃, accept, reject} is a labeling of the states. 1. a and ¬a for a ∈ A are effects. The symbols | and 2 are the left-end-of-tape and the blank 2. e1∧· · ·∧en is an effect if e1, . . . , en are effects (the special symbol, respectively. We require that s = | and m = R case with n = 0 is the empty conjunction >.) for hs, q0, mi ∈ δ(q, |) and any q ∈ Q, that is, at the 3. c B e is an effect if c is a formula on A and e is an effect. beginning of the tape the movement is to the right and | may 4. e1| · · · |en is an effect if e1, . . . , en for n ≥ 2 are effects. not be overwritten. For hs0, q0, mi ∈ δ(q, s) such that s ∈ Σ, we require s0 ∈ Σ. Above e1 ∧ · · · ∧ en means that all the effects ei simulta- A configuration of a TM, consisting of the internal state q neously take place. The notation c B e is for conditionality, and the tape contents, is final if g(q) ∈ {accept,reject}. that is, effect e takes place if c is true in the current state. The acceptance of an input string by an ATM is defined Nondeterministic effects e1| · · · |en mean randomly choos- inductively starting from final configurations that are accept- ing one of the effects ei. ing. A final configuration is accepting if g(q) = accept. Compound observations and observations dependent on Non-final configurations are accepting if the state is univer- the last applied operator can be reduced in polynomial time sal (∀) and all the successor configurations are accepting or to the basic model defined above in which the same set V of if the state is existential (∃) and at least one of the succes- state variables is observable all the time. sor configurations is accepting. Finally, an ATM accepts a given input string if the initial configuration with initial state Definition 3 Let A be a set of state variables. A problem instance hI, O, G, V i on A is q0 and the input string on the work tape is accepting. A nondeterministic Turing machine (NDTM) is an ATM 1. fully observable if V = A, without universal states. A deterministic Turing machine is 2. unobservable if V = ∅, an NDTM with |δ(q, s)| = 1 for all q ∈ Q and s ∈ Σ. 3. partially observable if no restrictions on V are imposed.

1We can view conformant planning as deterministic planning in Plans are directed graphs with nodes of degree 1 labeled the belief space, because the successor belief state uniquely deter- with operators and edges from branch nodes labeled with mined by the action and the preceding belief state. formulae.

2 Definition 4 Let hI, O, G, V i be a problem instance in Planning without Observability planning. A plan is a triple hN, b, li where The plan existence problem in probabilistic planning with- • N is a finite set of nodes, out observability is undecidable. Madani et al. (2003) • b ∈ N is the initial node, prove this by using the close connection of the problem to • l : N → (O × N) ∪ 2L×N is a function that assigns each the emptiness problem of probabilistic finite automata (Paz node an operator and a successor node ho, ni ∈ O × N 1971; Condon and Lipton 1989). Without observability, or a set of formulae and successor nodes hφ, ni. plans are sequences of actions. The emptiness problem is Only the observable state variables V may occur in the about the existence of a word with an acceptance probabil- branch labels φ. Nodes with l(n) = ∅ are terminal. ity higher than c. This equals the planning problem when plans are identified with words and goal states are identified The definition of plan execution should be intuitively with accepting states. The acceptance probability may be in- clear. When a successor node of a branch node is chosen, creased by increasing the length of the word, and in general the result is undefined if more than one condition is true. from a given c no finite upper bound can be derived and the In this paper we do not discuss plans with loops (cycles.) problem is therefore undecidable. Loops are needed for some nondeterministic problems in- When c = 1 the situation is completely different. Prob- stances as a representation of plans with unbounded execu- abilities strictly between 0 and 1 can be identified, which tion length, like obtaining 12 by throwing dice repeatedly. leads to a finite discrete belief space. For any problem in- The complexity of the plan existence problem with or with- stance there is a finite upper bound on plan length, if a plan out loops is the same, but simulations of alternating Tur- exists, and repetitive strategies for increasing success prob- ing machines become more complicated because existence ability to 1 do not have to be used and they do not help in of looping plans does not exactly correspond to the accep- reaching 1. Notice that for every other fixed c ∈]0, 1[ un- tance criterion of alternating Turing machines. decidability holds as the general problem can be reduced to the fixed-c problem for any c ∈]0, 1[ by adding a first action Planning under Full Observability that scales every success probability to the fixed c. Littman (1997) showed that the plan existence problem The unobservable planning problem is easily seen to be of probabilistic planning with full observability is EXP- in EXPSPACE. Haslum and Jonsson (2000) point out this complete. EXP-hardness was shown by reduction from the fact and outline the proof which is virtually identical to the game G4 (Stockmeyer and Chandra 1979). The reduction PSPACE membership proof of plan existence for classical does not rely on exact probabilities, and hence also the non- planning (Bylander 1994) except that it works at the level of probabilistic problem is EXP-complete. belief states instead of states. Here we sketch a new proof based on the equality EXP = APSPACE. In Theorem 8 we generalize this proof and the EXPSPACE-hardness proof of Theorem 7 to a 2-EXP- Theorem 6 In the unobservable case, testing the existence hardness proof for the general partially observable problem. of plans with success probability 1 is in EXPSPACE.

Theorem 5 Planning with full observability is EXP-hard. Haslum and Jonsson (2000) also show that an unobserv- able planning problem similar to ours is EXPSPACE-hard. Proof: Sketch: The proof is a Turing machine simulation Their proof is a reduction from the EXPSPACE-hard uni- like the PSPACE-hardness proof of classical planning (By- versality problem of regular expressions with exponentiation lander 1994), but for the more powerful alternating Turing (Hopcroft and Ullman 1979). machines, yielding hardness for EXP = APSPACE. The dif- Our EXPSPACE-hardness proof simulates deterministic ference is caused by ∀ and ∃ states of the ATM. For ∀ states, exponential-space Turing machines by planning. The main the transition is chosen nondeterministically and a branch in problem to be solved is the simulation of the exponentially the plan chooses how execution continues thereafter, and for long tape. In the PSPACE-hardness proof of classical plan- ∃ states the plan chooses the transition. The ATM accepts if ning each tape cell can be represented by one state variable, and only if there is a plan that reaches the goal states under but with an exponentially long tape this kind of simulation all nondeterministic choices.  is not possible. Instead, we use a randomization technique Testing plan existence is easily seen to be in EXP. This test that forces the tape contents to be faithfully represented in is performed by traversing the state space backwards starting the plan which may have a doubly exponential length. S from D0 = G by computing Di+1 = o∈O preimgo(Di) ∪ D until D = D for some n ≥ 0. If I ⊆ D , then a i n n+1 n Theorem 7 Existence of plans in planning without observ- plan exists. The number of states is exponential, and com- ability is EXPSPACE-hard. puting Dn is polynomial time in the number of states. The preimage preimg (B) of a set B of states with respect to an o hΣ, Q, δ, q , gi operator o is the set of states from which a state in B is al- Proof: Let 0 be a deterministic Turing machine e(x) σ ways reached by o. Viewing operators o ∈ O as relations on with an exponential space bound , and an input string n i σ σ the set S of states, we define preimg (B) as of length . We denote the th symbol of by i. o For encoding numbers from 0 to e(n) + 1 we need m = {s ∈ S|sos0 for some s0 ∈ B, sos0 for no s0 ∈ S\B}. dlog2(e(n) + 2)e Boolean state variables.

3 0 0 We construct a problem instance in nondeterministic plan- Now, these effects τs,q(s , q , m) which represent possible ning without observability for simulating the Turing ma- transitions are used in the operators that simulate the DTM. chine. The size of the instance is polynomial in the size Let hs, qi ∈ (Σ ∪ {|, 2}) × Q and δ(s, q) = {hs0, q0, mi}. of the TM and the input string. If g(q) = ∃ then define the operator It turns out that when not everything is observable, instead 0 0 of encoding all tape cells in the planning problem, it is suf- os,q = h((h 6= w) ∨ s) ∧ q, τs,q(s , q , m)i. ficient to keep track of only one tape cell (which we call the watched tape cell) that is randomly chosen for every plan A correct simulation is clearly obtained assuming that o execution. when operator s,q is executed the current tape symbol is s o The set P of state variables consists of indeed . So assume that some s,q is the first operator that misrepresents the tape contents. Let h = c for some c. Now 1. q ∈ Q for the internal states of the TM, there is an execution (initial state) of the plan so that w = c. 2. wi for i ∈ {0, . . . , m − 1} for the watched tape cell, On this execution the precondition of os,q is not satisfied, and the plan is not executable. Hence a valid plan cannot 3. s ∈ Σ ∪ {|, 2} for contents of the watched tape cell, and contain operators that misrepresent the tape contents.  4. hi, i ∈ {0, . . . , m − 1} for the position of the R/W head. The uncertainty in the initial state is about which tape cell is the watched. Otherwise the formula encodes the initial Planning under Partial Observability configuration of the TM, and it is the conjunction of the fol- Showing that the plan existence problem for planning with lowing formulae. partial observability is in 2-EXP is straightforward. The eas- 1. q0 iest way to see this is to view the partially observable plan- ning problem as a nondeterministic fully observable plan- 2. ¬q for all q ∈ Q\{q } 0 ning problem with belief states viewed as states. An opera- 3. Contents of the watched tape cell: tor maps a belief state to another belief state nondeterminis- | ↔ (w = 0) tically: subset of the image of a belief state with respect to an 2 ↔ (w > n) operator is chosen by observations. Like pointed out earlier, s ↔ W (w = i) for all s ∈ Σ the for fully observable planning run in polyno- i∈{1,...,n},σi=s mial time in the size of the state space. The state space with 4. h = 1 for the initial position of the R/W head. the belief states as the states has a doubly exponential size in the size of the problem instance, and hence the So the initial state formula allows any values for state vari- runs in doubly exponential time in the size of the problem ables wi and the values of the state variables s ∈ Σ are instance. determined by the values of wi. The expressions w = i and The main result of the paper, the 2-EXP-hardness of par- w > i denote the obvious formulae for integer equality and tially observable planning, is obtained as a generalization inequality of the numbers encoded by w0, w1,.... We also of the proofs of Theorems 5 and 7. From the EXPTIME- use effects h := h + 1 and h := h − 1 for incrementing and hardness proof we take the simulation of alternation by non- decrementing the number encoded by variables hi. deterministic operators, and from the EXPSPACE-hardness The goal is the following formula. proof the simulation of the exponentially long working tape. _ G = {q ∈ Q|g(q) = accept} These easily yield a simulation of alternating Turing ma- chines with an exponential space bound, and thereby proof To define the operators, we first define effects correspond- of AEXPSPACE-hardness. ing to all possible transitions. 0 0 For all hs, qi ∈ (Σ ∪ {|, 2}) × Q and hs , q , mi ∈ (Σ ∪ Theorem 8 Testing existence of plans with success proba- 0 0 {|}) × Q × {L, N, R} define the effect τs,q(s , q , m) as α ∧ bility 1 is 2-EXP-hard. κ ∧ θ where the effects α, κ and θ are defined as follows. The effect α describes the change to the current tape sym- Proof: Let hΣ, Q, δ, q0, gi be any alternating Turing machine bol. If s = s0 then α = > as nothing on the tape changes. with an exponential space bound e(x). Let σ be an input 0 Otherwise, α = ((h = w) B (¬s ∧ s )) to denote that the string of length n. We denote the ith symbol of σ by σi. 0 new symbol in the watched tape cell is s and not s. We construct a problem instance in nondeterministic plan- The effect κ changes the internal state of the TM. Again, ning with partial observability for simulating the Turing ma- 0 0 either the state changes or does not, so κ = ¬q ∧ q if q 6= q chine. The problem instance has a size that is polynomial and > otherwise. If R/W head movement is to the right then in the size of the description of the Turing machine and the 0 0 κ = ¬q∧((h < e(n)) B q ) if q 6= q , and (h = e(n)) B ¬q input string. otherwise. This prevents reaching an accepting state if the Here we just list the differences to the proof of Theorem space bound is violated: no further operators are applicable. 7 needed for handling alternation. The effect θ describes the movement of the R/W head. The set of state variables is extended with ( h := h − 1 if m = L 1. s∗ for s ∈ Σ ∪ {|} for the symbol last written, and θ = > if m = N h := h + 1 if m = R 2. L, R and N for the last movement of the R/W head.

4 The observable state variables are L, N and R, q ∈ Q, and be turned into a tree by having several copies of the shared s∗ for s ∈ Σ. These are needed by the plan to decide how to subplans. Branches not directly following a nondetermin- proceed execution after a nondeterministic ∀-transition. istic operator causing the uncertainty can be moved earlier The initial state formula is conjoined with ¬s∗ for all s ∈ so that only plan nodes with a non-singleton belief state im- Σ ∪ {|}. The goal formula remains unchanged. mediately follow a nondeterministic operator. With these Next we define the operators. All the transitions may transformations there is an exact match between plans and be nondeterministic, and the important thing is whether the computation trees of the ATM. transition is for a ∀ state or an ∃ state. For a given input Because alternating Turing machines with an exponential symbol and a ∀ state, the transition corresponds to one non- space bound are polynomial time reducible to the nonde- deterministic operator, whereas for a given input symbol and terministic planning problem with partial observability, the an ∃ state the transitions corresponds to a set of deterministic plan existence problem is AEXPSPACE=2-EXP-hard.  operators. 0 0 Effects τs,q(s , q , m) = α ∧ κ ∧ θ are like in the proof of Theorem 7 except for the following modifications. The Impact of Determinism on Complexity effect α is modified to store the written symbols to the state Our proofs for the EXP-hardness and 2-EXP-hardness of variables s∗. If s = s0 then α = > as nothing on the tape plan existence for fully and partially observable planning use 0 0∗ ∗ changes. Otherwise, α = ((h = w) B (¬s∧s ))∧s ∧¬s . nondeterminism. Also the EXP-hardness proof of Littman The effect θ is similarly extended to store the tape movement (1997) uses nondeterminism. The question arises whether to L, N and R. complexity is lower when all operators are deterministic. ( (h := h − 1) ∧ L ∧ ¬N ∧ ¬R if m = L θ = N ∧ ¬L ∧ ¬R if m = N Theorem 9 The plan existence problem with full observ- (h := h + 1) ∧ R ∧ ¬L ∧ ¬N if m = R ability and deterministic operators is in PSPACE. 0 0 Now, these effects τs,q(s , q , m) which represent possible Proof: This is because iteration over an exponential num- transitions are used in the operators that simulate the ATM. ber of initial states needs only polynomial space, and testing Operators for existential states q, g(q) = ∃ and for universal goal reachability for each initial state needs only polyno- states q, g(q) = ∀ differ. Let hs, qi ∈ (Σ ∪ {|, 2}) × Q and mial space like in the PSPACE membership proof of classi- δ(s, q) = {hs1, q1, m1i,..., hsk, qk, mki}. cal planning (Bylander 1994).  If g(q) = ∃, then define k deterministic operators Under unobservability determinism does not reduce com- os,q,1 = h((h 6= w) ∨ s) ∧ q, τs,q(s1, q1, m1)i plexity: proof of Theorem 7 uses deterministic operators os,q,2 = h((h 6= w) ∨ s) ∧ q, τs,q(s2, q2, m2)i only. But for the general partially observable problem de- . terminism does reduce complexity. . os,q,k = h((h 6= w) ∨ s) ∧ q, τs,q(sk, qk, mk)i Theorem 10 The plan existence problem with partial ob- That is, the plan determines which transition is chosen. If servability and deterministic operators is in EXPSPACE. g(q) = ∀, then define one nondeterministic operator Proof: The idea is similar to the EXPSPACE membership os,q = h((h 6= w) ∨ s) ∧ q, (τs,q(s1, q1, m1)| proof of planning without observability: go through all pos- . sible intermediate stages of a plan by binary search. De- . terminism yields an exponential upper bound on the sum of τs,q(sk, qk, mk))i. the cardinalities of the belief states that are possible after That is, the transition is chosen nondeterministically. branching and a given number of actions, (see Figure 2), and We claim that the problem instance has a plan if and only determinism also makes it unnecessary to visit any belief if the Turing machine accepts without violating the space state more than once. Hence plan executions have doubly bound. If the Turing machine violates the space bound, then exponential length and binary search needs only exponen- h > e(n) and an accepting state cannot be reached because tial recursion depth. no further operator will be applicable. Let hC1,...,Cki be the classes of observationally indis- From an accepting computation tree of an ATM we can tinguishable states. Let S be the set of states. For belief P 0 construct a plan, and vice versa. The construction of the state B and set L of belief states with B0∈L |B | ≤ |S|, plan is recursive: accepting final configurations correspond test reachability with plans of depth 2i by the following pro- to terminal nodes of plans, ∃-nodes correspond to an oper- cedure. ator corresponding to the transition made in the node fol- procedure reach(B,L,i) lowed by the plan recursively constructed for the successor if i = 0 then configuration, and ∀-nodes correspond to the execution of a begin nondeterministic operator followed by a branch node that se- for each j ∈ {1, . . . , k} 0 0 lects the plan recursively constructed for the corresponding if imgo(B) ∩ Cj 6⊆ B for all o ∈ O and B ∈ L 0 0 successor configuration. and B ∩ Cj 6⊆ B for all B ∈ L Construction of computation trees from plans is similar, then return false but involves small technicalities. A plan with DAG form can end

5 N0 only: they both come down to PSPACE from EXPSPACE.

Related Work and Conclusions The complexity of constructing and evaluating policies for MDPs with the restriction to finite-horizon performance has been thoroughly analyzed by Mundhenk et al. (2000). They N1 N2 N3 Nn evaluate the computational complexity of policy evaluation and policy existence under different assumptions. While in the fully observable, unobservable and in the general case we have problems complete for EXP, EXPSPACE and 2- EXP, Mundhenk et al. have EXP, NEXP and EXPSPACE for history-dependent POMDPs with problem instances rep- resented as circuits, exponentially long horizons and same Figure 2: In a deterministic problem instance, the sum of observability restrictions. The latter complexities are this the cardinalities of the possible sets of states at plan nodes low because of the exponential horizon length. N1,...,Nn cannot exceed the cardinality of the possible set In this paper we have proved that the plan existence of states at the root node of the plan N0. Here N1,...,Nn problem of propositional non-probabilistic planning with are plan nodes that intersect the plan in the sense that none partial observability is 2-EXP-complete. Complexity of of these nodes is an ancestor node of another. classical planning and non-probabilistic propositional plan- ning with other observability restrictions (full observabil- deterministic non-deterministic ity, no observability) was already known (Bylander 1994; full observability PSPACE EXP Littman 1997; Haslum and Jonsson 2000), and our result no observability EXPSPACE EXPSPACE was the most important missing one. partial observability EXPSPACE 2-EXP We also gave more direct proofs of EXPSPACE-hardness of planning without observability and EXP-hardness of Table 1: Summary of complexities of planning with multi- planning with full observability, shedding more light to re- ple initial states, deterministic or non-deterministic opera- sults first proved by Haslum and Jonsson and by Littman, tors, and different degrees of observability. and discussed the impact of determinism on complexity. The results do not address plan optimality with respect to a cost measure, like plan size or plan depth, but it seems that return true; for many cost measures there is no increase in complexity end and that this is in many cases easy to prove. else 0 S P 0 for each L ⊆ 2 such that B0∈L0 |B | ≤ |B| and 0 0 0 References for every B ∈ L , B ⊆ Cj for some j ∈ {1, . . . , k} if reach(B,L0,i − 1) then Jose´ Luis Balcazar,´ Josep D´ıaz, and Joaquim Gabarro.´ begin Structural Complexity I. Springer-Verlag, Berlin, 1988. flag := true; Jose´ Luis Balcazar,´ Josep D´ıaz, and Joaquim Gabarro.´ 00 0 for each B ∈ L Structural Complexity II. Springer-Verlag, Berlin, 1990. if reach(B00,L,i − 1) = false then flag := false; if flag = true then return true Tom Bylander. The computational complexity of propo- end sitional STRIPS planning. Artificial Intelligence, 69(1- end 2):165–204, 1994. return false A. Chandra, D. Kozen, and L. Stockmeyer. Alternation. We can now test plan existence by calling reach(B,L,|S|) Journal of the ACM, 28(1):114–133, 1981. for every B = I ∩ Ci, i ∈ {1, . . . , k} and L = {G ∩ Ci|i ∈ Anne Condon and Richard J. Lipton. On the complexity {1, . . . , k}}. The algorithm always terminates, and a plan of space bounded interactive proofs (extended abstract). In exists if and only if answer true is obtained in all cases. Proceedings of the 30th IEEE Symposium on Foundations The space consumption is (only) exponential because the of Computer Science, pages 462–467. IEEE, 1989. recursion depth is exponential and the sets L0 ⊆ 2S with P 0 0 Patrik Haslum and Peter Jonsson. Some results on the B0∈L0 |B | ≤ |B| have size ≤ |S|. Sets L this small complexity of planning with incomplete information. In suffice because all operators are deterministic, and after Susanne Biundo and Maria Fox, editors, Recent Advances any number of actions, independently of how the plan has in AI Planning. Fifth European Conference on Planning branched, the sum of the cardinalities of the possible belief (ECP’99), number 1809 in Lecture Notes in Artificial In- states is not higher than the number of initial states.  telligence, pages 308–318. Springer-Verlag, 2000. The problem complexities are summarized in Table 1. John E. Hopcroft and Jeffrey D. Ullman. Introduction to Restriction to only one initial state affects the determinis- Automata Theory, Languages, and Computation. Addison- tic unobservable and partially observable planning problems Wesley Publishing Company, 1979.

6 M. L. Littman, J. Goldsmith, and M. Mundhenk. The com- putational complexity of probabilistic planning. Journal of Artificial Intelligence Research, 9:1–36, 1998. Michael L. Littman. Probabilistic propositional planning: Representations and complexity. In Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97) and 9th Innovative Applications of Artificial Intelligence Conference (IAAI-97), pages 748–754, Menlo Park, July 1997. AAAI Press. Omid Madani, Steve Hanks, and Anne Condon. On the un- decidability of probabilistic planning and related stochastic optimization problems. Artificial Intelligence, 147(1–2):5– 34, 2003. Martin Mundhenk, Judy Goldsmith, Christopher Lusena, and Eric Allender. Complexity of finite-horizon Markov decision process problems. Journal of the ACM, 47(4):681–720, 2000. Azaria Paz. Introduction to Probabilistic Automata. Aca- demic Press, 1971. Larry J. Stockmeyer and Ashok K. Chandra. Provably dif- ficult combinatorial games. SIAM Journal on Computing, 8(2):151–174, 1979.

7