Unifying the Causal Graph and Additive Heuristics

Malte Helmert Hector´ Geffner Albert-Ludwigs-Universitat¨ Freiburg ICREA & Universitat Pompeu Fabra Institut fur¨ Informatik Passeig de Circumvalacio´ 8 Georges-Kohler-Allee¨ 52 08003 Barcelona, Spain 79110 Freiburg, Germany [email protected] [email protected]

Abstract simpler and more general. The new heuristic reduces to Many current heuristics for domain-independent planning, Helmert’s heuristic when the causal graph is acyclic, but re- such as Bonet and Geffner’s additive heuristic and Hoffmann quires neither acyclicity nor the causal graph itself. Like the and Nebel’s FF heuristic, are based on delete relaxations. additive heuristic, the new heuristic is defined mathemati- They estimate the goal distance of a search state by approx- cally by means of a functional equation, which translates imating the solution cost in a relaxed task where negative into a shortest-path problem over a poly-size graph that can consequences of operator applications are ignored. Helmert’s be solved by standard algorithms. Indeed, the only differ- causal graph heuristic, on the other hand, approximates goal ence between this account of the causal graph heuristic and distances by solving a hierarchy of “local” planning problems the normal additive heuristic is that the nodes in this graph, that only involve a single state variable and the variables it de- which stand for the atoms in the problem, are labeled with pends on directly. contextual information. The new formulation of the causal Superficially, the causal graph heuristic appears quite unre- graph heuristic suggests a number of extensions, all of which lated to heuristics based on delete relaxation. In this contribu- have to do with the exploitation of implicit or explicit prece- tion, we show that the opposite is true. Using a novel, declar- ative formulation of the causal graph heuristic, we show that dences among an operator’s preconditions in order to cap- the causal graph heuristic is the additive heuristic plus con- ture side effects in the computation of the heuristic. Even text. Unlike the original heuristic, our formulation does not without such extensions, we experimentally show that the require the causal graph to be acyclic, and thus leads to new heuristic delivers considerably better heuristic guidance a proper generalization of both the causal graph and addi- than both the causal graph heuristic and additive heuristic tive heuristics. Empirical results show that the new heuristic across a large suite of standard benchmarks. is significantly better informed than both Helmert’s original causal graph heuristic and the additive heuristic and outper- Multi-valued Planning Tasks forms them across a wide range of standard benchmarks. The causal graph heuristic is defined over a planning lan- + Introduction guage with multi-valued variables based on the SAS lan- guage (Backstr¨ om¨ and Nebel 1995), where the basic atoms The causal graph heuristic introduced by Helmert (2004; are of the form v = d where v is a variable and d ∈ Dv is a 2006) represents one of the few informative heuristics for value in v’s domain Dv. domain-independent planning that takes into account nega- Formally, a multi-valued planning task (MPT) is a tuple tive operator interactions. Unlike the additive heuristic used Π = hV, s0, s?,Oi where V is a set of variables v with asso- early in the HSP planner (Bonet and Geffner 2001) and the ciated finite discrete domains Dv, s0 is a state over V char- relaxed planning graph heuristic used in FF (Hoffmann and acterizing the initial situation, s? is a partial state over V Nebel 2001), the causal graph heuristic is not based on the characterizing goal situations, and O is a set of operators delete relaxation but on a deeper analysis of the problem that map one state into a possibly different state. structure as captured by its underlying causal graph. The A state is a function s that maps each variable v ∈ V into causal graph is a directed graph where the nodes stand for 0 a value s(v) in Dv. A partial state s is a state restricted to a the variables in the problem and links express the dependen- subset V 0 ⊆ V of variables. We write dom(s0) for the subset cies among them. The causal graph heuristic is defined for of variables on which s0 is defined. As it is common in the problems with acyclic causal graphs as the sum of the costs Boolean setting, we often represent and treat such functions of plans for subproblems that include a variable and its par- as the set of atoms v = d that they make true. For an atom ents in the graph. The local costs are not optimal (else the x, we write var(x) for the variable associated with x. For heuristic would be intractable) and are defined procedurally. example, if x is the atom v = d, then var(x) = v. In this paper we introduce an alternative, declarative for- Given a state s and an atom v = d, s[v = d] denotes the mulation of the causal graph heuristic that we believe is state that is like s except for variable v, which it maps to Copyright c 2008, Association for the Advancement of Artificial d. We will also use similar notations like s[s0] where s0 is a (www.aaai.org). All rights reserved. partial state, to denote the state that is like s except for the variables in dom(s0), where it is like s0. The definition assumes that CG(Π) is acyclic. When this An operator o has a precondition pre(o) that is a partial is not so, Helmert’s planner “relaxes” the causal graph by state, and a set of effects or rules z → v = d, written also deleting some edges, defining the costs and the resulting as o : z → v = d, where the condition z is a partial state, heuristic over the resulting acyclic graph.1 0 v ∈ V is a variable, and d is a value in Dv. The measures costv(d, d ) stand for the cost of a plan π An operator o is executable in a state s if pre(o) ⊆ s and that solves the subproblem Πv,d,d0 with initial state s[v = d] the result is a state s0 that is like s except that variables v are and goal v = d0 that involves only the variable v and its par- 0 mapped into values d when o : z → v = d is an effect of o ent variables in the causal graph. The measures costv(d, d ) and z ⊆ s.A plan is a sequence of applicable operators that however do not stand for the optimal costs of these sub- maps the initial state s0 into a final state sG with s? ⊆ sG. problems, whose computation is intractable (Helmert 2004), These definitions follow the ones by Helmert (2006), and are defined procedurally using a slight modification of apart from the omission of axioms and derived variables. Dijkstra’s algorithm (Cormen, Leiserson, and Rivest 1990; (This is for simplicity of presentation; all results in this pa- Bertsekas 2000). Costs are computed in topological causal per readily generalize to the full formalism.) In addition, for order, starting with the variables with no parents in the simplicity and without loss of generality, we make two as- causal graph. (In practice, not all cost values need to be sumptions. Firstly, we assume that operator preconditions computed, but this optimization is not relevant here.) pre(o) are empty. This is because the value of none of the We will not repeat the exact procedure for computing heuristics considered in this paper changes when precondi- these costs (found in Fig. 18, Helmert 2006). Instead, we tions p ∈ pre(o) are moved into the body z of all effects will explain the procedure in a way that makes the relation- z → v = d. Secondly, we assume that the variable v that ship between causal graph heuristic and additive heuristic appears in the head of a rule z → v = d also appears in the more direct. body z. Effects z → v = d for which this is not true are to Let us recall first that in Dijkstra’s algorithm, a cost label be replaced by a collection of effects v = d0, z → v = d, c(i) is associated with each node i in the graph, initialized 0 one for each value d ∈ Dv different from d; a transfor- to c(i) = 0 if i is the source node and c(i) = ∞ otherwise. mation that preserves the semantics and complies with the In addition, an open list is initialized with all nodes. The above condition. Effects z → v = d can thus all be written algorithm then picks and removes the node i from open with as least cost c(i) iteratively, updating the values c(j) of all the v = d0, z0 → v = d . nodes j still in open to c(j) = min(c(j), c(i) + c(i, j)), While this language is not standard in planning, an au- where c(i, j) is the cost of the edge connecting node i to j tomatic translation algorithm from PDDL into MPTs exists in the graph. This is called the expansion of node i. (Helmert 2008). The algorithm finishes when open is empty, after a num- ber of iterations bounded by the number of nodes in the Causal Graph Heuristic graph. The cost label c(i) of a node is optimal when selected CG for expansion and remains so until termination. The causal graph heuristic h (s) provides an estimate of The cost c(i, j) of the directed edges (i, j) is assumed to the number of operators needed to reach the goal from a state be non-negative and is used in the computation only right af- s in terms of the estimated costs of changing the value of ter node i is expanded. Helmert’s procedure takes advantage each variable v that appears in the goal from its value s(v) of this fact by setting the cost of such edges dynamically, in s to its value s?(v) in the goal: right after node i is selected for expansion. CG def X When v is a root variable in CG(Π), Helmert’s proce- h (s) = costv(s(v), s?(v)) (1) dure for solving the subproblem Πv,d,d0 , for a given d ∈ Dv v∈dom(s?) 0 0 and all d ∈ Dv, resulting in the costs costv(d, d ), is ex- 0 The costs costv(d, d ) are defined with the help of two actly Dijkstra’s: the graph is DTG(v), the source node is structures: the domain transition graphs DTG(v), which re- d, the cost of all edges is set to 1, and upon completion, 0 0 0 veal the structure of the domain D associated with each costv(d, d ) is set to c(d ) for all d ∈ Dv. For such vari- v 0 variable v, and the causal graph CG(Π), which reveals the ables, costv(d, d ) is indeed optimal for Πv,d,d0 . relation among the variables v in the problem Π. For variables v with a non-empty set of parent variables 0 The domain transition graph DTG(v) for a variable v ∈ vi in the causal graph with costs costvi (ei, ei) already com- 0 V is a labelled directed graph with vertex set D and edges puted for all ei, ei ∈ Dvi , Helmert’s procedure for solving v 0 (d, d0) labelled with the conditions z for rules v = d, z → the subproblem Πv,d,d0 for a given d and all d ∈ Dv mod- v = d0 in Π. ifies Dijkstra’s algorithm slightly by setting the cost of the The causal graph CG(Π) for Π = hV, s0, s?,Oi is the edges (d1, d2) ∈ DTG(v) labelled with condition z to V (v, v0) v 6= v0 X directed graph with vertex set and arcs for 1 + cost (e0 , e ) (2) such that v appears in the label of some arc in DTG(v0) or vi i i some operator o affects both v and v0 (i. e., Π contains effects (vi=ei)∈z 0 0 0 o : z → v = d and o : z → v = d for some operator o). 1The original paper on the causal graph heuristic (Helmert 0 CG The costs costv(d, d ) that determine the heuristic h (s) 2004) also mentions, in passing, a generalization that does not re- in Eq. 1 are defined in terms of the causal graph CG(Π) and quire acyclicity. This generalization is very similar to the heuristic the domain transition graphs DTG(v). we introduce in this paper. right after node d1 has been selected for expansion, where conditions xi are evaluated in different states si, rendering 0 ei is the value of variable vi in the state sd1 associated with the general pattern node d1. The state associated with a node d is defined as 0 if x ∈ s follows. The state sd associated with the source node d is  h(x|s) =def X  s, while if sd is the state associated with the node d just min 1 + h(xi|si) if x∈ / s (5) 0  o:z→x selected for expansion, and the expansion of d causes c(d ) xi∈z 0 0 to decrease for some d ∈ Dv due to an edge (d, d ) labelled where the states si over which some of the conditions xi in with z, then s 0 is set to s [z]. d d a rule are evaluated may be different from the seed state s Π 0 This procedure for solving the subproblems v,d,d is nei- where the value of the heuristic needs to be assessed. We ther optimal nor complete. Indeed, it is possible for the costs 0 will refer to such states si that may be different from the cost (d, d ) Π 0 v to be infinite while the costs of v,d,d are finite. seed state as contexts. This is because the procedure achieves the intermediate val- 00 ues d greedily, carrying the side effects sd00 but ignoring their impact in the “future” costs to be paid in going from Context-enhanced Additive Heuristic d00 to d0. The use of Eq. 5 in place of Eq. 4 for defining a variation of These limitations of the causal graph heuristic are known. the additive heuristic raises two questions: What we seek here is an understanding of the heuristic in 1. how to choose the contexts si needed for evaluating the terms of the simpler and declarative additive heuristic. conditions xi in a given rule o : z → x, and

2. how to restrict the number of total contexts si needed for Additive Heuristic computing the heuristic value of s. For the language of MPTs Π = hV, s0, s?,Oi, the addi- We answer these two questions in line with Helmert’s for- tive heuristic (Bonet, Loerincs, and Geffner 1997; Bonet and mulation of the causal graph heuristic, thus obtaining a Geffner 2001) can be expressed as: heuristic that is closely related to both the causal graph and additive heuristics. We call the resulting heuristic the add def X add h (s) = h (x|s) (3) context-enhanced additive heuristic hcea. We remark that x∈s? there are other reasonable answers to these two questions, and we will discuss some generalizations later. where x stands for the atoms v = d for v ∈ V and d ∈ D , v Let us recall first that, without loss of generality, we as- and hadd(x|s) is an estimate of the cost of achieving x from sume that all the rules have the form o : x0, z → x where s given as: x and x0 are atoms that refer to the same variable, i. e., 0 0 0 if x ∈ s var(x) = var(x ). Also recall that s[s ] for state s and par- def  0 hadd(x|s) = X add  tial state s refers to the state that is like s except for the min 1 + h (xi|s) if x∈ / s (4) 0 0  o:z→x variables mentioned in s , where it is equal to s . xi∈z The answer to the first question implied by Helmert’s for- 0 0 This functional equation approximates the true cost func- mulation is that for rules o : x , z → x the condition x that refers to the same variable as x is achieved first, and tion by assuming that the cost of joint conditions (in goals 0 and effects) is additive. Such sums go away, however, if the the conditions in z are evaluated in the state s that results goal is a single atom and no condition z features more than (Assumption 1). hadd The answer to the second question is that in the heuristic one atom. In such a case, the additive heuristic coin- cea hmax computation for a seed state s, the costs h (xi|si) for a cides with the max heuristic (Bonet and Geffner 2001) cea 0 0 and both heuristics are optimal. context si are mapped into costs h (xi|s[xi]) where xi is It follows from this that if x represents any atom v = d the value of var(xi) in si, meaning that information in the state si about other variables is discarded (Assumption 2). for some root variable v in the causal graph of Π, the value cea cea hadd(x|s) that follows from Eq. 4 is optimal, and thus in We will write h (x|s[xi]) as h (x|xi). These assumptions lead to a heuristic that is very much correspondence with the costs costv(s(v), d) computed by 0 like the additive heuristic except that a) the heuristic val- Helmert’s procedure. For other values d of v the costs cea 0 add 0 ues h (x|s) are computed not only for the seed state s but costv(d , d) are equivalent to the estimates h (x|s ) with 0 0 0 s0 = s[v = d0]. for all states s = s[x ] where x is an atom that refers to 0 the same variable as x, and b) during the computation, the This correspondence between the costs costv(d , d) and 00 00 the estimates hadd(x|s0) for root variables v raises the ques- preconditions other than x in the rules o : x , z → x are evaluated in the state s(x00|x0) that results from achieving tion of whether such costs can be characterized in the style 00 0 of the additive heuristic also when v is not a root variable. the “pivot” condition x , producing then the context s(x|x ) Notice first that Eq. 2 used in Helmert’s procedure is addi- associated with x. Formally, we get the equations: tive. At the same time, the procedure keeps track of side ef-  0 0 if x = x fects in a way that is not captured by Eq. 4, which evaluates  cea 00 0 add cea 0 def  min 1 + h (x |x ) the costs h (xi|s) of all conditions xi in the body of the h (x|x ) = o:x00,z→x (6) rules o : z → x with respect to the same state s. This how- X cea 0  0  + h (xi|xi) if x 6= x ever suggests considering variations of Eq. 4 where these  xi∈z 0 0 where xi is the value of var(xi) in the state that results from y = y in Eq. 9 for i = 0, since s(x0|x0) = s, and y is true achieving x00 from x0, written as s(x00|x0) and obtained from in s. Clearly, we have hcea(y|y) = 0 and hcea(y|¬y) = 1, so that Eq. 9 becomes:  0 0 def s[x ] if x = x s(x|x0) = (7) hcea(x |x ) = 1 + hcea(x |x ) + 0 s(x00|x0)[z, x][y , . . . , y ] if x 6= x0 1 0 0 0 1 n cea cea h (xi+1|x0) = 1 + h (xi|x0) + 1 for i ≥ 1 where x00 is the atom for which o : x00, z → x is the rule that With hcea(x |x ) = 0, we thus get hcea(x |x ) = 2i − 1 for yields the minimum2 in Eq. 6, and y , . . . , y are the heads 0 0 i 0 1 n all i ≥ 1, and therefore hcea(s) = hcea(x |x ) = 2n − 1, of all rules that must trigger simultaneously with this rule n 0 so that hcea(s) is optimal. The plain additive heuristic, on (i. e., o : x00, z → y for some z ⊆ z for all i = 1, . . . , n, i i i the other hand, is hadd(s) = n, which is optimal only for the for the same operator o). In words, when o : x00, z → x delete relaxation. is the best (cost-minimizing) achiever for atom x from x0 00 0 according to Eq. 6, and s(x |x ) is the state that is deemed Relationship to Causal Graph Heuristic to result from achieving x00 from x0 (i. e., the “side effect” of 00 0 00 0 The causal graph underlying the problem above involves a achieving x from x ), then s(x |x )[z, x, y1, . . . , yn] is the 0 cycle between the two variables X and Y . In the absence of state that is deemed to result from achieving x from x . 3 The context-enhanced additive heuristic hcea(s) is then such cycles the following correspondence can be shown: defined as Theorem 1 (hcea vs. hCG) cea def X cea h (s) = h (x|xs) (8) If the causal graph CG(Π) is acyclic, then the causal graph CG x∈s? heuristic h and the context-enhanced additive heuristic cea CG cea cea 0 h are equivalent, i. e., for every state s, h (s) = h (s). where xs is the atom that refers to var(x) in s, and h (x|x ) is defined by Eqs. 6 and 7. The sketch of the proof proceeds as follows. When CG(Π) is acyclic, a correspondence can be established be- 0 Example tween the costs costv(d, d ) defined by Helmert’s procedure and the costs hcea(x0|x) defined by Eq. 6 for x : v = d and As an illustration, consider a problem with a Boolean vari- 0 0 able Y and a multi-valued variable X ∈ {0, . . . , n} repre- x : v = d , and between the states sd0 associated with the node d0 and the states s(x0|x) defined by Eq. 7. sented by the atoms y, ¬y, and xi, standing for the assign- ments Y = true, Y = false, and X = i for i = 0, . . . , n, These correspondences must be proved inductively: first and operators a and b , i = 0, . . . , n − 1 with rules on the root variables of the causal graph, and then on the i execution of the modified Dijkstra procedure. a : ¬y → y, bi : xi, y → xi+1, bi : xi, y → ¬y The first part is straightforward and was mentioned ear- 0 lier: if v is a root variable, costv(d, d ) is the optimal cost of (note how the operators bi have two effects). We want to 0 cea the subproblem Πv,d,d which involves no other variables, determine the value of h (s) when x0 and y hold in s cea 0 cea cea and the costs h (x |x) are optimal as well, as there is a and the goal is xn. From Eq. 8, h (s) = h (xn|x0). single condition in every rule and hence no sums or additiv- The optimal plan for the problem is the operator sequence ity assumptions, and all these conditions are mutex. At the b0, a, b1, . . . , a, bn−1 for a cost of 2n − 1. The plans must 0 same time, the states sd0 and s(x |x) remain equivalent too. increase the value of X one by one, and in between any two If v is not a root variable, the correspondence between steps the value of Y that is made false by each increase in X 0 0 cea 0 costv(d, d ) and sd on the one hand, and h (x |x) and must be restored to true. s(x0|x) on the other, must hold as well for d0 = d before From Eq. 6, it follows that any node is expanded in Helmert’s procedure. Assuming in- cea ductively that the correspondence holds also for all values h (x0|x0) = 0 0 0 ei, ei ∈ Dvi of all ancestors vi of v in the causal graph, hcea(x |x ) = 1 + hcea(x |x ) + hcea(y|y ) (9) 0 i+1 0 i 0 and for all values costv(d, d ) and states sd0 after the first i for i = 0, . . . , n − 1, where y0 represents the value of Y in nodes have been expanded, it can be shown that the corre- spondence holds after i + 1 node expansions as well. the state s(xi|x0) that results from achieving xi from x0. By Eq. 7, this state is characterized as: The above result suggests that the context-enhanced ad- ditive heuristic might be more informative than the causal s(x0|x0) = s graph heuristic: in problems with acyclic causal graphs, it is equally informative, and in problems with cycles in the s(xi+1|x0) = s(xi|x0)[xi+1, ¬y] (10) causal graph, it does not need to ignore operator precondi- From Eq. 10, we see that ¬y holds in all states s(xi|x0) tions in order to break these cycles. We will later investigate for i ≥ 1, so that y0 = ¬y in Eq. 9 for i ≥ 1. However, this potential advantage empirically.

2If there are several rules minimizing the expression, some form 3The correspondence assumes that edges in domain transition of tie-breaking is needed to make the equations well-defined, for graphs and rules in the problem are ordered statically in the same example by imposing an arbitrary strict order on rules. Different way, so that ties in Helmert’s procedure and in Eqs. 6 and 7 are tie-breaking can lead to different heuristic estimates. This is not broken in the same way. Note that there is a direct correspondence unusual; e. g., the causal graph heuristic and FF heuristic also de- between edges (d, d0) labelled with conditions z in DTG(v) and pend on how ties between achievers of a proposition are broken. rules v = d, z → v = d0. Relationship to Additive Heuristic relevant cost estimates. By expanding nodes in order of in- cea As stated earlier, there is also a close relationship between creasing priority, rather than cost, we can compute h (s) the context-enhanced additive heuristic and the plain addi- much more efficiently. Due to space restrictions, we do not tive heuristic. Indeed, it is easy to prove the following result: discuss these algorithmic issues in detail. Theorem 2 (hcea vs. hadd) In problems where all state variables have Boolean do- Experiments mains, the additive heuristic hadd and the context-enhanced We have seen that the context-enhanced additive heuristic is additive heuristic hcea are equivalent, i. e., for every state s, a natural generalization of both the additive heuristic and the hadd(s) = hcea(s). (acyclic) causal graph heuristic. In this section, we show that To see that this is true, observe that Eq. 6 defining the unified heuristic is not just a theoretical concept, but can hcea(x|x0) simplifies significantly for Boolean domains. For actually be put to good use in practice. For use in a heuris- the case where x 6= x0, we must have x00 = x0, since x00 tic planning system, a heuristic estimator must be informa- must be different from x in a rule o : x00, z → x and there tive, i. e., provide heuristic estimates that guide the search are only two values in the domain of var(x). This means that algorithm to a goal state without expanding too many search the conditions in z are evaluated in the context of s(x0|x0), nodes, and it must be efficiently computable, i. e., require i. e., the same state in which hcea(x|x0) is evaluated. Using as little time as possible for each heuristic evaluation. Since this observation, it is easy to prove inductively that all costs hcea clearly requires more computational effort than either of are evaluated in the context of the seed state s, so that Eq. 6 the heuristics it generalizes, the question is whether there are for hcea defines the same cost values as Eq. 4 for hadd. cases where it provides significantly better guidance, enough Again, the result suggests that the context-enhanced ad- to overcome the higher per-node overhead. ditive heuristic might be more informative than the additive We conducted two experiments to address this question. heuristic: in the case of all-Boolean state variables, the two In the first experiment we investigate overall performance, cea heuristics are the same, while in general, h may be able to while in the second we look more closely at the informative- use the context information to improve its estimates. ness of the heuristics. As a search algorithm, we use greedy best-first search with deferred evaluation and preferred op- Computing the Heuristic for Cyclic Graphs erators, as implemented in the Fast Downward planning sys- The context-enhanced additive heuristic hcea reduces to the tem (Helmert 2006). All experiments were conducted with causal graph heuristic hCG in MPTs with acyclic causal a 30 minute timeout and 2 GB memory limit on a computer graphs, but does not require acyclicity, and indeed, does not with a 2.66 GHz Intel Xeon CPU. use the notion of causal graphs or domain transition graphs. As all heuristics were implemented within the same plan- In this sense, the induction over the causal graph for defining ning system and we want to compare the heuristics, not over- 0 the costs costv(d, d ), from variables to their descendants, is all system performance, all runtimes reported in the follow- cea replaced in h by an implicit induction over costs. ing refer to search time (i. e., time spent by the search algo- cea 0 Despite the presence of a cyclic graph, the costs h (x|x ) rithm and heuristic evaluators), excluding overhead for pars- in Eqs. 6 and 7 can be computed using a modification of Di- ing and grounding that is identical for all heuristics. jkstra’s algorithm, similar to Helmert’s algorithm for hCG. Indeed, Dijkstra’s algorithm can be used for computing the normal additive heuristics as well, whether the causal graph Helmert, 2004 New Results Domain is cyclic or not, although in practice the Bellman-Ford algo- hCG hadd hFF hCG hadd hFF hcea rithm is preferred (Liu, Koenig, and Furcy 2002). BLOCKSWORLD (35) 0 0 0 0 0 0 0 cea For h , the Dijkstra-like algorithm operates on a graph DEPOT (22) 14 11 10 10 7 5 5 0 0 containing one copy of DTG(var(x )) for each atom x of DRIVERLOG (20) 3 3 2 0 0 1 2 the task. In the copy for atom x0, the distance from x0 to x FREECELL (80) 2 11 9 9 0 1 2 corresponds to the heuristic value hcea(x|x0) from Eq. 6. The GRID (5) 1 2 1 1 2 1 1 algorithm initially assigns cost 0 to nodes corresponding to GRIPPER (20) 0 0 0 0 0 0 0 values hcea(x0|x0) and cost ∞ to other nodes. It then expands LOGISTICS (63) 0 0 0 0 6 1 0 MICONIC (150) 0 0 0 0 0 0 0 nodes in order of increasing cost, propagating costs and con- CG MOVIE (30) 0 0 0 0 0 0 0 texts in a similar way to Helmert’s procedure for h . MPRIME (35) 1 6 6 0 1 4 0 However, while this approach leads to polynomial evalua- MYSTERY (19) 1 0 1 2 1 2 0 tion time, we have found it prohibitively expensive for use in ROVERS (20) 3 5 3 0 0 0 0 a heuristic planner supposed to provide competitive results SATELLITE (20) 0 0 2 0 0 0 0 CG add with h and h . A critical speed-up can be obtained by ZENOTRAVEL (20) 0 0 0 0 0 0 0 separating the cost of a node in the Dijkstra algorithm (the Total (539) 25 38 34 22 17 15 10 associated hcea(x|x0) value) from its priority (the time dur- ing the execution of the Dijkstra algorithm where the value Table 1: Number of unsolved problem instances by domain, is required). In particular, many nodes with low cost are for the benchmark suite considered by Helmert (2004). never needed for the computation of hcea(s) because they Eleven unsolvable MYSTERY instances excluded. First col- refer to contexts that never arise in the computation of the umn shows number of problem instances in parentheses. Domain Percent Median Average cea 1000s h ASSEMBLY 100% 2034 2670 hCG DEPOT 100% 15.487 16.042 add h FREECELL 94% 10.615 11.643 hFF MPRIME 100% 10.308 21.076 TRUCKS 91% 8.377 5.758 100s GRID 100% 4.735 4.117 DRIVERLOG 94% 3.891 3.502 PIPESWORLD-NOTANKAGE 82% 3.057 4.814 Search Time ZENOTRAVEL 92% 2.186 2.261 10s TPP 84% 1.848 2.134 MYSTERY 85% 1.846 2.210 BLOCKSWORLD 71% 1.843 2.692 PIPESWORLD-TANKAGE 80% 1.818 2.592 1s AIRPORT 78% 1.706 4.393 440 460 480 500 520 PATHWAYS 80% 1.700 8.215 Solved Tasks LOGISTICS 96% 1.628 1.690 PSR-SMALL 86% 1.364 1.436 Figure 1: Number of tasks solved by each heuristic within a PSR-LARGE 76% 1.354 1.264 given time, for Helmert’s (2004) benchmark suite. PSR-MIDDLE 64% 1.050 1.077 ROVERS 52% 1.038 1.334 OPENSTACKS 52% 1.030 2.403 First Experiment SCHEDULE 35% 0.999 0.720 SATELLITE 38% 0.966 0.583 The first experiment, in which we investigate the overall per- STORAGE 41% 0.939 0.569 formance of the context-enhanced additive heuristic, is mod- OPTICALTELEGRAPHS 0% 0.923 0.923 elled after the main experiment of the original paper on the MICONIC-FULLADL 35% 0.917 0.846 PHILOSOPHERS 10% 0.612 0.617 causal graph heuristic (Helmert 2004). We evaluate on the same benchmarks (namely, the STRIPS benchmarks of IPC cea CG 1–3), we report the same data (number of problem instances Table 2: Comparison of node expansions of h and h : percentage of tasks where hcea expands fewer nodes than solved in each domain and overall runtime results), and we CG compare the same heuristic estimators (additive heuristic, h , median reduction in node expansions, (geometric) av- acyclic causal graph heuristic, FF heuristic). In addition, we erage reduction in node expansions. Domains are ordered by provide results for the context-enhanced additive heuristic. decreasing rate of (median) improvement. Above the line, hcea is more informed than hCG. Table 1 reports the overall success rates of all heuristics, showing the number of problem instances not solved with each heuristic in each domain. For reference, we also in- hFF are comparable. Their graphs intersect occasionally, so clude the results from Helmert’s original experiment. either heuristic outperforms the other for some timeouts. We first observe that for each of the three reference heuris- tics, we report better results than in the original experiment. Second Experiment This is partly due to differences in the experimental setup In the second experiment, we focus on the informedness (e. g., Helmert only used a 5 minute timeout), but also due of the context-enhanced additive heuristic, compared to the to the search enhancements used. The improvement is much causal graph and additive heuristics. In this experiment, we more pronounced for the additive and FF heuristics than for consider all domains from the previous International Plan- the causal graph heuristic because the latter benefits less ning Competitions (IPC 1–5), apart from those where each from the use of preferred operators. (For all three heuris- of the three heuristics can solve each benchmark instance tics, disabling preferred operators degrades results, but the in less than one second (thus excluding trivial domains like effect is least pronounced for the causal graph heuristic.) MOVIE and GRIPPER). Secondly, the table shows that the results for the context- We measure the informedness of a heuristic for a given enhanced additive heuristics are markedly better than for planning instance by the number of node expansions needed the other heuristics. While the closely related causal graph to solve it. Table 2 shows comparative results for the causal and additive heuristics leave 22 (hCG) and 17 (hadd) out of graph and context-enhanced additive heuristic. To be able to 539 tasks unsolved, the context-enhanced additive heuris- provide a comparison, we only consider instances solved by tic solves all but 10 tasks, roughly halving the number of both heuristics. We show both qualitative and quantitative failures. The FF heuristic achieves the next best result by results. Qualitatively, we show the percentage of tasks in leaving 15 out of 539 tasks unsolved, 50% more than hcea. each domain where hcea required fewer expansions. Larger Aggregate runtime results are shown in Fig. 1. The graph values than 50% indicate that hcea was more informed for for hcea is consistently below hCG and hadd, showing that most tasks of the domain. Quantitatively, we computed the the context-enhanced additive heuristic solves more prob- ratio between the number of node expansions of hCG and lems than the two heuristics it generalizes for any timeout hcea for each task (where values greater than 1 indicate that between one second and 30 minutes. The results for hcea and hcea required fewer expansions). For each domain, we report Domain Percent Median Average Discussion GRID 100% 2.714 2.244 The context-enhanced heuristic hcea is more general than DEPOT 93% 2.389 2.781 LOGISTICS 100% 2.236 3.117 the causal graph heuristic as it applies to problems with PIPESWORLD-TANKAGE 64% 1.901 2.255 cyclic causal graphs. It is also more general than the ad- MPRIME 91% 1.691 4.720 ditive heuristic, because it can take into account context in- add ZENOTRAVEL 92% 1.437 1.723 formation ignored by h . We have seen that this increase AIRPORT 74% 1.353 1.924 in generality translates to better heuristic guidance in many MYSTERY 78% 1.320 2.319 standard planning domains. PIPESWORLD-NOTANKAGE 65% 1.242 1.572 The heuristic can be generalized further, however, while ASSEMBLY 78% 1.145 1.356 remaining tractable, by relaxing the two assumptions that DRIVERLOG 69% 1.031 1.758 led us to it from the more general form in Eq. 5, where each OPTICALTELEGRAPHS 100% 1.023 1.023 condition x is evaluated in a potentially different context s PHILOSOPHERS 76% 1.019 1.362 i i OPENSTACKS 59% 1.003 1.070 in h(xi|si). The two assumptions were 1) that the contexts 0 PSR-SMALL 55% 1.000 1.007 si for all the conditions xi in a rule o : x , x1, . . . , xn → x 0 BLOCKSWORLD 49% 1.000 0.871 are all the same and correspond to the state s resulting from 0 0 STORAGE 47% 1.000 0.922 achieving the first condition x , and 2) that h(xi|s ) is re- 0 0 0 ROVERS 45% 1.000 0.992 duced to h(xi|s[xi]) where xi is the value of var(xi) in s , TPP 44% 1.000 0.978 thus effectively throwing away all the information in s0 that IDDLE PSR-M 44% 1.000 0.955 does not pertain to the variable associated with xi. Both PATHWAYS 30% 0.999 0.948 of these assumptions can be relaxed, leading to potentially MICONIC-FULLADL 31% 0.992 0.996 more informed heuristics, still expressible in the style of PSR-LARGE 42% 0.988 0.890 Eq. 5 and computable in polynomial time. We now discuss SATELLITE 17% 0.962 0.920 SCHEDULE 26% 0.931 0.650 some possibilities. FREECELL 25% 0.770 0.605 TRUCKS 20% 0.727 0.837 Generalizations The formulation that follows from Assumptions 1 and 2 Table 3: Comparison of node expansions of hcea and hadd. above presumes that in every rule z → x of the problem, Columns are as in Table 2. Above the first line, hcea is more a) there is a condition in the body of the rule that must be informed than hadd; below the second line, hcea is less in- achieved first (we call it the pivot condition), b) that this formed than hadd. In between, the median number of node pivot condition involves the same variable as the head, and expansions is identical. c) that no information involving the rest of the conditions in z is available or usable. Condition a) is not particularly restrictive as it is always the median of these values, indicating the typical advantage possible to add a dummy pivot condition. Condition b), on of hcea over hCG, and the geometric mean,4 indicating the the other hand, is restrictive, and often unnecessarily so. average advantage of hcea over hCG. Consider for example the following rule for “unlocking a The table clearly shows that hcea provides better heuristic door at a location”: CG estimates than h . The context-enhanced additive heuristic have key(D), at = L → unlocked(D) (11) provides better estimates in 21 out of 27 domains; in 15 of 0 these cases, the median improvement is a factor of 1.7 or and say that the key is at a location L different from L, while Ls is the current agent location. The cost of applying better. In four cases, the median improvement is enormous, 0 hCG such a rule should include the cost of going from Ls to L to by a factor of 10 or better. Conversely, only provides 0 better estimates in 6 out of 27 domains, and its advantage in pick up the key, as well as the cost of going from L to L to the median case only exceeds a factor of 1.1 in a single case unlock to the door. However, this does not follow from the (PHILOSOPHERS), where it is 1.634. formulation above or from the causal graph heuristic. We Table 3 shows a similar comparison between the additive can actually cast this rule into the format assumed by the and context-enhanced additive heuristics. While the advan- formulation as tage for hcea is less pronounced in this case, it is still quite ¬unlocked(D), have key(D), at = L → unlocked(D) apparent. The context-enhanced additive heuristic is more informed in the median case for 14 domains, and less in- where for Boolean variables v, v abbreviates v = true and formed for 7 domains. Only in two cases is the advantage ¬v abbreviates v = false. However, as indicated in the proof of hadd larger than 1.1, and it is never larger than 1.376. On sketch for Theorem 2, the context mechanism that the causal the other hand, hcea shows an improvement by a factor of graph heuristic adds to the normal additive heuristic has no more than 1.1 for ten domains, and in three cases it exceeds effect on rules z → v = d with Boolean variables v in the a factor of 2. head. For a rule such as Eq. 11, it makes sense to treat the atom have key(D) as the pivot condition, even though it in- 4For ratios, the geometric mean, rather than the arithmetic volves a variable that is different from the one mentioned mean, is the aggregation method of choice. To see this, note that in the rule head. Actually the generalization of the heuris- the “average” of 5.0 and 0.2 should be 1.0, not 2.6. tic for accounting for arbitrary pivot conditions in rules, as opposed to pivot conditions involving the head variable, is est, the context-enhanced additive heuristic also proves to be easy to accommodate. In the example above, the side ef- very strong in practice. fect of achieving the pivot condition have key(D) involves Finally, the ideas presented in this paper admit some inter- at = L0, so that the second condition in the rule at = L esting generalizations, all of which have to do with the use of should be evaluated in that context, accounting for the cost ordering information among operator preconditions to cap- of getting to the key location, and from there to the door. ture side effects in the computation of the heuristic. There A second generalization can be obtained by making use are some interesting connections with HTN planning, where of further precedence constraints among the rule conditions. precedence constraints among preconditions or subtasks are In the extreme case, if we have a total ordering among all common, which are also worth exploring. of the rule conditions, we should evaluate the second condi- tion in the context that results from achieving the first, the Acknowledgments third condition in the context that results from achieving the This work was partly supported by the German Research second, and so on. Indeed, such an ordering among condi- Foundation (DFG) as part of the Transregional Collabora- tions or subtasks is one of the key elements that distinguishes tive Research Center “Automatic Verification and Analysis HTN planning (Erol, Hendler, and Nau 1994) from “classi- of Complex Systems” (SFB/TR 14 AVACS). For more in- cal” planning. A variation of the context-enhanced additive formation, see http://www.avacs.org/. heuristic can be used to provide an estimator capable of tak- Hector´ Geffner is partially supported by Grant TIN2006- ing such precedence constraints into account. 15387-C03-03 from MEC, Spain. The two generalizations above follow from relaxing As- sumption 1 above. Assumption 2, which maps the estimates 0 0 0 References h(xi|s ) for contexts s into the estimates h(xi|s[xi]) where x0 (x ) s0 Backstr¨ om,¨ C., and Nebel, B. 1995. Complexity results for i is the value of var i in , throws away information + in the context s0 that does not pertain directly to the multi- SAS planning. Computational Intelligence 11(4):625– 655. valued variable associated with xi but which may be relevant to it. In the causal graph heuristic, this assumption trans- Bertsekas, D. P. 2000. Linear Network Optimization: Al- lates in the exclusion of the non-parent variables v0 from gorithms and Codes. The MIT Press. the subproblems Πv,d,d0 . Instead, one may want to keep Bonet, B., and Geffner, H. 2001. Planning as heuristic 0 a bounded number of such variables v in all subproblems. search. AIJ 129(1):5–33. Such an extension would cause polynomial growth in com- Bonet, B.; Loerincs, G.; and Geffner, H. 1997. A robust plexity, and can be accommodated simply by changing the 0 0 and fast action selection mechanism for planning. In Proc. assumption h(xi|s ) = h(xi|s[xi]) implicit in Eq. 6, where 0 AAAI-97, 714–719. xi is the value of var(xi) in s, by an explicit assumption 0 0 0 0 Cormen, T. H.; Leiserson, C. E.; and Rivest, R. L. 1990. h(xi|s ) = h(xi|s[xi, y ]) where y stands for the values of such “core variables” to be preserved in all contexts. Introduction to Algorithms. The MIT Press. It must be said, however, that all these generalizations Erol, K.; Hendler, J. A.; and Nau, D. S. 1994. HTN inherit certain commitments from the additive and causal planning: Complexity and expressivity. In Proc. AAAI-94, graph heuristics, in particular, the additivity of the former, 1123–1128. and the greedy rule selection of the latter. Haslum, P., and Geffner, H. 2000. Admissible heuristics for optimal planning. In Proc. AIPS 2000, 140–149. Summary Helmert, M. 2004. A planning heuristic based on causal Defining heuristics mathematically rather than procedurally graph analysis. In Proc. ICAPS 2004, 161–170. seems to pay off as mathematical definitions are often Helmert, M. 2006. The Fast Downward planning system. clearer and more concise, and force us to express the as- JAIR 26:191–246. sumptions explicitly, separating what is to be computed from how it is computed. Also, the functional equation form Helmert, M. 2008. Understanding Planning Tasks – Do- used in defining the additive heuristic, which is common main Complexity and Heuristic Decomposition, volume in dynamic programming, appears to be quite general and 4929 of LNAI. Springer-Verlag. flexible. It has been used to define the max heuristic hmax, Hoffmann, J., and Nebel, B. 2001. The FF planning sys- the hm heuristics (Haslum and Geffner 2000), and more re- tem: Fast plan generation through heuristic search. JAIR cently the set-additive heuristic (Keyder and Geffner 2008) 14:253–302. and cost-sharing heuristics (Mirkis and Domshlak 2007). Keyder, E., and Geffner, H. 2008. Heuristics for planning The resulting declarative formulation of the causal graph with action costs revisited. In Proc. ECAI 2008. heuristic, called the context-enhanced additive heuristic, Liu, Y.; Koenig, S.; and Furcy, D. 2002. Speeding up does not require acyclicity or causal graphs and leads to a the calculation of heuristics for heuristic search-based plan- natural generalization of both the causal graph and additive ning. In Proc. AAAI 2002, 484–491. heuristics. We consider this observation an interesting con- tribution in its own right, as it establishes a strong connec- Mirkis, V., and Domshlak, C. 2007. Cost-sharing approxi- + tion between two previously unrelated research strands in mations for h . In Proc. ICAPS 2007, 240–247. heuristic search planning. In addition to its theoretical inter-