The Fastmap Algorithm for Shortest Path Computations

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) The FastMap Algorithm for Shortest Path Computations Liron Cohen1, Tansel Uras1, Shiva Jahangiri2, Aliyah Arunasalam1, Sven Koenig1 and T. K. Satish Kumar1 1University of Southern California 2University of California, Irvine flironcoh, turas, arunasal, [email protected], [email protected], [email protected] Abstract a heuristic and proper tie-breaking is guaranteed to expand nodes only on a shortest path between the given start and goal We present a new preprocessing algorithm for em- nodes. In general, computing the perfect heuristic between bedding the nodes of a given edge-weighted undi- two nodes is as hard as computing the shortest path between rected graph into a Euclidean space. The Euclidean them. Hence, A* search benefits from a perfect heuristic only distance between any two nodes in this space ap- if it is computed offline. However, precomputing all pairwise proximates the length of the shortest path between distances is not only time-intensive but also requires a pro- them in the given graph. Later, at runtime, a hibitive O(N 2) memory where N is the number of nodes. shortest path between any two nodes can be com- The memory requirements for storing all-pairs shortest paths puted with an A* search using the Euclidean dis- data can be somewhat addressed through compression [Botea tances as heuristic. Our preprocessing algorithm, and Harabor, 2013; Strasser et al., 2015]. called FastMap, is inspired by the data-mining al- Existing methods for preprocessing a given graph (with- gorithm of the same name and runs in near-linear out precomputing all pairwise distances) can be grouped into time. Hence, FastMap is orders of magnitude the following categories: Hierarchical abstractions that yield faster than competing approaches that produce a suboptimal paths have been used to reduce the size of the Euclidean embedding using Semidefinite Program- search space by abstracting groups of nodes [Botea et al., ming. FastMap also produces admissible and con- 2004; Sturtevant and Buro, 2005]. More informed heuris- sistent heuristics and therefore guarantees the gen- tics [Bjornsson¨ and Halldorsson,´ 2006; Cazenave, 2006; eration of shortest paths. Moreover, FastMap ap- Sturtevant et al., 2009] focus A* searches better, resulting plies to general undirected graphs for which many in fewer expanded nodes. Hierarchies can also be used to traditional heuristics, such as the Manhattan Dis- derive heuristics during the search [Leighton et al., 2008; tance heuristic, are not well defined. Empirically, Holte et al., 1994]. Dead-end detection and other pruning we demonstrate that A* search using the FastMap methods [Bjornsson¨ and Halldorsson,´ 2006; Goldenberg et heuristic is competitive with A* search using other al., 2010; Pochter et al., 2010] identify areas of the graph state-of-the-art heuristics, such as the Differential that do not need to be searched to find shortest paths. Search heuristic. with contraction hierarchies [Geisberger et al., 2008] is an optimal hierarchical method, where every level of the hi- 1 Introduction and Related Work erarchy contains only a single node. It has been shown to be efficient on road networks but seems to be less effi- Shortest path computations commonly occur in the inner pro- cient on graphs with higher branching factors, such as grid- cedures of many AI programs. In video games, for exam- based game maps [Storandt, 2013]. N-level graphs [Uras and ple, a large fraction of CPU cycles is spent on shortest path Koenig, 2014], constructed from undirected graphs by parti- computations [Uras and Koenig, 2015]. Many other tasks in tioning the nodes into levels, also allow for significant prun- AI, including motion planning, temporal reasoning, and deci- ing during the search. sion making [Russell and Norvig, 2009], also involve finding A different approach that does not rely on preprocessing and reasoning about shortest paths. While Dijkstra’s algo- of the graph is to use some notion of a geometric distance rithm [Dijkstra, 1959] can be used to compute shortest paths between two nodes as a heuristic of the distance between in polynomial time, speeding up shortest path computations them. One such heuristic for gridworlds is the Manhattan allows one to solve the aformentioned tasks faster. One way Distance heuristic.1 For many gridworlds, A* search using to do that is to use A* search with an informed heuristic [Hart et al., 1968]. 1In a 4-neighbor 2D gridworld, for example, the Manhattan Dis- A perfect heuristic is one that returns the true distance tance between two cells (x1; y1) and (x2; y2) is jx1 −x2j+jy1 −y2j. between any two nodes in a given graph. A* with such Generalizations exist for 8-neighbor 2D and 3D gridworlds. 1427 Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) the Manhattan Distance heuristic outperforms Dijkstra’s al- empirically that, in gridworlds, A* using the FastMap heuris- gorithm. However, in complicated 2D/3D gridworlds like tic runs faster than A* using the Manhattan or Octile distance mazes, the Manhattan Distance heuristic may not be suffi- heuristics. A* using the FastMap heuristic runs equally fast or ciently informed to focus A* searches effectively. Another is- faster than A* using the Differential heuristic, with the same sue associated with Manhattan Distance-like heuristics is that memory resources. The (explicit) Euclidean embedding pro- they are not well defined for general graphs.2 For a graph that duced by FastMap also has representational benefits like re- cannot be conceived in a geometric space, there is no closed- covering the underlying manifolds of the graph and/or visual- form formula for a “geometric” heuristic for the distance be- izing them. Moreover, we observe that the FastMap and Dif- tween two nodes because there are no coordinates associated ferential heuristics have complementary strengths and can be with them. easily combined to generate an even more informed heuristic. For a graph that does not already have a geometric embedding in Euclidean space, a preprocessing algorithm can be 2 The Origin of FastMap used to generate one. As described before, at runtime, A* The FastMap algorithm [Faloutsos and Lin, 1995] was intro- search would then use the Euclidean distance between any duced in the data-mining community for automatically gener- two nodes in this space as an estimate for the distance be- ating geometric embeddings of abstract objects. For example, tween them in the given graph. One such approach is Eu- if we are given objects in form of long DNA strings, multi- clidean Heuristic Optimization (EHO) [Rayner et al., 2011]. media datasets such as voice excerpts or images or medical EHO guarantees admissiblility and consistency and there- datasets such as ECGs or MRIs, there is no geometric space fore generates shortest paths. However, it requires solving in which these objects can be naturally visualized. However, a Semidefinite Program (SDP). SDPs can be solved in poly- there is often a well defined distance function between ev- nomial time [Vandenberghe and Boyd, 1996]. EHO leverages ery pair of objects. For example, the edit distance3 between additional structure to solve them in cubic time. Still, a cu- two DNA strings is well defined although an individual DNA bic preprocessing time limits the size of the graphs that are string cannot be conceptualized in geometric space. amenable to this approach. Clustering techniques, such as the k-means algorithm, are The Differential heuristic is another state-of-the-art ap- well studied in machine learning [Alpaydin, 2010] but cannot proach that has the benefit of a near-linear runtime. However, be applied directly to domains with abstract objects because [ ] unlike the approach in Rayner et al., 2011 , it does not pro- they assume that objects are described as points in geomet- duce an explicit Euclidean embedding. In the preprocessing ric space. FastMap revives their applicability by first creat- phase of the Differential heuristic approach, some nodes of ing a Euclidean embedding for the abstract objects that ap- the graph are chosen as pivot nodes. The distances between proximately preserves the pairwise distances between them. each pivot node and every other node are precomputed and Such an embedding also helps to visualize the abstract ob- [ ] stored Sturtevant et al., 2009 . At runtime, the heuristic be- jects, for example, to aid physicians in identifying correla- tween two nodes a and b is given by maxp jd(a; p) − d(p; b)j, tions between symptoms from medical records. where p is a pivot node and d(·; ·) is the precomputed dis- The data-mining FastMap gets as input a complete undi- tance. The preprocessing time is linear in the number of piv- rected edge-weighted graph G = (V; E). Each node v 2 V ots times the size of the graph. The required space is lin- i represents an abstract object Oi. Between any two nodes vi ear in the number of pivots times the number of nodes, al- and v there is an edge (v ; v ) 2 E with weight D(O ;O ) [ j i j i j though a more succinct representation is presented in Gold- that corresponds to the given distance between objects O and ] i enberg et al., 2011 . Similar preprocessing techniques are O . A Euclidean embedding assigns to each object O a K- used in Portal-Based True Distance heuristics [Goldenberg et j i dimensional point p 2 RK . A good Euclidean embedding is al., 2010]. i one in which the Euclidean distance between any two points In this paper, we present a new preprocessing algorithm, pi and pj closely approximates D(Oi;Oj). called FastMap, that produces an explicit Euclidean embed- One of the early approaches for generating such an embedding while running in near-linear time.

Load more