Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Efficient Fastest-Path Computations in Road Maps

2021, ArXiv

In the age of real-time online traffic information and GPS-enabled devices, fastest-path computations between two points in a road network modeled as a directed graph, where each directed edge is weighted by a "travel time" value, are becoming a standard feature of many navigation-related applications. To support this, very efficient computation of these paths in very large road networks is critical. Fastest paths may be computed as minimal-cost paths in a weighted directed graph, but traditional minimal-cost path algorithms based on variants of the classic Dijkstra algorithm do not scale well, as in the worst case they may traverse the entire graph. A common improvement, which can dramatically reduce the number of traversed graph vertices, is the A* algorithm, which requires a good heuristic lower bound on the minimal cost. We introduce a simple, but very effective, heuristic function based on a small number of values assigned to each graph vertex. The values are based on...

Efficient Fastest-Path Computations in Road Maps Renjie Chen Max-Planck Institute for Informatics Saarbrucken, Germany Craig Gotsman New Jersey Institute of Technology Newark, NJ, USA Abstract In the age of real-time online traffic information and GPS-enabled devices, fastest-path computations between two points in a road network modeled as a directed graph, where each directed edge is weighted by a “travel time” value, are becoming a standard feature of many navigation-related applications. To support this, very efficient computation of these paths in very large road networks is critical. Fastest paths may be computed as minimal-cost paths in a weighted directed graph, but traditional minimal-cost path algorithms based on variants of the classic Dijkstra algorithm do not scale well, as in the worst case they may traverse the entire graph. A common improvement, which can dramatically reduce the number of traversed graph vertices, is the A* algorithm, which requires a good heuristic lower bound on the minimal cost. We introduce a simple, but very effective, heuristic function based on a small number of values assigned to each graph vertex. The values are based on graph separators and computed efficiently in a preprocessing stage. We present experimental results demonstrating that our heuristic provides estimates of the minimal cost which are superior to those of other heuristics. Our experiments show that when used in the A* algorithm, this heuristic can reduce the number of vertices traversed by an order of magnitude compared to other heuristics. 1. Introduction The Shortest, Minimal-Cost and Fastest Path Problems The shortest-path problem on graphs is one of the most fundamental algorithms in computer science, the graph being one of the most basic and common discrete structures, modeling an abundance of real-world problems involving networks. In the most basic scenario, graph vertices represent entities in a network and an edge between two vertices indicates the existence of a link between them (e.g. a communication or social network). The shortest path between two vertices 𝑠𝑠 and 𝑡𝑡 in the graph is then the path between 𝑠𝑠 and 𝑡𝑡 containing the minimal number of edges. In the case of a communication network, this could be the cheapest way to route a message to 𝑡𝑡, originating at 𝑠𝑠. In the more general case, edges are assigned weights which measure a cost associated with traversing that edge. The shortest path then becomes a minimal-cost path, where the cost of the path is the sum of the costs of its edges. In the case of a communication network, the associated cost of an edge may be its conductance. A very important type of network is a road map, the graph vertices representing road junctions and the edges road segments between the junctions. In the simplest scenario, the graph is planar, the vertices are embedded in the plane, namely have (𝑥𝑥, 𝑦𝑦) coordinates, and each edge is assigned a positive weight measuring its Euclidean length in the plane. The minimal-cost path between two vertices 𝑠𝑠 and 𝑡𝑡 is then the edge path of minimal Euclidean length between 𝑠𝑠 and 𝑡𝑡, which could indicate the shortest drive (or walk) between these two points. In practice, in vehicle navigation applications, not all the edges of the road map having the same length are equivalent for a driver, since the possible driving speed on the roads may vary, depending on the category of the 1 road. Highways are usually preferred, as they allow for higher speeds, thus a faster drive. Consequently, the more relevant weight assigned to a road segment is the so-called “travel time”, which is the segment length divided by the maximal speed possible on that segment. The resulting minimal-cost path in this weighted graph is sometimes called the “fastest path”. The more realistic variant of this problem is when the graph edges are directed, namely the travel time along an edge may depend on the direction of the edge. In the special case of a one-way road, the edge exists in just one direction (or, equivalently, the travel time in the opposite direction is infinite). Precomputation and Dynamic Fastest Path Problems Traditional minimal-cost path algorithms do not scale well to very large networks. More practical algorithms rely on a (typically heavy) preprocessing of the graph, resulting in extra information to store along with the basic graph data, which is exploited in answering online (𝑠𝑠, 𝑡𝑡)-minimal-cost queries efficiently. While effective, this approach introduces a complication. Using travel times as the costs of road network edges is useful to correctly model a real-world navigation problem, but it also imposes a dynamic character on the problem, as the maximal speed on a road segment is rarely a constant – it changes over time depending on traffic conditions – hence the travel times are dynamic. Consequently, an algorithm which relies on preprocessing of the graph data in order to speed up the online (𝑠𝑠, 𝑡𝑡)-fastest-path queries, must deal with the dynamic nature of the data by periodic repetition of the preprocessing. This rules out the use of unduly heavy preprocessing. Objective The computation of minimal-cost paths in dynamic weighted graphs has been the subject of intense study over the past decades, and many techniques have been proposed to solve different variants of the problem. A complete survey of the state-of-the-art is beyond the scope of this paper and we refer the interested reader to the survey and comparison of Bast et al. [1]. Our contribution is the description of a very effective heuristic function which can be used in the well-known A* algorithm for minimal-cost path computation on weighted directed graphs. Computation of the heuristic is fast and can easily be repeated periodically to accommodate dynamic traffic conditions in road networks. A* with this heuristic can be used in conjunction with many other techniques to provide a more complete solution to the general problem. 2. The Dijkstra and A* Algorithms While in practice we typically would like to solve the “point-to-point” minimal-cost path problem between a source vertex 𝑠𝑠 and a target vertex 𝑡𝑡 in a directed graph, it turns out that this indirectly involves computing the minimal-cost path from 𝑠𝑠 to many other vertices in the graph. Let 𝐺𝐺 = (𝑉𝑉, 𝐸𝐸, 𝑤𝑤) be a directed graph with vertex set 𝑉𝑉 and edge set 𝐸𝐸 ⊂ 𝑉𝑉 × 𝑉𝑉, such that the positive real cost of traversing the directed edge (𝑢𝑢, 𝑣𝑣) is 𝑤𝑤(𝑢𝑢, 𝑣𝑣). The minimal cost of a path from given vertex 𝑠𝑠 ∈ 𝑉𝑉 to given vertex 𝑡𝑡 ∈ 𝑉𝑉 may be obtained by computing the entire function 𝑐𝑐𝑠𝑠 (𝑣𝑣) from 𝑠𝑠 to any other vertex 𝑣𝑣 ∈ 𝑉𝑉 by solving the following linear program: 𝑠𝑠. 𝑡𝑡. 𝑐𝑐𝑠𝑠 (𝑠𝑠) = 0, max 𝑐𝑐𝑠𝑠 (𝑡𝑡) ∀𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 (𝑢𝑢, 𝑣𝑣) ∈ 𝐸𝐸(𝐺𝐺) ∶ 𝑐𝑐𝑠𝑠 (𝑣𝑣) − 𝑐𝑐𝑠𝑠 (𝑢𝑢) ≤ 𝑤𝑤(𝑢𝑢, 𝑣𝑣) 2 Thinking of 𝑐𝑐𝑠𝑠 (𝑣𝑣) as an “embedding” of the graph vertices on the real line, this means we would like to “stretch” 𝑠𝑠 and 𝑡𝑡 as far apart as possible on the line, subject to the constraint that the endpoints of any edge (𝑢𝑢, 𝑣𝑣) are separated by a distance of at most 𝑤𝑤(𝑢𝑢, 𝑣𝑣) - the weight of the edge. Denote by ∇(𝑢𝑢, 𝑣𝑣) = 𝑐𝑐𝑠𝑠 (𝑣𝑣)−𝑐𝑐𝑠𝑠 (𝑢𝑢) 𝑤𝑤(𝑢𝑢,𝑣𝑣) the “gradient” of 𝑐𝑐𝑠𝑠 along the edge 𝑒𝑒. In the optimal solution ∇(𝑢𝑢, 𝑣𝑣) = 1 for all edges along the minimal-cost path, and ∇(𝑢𝑢, 𝑣𝑣) ≤ 1 for all other edges. Thus a gradient-descent path of 𝑐𝑐 starting at 𝑠𝑠 traces out the minimal-cost path. In practice, this linear program can be transformed into a dynamic programming problem, which in turn can be solved by the celebrated Dijkstra algorithm [4], which traverses the graph vertices guided by a priority queue of vertices. The procedure terminates if 𝑡𝑡 is reached and a minimal-cost path is then generated by tracing the path backwards from 𝑡𝑡. If the priority queue empties before 𝑡𝑡 is reached, the search fails and a minimal-cost path does not exist (e.g. if the graph is not connected). The complexity of the most efficient implementation of the Dijkstra algorithm [5] is 𝑂𝑂(𝑚𝑚 + 𝑛𝑛 log 𝑛𝑛), where 𝑚𝑚 is the number of graph edges and 𝑛𝑛 the number of graph vertices. Unfortunately, this is prohibitive, for the number of edges 𝑚𝑚 is typically much larger than the number of edges along the minimal-cost path. The Dijkstra algorithm may be accelerated into a “guided” A* search [7] if there is additional domain knowledge in the form of a heuristic function ℎ(𝑣𝑣, 𝑡𝑡), one that estimates the minimal cost from 𝑣𝑣 to 𝑡𝑡. The simplest example of a heuristic function for a plane graph with edge-length weights is the planar Euclidean distance ℎ(𝑣𝑣, 𝑡𝑡) = �|𝑝𝑝(𝑣𝑣) − 𝑝𝑝(𝑡𝑡)|�2 , where 𝑝𝑝(𝑣𝑣) are the 2D coordinates of the position of 𝑣𝑣 in the plane. A* is guaranteed to find the shortest path if ℎ is admissible, namely is a lower bound on the true minimal cost. It is easy to see that the Euclidean distance mentioned above has this property. Like the Dijkstra algorithm, A* maintains a priority queue of OPEN vertices. If ℎ is not admissible, a path will still be found, but not necessarily the minimal-cost path. If ℎ satisfies the additional consistency (or monotonicity) condition ℎ(𝑣𝑣, 𝑡𝑡) − ℎ(𝑢𝑢, 𝑡𝑡) ≤ 𝑤𝑤(𝑢𝑢, 𝑣𝑣) for every edge (𝑢𝑢, 𝑣𝑣) of the graph and every vertex 𝑡𝑡, then A* can be implemented more efficiently - no node needs to be processed more than once - and A* is equivalent to running Dijkstra's algorithm with the modified (still positive) edge weights: 𝑤𝑤′(𝑢𝑢, 𝑣𝑣) = 𝑤𝑤(𝑢𝑢, 𝑣𝑣) + ℎ(𝑢𝑢, 𝑡𝑡) − ℎ(𝑣𝑣, 𝑡𝑡). In practice, in addition to the OPEN priority queue, a list CLOSED is maintained. Once popped from OPEN, a vertex goes into CLOSED and is never considered again. Note that the original Dijkstra algorithm is equivalent to A* with the trivial admissible and consistent heuristic ℎ(𝑣𝑣, 𝑡𝑡) ≡ 0. The following two theorems are useful in characterizing heuristics. Theorem 1: If ℎ is consistent and ℎ(𝑡𝑡, 𝑡𝑡) = 0, then ℎ is admissible. Proof: By induction on minimal cost from 𝑡𝑡. ♦ Theorem 2: If ℎ is derived from a metric function 𝑚𝑚, namely ℎ(𝑢𝑢, 𝑡𝑡) = 𝑚𝑚(𝑢𝑢, 𝑡𝑡) and 𝑚𝑚(𝑢𝑢, 𝑣𝑣) ≤ 𝑤𝑤(𝑢𝑢, 𝑣𝑣) for all edges (𝑢𝑢, 𝑣𝑣), then ℎ is consistent. Proof: Apply the triangle inequality and symmetry of 𝑚𝑚: ℎ(𝑣𝑣, 𝑡𝑡) − ℎ(𝑢𝑢, 𝑡𝑡) = 𝑚𝑚(𝑣𝑣, 𝑡𝑡) − 𝑚𝑚(𝑢𝑢, 𝑡𝑡) ≤ 𝑚𝑚(𝑣𝑣, 𝑢𝑢) = 𝑚𝑚(𝑢𝑢, 𝑣𝑣) ≤ 𝑤𝑤(𝑢𝑢, 𝑣𝑣). ♦ Theorems 1 and 2 imply that the planar Euclidean heuristic mentioned above is admissible and consistent. The admissible heuristic ℎ1 is called more informed than the admissible heuristic ℎ2 if ℎ1 (𝑣𝑣, 𝑡𝑡) ≥ ℎ2 (𝑣𝑣, 𝑡𝑡) for all 𝑣𝑣, 𝑡𝑡 ∈ 𝑉𝑉. 3 2.1 A* Heuristics Much effort has been invested in designing good heuristics for A*. A complete account would be lengthy, and much of it is domain-dependent, so we discuss here just the most generic methods. The Optimal Heuristic In general, it is possible to precompute the optimal heuristic ℎ(𝑣𝑣, 𝑡𝑡) by solving a convex semi-definite program (SDP) for the 𝑂𝑂(𝑛𝑛2 ) values of ℎ, forcing the conditions necessary for the heuristic to be admissible and consistent [9]. It relies on the convenient fact that it is sufficient for the heuristic to be “locally” admissible on single edges, namely ℎ(𝑢𝑢, 𝑣𝑣) ≤ 𝑤𝑤(𝑢𝑢, 𝑣𝑣) for every edge (𝑢𝑢, 𝑣𝑣) in order that it be admissible over arbitrary paths, significantly reducing the number of linear inequality conditions in the semi-definite program to 𝑂𝑂(𝑚𝑚), thus the complexity of the entire algorithm to 𝑂𝑂(𝑚𝑚3 ). However, this complexity is still prohibitive and the method is not applicable to graphs containing more than a few thousand vertices. A number of improvements to this are possible, but the method still remains quite complicated. The Differential Heuristic DH A very simple, but surprisingly effective differential heuristic, was proposed by Goldberg et al. [6] (who called it ALT) and independently by Chow [2]. It requires some preprocessing of the graph 𝐺𝐺 = (𝑉𝑉, 𝐸𝐸, 𝑤𝑤). A small number (usually 𝑘𝑘 ≤ 10) “landmark” vertices (also called anchors/pivots/centers) 𝑙𝑙1 , . . , 𝑙𝑙𝑘𝑘 are chosen from 𝑉𝑉(𝐺𝐺). In a preprocessing step, for each vertex 𝑣𝑣 ∈ 𝑉𝑉(𝐺𝐺), the vector of minimal costs 𝑐𝑐(𝑣𝑣) = �𝑐𝑐(𝑙𝑙1 , 𝑣𝑣) , . . , 𝑐𝑐(𝑙𝑙𝑘𝑘 , 𝑣𝑣)� is computed and stored. Then, at the online computation of the minimal-cost path from 𝑠𝑠 to 𝑡𝑡, the heuristic ℎ(𝑣𝑣, 𝑡𝑡) = max {|𝑐𝑐(𝑙𝑙𝑖𝑖 , 𝑣𝑣) − 𝑐𝑐(𝑙𝑙𝑖𝑖 , 𝑡𝑡)|} 1≤𝑖𝑖≤𝑘𝑘 (1) is used. This heuristic requires 𝑂𝑂�𝑘𝑘(𝑚𝑚 + 𝑛𝑛 log 𝑛𝑛)� preprocessing time and 𝑂𝑂(𝑘𝑘𝑛𝑛) space to store. Given 𝑣𝑣 and 𝑡𝑡, ℎ(𝑣𝑣, 𝑡𝑡) can be computed online in 𝑂𝑂(𝑘𝑘) time. It is convenient to think of 𝑐𝑐(𝑣𝑣) as an embedding of 𝑣𝑣 in 𝑅𝑅 𝑘𝑘 and ℎ(𝑣𝑣) as the embedding distance between 𝑣𝑣 and 𝑡𝑡 using the 𝑙𝑙∞ norm: ℎ(𝑣𝑣, 𝑡𝑡) = ‖𝑐𝑐(𝑣𝑣) − 𝑐𝑐(𝑡𝑡)‖∞ It is easy to see that ℎ(𝑣𝑣, 𝑡𝑡) is exact, namely ℎ(𝑣𝑣, 𝑡𝑡) = 𝑐𝑐(𝑣𝑣, 𝑡𝑡), if 𝑣𝑣 is on one of the minimal-cost paths between 𝑙𝑙𝑖𝑖 and 𝑡𝑡. It is also easy to apply Theorems 1 and 2 to show that the differential heuristic is admissible and consistent. The degrees of freedom in this heuristic are the choice of the landmark vertices. Goldberg et al [6] show how to optimize these, concluding that a good choice are landmarks which cover the graph well. In the special case of a plane (or close to plane) graph, a good choice are vertices covering the boundary. In the sequel we will call this heuristic DH. The FastMap Heuristic FM Inspired by the interpretation of the differential heuristic as an embedding distance and by the FastMap algorithm used in machine learning, Cohen et al. [3] devised another embedding based on pairs of landmarks and defined the heuristic using the 𝑙𝑙1 norm distance between the embeddings. 4 The algorithm proceeds by finding a pair of farthest vertices (𝑎𝑎1 , 𝑏𝑏1 ) – those having a large minimal-cost path between them - and computing for every vertex 𝑣𝑣: 1 𝑓𝑓1 (𝑣𝑣) = �𝑐𝑐(𝑎𝑎1 , 𝑣𝑣) − 𝑐𝑐(𝑏𝑏1 , 𝑣𝑣)�, 2 Defining the following function on pairs of vertices: ℎ1 (𝑢𝑢, 𝑣𝑣) = |𝑓𝑓1 (𝑢𝑢) − 𝑓𝑓1 (𝑣𝑣)| the weight 𝑤𝑤(𝑢𝑢, 𝑣𝑣) of the graph edge (𝑢𝑢, 𝑣𝑣) is then modified by subtracting ℎ1 (𝑢𝑢, 𝑣𝑣) from it and the process repeated 𝑘𝑘 − 1 times on the modified graph to obtain the embedding vector 𝑟𝑟(𝑣𝑣) = �𝑓𝑓1 (𝑣𝑣), . . , 𝑓𝑓𝑘𝑘 (𝑣𝑣)�. The final heuristic is the 𝑙𝑙1 embedding distance: ℎ(𝑣𝑣, 𝑡𝑡) = ‖𝑟𝑟(𝑣𝑣) − 𝑟𝑟(𝑡𝑡)‖1 The authors show that this heuristic is also admissible and consistent. In the sequel we will call this heuristic FM. 3. The Separator Heuristic SH Since each landmark employed by the differential heuristic defines a cost field on the graph vertices, where every vertex is assigned the value of the minimal cost of a path between vertex and the landmark, we first observe that this concept may be easily generalized. Instead of a landmark being a mere single vertex, it may be a set of vertices 𝑆𝑆 ⊂ 𝑉𝑉(𝐺𝐺), and the cost of a vertex 𝑣𝑣 (relative to 𝑆𝑆) is defined as: 𝑐𝑐(𝑣𝑣, 𝑆𝑆) = min 𝑐𝑐(𝑣𝑣, 𝑢𝑢) 𝑢𝑢∈𝑆𝑆 This defines a more complicated distance field per landmark, to which the triangle inequality may be applied to obtain an analogous differential heuristic. Unfortunately, in practice this generalization does not add much power to that heuristic. Significantly more power can be obtained if the set 𝑆𝑆 is a separator of the graph, namely its removal (along with the edges incident on the removed vertices) results in 𝑉𝑉 being partitioned into three sets 𝑈𝑈1 , 𝑆𝑆 and 𝑈𝑈2 = 𝑉𝑉 − 𝑈𝑈1 − 𝑆𝑆 , such that there exists no edges between 𝑈𝑈1 and 𝑈𝑈2 . This means that 𝑆𝑆 separates between 𝑈𝑈1 and 𝑈𝑈2 and the separated graph contains at least two connected components, none of them mixing 𝑈𝑈1 and 𝑈𝑈2 . We may take advantage of the dichotomy on 𝑉𝑉 induced by 𝑆𝑆 by defining a signed cost field on 𝑉𝑉 − positive in 𝑈𝑈1 and negative in 𝑈𝑈2 . Denote this signed cost field by 𝐶𝐶. Fig. 1 shows the unsigned cost fields induced on a road network by a single landmark vertex or a set of 5 landmark vertices, compared to the signed cost field induced by a separator. As with the differential heuristic, we choose 𝑘𝑘 separators 𝑆𝑆1 , . . , 𝑆𝑆𝑘𝑘 , and define the embedding 𝑟𝑟(𝑣𝑣) = �𝐶𝐶(𝑣𝑣, 𝑆𝑆1 ), . . , 𝐶𝐶(𝑣𝑣, 𝑆𝑆𝑘𝑘 )� and the resulting heuristic is the 𝑙𝑙∞ embedding distance: ℎ(𝑣𝑣, 𝑡𝑡) = ‖𝑟𝑟(𝑣𝑣) − 𝑟𝑟(𝑡𝑡)‖∞ 5 Using a signed cost boosts the values of this heuristic significantly. It remains to show that it is still admissible and consistent. Since the use of the signed cost changes the rules of the game relative to the differential heuristic, we provide next a separate proof of admissibility and consistency. In the sequel we will call this heuristic SH. Figure 1: Cost fields on undirected graph with edges weighted by Euclidean edge lengths: (left) Unsigned cost field induced by a single (magenta) landmark. (middle) Unsigned cost field induced by a set of 5 landmarks. (right) Signed cost field induced by a separator. Figure 2: (left) Illustration of Case 2 of proof of Theorem 3. The blue path is the minimal-cost path between vertices 𝑣𝑣 and 𝑡𝑡, which must cross the separator 𝑆𝑆𝑖𝑖 at some vertex 𝑠𝑠𝑖𝑖 . Vertices 𝑎𝑎𝑖𝑖 and 𝑏𝑏𝑖𝑖 are those on the separator having minimal cost to 𝑣𝑣 and 𝑡𝑡, respectively. (right) Analogous scenario for the planar Euclidean distance function. If 𝑆𝑆𝑖𝑖 is approximately parallel to the (dotted) bisector between 𝑣𝑣 and 𝑡𝑡, 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) will be small and ℎ(𝑣𝑣, 𝑡𝑡) more informed. Theorem 3: The separator heuristic is admissible and consistent. Proof: Assume 𝑆𝑆𝑖𝑖 separates 𝐺𝐺 into 𝑈𝑈1 and 𝑈𝑈2 . Denote by 𝑎𝑎𝑖𝑖 and 𝑏𝑏𝑖𝑖 the vertices of 𝑆𝑆𝑖𝑖 with minimal cost to 𝑣𝑣 and 𝑡𝑡, respectively. Case 1: 𝑣𝑣, 𝑡𝑡 ∈ 𝑈𝑈1 or 𝑣𝑣, 𝑡𝑡 ∈ 𝑈𝑈2 . In this case 𝐶𝐶(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) and 𝐶𝐶(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) have the same sign and this case is similar to that of the differential heuristic. By definition: 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) ≤ 𝑐𝑐(𝑣𝑣, 𝑏𝑏𝑖𝑖 ) and 6 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) ≤ 𝑐𝑐(𝑡𝑡, 𝑎𝑎𝑖𝑖 ) By the triangle inequality: 𝑐𝑐(𝑣𝑣, 𝑏𝑏𝑖𝑖 ) ≤ 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) + 𝑐𝑐(𝑣𝑣, 𝑡𝑡) so Again by the triangle inequality: 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≥ 𝑐𝑐(𝑣𝑣, 𝑏𝑏𝑖𝑖 ) − 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) ≥ 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) − 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) 𝑐𝑐(𝑡𝑡, 𝑎𝑎𝑖𝑖 ) ≤ 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) + 𝑐𝑐(𝑣𝑣, 𝑡𝑡) so Putting these two together: 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≥ 𝑐𝑐(𝑡𝑡, 𝑎𝑎𝑖𝑖 ) − 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) ≥ 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) − 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≥ |𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) − 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 )| = |𝐶𝐶(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑏𝑏𝑖𝑖 )| = |𝐶𝐶(𝑣𝑣, 𝑆𝑆𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑆𝑆𝑖𝑖 )| Case 2: 𝑣𝑣 ∈ 𝑈𝑈1 and 𝑡𝑡 ∈ 𝑈𝑈2 or vice versa. See Fig. 2 (left). Since 𝑆𝑆𝑖𝑖 separate 𝑈𝑈1 and 𝑈𝑈2 , the minimal-cost path between 𝑣𝑣 and 𝑡𝑡 must contain at least one vertex 𝑠𝑠𝑖𝑖 ∈ 𝑆𝑆𝑖𝑖 . By definition: 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) ≤ 𝑐𝑐(𝑣𝑣, 𝑠𝑠𝑖𝑖 ) By the subpath property of the minimal-cost path and 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) ≤ 𝑐𝑐(𝑡𝑡, 𝑠𝑠𝑖𝑖 ) 𝑐𝑐(𝑣𝑣, 𝑡𝑡) = 𝑐𝑐(𝑣𝑣, 𝑠𝑠𝑖𝑖 ) + 𝑐𝑐(𝑡𝑡, 𝑠𝑠𝑖𝑖 ) ≥ 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) + 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) = 𝐶𝐶(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) = 𝐶𝐶(𝑣𝑣, 𝑆𝑆𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑆𝑆𝑖𝑖 ) Since 𝑐𝑐(𝑣𝑣, 𝑡𝑡) is always positive, we can, without loss of generality, flip the signs of 𝐶𝐶 so that: Since this is true for all 𝑆𝑆𝑖𝑖 , we have QED 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≥ |𝐶𝐶(𝑣𝑣, 𝑆𝑆𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑆𝑆𝑖𝑖 )| 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≥ max |𝐶𝐶(𝑣𝑣, 𝑆𝑆𝑖𝑖 ) − 𝐶𝐶(𝑡𝑡, 𝑆𝑆𝑖𝑖 )| = ℎ(𝑣𝑣, 𝑡𝑡) 𝑖𝑖 Note that the separator itself may not be connected, and even if it is, it may separate the graph into more than two connected components, as in Fig. 3. This does not change any of the arguments above. 4.1 Computing the SH Heuristic Although computing the heuristic is done in a preprocessing stage, it is still important that it be computable somewhat efficiently. In many applications (e.g. traffic-sensitive navigation) the edge weights are dynamic, namely change over time, so the heuristic must be updated periodically to reflect the new weights. Hence efficiency is important. At first glance, it seems that computing the SH heuristic is much more complex than computing DH. DH requires a single-source minimal-cost computation over the entire graph for each of the 𝑘𝑘 landmarks, costing 𝑂𝑂�𝑘𝑘(𝑚𝑚 + 𝑛𝑛 log 𝑛𝑛)� time. Using the same logic, it would seem that computing SH requires similar computation for each vertex in the separators, whose size in a planar graph is 𝑂𝑂�√𝑛𝑛� [8], thus costing 𝑂𝑂 �𝑘𝑘√𝑛𝑛(𝑚𝑚 + 𝑛𝑛 log 𝑛𝑛)� time, which is significantly more than the complexity of computing DH. Fortunately, a straightforward “trick” reduces 7 this complexity back down to the same proportions as DH. For each separator 𝑆𝑆, introduce a new “virtual” vertex 𝑤𝑤𝑆𝑆 to the graph connected to all vertices of 𝑆𝑆, and assign a zero weight to all these edges. Then computing the heuristic associated with 𝑆𝑆 is easily seen to be reduced to computing a single-source minimal-cost path computation over the entire new graph for 𝑤𝑤𝑆𝑆 . Figure 3: A separator may disconnect the graph into (left) two or (right) more connected components. 4.2 Choosing Good Separators The quality of the SH heuristic very much depends on the choice of separators. It seems that the most informed value of ℎ(𝑣𝑣, 𝑡𝑡) is obtained when 𝑣𝑣 and 𝑡𝑡 are separated by one of the 𝑆𝑆𝑖𝑖 and the separator is compact, in the sense that it contains few vertices and these vertices are “close” to each other, i.e. the “cost diameter” of the separator is small. In this case the separator functions as a “bottleneck”, through which the minimal-cost path between 𝑣𝑣 and 𝑡𝑡 must pass, and the three points 𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 and 𝑠𝑠𝑖𝑖 mentioned in the proof of Theorem 2 are very close to each other. Indeed, by the triangle inequality so 𝑐𝑐(𝑣𝑣, 𝑡𝑡) ≤ 𝑐𝑐(𝑣𝑣, 𝑎𝑎𝑖𝑖 ) + 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) + 𝑐𝑐(𝑡𝑡, 𝑏𝑏𝑖𝑖 ) = ℎ(𝑣𝑣, 𝑡𝑡) + 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) 𝑐𝑐(𝑣𝑣, 𝑡𝑡) − ℎ(𝑣𝑣, 𝑡𝑡) ≤ 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) and if 𝑎𝑎𝑖𝑖 and 𝑏𝑏𝑖𝑖 are connected by a path of small cost, 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) is probably small. If the separator is not very compact it is difficult to guarantee that 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) will always be small. Indeed, it is easy to construct simple examples where 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) is very large. In analogy to the Euclidean planar case, a good rule of thumb is that if the separator is more or less parallel to the bisector between 𝑣𝑣 and 𝑡𝑡, 𝑐𝑐(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) will be small. See Fig. 2 (right). If the graph is a road network that contains highways with small travel times, it is quite effective to use these highways as separators, as they are typically also minimal-cost paths, so all the vertices of the separators are very “close” to each other. Care must be exercised to completely separate the graph along the highway, as typically there are overpasses and underpasses related to the highway, i.e. the graph may not be planar close to the highway. When the highways do not cover the road network in a systematic manner, it is more practical to take advantage of the planar layout of the network and simply “slice up” the network by straight lines. The simplest approach is to use equally-spaced horizontal and vertical lines. Each such line defines a separator as the vertices on the set of edges intersecting the line, on the one side of the line. However, based on the analogy to planar bisectors mentioned above, it is also advantageous that these lines span a variety of angles. It may also be more practical 8 to use a piecewise-linear polyline to manually (i.e. interactively) define the separator, as this allows to better adapt to the features of the network. Another way to obtain compact separators is by using the very effective METIS [12] software package for computing compact balanced separators in graphs. See Fig. 4 for example of a polyline separator and one generated by METIS. While we found that METIS generates very compact separators, it completely ignores the “cost diameter” of the separator, so is not optimal for our purposes. We also found that it is difficult to control METIS and cause it to generate a variety of separators at different locations and angles. Figure 4: Example of separators in the Bay area. (left) A separator defined by a dotted pink polyline effectively cutting through bridges across the bay and mountain ridges. The separator is the set of pink vertices which are incident on the cut edges to the left of the polyline. The blue and green regions are the largest connected components in the network after the separator is removed. Black regions are the union of the other (usually very small) connected components. (right) A more compact and more balanced separator computed by METIS. 4.3 Directed Graphs The preceding discussion is valid for directed graphs, in which the minimal-cost function is symmetric (thus also a metric): 𝑐𝑐(𝑢𝑢, 𝑣𝑣) = 𝑐𝑐(𝑣𝑣, 𝑢𝑢). In reality, road networks are directed graphs, traffic flowing with different velocities in opposite directions, with one-way roads as an extreme case of zero flow in one direction - a fact that cannot be ignored in a real-world application. Thus 𝑐𝑐(𝑢𝑢, 𝑣𝑣) − the travel time from 𝑢𝑢 to 𝑣𝑣 - will typically be different from 𝑐𝑐(𝑣𝑣, 𝑢𝑢). The DH and SH heuristics described above may be generalized to the directed case, by storing two values per coordinate, representing minimal costs in opposite directions. For example, given a landmark vertex 𝑙𝑙, the directed triangle inequalities relating to 𝑐𝑐(𝑠𝑠, 𝑡𝑡) are (see Fig. 5): implying: 𝑐𝑐(𝑠𝑠, 𝑙𝑙) ≤ 𝑐𝑐(𝑠𝑠, 𝑡𝑡) + 𝑐𝑐(𝑡𝑡, 𝑙𝑙), 𝑐𝑐(𝑙𝑙, 𝑡𝑡) ≤ 𝑐𝑐(𝑙𝑙, 𝑠𝑠) + 𝑐𝑐(𝑠𝑠, 𝑡𝑡) 𝑐𝑐(𝑠𝑠, 𝑡𝑡) ≥ 𝑐𝑐(𝑠𝑠, 𝑙𝑙) − 𝑐𝑐(𝑡𝑡, 𝑙𝑙), 𝑐𝑐(𝑠𝑠, 𝑡𝑡) ≥ 𝑐𝑐(𝑙𝑙, 𝑡𝑡) − 𝑐𝑐(𝑙𝑙, 𝑠𝑠) Thus the analog to (1) for the DH heuristic in the directed case is: ℎ(𝑠𝑠, 𝑡𝑡) = max { 𝑐𝑐(𝑠𝑠, 𝑙𝑙) − 𝑐𝑐(𝑡𝑡, 𝑙𝑙), 𝑐𝑐(𝑙𝑙, 𝑡𝑡) − 𝑐𝑐(𝑙𝑙, 𝑠𝑠) } ≤ 𝑒𝑒(𝑠𝑠, 𝑡𝑡) 9 Figure 5: Computation of the DH heuristic based on the landmark 𝑙𝑙 in the directed case. The costs of the minimal-cost paths from 𝑙𝑙 in both directions are stored for all vertices. Note that, as opposed to the undirected case, ℎ(𝑠𝑠, 𝑡𝑡) may (in rare cases) be negative, so it should be capped at zero: ℎ(𝑠𝑠, 𝑡𝑡) = max { 𝑐𝑐(𝑠𝑠, 𝑙𝑙) − 𝑐𝑐(𝑡𝑡, 𝑙𝑙), 𝑐𝑐(𝑙𝑙, 𝑡𝑡) − 𝑐𝑐(𝑙𝑙, 𝑠𝑠), 0 } As in the undirected case, using the SH in the directed case requires using separators. These are defined and generated in the same way as the undirected case, i.e. the separation property ignores the directionality of the edges. Since the cost is no longer symmetric, it is not as useful to use the concept of “cost field”, rather just the concept of (undirected) connected components. Fig. 6 summarizes the computation of our SH heuristic based on a separator 𝑆𝑆 in the directed case: Preprocessing: For each separator 𝑆𝑆, compute: 1. For every graph vertex 𝑣𝑣, 𝑐𝑐(𝑣𝑣, 𝑆𝑆) – the minimal cost from 𝑣𝑣 to 𝑆𝑆. 2. For every graph vertex 𝑣𝑣, 𝑐𝑐(𝑆𝑆, 𝑣𝑣) – the minimal cost from 𝑆𝑆 to 𝑣𝑣. 3. For every graph vertex 𝑣𝑣, the label of the (undirected) connected component it belongs to when all edges connecting 𝑆𝑆 to other vertices in the graph (in both directions) are removed. Online Query: Given vertices 𝑢𝑢, 𝑣𝑣, the SH heuristic based on separator 𝑆𝑆, is: 1. If 𝑢𝑢, 𝑣𝑣 are in the same connected component, then: ℎ𝑆𝑆 (𝑢𝑢, 𝑣𝑣) = max { 0, 𝑐𝑐(𝑢𝑢, 𝑆𝑆) − 𝑐𝑐(𝑣𝑣, 𝑆𝑆), 𝑐𝑐(𝑆𝑆, 𝑣𝑣) − 𝑐𝑐(𝑆𝑆, 𝑢𝑢) } 2. If 𝑢𝑢, 𝑣𝑣 are in different connected components, then ℎ𝑆𝑆 (𝑢𝑢, 𝑣𝑣) = 𝑐𝑐(𝑢𝑢, 𝑆𝑆) + 𝑐𝑐(𝑆𝑆, 𝑣𝑣) Figure 6: SH heuristic ℎ𝑆𝑆 (𝑢𝑢, 𝑣𝑣) for a directed weight graph based on a separator 𝑆𝑆. Computing 𝑐𝑐(𝑆𝑆, 𝑡𝑡) for all vertices 𝑡𝑡 is easy, as discussed in Section 4.1, through the use of a virtual vertex connected with edges of weight zero to all vertices of 𝑆𝑆, and then performing a one-to-all minimal-cost computation from 10 that vertex. At first glance, it would seem that directly computing the opposite 𝑐𝑐(𝑡𝑡, 𝑆𝑆) is not that straightforward, but this may be solved by reversing the directions of all the graph edges. 5. Experimental Results We have implemented the heuristics mentioned in this paper, namely the differential heuristic (DH), FastMap (FM) and our separator heuristic (SH) and compared how informed they are when approximating the travel time on a number of road networks whose edges are weighted with realistic travel times. We were not able to use the popular benchmark road networks from the 9th DIMACS Implementation Challenge – Shortest Paths dataset [10], because these are undirected graphs, so do not reflect reality. Instead, we extracted directed graphs on the equivalent areas of New York, Colorado and the Bay Area from OpenStreetMap [11]. Table 1 shows the specs of those graphs. We were surprised to discover that these were 10x more detailed than those in the DIMACS Challenge. The edges of the graphs were weighted by the minimal travel time along that edge, which was computed as the Euclidean length of the edge (as computed from the latitude and longitude information per vertex) divided by the maximal speed on that edge, as extracted from OpenStreetMap. In our experiments, we randomly chose 10,000 pairs of vertices from each map by randomly choosing two points (𝑠𝑠, 𝑡𝑡) uniformly distributed within the bounding box of the map, and then “snapping” those two points to the closest map vertex, as long as the snap was not too far. We then compared the true fastest path time 𝑐𝑐(𝑠𝑠, 𝑡𝑡) with the heuristic ℎ(𝑠𝑠, 𝑡𝑡), when varying the number of “coordinates” used in the heuristics between 4, 6 and 8. We performed this experiment for the directed graph and an undirected version of the same graph, where the weight of an edge was taken as the minimal weight of the edges in each direction. In the directed case, we compared SH only to DH, as it is unclear how to generalize FM to the directed case. The DH landmarks were spread uniformly around the boundary of the network. The FM landmark pairs were computed in the manner described by Cohen el al [3], as pairs with distant travel times between them. The SH separators were chosen as a mix of METIS separators and polyline separators specified interactively to take advantage of bottlenecks in the networks. For each pair of vertices (𝑠𝑠, 𝑡𝑡), we measure the relative quality of the heuristic: qual(𝑠𝑠, 𝑡𝑡) = ℎ(𝑢𝑢, 𝑣𝑣) 𝑐𝑐(𝑢𝑢, 𝑣𝑣) which is a value in [0,1] reflecting how informed the heuristic is. Tables 2 and 3 show the mean and standard deviations of the heuristic qualities for the experiments we performed, on the undirected and directed graphs, respectively. Good values should be between 80% and 100%. The results show that in the undirected case, the SH heuristic is consistently more informed by 3% to 13% than the DH heuristic, which in turn is also 3% to 13% more informed than the FM heuristic. The results are similar in the directed case: SH is 3% to 12% to more informed than DH. Graph New York (NY) Colorado (COL) Bay Area (BAY) Vertices 1,579,003 5,154,659 3,092,249 Undirected Edges 1,744,284 5,400,186 3,351,919 Directed Edges 3,104,365 10,454,829 6,279,871 Table 1: Statistics of the graphs used in our experiments, as extracted from OpenStreetMap. 11 Graph New York (NY) Colorado (COL) Bay Area (BAY) SH DH FM SH DH FM SH DH FM 𝒌𝒌 89 ± 13 84 ± 14 82 ± 17 87 ± 14 84 ± 13 71 ± 22 90 ± 15 77 ± 15 67 ± 21 4 91 ± 11 85 ± 13 84 ± 16 90 ± 12 85 ± 12 73 ± 21 91 ± 13 80 ± 15 71 ± 20 6 92 ± 10 87 ± 12 84 ± 16 92 ± 11 86 ± 12 75 ± 21 93 ± 10 83 ± 13 72 ± 20 8 Table 2: Mean and standard deviation of heuristic quality (%) as measured in our experiments, over 10,000 pairs of vertices on an undirected road network. Graph New York (NY) Colorado (COL) Bay Area (BAY) SH DH SH DH SH DH 𝒌𝒌 88 ± 14 83 ± 15 87 ± 14 84 ± 13 89 ± 15 77 ± 15 4 91 ± 11 85 ± 13 90 ± 12 85 ± 12 91 ± 13 80 ± 15 6 92 ± 10 87 ± 12 92 ± 11 86 ± 12 93 ± 11 83 ± 13 8 Table 3: Mean and standard deviation of heuristic quality (%) as measured in our experiments, over 10,000 pairs of vertices on a directed road network. Although a 3% improvement in the quality of the SH heuristic over the DH heuristic would seem rather small, it can make a surprisingly big difference in the performance of the A* algorithm. The effect of a good heuristic is to reduce the number of road network vertices traversed during the search for the fastest path. Thus the efficiency of a heuristic in conjunction with A* is measured as the number of vertices on the fastest path divided by the total number of vertices traversed by A*: eff(𝑠𝑠, 𝑡𝑡) = #𝑣𝑣𝑒𝑒𝑟𝑟𝑡𝑡𝑣𝑣𝑐𝑐𝑒𝑒𝑠𝑠(fastest_path(𝑠𝑠, 𝑡𝑡)) #𝑣𝑣𝑒𝑒𝑟𝑟𝑡𝑡𝑣𝑣𝑐𝑐𝑒𝑒𝑠𝑠(A∗ _traversal(𝑠𝑠, 𝑡𝑡)) The closer this number is to 1 – the more efficient the heuristic is. The efficiency of the heuristic is the mean of this quantity over all possible pairs (𝑠𝑠, 𝑡𝑡). The best possible efficiency on a road network is typically 40%-50%, since any variant of A* must traverse at least the fastest path vertices and also their immediate neighbors. When a heuristic is used, the efficiency can drop dramatically to the vicinity of 1%, meaning 100 vertices of the graph are explored for every one vertex along the fastest path. Tables 4 and 5 compare the efficiencies of the different heuristics using the same formats as Tables 2 and 3. SH is more efficient than DH by a factor between 1.35 and 2.4 in the undirected case, and between 1.26 and 2.67 in the directed case. Figs. 7 and 8 give more details of the results for the simplest case of 𝑘𝑘 = 4 on undirected and directed road networks. The left column of each table illustrates the four DH landmarks in red, the four FM pairs in blue and the four SH polyline separators in four other colors. The middle column shows the histogram of the distribution of the qualities of DH, FM and SH values in red, blue and green, respectively. The right column shows the histogram of the efficiencies, color-coded in the same way. Graph New York (NY) Colorado (COL) Bay Area (BAY) SH DH FM SH DH FM SH DH FM 𝒌𝒌 6.0 ± 9.9 3.6 ± 9.4 3.2 ± 6.7 3.4 ± 5.9 2.9 ± 6.4 1.5 ± 3.1 4.8 ± 8.9 3.5 ± 10.4 2.0 ± 5.1 4 7.1 ± 11.0 5.7 ± 11.9 3.7 ± 7.4 5.8 ± 8.5 3.8 ± 8.6 1.9 ± 4.9 8.1 ± 15.3 3.9 ± 10.8 2.7 ± 6.5 6 8.1 ± 12.5 6.0 ± 12.0 3.7 ± 7.5 7.3 ± 9.3 4.4 ± 9.7 2.2 ± 5.5 11.3 ± 17.7 4.7 ± 11.5 3.0 ± 7.1 8 Table 4: Mean and standard deviation of heuristic efficiency (%) for A* as measured in our experiments, over 1,000 pairs of vertices on an undirected road network. 12 Graph New York (NY) Colorado (COL) Bay Area (BAY) SH DH SH DH SH DH 𝒌𝒌 5.6 ± 9.8 3.1 ± 7.9 3.4 ± 6.0 2.7 ± 5.6 4.8 ± 9.5 3.3 ± 10.8 4 6.7 ± 11.3 6.0 ± 12.3 5.8 ± 8.0 3.6 ± 8.2 8.0 ± 15.3 3.6 ± 11.2 6 8.2 ± 13.2 6.1 ± 12.3 7.4 ± 9.0 4.2 ± 9.4 11.2 ± 18.3 4.2 ± 11.9 8 Table 5: Mean and standard deviation of heuristic efficiency (%) for A* as measured in our experiments, over 1,000 pairs of vertices on a directed road network. Figure 7: Comparison of heuristics using 𝑘𝑘 = 4 coordinates on undirected weighted road networks. Red points are DH landmarks. Blue points joined by line segments are FM pairs. SH separators are in other colors. Top: New York (NY). Middle: Colorado (COL), Bottom: Bay Area (BAY). Left: Road network, Middle: Heuristic quality histogram, Right: Heuristic efficiency histogram. 13 Figure 8: Comparison of heuristics using 𝑘𝑘 = 4 coordinates on directed weighted road networks. Red points are DH landmarks. Blue points joined by line segments are FM pairs. SH separators are in other colors. Top: New York (NY). Middle: Colorado (COL), Bottom: Bay Area (BAY). Left: Road network, Middle: Heuristic quality histogram, Right: Heuristic efficiency histogram. To illustrate better the efficiency of the SH heuristic compared to that of the DH heuristic, Figs. 9 and 10 show the vertices traversed by A* when searching for the fastest path using the different heuristics on the same (𝑠𝑠, 𝑡𝑡) pair, in the undirected and directed cases. Despite the modest improvements in quality between DH and SH, the efficiency is improved by anywhere between a factor of 2.9 and a factor of 30. It is interesting to understand better the effect of the location of the separator on the efficiency of A* using SH. Fig. 11 shows how A* traverses a road network when searching for the fastest path between two vertices using SH based on a single separator, in three different locations. For simplicity, this network is undirected and its edges are weighted by Euclidean edge lengths. The separators are all parallel to the “bisector” between the two vertices, but at different distances from the target. As long as the vertex under investigation is separated from the target vertex, the heuristic seems to be quite informed. This changes, sometimes quite dramatically, when the separator is crossed, indicating that the true power of the heuristic is in its separation property, as opposed to, e.g. the DH heuristic, which is based on no more than the very basic triangle inequality. 14 38.4% 1.8% 1.3% 15.0% 0.5% 0.3% 5.8% 0.9% 0.2% Figure 9: Efficiency of the heuristics with 𝑘𝑘 = 4 coordinates on different weighted undirected road networks. Top: NY, Middle: COL, Bottom: BAY. Colored vertices show the vertices traversed during the A* search for the fastest path from the black vertex to the magenta vertex. Green – SH, Red – DH, Blue – FM. Magenta dotted lines indicate the SH separators. Cyan indicates fastest path, usually taking advantage of highways. Summary and Conclusion We have proposed a relatively simple way to compute an admissible and consistent heuristic SH for the A* algorithm for computing the minimal-cost path in a weighted directed graph. In some sense, this heuristic may be viewed as a powerful generalization of the differential heuristic DH (originally called ALT), which has proven to be very effective in its own right. SH is based on the notion of graph separators, which may be generated automatically or manually on road networks, and is shown experimentally to be of higher quality (i.e. more informed) than DH by about 10%, but resulting in an increase in efficiency of up to an order of magnitude, when used by A* to generate fastest paths in directed road networks with edges weighted by travel times. SH is applicable to both undirected and directed graphs and seems to perform similarly on both. 15 Like DH, SH may be used in conjunction with other types of optimizations of the A* algorithm (e.g. bi-directional search, reach-based and hierarchical methods) to independently boost its performance. 35.6% 1.5% 6.7% 0.6% 1.7% 0.6% Figure 10: Efficiency of the heuristics with 𝑘𝑘 = 4 coordinates on different directed road networks. Top: NY, Middle: COL, Bottom: BAY. Colored vertices show the vertices traversed during the A* search for the fastest path from the black vertex to the magenta vertex. Green – SH, Red – DH. Magenta dotted lines indicate the SH separators. Cyan indicates fastest path, usually taking advantage of highways. 16 Figure 11: The effect of the location of the separator on the efficiency of the SH heuristic in an undirected road network whose edges are weighted by Euclidean edge lengths. Green vertices are those traversed by A* using SH with a single separator, marked in magenta, when computing the fastest path from the black source to the magenta target vertex. The separator is parallel to the bisector between the two vertices, but at different distances from the target. Note the deterioration in the efficiency once the separator is crossed. References 1. H. Bast, D. Delling, A.Goldberg, M. Mueller-Hannemann, T. Pajor, P. Sanders, D. Wagner and R.F. Werneck. Route planning in transportation networks. In Algorithm Engineering: Selected Results and Surveys (L. Kliemann and P. Sanders, Eds.), p. 19-80, Springer, 2016. 2. E. Chow. A graph search heuristic for shortest distance paths. Proc. AAAI, 2005. 3. L. Cohen, T. Uras, S. Jahangiri, A. Arunasalam, S. Koenig and T.K. Satish Kumar. The FastMap algorithm for shortest path computations. Proc. IJCAI, 2018. 4. E.W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 1959. 5. M.L. Fredman and R.E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. Proc. IEEE FOCS, 1984. 6. A. Goldberg and C. Harrelson. Computing the shortest path: A* search meets graph theory. Proc. SODA, 2005. 7. P.E. Hart, N.J. Nilsson and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE TSSC, 4(2):100-107, 1968. 8. R.J. Lipton and R.E. Tarjan. A separator theorem for planar graphs. SIAM J. Appl. Math., 36(2):177–189, 1979. 9. C. Rayner, M. Bowling, and N. Sturtevant. Euclidean heuristic optimization. Proc. AAAI, 2011. 10. http://www.diag.uniroma1.it/challenge9/ 11. https://www.openstreetmap.org/ 12. http://glaros.dtc.umn.edu/gkhome/metis/metis/overview 17