Experimental analysis of dynamic algorithms for the single source shortest paths problem

Daniele Frigioni; Mario Ioffreda; Umberto Nanni; Giulio Pasqualone

Experimental Analysis of Dynamic Algorithms for the Single Source Shortest Path Problemy Daniele Frigionizx Mario Io redax Umberto Nannix Giulio Pasqualonex December 30, 1997 Abstract In this paper we propose the rst experimental study of the fully dynamic single source shortest paths problem on directed graphs with positive real edge weights. In particular, we perform an experimental analysis of three di erent algorithms: Dijkstra's algorithm, and the two output bounded algorithms proposed by Ramalingam and Reps in [31] and by Frigioni, Marchetti-Spaccamela and Nanni in [18], respectively. The main goal of this paper is to provide a rst experimental evidence for: (a) the e ectiveness of dynamic algorithms for shortest paths with respect to a traditional static approach to this problem; (b) the validity of the theoretical model of output boundedness to analyze dynamic graph algorithms. Beside random generated graphs, useful to capture the \asymptotic" behavior of algorithms, we also develope experiments by considering a widely used graph from the real world, i.e., the Internet graph. Work partially supported by the ESPRIT Long Term Research Project ALCOM-IT under contract no. 20244, and by Progetto Finalizzato Trasporti 2 of the Italian National Research Council (CNR). The work of the rst author was done while he was visiting the Max Planck Institut fur Informatik, IM Stadtwald, 66123, Saarbrucken (Germany), supported by the NATO { Advanced Fellowships Programme n. 215.29 of the Italian National Research Council (CNR). y Portions of this work were presented at the 1st Workshop on Algorithmic Engineering (WAE'97), Venice, Italy, March 11{13 1997. z Max Planck Institut f ur Informatik, IM Stadtwald, 66123, Saarbrucken (Germany). x Dipartimento di Informatica e Sistemistica, Universit a di Roma \La Sapienza", Via Salaria 113 00198 - Roma, Italy, ffrigioni,ioffreda,nanni,pasqualog@dis.uniroma1.it 1 1 Introduction A lot of e orts have been done in the last years in order to devise ecient algorithms for dynamic graph problems (e.g., see [6, 9, 13, 14, 15, 16, 18, 20, 23, 24, 25, 26, 30, 31, 32]), motivated by theoretical as well as practical applications. In the literature, the most used dynamic model is the following: we are given a graph G and we want to answer queries on a property P of G, while the graph is changing due to insertions and deletions of edges. For instance, if the graph represents a communication or a transportation network, the edge updates re ect the real network changes as trac conditions and link failures and resumes. The main goal of a dynamic algorithm is to update the information on P more efciently than recomputing it from scratch after each update. If both insertions and deletions of edges are allowed, then we refer to the fully-dynamic problem; if only insertions (deletions) of edges are supported, then we refer to the semi-dynamic incremental (decremental) problem. One of the most studied problem in this eld is the problem of updating shortest paths in a dynamically changing graph. This problem is interesting on its own and nds many important applications, including network optimization, document formatting, routing in communication systems, robotics [7]. For a comprehensive review of the application settings for the static and dynamic shortest paths problem, we refer to [1, 30]. Several theoretical results have been provided in the literature for the dynamic maintenance of shortest paths in graphs, but, to the best of our knowledge, nothing is known from the experimental point of view. In this paper we provide the rst experimental study of dynamic algorithms for the single source shortest paths problem. 1.1 Previous theoretical results Many dynamic solutions have been proposed in the literature for the shortest paths problem, both for the single-source and the all-pairs versions [6, 9, 14, 15, 16, 18, 20, 26, 31, 32]. A fully dynamic solution for maintaining all-pairs shortest paths on planar graphs with unrestricted edge weights is given in [26], but the algorithm is complex and far from being practical. In [9] ecient dynamic solutions are provided for graphs with bounded treewidth when the weights of edges might change, but without considering insertions of edges. An ecient solution for the all-pairs incremental problem is provided in [6], when the edge weights are integers in the range [1::C ]. A decremental solution for the single source version of the problem on digraphs with n vertices and m edges, and integer edge weights in [1::C ], has been given in [16], working in O(nC ) amortized time per deletion. Further results concerning the dynamic shortest paths problem have been proposed in [14, 15, 31, 32]. All the mentioned solutions are characterized by a given setting, either for the kind of considered graph and edge weights, or for the set of allowed updates on the graph. However, to the best of our knowledge, neither a fully dynamic solution nor a decremental solution for the single source shortest paths problem on general graphs with real edge weights is currently known in the literature that is asymptotically faster (either in worst case or in amortized sense) than recomputing the new solution from scratch. These 2 considerations state that it is very hard to devise ecient fully dynamic solutions for the single source shortest paths problem in the standard cost models, and this holds also for other important dynamic graph problems, as, for example for transitive closure (see e.g., [24, 25]). 1.2 The output complexity model For the reasons given above, some researchers have tried to investigate the possibility of de ning other cost models for the computational analysis of dynamic graph problems [4, 20, 30]. One of them is the output complexity model, rst considered by Ramalingam and Reps in [30, 31], and then by other authors in [18, 19, 20]. In this model the cost of a dynamic algorithm is evaluated as a function of the number of updates to the output information of the problem, determined by each input modi cation. The main motivation behind the use of this model is that it tries to capture the intrinsic cost required by a dynamic algorithm after each input update. In [30, 31] Ramalingam and Reps propose to measure the cost of dynamic graph algorithms by using the following model: let G = (V; E ) be a graph, P be a property to be maintained on G and be an input modi cation (insertion, deletion or weight update of an edge) to be performed on G. The cost parameter is the extended size of the output updates, denoted as jjjj. This parameter represents the number of vertices that change their output value with respect to the property P after the input change , which they call a ected vertices, plus the number of edges having at least one a ected endpoint. In [20], the authors propose to measure the computational cost of a dynamic graph algorithm in terms of the number of changes on the output information of the problem (depending on the problem at hand). For example, in the case of the single source shortest paths problem for a given graph G = (V; E ) with source vertex s, the output information consists of the value of the minimum distance from s of each vertex x 2 V , and the description of a shortest paths tree rooted in s. If is an input change to be performed on G, then the number of output updates, denoted as jj, is the number of vertices in V that either change their distance from the source, or must change their parent in the shortest paths tree by maintaining the same shortest distance, due to . This notion of output complexity has been also extended in [20] to sequences of updates. Namely, given a graph G = (V; E ), let = h1 ; 2; : : : ; i be a sequence of edge modi cations; each input modi cation 2 is performed on graph G ?1, with G0 G, and gives the new graph G . After that each input modi cation 2 is speci ed, it is explicitly required to update the current output information. The total number of output updates over sequence , denoted as (G; ), is the sum of the number of output updates caused by each input modi cation in the sequence. In this case no algorithm can process the sequence , performing explicit updates, in less than O(jj + j(G; )j) time. If A(G; ) is the time required by algorithm A to process a sequence on graph G, with explicit updates, then algorithm A requires amortized time per output update if: A(G; ) j(G; )j + jj + constant: Previous results concerning the dynamic single source shortest paths problem on digraphs with positive real edge weights in the above output complexity models have been proposed in [18, 19, 20, 31]. In [31] Ramalingam and Reps provide a fully dynamic h i i i i 3 algorithm requiring O(jjjj log jjjj) worst case time to update the shortest paths information after an input change . In [18, 19] the authors propose fully dynamic algorithms for the single source shortest paths problem that work for any graph with positive real edge weights and have optimal space requirements and query time. The cost of the update operations depends on the class of the considered graph and on the number of a ected vertices. If only updates on the weights of edges are allowed, then, in the case of graphs with bounded genus (including planar graphs), bounded arboricity, bounded degree, bounded treewidth and bounded pagenumber [27], the proposed algorithms require O(log n) worst case time per a ected vertex. For general graphs with n vertices and m p edges they require O( m log n) worst case time per a ected vertex. If also insertions and deletions of edges are allowed, all the previous worst case bounds become amortized. 1.3 Experimental results A lot of work has been done in the literature in order to evaluate the practical performances of static algorithms for the shortest paths problem (see e.g. [10, 11, 21]), but nothing is known for the experimental evaluation of dynamic shortest path algorithms. On the contrary this is not the case for other important dynamic graph problems, as, for example, for connectivity and for minimum spanning tree [3, 5]. In this paper we make a rst step toward this direction. We implemented the algorithms proposed by Ramalingam and Reps in [31] (denoted as RR) and the one proposed by Frigioni et al. in [18, 19] (denoted as FMN), and evaluated the practical performances of those algorithms in a fully dynamic environment. The main reason for choosing RR and FMN is that they are the only algorithms known in the literature that can handle fully{dynamic updates on general digraphs. Together with such algorithms we considered also a simple variant of the best theoretical implementation of Dijkstra's algorithm (Dij), de ned properly to be used in a fully{dynamic framework. All our implementations are written in C++ with the support of LEDA, the library of ecient data types and algorithms developed at the Max Plank Institut fur Informatik of Saarbrucken (Germany) [28, 29]. These libraries allow the programmer to use prede ned classes to solve many graph problems in an easy way. We also realized the algorithms in the form of parameterized classes with integer and oat edge types. With this platform of implementations we performed several tests on di erent instances (where an instance is de ned by a given graph and a given update sequence), in order to verify how fast are dynamic algorithm in practice, i.e., beyond their theoretical characterization. Our experiments have been performed on three kinds of instances: (a) randomly generated instances, (b) particular graphs and update sequences that try to enforce special situations for the tested algorithms, (c) random sequences of updates performed on a real world graph, whose structure consists of the fragment of the Internet network visible from the RIPE server (more details are given in section 6.4). These experiments lead basically to the following conclusions: Dynamic algorithms ful ll our expectations, overcoming the static one in a fully{ dynamic environment: not surprising, of course, but we could not guess how good they are. 4 The theoretically best algorithm is not always the fastest. Namely, our experiments show that, in terms of CPU time, RR is faster than FMN in the case of random instances; this remains true although FMN considers a subset of the edges considered by RR (but uses more complex data structures). By using \ad hoc" experiments (devised only for this purpose), we show that FMN overcomes RR in some special cases, with a ratio between the running times (and number of scanned edges) which is not bounded. This is meaningful since the computational models proposed by the authors to analyze the two algorithms are di erent in some extent. The experiments on the Internet graph have been performed by considering unitary edge weights (thus minimizing the number of \hops" to deliver information), and random sequences of updates. The e ectiveness of RR and FMN algorithms is evident also on this real world graph where shortest paths problems have to be frequently handled. The paper is organized as follows. In Section 2 we describe the notation used throughout the paper. In Sections 3, 4 and 5 we describe the considered algorithms Dij, RR, and FMN, respectively, and provide in addition some implementation details. In Section 6 we discuss the results of our experiments. Finally, in Section 7 we provide some concluding remarks and discuss some possible future works. 2 Notation In the paper we use the following standard notation. Let G = (V; E ) be a digraph with n vertices and m edges, and let s 2 V be a xed source vertex. A real positive weight w is associated to each edge (x; y) 2 E . We denote as dist : V ! <+ a distance function that gives, for each vertex x 2 V , the minimum distance of x from s, and as T (s) = (V ; E ) a shortest paths tree rooted in s for G, i.e., a spanning tree rooted at s such that, for any vertex v, the path from s to v in T (s) is a shortest path. For each x 2 V , T (x) denotes the subtree of T (s) rooted at x. Any vertex x 2 V has one parent, denoted as parent(v), (except for source s), and a (possibly empty) set of children in T (s). An edge (x; y) 2 E is a tree edge if (x; y) 2 E , otherwise it is a non-tree edge. For each vertex z 2 V , we denote as in(z) and as out(z), the edges of E incoming and outgoing z, respectively. For each vertex z 2 V , d(z) and d0(z) denote the values of the distances of z before and after an edge modi cation, respectively. The new parent of a vertex z, and the new shortest paths tree in the graph G0 obtained from G after an edge operation, are denoted as parent0(z) and T 0(s), respectively. x;y T T T 3 Dijkstra's algorithm We rst considered for our experiments the Dijkstra's algorithm [12]. This is motivated by the fact that, when implemented with Fibonacci heaps [17], it represents the best known theoretical static solution of the single source shortest paths problem for digraphs 5 with non{negative edge weights, and achieves a time bound of O(m + n log n). In the case of integer edge p weights in the range [0::C ], the best known solution for digraphs requires O(m + n log C ) and it has been proposed in [2]. On the other hand, the best known static solution for undirected graphs with non{negative edge weights is due to Thorup [33] and it is optimal. Our goal is to verify the performances of the Dijkstra's algorithm when used in a fully dynamic environment, compared with algorithms that are actually fully dynamic. We used a very simple variant of the mentioned algorithm which is suitable to be used in such a framework. The idea is to rebuild from scratch the information on the shortest paths using Dijkstra's algorithm, only when the input update a ects the current shortest paths information. As an example, we do not execute the algorithm when an edge, that does not belong to the current shortest path tree, is deleted. In this way the static algorithm is allowed to work in a dynamic framework in the best possible conditions. The basic strategy adopted in Dijkstra's algorithm [12] is the following. For each vertex v in the digraph G = (V; E ) a distance label d(v) and a parent label p(v) are maintained. Furthermore, a set S of vertices whose nal shortest distance have already been determined, is maintained. This means that, for each v 2 S , we have d(v) = dist(v). Initially all the vertices have a distance label equal to +1, except for the source that belongs to S and has label equal to 0. The algorithm repeatedly selects the vertex z 2 V ? S with minimum distance label and adds it to S , by setting dist(z) = d(z) and parent(z) = p(z). Then, all edges (z; q) 2 out(z) z are traversed in order to verify whether vertex z provides a path from the source to vertex q that is shorter than the current distance label of q. If this is the case, then the distance label of vertex q is decreased to the value given by the length of the path from the source to q passing through z. The algorithm halts when the nal shortest distances from the source have been determined for all the vertices in the digraph. For our experiments we used the LEDA implementation of the Dijkstra's algorithm [29], modi ed as described above to be used in a fully{dynamic environment. The data structures used to implement Dijkstra's algorithm are stored in a class called SSSP DIJ and are very simple. An array dist stores the current distance of each vertex from the source, while an array pred maintains the current parent of each vertex in the shortest paths tree. Furthermore, the vertices in S are stored in a Fibonacci heap (as implemented in LEDA, in which the priority of each vertex is the length of the shortest path found so far. In this way we achieved the most ecient theoretical implementation of the Dijkstra's algorithm in the case of digraphs with non{negative real edge weights. The class constructor uses the LEDA implementation of Dijkstra's algorithm in order to build the initial shortest paths tree. In [21] it has been shown that the Dijkstra's algorithm can be implemented more eciently from an experimental point of view, when the edge weights are integers in the range [0::C ], C 2, by using a multi-level bucket implementation. In [22] the authors show that the expected number of decrease-key operations performed during the execution of Dijkstra's algorithm is small. This explains why Dijkstra codes based on binary heaps perform better in pratice than ones based on Fibonacci heaps. In our implementations we use priority queues with di erent features: for all the algorithms a global priority queue for the nodes to be updated, and for FMN algorithm, local queues 6 are used for the adjacency lists. We tested binary and Fibonacci heaps, and choose the latter. We implemented exactly the same algorithms both for integer and for real edge weights, and hence did not use the multi-level bucket implementations, al least so far, but are considering the possible improvements with this approach in the case of integer edge weights. 4 Ramalingam and Reps' algorithm In this section we describe the rst fully dynamic solution for the single source shortest paths problem on general digraphs with positive real edge weights, that has been proposed in the literature. It is the solution provided by Ramalingam and Reps in [31], whose performances are evaluated in terms of the parameter jjjj, which is the number of vertices a ected by the input change , plus the number of edges with at least one a ected endpoint. The strategy used by Ramalingam and Reps is based on Dijkstra's algorithm, and can be summarized as follows. The algorithms maintain a subset of the edges of the digraph G = (V; E ), denoted as SP , during insertions and deletions of edges. This subset contains exactly the edges of G that belong to at least one shortest path from the source s to the other vertices of the graph. The graph with vertex set V and edge set SP is a dag, denoted as SP (G), and contains all the shortest paths from s to the other vertices of G. In the case of an edge insertion, Ramalingam and Reps propose to use an adaptation of Dijkstra's algorithm, working on the set of vertices a ected by that insertion. In particular, if (v; w) is the inserted edge, vertices are stored in a priority queue with priority equal to their distance from vertex w. When a vertex x with minimum priority is extracted from the heap, all edges (x; y) 2 out(x) are traversed; vertex y is inserted in the heap, or its priority in the heap is updated, only if the shortest path from s to x, plus edge (x; y) yields a path shorter than the shortest path currently known for y. In this case the algorithm deletes all the edges entering y in SP (G) and adds to SP (G) edge (x; y). On the other hand, if d(x) + w = d(y) then edge (x; y) is simply added to SP (G). The algorithm halts when the shortest distance of all the a ected vertices, and the new dag SP (G) have been computed. In the case of an edge deletion, the algorithm of Ramalingam and Reps works in two phases. Suppose (v; w) is the deleted edge. In the rst phase the algorithm nds the vertices a ected by that deletion, performing a computation similar to a topological sorting of dag SP (G). A work set is maintained containing vertices that have been identi ed as a ected, but have not yet been processed. Initially, vertex w is inserted in the work set, only if there are no further edges in SP (G) entering w after the deletion. Vertices in the work set are processed one by one, and when vertex u is processed, all edges outgoing u are deleted from SP (G). All vertices that are identi ed as a ected during this process are inserted in the work set. In the second phase the new distances of the a ected vertices from the source are computed, using an approach similar to Dijkstra's one. In particular, if A is the set of a ected vertices computed during the rst phase, and B is the set of una ected vertices, x;y 7 with s 2 B , then the correct distance from the source is known for each vertex in B , and the new distance value has to be computed for each vertex in A. This problem is reduced to a computation of Dijkstra's algorithm as follows: introduce a new source vertex s0, by condensing into s0 the subgraph of G induced by vertices in B ; consider the subgraph induced by vertices in A and, for each edge (v; w) from a vertex v outside A to a vertex w inside A, add an edge from s0 to w, with weight equal to the nal distance of vertex v plus the weight of edge (v; w). Now Dijkstra's algorithm is performed on the graph built as above, starting from s0. During this second phase the algorithm takes also into account to update, for each vertex v, the set of edges entering and outgoing v in SP (G), in a way similar to that described above in the case of an edge insertion. The algorithms of Ramalingam and Reps require O(jjjj log jjjj) worst case time to perform both insertions and deletions of edges. In addition O(m + n log n) initialization time is required and O(m + n) space, while queries can be answered in optimal time. The initialization time is the time required to execute a simple modi cation of Dijkstra's algorithm that computes the initial shortest distances of the vertices in G, and the initial dag SP (G). The di erence with respect to the Dijkstra's algorithm is that, when the vertex z 2 V ? S with minimum distance label is selected and an edge (z; q) 2 out(z) is traversed, the following operations are performed. If d(z)+ w < d(q) then the distance label of vertex q is given the value d(z) + w ; all the edges entering q in SP (G) are deleted; edge (z; q) is added to SP (G). On the other hand, if d(z) + w = d(q) then edge (z; q) is simply added to SP (G). The data structures used to implement Ramalingam and Reps' algorithm are stored in a parameterized LEDA class called SSSP RR, and can be summarized as follows. The current solution, that is the dag SP (G) of all the shortest paths from the source, is represented by a LEDA parameterized graph, where the labels of the vertices are the same of the embedded graph in the class. The distances of vertices from the source are stored in an array dist. The priority queues used in the algorithms have been implemented using Fibonacci heaps. The class constructor uses the modi cation of Dijkstra's algorithm described above for the initialization phase. z;q z;q z;q 5 The algorithm of Frigioni et al. In this section we describe the second fully dynamic solution for the single source shortest paths problem on digraphs with positive real edge weights, that has been proposed in the literature [18, 19]. The theoretical performances of the algorithms are evaluated in the output complexity model proposed in [20], where the number of output updates jj is the number of vertices a ected by the input change . The algorithms work for any graph, have optimal space requirements and query time, and, for several classes of graphs, they are only a logarithmic factor slower than the optimal one. The strategy proposed in [18] is based on the use of two main ingredients: the notion of level of an edge, and the notion of ownership of a vertex, that we describe in the following. 1. Let G = (V; E ) be a weighted digraph, and z be a vertex of G. The backward level of edge (z; q) 2 out(z) and of vertex q, relative to vertex z, is the quantity 8 b level (q) = d(q) ? w . The forward level of edge (v; z) 2 in(z) and of vertex v, relative to vertex z, is the quantity f level (v) = d(v) + w . The intuition behind such parameters is that the levels of an edge (z; q) provide information about the shortest available path from s to q passing through z. For instance, let us suppose that, while processing an edge insertion, the algorithm has computed the new distance d0(z) of vertex z from s, and that there exists an edge (z; q) 2 out(z) such that b level (q) ? d0(z) = d(q) ? w ? d0(z) > 0, i.e., d(q) > d0(z) + w . This means that after the insertion the path from the source to q that passes through vertex z, is shorter than the shortest path from s to q in the graph before that insertion. In the case of an edge deletion, let suppose that the operation is performed on an edge in the current shortest path from s to a vertex z. Let l be the length of the shortest path from s to z found so far by the algorithm. Suppose now that there exists an edge (q; z) 2 in(z), such that f level (q) ? l = d(q)+ w ? l < 0, i.e., l > d(q) + w . This means that the path from s to z that passes through vertex q is shorter than the current path of length l. z z;q v;z z z q;z q;z q;z z q;z 2. In order to bound the number of edges scanned by the algorithms each time that a vertex changes its distance from the source, for each vertex z 2 V , each of the sets in(z ) and out(z ) is partitioned in two subsets as follows. Every edge (x; y ) 2 E has an owner that must be either x or y. For each vertex x 2 V , in-own(x) denotes the subset of in(x) containing the edges owned by x, and in-own(x) denotes the set of edges in in(x) not owned by x. Analougously, out-own(x) and out-own(x) represent the edges in out(x) owned and not owned by x, respectively. The above partition satis es the following property: i) (z; q) 2 out-own(z) if and only if (z; q) 2 in-own(q); ii) (z; q) 2 out-own(z) if and only if (z; q) 2 in-own(q). The digraph G has a k-bounded ownership if both in-own(x) and out-own(x) contain at most k edges. The notion of k-bounded ownership is required only to bound the running times of the algorithms, but does not a ect their behavior (the algorithms do not need to know the value k or any upper bound on it). For each vertex x, the edges in in-own(x) and in out-own(x) are stored in two priority queues denoted as F and B , with priorities given by their f level and b level, respectively. Any time that a vertex z changes its distance from the source as a consequence of an input modi cation, the algorithms traverse all the edges owned by z (at most k), and only the right ones among those not owned by z, using the intuition described above. In the case of an edge insertion the number of output updates is given by the number of vertices that change their distance from the source s, as a consequence of that modi cation. If the insertion of edge (x; y) decreases the distance of vertex y from s, a global priority queue C is used, as in Dijkstra's algorithm, in order to nd new distances from the source in nondecreasing order. Unlike Dijkstra's algorithm and Ramalingam and Reps one, when a vertex z is dequeued from C and its new distance from the source d0(z) < d(z) is computed, not all edges leaving z are scanned in order to propagate the new distance d0(z). In particular, the information stored in heap B is used, in order to select the edges (z; q) in out-own(z) to be traversed. They are the edges whose priority x x z 9 in B is greater than d0(z), i.e., edges that lead to vertices q that surely decrease their distance from the source, choosing z as new parent in the shortest paths tree. In the case of an edge deletion, the number of output updates is given by the number of vertices that, either change their distance from the source s, as a consequence of that modi cation, or must change the parent in T (s), so that the new shortest path from the source is as good as the previous one. The algorithm proposed in [18] for edge deletions works in two phases: rst it nds all the vertices that have to be updated, and then computes the new distances from the source and the new shortest path tree. In particular, if z is a vertex that increases its distance from the source, due to an edge deletion the algorithm has to deal with two tasks: (i) nd the new shortest path from the source to z, choosing a new parent for z in T (s); (ii) nd out the neighbors of z that also increase their distance from the source due to the edge deletion. The number of edges in in-own(z) scanned by the algorithm in the rst phase, in order to nd the best possible alternative path from the source to z, is bounded as follows. Only edges (q; z) in in-own(z) with f level (q) ? d(z) = d(q) + w ? d(z) < 0 are considered, using the information stored in heap F . The second phase performs a computation similar to Dijkstra's algorithm, on the subgraph of G induced by the vertices that change their distance from s, as a consequence of that deletion. In this phase only edges between a ected vertices are traversed. If the graph G has a k-bounded ownership then the algorithms require O(k log n) time per output update. The initialization time of FMN is equal to O(m + n log n). This is the time required to execute a simple modi cation of the Dijkstra's algorithm that computes the initial shortest distances of the vertices in G, the initial shortest paths tree, the initial ownership function for G and sets up the local heaps F and B , for each vertex z in the graph. The data structures used for Frigioni, Marchetti and Nanni's algorithm are implemented in a parameterized class called SSSP FMN. The graph inside the class is parameterized and the label of each vertex x is a pointer to a structure containing information about the vertex, including: its current distance from the source, two heaps F and B containing edges in in-own(x) and out-own(x), respectively, and two further lists containing the edges owned by x. Each edge also points to a structure containing information about the weight of the edge and its location in the ownerships of vertices, in order to perform insertions and deletions in the corresponding structures in constant time. Two main private methods, MILDEST SLOPE and CHANGE OWNERSHIP, are used in order to nd the best alternative parent of a vertex that increases its distance from the source or changes its parent in the current shortest path tree after an edge deletion, and to update the ownership of each vertex, respectively. A global heap is used, as in the other algorithms, to compute shortest paths from the source in nondecreasing order. For FMN algorithm, we tested di erent alternatives for the local priority queues F and B , and nally selected Fibonacci heaps. z z;q z z z z x x x 10 x 6 Experimental results In this section we rst summarize and discuss brie y the theoretical time and space bounds of the algorithms considered, then we describe the used experimental setup, and nally we comment in detail the results of our experimentations. We recall that the considered algorithms are denoted as follows: Dij the LEDA implementation of Dijkstra's algorithm [12, 17, 28, 29], suitably modi ed in order to process dynamic updates eciently (in the sense speci ed in Section 3); RR our implementation of the algorithm proposed by Ramalingam and Reps in [31], and described in Section 4; FMN our implementation of the algorithm proposed by Frigioni, Marchetti{Spaccamela, and Nanni in [18], and described in Section 5. The table of Figure 6 summarizes the theoretical time and space bounds of the three algorithms considered. Remind that, the parameter jj represents the number of vertices a ected by the input change , while the parameter jjjj represents the number of a ected vertices, plus the number of edges with at least one a ected endpoint. In the worst case the above parameters satysfy the following inequalities: jj jjjj n jj. Finally, the parameter k in the time bound for insertions and deletions of edges of FMN, is a parameter that depends on the class of the considered graphpas described in Section 5, for example, for a general digraph with m edges we have k = m. Algorithm Path query Distance Query Insert-Delete Dij O(n) O(1) O(m + n log n) RR O(n) O(1) O( log ) FMN O(n) O(1) O(k log n) jj jj j j jj jj Preprocessing Space O(m + n log n) O(m + n) O(m + n log n) O(m + n) O(m + n log n) O(m + n) Figure 1: Theoretical time and space bounds of the considered algorithms. All the algorithms explicitly store the shortest distances and the shortest paths of vertices from the source, and therefore distance and path queries can be always answered in optimal time. In addition to this theoretical consideration, we experimentally checked that queries do not in uence the performances of the algorithms, and hence we decided to report only on the updates. The considered algorithms have the same space requirements and the same preprocessing time, but some observations are worth nothing. First of all, RR and FMN require additional space requirements with respect to Dij. The rst one for storing the dag of shortest paths SP (G), whose size is greater than the size of the shortest paths tree computed by Dij; the second one for storing the edges owned and not owned by each vertex. The preprocessing time of the three algorithms basically consists of executing Dijkstra's algorithm on the initial graph. In the case of RR the algorithm is modi ed properly to build the initial dag SP (G). In the case of FMN it is modi ed in order to build 11 the initial local data structures for the vertices of G, i.e., an initial ownership function for G and the local heaps of the vertices. From a theoretical point of view FMN is the most ecient algorithm in a fully dynamic environment with explicit updates, among the three algorithms considered. In fact, for several classes of graphs it is only a logarithmic factor far from the optimum one, which is an algorithm that, after each input change, would update the shortest path information only for the vertices a ected by that change. On the other hand, a common phenomenon is that a theoretically fast algorithm is outperformed by a slower and simpler one in practice, due to the large constants that can be hidden in the O-notation. In order to discover whether this phenomenon appears in our case, we performed an extensive experimental study of the three algorithms considered. In the remainder of this section we will describe the experimental behavior of the three algorithms, in order to show which is the most ecient one from an experimental point of view. The correctness of the various implementations was veri ed by comparing the results of the algorithms subject to di erent inputs. In particular, we checked whether the shortest distances of vertices from the source, computed by the three algorithms on the same input graphs, were the same after each input update over arbitrary sequences of edge modi cations. Our experiments have been performed on a SUN SPARC Ultra-2 with 128 megabytes of RAM memory. CPU times have been measured in seconds by using the UNIX getrusage() function. Furthermore, we have used the memory management of LEDA. In the reported experiments we have investigated the behavior of the proposed algorithm from various points of view. In particular, we have considered the following four kinds of update operations on the edges: edge insertions and deletions, weight increases and decreases. We have reported the following experiments: 1. random sequences of edge updates: each update operation is chosen among the four kinds speci ed above, starting from an initial random graph; 2. modifying sequences of edge updates: these are still random sequences, but each operation is chosen among the operations that actually modify some shortest path from the source, starting from an initial random graph; 3. ad hoc sequences of edge updates on special graphs, that we describe below; 4. random sequences of edge updates on the Internet graphs. The rst two groups of the reported experiments consider initial graphs having a variable number of vertices and constant density. For each experiment we report the average time on three instances. We performed experiments with graphs with di erent type of edge weights: positive integers in the ranges 1{100 and 1{2, and real values. We have separated the time spent by the algorithms to perform updates and their preprocessing times to set up the di erent data structures. The initialization time of the various algorithms is reported apart in Figure 2. As expected, the more complex data structures of FMN require a larger initialization time, but the di erence is within a factor of about 1.5{1.7 between the time spent by the slowest algorithm (FMN) and the fastest one (Dij). 12 CPU time for initialization average of 3 instances random directed graphs with node number: constant edge density: edge weights: 50-500 50% integers (1-100) 1.2 Dijkstra FMN RR 1.0 0.8 0.6 0.4 0.2 500 450 400 350 300 250 200 150 100 50 0.0 Figure 2: CPU time (in seconds) to initialize the data structures for Dij, RR, and FMN 6.1 Random sequences on random graphs A random operation can modify the current shortest paths information, or not. In the latter case none of the three algorithms has to be executed on the current graph. The experimental results obtained for random sequences of updates are reported in Figure 3. The two dynamic algorithms RR and FMN, as expected, perform updates faster with respect to their static counterpart Dij, with a substantially stable ratio in running time of about 25 between RR and Dij, and about 10{12 between Dij and FMN. Algorithm RR performs also better than FMN by a factor of around 2.5. Hence, the simpler data structures by RR overcome the larger number of edges scanned by this algorithm with respect to FMN (see Figure 5). We note that, as the density of the initial graph increases due to updates, the total time spent to perform the updates does not increase, even in the case of the static Dij algorithm. In fact, as the density of the graph increases, the work done to perform updates increases, but the probability that the edge operation requires some update decreases. Actually, only a small fraction of the edge operations requires to update the shortest path information. 6.2 Modifying sequences In this case we substantially set to zero the probability that an edge update does not modify the current shortest path tree. In this way all the three algorithms considered have to perform some computation for each edge operation in the modifying sequence. In the case of Dij, the running time increases with the size of the graph, as expected from the asymptotic analysis. 13 A) Total CPU time for a random sequence of 10000 edge operations average of 3 instances random directed graphs with node number: constant edge density: edge weights: 50-500 50% integers (1-100) 4.5000 Dijkstra FMN RR 4.0000 3.5000 3.0000 2.5000 2.0000 1.5000 1.0000 0.5000 500 450 400 350 300 250 200 150 100 50 0.0000 B) Total CPU time for a random sequence of 10000 edge operations average of 3 instances random directed graphs with node number: constant edge density: edge weights: 50-500 50% real values (1.0-100.0) Dijkstra FMN RR 6.0000 5.0000 4.0000 3.0000 2.0000 1.0000 500 450 400 350 300 250 200 150 100 50 0.0000 Figure 3: CPU time (in seconds) to process random sequences of 10000 operations on random graphs with constant edge density, with (A) integer and (B) real edge weights In the considered range of parameters, the running time of the dynamic algorithms and FMN is essentially much more stable, and an order of magnitude lower than Dij. The ratio between the time performances of RR and FMN is essentially similar to the case of random sequences (see Figure 4) although the number of edges considered for possible updates is smaller in case of FMN algorithm (see Figure 5). This phenomenon is probably RR 14 Total CPU time for a modifying sequence of 1000 edge operations average of 3 instances random directed graphs with node number: constant edge density: edge weights: 50-500 50% integers (1-100) Dijkstra FMN RR 60 50 40 30 20 10 500 450 400 350 300 250 200 150 100 50 0 Figure 4: total CPU time to process random sequences of 1000 modifying operations on random graphs with constant edge density due to the fact that each edge traversal performed by FMN has a logarithmic cost. 6.3 Ad hoc sequences on special graphs In this group of experiments we wanted to show that if we increase the size of the neighborhood of the vertices to be updated, FMN is faster than RR by a ratio which can be proportional the number of vertices in the graph, re ecting the theoretical di erence between the parameters jj and jjjj that was used in the analysis of the two algorithms. If we are given two algorithms that solve the same problem Andrew Goldberg has called separator for the two algorithms a class of instances such that one of the two algorithms outperforms the other one on the given instances. As a separator between FMN and RR algorithms we have chosen a special class of graphs with the following structure. The set of vertices consists of a top source vertex s, a bottom vertex t, and a set of vertices x1 ; x2 ; :::; x ; the edges connect all the k vertices x1 ; x2; :::; x to both the source and the bottom. The sequence of updates consists in alternated insertion and deletion of a single edge (s; t) with a proper edge weight. In Figure 6 we show that the ratio between the time spent by RR and FMN can be arbitrarily large (as the neighborhood of the updated vertices increases in size). All the experiments use a graph with the xed topology described above, but edge weights are chosen such that the number of vertices to be updated is variable. k k 15 A) Total CPU time for a modifying sequence of 1000 edge operations average of 3 instances random graphs with node number: edge density: 10%-90% edge weights: integers (1-100) 200 700000 FMN RR 600000 500000 400000 300000 200000 100000 35820 31840 27860 23880 19900 15920 11940 7960 3980 0 B) Total CPU time for a modifying sequence of 1000 edge operations average of 3 instances random graphs with node number: edge density: 10%-90% edge weights: 1 200 350000 FMN RR 300000 250000 200000 150000 100000 50000 35820 31840 27860 23880 19900 15920 11940 7960 3980 0 Figure 5: number of edges scanned while processing random sequences of 1000 modifying operations on graphs with 200 vertices and variable edge density; integer edge weights in the range (A) 1{100, and (B) unitary. 6.4 Experiments on the Internet graph In this section we describe the experiments we have carried out on the graph modeling the Internet network. We used a database describing the fragment of the network visible from one of the main European servers: Reseaux IP Europeene (RIPE), 16 9 8 7 6 5 RR 4 FMN 3 2 1 4000 2002 3600 1802 3200 1602 2800 1402 2400 1202 2000 1002 1600 802 1200 602 800 402 0 400 202 TOTAL CPU TIME ad hoc sequences of 1000 operations special graphs (single experiment) #nodes #edges RR FMN 202 400 0,70004 0,06570 402 800 1,33943 0,08198 602 1200 2,03070 0,11350 802 1600 2,75082 0,11100 1002 2000 3,46326 0,13705 1202 2400 5,01197 0,14609 1402 2800 5,98160 0,16497 1602 3200 6,64268 0,20075 1802 3600 7,42286 0,20708 2002 4000 8,13032 0,21597 Figure 6: ad hoc sequences of updates on special graphs with variable size. at the site dbase.ripe.net. The interpretation of the database is based on the report ripe-81++ [8]. A basic notion of the database is the Autonomous System (AS), that includes a group of IP networks with a unique routing policy run by one or more network operators. Each AS has a unique identi er (aut-num). The main RIPE's object used for routing purposes is the Autonomous System Object, which stores descriptive, administrative and contact information about the corresponding AS as well as its routing policies in relation to all neighboring ASes. An Autonomous System Object object is described by a number of attributes; for our goals we considered only two of them: aut-num which is the Autonomous System Number identifying the AS; as-in that describes the accepted information between AS neighbors. After ltering the information contained in that database, in order to reconstruct an explicit representation G = (V; E ) of the network we have used the following rules: a) each AS is a vertex of G; b) if the attribute as-in of the autonomous system x speci es that x accepts routing information from AS y, then we add the edge (y; x; 1) to E , no matter what routing policy is described. In this way we obtained an undirected graph G with 1259 vertices, 5104 edges, unitary edge weights and several connected components. The updates that we considered are fully dynamic sequences of edge deletions and reinsertions, i.e., without changing the initial topology of G. These experiments were chosen to simulate, as far as possible, a realistic example, with failure and recovery of links: purely random edge operations would lead the graph toward a more uniform degree of the vertices. We chose two di erent sources: the autonomous systems AS 1755 and AS 3561. The rst one minimizes the number of nodes not reachable on the starting graph (399), while the other maximizes the outgoing degree of the source (82) still maintaining 403 isolated vertices (out of 1259). The sequences have been randomly generated, with equal probability (50%) for insertions and deletions (with the constraint that any inserted edge has been previously deleted, for the reasons explained above). Each sequence is composed by 10000 operations. The results, reported in gure 7, provide average values on ve sequences for each of the speci ed sources. 17 Total time (in seconds) source AS1755 AS3561 Dij 12.615026 12.563434 Scanned edges source AS1755 AS3561 Dij 6,725,010 6,706,948 RR 0.653482 0.651382 RR 20,466 21,073 FMN 0.462232 0.470504 FMN 15,401 15,829 Time ratio = total time / time(Dij) Dij 100.00% 100.00% RR 5.18% 5.18% FMN 3.66% 3.75% Edge ratio = scanned edges / edges(Dij) Dij 100.00% 100.00% RR 0.30% 0.31% FMN 0.23% 0.24% Figure 7: random sequences of 10000 edge insertions and deletions on the Internet graph. 7 Conclusions We have shown, at least in the case of random instances, the remarkable advantages of using the dynamic algorithms RR and FMN for solving the fully dynamic single-source shortest-path problem on digraphs with positive real edge weights, with respect to the computation from scratch using the well known Dijkstra's algorithm, widespread over a number of software application. The resulting gures and tables show that in the conditions described above it is possible to save over 95% of the running time devoted to update shortest paths, in a fully dynamic framework with explicit updates. Another result of our experiments is that RR is faster than FMN by a factor 2{3 in the case of random instances, due to its simpler data structures (that also produces a more e ective use of cache memory): on the other hand, the edges scanned by FMN are always a subset of those scanned by RR. In very special cases we can force RR to perform radically worse than FMN, with a ratio between both the running times and the number of scanned edges which is not bounded. In conclusions, RR performs usually better in running time, while FMN seems to be better suited when the worst case time is the main concern, or when we want to keep as low as possible the number of edges to be considered for possible updates. In the experiments on the Internet graph, the low density and the large number of alternative shortest paths with the same length (that must be handled by RR) led FMN to perform better that we expected. We aim to make experiments in a more realistic framework. By the way, in these experiments, it is impressive to compare the number of edges scanned in order to update shortest paths: the ratio between the static algorithm (actually used in routing) and the two dynamic ones is in the range of 500 { 600. We are considering suitable contexts to propose dynamic algorithms in the realm of communication networks. Acknowledgments We like to thank Gianfranco Lanzilli for handling and analyzing the RIPE database, and Francesco Pugliese and Telecom Italia for providing support ans for stimulating discussions. 18 References [1] R. K. Ahuia, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms and Applications. Prentice Hall, Englewood Cli s, NJ, 1993. [2] R. K. Ahuia, K. Mehlhorn, J. B. Orlin, and R. E. Tarjan. Faster algorithms for the shortest paths problem. Journal of the ACM, 37(2):213{233, 1990. [3] D. Alberts, G. Cattaneo, and G. F. Italiano. An empirical study of dynamic graph algorithms. In ACM-SIAM Symposium on Discrete Algorithms, pages 192{201, 1996. To appear on ACM Journal on Experimental Algorithmics. [4] B. Alpern, R. Hoover, B.K. Rosen, P.F. Sweeney, and F.K. Zadeck. Incremental evaluation of computational circuits. In ACM-SIAM Symposium on Discrete Algorithms, pages 32{42, 1990. [5] G. Amato, G. Cattaneo, and G. F. Italiano. Experimental analysis of dynamic minimum spanning tree algorithms. In ACM-SIAM Symposium on Discrete Algorithms, pages 1{10, 1997. [6] G. Ausiello, G. F. Italiano, A. Marchetti-Spaccamela, and U. Nanni. Incremental algorithms for minimal length paths. Journal of Algorithms, 12(4):615{638, 1991. [7] M Barbehenn and S. Hutchinson. Ecient search and hierarchical motion planning by dynamically maintaining single-source shortest paths trees. IEEE Transaction on Robotics and Automation, 11(2):198{214, 1995. [8] T. Bates, E. Gerich, L. Joncheray, J-M. Jouanigot, D. Karrenberg, M. Terpstra, and J. Yu. Representation of ip routing policies in a routing registry. Technical report, RIPE-181, October 1994. [9] S. Chaudhuri and C. D. Zaroliagis. Shortest path queries in digraph of small treewidth. In International Colloquium on Automata, Languages, and Programming, pages 244{255. Lect. Notes in Comp. Sci., 944, 1995. [10] B. V. Cherkassky and A. V. Goldberg. Negative-cycle detection algorithms. In European Symposium on Algorithms, pages 349{363. Lect. Notes in Comp. Sci., 1996. [11] B. V. Cherkassky, A. V. Goldberg, and T. Radzik. Shortest paths algorithms: Theory and experimental evaluation. In ACM-SIAM Symposium on Discrete Algorithms, pages 516{525, 1994. [12] E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Mathematik, 1:269{271, 1959. [13] D. Eppstein, Z. Galil, G. F. Italiano, and A. Nissenzweig. Sparsi cation { a technique for speeding up dynamic graph algorithms. In IEEE Symposium on Foundations of Computer Science, pages 60{69, 1992. 19 [14] S. Even and H. Gazit. Updating distances in dynamic graphs. Methods of Operations Research, 49:371{387, 1985. [15] E. Feuerstein and A. Marchetti-Spaccamela. On-line algorithms for shortest paths in planar graph. Theoretical Computer Science, 116:359{371, 1993. [16] P. G. Franciosa, D. Frigioni, and R. Giaccio. Semi-dynamic shortest paths and breadth- rst search on digraphs. In Symposium on Theoretical Aspects of Computer Science, pages 33{46. Lect. Notes in Comp. Sci., 1200, 1997. [17] M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their use in improved network optimization algorithms. Journal of the ACM, 34:596{615, 1987. [18] D. Frigioni, A. Marchetti-Spaccamela, and U. Nanni. Fully dynamic output bounded single source shortest path problem. In ACM-SIAM Symposium on Discrete Algorithms, pages 212{221, 1996. [19] D. Frigioni, A. Marchetti-Spaccamela, and U. Nanni. Fully dynamic algorithms for maintaining shortest path trees. 1997. Submitted for publication. [20] D. Frigioni, A. Marchetti-Spaccamela, and U. Nanni. Semi-dynamic algorithms for maintaining single source shortest path trees. Algorithmica, to appear, 1997. [21] A.V. Goldberg and C. Silverstein. Implementations of Dijkstra's algorithm based on multi-level buckets. Technical Report 95-187, NEC Research Institute, Inc., November 1995. Also, Proceedings of Network Optimization Conference, 1996. [22] A.V. Goldberg and R. E. Tarjan. Espected performances of Dijkstra's shortest paths algorithm. Technical Report 96-062, NEC Research Institute, Inc., June 1996. [23] M. R. Henzinger and V. King. Randomized dynamic graph algorithms with polilogarithmic time per operation. In ACM Symposium on Theory of Computing, pages 519{527, 1995. [24] G. F. Italiano. Amortized eciency of a path retrieval data structure. Theoretical Computer Science, 48:273{281, 1986. [25] G. F. Italiano. Finding paths and deleting edges in directed acyclic graphs. Information Processing Letters, 28:5{11, 1988. [26] P. N. Klein, S. Rao, M. Rauch, and S. Subramanian. Faster shortest-path algorithms for planar graphs. In ACM Symposium on Theory of Computing, pages 27{37, 1994. [27] S. M. Malitz. Genus g graphs have pagenumber o(pg). Journal of Algorithms, 17:85{109, 1994. [28] K. Mehlhorn and S. Naher. Leda, a platform for combinatorial and geometric computing. Communications of the ACM, 38:96{102, 1995. 20 [29] K. Mehlhorn, S. Naher, and C. Uhrig. The leda user manual, version 3.5. Technical report, Max Planck Institut for Informatik, 1997. [30] G. Ramalingam. Bounded Incremental Computation. Lect. Notes in Comp. Sci., 1089, 1996. [31] G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest path problem. Journal of Algorithms, 21:267{305, 1996. [32] H. Rohnert. A dynamization of the all pairs least cost path problem. In Symposium on Theoretical Aspects of Computer Science, pages 279{286. Lect. Notes in Comp. Sci., 182, 1985. [33] M. Thorup. Undirected single source shortest paths in linear time. In IEEE Symposium on Foundations of Computer Science, 1997. 21 View publication stats

RELATED PAPERS

RELATED TOPICS

Log In

Experimental Analysis of Dynamic Algorithms for the Single Source Shortest Path Problem y

Experimental Analysis of Dynamic Algorithms for the Single Source Shortest Path Problem y

Related Papers

RELATED PAPERS

RELATED TOPICS