Algorithms and Data Structures
Total Page:16
File Type:pdf, Size:1020Kb
Algorithms and Data Structures Marcin Sydow Introduction Algorithms and Data Structures Shortest Paths Shortest Paths Variants Relaxation DAG Marcin Sydow Dijkstra Algorithm Bellman- Web Mining Lab Ford PJWSTK All Pairs Topics covered by this lecture: Algorithms and Data Structures Marcin Sydow Shortest Path Problem Introduction Shortest Variants Paths Variants Relaxation Relaxation DAG DAG Dijkstra Nonnegative Weights (Dijkstra) Algorithm Bellman- Arbitrary Weights (Bellman-Ford) Ford (*)All-pairs algorithm All Pairs Example: Fire Brigade Algorithms and Data Structures Marcin Consider the following example. There is a navigating system Sydow for the re brigade. The task of the system is as follows. When Introduction there is a re at some point f in the city, the system has to Shortest Paths Variants quickly compute the shortest way from the re brigade base to Relaxation the point f . The system has access to full information about DAG the current topology of the city (the map includes all the streets Dijkstra and crossings in the city) and the estimated latencies on all Algorithm sections of streets between all crossings. Bellman- Ford All Pairs (notice that some streets can be unidirectional. Assume that the system is aware of this) The Shortest Paths Problem Algorithms and Data Structures INPUT: a graph G = (V ; E) with weights on edges given by a Marcin Sydow function w : E ! R, a starting node s 2 V Introduction OUTPUT: for each node v 2 V Shortest Paths Variants the shortest-path distance µ(s; v) from s to v, if it exists Relaxation the parent in a shortest-path tree (if it exists), to DAG reconstruct the shortest path Dijkstra Algorithm Bellman- The shortest path may not exist for some reasons: Ford All Pairs there may be no path from s to v in the graph there may be a negative cycle in the graph (there may be no lower bound on the length in such case) Variants Algorithms and Data Structures Marcin When we design the best algorithm for the shortest paths Sydow problem, we can exploit some special properties of the graph, Introduction for example: Shortest Paths Variants Relaxation the graph G is directed or undirected DAG the weights are non-negative or integer, etc. (each case Dijkstra can be approached dierently to obtain the most ecient Algorithm algorithm) Bellman- Ford the graph is acyclic, etc. All Pairs We will see dierent algorithms for dierent variants Shortest Paths - properties Algorithms and Data If G = (V ; E) is an arbitrary graph and s; v 2 V , we consider Structures the following convention for the value of µ(s; v), the Marcin Sydow shortest-path distance from s to v in G: Introduction Shortest s v (when there is no path from s to v) Paths µ( ; ) = +1 Variants µ(s; v) = −∞ (when there exists a path from s to v that Relaxation contains a negative cycle) DAG Dijkstra µ(s; v) = d 2 R (else) Algorithm Bellman- (we will use the denotation µ(v) instead of µ(s; v) if s is clear Ford from the context) All Pairs Notice the following important property of shortest paths. A subpath of a shortest path is itself a shortest path (i.e. if (v1; :::; vk−1; vk ) is a shortest path from v1 to vk then (v1; :::; vk−1) must be a shortest path from v1 to vk−1 simple proof by contradiction) The idea of computing shortest paths Algorithms and Data Structures The general idea is similar to BFS from a source node s. We Marcin Sydow keep two attributes with each node v: Introduction v.distance: that keeps the shortest (currently known) Shortest Paths distance from s to v Variants Relaxation v.parent: that keeps the predecessor of v on the shortest DAG (currently known) path from s Dijkstra Algorithm Initialisation: s.distance = 0, s.parent = s, and all other Bellman- Ford nodes have distance set to innity and parent to null. All Pairs The attributes are updated by propagation via edges: it is called a relaxation of an edge. The Concept of Edge Relaxation Algorithms and Data Structures relax((u,v)) # (u,v) is an edge in E if u.distance + w(u,v) < v.distance Marcin Sydow v.distance = u.distance + w(u,v) v.parent = u Introduction Shortest Paths The algorithms relax edges until the shortest paths have been Variants Relaxation found or a negative cycle has been discovered. DAG Dijkstra Relaxation has some important properties, eg: Algorithm (assuming the prior initialisation as described before) Bellman- - after any sequence of relaxations, v V v.distance v Ford 8 2 ≥ µ( ) (simple proof by induction on the number of edge relaxations) (the All Pairs distance discovered by relaxations never drops below the shortest distance) It can be proven (by induction) that the relaxations indeed are the right tool to compute the shortest paths. Correctness of Relaxation Algorithms and Data Structures Lemma Marcin After doing a sequence R of edge relaxations that contains (as Sydow a subsequence) a shortest path p = (e1; :::; ek ) from s to v, Introduction v:distance = µ(s; v). Shortest Paths Variants Relaxation Proof: Because p is a shortest path, we have v P w e . µ( ) = 1≤j≤k ( j ) DAG Let vi denote the target node of ei , for 0 < i ≤ k (v0 = s). We show, Dijkstra Algorithm by induction, that after the i-th relaxation Bellman- v distance P w e . It is true at the beginning (s.distance i : ≤ 1≤j≤i ( j ) Ford == 0). Then, after the i-th relaxation, All Pairs v distance v distance w e P w e (by relaxation i : ≤ i−1: + ( i ) ≤ 1≤j≤i ( j ) denition and induction). Thus, after k-th relaxation v:distance ≤ µ(v). But v:distance cannot be lower than µ(v) (previous slide), so that v:distance == µ(v) holds. Shortest Paths in a Directed Acyclic Graph (case 1) Algorithms and Data Structures Marcin A directed acyclic graph (DAG) G = (V ; E) is a simple case for Sydow computing shortest paths from a source s. Introduction Shortest First, any DAG can be topologically sorted (e.g. by DFS) in Paths Variants O(m + n) time (where m = jEj and n = jV j, and edge Relaxation traversing is the dominating operation), resulting in a sequence DAG of vertices (v1; :::; vn). Dijkstra Algorithm Then, assuming s = vj , for some 0 < j ≤ n, we can relax all Bellman- edges outgoing of vj , then all edges outgoing from vj 1, etc. Ford + All Pairs Thus, each edge is relaxed at most once. Since each relaxation has constant time, the total time complexity is O(m + n). Dijkstra's Algorithm: only non-negative weights (case 2) Algorithms Another case is when there are no negative weights on edges and Data Structures (thus, there are no negative cycles, but a cycle can exist so that Marcin topological sorting is not possible). Sydow Introduction The idea is as follows. If we relax edges in the order of Shortest Paths non-decreasing shortest-path distance from the source s, then Variants each edge is relaxed exactly once. Relaxation DAG high-level pseudo-code of Dijkstra's algorithm: Dijkstra Algorithm initialise Bellman- while(there is an unscanned node with finite distance) Ford u = minimum-distance unscanned node All Pairs relax all edges (u,v) u becomes scanned In the following proof of correctness, for any node v we will denote v:distance as d(v), for short. Correctness of the Dijkstra's Algorithm Algorithms It is quite easy to observe that each node v reachable from s will be and Data Structures scanned by the Dijkstra's algorithm (e.g. by induction on the unweighted Marcin length of the shortest path from s to v) Sydow To complete, we prove that when any node v is scanned, d v v . By Introduction ( ) = µ( ) Shortest condtradiction, let t be the rst moment in time when any node v is Paths Variants scanned with d(v) > µ(v) (i.e. v contradicts what we want to prove). Let Relaxation (s = v1; :::; vk = v) be the shortest path from s to this node and let i be DAG the smallest index on this path such that vi was not scanned before the Dijkstra Algorithm time t. Since v1 = s and d(s) = 0 = µ(s), it must be that i > 1. By the Bellman- denition of i, vi−1 was scanned before the time t and d(vi−1) = µ(vi−1) Ford when it is scanned (due to the denition of t). Thus, when vi 1 is All Pairs − scanned, d(vi ) is set to d(vi−1) + w(vi−1; vi ) = µ(vi−1) + w(vi−1; vi ) (because the path is a shortest one). To summarise, at time t we have: d(vi ) = µ(vi ) ≤ µ(vk ) < d(vk ) so that vi should be scanned instead of vk (and they are dierent due to the sharp inequality) a contradiction. Algorithm (Dijkstra) Algorithms (pq - a priority queue with decreaseKey operation, priority is the value of and Data distance attribute) Structures s.distance = 0 Marcin pq.insert(s) Sydow s.parent = s Introduction for-each v in V except s: Shortest Paths v.distance = INFINITY Variants v.parent = null Relaxation DAG while(!pq.isEmpty()) scannedNode = pq.delMin() Dijkstra Algorithm for-each v in scannedNode.adjList: if (v.distance > scannedNode.distance + w(scannedNode, v)) Bellman- Ford v.distance = scannedNode.distance + w(scannedNode, v) v.parent = scannedNode All Pairs if (pq.contains(v)) pq.decreaseKey(v) else pq.insert(v) (to eciently implement contains and decreaseKey operations of priority queue, an additional dictionary can be kept for mapping from nodes to their positions (pointers) in priority queue) Dijkstra - analysis Algorithms and Data Structures Marcin Sydow data size: n = |V|, m = |E| Introduction dominating operation: comparison of priorities (inside priority Shortest Paths Variants queue), attribute update Relaxation DAG initialisation: O(n) Dijkstra Algorithm loop: O(n × (delMin + insert) + m × decreaseKey) Bellman- Ford O(nlogn) + O(mlogn) = O((n + m)logn) (if binary Heap is All Pairs used) (*)Faster