From a to B with Edsger and Friends
Total Page:16
File Type:pdf, Size:1020Kb
C H A P T E R 9 ■ ■ ■ 2 # 2 The shortest distance between two points is under construction. —Noelie Altito It’s time to return to the second problem from the introduction:1 how do you find the shortest route from Kashgar to Ningbo? If you pose this problem to any map software, you’d probably get the answer in less than a second. By now, this probably seems less mysterious than it (maybe) did initially, and you even have tools that could help you write such a program. You know that BFS would find the shortest path if all stretches of road had the same length, and you could use the DAG shortest path algorithm as long as you didn’t have any cycles in your graph. Sadly, the road map of China contains both cycles and roads of unequal length. Luckily, however, this chapter will give you the algorithms you need to solve this problem efficiently! And lest you think all this chapter is good for is writing map software, consider what other contexts the abstraction of shortest paths might be useful. For example, you could use it in any situation where you’d like to efficiently navigate a network, which would include all kinds of routing of packets over the Internet. In fact, the ’net is stuffed with such routing algorithms, all working behind the scenes. But such algorithms are also used in less obviously graph-like navigation, such as having characters move about intelligently in computer games. Or perhaps you’re trying to find the lowest number of moves to solve some form of puzzle? That would be equivalent to finding the shortest path in its state space—the abstract graph representing the puzzle states (nodes) and moves (edges). Or are you looking for ways to make money by exploiting discrepancies in currency exchange rates? One of the algorithms in this chapter will at least take you part of the way (see Exercise 9-1). Finding shortest paths is also an important subroutine in other algorithms that need not be very graph-like. For example, one common algorithm for finding the best possible match between n people and n jobs2 needs to solve this problem repeatedly. At one time, I worked on a program that tried to repair XML files, inserting start and end tags as needed to satisfy some simple XML schema (with rules such as “list items need to be wrapped in list tags”). It turned out that this could be solved easily by using one of the algorithms in this chapter. There are applications in operations research, VLSI manufacture, robotics—you name it. It’s definitely a problem you want to learn about. Luckily, although some of the 1 Don’t worry, I’ll revisit the “Sweden tour” problem in Chapter 11. 2 The min-cost bipartite matching problem, discussed in Chapter 10. 199 CHAPTER 9 ■ FROM A TO B WITH EDSGER AND FRIENDS algorithm can be a bit challenging, you’ve already worked through many, if not most, of their challenging bits in the previous chapters. The shortest path problem comes in several varieties. For example, you can find shortest paths (just like any other kinds of paths) in both directed and undirected graphs. The most important distinctions, though, stem from your starting points and destinations. Do you want to find the shortest from one node to all others (single source)? From one node to another (single pair, one to one, point to point)? From all nodes to one (single destination)? From all nodes to all others (all pairs)? Two of these—single source and all pairs—are perhaps the most important. Although we have some tricks for the single pair problem (see “Meeting in the middle” and “Knowing where you’re going,” later), there are no guarantees that will let us solve that problem any faster than the general single source problem. The single destination problem is, of course, equivalent (just flip the edges for the directed case). The all pairs problem can be tackled by using each node as a single source (and we’ll look into that), but there are special-purpose algorithms for that problem as well. Propagating Knowledge In Chapter 4, I introduced the idea of relaxation and gradual improvement. In Chapter 8, you saw the idea applied to finding shortest paths in DAGs. In fact, the iterative shortest path algorithm for DAGs (Listing 8-4) is not just a prototypical example of dynamic programming; it also illustrates the fundamental structure of the algorithms in this chapter: we use relaxation over the edges of a graph to propagate knowledge about shortest paths. Let’s review what this looks like. I’ll use a dict of dicts representation of the graph, and a dict D to maintain distance estimates (upper bounds), like in Chapter 8. In addition, I’ll add a predecessor dict, P, as for many of the traversal algorithms in Chapter 5. These predecessor pointers will form a so-called shortest path tree and will allow us to reconstruct the actual paths that correspond to the distances in D. Relaxation can then be factored out in the relax function in Listing 9-1. Note that I’m treating nonexistent entries in D as if they were infinite. (I could also just initialize them all to be infinite in the main algorithms, of course.) Listing 9-1. The Relaxation Operation inf = float('inf') def relax(W, u, v, D, P): d = D.get(u,inf) + W[u][v] # Possible shortcut estimate if d < D.get(v,inf): # Is it really a shortcut? D[v], P[v] = d, u # Update estimate and parent return True # There was a change! The idea is that we look for an improvement to the currently known distance to v by trying to take a shortcut through v. If it turns out not to be a shortcut, fine. We just ignore it. If it is a shortcut, we register the new distance and remember where we came from (by setting P[v] to u). I’ve also added a small extra piece of functionality: the return value indicates whether any change actually took place; that’ll come in handy later (though you won’t need it for all your algorithms). Here’s a look at how it works: >>> D[u] 7 >>> D[v] 13 >>> W[u][v] 3 >>> relax(W, u, v, D, P) 200 CHAPTER 9 ■ FROM A TO B WITH EDSGER AND FRIENDS True >>> D[v] 10 >>> D[v] = 8 >>> relax(W, u, v, D, P) >>> D[v] 8 As you can see, the first call to relax improves D[v] from 13 to 10, because I found a shortcut through u, which I had (presumably) already reached using a distance of 7, and which was just 3 away from v. Now I somehow discover that I can reach v by a path of length 8. I run relax again, but this time, no shortcut is found, so nothing happens. As you can probably surmise, if I now set D[u] to 4 and ran the same relax again, D[v] would improve, this time to 7, propagating the improved estimate from u to v. This propagation is what relax is all about. If you randomly relax edges, any improvements to the distances (and their corresponding paths) will eventually propagate throughout the entire graph—so if you keep randomly relaxing forever, you know that you’ll have the right answer. Forever, however, is a very long time … This is where the relax game (briefly mentioned in Chapter 4) comes in: we want to achieve correctness with as few calls to relax as possible. Exactly how few we can get away with depends on the exact nature of our problem. For example, for DAGs, we can get away with one call per edge—which is clearly the best we can hope for. As you’ll see a bit later, we can actually get that low for more general graphs as well (although with a higher total running time and with no negative weights allowed). Before getting into that, however, let’s take a look at some important facts that can be useful along the way. In the following, assume that we start in node s and that we initialize D[s] to zero, while all other distance estimates are set to infinity. Let d(u,v) be the length of the shortest path from u to v. • d(s,v) <= d(s,u) + W[u,v]. This is an example of the triangle inequality. • d(s,v) <= D[v]. For v other than s, D[v] is initially infinite, and we reduce it only when we find actual shortcuts. We never “cheat,” so it remains an upper bound. • If there is no path to node v, then relaxing will never get D[v] below infinity. That’s because we’ll never find any shortcuts to improve D[v]. • Assume a shortest path to v is formed by a path from s to u and an edge from u to v. Now, if D[u] is correct at any time before relaxing the edge from u to v, then D[v] is correct at all times afterward. The path defined by P[v] will also be correct. • Let [s, a, b, ... , z, v] be a shortest path from s to v. Assume all the edges (s,a), (a,b), … , (z,v) in the path have been relaxed in order. Then D[v] and P[v] will be correct. It doesn’t matter if other relax operations have been performed in between.