Massachusetts Institute of Technology Handout 4 6.854J/18.415J: Advanced Algorithms Wednesday, September 14, 2005 David Karger

Problem 1 Solutions

Problem 1.Suppose we have a Fibonacci that is ak− single 1 nodes. chain The of following operations make a chain ofk. length Let min be the current minimum of the Fibonacci heap:

1.Insert itemsx1< x2< x3<min in an arbitrary order.

2.Delete the minimum, whichx1. is

3.Decrease the keyx3to of be−∞, i.e. the minimum.

4.Delete the minimum, whichx3= is−∞.

The second step is the key one:x1 it, joins removesx2andx3as a chain, and then joins the original chain with the chain containingx2andx3(obtaining a wherex2is the root, withx3and the originalk − 1­nodes chain as children). The third step justx3from removes the chain, and the last step completely deletes it. Thex2is result now is the that root of the original chain, so we have constructed a chaink. For of the length base case, just insert a single node. Thus, we obtaink­nodes chain withO(k) operations; therefore, we can constructn)­nodes Ω( chain withnoperations. Note thatdecrease­keyoperation was essential for obtaining Ω(n) depth: without it, you can only obtain binomial heaps (which have logarithmic depth).

Problem 2.For each node, we store a counter of how many of its children were removed (callcountithe counter of nodei). To analyze the running time of the operations, we use 2� count i i the following potential function:φ= #roots+ k−1 . Theinsertoperation hasO(1) amortized cost. Noteφincreases that by 1 unit as in the case of the original Fibonacci heap. Thus, the cost of insert does not change. Thedecrease­keyoperation will have a lower amortized cost. Supposeccascading there are −2c(k−1)+2 cuts. Then, the amortizeddecrease­key cost of is 1c + Δφ, with Δ=φ +c+ k−1 . 2 2 Concluding, the costdecrease­key of is 1c +c− 2c+k−1= 1 +k−1 . Note that in the original Fibonacci heaps, this cost was 1 + 2 = 3. Conversely, thedelete­minoperation will have a higher amortized cost. The analysis is the same as in the case of the original Fibonacci heaps. Thus, thedelete­min amortized cost of is bounded by the maximum degree of a heap in our data structure. 2 Handout 4: Problem Set 1 Solutions

To analyze the maximum degree, we use arguments similar to those used for the original Fibonacci heaps. LetSmbe the minimum number of nodes in a heapm. with We degree will try to find a recurrence formulaSmand for then lower boundSm. Consider a node withm degree. Then the degreei’s of child its is at least{0, i − max k}. �m Considering thatSm=m + 1m for= 0 . . . k − 1, we haveS thatm=k + i=kSi−kfor m m ≥ k. Next, noteS thatm− Sm−1=Sm−k. The solution for this recurrenceSm≥ Ω(λ isk), k k−1 whereλkis the largest solution to the characteristicλ − equation λ − 1 = 0 (note that in the casek of= 2, the largest solution,λ2, is the golden ratio).Mis If the highest possible log n degree of a heap, then weS have≤ n that, meaning thatM≤ O(log n) =O( λ2 ). M λk log λ λ2 k logλ2 Thus, our modification slows down the runningdelete­min time ofby a factor of(λk logλk is a decreasing functionk). of Note:a common mistake was to take a potential function that gives suitable amortized cost of one operation. Remember that if you use a potential function, you have to check the running time of all operations using the same potential function. Another common mistake was to use for the analysis the same potential function as was used for the original Fibonacci heaps. That function does not give you a lower amortized cost for decrease­key(consider the case when there are no cuts).

Problem 3. (a)We can augment the priorityP queuewith a linkedl. list We modify the insert operation so it just puts the elementl. in Now the linked list we define a consolidate operation that adds the elements of the linked list to the priority queue. We do this by creating a newP priority� containing queue only the items in the linkedl using list make­heap. This takesO(m) time, wherem is the size of the linked list. We then merge thePand twoP � in queuesO(log n) time. Therefore, the total consolidation timeO(m + is logn). We modify delete­min to first consolidate, and then call the original delete­min. We modify merge to first consolidate each of the augmented priority queues, and then call the original merge. Consider a set of initially empty augmented priority{P �} (that queues may be merged later) on which all operations are performed. The potentialφ� function is defined as the sum of the size of the lists of eachP of�. the priority queues Note that inserting in a particular priorityO queue(1) amortized takes time. Delete­min on any particular priority queue alsoO(log takesnh) only amortized, time, wherenhis the size of that priority queue,O(m sinceh+ log thenh) real work to consolidate is decreased toO amortized(log nh) by the potential from the queue before delete­min was processed. Now, consider the amortized time to merge two of the augmented priority queues. We spend amortized time of O(log nh) +O(log nh� ) to consolidate each one, plus the real work of merging the � two priority queues whichO takes(log nh) time, assumingnh> nh.The total amortized time, then,O(log isnh) Handout 4: Problem Set 1 Solutions 3

(b) The basic idea is to use a heap of heaps (together with a list as in part (a)). The data structure is composed of several binaryP1,...Pk heapsand a “master” binary heapM. The heapsP1,...Pkcontain the elements of the data struc­ ture. The heapMcontains as its elements theP1,...P heapsk, which are keyed (compared) according to the values of their roots. To insert an element into the data structure, we just addl. it to the linked list Delete­min first does a consolidation (whichO(m takes+ logn) time, wheremis the lengthl). of Then, delete­min retrieves the “smallest”Pifrom heapM. The root ofPiis the minimum element in the data structure. Remove the minimum fromPi(usual heap operation).Piis If not empty, insert the modifiedPiback intoM.

In the consolidation step, we construct aP binaryk+1from heapl(iflis non­ empty) and empty thel. list This can be doneO(m) in using a standard heap construction algorithm. To finish the consolidation step,Pk+1 weinto insert into M. To analyze the running time, let the potential function be equal to the length of the listl. Then, insert takesO(1). Consolidation takesO(m + logn) real time, butO(log n) amortized time (since the length of the listm). decreases by Delete­min takesO(log n) time (note that the depth ofM,P all1,...P heaps,kis alwaysO(log n)).

Problem 4.Consider the offline algorithm: we process nodes in postorder (i.e., we traverse the nodes using DFS, and process a node only after processsing all of its children). When we process a nodea, we answer queriesa, b), ( such thatbwas processed earliera thanby doing a find in our union­find dataD; structure the “name” of the result is the answer to the query. Then wea unionwith the parenta, of and set the name of the set­representative to be the name of the parent. The relationship to persistent data structures is as follows. We view the order in which we process the nodes as time. Note that changes to the union­findDoccur data structure exactly at the times the nodes are processed, so that we can think of the data structure as changing over time:D1,D2,...Dn. Suppose we run the above algorithm, but at each time twe process a node, we save theDt. state Now, of suppose we wish to answer a query of the form (a, b). Supposeb was processed afteraat timet. Revert to the data structureDt, and do a “find”a. Thisof would answer the query (a, b). The goal, then, is to design a persistent version of the union­find data structure to support the following two operations:

• find(x,t): Find the namex’s of component att time. • union(w,p,t): Union the component withwand name the component withp nameat timet. 4 Handout 4: Problem Set 1 Solutions

We use the disjoint­forest implementation of the union­find data structure using the union by rank heuristic. For each node, the parent pointer will also storetat the which timestamp the parent pointer became non­null (note that this occurs exactly once for each node in the tree). Therefore, tofind(x,t) do , we walk up the parent pointers until we find a node whose parent pointer became non­null at a time latert. However, than we need to find the name of this component. To do this, we create a log of the operations done on the union­find data structure. The log is an array mapping time­stamps to the names of the components unioned. To compute the root node, we can lookup the name of the parent component corresponding to the time­stamp of the last edge traversed. Following parentO(log pointersn) time takes due to union by rank, so the find operationO(log n) time.takes To dounion(w,p,t) a , we firstfind(w,t) do a and afind(p,t). Then we do union by rank and timestamp the edge addedt. with Now we need to update the log: a log entry (w, p) is added to thetth element in the log array. It is clear that the unionO(log n operation) takes time, so the preprocessing timeO(n takeslogn) time.