CMPT 307 Notes on Greedy Algorithms (Should Be Read in Conjunction with the Lecture Notes and the Text.)

CMPT 307 Notes on Greedy Algorithms (Should be read in conjunction with the lecture notes and the text.) 1 Introduction A greedy algorithm builds a solution to a problem in steps. At each iteration, it adds a part of the solution. Which part of the solution to add next is determined by a greedy rule. The greedy rule says: among all the possibilities choose the best one. It never backtracks or changes past choices. Depending on the problem, greedy strategy may or may not work. However, there are many important problems that can be solved using the greedy strategy. We consider the following problems. 1. Coin Changing 2. Huffman Codes (Chapter 16.3) 3. Minimum-cost Spanning Tree (Chapter 23) 4. Single Source Shortest Path (Chapter 24) A greedy algorithm proceeds step by step. Initially, the part of the solution selected is empty. Then at each step some part of the solution is added to the partial solution already obtained. This addition is guided by the selection function which uses a greedy rule. If after the addition, the extended part of the solution is no longer feasible, the part of the solution, just added, is removed; and this part of the solution is never considered again. However, if the extended part of the solution is still feasible, the added part of the solution stays. The process is repeated till the solution to the original problem is obtained. The generic function Greedy can be described as follows: function Greedy (C:set):set fC is the set of all candidates for the solutiong S Φ f S is a set which stores the part of the solution.g while not solution(S) and C =6 Φ do x an element, or elements, of C using a greedy rule select(x) Update C by removing x, selected in the previous step. In some cases some elements get added to C as well. if feasible (S [ fxg) then S S [ fxg if solution(S) then return S else return \There is no solution" 1 2 Coin Changing We are given an unlimited number of coins representing 1, 5, 10, 25 denominations. We want to give change to a customer using the smallest possible number of coins. The following greedy strategy is applied: • candidate set: an unlimited set of coins representing 1, 5, 10, 25 denominations. • solution: the total value of the chosen set of coins is exactly the amount we have to pay. • feasible set: the total value of the chosen set does not exceed the amount to be paid. • selection function (greedy rule): choose the highest-denomination coin whose value does not exceed the balance of the change. • objective function: the number of coins used in the solution. Theorem 1: The greedy algorithm is optimal for denominations 1, 5, 10, 25. Proof: We use induction to prove that, to make change for an amount A, the greedy output and the optimal solution are identical. The cases A = 1; 2; 3; : : : ; 23; 24 can be readily verified. Suppose the greedy algorithm gives optimal solution for any amount k < A. Suppose A ≥ 25. Let Opt be an optimal solution. Let b25; b10; b5 and b1 denote the respective number of coins of denominations 25, 10, 5 and 1 in the optimal solution of A. We claim that Opt must use a coin of denomination 25. We prove this by contradiction. Suppose that b25 = 0. Clearly, b10 ≤ 2, b5 ≤ 1, and b1 ≤ 4. Otherwise, it is possible to improve the optimal solution. It is also not possible to have the combination b10 = 2 and b5 = 1. Otherwise, we could replace the 10 and 5 denomination coins by a 25 denomination coin. Therefore, the possible combinations for the 10 and 5 denomination coins are: b10 = 2; b5 = 0; b10 = 1; b5 = 1; b10 = 1; b5 = 0; b10 = 0; b5 = 1; and b10 = 0; b5 = 0. In all these cases, 1:b1 + 5:b5 + 10:b10 is less than 25 which is less than A. Therefore, none of the above combinations is possible in the optimal change for A ≥ 25. Hence our claim that b25 = 0 is false. Once one 25 denomination coin is used by the greedy algorithm, the greedy strategy gives optimal change for A−25, by induction hypothesis. Therefore, adding a coin of denomination 25 to the greedy algorithm output for A − 25 yields greedy algorithm output for A. Thus, the greedy solution and the optimal solution are the same. 2 Note: The above strategy does not work if there also exist coins of denomination 12. (For A=29 the greedy algorithm gives wrong result.) We now describe a dynamic programming approach that solves the coin change problem for a list of k coins (d1; d2; : : : ; dk), d1 = 1, and di < di+1 for all i. Problem: Given a list of k coins (d1; d2; : : : ; dk), and a number n, we want to find the k k integers (bd1 ; bd2 ; : : : ; bdk ) such that n = Pi=1 dibdi and Pi=1 bdi is minimal. Our subproblems consist of the optimal change set for 1 through n. To keep track of the optimal solution for each subproblem we will use an array bSum which is indexed by 2 subproblem. (i.e. bSum[i] contains the least number of coins needed to make change for i. The recurrence relation for bSum[i] can be described as follows: bSum[d1] = 1; bSum[d2] = 1; : : : ; bSum[dk] = 1 bSum[i] = min1≤j≤kfbSum[i − dj] + 1g In the above, we ignore the case when i − dj < 0. The top down algorithm and the bottom up algorithm should be easy to write. (Make sure that you know this.) There are O(n) subproblems, each subproblem takes O(k) time to solve. Therefor, the dynamic programming solution of the coin change problem for any set of denomination coins can be solved in O(nk) time. The storage space requirement is O(n). 2.1 Problems 1. Show that the greedy strategy for the coint set (1; 5; 10; 25) is optimal for any change less than 25. (In the induction proof we mentioned this is easy to determine.) 2. Determine whether greedy strategy is optimal for each of the following coin sets. If it is optimal give an argument to support your answer. If greedy strategy is not optimal, give a counterexample. (a) (1,4,10) (b) (1,5,10,25,50) (c) (1,5,14,18) 3. Show that the greedy strategy is optimal for the coin set (d1; d2; : : : ; dk), d1 < d2 < : : : < dk, and di−1 divides di, for i = 2; : : : ; k. 3 Huffman Code We are given an alphabet (a set of characters, e.g. English characters) and a string made up of these characters. The objective is to find binary codes for the characters that will minimize the total length of the given string. The general strategy is to encode frequently used characters with short binary codes and use longer binary codes for infrequently appearing characters. Suppose we use the following encoding: B=110, C=010, D=010110. When string \010110" is decoded, we can not say whether the string is \CB" or \D". Therefore, the additional requirement of the encoding is that no code can be a prefix of another code. In the above the code for C is a prefix of the code for D. The encoding could be of fixed length or of variable length. Consider the following table: character a b c d e f frequency 45 13 12 16 9 5 probability .45 .13 .12 .16 .09 .05 variable length code 0 101 100 111 1101 1100 3 The expected length using the code is: n∗(1∗:45+3∗:13+3∗:12+3∗:16+4∗:09+4∗:05) = 2:24n. Huffman’s algorithm The idea is to build a binary tree T with the leafs storing the characters such that Pc2alphabet P r(c) ∗ dT (c) is minimum where P r(c) denotes the probability of the character appearing in the text and dT (c) denotes the depth of leaf-node storing c in T . The algorithm can be informally described as follows: • The algorithm starts with a forest of jCj trees, each consisting of a single node labelled with a character and a weight (= character's probability). • While there is more than one tree in the forest do: { Choose 2 trees, X and Y , with least weight. { Construct a new tree Z with X and Y as its left and right children, respectively. Make the weight of Z to be the sum of the weights of X and Y . { Delete X and Y from the forest; add Z. • (Building of the tree is complete) Label the left links with \0" and the right links with \1". • Encode each character by concatenating the labels of the links on the path from the root to the leaf storing the character. It is easy to see that the algorithm produces prefix codes. This comes from the fact that the characters are all stored at the leaf nodes. The decoding is done by following the path dictated by the given code. The algorithm to construct T is formally described in the text (HUFFMAN(C), page 388). A priority queue is needed to implement EXTRACT-MIN(Q). The correctness of the above greedy algorithm follows from the lemma below: Lemma 1: Let C be an alphabet in which every character c 2 C has a weight w[c]. Let x and y be two characters in C having the smallest weight. Then there exists an optimal prefix code of C in which the codewords for x and y have the same length and differ in the last bit.

Load more