Data Structures and Algorithms Algorithm Design Greedy Methods
Total Page:16
File Type:pdf, Size:1020Kb
Programs = Algorithms + Data Structures Niklaus Wirth Algorithms are often designed using common Data Structures and Algorithms techniques, including: Algorithm Design • Greedy Methods Greedy Methods and Divide & • Divide & Conquer Conquer • Dynamic Programming See references in Goodrich & Tamassia to • Backtracking Greedy Methods & Divide & Conquer • Brute Force • Optimisation Problems • Branch & Bound • Greedy Methods • Linear Programming • Proving Greedy Methods Optimal • Integer Programming • Divide & Conquer • Neural Networks • Genetic Algorithms • Simulated Annealing • ... 1 2 Change Making Intuition: attempt to construct a sum using the minimum number of coins. Optimisation Problems A set of coin denominations, e.g. Comprise {1, 2, 5, 10, 20, 50} • A set of constraints A target, e.g. 67p • An optimisation function A solution is a sequence of coins ci ∈ denominations, e.g. [20, 20, 20, 2, 2, 2, 1] or A solution that satisfies the constraints is [20, 20, 20, 5, 1, 1] feasible. A feasible solution with the best n possible value for the optimisation function is The constraint is Σi ci = target optimal. The optimisation function is the length of the sequence, e.g. 7 or 6 The optimal solution has the minimum sequence length, i.e. 6 3 4 Greedy Algorithm Design Technique Many real world problems can be formulated as Key idea: Attempt to construct the optimal optimisation problems. Operations Research solution by repeatedly taking the ’best’ feasible studies such problems, amongst others, for solution. large organisations like Airlines or Strategic Greedy Alg. 1: Change Making commands etc. Chose the largest possible denomination coin at Example problems: each stage: • Find the shortest journey, visiting Target Coin Selected Edinburgh, Newcastle, Carlisle, Inverness, 67 Glasgow. • Find the maximum volume of arbitrary-sized crates that can be packed into a container. Optimal solution = Exercise: Trace the algorithm for target 48p, giving the optimal solution. 5 6 Greedy Alg. 2: Simple Shortest Path Chose the cheapest edge to traverse from the end of the current path. 40 1 ----> 2 Is the Greedy Algorithm Optimal? |\ ^\ 20| \30 |10\50 It is usually easy to produce a greedy algorithm | \ | >5 for a problem. | \|/ However the algorithm may not always find v v | /30 an optimal solution. 3 ----> 4 20 Hence it is essential to prove that a proposed algorithm does find an optimal soluion. simpleShortestPath(1,5) = length = ShortestPath(1,5) = length = Dijkstra’s Shortest Path: Choose the cheapest edge to traverse from any reached vertex. 7 8 Change Making Correctness Proof 2. For the optimal change, we can establish t10 < (by contradiction) 3, t < 2, f < 2, tw < 3, p < 2. To see this, Fifties, twenties, tens, fives, twos and pennies are note that if t10 >= 3, we can replace three currency denominations. Twenties are smaller twenties by a fifty and a ten and provide the denominations than fifties; and tens are smaller change using one less coin. This is not possible denominations twenties; etc. as t10+t+f+tw+p is the fewest number of coins with which the change can be provided. Let F10, T10, T, F, TW and P respectively, be the Similarly for the other denominations. Hence, number of fifties, twenties, tens, fives, twos and the total amount of change given in lower pennies in the change generated by the greedy denominations is less than the value of the algorithm. Let f10, t10, t, f, tw and p, respectively, next higher denomination. be the number of fifties, twenties, tens, fives, twos and pennies in the change generated by an optimal Now if F10 != f10, then either the greedy or the algorithm. optimal solution must provide 50 pence or more in We make the following observations: lower denominations. This violates the above 1. From the way the greedy algorithm works, it observations. So, F10 = f10. Similarly, if T10 != follows that the total amount of change given t10, then either the greedy or the optimal solution in lower denominations is less than the value of must provide 20p or more in lower denominations the next higher denomination. That is, the which violates the above observations, so, T10 = change given in pennies is less than 2p; the t10. We can show T = t, F = f, TW = tw and P = change given in pennies and twos is less than p in a similar way. Therefore, the greedy and < < < 5p; etc. Therefore, T10 3, T 2, F 2, optimal solutions are the same. TW < 3, P < 2. 9 10 D&C1: Detecting a Counterfeit Problem: you are given a balance and a bag of 16 coins, one of which may be counterfeit. Counterfeight coins are lighter than genuine coins. Algorithm Design Technique 2: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 Divide & Conquer Algorithms: 1. Naive: compare coin 1 with coins Key idea: 2,3, ... 16 1. Divide the problem into smaller Max. No. comparisons = independent subproblems 2. Better: Weigh pairs Max. No. comparisons = 2. Solve the subproblems 3. Best: 3. Combine the solutions 1. Divide the coins into two sets of eight: A and Good for parallel programming B 2. Solve: weigh A and B 3. Combine: Counterfeit present iff (if and only if) A != B Max. No. comparisons = 11 12 D&C2: Identifying the Counterfeit Problem: locate the counterfeit coin in the scenario above. 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 Algorithms 1, and 2 from previous slide work, Exercise: Does this algorithm work if there are requiring at most 15 and 8 comparisons • at most two counterfeit coins, i.e. 0, 1 or 2 respectively. counterfeits? Exercise: How many comparisons to identify 14 • 0 or 3 counterfeit coins? as the counterfeit using Algorithm 1? And using Algorithm 2? • some natural number 0, 1, 2, .. 16 of counterfeit coins? 1. Divide the coins into two equal-size sets : A and B 2. Solve: weigh A and B, lighter set contains counterfeit, so apply D&C to it 3. Combine: return counterfeit coin Max. No. comparisons = 4 13 14 Backtracking D&C3: Quicksort Key idea: First organise the candidate Quicksort was devised by C.A.R. Hoare and solutions to make searching easy, e.g. a graph sorts a sequence (list/array/file/etc) into or tree. Then search the solution space ascending order as follows. depth-first, and if the node at the end of the 1. Divide: Pick an element of the sequence current search proves infeasible, mark it as (typically the first element) as the pivot. Split dead and backtrack to the previous node. the sequence into two: his: all elements Live nodes are part of the current solution greater than the pivot, and los all elements under investigation (often on a path). smaller than the pivot. Dead nodes are part of solutions that have 2. Solve: Apply D&C to his giving hiSort and proved to be infeasible. apply D&C to los giving loSort 3. Combine: Append loSort, pivot and hiSort More example backtracking algorithms in Weiss Section 10.5 15 16 Solution space is an undirected graph: 1,1 ---- 2,1 ---- 3,1 ---- 4,1 Maze Search Problem | | | | | | | | Find a path through a maze from top left (1,1) 1,2 ---- 2,2 ---- 3,2 ---- 4,2 to bottom right (4,4) avoiding obstacles | | | | Maze is represented as a matrix: | | | | 1,3 ---- 2,3 ---- 3,3 ---- 4,3 1 2 3 4 | | | | | | | | 1 0000 1,4 ---- 2,4 ---- 3,4 ---- 4,4 2 0111 3 0010 Every path from (1,1) to (4,4) is a solution, but 4 0000 some cross obstacles, and are not feasible. Search Objective: to find a feasible path from top-left to bottom-right. 17 18 A Backtracking Alg: Maze seach Depth First Search DFS tries to move Right, Down, Left, Up 0 0 0 0 Depth First Search 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 Path: 0 0 1 0 Note: Backtracking is easily implemented in 0 0 0 0 Logic languages like Prolog, and in lazy functional languages like Haskell. Backtrack! Exercise: Trace the algorithm for the following 0 0 0 0 maze. 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 0 19 20 Branch&Bound Alg: Maze Search Breadth First Search Branch&Bound 0 0 0 0 Key idea: Use a breadth first search of the 0 1 1 1 solution space, but prune suboptimal solutions 0 0 0 0 0 0 1 0 2 paths Good for parallelism: there are many solutions to consider simultaneously But ... multiple solutions consume resources, 0 0 0 0 including memory 0 1 1 1 0 0 0 0 0 0 1 0 2 paths 21 22 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 3 paths 0 0 1 0 1 paths Top path is infeasible Prune: chose shorter, or arbitrarily 0 0 0 0 between equal-length 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 path 0 0 0 0 0 0 1 0 2 paths Path: 23 24 Fibionacci Numbers Dynamic Programming The fibionacci numbers are a sequence of numbers defined as follows (in SML) Key idea: Store optimal solutions to subproblems, and use them to solve bigger fun fib 0 = 0 problems | fib 1 = 1 | fib n = fib(n-1) + fib(n-2) Often used to produce an efficient version of a recursively specified problem i.e.