CS116 - Module 8 - Efficiency: Searching and Sorting

CS116 - Module 8 - Efficiency: Searching and Sorting

CS116 - Module 8 - Efficiency: Searching and Sorting Cameron Morland Winter 2020 Reminder: if you have not already, ensure you: Read the Wikipedia article on the Binary search algorithm Read the the Wikipedia articles on Selection sort, Insertion sort, and Merge sort 1 CS116 - Module 8 - Efficiency: Searching and Sorting Efficiency: Searching and Sorting Algorithms Why are we doing this? Can't I just use [].sort() and sorted() ? Yes! Use available sorting functions whenever possible! But these algorithms are beautiful to analyse for efficiency. 3 CS116 - Module 8 - Efficiency: Searching and Sorting Searching a List Is 42 in the following list? [97,24,6,87,0,78,90,77,4,24,50,30,68,44,62,46,93,47,1,81,30,48,26,45,99] With a list in no particular order, we can do no better than Linear Search: check each value in the list. If you find a value which is the target, return True. If you arrive at the end of list and have not found the target, return False. Write a function search(mylist, target) which implements linear search. Ex. 4 CS116 - Module 8 - Efficiency: Searching and Sorting Running time of linear search Best case: mylist[0] == target, and we return immediately; O(1). Worst case: either mylist[-1] == target or target is not in mylist. We need to check every value; O(n). At this point we always consider the worst case: O(n). 5 CS116 - Module 8 - Efficiency: Searching and Sorting I need a volunteer... Pick a number from 0 to 1000. I divide the region in 2 each step; we need only log2 1000 ≈ 10 guesses. I could go to 1 000 000 with only 20 guesses, or 1 000 000 000 with only 30 guesses. 6 CS116 - Module 8 - Efficiency: Searching and Sorting Recursive Binary Search Is 42 in the following sorted list? [0,1,4,6,24,24,26,30,30,44,45,46,47,48,50,62,68,77,78,81,87,90,93,97,99] def bs(L,t): ''' Return True ift is inL, and False otherwise. bs:(listof Int) Int -> Bool ''' if L == []: return False elif len(L) == 1: return L[0] == t elif t < L[len(L)//2]: # first half return bs(L[:len(L)//2], t) else: # second half return bs(L[len(L)//2:], t) It's kind of like building a binary search tree, without the tree. What's the running time? Every step we do O(n=2) = O(n) slicing, so our recurrence is T (n) = O(n) + T (n=2), which gives us only O(n) running time | the same as linear search! To do better, don't slice the list. 7 CS116 - Module 8 - Efficiency: Searching and Sorting Iterative binary Search We can do it recursively. How to do it iteratively? Use indices to keep track of the left and right halves of the region of interest: left indicates the lowest point it might be in, right indicates the highest point it might be in. Using a while loop, create a function binary_search(L, target) that returns True if target is in L and False otherwise. Exercise 8 CS116 - Module 8 - Efficiency: Searching and Sorting Iterative binary Search def binary_search(L, target): if L == [] or target > L[-1] or target < L[0]: return False left = 0 right = len(L) while right - left > 1: middle = (right + left) // 2 if L[middle] == target: return True if L[middle] > target: # keep left half right = middle else: #L[middle] <= target; keep right half left = middle return L[left] == target or L[right] == target Using left and right we could also do this recursively. 9 CS116 - Module 8 - Efficiency: Searching and Sorting Testing Binary Search Probably this code is terribly buggy. We should test at least: empty list list of length 1 with and without target small lists of odd and even length longer list: The first and last values in the list. Values less than the first and greater than the last. Values near the middle of the list Values which would fit in the list, but are not present. 10 CS116 - Module 8 - Efficiency: Searching and Sorting Efficiency What is the runtime of each iteration? How many iterations do we need? Worst case running time is O(log n). If n = 1000, we need at most 10 or 11 iterations since 210 = 1024. Doubling the input size adds just 1 iteration. n = 4 000 000 000 takes only 32 iterations. To make binary_search more useful, we can make it return the location where target was found, or -1 if it is not found. Although the recursive solution we started with was O(n), we could build it recursively in O(log n) using indices instead of slicing. 11 CS116 - Module 8 - Efficiency: Searching and Sorting Sorting How can we sort a list into increasing order? Many ways. We consider only here selection sort, insertion sort, and mergesort, as these can be easily analysed without statistics. (Many libraries use quicksort. Proper analysis of quicksort requires statistics, and is beyond the score of this course.) 13 CS116 - Module 8 - Efficiency: Searching and Sorting Selection Sort Place the smallest item into L[0] Place the second item into L[1] Place the third item into L[2] ::: After n − 1 steps, the list is sorted. Get a random ordering of the first 10 natural numbers, 0 ::: 9: import random L = list(range(10)) Ex. random.shuffle(L) print(L) By hand, sort L using selection sort. Count the number of comparisons. For 10 cards, 9 comparisons. For 9 cards, 8 comparisons. For 8 cards, 7 comparisons. Pn−1 n(n−1) 2 So we get 9 + 8 + 7 + ··· + 2 + 1 = i=1 = 2 = 45 comparisons. This is O(n ). 14 CS116 - Module 8 - Efficiency: Searching and Sorting Insertion Sort Enjoy Sorting out Sorting to 4:10. The first item of the list is sorted. Insert the first unsorted item of the list into the sorted part, keeping the sorted part sorted. Continue to expand the sorted part until it is the entire list. Running time? n times through, the ith time takes i moves. As before, O(n2) comparisions and moves. We can reduce the number of comparisons to O(n log n) by using binary search, but we still need O(n2) moves. For large n it will be roughly twice as fast, but the same order. 15 CS116 - Module 8 - Efficiency: Searching and Sorting Disassembling Selection Sort Selection sort can be implemented as follows: 1 Begin with an empty list D and an unsorted list S. 2 Find the smallest item in S, 3 ... remove it from S, 4 ... and add it at the end of D. 5 Repeat steps 2 { 4 until S is empty. 6 Move everything from D to S. Implement selection sort using this framework. (It is possible to do this in 7 lines of code, maybe less.) Exercise 17 CS116 - Module 8 - Efficiency: Searching and Sorting In Place algorithms The notes contain the following in place selection sort: def selection_sort(L): n = len(L) positions = list(range(n-1)) for i in positions: min_pos = i for j in range(i,n): if L[j] < L[min_pos]: min_pos = j temp = L[i] L[i] = L[min_pos] L[min_pos] = temp 18 CS116 - Module 8 - Efficiency: Searching and Sorting Mergesort Mergesort is a Divide and Conquer algorithm. It works in the following manner: 1 Divide the list into two halves, using any method. 2 Sort the first half. 3 Sort the second half. 4 Merge the two sorted lists together to form a new sorted list. Write a function merge(A, B) which consumes A and B, each of which is a sorted list, and returns the list containing all items from A and B, still in sorted order. Exercise What is the running time of merge(A, B)? O(len(A) + len(B)) = O(n) if the lengths are approximately equal. Implement mergesort using this framework and your merge function. Note: a list of length 1 is already sorted { use this as the base case for recursion. Exercise Mergesort analysis 1 Divide the list into two halves, using any method. 2 Sort the first half. 3 Sort the second half. 4 Merge the two sorted lists together to form a new sorted list. 1 Divide the list into two halves, using any method. O(n) 2 Sort the first half. T (n=2) 3 Sort the second half. T (n=2) 4 Merge the two sorted lists together to form a new sorted list. O(n) T (n) = O(n) + 2T (n=2) ! O(n log n) Draw the tree! 20 CS116 - Module 8 - Efficiency: Searching and Sorting Disassembling Insertion Sort Insertion sort depends on insertion. We need to insert the new item in the list at the right place so it stays sorted. Write a function insert_keep_sorted(D, item) which consumes a list D and an integer item, and mutates D so it contains item while remaining sorted. Exercise For example, D = [1, 2, 17]; insert_keep_sorted(D, 4), now D is [1, 2, 4, 7]. 21 CS116 - Module 8 - Efficiency: Searching and Sorting Insertion Sort Insertion sort can be implemented as follows: 1 Begin with an empty list D and an unsorted list S. 2 Find any item in S, 3 ... remove it from S, 4 ... and insert it into D, keeping D sorted. 5 Repeat steps 2 { 4 until S is empty. 6 Move everything from D to S. Implement insertion sort using this framework and your insert_keep_sorted function.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    27 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us