CS 361, Lecture 14 Outline Bucket Sort Bucket Sort

CS 361, Lecture 14 Outline Bucket Sort Bucket Sort

Bucket Sort Bucket sort assumes that the input is drawn from a uniform ² CS 361, Lecture 14 distribution over the range [0; 1) Basic idea is to divide the interval [0; 1) into n equal size ² Jared Saia regions, or buckets University of New Mexico We expect that a small number of elements in A will fall into ² each bucket To get the output, we can sort the numbers in each bucket ² and just output the sorted buckets in order 2 Outline Bucket Sort //PRE: A is the array to be sorted, all elements in A[i] are between $0$ and $1$ inclusive. //POST: returns a list which is the elements of A in sorted order BucketSort(A){ B = new List[] Bucket Sort n = length(A) ² Midterm Review for (i=1;i<=n;i++){ ² insert A[i] at end of list B[floor(n*A[i])]; } for (i=0;i<=n-1;i++){ sort list B[i] with insertion sort; } return the concatenated list B[0],B[1],...,B[n-1]; } 1 3 Bucket Sort Analysis We claim that E(n2) = 2 1=n Claim: If the input numbers are distributed uniformly over ² i ¡ ² To prove this, we de¯ne indicator random variables: X = 1 the range [0; 1), then Bucket sort takes expected time O(n) ² ij if A[j] falls in bucket i and 0 otherwise (de¯ned for all i, Let T (n) be the run time of bucket sort on a list of size n ² 0 i n 1 and j, 1 j n) Let ni be the random variable givingthe number of elements · · ¡ · · ² Thus, n = n X in bucket [ ] i j=1 ij B i ² 2 n 1 2 We can nowP compute E(ni ) by expanding the square and Then T (n) = £(n) + ¡ O(n ) ² ² i=0 i regrouping terms P 4 6 Analysis Analysis n 1 2 We know T (n) = £(n) + ¡ O(n ) ² i=0 i Taking expectation of both sides, we have ² P n 1 n ¡ 2 E(n ) = E(( X )2) E(T (n)) = E(£(n) + O(ni )) i2 ij j=1 iX=0 X n 1 n n ¡ 2 = E( X X ) = £(n) + E(O(ni )) ij ik j=1 k=1 iX=0 X X n 1 n ¡ 2 = E( X2 + X X ) = £(n) + (O(E(ni ))) ij ij ik i=0 jX=1 1 Xj n 1 k Xn;k=j X n · · · · 6 2 The second step follows by linearity of expectation = E(Xij) + E(XijXik)) ² j=1 1 j n 1 k n;k=j The last step holds since for any constant a and random X ·X· · ·X 6 ² variable X, E(aX) = aE(X) (see Equation C.21 in the text) 5 7 Analysis Analysis We can evaluate the two summations separately. Xij is 1 ² n 1 2 with probability 1 and 0 otherwise Recall that E(T (n)) = £(n) + ¡ (O(E(n ))) =n ² i=0 i 2 2 Thus E(X ) = 1 (1=n) + 0 (1 1=n) = 1=n We can now plug in the equationP E(ni ) = 2 (1=n) to get ² ij ¤ ¤ ¡ ² ¡ Where k = j, the random variables Xij and Xik are indepen- n 1 ² 6 ¡ dent E(T (n)) = £(n) + 2 (1=n) ¡ For any two independent random variables X and Y , E(XY ) = iX=0 ² = £(n) + £(n) E(X)E(Y ) (see C.3 in the book for a proof of this) = £( ) Thus we have that n ² E(X X ) = E(X )E(X ) ij ik ij ik Thus the entire bucket sort algorithm runs in expected linear = (1=n)(1=n) ² time = (1=n2) 8 10 Analysis Midterm Substituting these two expected values back into our main ² equation, we get: Midterm: Tuesday, March 23rd in class (the Tuesday after ² n spring break) 2 2 E(ni ) = E(Xij) + E(XijXik)) You can bring 2 pages of \cheat sheets" to use during the jX=1 1 Xj n 1 k Xn;k=j ² n · · · · 6 exam. Otherwise the exam is closed book and closed note = (1=n) + (1=n2) Note that the web page contains links to prior classes and ² j=1 1 j n 1 k n;k=j X ·X· · ·X 6 their midterms. Many of the questions on my midterm will 2 = n(1=n) + (n)(n 1)(1=n ) be similar in avor to these past midterms (and to exercises ¡ = 1 + (n 1)=n in the book)! ¡ = 2 (1=n) ¡ 9 11 Review Session Question 1 Collection of true/false questions and short answer on: Asymptotic notation: e.g. I give you a bunch of functions ² and ask you to give me the simplest possible theta notation for each I will have a review session on Monday, March 22nd from ² Recurrences: e.g. I ask you to solve a recurrence 5:30-6:30 in FEC 141 (conference room on ¯rst oor of ² Heaps: e.g. I ask you questions about properties of heaps FEC) ² and priority queues Please try to make it to this review session if at all possible ² Sorting Algorithms: heapsort, quicksort, bucketsort, merge- ² sort, (know resource bounds for these algorithms) Probability: Random variables, expectation, linearity of ex- ² pectation, birthday paradox, analysis of expected runtime of quicksort and bucketsort 12 14 Midterm Question 2 Solving recurrence relations: 5 questions, about 20 points each ² Like problems on hw 4 and Problem 7-3 (Stooge Sort) Hard but fair ² ² You'll need to know annihilators, change of variables, han- There will be some time pressure, so make sure you can e.g. ² ² dling homogeneous and non-homogeneous parts of recur- solve recurrences both quickly and correctly. rences, recursion trees, and the Master Method I expect a class mean of between 60 :( and 70 :) points ² You'll need to know the formulas for sums of convergent and ² divergent geometric series 13 15 Question 3 Questions 5 Loop Invariant: Will give you an algorithm and ask you to give the loop Asymptotic notation: ² invariant you would use to show it is correct You may also need to give initialization, maintenance and Similar to book problems: 3.1-2, 3.1-5, 3.1-7 ² ² termination for your loop invariant Similar to the hw problems and in-class exercises on loop ² invariants 16 18 Question 4 Asymptotic Notation Let's now review asymptotic notation ² I'll review for O notation, make sure you understand the other Recurrence proof using induction (i.e. the substitution method): ² four types f(n) = O(g(n)) if there exists positive constants c and n0 You'll need to give base case, inductive hypothesis and then ² ² such that 0 f(n) cg(n) for all n n0 show the inductive step · · ¸ This means to show that f(n) = O(g(n)), you need to give Similar to Exercises 7.2-1, 4.2-1 and 4.2-3 ² ² positive constants c and n0 for which the above statement is true! 17 19 Example 1 Example 3 Prove that 22n = O(5n) Prove that 2n+1 = O(2n) ² ² Goal: Show there exist positive constants c and n0 such that Goal: Show there exist positive constants c and n0 such that ² 2n n ² 2 c 5 for all n n0 2n+1 c 2n for all n n · ¤ ¸ · ¤ ¸ 0 22n c 5n (7) 2n+1 c 2n (1) · ¤ · ¤ 4n c 5n (8) 2 2n c 2n (2) · ¤ ¤ · ¤ (4=5)n c (9) 2 c (3) · · (10) n+1 n Hence for c = 2 and n0 = 1, 2 c 2 for all n n0 ² · ¤ ¸ Hence for c = 1 and n = 1, 22n c 5n for all n n ² 0 · ¤ ¸ 0 20 22 Example 2 A Procedure Prove that n + pn = O(n) ² Goal: Show there exist positive constants c and n such that Goal: prove that f(n) = O(g(n)) ² 0 n + pn c n for all n n · ¤ ¸ 0 1. Write down what this means mathematically n + pn c n (4) · ¤ 2. Write down the inequality f(n) c g(n) 1 · ¤ 1 + c (5) 3. Simplify this inequality so that c is isolated on the right hand pn · (6) side 4. Now ¯nd a n0 and a c such that for all n n0, this simpli¯ed Hence if we choose n = 4, and c = 1:5, then it's true that ¸ ² 0 inequality is true n + pn c n for all n n · ¤ ¸ 0 21 23 In Class Exercise Example: Substitution Method Base Case: T (1) = 2 which is in fact 21. ² Show that n2n is O(4n) Inductive Hypothesis: For all j < n, T (j) = 2j ² Inductive Step: We must show that T (n) = 2n, assuming the ² Q1: What is the exact mathematical statement of what you inductive hypothesis. ² need to prove? 2 n T (n) = 2 ¡ T (n 1) T (n 1) Q2: What is the ¯rst inequality in the chain of inequalities? 2 n ¤ n 1¡ n¤ 1 ¡ ² T (n) = 2 ¡ 2 ¡ 2 ¡ Q3: What is the simpli¯ed inequality where c is isolated? ¤ ¤ ² T (n) = 2n Q4: What is a n and c such that the inequality of the last ² 0 question is always true? where the inductive hypothesis allows us to make the replace- ments in the second step. 24 26 Example: Substitution Method Todo Consider the following recurrence: ² T (n) = 22 n T (n 1) T (n 1) ¡ ¤ ¡ ¤ ¡ Enjoy Spring! where T (1) = 2. ² Study for Midterm Show that T (n) = 2n by induction. Include the following ² ² in your proof: 1)the base case(s) 2)the inductive hypothesis and 3)the inductive step.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us