Data Structures and Algorithms (CS210A)
Lecture 21 • Analyzing average running time of Quick Sort
1 Overview of this lecture
Main Objective: • Analyzing average time complexity of QuickSort using recurrence. – Using mathematical induction. – Solving the recurrence exactly. • The outcome of this analysis will be quite surprising!
Extra benefits: • You will learn a standard way of using mathematical induction to bound time complexity of an algorithm. You must try to internalize it.
2 QuickSort
3 Pseudocode for QuickSort(푺)
QuickSort(푺) { If (|푺|>1) Pick and remove an element 풙 from 푺;
(푺<풙, 푺>풙) Partition(푺,풙); return( Concatenate(QuickSort(푺<풙), 풙, QuickSort(푺>풙)) }
4 Pseudocode for QuickSort(푺) When the input 푺 is stored in an array
QuickSort(푨,풍, 풓) { If (풍 < 풓) 풊 Partition(푨,풍,풓); QuickSort(푨,풍, 풊 − ퟏ); QuickSort(푨,풊 + ퟏ, 풓) }
Partition : 풙푨[풍] as a pivot element, permutes the subarray 푨[풍 … 풓] such that elements preceding 풙 are smaller than 풙, 푨[풊]= 풙, and elements succeeding 풙 are greater than 풙. 5 Analyzing average time complexity of QuickSort
Part 1 Deriving the recurrence
6 Analyzing average time complexity of QuickSort
Assumption (just for a neat analysis):
• All elements are distinct.
• Each recursive call selects the first element of the subarray as the pivot element.
7 Analyzing average time complexity of QuickSort
A useful Fact: Quick sort is a comparison based algorithm.
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 6 11 42 37 24 5 16 27 2 15 20 49 41 29 4 23 36 3
푒 3 푒4 푒9 푒8 푒6 푒2 푒5 푒7 푒1 푒3 푒4 푒9 푒8 푒6 푒2 푒5 푒7 푒1
Let 푒푖 : 푖th smallest element of 푨.
Observation: The execution of Quick sort depends upon the permutation of 푒푖’s and not on the values taken by 푒푖’s.
8 Analyzing average time complexity of QuickSort
푻(풏) : Average running time for Quick sort on input of size 풏.
(average over all possible permutations of {푒1, 푒2,… ,푒푛})
1 Hence, 푻(풏)= 푸(휋), 풏! 휋 where 푸(휋) is the time complexity (or no. of comparisons) when the input is permutation 휋.
Calculating 푻(풏) from definition/scratch is impractical, if not impossible.
9 Analyzing average time complexity of QuickSort
Permutations beginning with 푒2
Permutations beginning with 푒 1
Permutations beginning with 푒 3 All 풏! permutations of {푒1, 푒2,… ,푒푛}
Let P(풊) be the set of all those permutations of {푒1, 푒2,… ,푒푛} that begin with 푒푖. Question: What fraction of all permutations constitutes P(풊) ? 1 Answer: 풏 Let 푮(풏, 풊) be the average running time of QuickSort over P(풊).
Question: What is the relation between 푻(풏) and 푮(풏, 풊)’s ? ퟏ 푛 Answer: 푻(풏) = 풏 푖<1 푮(풏, 풊) Observation: We now need to derive an expression for 푮(풏, 풊).For this purpose, we need to have a closer look at the execution of QuickSort over P(풊).
10
Quick Sort on a permutation from P(풊).
0 1 2 3 4 5 6 7 8 > 푒푖 푨
< 푒푖 푒푖
What happens during Partition(푨,ퟎ,ퟖ)
11 Quick Sort on a permutation from P(풊).
(풏 − ퟏ)! 0 1 2 3 4 5 6 7 8 P (풊) 푨
푒 Many-to-one 푖 Is it also a mapping uniform mapping ?
Reasons: 0 1 2 3 4 5 6 7 8 - Partition() isYes “well -defined” S(풊) - Partition() just compares pivot with other elements. (풊 − ퟏ)!⨯ (풏 − 풊)! 푒푖 푒 … 푒 푒1… 푒푖;1 푖:1 푛 Lemma 1: 풏 − ퟏ S(풊): Permutations resulting from Partition(). There are exactly ( ) permutations from P(풊) that get mapped to one permutation in S(풊). 풊 − ퟏ 12 Homework: Try your best to prove Lemma 1. If you have still any doubt, please meet me. Quick Sort on a permutation from P(풊).
(풏 − ퟏ)! 0 1 2 3 4 5 6 7 8 P (풊) 푨
Many-to-one 푒푖 Using Lemma 1 Can mapping you now express 푮(풏, 풊) recursively ?
0 1 2 3 4 5 6 7 8 S(풊)
(풊 − ퟏ)!⨯ (풏 − 풊)! 푒푖 푒 … 푒 푒1… 푒푖;1 푖:1 푛 Lemma 1: 풏 − ퟏ There are exactly ( ) permutations from P(풊) that get mapped to one permutation in S(풊). 풊 − ퟏ 13 Analyzing average time complexity of QuickSort
푮(풏, 풊) = 푻(풊 − ퟏ) + 푻(풏 − 풊) + d 풏 ----1
We showed previously that : ퟏ 푛 푻(풏) = 풏 푖<1 푮(풏, 풊) ----2
Question: Can you express 푻(풏) recursively using 1 and 2? ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + d풏 푻(ퟏ) = c
14 Analyzing average time complexity of QuickSort
Part 2 Solving the recurrence through mathematical induction
15 푻(ퟏ) = c ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + d풏 ퟐ 푛;1 = 풏 푖<1 푻(풊) + d풏 Assertion A(풎): 푻(풎) ≤ a풎 퐥퐨퐠 풎 + b for all 풎 ≥ 1 Base case A(ퟎ) : Holds for b ≥ c Induction step: Assuming A(풎) holds for all 풎< 풏, we have to prove A(풏). ퟐ 푛;1 푻(풏) ≤ 풏 푖<1 (a풊 퐥퐨퐠 풊 + b) + d풏 ퟐ 푛;1 ≤ 풏( 푖<1 a풊 퐥퐨퐠 풊 ) + 2b + d풏 ퟐ 푛/2 푛;1 = ( a풊 퐥퐨퐠 풊 + 푛 a풊 퐥퐨퐠 풊 ) + 2b + d풏 풏 푖<1 푖< :1 2 ퟐ 푛/2 푛;1 ≤ ( a풊 퐥퐨퐠 풏/ퟐ + 푛 a풊 퐥퐨퐠 풏 ) + 2b + d풏 풏 푖<1 푖< :1 2 ퟐ 푛;1 푛/2 = 풏( 푖<1 a풊 퐥퐨퐠 풏 − 푖<1 a풊) + 2b + d풏 풏 풏 ퟐ 풏(풏;ퟏ) ( :ퟏ) = ( a 퐥퐨퐠 풏 − ퟐ ퟐ a) + 2b + d풏 풏 ퟐ ퟐ 풏 ≤ a 풏 − ퟏ 퐥퐨퐠 풏 − a + 2b + d풏 ퟒ 풏 = a풏퐥퐨퐠 풏+ b − a + b + d풏 ퟒ ≤ a풏퐥퐨퐠 풏+ b for a >ퟒ(b + d) 16
Analyzing average time complexity of QuickSort
Part 3 Solving the recurrence exactly
17 Some elementary tools 푛 ퟏ 퐇 풏 = 풊 푖<1 Question: How to approximate 퐇(풏) ?
Answer: 퐇(풏) 퐥퐨퐠풆 풏 + ᵞ, as 풏 increases where ᵞ is Euler’s constant ~0.58 Look at this figure, and relate Hint: 1 it to the curve for function f(x)= 1/x and its integration…
1/2 1/3 1/4 1/5 1/6
We shall calculate average number of comparisons during QuickSort using: • our knowledge of solving recurrences by substitution • our knowledge of solving recurrence by unfolding • our knowledge of simplifying a partial fraction (from JEE days) Students should try to internalize the way the above tools are used.
18
푻(풏) : average number of comparisons during QuickSort on 풏 elements. 푻(ퟏ) = ퟎ, 푻(ퟎ) = ퟎ, ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + 풏 − ퟏ ퟐ 푛 = 풏 푖<1(푻(풊 − 1)) + 풏 − ퟏ 푛 풏푻(풏) = 2 푖<1(푻(풊 − 1)) + 풏(풏 − ퟏ) -----1 Question: How will this equation appear for 풏 − ퟏ ? 푛;1 (풏 − ퟏ)푻(풏 − ퟏ) = 2 푖<1 (푻(풊 − 1)) + (풏 − ퟏ)(풏 − ퟐ) -----2 Subtracting 2 from 1, we get 풏푻(풏) − (풏 − ퟏ)푻(풏 − ퟏ) = 2 푻(풏 − ퟏ) + 2(풏 − ퟏ) 풏푻(풏) − (풏 + ퟏ)푻(풏 − ퟏ) = 2(풏 − ퟏ) Question: How to solve/simplify it further ? 푻(풏) 푻(풏;1) ퟐ(풏−1) − = 풏+1 풏 풏(풏+1)
19
푻(풏) 푻(풏−1) ퟐ(풏−1) − = 풏:1 풏 풏(풏+1) ퟐ(풏;1) 푻(풎) 품(풏) − 품 풏 − ퟏ = where 품 풎 = 풏(풏:1) , 풎:1 Question: How to simplify RHS ? ퟐ(풏;1) ퟐ 풏:1 ;4 = = 풏(풏:1) 풏(풏:1) ퟐ 4 = − 풏 풏(풏:1) ퟐ 4 4 = − + 풏 풏 풏:1 4 2 = − 풏:1 풏 4 2 품(풏) − 품 풏 − ퟏ = − 풏:1 풏
20 4 2 품(풏) − 품 풏 − ퟏ = − 풏:1 풏 Question: How to calculate 품(풏) ? 4 2 품(풏 − ퟏ) − 품 풏 − ퟐ = − 풏 풏;ퟏ 4 2 품(풏 − ퟐ) − 품 풏 − ퟑ = − 풏;ퟏ 풏;ퟐ … = … 4 2 품(ퟐ) − 품 ퟏ = − 3 ퟐ 4 2 품(ퟏ) − 품 ퟎ = − 2 1 4 ퟏ 4 ퟏ Hence 품(풏) = + (2 풏 ) − 2 = + (2 풏 ) − 4 풏:1 풋<ퟐ 풋 풏:1 풋<ퟏ 풋 4 = + 2퐇(풏) − 4 풏:1 4 푻(풏) = (풏 + 1) ( + 2퐇(풏) − 4) 풏:1 = 2(풏 + ퟏ)퐇(풏) − 4풏
21 푻(풏) = 2(풏 + ퟏ)퐇(풏) − 4풏
= 2(풏 + 1) 퐥퐨퐠풆 풏 + 1.16 (풏 + ퟏ) − ퟒ풏 = 2풏 퐥퐨퐠풆 풏 − 2.84 풏 + O(1) = 2풏 퐥퐨퐠풆 풏
Theorem: The average number of comparisons during QuickSort on 풏 elements approaches 2풏 퐥퐨퐠풆 풏 − 2.84 풏. = 1.39 풏 퐥퐨퐠ퟐ 풏 − O(풏)
The best case number of comparisons during QuickSort on 풏 elements = 풏 퐥퐨퐠ퟐ 풏 The worst case no. of comparisons during QuickSort on 풏 elements = 풏(풏 − ퟏ)
22 Quick sort versus Merge Sort
No. of Comparisons Merge Sort Quick Sort
Average case 풏 퐥퐨퐠ퟐ 풏 1.39 풏 퐥퐨퐠ퟐ 풏 Best case 풏 퐥퐨퐠 풏 ퟐ 풏 퐥퐨퐠ퟐ 풏
Worst case 풏 퐥퐨퐠ퟐ 풏 풏(풏 − ퟏ) After seeing this table, no one would prefer Quick sort to Merge sort But Quick sort is still the most preferred algorithm in practice. Why ?
You will find the answer yourself in the next programming assignment 23