<<

Data Structures and (CS210A)

Lecture 21 • Analyzing average running time of Quick

1 Overview of this lecture

Main Objective: • Analyzing average time complexity of using recurrence. – Using mathematical induction. – Solving the recurrence exactly. • The outcome of this analysis will be quite surprising!

Extra benefits: • You will learn a standard way of using mathematical induction to bound time complexity of an . You must try to internalize it.

2 QuickSort

3 Pseudocode for QuickSort(푺)

QuickSort(푺) { If (|푺|>1) Pick and remove an element 풙 from 푺;

(푺<풙, 푺>풙) Partition(푺,풙); return( Concatenate(QuickSort(푺<풙), 풙, QuickSort(푺>풙)) }

4 Pseudocode for QuickSort(푺) When the input 푺 is stored in an array

QuickSort(푨,풍, 풓) { If (풍 < 풓) 풊 Partition(푨,풍,풓); QuickSort(푨,풍, 풊 − ퟏ); QuickSort(푨,풊 + ퟏ, 풓) }

Partition : 풙푨[풍] as a pivot element, permutes the subarray 푨[풍 … 풓] such that elements preceding 풙 are smaller than 풙, 푨[풊]= 풙, and elements succeeding 풙 are greater than 풙. 5 Analyzing average time complexity of QuickSort

Part 1 Deriving the recurrence

6 Analyzing average time complexity of QuickSort

Assumption (just for a neat analysis):

• All elements are distinct.

• Each recursive call selects the first element of the subarray as the pivot element.

7 Analyzing average time complexity of QuickSort

A useful Fact: Quick sort is a comparison based algorithm.

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 6 11 42 37 24 5 16 27 2 15 20 49 41 29 4 23 36 3

푒 3 푒4 푒9 푒8 푒6 푒2 푒5 푒7 푒1 푒3 푒4 푒9 푒8 푒6 푒2 푒5 푒7 푒1

Let 푒푖 : 푖th smallest element of 푨.

Observation: The execution of Quick sort depends upon the permutation of 푒푖’s and not on the values taken by 푒푖’s.

8 Analyzing average time complexity of QuickSort

푻(풏) : Average running time for Quick sort on input of size 풏.

(average over all possible permutations of {푒1, 푒2,… ,푒푛})

1 Hence, 푻(풏)= 푸(휋), 풏! 휋 where 푸(휋) is the time complexity (or no. of comparisons) when the input is permutation 휋.

Calculating 푻(풏) from definition/scratch is impractical, if not impossible.

9 Analyzing average time complexity of QuickSort

Permutations beginning with 푒2

Permutations beginning with 푒 1

Permutations beginning with 푒 3 All 풏! permutations of {푒1, 푒2,… ,푒푛}

Let P(풊) be the set of all those permutations of {푒1, 푒2,… ,푒푛} that begin with 푒푖. Question: What fraction of all permutations constitutes P(풊) ? 1 Answer: 풏 Let 푮(풏, 풊) be the average running time of QuickSort over P(풊).

Question: What is the relation between 푻(풏) and 푮(풏, 풊)’s ? ퟏ 푛 Answer: 푻(풏) = 풏 푖<1 푮(풏, 풊) Observation: We now need to derive an expression for 푮(풏, 풊).For this purpose, we need to have a closer look at the execution of QuickSort over P(풊).

10

Quick Sort on a permutation from P(풊).

0 1 2 3 4 5 6 7 8 > 푒푖 푨

< 푒푖 푒푖

What happens during Partition(푨,ퟎ,ퟖ)

11 Quick Sort on a permutation from P(풊).

(풏 − ퟏ)! 0 1 2 3 4 5 6 7 8 P (풊) 푨

푒 Many-to-one 푖 Is it also a mapping uniform mapping ?

Reasons: 0 1 2 3 4 5 6 7 8 - Partition() isYes “well -defined” S(풊) - Partition() just compares pivot with other elements. (풊 − ퟏ)!⨯ (풏 − 풊)! 푒푖 푒 … 푒 푒1… 푒푖;1 푖:1 푛 Lemma 1: 풏 − ퟏ S(풊): Permutations resulting from Partition(). There are exactly ( ) permutations from P(풊) that get mapped to one permutation in S(풊). 풊 − ퟏ 12 Homework: Try your best to prove Lemma 1. If you have still any doubt, please meet me. Quick Sort on a permutation from P(풊).

(풏 − ퟏ)! 0 1 2 3 4 5 6 7 8 P (풊) 푨

Many-to-one 푒푖 Using Lemma 1 Can mapping you now express 푮(풏, 풊) recursively ?

0 1 2 3 4 5 6 7 8 S(풊)

(풊 − ퟏ)!⨯ (풏 − 풊)! 푒푖 푒 … 푒 푒1… 푒푖;1 푖:1 푛 Lemma 1: 풏 − ퟏ There are exactly ( ) permutations from P(풊) that get mapped to one permutation in S(풊). 풊 − ퟏ 13 Analyzing average time complexity of QuickSort

푮(풏, 풊) = 푻(풊 − ퟏ) + 푻(풏 − 풊) + d 풏 ----1

We showed previously that : ퟏ 푛 푻(풏) = 풏 푖<1 푮(풏, 풊) ----2

Question: Can you express 푻(풏) recursively using 1 and 2? ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + d풏 푻(ퟏ) = c

14 Analyzing average time complexity of QuickSort

Part 2 Solving the recurrence through mathematical induction

15 푻(ퟏ) = c ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + d풏 ퟐ 푛;1 = 풏 푖<1 푻(풊) + d풏 Assertion A(풎): 푻(풎) ≤ a풎 퐥퐨퐠 풎 + b for all 풎 ≥ 1 Base case A(ퟎ) : Holds for b ≥ c Induction step: Assuming A(풎) holds for all 풎< 풏, we have to prove A(풏). ퟐ 푛;1 푻(풏) ≤ 풏 푖<1 (a풊 퐥퐨퐠 풊 + b) + d풏 ퟐ 푛;1 ≤ 풏( 푖<1 a풊 퐥퐨퐠 풊 ) + 2b + d풏 ퟐ 푛/2 푛;1 = ( a풊 퐥퐨퐠 풊 + 푛 a풊 퐥퐨퐠 풊 ) + 2b + d풏 풏 푖<1 푖< :1 2 ퟐ 푛/2 푛;1 ≤ ( a풊 퐥퐨퐠 풏/ퟐ + 푛 a풊 퐥퐨퐠 풏 ) + 2b + d풏 풏 푖<1 푖< :1 2 ퟐ 푛;1 푛/2 = 풏( 푖<1 a풊 퐥퐨퐠 풏 − 푖<1 a풊) + 2b + d풏 풏 풏 ퟐ 풏(풏;ퟏ) ( :ퟏ) = ( a 퐥퐨퐠 풏 − ퟐ ퟐ a) + 2b + d풏 풏 ퟐ ퟐ 풏 ≤ a 풏 − ퟏ 퐥퐨퐠 풏 − a + 2b + d풏 ퟒ 풏 = a풏퐥퐨퐠 풏+ b − a + b + d풏 ퟒ ≤ a풏퐥퐨퐠 풏+ b for a >ퟒ(b + d) 16

Analyzing average time complexity of QuickSort

Part 3 Solving the recurrence exactly

17 Some elementary tools 푛 ퟏ 퐇 풏 = 풊 푖<1 Question: How to approximate 퐇(풏) ?

Answer: 퐇(풏)  퐥퐨퐠풆 풏 + ᵞ, as 풏 increases where ᵞ is Euler’s constant ~0.58 Look at this figure, and relate Hint:  1 it to the curve for f(x)= 1/x and its integration…

1/2 1/3 1/4 1/5 1/6

We shall calculate average number of comparisons during QuickSort using: • our knowledge of solving recurrences by substitution • our knowledge of solving recurrence by unfolding • our knowledge of simplifying a partial fraction (from JEE days) Students should try to internalize the way the above tools are used.

18

푻(풏) : average number of comparisons during QuickSort on 풏 elements. 푻(ퟏ) = ퟎ, 푻(ퟎ) = ퟎ, ퟏ 푛 푻(풏) = 풏 푖<1(푻(풊 − 1) + 푻(풏 − 풊)) + 풏 − ퟏ ퟐ 푛 = 풏 푖<1(푻(풊 − 1)) + 풏 − ퟏ 푛 풏푻(풏) = 2 푖<1(푻(풊 − 1)) + 풏(풏 − ퟏ) -----1 Question: How will this equation appear for 풏 − ퟏ ? 푛;1 (풏 − ퟏ)푻(풏 − ퟏ) = 2 푖<1 (푻(풊 − 1)) + (풏 − ퟏ)(풏 − ퟐ) -----2 Subtracting 2 from 1, we get 풏푻(풏) − (풏 − ퟏ)푻(풏 − ퟏ) = 2 푻(풏 − ퟏ) + 2(풏 − ퟏ) 풏푻(풏) − (풏 + ퟏ)푻(풏 − ퟏ) = 2(풏 − ퟏ) Question: How to solve/simplify it further ? 푻(풏) 푻(풏;1) ퟐ(풏−1) − = 풏+1 풏 풏(풏+1)

19

푻(풏) 푻(풏−1) ퟐ(풏−1) − = 풏:1 풏 풏(풏+1) ퟐ(풏;1) 푻(풎)  품(풏) − 품 풏 − ퟏ = where 품 풎 = 풏(풏:1) , 풎:1 Question: How to simplify RHS ? ퟐ(풏;1) ퟐ 풏:1 ;4 = = 풏(풏:1) 풏(풏:1) ퟐ 4 = − 풏 풏(풏:1) ퟐ 4 4 = − + 풏 풏 풏:1 4 2 = − 풏:1 풏 4 2  품(풏) − 품 풏 − ퟏ = − 풏:1 풏

20 4 2 품(풏) − 품 풏 − ퟏ = − 풏:1 풏 Question: How to calculate 품(풏) ? 4 2 품(풏 − ퟏ) − 품 풏 − ퟐ = − 풏 풏;ퟏ 4 2 품(풏 − ퟐ) − 품 풏 − ퟑ = − 풏;ퟏ 풏;ퟐ … = … 4 2 품(ퟐ) − 품 ퟏ = − 3 ퟐ 4 2 품(ퟏ) − 품 ퟎ = − 2 1 4 ퟏ 4 ퟏ Hence 품(풏) = + (2 풏 ) − 2 = + (2 풏 ) − 4 풏:1 풋<ퟐ 풋 풏:1 풋<ퟏ 풋 4 = + 2퐇(풏) − 4 풏:1 4  푻(풏) = (풏 + 1) ( + 2퐇(풏) − 4) 풏:1 = 2(풏 + ퟏ)퐇(풏) − 4풏

21 푻(풏) = 2(풏 + ퟏ)퐇(풏) − 4풏

= 2(풏 + 1) 퐥퐨퐠풆 풏 + 1.16 (풏 + ퟏ) − ퟒ풏 = 2풏 퐥퐨퐠풆 풏 − 2.84 풏 + O(1) = 2풏 퐥퐨퐠풆 풏

Theorem: The average number of comparisons during QuickSort on 풏 elements approaches 2풏 퐥퐨퐠풆 풏 − 2.84 풏. = 1.39 풏 퐥퐨퐠ퟐ 풏 − O(풏)

The best case number of comparisons during QuickSort on 풏 elements = 풏 퐥퐨퐠ퟐ 풏 The worst case no. of comparisons during QuickSort on 풏 elements = 풏(풏 − ퟏ)

22 Quick sort versus

No. of Comparisons Merge Sort Quick Sort

Average case 풏 퐥퐨퐠ퟐ 풏 1.39 풏 퐥퐨퐠ퟐ 풏 Best case 풏 퐥퐨퐠 풏 ퟐ 풏 퐥퐨퐠ퟐ 풏

Worst case 풏 퐥퐨퐠ퟐ 풏 풏(풏 − ퟏ) After seeing this table, no one would prefer Quick sort to Merge sort But Quick sort is still the most preferred algorithm in practice. Why ?

You will find the answer yourself in the  next programming assignment 23