Reading assignment CSE 417: and n Read Chapter 2 of The Computational Design Manual

Complexity Analysis &

Autumn 2002 Paul Beame

1 2

Complexity analysis Complexity

n Problem size n n The complexity of an algorithm associates a number T(n), the best/worst/average-case n Worst-case complexity: max # steps algorithm takes on any input of size n time the algorithm takes, with each problem size n. n Best-case complexity:min # steps algorithm takes on any input of size n n Mathematically, n Average -case complexity: avg # steps + + algorithm takes on inputs of size n n T: N ® R n that is T is a function that maps positive integers giving problem size to positive real numbers giving number of steps.

3 4

Complexity Why Worst-Case Analysis?

n Appropriate for time-critical applications, e.g. T(n) avionics

n Unlike Average-Case, no debate about what the right definition is

n Analysis often easier Time n Result is often representative of "typical" problem instances

n Of course there are exceptions… Problem size

5 6

1 O-notation etc Complexity

n Given two functions f and g:N®R T(n) n f(n) is O(g(n)) iff there is a constant c>0 so that f(n) is eventually always £ c g(n) 2n log2n n f(n) is W(g(n)) iff there is a constant c>0 so that f(n) is eventually

always³ c g(n) Time

n f(n) is Q(g(n)) iff there is are constants c1 and c2>0 sothat eventually always n log2n c1g(n) £ f(n) £ c2g(n) Problem size

7 8

Examples Working with O-W-Q notation

2 2 3 n 10n -16n+100 is O(n ) also O(n ) n Claim: For any a, b>1 logan is Q(logbn) 2 2 n 10n -16n+100 £ 11n for all n ³ 10 n logan=log ab × logbn so letting c=log ab we get that clog n £ log n £ clog n n 10n2-16n+100 is W (n2) also W (n) b a b 2 2 n 10n -16n+100 ³ 9n for all n ³16 2 2 n b b n Therefore also 10n -16n+100 is Q(n ) Claim: For any a and b>0, (n+a) is Q(n ) b b n (n+a) £ (2n) for n ³ |a| = 2bn b = cn b for c=2b so (n+a)b is O(n b) n 10n2-16n+100 is not O(n) also not W(n3)

b b n (n+a) ³ (n/2) for n ³ 2|a| -b b -b b b n Note: I don’t use notation f(n)=O(g(n)) =2 n =c’n for c’=2 so (n+a) is W(n )

9 10

Complexity Analysis Complexity analysis overview

n We have looked at

n type of complexity analysis Type of Complexity Type of n worst-case, best-case, average-case Analysis Bound Alg n types of function bounds T(n) grows A n O, W , Q like nlog 2n different running n These two considerations are time for each Function mapping Nice formula input string input length to approximating orthogonal to each other runtime of A running time n one can do any type of function bound with any type of complexity analysis Usually we represent the function in the middle using a recurrence relation rather than explicitly 11 12

2 General algorithm design paradigm Example

n Find a way to reduce your problem to n Mergesort

one or more smaller problems of the n on a problem of size at least 2

same type n Sort the first half of the numbers

n Sort the second half of the numbers

n When problems are really small solve n Merge the two sorted lists them directly n on a problem of size 1 do nothing

13 14

Cost of Merge Recurrence relation for Mergesort

n In total including other operations let’s say n Given two lists to merge size n and m each merge costs 3 per element output n Maintain pointer to head of each list “ceiling” round up n T(n)=T(én/2ù)+T(ën/2û)+3n for n³2 n Move smaller element to output and advance pointer n T(1)=1 “floor” round down n n Can use this to figure out T for any value of n n T(5)=T(3)+T(2)+3x5 m =(T(2)+T(1)+3x3)+(T(1)+T(1)+3x2)+15 =((T(1)+T(1)+3x2)+1+9)+(1+1+6)+15 =8+10+8+15=41 n+m n T(n)= 3n log2n Worst case n+m-1 comparisons Best case min(n,m) comparisons 15 16

Insertion Sort Recurrence relation for Insertion Sort

n For i=2 to n do n Let Tn(i) be the worst case cost of creating j¬i list that has first i elements sorted out of n. while(j>1 & X[ j ] > X[ j-1]) do n We want to know Tn(n) swap X[ j ] and X[ j-1] n The insertion of X[i] makes up to i-1 comparisons in the worst case

n T (i)=T (i-1)+i-1 for i>1 n i.e., For i=2 to n do n n n T (1)=0 since a list of length 1 is always Insert X[i] in the sorted list n sorted X[1],...,X[i-1] n Therefore Tn(n)=n(n-1)/2

17 18

3 Solving recurrence relations Series

n e.g. T(n)=T(n-1)+f(n) for n ³ 1 n S= 1 + 2 + 3 + ... + (n-1) T(0)=0 n S= (n-1)+(n-2)+(n-3)+ ... + 1 n fi() n n solution is T(n)=åi=1 2S=n + n + n + .... + n {n-1 terms} n 2S=n(n-1)

n so S=n(n-1)/2 n Insertion sort: Tn(i)=Tn(i-1)+ i-1

n n (-)i1 n Works generally when f(i)=a×i+b for all i so Tn(n)= åi=1 =n(n-1)/2 n Sum = average term size × # of terms

19 20

Quicksort Partition - two finger algorithm

n (X,left,right) n Partition(X, left,right) if left < right choose a random element to be a pivot and split=Partition(X, left, right) pull it out of the array, say at left end Quicksort(X, left, split-1) maintain two fingers starting at each end of the array Quicksort(X, split+1, right) slide them towards each other until you get a pair of elements where right finger has a smaller element and left finger has a bigger one (when compared to pivot) swap them and repeat until fingers meet put the pivot element where they meet

21 22

Partition - two finger algorithm In practice n Partition(X,left,right) n often choose pivot in fixed way as swap X[left], X[random(left, right)] n middle element for small arrays n median of 1st, middle, and last for larger arrays pivot ¬ X[left]; L ¬ left; R ¬right n median of 3 medians of 3 (9 elements in all) for while L pivot & R ³ left) do n also maintain two groups at each end of elements equal to the pivot

R ¬ R-1 n swap them all into middle at the end of Partition if L>R then swap X[L],X[R] n equal elements are bad cases for two fingers swap X[left],X[R] return R

23 24

4 Quicksort Analysis Quicksort Analysis Average Case

n Partition does n-1 comparisons on a list of n Recall length n n Partition does n-1 comparisons on a list of n pivot is compared to each other element length n n If pivot is ith largest then two sub-problems th are of size i-1 and n-i n If pivot is i largest then two sub-problems n If pivot is always in the middle get are of size i-1 and n-i T(n)=2T(n/2)+n-1 comparisons n Pivot is equally likely to be any one of n T(n) = nlog2n better than Mergesort 1st through nth largest n If pivot is always at the end get T(n)=T(n-1)+n-1 comparisons 1 n n T(n) = n(n-1)/2 like Insertion Sort T(n)n=-1+å(T(i-1)+-T(ni)) n i1=

25 26

Quicksort analysis Quicksort analysis

1 n T(n) T(n) = n -1+ å (T(i -1)+ T(n -i) ) Let Q(n)= n i =1 n +1 2 T(1) + 2 T(2) +... + 2 T(n -1) 2 = n -1 + \Q(n +1)£ Q(n)+ n n+1 1 1 1 \nT(n) = n(n-1) + 2 T(1) + 2 T(2) +... + 2 T(n -1) \Q(n) £ 2(1+ + +...+ ) = 2H » 2ln n =1.38log n 2 3 n n 2 (n +1)T(n +1) = (n +1)n + 2 T(1) + 2 T(2) + ... + 2 T(n)

\(n +1)T(n +1)-nT(n) = 2 T(n) + 2n n (Recall that ln n = 1/x dx) (n +1)T(n +1) = (n + 2)T(n) + 2n ò1 T(n +1) T(n) 2n \ = + \T(n)»1.38 nlog n n + 2 n +1 (n+ 1)(n+ 2) 2 27 28

“Gestalt” Analysis of Quicksort Quicksort execution

n Look at elements that ended up in positions j < k of the final sorted array

n The expected # of comparisons in Qsort = the expected # of j < k such that the jth and k th elements were compared th th = sum j < k Pr [j and k elts were compared]

j k

29 30

5 “Gestalt” Analysis of Quicksort “Gestalt” Analysis of Quicksort n Look at elements that end up in positions n The only time they might have been j < k of the final sorted array compared to each is when they were split into n What is the chance that they were compared separate sub-problems to each other during the course of the n The only way they could be split in a step is if the th algorithm? pivot was an element that ended up between j and kth in the final sorted array n They started off together in the same sub-problem th th n The pivot could be j or k n They ended up in different sub-problems n Those are the only cases when they are n The only time they might have been compared to compared each is when they were split into separate sub- n Chances of that happening is 2 out of (k -j+1) problems equally likely possibilities

31 32

Total cost of Quicksort n Total expected cost å 2 kj> k-j+1 n The contribution for each j is at most

æö1111 2++ç÷+…+£ 2logne èø234n n Total 2n logen = 1.38 n log2n

33

6