Revised 27/07/03 2 About This Lecture

Sorting Operations - Comparisons In this lecture we will learn about the two basic operations performed by Sort and Data Movement : comparison and element movement

We will also learn how to express them as Cmput 115 - Lecture 8 a number of access operations. Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package

3 4

Outline The Sort Problem

Comparing Elements Given a collection, with elements that can Moving Elements be compared, put the elements in increasing or decreasing order. 0 1 2 3 4 5 6 7 8 60 30 10 20 40 90 70 80 50

0 1 2 3 4 5 6 7 8 10 20 30 40 50 60 70 80 90

5 6

Operations Comparing Primitive Values ¡ ¡ Given a collection, with elements that can be The algorithms we will consider are based on compared, put the elements in increasing or comparing individual elements.

decreasing order. ¡ If the elements are primitive values, we can use the < ¡ We must perform two operations to sort a operator to compare them.

collection: ¡ If the elements are objects, we cannot use <

– compare elements ¡ However, Java has an interface called Comparable (in – move elements the java.lang package) that defines the method ¡ The time to perform each of these two operations, compareTo(Object object) which can be used to and the number of times we perform each operation, compare objects from classes implementing the is critical to the time it takes to sort a collection. Comparable interface.

1 7 8

Comparing Objects - CompareTo() Designing Classes for Sorting ¡ compareTo(Object object) If we write our sorting algorithms using the – If the receiver is “less” than the argument, it compareTo() method, we can apply them to returns a negative int. collections that hold objects from any classes that implement the Comparable interface. – If the receiver is “equal” to the argument, it returns

the zero int. ¡ – If the receiver is “greater” than the argument, it For example, the String and Integer classes implement returns a positive int. Comparable.

¡ Any class designer who wants the objects in a class to be sortable using our algorithms only needs to make the class implement the Comparable interface.

9 10

The Time for Comparing Values The Time for Comparing Objects 1 ¡ ¡ The time to actually compare two primitive values is It takes longer to compare two Java objects than two

small (one virtual machine instruction). Java primitive values. ¡ ¡ The comparison time for values is dominated by the To compare objects, the compareTo() method must time it takes to access the two elements being access not only the object, but its internal state as well.

compared. ¡

¡ For example, the next slide shows the Java source For example, the two array accesses take much more code from the library (java.String) to compare two time than the actual comparison in the code: Strings. if (data[i] < data[j]) – The details are not important, just notice that it requires many ¡ Therefore, a comparison of primitive values "costs" accesses to the array that each String uses to store its chars. two data accesses. ¡ The important point is that comparing objects can "cost" many data accesses.

11 12 The Time for Comparing Objects 2 Moving Elements

public int compareTo(String anotherString) { ¡ Besides comparing elements, the only other operation int len1 = this.count; that is essential to sorting is moving elements.

int len2 = anotherString.count; ¡ int n = Math.min(len1, len2); The exact code for moving elements depends on the char v1[] = this.value; type of collection and the pattern of element char v2[] = anotherString.value; movement, but it consists of a series of data int i = this.offset; accesses. int j = anotherString.offset; ¡ One common form of element movement is an while (n-- != 0) { exchange which is done using a single temporary char c1 = v1[i++]; char c2 = v2[j++]; variable and three assignments. if (c1 != c2) { ¡ This process usually involves four container accesses return c1 - c2; and two local variable accesses.

} ¡ } Since the local variable accesses often get mapped to return len1 - len2; registers or cache memory we won't count them. }

2 13 14 Exchange (1) Exchange Algorithm (2)

ref i ref j ref i ref j ref i ref j ref i ref j

temp = c[i]; c[j] = temp;

£¥¤§¦¨¡©¦©¤©¤§ ¨ § ¢¡ £¥¤§¦¨¡©¦©¤©¤§ ¨ §

data a data b ¢¡ data a data b data a data b data a data b

¤§£ ¨ ¦ ¨ ¨¦¤¤§ ¨ § ¤§£ ¨ ¦¨ ¨ ¨¦©¤¤§ ¨ §

temp temp temp temp

ref i ref j

c[i] = c[j];

¦¨  ¨¦©¤§¤     ¢¤£  data a data b

temp

15 16

Comparison and Movement Times Sorting ¡ To predict the total time for an algorithm, we can We will look at these sorting algorithms add the accesses used for comparison and the accesses used for movement in the algorithm. – ¡ If the container holds primitive values each – comparison requires two accesses, but if the – container holds objects the number of accesses – Quick sort may be harder to compute. ¡ If the algorithm uses exchanges, each exchange requires four accesses, but if the algorithm uses a different style of data movement, the number of accesses may be harder to compute.

Revised 27/07/03 18 About This Lecture

In this lecture we will review the sorting Sorting - Selection Sort algorithm called Selection Sort which was presented in CMPUT 114. We will analyze the time and space complexity of a standard implementation Cmput 115 - Lecture 9 for sorting arrays. Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package

3 19 20 Selection Sort Algorithm Outline Input:

anArray – array of Comparable objects Selection Sort Algorithm n – sort the elements in positions 0…(n-1) Selection Sort – Implementation for Arrays

Output:

Time and Space Complexity of Selection a sorted array [0..(n-1)] Sort Idea: We first pick up the largest item and store it at the end, and then pick up the 2nd largest item and put it at the 2nd last, … Algorithm For last =(n-1),(n-2),…,1 – Find the largest element in positions 0…last – exchange it with the element at the last index

21 22 Selection Sort Code - Arrays Method - getMaxIndex() - Arrays public static void selectionSort(Comparable anArray[], public static int getMaxIndex(Comparable anArray[], int n) { int last) { // pre: 0 <= n <= anArray.length // pre: 0 <= last < anArray.length // post: values in anArray positions 0…..(n-1) // post: return the index of the max value in // positions 0…..(last-1) of the given array // are in ascending order int maxIndex; //index of largest object int maxIndex; //index of largest object int index; //index of inspected element int last; //index of last unsorted element maxIndex = last; for (index = last - 1; index >= 0; index--) { for (last = n-1; last > 0; last--) { if (anArray[index].compareTo(anArray[maxIndex]) > 0) maxIndex = getMaxIndex(anArray, last); maxIndex = index; swap(anArray, last, maxIndex); } } // we could check to see if maxIndex != last return maxIndex } // and only swap in this case. }

code based on Bailey pg. 82 code based on Bailey pg. 82

23 24 Counting Comparison Accesses Selection Sort Code (recursion) public static void selectionSort(Comparab lanArray[], How many comparison accesses are required for a selection int n) sort of n elements ? { All the comparisons are done in getMaxIndex. // pre: 0 <= n <= anArray.length

// post: objects in anArray positions 0…..(n-1) getMaxIndex’s loop does one comparison per iteration, and // are in ascending order iterates “last” times, so each call to getMaxindex does “last” comparisons.

int maxIndex; //index of largest object

The loop body in the sort method is executed (n-1) times with if ( n > 1 ) { “last” taking on the values n-1, n-2, … 1. getMaxIndex is called maxIndex = getMaxIndex(anArray, n); once with each value of “last”. swap(anArray, n, maxIndex); The total number of comparisons is: selectionSort(anArray,n-1); (n-1)+(n-2) + … + 1= (1+2+ … n)-n = [n(n+1)/2] - n } Since each comparison requires two accesses there are: } n(n+1) - 2n = n2 - n = O(n2) comparison accesses.

4 25 26

Counting Move Accesses of Selection Sort ¡ How many move accesses are required for a The number of comparisons and moves is selection sort of n elements? independent of the data (the initial order of ¡ The only time we do a move is in a reference

exchange (swap) which required 4 accesses. elements doesn’t matter).

¡ The sort method executes swap() once on each Therefore, the best, average and worst iteration of its loop. The loop iterates n-1 times. case time complexities of the Selection ¡ The total number of move accesses is: Sort are all O(n2). 4*(n-1) = O(n). ¡ Since the number of comparison accesses is O(n2), the move accesses are insignificant. ¡ In total the code does O(n2) accesses.

27 Revised 27/07/03 Space Complexity of Selection Sort

Besides the collection itself, the only extra storage for this sort is the single temp Sorting - Insertion Sort reference used in the swap method. Therefore, the space complexity of Selection Sort is O(n). Cmput 115 - Lecture 10 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package

29 30

About This Lecture Outline

In this lecture we will learn about a sorting The Insertion Sort Algorithm

algorithm called the Insertion Sort. Insertion Sort - Arrays

We will study its implementation and its Time and Space Complexity of Insertion time and space complexity. Sort

5 31 32 Insertion Sort Algorithm Insertion Sort Algorithm 1 Input: anArray – array of Comparable objects n – sort the elements in positions 0…(n-1) The lower part of the collection is sorted Output: and the higher part is unsorted. a sorted array [0..(n-1)] Idea: (1) We try to sort the first 1 object, then first 2 objects,and then first 3 object, …, then first n-1 objects 0 1 2 3 4 5 6 7 8 (2) Each step, we insert the k+1 object into an appropriate position 60 30 10 20 40 90 70 80 50

between 0 and k+1

Algorithm Insert the first element of the unsorted part FOR (k = 1; k <= n-1; k ++ ) DO into the correct place in the sorted part. - find the appropriate position for anArray[k] between 0..k - insert anAraary[k] into anArray[0..k] 0 1 2 3 4 5 6 7 8 30 60 10 20 40 90 70 80 50

33 34 Insertion Sort Algorithm 2 Insertion Sort Code - Arrays

public static void insertionSort(Comparable anArray[], int size) { 0 1 2 3 4 5 6 7 8 // pre: 0 <= size <= anArray.length 310 6320 1630 2640 4650 9760 7980 890 590 // post: values in anArray are in ascending order

int index; //index of start of unsorted part

for (index = 1; index < size; index++) { moveElementAt(anArray, index); } }

code based on Bailey pg. 83

35 36

Moving Elements in Insertion Sort Multiple Element Exchanges ¡ ¡ The Insertion Sort does not use an exchange The naïve approach is to just keep exchanging the operation. new element with its left neighbor until it is in the

¡ When an element is inserted into the ordered part right location. of the collection, it is not just exchanged with another element.

¡ Several elements must be “moved”. 0 1 2 3 0 1 2 3 0 1 2 3

10 30 60 20 10 30 20 60 10 20 30 60

0 1 2 3 ¡

0 1 2 3 Every exchange costs four access operations. ¡ 10 30 60 20 10 20 30 60 If we move the new element two spaces to the left, this costs 2*4 = 8 access operations.

6 37 38 Method - moveElementAt() - Arrays Avoiding Multiple Exchanges

public static void moveElementAt(Comparable anArray[], ¡ We can insert the new element in the correct place int last) { with fewer accessing operations - only 6 accesses! // pre: 0 <= last < anArray.length and anArray in // ascending order from 0 to last-1 // post: anArray in ascending order from 0 to last 1. move = anArray[3]; 0 1 2 3 2. anArray[3] = anArray[2]; 124 while ((last>0) && 3. anArray[2] = anArray[1]; 3 (anArray[last].compareTo(anArray[last-1]) < 0)) { 4. anArray[1] = move; 10 30 60 20 move swap(anArray, last, last - 1); last--; } } ¡ In general if an element is moved (p) places it only takes (2*p + 2) access operations, not (4*p) access operations as required by (p) exchanges. code based on Bailey pg. 83

39 40 Recall Element Insertion in a Vector Recall Vector Insertion Code

public void insertElementAt(Object object, int index) { This operation is similar to inserting a new //pre: 0 <= index <= size() element in a Vector. //post: inserts the given object at the given index, // moving elements from index to size()-1 to the right Each existing element was “moved” to the right before inserting the new element in int i; its correct location. this.ensureCapacity(this.elementCount + 1); for (i = this.elementCount; i > index; i--) this.elementData[i] = this.elementData[i - 1]; this.elementData[index] = object; this.elementCount++; }

code based on Bailey pg. 39

41 42 Differences from Element Insertion Method - moveElementAt() - Arrays

In Vector element insertion: public static void moveElementAt(Comparable anArray[], int last) { – We have a reference to the new element. // pre: 0 <= last < anArray.length and anArray in – We know the index location for the new element. // ascending order from 0 to last-1 // post: anArray in ascending order from 0 to last In the Insertion sort: – We don’t have a reference to the new element, Comparable move; //A reference to the element being moved only an index in the array where the new element move = anArray[last]; is currently located. while ((last>0) && (move.compareTo(anArray[last-1]) < 0)) { anArray[last] = anArray[last - 1]; – We don’t know the index location for the new last--; element. We need to find the index by comparing } the new element with the elements in the anArray[last] = move; collection from right to left. }

code based on Bailey pg. 83

7 43 44

Counting Comparisons Comparisons - Best Case ¡ How many comparison operations are required for an ¡ In the best case there is 1 comparison per call since the insertion sort of an n-element collection? first comparison terminates the loop. ¡ The sort method calls moveElementAt() in a loop for the while ((last>0) && (anArray[last].compareTo(anArray[last-1])< 0)){ indexes: i = 1, 2, … n - 1. anArray[last] = anArray[last - 1]; for (index = 1; index < size; index++) { last--; this.moveElementAt(anArray, index); }

¡ Each time moveElementAt() is executed for some move(a, 1) move(a, 2) . . . move(a, n-1) argument, last, it does a comparison in a loop for some of the indexes: last, last-1, … 1. while ((last>0) && (anArray[last].compareTo(anArray[last-1])< 0)) { 1 1 1 anArray[last] = anArray[last - 1];

last--; ¡ } The total number of comparisons is: (n - 1) * 1 = n - 1 = O(n)

45 46

Comparisons - Worst Case Comparisons - Average Case 1 ¡ ¡ In the worst case there are "last" comparisons per call In the average case it is equally probable that the since the loop is not terminated until last == 0. number of comparisons is any number between 1 and while ((last>0) && (anArray[last].compareTo(anArray[last-1])< 0)){ "last" (inclusive) for each call. anArray[last] = anArray[last - 1]; while ((last>0) && (anArray[last].compareTo(anArray[last-1])< 0)){ last--; anArray[last] = anArray[last - 1]; last--; } } move(a, 1) move(a, 2) . . . move(a, n-1) move(a, 1) 1 move(a, 2) last . . . 1 move(a, n-1) last 1 last

1 2 n-1 1 1 2 1 2 . . . n-1 ¡ ¡ The total number of comparisons is: Note that the average for each call is: 1 + 2 + … (n - 1) = [(n-1)*n] / 2 = O(n2) (1 + 2 + … last)/last = [last*(last+1)]/[2*last] = (last+1)/2

47 48

Comparisons - Average Case 2 Counting Moves ¡ ¡ In the average case, the total number of How many move operations are required for an insertion comparisons is: sort of an n-element collection? (1+1)/2 + (2+1)/2 + … + ((n-1) + 1)/2 = ¡ The sort method calls moveElementAt() in a loop for the 1/2 + 1/2 + 2/2 + 1/2 + … + (n-1)/2 + 1/2 = indexes: k = 1, 2, … n - 1. [1 + 2 + … + (n-1)]*(1/2) + (n-1)*(1/2) = ¡ Every time the method is called, the element is moved [ (n-1)*n/2]*(1/2) + (n-1)*(1/2) = one place for each successful comparison. [ (n-1)*n]*(1/4) + 2*(n-1)*(1/4) = while ((last>0) && (anArray[last].compareTo(anArray[last-1]) < 0)){ [ (n-1)*n + 2*(n-1)]*(1/4) = anArray[last] = anArray[last - 1]; [ (n-1)*(n + 2)]*(1/4) = O(n2) last--; }

¡ There is one move operation for each comparison so the best, worst and average number of moves is the same as the best, worst and average number of comparisons.

8 49 50

Counting Accesses Time Complexity of Insertion Sort ¡

Each comparison requires 2 accesses. Best case O(n) accesses. ¡

Each move requires 2 accesses. 2

¡ Worst case O(n ) accesses. Each time that moveElementAt() is called, there are two

other accesses, one before the while loop and one after. Average case O(n2) accesses. ¡ Since moveElementAt() is called n-1 times, there are Note: this means that for nearly sorted

2*(n-1) = O(n) of these extra accesses. ¡ collections, insertion sort is better than Therefore, the "order" of the best, worst and average number of accesses is the same as the "order" of the selection sort even though in average and best, worst and average number of comparisons. worst cases, they are the same: O(n2).

51 Revised 27/07/03 Space Complexity of Insertion Sort

Besides the collection itself, the only extra storage for this sort is the single temp Sorting - Merge Sort reference used in the move element method.

Therefore, the space complexity of Insertion Sort is O(n). Cmput 115 - Lecture 11 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package

53 54

About This Lecture Outline

In this lecture we will learn about a sorting Merge: combining two sorted arrays

algorithm called the Merge Sort. Merge algorithm

We will study its implementation and its Time and Space complexity for Merge time and space complexity.

The Merge Sort Algorithm Merge Sort - Arrays Time and Space Complexity of Merge Sort

9 55 56 Merge Sort Algorithm Merging Two Sorted Arrays Input: anArray – array of Comparable objects Merge is an operation that combines two n – sort the elements in positions 0…(n-1) sorted arrays together into one. Output: a sorted array [0..(n-1)] 0 1 2 0 1 2 3 Idea: (1) split the given array into two arrays, 10 40 60 50 70 80 90 (2) sort the first and next half array separately, (2) merge the two sorted array into one sorted array merge Algorithm (1) sort anArray[0..n/2-1], (2) sort anArray[n/2 .. n-1], 0 1 2 3 4 5 6 (3) merge two arrays into one sorted array 10 40 50 60 70 80 90

57 58

Merge Algorithm – initial version Merge Algorithm ¡ ¡ For now, assume the result is to be placed in a For each array keep track of the current position separate array called result, which has already (initially 0). been allocated.

¡ REPEAT until all the elements of one of the given ¡ The two given arrays are called front and back arrays have been copied into result : (the reason for these names will be clear later). – Compare the current elements of front and back – Copy the smaller into the current position of result (break

¡ front and back are in increasing order. ties however you like) – Increment the current position of result and the array that was copied from

¡ For the complexity analysis, the size of the input, n, ¡ is the sum nfront + nback Copy all the remaining elements of the other given array into result.

59 60 Merge Example (1) Merge Example (2)

Current positions indicated in red 0 1 2 0 1 2 3 0 1 2 3 4 5 6 0 1 2 0 1 2 3 0 1 2 3 4 5 6 10 40 60 50 70 80 90 10 40 60 50 70 80 90 10 40

Compare current elements; copy smaller; update current Compare current elements; copy smaller; update current

0 1 2 0 1 2 3 0 1 2 3 4 5 6 0 1 2 0 1 2 3 0 1 2 3 4 5 6 10 40 60 50 70 80 90 10 10 40 60 50 70 80 90 10 40 50

Compare current elements; copy smaller; update current Compare current elements; copy smaller; update current

10 61 62 Merge Example (3) Merge Code – version 1 (1)

private static void merge(int[] front, int[] back, int[] result, int first, int last) { // pre: all positions in front and back are sorted, 0 1 2 0 1 2 3 0 1 2 3 4 5 6 // result is allocated, 10 40 60 50 70 80 90 10 40 50 60 // (last-first+1) == (front.length + back.length) // post: positions first to last in result contain one copy // of each element in front and back in sorted order. Copy the rest of the elements from the other array int f=0 ; // front index int b=0 ; // back index int i=first ; // index in result while ( (f < front.length) && (b < back.length)) { 0 1 2 0 1 2 3 0 1 2 3 4 5 6 if (front[f] < back[b]) { result[i] = front[f] ; 10 40 60 50 70 80 90 10 40 50 60 70 80 90 i++ ; f++ ; } else { result[i] = back[b] ; i++ ; b++ ; } }

63 64 Merge Code – version 1 (2) Merge – complexity

¡ Every element in front and back is copied // copy remaining elements into result exactly once. Each copy is two accesses, so the total number of accesses due to copying is 2n.

while ( f < front.length) { ¡ result[i] = front[f] The number of comparisons could be as small as i++ ; min(nfront,nback) or as large as (n-1). Each f++ ; comparison is two accesses. } ¡ In the worst case the total number of accesses is while ( b < back.length) { 2n+2(n-1) = O(n).

result[i] = back[b] ; ¡ In the best case the total number of accesses is i++ ; b++ ; 2n+ 2*min(nfront,nback) = O(n) } ¡ The average case is between the worst and best } case and is therefore also O(n). ¡ Memory required: 2n = O(n)

65 66 Merge Sort Algorithm Merge Sort – (1) Split

¡ Merge Sort sorts a given array (anArray) into increasing order as follows: ¡ Split anArray into two non-empty parts any way 0 1 2 3 4 5 6 you like. For example 40 60 10 90 50 80 70 front = the first n/2 elements in anArray back = the remaining elements in anArray

¡ Sort front and back by recursively calling 0 1 2 0 1 2 3 MergeSort with each one. 40 60 10 90 70 80 90 ¡ Now you have two sorted arrays containing all the elements from the original array. Use merge to combine them, put the result in anArray.

11 67 68 Merge Sort – (2) recursively sort front Merge Sort – (3) recursively sort back

0 1 2 3 4 5 6 0 1 2 3 4 5 6 40 60 10 90 50 80 70 40 60 10 90 50 80 70

0 1 2 0 1 2 3 0 1 2 0 1 2 3 40 60 10 50 70 80 90 40 60 10 90 70 80 90

0 1 2 0 1 2 0 1 2 3 mergesort(front) 10 40 60 mergesort(back) 10 40 60 50 70 80 90

69 70 Merge Sort – (4) merge Merge Sort Algorithm - summary

Original array Final result Original array Final result 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 40 60 10 90 50 80 70 10 40 50 60 70 80 90 40 60 10 90 50 80 70 10 40 50 60 70 80 90

merge merge 0 1 2 0 1 2 3 0 1 2 0 1 2 3 40 60 10 90 70 80 90 40 60 10 90 70 80 90

0 1 2 0 1 2 3 0 1 2 0 1 2 3 10 40 60 50 70 80 90 Recursively sort each part 10 40 60 50 70 80 90

71 72 MergeSort Code – version 1 MergeSort Call Graph (n=7)

last public static void mergesort(int[] anArray, int first, first int last) { 0-6 Each box represents //pre: last < anArray.length //post: anArray positions first to last are in increasing order one invocation of the int size = (last-first)+1 ; 0-2 3-6 mergesort method. if (size > 1) { int frontsize = size/2 ; int backsize = size-frontsize ; 0-0 1-2 3-4 5-6 int[] front = new int[frontsize] ; int[] back = new int[backsize] ; int i; 1-1 2-2 3-3 4-4 5-5 6-6 for (i=0; i < frontsize; i++) { front[i] = anArray[first+i]; } for (i=0; i < backsize; i++) { back[i] = anArray[first+frontsize+i]; } mergesort(front,0,frontsize-1); mergesort(back, 0,backsize -1); How many levels are there, in general, merge(front,back,anArray,first,last) ; } if the array is divided in half each time ? }

12 73 74 MergeSort Call Graph (general) MergeSort – complexity analysis (1)

# of positions to sort k Each invocation of mergesort on p array Suppose n=2 . n How many levels ? positions does the following: What value is in Copies all p positions once (# accesses = O(p)) n/2 n/2 each box at level j? Calls merge (#accesses = O(p)) n/4 n/4 n/4 n/4 How many boxes on Observe that p is the same for all invocations at level j ? the same level, therefore total # of accesses at a given level j is O((#invocations at level j)*pj) 1 1 1 1 1 1 1 1

75 76

MergeSort – complexity analysis (2) Time Complexity of Merge Sort ¡ The total # of accesses at level j is Best case - O(n log(n)) O((#invocations at level j)*p )

j = O( 2j * (n/2j)) Worst case - O(n log(n))

= O( n ) Average case O(n log(n)) ¡ In other words, the total # of accesses at each level Note that the insertion sort is actually a is the same, O(n)

¡ better sort than the merge sort if the The total # of accesses for the entire mergesort is the sum of the accesses for all the levels. Since the original collection is almost sorted. accesses at every level is the same – O(n) – this is (# levels)*O(n) = O(log(n))*O(n) = O(n*log(n))

77 78

Space Complexity of Merge Sort (1) Space Complexity of Merge Sort (2) ¡ ¡ In any recursive method, space is required for the Besides the given array, there are two temporary stack frames created by the recursive calls. arrays allocated in each invocation whose total size is the same as the number of positions to be j ¡ j The maximum amount of memory required for sorted: at level this is pj = n/2 this purpose is ¡ This space is allocated before the recursive calls (size of the stack frame) * (depth of recursion) are made and needed after the recursive calls have returned and therefore the maximum total j

¡ amount of space allocated is the sum of n/2 for The size of the stack frame is a constant, and for j=0…log(n).

mergesort the depth of recursion (the number of ¡ levels) is O(log(n)). This sum is O(n) – it is a little less than 2*n. ¡ Therefore, the space complexity of Merge Sort is ¡ The memory required for the stack frames is O(n), but doubling the collection storage may therefore O(log(n)). sometimes be a problem.

13 79 80

Making mergesort faster Reducing copying - back

Although we cannot improve the big-O The back array is easy to eliminate. We just complexity of mergesort we can make it use the back portion of anArray in its faster in practice by doing two things: place. – Reducing the amount of copying The only significant change in the code is – Allocating temporary storage once at the very to the merge method, which now must be outset told where the “back” of anArray begins.

We can also eliminate from merge the final We will make these improvements in 2 steps. loop which copies values from back into the final positions of anArray since these will be in the correct place in anArray.

81 82 MergeSort Code – version 2 (1) MergeSort Code – version 2 (2) public static void mergesort(int[] anArray, int first, int last) { me--r-g--e-s--o-r--t-(--b-a--c-k--,- --0-,--b-a--c-k--s-i--z-e-----1-); //pre: last < anArray.length //post: anArray positions first to last are in increasing order int size = (last-first)+1 ; int backstart = first + frontsize; if ( size > 1) { int frontsize = size/2 ; mergesort(anArray, backstart, last); int backsize = size-frontsize ; int[] front = new int[frontsize] ; -i-n-t--[-]- --b-a--c-k- -- -=- -n--e-w---i-n--t-[-b--a-c-k--s-i-z--e-]- ; m-e--r-g--e-(--f-r--o-n--t-,--b-a--c-k--,-a--n-A--r-r--a-y--,-f--i-r--s-t--,-l--a-s--t-)-- -; int i; for (i=0; i < frontsize; i++) { front[i] = anArray[first+i]; } f-o--r- -(--i-=-0--;- -i-- -<---b-a--c-k-s--i-z-e--;- - --i-+-+--)- -{-- -b-a--c-k-[--i-]- -- -= merge(front, anArray, first, backstart, last); ---a-n-A--r-r-a--y-[-f--i-r--s-t-+--f-r-o--n-t-s--i-z-e--+-i-]--;- -}-- mergesort(front,0,frontsize-1); } }

83 84 Merge Code – version 2 (1) Merge Code – version 2 (2)

private static void merge(int[] front, int[] anArray, int first, int backstart, // copy remaining elements into result (anArray) int last) { int f=0 ; // front index while ( f < front.length) { int b=backstart ; // back index anArray[i] = front[f] int i=first ; // index in result while ( (f < front.length) && (b <= last)) { i++ ; if (front[f] < anArray[b]) { f++ ; anArray[i] = front[f] ; } i++ ; f++ ; while ( b < back.length) { } else { anArray[i] = back[b] ; // i==b ALWAYS AT THIS POINT anArray[i] = anArray[b] ;// i <= b ALWAYS AT THIS POINT i++ ; i++ ; b++ ; b++ X; } } } }

14 85 86

Improving efficiency – front (1) Improving efficiency – front (2)

front is as easy to eliminate as back in the In addition, instead of allocating the storage mergesort method. We just use the front each time merge is called, we can allocate it portion of anArray in its place. once, before the first call to mergesort is But the merge method must make a copy of made, and pass this extra array on all calls. the front portion of anArray before

merging begins.

This does not reduce copying at all, but it This saves the time it takes to allocate moves the temporary storage into the memory and garbage collect it, which in the merge method, which means it is allocated previous versions was done once for every AFTER the recursive calls and therefore invocation. less memory is needed in total.

Revised 27/07/03 88 About This Lecture

In this lecture we will learn about a sorting Sorting - Quick Sort algorithm called the Quick Sort. We will study its implementation and its time and space complexity.

Cmput 115 - Lecture 12 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package

89 90

Outline Algorithm – initial version

The Quick Sort Algorithm As we did with Mergesort, we will first give a Time and Space Complexity of Quick Sort simple version of Quicksort and then make efficiency improvements.

Quicksort can be seen as a variation of Mergesort in which front and back are defined in a different way.

15 91 92

Merge Sort Algorithm - reminder Quicksort Algorithm ¡ ¡ Merge Sort sorts a given array (anArray) into Partition anArray into two non-empty parts. increasing order as follows: Pick any value in the array, pivot. ¡ Split anArray into two non-empty parts any way small = the elements in anArray < pivot large = the elements in anArray > pivot you like. For example Place pivot in either part, so as to make sure neither part front = the first n/2 elements in anArray is empty. back = the remaining elements in anArray ¡ Sort small and large by recursively calling ¡ Sort front and back by recursively calling Quicksort with each one. MergeSort with each one. ¡ You could use merge to combine them, but

¡ Now you have two sorted arrays containing all because you know the elements in small are the elements from the original array. Use merge smaller than the elements in large you can simply concatenate small and large, and put to combine them, put the result in anArray. the result into anArray.

93 94 Quicksort – (1) Partition Quicksort – (2) recursively sort small

Pivot

0 1 2 3 4 5 6 0 1 2 3 4 5 6 50 60 40 90 10 80 70 50 60 40 90 10 80 70

0 1 2 0 1 2 3 0 1 2 0 1 2 3 40 10 50 60 90 80 70 40 10 50 60 90 80 70

0 1 2 quicksort(small) 10 40 50

95 96 Quicksort – (3) recursively sort large Quicksort – (4) concatenate

Original array Final result 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 50 60 40 90 10 80 70 50 60 40 90 10 80 70 10 40 50 60 70 80 90

concatenate 0 1 2 0 1 2 3 0 1 2 0 1 2 3 40 10 50 60 90 80 70 40 10 50 60 90 80 70

0 1 2 0 1 2 3 0 1 2 0 1 2 3 quicksort(large) 10 40 50 60 70 80 90 10 40 50 60 70 80 90

16 97 98 Merge Sort Algorithm Quicksort Algorithm – summary Input: anArray – array of Comparable objects Pivot n – sort the elements in positions 0…(n-1) Original array Final result Output: 0 1 2 3 4 5 6 0 1 2 3 4 5 6 a sorted array [0..(n-1)] 50 60 40 90 10 80 70 10 40 50 60 70 80 90 Idea: (1) split the given array into two arrays, (2) sort the first and next half array separately, concatenate (2) merge the two sorted array into one sorted array 0 1 2 0 1 2 3 Algorithm 40 10 50 60 90 80 70 (1) sort anArray[0..n/2-1], (2) sort anArray[n/2 .. n-1], 0 1 2 0 1 2 3 (3) merge two arrays into one sorted array Recursively sort each part 10 40 50 60 70 80 90

99 100 Quick Sort Algorithm Quicksort - Time Complexity (best) Input: anArray – array of Comparable objects Like mergesort, a single invocation of n – sort the elements in positions 0…(n-1) quicksort on an array of size p has Output: complexity O(p): a sorted array [0..(n-1)] – p comparisons = 2*p accesses Algorithm – 2*p moves (copying) = 4*p accesses (1) randomly split the given array into two arrays such that

all objects in the first one is less than that in the 2nd. Best case: every pivot chosen by (2) sort the first using quick sort quicksort partitions the array into equal- (3) sort the second using quick sort sized parts. In this case quicksort is the (4) merge two arrays into one sorted array same big-O complexity as mergesort – O(n*log(n))

101 102

Quicksort Call Graph – best case Quicksort - Worst Case

# of elements to sort (p)

n Worst case: the pivot chosen is the largest or smallest value in the array. Partition n/2 n/2 creates one part of size 1 (containing only the pivot), the other of size p-1. n/4 n/4 n/4 n/4

1 1 1 1 1 1 1 1

17 103 104

Quicksort Call Graph – worst case Quicksort Time Complexity - Worst Case

# of elements to sort (p) ¡ There are n-1 invocations of Quicksort (not n counting base cases) with arrays of size p = n, (n-1),…2 1 n-1

¡ Since each of these does O(p) accesses the total 1 n-2 number of accesses is O(n) + O(n-1) + … + O(1) = O(n2) 1

¡ Ironically, the worst case occurs when the list is sorted (or near sorted)! 1

105 106

Comparisons and Accesses - Average Case Time Complexity of Quick Sort ¡ The average case must be between the best case Best case O(n log(n)) and the worst case, but since the best case is O(n 2 log(n)) and the worst case is is O(n2), some Worst case O(n )

analysis is necessary to find the answer. Average case O(n log(n)) ¡

Analysis yields a complex .

¡ Note that the quick sort is inferior to On average, the elements are in random order after each partition so about half should be smaller insertion sort and merge sort if the list is than the pivot and about half should be larger, so sorted, nearly sorted, or reverse sorted. the average case is more like the best case. ¡ The average case number of comparisons turns out to be approximately: 1.386*n*log(n) - 2.846*n ¡ Therefore, the average case time complexity is: O(n log(n)).

107 108

Quicksort - Space Complexity (1) Quicksort - Space Complexity (2) ¡ ¡ The memory needed for the original array and the In the best case the maximum memory allocated is the stack frame memory is n + O(logn)+ n + (n/2) + (n/4) + … + 2 = O(n)

n + O(depth of the call graph) ¡ ¡ The version we are analyzing has two temporary In the worst case the maximum memory allocated is arrays (small and large) in each invocation of n + O(n)+ n + (n-1) + (n-2) + … + 2 quicksort. If we make an extra scan of the array to = O(n2) determine their exact sizes, the additional memory for these is p. ¡ It is possible to reduce this to O(n). This is a very ¡ Because these are allocated before the recursive important improvement, even if time remains O(n2).

calls are made and needed after calls return, the ¡ maximum total memory allocated is the sum of the The key idea is to re-arrange the code so that small memory needed on a path in the call graph. and large are not used in the recursive calls or needed after the recursive calls.

18 109 110

Quicksort Algorithm – version 1 Quicksort Algorithm – version 2

Partition anArray into small and large Partition anArray into small and large

quicksort(small) Copy small into anArray positions

quicksort(large) 0…(smallsize-1)

Copy small into anArray positions Copy large into anArray positions

0…(smallsize-1) smallsize…(n-1)

Copy large into anArray positions quicksort(anArray,0,smallsize-1) smallsize…(n-1) quicksort(anArray,smallsize,n-1)

worst case space complexity: O(n2) worst case space complexity: O(n)

111 112 Quicksort – (1) Partition Quicksort – (2) copy back

Pivot

0 1 2 3 4 5 6 0 1 2 3 4 5 6 50 60 40 90 10 80 70 50 60 40 90 10 80 70

0 1 2 0 1 2 3 0 1 2 0 1 2 3 40 10 50 60 90 80 70 40 10 50 60 90 80 70

0 1 2 3 4 5 6 copy into original array 40 10 50 60 90 80 70

113 114 Quicksort – (3) recursively sort front Quicksort – (4) recursively sort back

0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 50 60 40 90 10 80 70 10 40 50 60 90 80 70 50 60 40 90 10 80 70 10 40 50 60 70 80 90

quicksort quicksort 0 1 2 0 1 2 3 0 1 2 0 1 2 3 front part back part 40 10 50 60 90 80 70 40 10 50 60 90 80 70

0 1 2 3 4 5 6 0 1 2 3 4 5 6 copy into original array 40 10 50 60 90 80 70 copy into original array 40 10 50 60 90 80 70

19 115 116

Eliminating small and large Quicksort Algorithm – version 3 ¡ In version 2 of Quicksort small and large are Partition anArray in-place so that the pivot used as temporary storage in the course of re- is in its correct final position,

arranging the values in anArray. pivotIndex, all smaller values are to its ¡ It is possible to re-arrange the values in anArray left, and all larger values are to its right. so that: – pivot is in its final position (pivotIndex) – All values in positions < pivotIndex are smaller than quicksort(anArray,first,pivotIndex-1)

pivot

– All values in positions > pivotIndex are greater than quicksort(anArray,pivotIndex+1,n-1) pivot using only one temporary variable. ¡ Still O(n) but half as much as in version 2.

117 118

In-place Partition Algorithm (1) In-place Partition Algorithm (2)

Our goal is to move one element, the pivot, Find the rightmost element that is smaller to its correct final position so that all than the pivot element. elements to the left of it are smaller than it l p r r and all elements to the right of it are larger than it. 0 1 2 3 4 5 6 7 8

60 30 10 20 40 90 70 80 50

We will call this operation partition().

We select the left element as the pivot. Exchange the elements and increment the left. lp r l pr 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 60 30 10 20 40 90 70 80 50 50 30 10 20 40 90 70 80 60

119 120

In-place Partition Algorithm (3) In-place Partition Algorithm (4)

Find the leftmost element that is larger than Find the rightmost element that is smaller the pivot element. than the pivot element. l l pr r lp r 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

50 30 10 20 40 90 70 80 60 50 30 10 20 40 60 70 80 90

Exchange the elements and decrement the Since the right passes the left, there is no right. element and the pivot is the final location. lp r 0 1 2 3 4 5 6 7 8 50 30 10 20 40 60 70 80 90

20 121 In-place Partition - Arrays private static int partition(int anArray[], int left, int right) { // pre: left <= right // post: data[left] is in the correct sort location; its // location is returned. All elements to the left of this location // are smaller than pivot, all elements to the right are larger. while (true) { while ((left < right)&&(anArray[left] < anArray[right])) right--; if (left < right) { swap(anArray, left, right); left++; } else return left; while ((left < right)&&(anArray[left] < anArray[right])) left++; if (left < right) { swap(anArray, left, right); right--; } else return right; } } code based on Bailey pg. 89

21