Merge Sort Roberto Hibbler Dept

Merge Sort Roberto Hibbler Dept. of Computer Science Florida Institute of Technology Melbourne, FL 32901 [email protected] ABSTRACT solution in Section 3, the evaluation of our results in Section 4, Given an array of elements, we want to arrange those elements and our final conclusion in Section 5. into a sorted order. To sort those elements, we will need to make comparisons between the individual elements efficiently. Merge Sort uses a divide and conquer strategy to sort an array efficiently 2. RELATED WORK while making the least number of comparisons between array The three algorithms that we will discuss are Bubble Sort[1] , elements. Our results show that for arrays with large numbers of Selection Sort[2], and Insertion Sort[3]. All three are comparison array elements, Merge Sort is more efficient than three other sort algorithms, just as Merge Sort. comparison sort algorithms, Bubble Sort[1], Insertion Sort[3], and Selection Sort[2]. Our theoretical evaluation shows that Merge The Bubble Sort[1] algorithm works by continually swapping Sort beats a quadratic time complexity, while our empirical adjacent array elements if they are out of order until the array is in evaluation shows that on average Merge Sort is 32 times faster sorted order. Every iteration through the array places at least one than Insertion Sort[3], the current recognized most efficient element at its correct position. Although algorithmically correct, comparison algorithm, with ten different data sets. Bubble Sort[1] is inefficient for use with arrays with a large number of array elements and has a ͉ʚͦ͢ʛ time complexity. Keywords Knuth observed, also, that while Bubble Sort[1] shares the worst- Merge Sort, sorting, comparisons, Selection Sort[2], arrange case time complexity with other prevalent sorting algorithms, compared to them it makes far more element swaps, resulting in 1. INTRODUCTION poor interaction with modern CPU hardware. We intend to show The ability to arrange an array of elements into a defined order is that Merge Sort needs to make on average fewer element swaps very important in Computer Science. Sorting is heavily used with than Bubble Sort[1] . online stores, were the order that services or items were purchased determines what orders can be filled and who receives their order The Selection Sort[2]algorithm arranges array elements in order first. Sorting is also essential for the database management by first finding the minimum value in the array and swapping it systems used by banks and financial systems, such as the New with the array element that is in its correct position depending on Stock Exchange, to track and rank the billions of transactions that how the array is being arranged. The process is then repeated with go on in one day. There are many algorithms, which provide a the second smallest value until the array is sorted. This creates solution to sorting arrays, including algorithms such as Bubble two distinctive regions within the array, the half that is sorted and Sort[1], Insertion Sort[3], and Selection Sort[2]. While these the half that has not been sorted. Selection Sort[2]shows an algorithms are programmatically correct, they are not efficient for improvement over Bubble Sort[1] by not comparing all the arrays with a large number of elements and exhibit quadratic time elements in its unsorted half until it is time for that element to be complexity. placed into its sorted position. This makes Selection Sort[2]less affected by the input’s order. Though, it is still no less inefficient We are given an array of comparable values. We need to arrange with arrays with a large number of array elements. Also, even these values into either an ascending or descending order. with the improvements Selection Sort[2]still shares the same worst-case time complexity of ͉ʚͦ͢ʛ. We intend to show that We introduce the Merge Sort algorithm. The Merge Sort Merge Sort will operate at a worst-case time complexity faster algorithm is a divide-and-conquer algorithm. It takes input of an than ͉ʚͦ͢ʛ. array and divides that array into sub arrays of single elements. A single element is already sorted, and so the elements are sorted The Insertion Sort[3]algorithm takes elements from the input back into sorted arrays two sub-arrays at a time, until we are left array and places those elements in their correct place into a new with a final sorted array. We contribute the following: array, shifting existing array elements as needed. Insertion Sort[3]improves over Selection Sort[2]by only making as many 1. We introduce the Merge Sort algorithm. comparisons as it needs to determine the correct position of the 2. We show that theoretically Merge Sort has a worst-case current element, while Selection Sort[2]makes comparisons against each element in the unsorted part of the array. In the time complexity better than ͉ʚͦ͢ʛ. 3. We show that empirically Merge Sort is faster than )v average case, Insertion Sort[3]’s time complexity is ͉ʚ ʛ, but its Selection Sort[2] over ten data sets. ͨ worst case is ͦ , the same as Bubble Sort[1] and Selection ͉ʚ͢ ʛ Sort[2]. The tradeoff of Insertion Sort[3]is that on the average This paper will discuss in Section 2 comparison sort algorithms more elements are swapped as array elements are shifted within related to the problem, followed by the detailed approach of our the array with the addition of new elements. We intend to show that Merge Sort operates at an average case time complexity faster (38 27 43 3 9 82 10 1) Output – array A in ascending order than ͉ʚͦ͢ʛ. 3. APPROACH A large array with an arbitrary order needs to be arranged in an ascending or descending order, either lexicographically or numerically. Merge sort can solve this problem by using two key ideas. The first key idea of merge sort is that a problem can be divided and conquered. The problem can be broken into smaller arrays, and those arrays can be solved. Second, by dividing the array into halves, then dividing those halves by recursively halving them into arrays of single elements, two sorted arrays are merged into one array, as a single element array is already sorted. Refer to the following pseudocode: Figure 1: Shows the splitting of the input array into single element arrays. Input – A: array of n elements Output – array A sorted in ascending order 1. proc mergesort(A: array) 2. var array left, right, result 3. if length(A)<=1 4. return(A) 5. var middle=length(A)/2 6. for each x in A up to middle 7. add x to left 8. for each x in A after middle 9. add x to right 10. left=mergesort(left) 11. right=mergesort(right) 12. result=merge(left,right) 13. return result Input – left:array of m elements, right: array of k elements Figure 2: Shows the merging of the single element arrays Output – array result sorted in ascending order during the Merge Step. 14. proc merge(left: array, right: array) 15. var array result 16. which length(left) > 0 and length(right) > 0 As the example shows, array A is broken in half continuously 17. if first(left) <= first(right) until they are in arrays of only a single element, then those single 18. append first(left) to result 19. left=rest(left) elements are merged together until they form a single sorted array 20. else in ascending order. 21. append first(right) to result 22. right=rest(right) 23. end while 4. EVALUATION 24. if length(left) > 0 25. append left to result 4.1 Theoretical Analysis 26. if length(right) > 0 4.1.1 Evaluation Criteria 27. append right to result 28. return result All comparison based sorting algorithms count the comparisons of array elements as one of their key operations. The Merge Sort algorithm can be evaluated by measuring the number of As the pseudocode shows, after the array is broken up into a left comparisons between array elements. As the key operation, we half and a right half (lines 5 - 9), the two halves are divided can measure the number of comparisons made to determine the recursively (lines 10 – 11) until they are all within a single overall efficiency of the algorithm. We intend to show that element array. Then, the two halves’ elements are compared to because the Merge Sort algorithm makes less comparisons over determine how the two arrays should be arranged (lines 16 -22). the currently acknowledged most efficient algorithm, Insertion Should any one half contain elements not added to the sorted Sort[3], Merge Sort is the most efficient comparison sort array after the comparisons are made, the remainder is added so algorithm. no elements are lost (lines 24 – 27). In the following examples, using the given input, the division of the array (Figure 1) and how the array is merged back into a sorted array (Figure 2) are 4.1.1.1 Merge Sort Case Scenarios illustrated. 4.1.1.1.1 Worst Case Merge Sort makes the element comparisons we want to measure Inputs – A: array of n elements during the merge step, where pairs of arrays are recursively merged into a single array. Merge Sort’s worst case, depicted in total number of comparisons for the worst case, we get the Figure 3, is the scenario where during each recursive call of the following equations: merge step, the two largest elements are located in different arrays. This forces the maximum number of comparisons to occur. ) ) ʚ ʛ (11) In this case, the Merge Sort algorithm’s efficiency can be ͎ ͢ = 2͎ ʠͦʡ + ͦ represented by the number of comparisons made during each ) &) ͎ʚ͢ʛ = 2&͎ ʠ ʡ + (12) recursive call of the merge step, which is described in the ͦĞ ͦ ) (13) following recurrence equation where variable n is denoted as the ͎͢ʛ = ͢ ∗ 0 + ͦ log ͦ ͢ array size and T(n) refers to the total comparisons in the merge step: Similarly to earlier, equation (11) can be expanded to find a pattern; equation (12) can then be created by substituting k, and ) ͎ʚ͢ʛ = 2͎ ʠ ʡ + ͢ − 1 (1) by solving for k get equation (13), which is the total number of ͦ comparisons for the best case of Merge sort This also results in a ͎ʚ1ʛ = 0 (2) Big O time complexity of ͉ʚ͢ log ͢ʛ, just like the worst case.

Load more