A Heapify Based Parallel Sorting Algorithm
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Computer Science 4 (11): 897-902, 2008 ISSN 1549-3636 © 2008 Science Publications A Heapify Based Parallel Sorting Algorithm 1Mwaffaq A. Abu Al hija, 2Arwa Zabian, 3Sami Qawasmeh and 4Omer H. Abu Al haija 1Department of Computer Science, Hail University, Hail, Saudi Arabia 2Department of Computer Science, Jadara University, Irbid, Jordan 3Department of Computer Science, Irbid National University, Irbid, Jordan 4Department of Computer Science, Irbid, Jordan Abstract: Quick sort is a sorting algorithm whose worst case running time is (n2 ) on an input array of n numbers. It is the best practical for sorting because it has the advantage of sorting in place. Problem statement: Behavior of quick sort is complex, we proposed in-place 2m threads parallel heap sort algorithm which had advantage in sorting in place and had better performance than classical sequential quick sort in running time. Approach: The algorithm consisted of several stages, in first stage; it splits input data into two partitions, next stages it did the same partitioning for prior stage which had been spitted until 2 m partitions was reached equal to the number of available processors, finally it used heap sort to sort respectively ordered of non internally sorted partitions in parallel. Results: Results showed the speed of algorithm about double speed of classical Quick sort for a large input size. The number of comparisons needed was reduced significantly. Conclusion: In this study we had been proposed a sorting algorithm that uses less number of comparisons with respect to original quick sort that in turn requires less running time to sort the same input data. Key words: Parallel sorting, heap sort, Quick sort, in-place sorting INTRODUCTION complete binary tree with height h has between 2h and 2h+1 -1 node. So, the height of a binary tree of n nodes is Sorting is one of the most well studied problems in O (log2 n). The basic operations of the heap are Build- computer science because it is fundamental in many Heap and Heapify. Heapify operation arranges the heap [1] applications. Quick sort is typically the fastest sorting to restore its heap property. algorithm, basically due to better cache behavior among other factors. However, the worst case running time for MATERIALS AND METHODS 2 Quick sort is O (n ) for an input size n, which is unacceptable for large data sets. In this study, we Partition and Swap Algorithm (PSA): Given an array propose PSA Partition and Swap Algorithm, which of size n, PSA divides the array into four sub-arrays in reduces the time needed for sorting with respect to two steps to facilitate the sorting and to reduce the Quiksort in the worst case. A comparison between number of comparisons needed. The algorithm works in Quick sort and PSA algorithm in terms of running time three phases the first partition phase, the second partition phase and finally the Heapify phase. is performed; the simulation results show that PSA is In the first partition phase the algorithm divides the faster than Quick sort in the worst case for sorting the [n] array A into two sub-arrays A , A based on the same input data size. Our algorithm is based on 1 2 middle value midval, in a manner that all the elements partitioning the input data to different partitions then of A1 (or denoted left) are smaller than the midval and using Heapify procedure in parallel to sort each all the elements of A2 (or denoted as right) are greater partition. The heap is a very fundamental data structure than the (midval), if the data must be ordered in an used in many application programs. A heap is a data ascendant manner. (If the data must be ordered in structure that stores an array in a binary tree descendent manner, all the elements of A1 must be maintaining two properties: the first is that the value greater than the midval and all the elements of A2 must stored in every node is smaller than or equal to the be smaller). In the second partition phase, the value of its children. The second property is that the partitioning in the first partition phase is repeated for binary tree must be complete. This implies that the each of A1, A2. Where A1 is divided into two subarrays Corresponding Author: Mwaffaq A. Abu Al hija, Department of Computer Science, Hail University, Hail, Saudi Arabia 897 J. Computer Sci., 4 (11): 897-902, 2008 [0] [n] A1l, A1r and A2 is divided into two sub arrays A2l, A2r A , A denote the first and last element in the array then the elements of these four sub arrays are ordered as in the previous phase. The third phase, is heavily phase Swap procedure: in which the resultant subarrays A1l, A1r, A2l, A2r are for i = 0 to n/2, j = n to n/2 do: organized using heavily procedure into sorted [0] subarrays. if A = Lmax > midval and [n] Rmin = A < Lmax First partition phase: Swap (Rmin, Lmax ) if i ≠ j • Given an array A of size n find the middle location i = i+1, j = j+1 of the n elements the pivot can be find by n/2. Step 1 in the flowchart Return if • For each part (left and right to the pivot) find the Else End max and min values (step 2 in the flowchart) Else • Find the mid value midval that can be calculated as i = i+1 and Return if following: The procedure stops when i = j, that means the two pointers are encountered and all the elements on the left Lmax + Lmin+ Rmax+Rmin /4⁄ and on the right have been visited and compared. This • For each element in the two subarrays A1 (left), A2 condition is important to ensure the correctness of the (right) do the following comparisons (if the array is algorithm and to ensure that all the elements have been ordered in an ascendant manner) For i, j are two compared. In the case of n being an odd number, we pointers, take n/2⁄ to find the mid location (pivot location), Fig. 1: The flow chart of PSA algorithm 898 J. Computer Sci., 4 (11): 897-902, 2008 Fig. 3: PSA algorithm/merge in place Fig. 2: PSA algorithm/first partition phase in which case the right will have more elements than the left and the pointer i must continue to visit the elements until it encounters j. Second partition phase: The algorithm starts working in parallel in dividing the two subarrays A1, A2 to four consecutive partitions A1l, A1r, A2l, A2r and the swap procedure will be repeated for the following counters in Fig. 4: PSA Algorithm/ Heapify Phase A1 for i = 0 to n/4, j = n/2 to n/4 and for A2 the counter will be: for i = n/2 to 3n/2, j = n to j = 3n/2. At the end of this phase we obtain four subarrays in represent the swapped elements. In part A of Fig. 3, the sub-array A is divided, to left1 and right1. In part B, such a manner that: 1 A2 is divided into two sub-arrays left 2, right 2. In All the elements of A1l < all the elements of A1r and all Fig. 4, it is shown how the four sorted sub-arrays are the elements of A2l < all the elements of A2r inserted in their location into the final sorted array. Heapify phase: In this phase, we use heavily procedure RESULTS on all the four resultant sub-arrays to organize them in sorted subarrays. To study the effectiveness of our algorithm PSA Figure 1 shows the flowchart that describes how with respect to Quick sort and other sorting algorithms, PSA work. In the first step of the flowchart the initial we have used turbo Delphi to implement both Quick array is divided into two sub-arrays and the midval- sort and PSA. Where we wrote a generic code that location is found. In the second step, the min and max could run on both programs. Turbo Delphi is an values for each partition are found and a series of swap integrated development environment that runs under win 32 and is built based on object-oriented Pascal operations are done. Step 3 indicates the last phase that language. Our simulator runs on Pentium 4, 2.66 GHZ, is heavily phase. 512 Mb. We have implemented both Quick sort and PSA on the same machine and we have performed Example: Figure 2 shows a numerical example of how different simulations. Our simulation results show that PSA works in an array of size n = 12. In Fig. 2, A is the performance of PSA is better than quick sort in divided into two sub-arrays A1, A2 where the bold items term of running time (Table 1). 899 J. Computer Sci., 4 (11): 897-902, 2008 Table 1: Comparison between the running time of Quick sort and PSA for different input size Quick sort running PSA running ) s Input size time (ms) time (ms) m ( 1000 0.234 0.171 e m 5000 1.640 0.829 i t g 10000 3.313 2.139 n i n 50000 20.485 12.203 n u 100000 40.460 26.090 R (1000 items), however the efficiency of our algorithm respectively to Quick sort is about 64.56% for a large input size and does not change if the input size Input size increased from n to 10 n (10000-100000 items). And that is accorded with our goal is to propose an Fig.