A Heapify Based Parallel Sorting Algorithm

Total Page:16

File Type:pdf, Size:1020Kb

A Heapify Based Parallel Sorting Algorithm Journal of Computer Science 4 (11): 897-902, 2008 ISSN 1549-3636 © 2008 Science Publications A Heapify Based Parallel Sorting Algorithm 1Mwaffaq A. Abu Al hija, 2Arwa Zabian, 3Sami Qawasmeh and 4Omer H. Abu Al haija 1Department of Computer Science, Hail University, Hail, Saudi Arabia 2Department of Computer Science, Jadara University, Irbid, Jordan 3Department of Computer Science, Irbid National University, Irbid, Jordan 4Department of Computer Science, Irbid, Jordan Abstract: Quick sort is a sorting algorithm whose worst case running time is (n2 ) on an input array of n numbers. It is the best practical for sorting because it has the advantage of sorting in place. Problem statement: Behavior of quick sort is complex, we proposed in-place 2m threads parallel heap sort algorithm which had advantage in sorting in place and had better performance than classical sequential quick sort in running time. Approach: The algorithm consisted of several stages, in first stage; it splits input data into two partitions, next stages it did the same partitioning for prior stage which had been spitted until 2 m partitions was reached equal to the number of available processors, finally it used heap sort to sort respectively ordered of non internally sorted partitions in parallel. Results: Results showed the speed of algorithm about double speed of classical Quick sort for a large input size. The number of comparisons needed was reduced significantly. Conclusion: In this study we had been proposed a sorting algorithm that uses less number of comparisons with respect to original quick sort that in turn requires less running time to sort the same input data. Key words: Parallel sorting, heap sort, Quick sort, in-place sorting INTRODUCTION complete binary tree with height h has between 2h and 2h+1 -1 node. So, the height of a binary tree of n nodes is Sorting is one of the most well studied problems in O (log2 n). The basic operations of the heap are Build- computer science because it is fundamental in many Heap and Heapify. Heapify operation arranges the heap [1] applications. Quick sort is typically the fastest sorting to restore its heap property. algorithm, basically due to better cache behavior among other factors. However, the worst case running time for MATERIALS AND METHODS 2 Quick sort is O (n ) for an input size n, which is unacceptable for large data sets. In this study, we Partition and Swap Algorithm (PSA): Given an array propose PSA Partition and Swap Algorithm, which of size n, PSA divides the array into four sub-arrays in reduces the time needed for sorting with respect to two steps to facilitate the sorting and to reduce the Quiksort in the worst case. A comparison between number of comparisons needed. The algorithm works in Quick sort and PSA algorithm in terms of running time three phases the first partition phase, the second partition phase and finally the Heapify phase. is performed; the simulation results show that PSA is In the first partition phase the algorithm divides the faster than Quick sort in the worst case for sorting the [n] array A into two sub-arrays A , A based on the same input data size. Our algorithm is based on 1 2 middle value midval, in a manner that all the elements partitioning the input data to different partitions then of A1 (or denoted left) are smaller than the midval and using Heapify procedure in parallel to sort each all the elements of A2 (or denoted as right) are greater partition. The heap is a very fundamental data structure than the (midval), if the data must be ordered in an used in many application programs. A heap is a data ascendant manner. (If the data must be ordered in structure that stores an array in a binary tree descendent manner, all the elements of A1 must be maintaining two properties: the first is that the value greater than the midval and all the elements of A2 must stored in every node is smaller than or equal to the be smaller). In the second partition phase, the value of its children. The second property is that the partitioning in the first partition phase is repeated for binary tree must be complete. This implies that the each of A1, A2. Where A1 is divided into two subarrays Corresponding Author: Mwaffaq A. Abu Al hija, Department of Computer Science, Hail University, Hail, Saudi Arabia 897 J. Computer Sci., 4 (11): 897-902, 2008 [0] [n] A1l, A1r and A2 is divided into two sub arrays A2l, A2r A , A denote the first and last element in the array then the elements of these four sub arrays are ordered as in the previous phase. The third phase, is heavily phase Swap procedure: in which the resultant subarrays A1l, A1r, A2l, A2r are for i = 0 to n/2, j = n to n/2 do: organized using heavily procedure into sorted [0] subarrays. if A = Lmax > midval and [n] Rmin = A < Lmax First partition phase: Swap (Rmin, Lmax ) if i ≠ j • Given an array A of size n find the middle location i = i+1, j = j+1 of the n elements the pivot can be find by n/2. Step 1 in the flowchart Return if • For each part (left and right to the pivot) find the Else End max and min values (step 2 in the flowchart) Else • Find the mid value midval that can be calculated as i = i+1 and Return if following: The procedure stops when i = j, that means the two pointers are encountered and all the elements on the left Lmax + Lmin+ Rmax+Rmin /4⁄ and on the right have been visited and compared. This • For each element in the two subarrays A1 (left), A2 condition is important to ensure the correctness of the (right) do the following comparisons (if the array is algorithm and to ensure that all the elements have been ordered in an ascendant manner) For i, j are two compared. In the case of n being an odd number, we pointers, take n/2⁄ to find the mid location (pivot location), Fig. 1: The flow chart of PSA algorithm 898 J. Computer Sci., 4 (11): 897-902, 2008 Fig. 3: PSA algorithm/merge in place Fig. 2: PSA algorithm/first partition phase in which case the right will have more elements than the left and the pointer i must continue to visit the elements until it encounters j. Second partition phase: The algorithm starts working in parallel in dividing the two subarrays A1, A2 to four consecutive partitions A1l, A1r, A2l, A2r and the swap procedure will be repeated for the following counters in Fig. 4: PSA Algorithm/ Heapify Phase A1 for i = 0 to n/4, j = n/2 to n/4 and for A2 the counter will be: for i = n/2 to 3n/2, j = n to j = 3n/2. At the end of this phase we obtain four subarrays in represent the swapped elements. In part A of Fig. 3, the sub-array A is divided, to left1 and right1. In part B, such a manner that: 1 A2 is divided into two sub-arrays left 2, right 2. In All the elements of A1l < all the elements of A1r and all Fig. 4, it is shown how the four sorted sub-arrays are the elements of A2l < all the elements of A2r inserted in their location into the final sorted array. Heapify phase: In this phase, we use heavily procedure RESULTS on all the four resultant sub-arrays to organize them in sorted subarrays. To study the effectiveness of our algorithm PSA Figure 1 shows the flowchart that describes how with respect to Quick sort and other sorting algorithms, PSA work. In the first step of the flowchart the initial we have used turbo Delphi to implement both Quick array is divided into two sub-arrays and the midval- sort and PSA. Where we wrote a generic code that location is found. In the second step, the min and max could run on both programs. Turbo Delphi is an values for each partition are found and a series of swap integrated development environment that runs under win 32 and is built based on object-oriented Pascal operations are done. Step 3 indicates the last phase that language. Our simulator runs on Pentium 4, 2.66 GHZ, is heavily phase. 512 Mb. We have implemented both Quick sort and PSA on the same machine and we have performed Example: Figure 2 shows a numerical example of how different simulations. Our simulation results show that PSA works in an array of size n = 12. In Fig. 2, A is the performance of PSA is better than quick sort in divided into two sub-arrays A1, A2 where the bold items term of running time (Table 1). 899 J. Computer Sci., 4 (11): 897-902, 2008 Table 1: Comparison between the running time of Quick sort and PSA for different input size Quick sort running PSA running ) s Input size time (ms) time (ms) m ( 1000 0.234 0.171 e m 5000 1.640 0.829 i t g 10000 3.313 2.139 n i n 50000 20.485 12.203 n u 100000 40.460 26.090 R (1000 items), however the efficiency of our algorithm respectively to Quick sort is about 64.56% for a large input size and does not change if the input size Input size increased from n to 10 n (10000-100000 items). And that is accorded with our goal is to propose an Fig.
Recommended publications
  • Overview of Sorting Algorithms
    Unit 7 Sorting Algorithms Simple Sorting algorithms Quicksort Improving Quicksort Overview of Sorting Algorithms Given a collection of items we want to arrange them in an increasing or decreasing order. You probably have seen a number of sorting algorithms including ¾ selection sort ¾ insertion sort ¾ bubble sort ¾ quicksort ¾ tree sort using BST's In terms of efficiency: ¾ average complexity of the first three is O(n2) ¾ average complexity of quicksort and tree sort is O(n lg n) ¾ but its worst case is still O(n2) which is not acceptable In this section, we ¾ review insertion, selection and bubble sort ¾ discuss quicksort and its average/worst case analysis ¾ show how to eliminate tail recursion ¾ present another sorting algorithm called heapsort Unit 7- Sorting Algorithms 2 Selection Sort Assume that data ¾ are integers ¾ are stored in an array, from 0 to size-1 ¾ sorting is in ascending order Algorithm for i=0 to size-1 do x = location with smallest value in locations i to size-1 swap data[i] and data[x] end Complexity If array has n items, i-th step will perform n-i operations First step performs n operations second step does n-1 operations ... last step performs 1 operatio. Total cost : n + (n-1) +(n-2) + ... + 2 + 1 = n*(n+1)/2 . Algorithm is O(n2). Unit 7- Sorting Algorithms 3 Insertion Sort Algorithm for i = 0 to size-1 do temp = data[i] x = first location from 0 to i with a value greater or equal to temp shift all values from x to i-1 one location forwards data[x] = temp end Complexity Interesting operations: comparison and shift i-th step performs i comparison and shift operations Total cost : 1 + 2 + ..
    [Show full text]
  • Batcher's Algorithm
    18.310 lecture notes Fall 2010 Batcher’s Algorithm Prof. Michel Goemans Perhaps the most restrictive version of the sorting problem requires not only no motion of the keys beyond compare-and-switches, but also that the plan of comparison-and-switches be fixed in advance. In each of the methods mentioned so far, the comparison to be made at any time often depends upon the result of previous comparisons. For example, in HeapSort, it appears at first glance that we are making only compare-and-switches between pairs of keys, but the comparisons we perform are not fixed in advance. Indeed when fixing a headless heap, we move either to the left child or to the right child depending on which child had the largest element; this is not fixed in advance. A sorting network is a fixed collection of comparison-switches, so that all comparisons and switches are between keys at locations that have been specified from the beginning. These comparisons are not dependent on what has happened before. The corresponding sorting algorithm is said to be non-adaptive. We will describe a simple recursive non-adaptive sorting procedure, named Batcher’s Algorithm after its discoverer. It is simple and elegant but has the disadvantage that it requires on the order of n(log n)2 comparisons. which is larger by a factor of the order of log n than the theoretical lower bound for comparison sorting. For a long time (ten years is a long time in this subject!) nobody knew if one could find a sorting network better than this one.
    [Show full text]
  • Hacking a Google Interview – Handout 2
    Hacking a Google Interview – Handout 2 Course Description Instructors: Bill Jacobs and Curtis Fonger Time: January 12 – 15, 5:00 – 6:30 PM in 32‐124 Website: http://courses.csail.mit.edu/iap/interview Classic Question #4: Reversing the words in a string Write a function to reverse the order of words in a string in place. Answer: Reverse the string by swapping the first character with the last character, the second character with the second‐to‐last character, and so on. Then, go through the string looking for spaces, so that you find where each of the words is. Reverse each of the words you encounter by again swapping the first character with the last character, the second character with the second‐to‐last character, and so on. Sorting Often, as part of a solution to a question, you will need to sort a collection of elements. The most important thing to remember about sorting is that it takes O(n log n) time. (That is, the fastest sorting algorithm for arbitrary data takes O(n log n) time.) Merge Sort: Merge sort is a recursive way to sort an array. First, you divide the array in half and recursively sort each half of the array. Then, you combine the two halves into a sorted array. So a merge sort function would look something like this: int[] mergeSort(int[] array) { if (array.length <= 1) return array; int middle = array.length / 2; int firstHalf = mergeSort(array[0..middle - 1]); int secondHalf = mergeSort( array[middle..array.length - 1]); return merge(firstHalf, secondHalf); } The algorithm relies on the fact that one can quickly combine two sorted arrays into a single sorted array.
    [Show full text]
  • Data Structures & Algorithms
    DATA STRUCTURES & ALGORITHMS Tutorial 6 Questions SORTING ALGORITHMS Required Questions Question 1. Many operations can be performed faster on sorted than on unsorted data. For which of the following operations is this the case? a. checking whether one word is an anagram of another word, e.g., plum and lump b. findin the minimum value. c. computing an average of values d. finding the middle value (the median) e. finding the value that appears most frequently in the data Question 2. In which case, the following sorting algorithm is fastest/slowest and what is the complexity in that case? Explain. a. insertion sort b. selection sort c. bubble sort d. quick sort Question 3. Consider the sequence of integers S = {5, 8, 2, 4, 3, 6, 1, 7} For each of the following sorting algorithms, indicate the sequence S after executing each step of the algorithm as it sorts this sequence: a. insertion sort b. selection sort c. heap sort d. bubble sort e. merge sort Question 4. Consider the sequence of integers 1 T = {1, 9, 2, 6, 4, 8, 0, 7} Indicate the sequence T after executing each step of the Cocktail sort algorithm (see Appendix) as it sorts this sequence. Advanced Questions Question 5. A variant of the bubble sorting algorithm is the so-called odd-even transposition sort . Like bubble sort, this algorithm a total of n-1 passes through the array. Each pass consists of two phases: The first phase compares array[i] with array[i+1] and swaps them if necessary for all the odd values of of i.
    [Show full text]
  • Sorting Networks on Restricted Topologies Arxiv:1612.06473V2 [Cs
    Sorting Networks On Restricted Topologies Indranil Banerjee Dana Richards Igor Shinkar [email protected] [email protected] [email protected] George Mason University George Mason University UC Berkeley October 9, 2018 Abstract The sorting number of a graph with n vertices is the minimum depth of a sorting network with n inputs and n outputs that uses only the edges of the graph to perform comparisons. Many known results on sorting networks can be stated in terms of sorting numbers of different classes of graphs. In this paper we show the following general results about the sorting number of graphs. 1. Any n-vertex graph that contains a simple path of length d has a sorting network of depth O(n log(n=d)). 2. Any n-vertex graph with maximal degree ∆ has a sorting network of depth O(∆n). We also provide several results relating the sorting number of a graph with its rout- ing number, size of its maximal matching, and other well known graph properties. Additionally, we give some new bounds on the sorting number for some typical graphs. 1 Introduction In this paper we study oblivious sorting algorithms. These are sorting algorithms whose sequence of comparisons is made in advance, before seeing the input, such that for any input of n numbers the value of the i'th output is smaller or equal to the value of the j'th arXiv:1612.06473v2 [cs.DS] 20 Jan 2017 output for all i < j. That is, for any permutation of the input out of the n! possible, the output of the algorithm must be sorted.
    [Show full text]
  • 13 Basic Sorting Algorithms
    Concise Notes on Data Structures and Algorithms Basic Sorting Algorithms 13 Basic Sorting Algorithms 13.1 Introduction Sorting is one of the most fundamental and important data processing tasks. Sorting algorithm: An algorithm that rearranges records in lists so that they follow some well-defined ordering relation on values of keys in each record. An internal sorting algorithm works on lists in main memory, while an external sorting algorithm works on lists stored in files. Some sorting algorithms work much better as internal sorts than external sorts, but some work well in both contexts. A sorting algorithm is stable if it preserves the original order of records with equal keys. Many sorting algorithms have been invented; in this chapter we will consider the simplest sorting algorithms. In our discussion in this chapter, all measures of input size are the length of the sorted lists (arrays in the sample code), and the basic operation counted is comparison of list elements (also called keys). 13.2 Bubble Sort One of the oldest sorting algorithms is bubble sort. The idea behind it is to make repeated passes through the list from beginning to end, comparing adjacent elements and swapping any that are out of order. After the first pass, the largest element will have been moved to the end of the list; after the second pass, the second largest will have been moved to the penultimate position; and so forth. The idea is that large values “bubble up” to the top of the list on each pass. A Ruby implementation of bubble sort appears in Figure 1.
    [Show full text]
  • Title the Complexity of Topological Sorting Algorithms Author(S)
    Title The Complexity of Topological Sorting Algorithms Author(s) Shoudai, Takayoshi Citation 数理解析研究所講究録 (1989), 695: 169-177 Issue Date 1989-06 URL http://hdl.handle.net/2433/101392 Right Type Departmental Bulletin Paper Textversion publisher Kyoto University 数理解析研究所講究録 第 695 巻 1989 年 169-177 169 Topological Sorting の NLOG 完全性について -The Complexity of Topological Sorting Algorithms- 正代 隆義 Takayoshi Shoudai Department of Mathematics, Kyushu University We consider the following problem: Given a directed acyclic graph $G$ and vertices $s$ and $t$ , is $s$ ordered before $t$ in the topological order generated by a given topological sorting algorithm? For known algorithms, we show that these problems are log-space complete for NLOG. It also contains the lexicographically first topological sorting problem. The algorithms use the result that NLOG is closed under conplementation. 1. Introduction The topological sorting problem is, given a directed acyclic graph $G=(V, E)$ , to find a total ordering of its vertices such that if $(v, w)$ is an edge then $v$ is ordered before $w$ . It has important applications for analyzing programs and arranging the words in the glossary [6]. Moreover, it is used in designing many efficient sequential algorithms, for example, the maximum flow problem [11]. Some techniques for computing topological orders have been developed. The algorithm by Knuth [6] that computes the lexicographically first topological order runs in time $O(|E|)$ . Tarjan [11] also devised an $O(|E|)$ time algorithm by employ- ing the depth-first search method. Dekel, Nassimi and Sahni [4] showed a parallel algorithm using the parallel matrix multiplication technique. Ruzzo also devised a simple $NL^{*}$ -algorithm as is stated in [3].
    [Show full text]
  • Selected Sorting Algorithms
    Selected Sorting Algorithms CS 165: Project in Algorithms and Data Structures Michael T. Goodrich Some slides are from J. Miller, CSE 373, U. Washington Why Sorting? • Practical application – People by last name – Countries by population – Search engine results by relevance • Fundamental to other algorithms • Different algorithms have different asymptotic and constant-factor trade-offs – No single ‘best’ sort for all scenarios – Knowing one way to sort just isn’t enough • Many to approaches to sorting which can be used for other problems 2 Problem statement There are n comparable elements in an array and we want to rearrange them to be in increasing order Pre: – An array A of data records – A value in each data record – A comparison function • <, =, >, compareTo Post: – For each distinct position i and j of A, if i < j then A[i] ≤ A[j] – A has all the same data it started with 3 Insertion sort • insertion sort: orders a list of values by repetitively inserting a particular value into a sorted subset of the list • more specifically: – consider the first item to be a sorted sublist of length 1 – insert the second item into the sorted sublist, shifting the first item if needed – insert the third item into the sorted sublist, shifting the other items as needed – repeat until all values have been inserted into their proper positions 4 Insertion sort • Simple sorting algorithm. – n-1 passes over the array – At the end of pass i, the elements that occupied A[0]…A[i] originally are still in those spots and in sorted order.
    [Show full text]
  • Visualizing Sorting Algorithms Brian Faria Rhode Island College, [email protected]
    Rhode Island College Digital Commons @ RIC Honors Projects Overview Honors Projects 2017 Visualizing Sorting Algorithms Brian Faria Rhode Island College, [email protected] Follow this and additional works at: https://digitalcommons.ric.edu/honors_projects Part of the Education Commons, Mathematics Commons, and the Other Computer Sciences Commons Recommended Citation Faria, Brian, "Visualizing Sorting Algorithms" (2017). Honors Projects Overview. 127. https://digitalcommons.ric.edu/honors_projects/127 This Honors is brought to you for free and open access by the Honors Projects at Digital Commons @ RIC. It has been accepted for inclusion in Honors Projects Overview by an authorized administrator of Digital Commons @ RIC. For more information, please contact [email protected]. VISUALIZING SORTING ALGORITHMS By Brian J. Faria An Honors Project Submitted in Partial Fulfillment Of the Requirements for Honors in The Department of Mathematics and Computer Science Faculty of Arts and Sciences Rhode Island College 2017 Abstract This paper discusses a study performed on animating sorting al- gorithms as a learning aid for classroom instruction. A web-based animation tool was created to visualize four common sorting algo- rithms: Selection Sort, Bubble Sort, Insertion Sort, and Merge Sort. The animation tool would represent data as a bar-graph and after se- lecting a data-ordering and algorithm, the user can run an automated animation or step through it at their own pace. Afterwards, a study was conducted with a voluntary student population at Rhode Island College who were in the process of learning algorithms in their Com- puter Science curriculum. The study consisted of a demonstration and survey that asked the students questions that may show improve- ment when understanding algorithms.
    [Show full text]
  • Advanced Topics in Sorting
    Advanced Topics in Sorting complexity system sorts duplicate keys comparators 1 complexity system sorts duplicate keys comparators 2 Complexity of sorting Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Machine model. Focus on fundamental operations. Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X. Optimal algorithm. Algorithm with best cost guarantee for X. lower bound ~ upper bound Example: sorting. • Machine model = # comparisons access information only through compares • Upper bound = N lg N from mergesort. • Lower bound ? 3 Decision Tree a < b yes no code between comparisons (e.g., sequence of exchanges) b < c a < c yes no yes no a b c b a c a < c b < c yes no yes no a c b c a b b c a c b a 4 Comparison-based lower bound for sorting Theorem. Any comparison based sorting algorithm must use more than N lg N - 1.44 N comparisons in the worst-case. Pf. Assume input consists of N distinct values a through a . • 1 N • Worst case dictated by tree height h. N ! different orderings. • • (At least) one leaf corresponds to each ordering. Binary tree with N ! leaves cannot have height less than lg (N!) • h lg N! lg (N / e) N Stirling's formula = N lg N - N lg e N lg N - 1.44 N 5 Complexity of sorting Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • Faster Shellsort Sequences: a Genetic Algorithm Application
    Faster Shellsort Sequences: A Genetic Algorithm Application Richard Simpson Shashidhar Yachavaram [email protected] [email protected] Department of Computer Science Midwestern State University 3410 Taft, Wichita Falls, TX 76308 Phone: (940) 397-4191 Fax: (940) 397-4442 Abstract complex problems that resist normal research approaches. This paper concerns itself with the application of Genetic The recent popularity of genetic algorithms (GA's) and Algorithms to the problem of finding new, more efficient their application to a wide variety of problems is a result of Shellsort sequences. Using comparison counts as the main their ease of implementation and flexibility. Evolutionary criteria for quality, several new sequences have been techniques are often applied to optimization and related discovered that sort one megabyte files more efficiently than search problems where the fitness of a potential result is those presently known. easily established. Problems of this type are generally very difficult and often NP-Hard, so the ability to find a reasonable This paper provides a brief introduction to genetic solution to these problems within an acceptable time algorithms, a review of research on the Shellsort algorithm, constraint is clearly desirable. and concludes with a description of the GA approach and the associated results. One such problem that has been researched thoroughly is the search for Shellsort sequences. The attributes of this 2. Genetic Algorithms problem make it a prime target for the application of genetic algorithms. Noting this, the authors have designed a GA that John Holland developed the technique of GA's in the efficiently searches for Shellsort sequences that are top 1960's.
    [Show full text]