On the Usage of Sorting Networks to Big Data

Total Page:16

File Type:pdf, Size:1020Kb

On the Usage of Sorting Networks to Big Data On the usage of Sorting Networks to Big Data Blanca López and Nareli Cruz-Cortés Artificial Intelligence Laboratory, Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-IPN), México D.F., México Country Abstract— Sorting data in a computer is maybe the most Sorting Networks (SN) are an example of the non-adaptive popular classical task in Computer Science. For the majority algorithms. of applications the main goal is to minimize the number of Taking advantage of the divide-and-conquer strategy uti- comparisons and execution time that the sorting algorithm lized by the QuickSort, it is designed a strategy where some consumes. Sorting Networks are algorithms that perform SN are coupled to it in order to reduce the comparisons exactly the same number of comparisons to order any input performed by the QuickSort. permutation for a given input data size. That is, each step The remaining of this paper is organized as follows. does not depend on the result of a previous comparisons. In Section 2 some basic concepts about Quicksort and Thus, designing Sorting Networks with a minimal number of Sorting Networks are presented. In Section 3 the proposal comparisons becomes a very important task. However, it is is explained. Section 4 presents the experiments and results. an NP-hard problem. Actually, the optimal Sorting Networks Finally in Section 5 some conclusions are drawn. with a minimal number comparisons (or at least close to the optimal) for small input data sizes from 3 to 16 are published 2. Basic Concepts in the specialized literature. Of course, these input data sizes are very small to be used in real world problems. In this 2.1 Quicksort Algorithm work we propose a new strategy to improve the QuickSort Quicksort (also known as Partition-Exchange Sort) was performance by coupling it with some Sorting Networks to first presented in 1960 by Tony Hoare [4]. It uses a divide- large input data. The results demonstrate it helps reducing and-conquer strategy by dividing a large list into two smaller the sorting execution time. sublists. A sublist with the smallest values and another with the greatest. Then, each sublist is recursively ordered. The Keywords: Sorting Networks, QuickSort algorithm is as follows: 1) Choose an element from the list that will be called 1. Introduction pivot. 2) Order the list in such a way that all the values which Sorting Algorithms are maybe one of the most studied are less than the pivot will be located to its left (before problems in Computer Science, from the theoretical and the pivot). Further, all the values greater than the pivot practical points of view. Applications of them can be found will be located to its right (after the pivot). This way, in Data Processing Systems, Network Communication Sys- the value in the pivot is on its final position. tems, Image Processing, Artificial Intelligence, Cryptogra- 3) For each sublist, repeat the previous steps in a recur- phy, Computer Security, Information Systems, among many sively manner until the sublists size is zero or one. others. A large set of Sorting Algorithms can be found in the This idea is illustrated in Figure 1. QuickSort is a very specialized literature, such as: quicksort, bubble sort, merge efficient algorithm that on the average and best cases makes sort, shell sort, heapsort, insertion, introsort, shear sorting, O(n log n) comparisons for sorting n elements. In the worst 2 etc. Choosing the most efficient algorithm usually depends case it makes O(n ). Some variants to this algorithm have on the type of application at hand. In general, the Sorting been presented in [6][3] where their authors proposed some Algorithms can be classified into two groups: the adap- modifications to reduce the execution time. tive and non-adaptive. An adaptive algorithm executes its compare-interchange operations depending on the input data. 2.2 Sorting Networks On the other hand, the non-adaptive algorithms have fixed SN are algorithms with the main feature of being oblivi- operations which are executed no matter the configuration ous, it means that their current operations (comparisons) do of the input data (e. g. all the possible permutations). They not depend on the input data or the previous comparisons always execute the same compare-interchange operations. [5][7]. Unlike other well known sorting algorithms (bubble Pivot x0=4 y0 =1 c0 c2 Iteration 1 x1=2 y1 =2 c4 QS QS c3 x2=1 c1 y2 =3 Pivot Pivot x3=3 y3 =4 Fig. 2 SORTING NETWORK FOR n = 4 INPUTS. Iteration 2 QS QS QS QS . compare-interchange each time a comparator is found. So, . the comparators c0 and c1 are executed first, then c2 and . c3, and finally c4. c0 evaluates 4 > 2, thus the values of x0 and x1 are swapped. c1 evaluates 1 < 3, so the values of x2 and x3 remain without change. This process continues Fig. 1 until all the comparators are applied, so the final sorted list RECURSIVE PARTITION OPERATION OF THE QUICKSORT ALGORITHM y0; y1; y2; y3 at the right accomplishes y0 ≤ y1 ≤ y2 ≤ y3. As a matter of fact, if an optimal SN for input size n can be designed (i. e. with minimal number of comparators), then n sort, quicksort, etc.), the sequence and number of compari- it means that is the best manner to sort data. Designing SN sons are exactly the same no matter the input configuration with minimal number of comparators and/or high parallelism (permutation). The SN exhibits two main features: is a classical interesting problem in Computer Science. Actually, nowadays it is an open research area. • The comparisons (called comparators) are fixed before It is important to notice that the optimal SN for input the SN execution, size greater than n = 16 are not know. Actually, only • Some comparisons can be executed in a parallel man- lower bounds regarding the number of comparators are ner. theoretically known [5]. The most studied SN is the one A SN is composed by a set of comparators, where each of with input size n = 16, which is a relatively small value, them executes an action compare-interchange between two considering the huge quantity of information that the modern elements (a; b). The element a must be not grater than b, if systems must handle. The best known SN n = 16 has only so, the values must be interchanged to (b; a). So, for a given 60 comparators, for example, the one designed by Green [5] input list with size n, the set of comparators conforming the is illustrated in Figure 3. SN are applied to it, then the output is the list monotonically In [2] K. E. Batcher proposed an interesting algorithm non decreasing ordered. called Merge Odd-Even to merge two SN into one. That is, Typically, the SN are graphically represented by n ho- if we have a SN with input size n, then, it is possible to rizontal lines representing the n input data. Further, some obtain a SN with input size 2n by merging two copies of vertical lines that represent comparisons between the value the original SN size n each. By following this algorithm it at its top extreme and the value at its bottom. If the value at is possible to obtain SN with larger input sizes 1. the top is grater than the value at the bottom, these values An example, to increase the size of input data in 2n must be swapped. from SN for n = 4. A set of operations to order and two The input data are placed at the left, then, after they output lists “g” and “h” are considered. In the Figure 4 are have traveled across the horizontal lines and executed the shown two lists to re-arrange. The list “t” has the numbers comparisons found, the output is obtained at the right. The {t1; t2:::; tg} in ordered. At the same time, second list called data must be ascendant sorted from top to bottom. “w” are composed by fw1; w2; :::; whg. The “g + h” is the See for example a SN for n = 4 inputs illustrated in output of the merging network, the numbers of the merged Figure 2. Each input data is set on the horizontal lines lists in ascending order are fu1; :::; ug+h−1; ug+hg. i.e., at labeled as x0; x1; x2; x3. The vertical lines are the com- first, a list “g + h” can be build by merging network with parators c0; c1; c2; c3; c4, each receiving two values, i. e., the odd-indexed numbers of the two input lists and the even- the comparator c0 receives the values x0 and x1, and so on. All the data values go from left to right executing a 1Usually SN for input sizes greater than n = 16 are considered as large. step1 step2 step3 step4 step5 step6 t 1 t 2 t 3 t g w1 w2 w3 wh Fig. 5 Fig. 3 ODD-EVEN MERGESORT SCHEME.TWO SN FOR n = 8 INPUTS IS SN WITH INPUT SIZE n = 16 DESIGNED BY GREEN.IT IS THE BEST CONSTRUCTED BY TWO SN FOR n = 4. KNOW WITH 60 COMPARATORS. x 1 t 1 u 1 t 1 u 1 O x 2 Co t 2 u2 t 2 u2 C2 d x 3 M t 3 C4 u 3 t 3 u 3 . d C3 . C1 E . t g u 4 w y1 1 E u5 w1 R u5 Co w y2 2 v u6 w2 u6 C2 G w y3 C4 3 e u7 w3 u7 .
Recommended publications
  • Batcher's Algorithm
    18.310 lecture notes Fall 2010 Batcher’s Algorithm Prof. Michel Goemans Perhaps the most restrictive version of the sorting problem requires not only no motion of the keys beyond compare-and-switches, but also that the plan of comparison-and-switches be fixed in advance. In each of the methods mentioned so far, the comparison to be made at any time often depends upon the result of previous comparisons. For example, in HeapSort, it appears at first glance that we are making only compare-and-switches between pairs of keys, but the comparisons we perform are not fixed in advance. Indeed when fixing a headless heap, we move either to the left child or to the right child depending on which child had the largest element; this is not fixed in advance. A sorting network is a fixed collection of comparison-switches, so that all comparisons and switches are between keys at locations that have been specified from the beginning. These comparisons are not dependent on what has happened before. The corresponding sorting algorithm is said to be non-adaptive. We will describe a simple recursive non-adaptive sorting procedure, named Batcher’s Algorithm after its discoverer. It is simple and elegant but has the disadvantage that it requires on the order of n(log n)2 comparisons. which is larger by a factor of the order of log n than the theoretical lower bound for comparison sorting. For a long time (ten years is a long time in this subject!) nobody knew if one could find a sorting network better than this one.
    [Show full text]
  • Sorting Algorithms Correcness, Complexity and Other Properties
    Sorting Algorithms Correcness, Complexity and other Properties Joshua Knowles School of Computer Science The University of Manchester COMP26912 - Week 9 LF17, April 1 2011 The Importance of Sorting Important because • Fundamental to organizing data • Principles of good algorithm design (correctness and efficiency) can be appreciated in the methods developed for this simple (to state) task. Sorting Algorithms 2 LF17, April 1 2011 Every algorithms book has a large section on Sorting... Sorting Algorithms 3 LF17, April 1 2011 ...On the Other Hand • Progress in computer speed and memory has reduced the practical importance of (further developments in) sorting • quicksort() is often an adequate answer in many applications However, you still need to know your way (a little) around the the key sorting algorithms Sorting Algorithms 4 LF17, April 1 2011 Overview What you should learn about sorting (what is examinable) • Definition of sorting. Correctness of sorting algorithms • How the following work: Bubble sort, Insertion sort, Selection sort, Quicksort, Merge sort, Heap sort, Bucket sort, Radix sort • Main properties of those algorithms • How to reason about complexity — worst case and special cases Covered in: the course book; labs; this lecture; wikipedia; wider reading Sorting Algorithms 5 LF17, April 1 2011 Relevant Pages of the Course Book Selection sort: 97 (very short description only) Insertion sort: 98 (very short) Merge sort: 219–224 (pages on multi-way merge not needed) Heap sort: 100–106 and 107–111 Quicksort: 234–238 Bucket sort: 241–242 Radix sort: 242–243 Lower bound on sorting 239–240 Practical issues, 244 Some of the exercise on pp.
    [Show full text]
  • An Evolutionary Approach for Sorting Algorithms
    ORIENTAL JOURNAL OF ISSN: 0974-6471 COMPUTER SCIENCE & TECHNOLOGY December 2014, An International Open Free Access, Peer Reviewed Research Journal Vol. 7, No. (3): Published By: Oriental Scientific Publishing Co., India. Pgs. 369-376 www.computerscijournal.org Root to Fruit (2): An Evolutionary Approach for Sorting Algorithms PRAMOD KADAM AND Sachin KADAM BVDU, IMED, Pune, India. (Received: November 10, 2014; Accepted: December 20, 2014) ABstract This paper continues the earlier thought of evolutionary study of sorting problem and sorting algorithms (Root to Fruit (1): An Evolutionary Study of Sorting Problem) [1]and concluded with the chronological list of early pioneers of sorting problem or algorithms. Latter in the study graphical method has been used to present an evolution of sorting problem and sorting algorithm on the time line. Key words: Evolutionary study of sorting, History of sorting Early Sorting algorithms, list of inventors for sorting. IntroDUCTION name and their contribution may skipped from the study. Therefore readers have all the rights to In spite of plentiful literature and research extent this study with the valid proofs. Ultimately in sorting algorithmic domain there is mess our objective behind this research is very much found in documentation as far as credential clear, that to provide strength to the evolutionary concern2. Perhaps this problem found due to lack study of sorting algorithms and shift towards a good of coordination and unavailability of common knowledge base to preserve work of our forebear platform or knowledge base in the same domain. for upcoming generation. Otherwise coming Evolutionary study of sorting algorithm or sorting generation could receive hardly information about problem is foundation of futuristic knowledge sorting problems and syllabi may restrict with some base for sorting problem domain1.
    [Show full text]
  • Sorting and Asymptotic Complexity
    SORTING AND ASYMPTOTIC COMPLEXITY Lecture 14 CS2110 – Fall 2013 Reading and Homework 2 Texbook, chapter 8 (general concepts) and 9 (MergeSort, QuickSort) Thought question: Cloud computing systems sometimes sort data sets with hundreds of billions of items – far too much to fit in any one computer. So they use multiple computers to sort the data. Suppose you had N computers and each has room for D items, and you have a data set with N*D/2 items to sort. How could you sort the data? Assume the data is initially in a big file, and you’ll need to read the file, sort the data, then write a new file in sorted order. InsertionSort 3 //sort a[], an array of int Worst-case: O(n2) for (int i = 1; i < a.length; i++) { (reverse-sorted input) // Push a[i] down to its sorted position Best-case: O(n) // in a[0..i] (sorted input) int temp = a[i]; Expected case: O(n2) int k; for (k = i; 0 < k && temp < a[k–1]; k– –) . Expected number of inversions: n(n–1)/4 a[k] = a[k–1]; a[k] = temp; } Many people sort cards this way Invariant of main loop: a[0..i-1] is sorted Works especially well when input is nearly sorted SelectionSort 4 //sort a[], an array of int Another common way for for (int i = 1; i < a.length; i++) { people to sort cards int m= index of minimum of a[i..]; Runtime Swap b[i] and b[m]; . Worst-case O(n2) } . Best-case O(n2) .
    [Show full text]
  • Algoritmi Za Sortiranje U Programskom Jeziku C++ Završni Rad
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Repository of the University of Rijeka SVEUČILIŠTE U RIJECI FILOZOFSKI FAKULTET U RIJECI ODSJEK ZA POLITEHNIKU Algoritmi za sortiranje u programskom jeziku C++ Završni rad Mentor završnog rada: doc. dr. sc. Marko Maliković Student: Alen Jakus Rijeka, 2016. SVEUČILIŠTE U RIJECI Filozofski fakultet Odsjek za politehniku Rijeka, Sveučilišna avenija 4 Povjerenstvo za završne i diplomske ispite U Rijeci, 07. travnja, 2016. ZADATAK ZAVRŠNOG RADA (na sveučilišnom preddiplomskom studiju politehnike) Pristupnik: Alen Jakus Zadatak: Algoritmi za sortiranje u programskom jeziku C++ Rješenjem zadatka potrebno je obuhvatiti sljedeće: 1. Napraviti pregled algoritama za sortiranje. 2. Opisati odabrane algoritme za sortiranje. 3. Dijagramima prikazati rad odabranih algoritama za sortiranje. 4. Opis osnovnih svojstava programskog jezika C++. 5. Detaljan opis tipova podataka, izvedenih oblika podataka, naredbi i drugih elemenata iz programskog jezika C++ koji se koriste u rješenjima odabranih problema. 6. Opis rješenja koja su dobivena iz napisanih programa. 7. Cjelokupan kôd u programskom jeziku C++. U završnom se radu obvezno treba pridržavati Pravilnika o diplomskom radu i Uputa za izradu završnog rada sveučilišnog dodiplomskog studija. Zadatak uručen pristupniku: 07. travnja 2016. godine Rok predaje završnog rada: ____________________ Datum predaje završnog rada: ____________________ Zadatak zadao: Doc. dr. sc. Marko Maliković 2 FILOZOFSKI FAKULTET U RIJECI Odsjek za politehniku U Rijeci, 07. travnja 2016. godine ZADATAK ZA ZAVRŠNI RAD (na sveučilišnom preddiplomskom studiju politehnike) Pristupnik: Alen Jakus Naslov završnog rada: Algoritmi za sortiranje u programskom jeziku C++ Kratak opis zadatka: Napravite pregled algoritama za sortiranje. Opišite odabrane algoritme za sortiranje.
    [Show full text]
  • Advanced Topics in Sorting
    Advanced Topics in Sorting complexity system sorts duplicate keys comparators 1 complexity system sorts duplicate keys comparators 2 Complexity of sorting Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Machine model. Focus on fundamental operations. Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X. Optimal algorithm. Algorithm with best cost guarantee for X. lower bound ~ upper bound Example: sorting. • Machine model = # comparisons access information only through compares • Upper bound = N lg N from mergesort. • Lower bound ? 3 Decision Tree a < b yes no code between comparisons (e.g., sequence of exchanges) b < c a < c yes no yes no a b c b a c a < c b < c yes no yes no a c b c a b b c a c b a 4 Comparison-based lower bound for sorting Theorem. Any comparison based sorting algorithm must use more than N lg N - 1.44 N comparisons in the worst-case. Pf. Assume input consists of N distinct values a through a . • 1 N • Worst case dictated by tree height h. N ! different orderings. • • (At least) one leaf corresponds to each ordering. Binary tree with N ! leaves cannot have height less than lg (N!) • h lg N! lg (N / e) N Stirling's formula = N lg N - N lg e N lg N - 1.44 N 5 Complexity of sorting Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • How to Sort out Your Life in O(N) Time
    How to sort out your life in O(n) time arel Číže @kaja47K funkcionaklne.cz I said, "Kiss me, you're beautiful - These are truly the last days" Godspeed You! Black Emperor, The Dead Flag Blues Everyone, deep in their hearts, is waiting for the end of the world to come. Haruki Murakami, 1Q84 ... Free lunch 1965 – 2022 Cramming More Components onto Integrated Circuits http://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf He pays his staff in junk. William S. Burroughs, Naked Lunch Sorting? quicksort and chill HS 1964 QS 1959 MS 1945 RS 1887 quicksort, mergesort, heapsort, radix sort, multi- way merge sort, samplesort, insertion sort, selection sort, library sort, counting sort, bucketsort, bitonic merge sort, Batcher odd-even sort, odd–even transposition sort, radix quick sort, radix merge sort*, burst sort binary search tree, B-tree, R-tree, VP tree, trie, log-structured merge tree, skip list, YOLO tree* vs. hashing Robin Hood hashing https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf xs.sorted.take(k) (take (sort xs) k) qsort(lotOfIntegers) It may be the wrong decision, but fuck it, it's mine. (Mark Z. Danielewski, House of Leaves) I tell you, my man, this is the American Dream in action! We’d be fools not to ride this strange torpedo all the way out to the end. (HST, FALILV) Linear time sorting? I owe the discovery of Uqbar to the conjunction of a mirror and an Encyclopedia. (Jorge Luis Borges, Tlön, Uqbar, Orbis Tertius) Sorting out graph processing https://github.com/frankmcsherry/blog/blob/master/posts/2015-08-15.md Radix Sort Revisited http://www.codercorner.com/RadixSortRevisited.htm Sketchy radix sort https://github.com/kaja47/sketches (thinking|drinking|WTF)* I know they accuse me of arrogance, and perhaps misanthropy, and perhaps of madness.
    [Show full text]
  • Evaluation of Sorting Algorithms, Mathematical and Empirical Analysis of Sorting Algorithms
    International Journal of Scientific & Engineering Research Volume 8, Issue 5, May-2017 86 ISSN 2229-5518 Evaluation of Sorting Algorithms, Mathematical and Empirical Analysis of sorting Algorithms Sapram Choudaiah P Chandu Chowdary M Kavitha ABSTRACT:Sorting is an important data structure in many real life applications. A number of sorting algorithms are in existence till date. This paper continues the earlier thought of evolutionary study of sorting problem and sorting algorithms concluded with the chronological list of early pioneers of sorting problem or algorithms. Latter in the study graphical method has been used to present an evolution of sorting problem and sorting algorithm on the time line. An extensive analysis has been done compared with the traditional mathematical methods of ―Bubble Sort, Selection Sort, Insertion Sort, Merge Sort, Quick Sort. Observations have been obtained on comparing with the existing approaches of All Sorts. An “Empirical Analysis” consists of rigorous complexity analysis by various sorting algorithms, in which comparison and real swapping of all the variables are calculatedAll algorithms were tested on random data of various ranges from small to large. It is an attempt to compare the performance of various sorting algorithm, with the aim of comparing their speed when sorting an integer inputs.The empirical data obtained by using the program reveals that Quick sort algorithm is fastest and Bubble sort is slowest. Keywords: Bubble Sort, Insertion sort, Quick Sort, Merge Sort, Selection Sort, Heap Sort,CPU Time. Introduction In spite of plentiful literature and research in more dimension to student for thinking4. Whereas, sorting algorithmic domain there is mess found in this thinking become a mark of respect to all our documentation as far as credential concern2.
    [Show full text]
  • Gsoc 2018 Project Proposal
    GSoC 2018 Project Proposal Description: Implement the project idea sorting algorithms benchmark and implementation (2018) Applicant Information: Name: Kefan Yang Country of Residence: Canada University: Simon Fraser University Year of Study: Third year Major: Computing Science Self Introduction: I am Kefan Yang, a third-year computing science student from Simon Fraser University, Canada. I have rich experience as a full-stack web developer, so I am familiar with different kinds of database, such as PostgreSQL, MySQL and MongoDB. I have a much better understand of database system other than how to use it. In the database course I took in the university, I implemented a simple SQL database. It supports basic SQL statements like select, insert, delete and update, and uses a B+ tree to index the records by default. The size of each node in the B+ tree is set the disk block size to maximize performance of disk operation. Also, 1 several kinds of merging algorithms are used to perform a cross table query. More details about this database project can be found here. Additionally, I have very solid foundation of basic algorithms and data structure. I’ve participated in division 1 contest of 2017 ACM-ICPC Pacific Northwest Regionals as a representative of Simon Fraser University, which clearly shows my talents in understanding and applying different kinds of algorithms. I believe the contest experience will be a big help for this project. Benefits to the PostgreSQL Community: Sorting routine is an important part of many modules in PostgreSQL. Currently, PostgreSQL is using median-of-three quicksort introduced by Bentley and Mcllroy in 1993 [1], which is somewhat outdated.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) which require input data to be in sorted lists; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation (reordering) of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2006). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and upper and lower bounds. Classification Sorting algorithms are often classified by: • Computational complexity (worst, average and best behavior) of element comparisons in terms of the size of the list (n). For typical serial sorting algorithms good behavior is O(n log n), with parallel sort in O(log2 n), and bad behavior is O(n2).
    [Show full text]
  • Introspective Sorting and Selection Algorithms
    Introsp ective Sorting and Selection Algorithms David R Musser Computer Science Department Rensselaer Polytechnic Institute Troy NY mussercsrpiedu Abstract Quicksort is the preferred inplace sorting algorithm in manycontexts since its average computing time on uniformly distributed inputs is N log N and it is in fact faster than most other sorting algorithms on most inputs Its drawback is that its worstcase time bound is N Previous attempts to protect against the worst case by improving the way quicksort cho oses pivot elements for partitioning have increased the average computing time to o muchone might as well use heapsort which has aN log N worstcase time b ound but is on the average to times slower than quicksort A similar dilemma exists with selection algorithms for nding the ith largest element based on partitioning This pap er describ es a simple solution to this dilemma limit the depth of partitioning and for subproblems that exceed the limit switch to another algorithm with a b etter worstcase bound Using heapsort as the stopp er yields a sorting algorithm that is just as fast as quicksort in the average case but also has an N log N worst case time bound For selection a hybrid of Hoares find algorithm which is linear on average but quadratic in the worst case and the BlumFloydPrattRivestTarjan algorithm is as fast as Hoares algorithm in practice yet has a linear worstcase time b ound Also discussed are issues of implementing the new algorithms as generic algorithms and accurately measuring their p erformance in the framework
    [Show full text]