Sorting: from Theory to Practice Sorting out Sorts (See Libsort.Cpp)

Total Page:16

File Type:pdf, Size:1020Kb

Sorting: from Theory to Practice Sorting out Sorts (See Libsort.Cpp) Sorting: From Theory to Practice Sorting out sorts (see libsort.cpp) z Why do we study sorting? z Simple, O(n2) sorts --- for sorting n elements 2 ¾ Because we have to ¾ Selection sort --- n comparisons, n swaps, easy to code 2 2 ¾ Because sorting is beautiful ¾ Insertion sort --- n comparisons, n moves, stable, fast ¾ Bubble sort --- n2 everything, slow, slower, and ugly ¾ Example of algorithm analysis in a simple, useful setting z Divide and conquer faster sorts: O(n log n) for n elements z There are n sorting algorithms, how many should we study? ¾ Quick sort: fast in practice, O(n2) worst case ¾ O(n), O(log n), … ¾ Merge sort: good worst case, great for linked lists, uses ¾ Why do we study more than one algorithm? extra storage for vectors/arrays • Some are good, some are bad, some are very, very sad z Other sorts: • Paradigms of trade-offs and algorithmic design ¾ Heap sort, basically priority queue sorting ¾ Which sorting algorithm is best? ¾ Radix sort: doesn’t compare keys, uses digits/characters ¾ Which sort should you call from code you write? ¾ Shell sort: quasi-insertion, fast in practice, non-recursive CPS 100 14.1 CPS 100 14.2 Selection sort: summary Insertion Sort: summary z Simple to code n2 sort: n2 comparisons, n swaps z Stable sort, O(n2), good on nearly sorted vectors ¾ Stable sorts maintain order of equal keys void selectSort(tvector<string>& a) ¾ Good for sorting on two criteria: name, then age { for(int k=0; k < a.size(); k++){ void insertSort(tvector<string>& a) int minIndex = findMin(a,k,a.size()); { int k, loc; string elt; for(k=1; k < a.size(); k++) { swap(a[k],a[minIndex]); elt = a[k]; } loc = k; } // shift until spot for elt is found while (0 < loc && elt < a[loc-1] n { a[loc] = a[loc-1]; // shift right z # comparisons: Σ k = 1 + 2 + … + n = n(n+1)/2 = O(n2) loc=loc-1; k=1 } ¾ Swaps? a[loc] = elt; Sorted, won’t move ¾ Invariant: ????? } final position } Sorted relative to ????? each other CPS 100 14.3 CPS 100 14.4 Bubble sort: summary of a dog Summary of simple sorts z For completeness you should know about this sort z Selection sort has n swaps, good for “heavy” data ¾ Few, if any, redeeming features. Really slow, really, really ¾ moving objects with lots of state, e.g., … ¾ Can code to recognize already sorted vector (see insertion) • A string isn’t heavy, why? (pointer and pointee) • Not worth it for bubble sort, much slower than insertion • What happens in Java? • Wrap heavy items in “smart pointer proxy” void bubbleSort(tvector<string>& a) { z Insertion sort is good on nearly sorted data, it’s stable, it’s fast for(int j=a.size()-1; j >= 0; j--) { ¾ Also foundation for Shell sort, very fast non-recursive for(int k=0; k < j; k++) { if (a[k] > a[k+1]) ¾ More complicated to code, but relatively simple, and fast swap(a[k],a[k+1]); } z Bubble sort is a travesty? But it's fast to code if you know it! Sorted, in final } ????? ¾ position Can be parallelized, but on one machine don’t go near it } (see quotes at end of slides) z “bubble” elements down the vector/array CPS 100 14.5 CPS 100 14.6 Quicksort: fast in practice Partition code for quicksort z Easy to develop partition z Invented in 1962 by C.A.R. Hoare, didn’t understand recursion what we want ¾ Worst case is O(n2), but avoidable in nearly all cases <= pivot > pivot int partition(tvector<string>& a, int left, int right) ¾ In 1997 Introsort published (Musser, introspective sort) left right { • Like quicksort in practice, but recognizes when it will be bad pIndex string pivot = a[left]; int k, pIndex = left; and changes to heapsort for(k=left+1, k <= right; k++) { what we have if (a[k] <= pivot){ void quick(tvector<string>& a, int left, int right) pIndex++; { ?????????????? swap(a[k],a[pIndex]); right } if (left < right) { left } int pivot = partition(a,left,right); swap(a[left], a[pIndex]); quick(a,left,pivot-1); } invariant quick(a,pivot+1, right); z loop invariant: } <= > ??? ¾ statement true each time loop } left right test is evaluated, used to verify z Recurrence? <= X X > X correctness of loop pIndex k z Can swap into a[left] before loop pivot index ¾ Nearly sorted data still ok CPS 100 14.7 CPS 100 14.8 Analysis of Quicksort Tail recursion elimination z Average case and worst case analysis z If the last statement is a recursive call, recursion can be replaced ¾ Recurrence for worst case: T(n) = T(n-1) + T(1) + O(n) with iteration ¾ What about average? T(n) = 2T(n/2) + O(n) ¾ Call cannot be part of an expression ¾ Some compilers do this automatically z Reason informally: void foo(int n) void foo2(int n) ¾ Two calls vector size n/2 { { ¾ Four calls vector size n/4 if (0 < n) { while (0 < n) { cout << n << endl; cout << n << endl; ¾ … How many calls? Work done on each call? foo(n-1); n = n-1; } } z Partition: typically find middle of left, middle, right, swap, go } } ¾ Avoid bad performance on nearly sorted data z What if cout << and recursive call switched? z In practice: remove some (all?) recursion, avoid lots of “clones” z What about recursive factorial? return n*factorial(n-1); CPS 100 14.9 CPS 100 14.10 Merge sort: worst case O(n log n) Merge sort: lists or vectors z Divide and conquer --- recursive sort z Mergesort for vectors ¾ Divide list/vector into two halves void mergesort(tvector<string>& a, int left, int right) • Sort each half { • Merge sorted halves together if (left < right) { int mid = (right+left)/2; ¾ What is complexity of merging two sorted lists? mergesort(a, left, mid); ¾ What is recurrence relation for merge sort as described? mergesort(a, mid+1, right); merge(a,left,mid,right); T(n) = T(n) = 2T(n/2) + O(n) } } z What is advantage of vector over linked-list for merge sort? z What’s different when linked lists used? ¾ What about merging, advantage of linked list? ¾ Do differences affect complexity? Why? ¾ Vector requires auxiliary storage (or very fancy coding) z How does merge work? CPS 100 14.11 CPS 100 14.12 Mergesort continued Summary of O(n log n) sorts z Vector code for merge isn’t pretty, but it’s not hard z Quicksort is relatively straight-forward to code, very fast ¾ Mergesort itself is elegant ¾ Worst case is very unlikely, but possible, therefore … ¾ But, if lots of elements are equal, performance will be bad void merge(tvector<string>& a, • One million integers from range 0 to 10,000 int left, int middle, int right) • How can we change partition to handle this? // pre: left <= middle <= right, // a[left] <= … <= a[middle], // a[middle+1] <= … <= a[right] z Merge sort is stable, it’s fast, good for linked lists, harder to code? // post: a[left] <= … <= a[right] ¾ Worst case performance is O(n log n), compare quicksort ¾ Extra storage for array/vector z Why is this prototype potentially simpler for linked lists? ¾ What will prototype be? What is complexity? z Heapsort, more complex to code, good worst case, not stable ¾ Basically heap-based priority queue in a vector CPS 100 14.13 CPS 100 14.14 Sorting in practice, see libsort.cpp Standard sorts: know your library z Rarely will you need to roll your own sort, but when you do … z Know how to use the STL sorts even if you don't use STL ¾ What are key issues? ¾ The sort function takes iterators as parameters ¾ vectors, strings and other containers: "give me iterators" z If you use a library sort, you need to understand the interface • What about linked-list iterators? Why aren't these "sortable"? ¾ In C++ we have STL and sortall.cpp in Tapestry • STL has sort, and stable_sort string s = "…."; • Tapestry has lots of sorts, Mergesort is fast in practice, stable, safe sort(s.begin(), s.end()); ¾ In C the generic sort is complex to use because arrays are ugly tvector<string> vs; // fill vs with values • See libsort.cpp sort(vs.begin(), vs.end()); ¾ In Java guarantees and worst-case are important • Why won’t quicksort be used? z Beware C qsort, vary widely and wildly on different platforms ¾ Last year it was slow on Solaris, this year fast. Why? z Function objects permit sorting criteria to change simply CPS 100 14.15 CPS 100 14.16 In practice: templated sort functions Function object concept in Tapestry z Function templates permit us to write once, use several times z To encapsulate comparison (like operator <) in an object for several different types of vector ¾ Need convention for parameter : name and behavior ¾ Template function “stamps out” real function ¾ Enforceable by templates or by inheritance (or both) ¾ Maintenance is saved, code still large (why?) z Name convention: know what name of function/method is ¾ Two parameters, the (vector) elements being compared z What properties must hold for vector elements? ¾ In Tapestry name is compare, in STL is operator() ¾ Comparable using < operator z compare returns an int, operator() returns a bool ¾ Elements can be assigned to each other ¾ For operator(),like <, but works like function ¾ For compare: z Template functions capture property requirements in code • zero if elements equal ¾ Part of generic programming • +1 (positive) if first > second ¾ Newest Java (1.5 beta) has generics, older Java did not • -1 (negative) if first < second CPS 100 14.17 CPS 100 14.18 Function object example: Tapestry Function object example: STL class StrLenComp // : public Comparer<string> struct stllencomp { { public: // for use with standard C++ sorting functions int compare(const string& a, const string& b) const bool operator() (const string& a, const string& b) // post: return -1/+1/0 as a.length() < b.length() { { if (a.length() < b.length()) return -1; return a.length() < b.length(); if (a.length() > b.length()) return
Recommended publications
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • Visvesvaraya Technological University a Project Report
    ` VISVESVARAYA TECHNOLOGICAL UNIVERSITY “Jnana Sangama”, Belagavi – 590 018 A PROJECT REPORT ON “PREDICTIVE SCHEDULING OF SORTING ALGORITHMS” Submitted in partial fulfillment for the award of the degree of BACHELOR OF ENGINEERING IN COMPUTER SCIENCE AND ENGINEERING BY RANJIT KUMAR SHA (1NH13CS092) SANDIP SHAH (1NH13CS101) SAURABH RAI (1NH13CS104) GAURAV KUMAR (1NH13CS718) Under the guidance of Ms. Sridevi (Senior Assistant Professor, Dept. of CSE, NHCE) DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING NEW HORIZON COLLEGE OF ENGINEERING (ISO-9001:2000 certified, Accredited by NAAC ‘A’, Permanently affiliated to VTU) Outer Ring Road, Panathur Post, Near Marathalli, Bangalore – 560103 ` NEW HORIZON COLLEGE OF ENGINEERING (ISO-9001:2000 certified, Accredited by NAAC ‘A’ Permanently affiliated to VTU) Outer Ring Road, Panathur Post, Near Marathalli, Bangalore-560 103 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CERTIFICATE Certified that the project work entitled “PREDICTIVE SCHEDULING OF SORTING ALGORITHMS” carried out by RANJIT KUMAR SHA (1NH13CS092), SANDIP SHAH (1NH13CS101), SAURABH RAI (1NH13CS104) and GAURAV KUMAR (1NH13CS718) bonafide students of NEW HORIZON COLLEGE OF ENGINEERING in partial fulfillment for the award of Bachelor of Engineering in Computer Science and Engineering of the Visvesvaraya Technological University, Belagavi during the year 2016-2017. It is certified that all corrections/suggestions indicated for Internal Assessment have been incorporated in the report deposited in the department library. The project report has been approved as it satisfies the academic requirements in respect of Project work prescribed for the Degree. Name & Signature of Guide Name Signature of HOD Signature of Principal (Ms. Sridevi) (Dr. Prashanth C.S.R.) (Dr. Manjunatha) External Viva Name of Examiner Signature with date 1.
    [Show full text]
  • How to Sort out Your Life in O(N) Time
    How to sort out your life in O(n) time arel Číže @kaja47K funkcionaklne.cz I said, "Kiss me, you're beautiful - These are truly the last days" Godspeed You! Black Emperor, The Dead Flag Blues Everyone, deep in their hearts, is waiting for the end of the world to come. Haruki Murakami, 1Q84 ... Free lunch 1965 – 2022 Cramming More Components onto Integrated Circuits http://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf He pays his staff in junk. William S. Burroughs, Naked Lunch Sorting? quicksort and chill HS 1964 QS 1959 MS 1945 RS 1887 quicksort, mergesort, heapsort, radix sort, multi- way merge sort, samplesort, insertion sort, selection sort, library sort, counting sort, bucketsort, bitonic merge sort, Batcher odd-even sort, odd–even transposition sort, radix quick sort, radix merge sort*, burst sort binary search tree, B-tree, R-tree, VP tree, trie, log-structured merge tree, skip list, YOLO tree* vs. hashing Robin Hood hashing https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf xs.sorted.take(k) (take (sort xs) k) qsort(lotOfIntegers) It may be the wrong decision, but fuck it, it's mine. (Mark Z. Danielewski, House of Leaves) I tell you, my man, this is the American Dream in action! We’d be fools not to ride this strange torpedo all the way out to the end. (HST, FALILV) Linear time sorting? I owe the discovery of Uqbar to the conjunction of a mirror and an Encyclopedia. (Jorge Luis Borges, Tlön, Uqbar, Orbis Tertius) Sorting out graph processing https://github.com/frankmcsherry/blog/blob/master/posts/2015-08-15.md Radix Sort Revisited http://www.codercorner.com/RadixSortRevisited.htm Sketchy radix sort https://github.com/kaja47/sketches (thinking|drinking|WTF)* I know they accuse me of arrogance, and perhaps misanthropy, and perhaps of madness.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) which require input data to be in sorted lists; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation (reordering) of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2006). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and upper and lower bounds. Classification Sorting algorithms are often classified by: • Computational complexity (worst, average and best behavior) of element comparisons in terms of the size of the list (n). For typical serial sorting algorithms good behavior is O(n log n), with parallel sort in O(log2 n), and bad behavior is O(n2).
    [Show full text]
  • Comparison Sorts Name Best Average Worst Memory Stable Method Other Notes Quicksort Is Usually Done in Place with O(Log N) Stack Space
    Comparison sorts Name Best Average Worst Memory Stable Method Other notes Quicksort is usually done in place with O(log n) stack space. Most implementations on typical in- are unstable, as stable average, worst place sort in-place partitioning is case is ; is not more complex. Naïve Quicksort Sedgewick stable; Partitioning variants use an O(n) variation is stable space array to store the worst versions partition. Quicksort case exist variant using three-way (fat) partitioning takes O(n) comparisons when sorting an array of equal keys. Highly parallelizable (up to O(log n) using the Three Hungarian's Algorithmor, more Merge sort worst case Yes Merging practically, Cole's parallel merge sort) for processing large amounts of data. Can be implemented as In-place merge sort — — Yes Merging a stable sort based on stable in-place merging. Heapsort No Selection O(n + d) where d is the Insertion sort Yes Insertion number of inversions. Introsort No Partitioning Used in several STL Comparison sorts Name Best Average Worst Memory Stable Method Other notes & Selection implementations. Stable with O(n) extra Selection sort No Selection space, for example using lists. Makes n comparisons Insertion & Timsort Yes when the data is already Merging sorted or reverse sorted. Makes n comparisons Cubesort Yes Insertion when the data is already sorted or reverse sorted. Small code size, no use Depends on gap of call stack, reasonably sequence; fast, useful where Shell sort or best known is No Insertion memory is at a premium such as embedded and older mainframe applications. Bubble sort Yes Exchanging Tiny code size.
    [Show full text]
  • Data Sorting Using Graphics Processing Units
    Telfor Journal, Vol. 4, No. 1, 2012. 43 Data Sorting Using Graphics Processing Units Marko J. Mišić and Milo V. Tomašević 1 produce human-readable form of data, etc. Sorting Abstract — Graphics processing units (GPUs) have been operation is not only a part of many parallel applications, increasingly used for general-purpose computation in recent but also an important benchmark for parallel systems. It years. The GPU accelerated applications are found in both consumes a significant bandwidth for communication scientific and commercial domains. Sorting is considered as among processors since highly rated sorting algorithms one of the very important operations in many applications, so its efficient implementation is essential for the overall access data in irregular patterns [1]. This is very important application performance. This paper represents an effort to for the GPUs, since they need many calculations per one analyze and evaluate the implementations of the memory access to achieve their peak performance. Sorting representative sorting algorithms on the graphics processing algorithms usually do not have high computation to global units. Three sorting algorithms (Quicksort, Merge sort, and memory access ratio, which puts the emphasis on Radix sort) were evaluated on the Compute Unified Device algorithms that access the memory in favorable ways. Architecture (CUDA) platform that is used to execute applications on NVIDIA graphics processing units. The goal of this paper is to present a short survey and Algorithms were tested and evaluated using an automated performance analysis of sorting algorithms on graphics test environment with input datasets of different processing units. It presents three representative sorting characteristics.
    [Show full text]
  • Comparison Study of Sorting Techniques in Dynamic Data Structure
    COMPARISON STUDY OF SORTING TECHNIQUES IN DYNAMIC DATA STRUCTURE ZEYAD ADNAN ABBAS A dissertation submitted in partial fulfilment of the requirement for the award of the Degree of Master of Computer Science (Software Engineering) Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia MARCH 2016 v ABSTRACT Sorting is an important and widely studied issue, where the execution time and the required resources for computation is of extreme importance, especially if it is dealing with real-time data processing. Therefore, it is important to study and to compare in details all the available sorting algorithms. In this project, an intensive investigation was conducted on five algorithms, namely, Bubble Sort, Insertion Sort, Selection Sort, Merge Sort and Quick Sort algorithms. Four groups of data elements were created for the purpose of comparison process among the different sorting algorithms. All the five sorting algorithms are applied to these groups. The worst time complexity for each sorting technique is then computed for each sorting algorithm. The sorting algorithms 2 were classified into two groups of time complexity, O (n ) group and O(nlog2n) group. The execution time for the five sorting algorithms of each group of data elements were computed. The fastest algorithm is then determined by the estimated value for each sorting algorithm, which is computed using linear least square regression. The results revealed that the Merge Sort was more efficient to sort data from the Quick Sort for O(nlog2n) time complexity group. The Insertion Sort had more efficiency to sort data from Selection Sort and Bubble Sort for O (n2) group.
    [Show full text]
  • Sorting Textbook Section 5.3
    CompSci 107 Sorting Textbook section 5.3 1 One of the most common activities on computers • Any collection of items can be sorted • Being sorted helps us work with the information • reference books, catalogues, filing systems • e.g. binary search algorithm • We need a comparison operator • Numbers are common • Unicode or ASCII values can be used to sort words • is ‘a’ (0x00061) less than or greater than ‘A’ (0x00041)? • what about spaces? (0x00020) 2 Why bother? • There is an excellent built-in sort function in Python • sorted • takes any iterable and returns a sorted list of the data • also sort methods for many types 3 • sorted(a) returns a new list - a is unchanged • a.sort() makes a sorted We bother because it gives us a greater understanding of how our programs work - in particular an idea of the amount of processing going on when we call such a function. It provides good examples of Big O values. And it is good for us :) ; it builds character. Also as Wikipedia says: “useful new algorithms are still being invented, with the now widely used Timsort dating to 2002, and the library sort being first published in 2006.” Timsort is used in Python. 4 It builds character Sorting 5 How do we sort? • Sort the towers of blocks, taking care to think about how you are doing it. Sorting 6 Simple but slow sorts • Bubble sort • Selection sort • Insertion sort • Sometimes we sort from smallest to largest, sometimes the other way (it is trivial to change the algorithms to work the other way) 7 Bubble Sort • We generally regard the Bubble sort as the worst of all sorts - but there are much worse ones e.g.
    [Show full text]
  • Analysis of Fast Insertion Sort
    Analysis of Fast Insertion Sort A Technical Report presented to the faculty of the School of Engineering and Applied Science University of Virginia by Eric Thomas April 24, 2021 On my honor as a University student, I have neither given nor received unauthorized aid on this assignment as defined by the Honor Guidelines for Thesis-Related Assignments. Eric Thomas Technical advisor: Lu Feng, Department of Computer Science Analysis of Fast Insertion Sort Efficient Variants of Insertion Sort Eric Thomas University of Virginia [email protected] ABSTRACT complexity of 푂(푛) and the latter suffers from complicated implementation. According to Faro et al. [1], there does not As a widespread problem in programming language and seem to exist an iterative, in-place, and online variant of algorithm design, many sorting algorithms, such as Fast Insertion Sort that manages to have a worst-case time Insertion Sort, have been created to excel at particular use complexity of 표(푛2). Originating from Faro et al. [1], a cases. While Fast Insertion Sort has been demonstrated to family of Insertion Sort variants, referred to as Fast have a lower running time than Hoare's Quicksort in many Insertion Sort, is the result of an endeavor to solve these practical cases, re-implementation is carried out to verify problems. those results again in C++ as well as Python, which involves comparing the running times of Fast Insertion Sort, Merge Sort, Heapsort, and Quicksort on random and 2 Related Work partially sorted arrays of different lengths. Furthermore, the After reviewing research by “Mishra et al. [2], Yang et al.
    [Show full text]
  • Sorting Algorithm
    Sorting Algorithm Monika Tulsiram Computer Science Indiana State University [email protected] December 3, 2014 Abstract In general, Sorting means rearrangement of data according to a defined pattern. The task of sorting algorithm is to transform the original unsorted sequence to the sorted sequence. While there are a large number of sorting algorithms, in practical implementations a few algorithms predominate. The most classical approach is comparison based sorting which include heap sort, quick sort, merge sort, etc. In this paper, we study about two sorting algorithms, i.e., quick sort and merge sort, how it is implemented, its algorithm, examples. We also look at the efficiency and complexity of these algorithms in detail with comparison to other sorts. This is an attempt to help to understand how some of the most famous sorting algorithms work. 1 Introduction In computer science, sorting is one of the most extensively researched subjects because of the need to speed up the operation on thousands or millions of records during a search operation. When retrieving data, which happens often in computer science, it’s important for the data to be sorted in some way. It allows for faster more efficient retrieval. For example, think about how much chaos it’d be if the songs on your iPod weren’t sorted. Another thing is, sorting is a very elementary problem in computer science, somewhat like addition and subtraction, so it’s a good introduction to algorithms. And it’s a great vehicle for teaching about complexity and classification of algorithms. Sorted data has good properties that allow algorithms to work on it quickly.
    [Show full text]
  • Fast-Insertion-Sort: a New Family of Efficient Variants of the Insertion-Sort Algorithm∗†
    Fast-Insertion-Sort: a New Family of Efficient ∗† Variants of the Insertion-Sort Algorithm Simone Faro, Francesco Pio Marino, and Stefano Scafiti Universit`a di Catania, Viale A.Doria n.6, 95125 Catania, Italy Abstract. In this paper we present Fast-Insertion-Sort, a sequence of ef- ficient external variants of the well known Insertion-Sort algorithm which 1+" 1 achieve by nesting an O(n ) worst-case time complexity, where " = h , for h 2 N. Our new solutions can be seen as the generalization of Insertion-Sort to multiple elements block insertion and, likewise the orig- inal algorithm, they are stable, adaptive and very simple to translate into programming code. Moreover they can be easily modified to obtain in-place variations at the cost of a constant factor. Moreover, by further generalizing our approach we obtain a representative recursive algorithm achieving O(n log n) worst case time complexity. From our experimental results it turns out that our new variants of the Insertion-Sort algorithm are very competitive with the most effective sorting algorithms known in literature, outperforming fast implementations of the Hoare's Quick-Sort algorithm in many practical cases, and showing an O(n log n) behaviour in practice. Keywords: Sorting · Insertion-Sort · Design of Algorithms. 1 Introduction Sorting is one of the most fundamental and extensively studied problems in computer science, mainly due to its direct applications in almost all areas of computing and probably it still remains one of the most frequent tasks needed in almost all computer programs. Formally sorting consists in finding a permutation of the elements of an input array such that they are organized in an ascending (or descending) order.
    [Show full text]
  • Highly Scalable Parallel Sorting
    so that they are in globally sorted order. A globally sorted order implies that every key on processor k is larger than every key on processor k − 1. Further, at Highly Scalable Parallel Sorting the end of execution, the number of keys stored on any processor should not be larger than some threshold n value p + tthresh (dictated by the memory availability Edgar Solomonik and Laxmikant V. Kale´ or desired final distribution). We will describe a new algorithm grounded on the Department of Computer Science basic principles of Histogram Sort [3] that can perform and scale well on modern supercomputer architectures. University of Illinois at Urbana-Champaign, Using our optimized implementation, we are able to Urbana, IL 61801, USA achieve an estimated 46.2% efficiency on 32,768 cores E-mail: [email protected], of BG/P (which corresponds to a speedup of roughly [email protected] 15,000). For a communication intensive algorithm such as sorting, this is a fairly impressive efficiency. Addi- tionally, the algorithm handles uniform as well as highly Abstract— Sorting is a commonly used process with non-uniform distributions of keys effectively. a wide breadth of applications in the high performance computing field. Early research in parallel processing Section 2 describes various parallel sorting algorithms has provided us with comprehensive analysis and theory as well as the original Histogram Sort. Section 3 docu- for parallel sorting algorithms. However, modern super- ments the optimizations we applied to the algorithm in computers have advanced rapidly in size and changed order to achieve better scaling. These include our probe significantly in architecture, forcing new adaptations to these algorithms.
    [Show full text]