Sorting Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Sort Algorithms 15-110 - Friday 2/28 Learning Objectives
Sort Algorithms 15-110 - Friday 2/28 Learning Objectives • Recognize how different sorting algorithms implement the same process with different algorithms • Recognize the general algorithm and trace code for three algorithms: selection sort, insertion sort, and merge sort • Compute the Big-O runtimes of selection sort, insertion sort, and merge sort 2 Search Algorithms Benefit from Sorting We use search algorithms a lot in computer science. Just think of how many times a day you use Google, or search for a file on your computer. We've determined that search algorithms work better when the items they search over are sorted. Can we write an algorithm to sort items efficiently? Note: Python already has built-in sorting functions (sorted(lst) is non-destructive, lst.sort() is destructive). This lecture is about a few different algorithmic approaches for sorting. 3 Many Ways of Sorting There are a ton of algorithms that we can use to sort a list. We'll use https://visualgo.net/bn/sorting to visualize some of these algorithms. Today, we'll specifically discuss three different sorting algorithms: selection sort, insertion sort, and merge sort. All three do the same action (sorting), but use different algorithms to accomplish it. 4 Selection Sort 5 Selection Sort Sorts From Smallest to Largest The core idea of selection sort is that you sort from smallest to largest. 1. Start with none of the list sorted 2. Repeat the following steps until the whole list is sorted: a) Search the unsorted part of the list to find the smallest element b) Swap the found element with the first unsorted element c) Increment the size of the 'sorted' part of the list by one Note: for selection sort, swapping the element currently in the front position with the smallest element is faster than sliding all of the numbers down in the list. -
PROC SORT (Then And) NOW Derek Morgan, PAREXEL International
Paper 143-2019 PROC SORT (then and) NOW Derek Morgan, PAREXEL International ABSTRACT The SORT procedure has been an integral part of SAS® since its creation. The sort-in-place paradigm made the most of the limited resources at the time, and almost every SAS program had at least one PROC SORT in it. The biggest options at the time were to use something other than the IBM procedure SYNCSORT as the sorting algorithm, or whether you were sorting ASCII data versus EBCDIC data. These days, PROC SORT has fallen out of favor; after all, PROC SQL enables merging without using PROC SORT first, while the performance advantages of HASH sorting cannot be overstated. This leads to the question: Is the SORT procedure still relevant to any other than the SAS novice or the terminally stubborn who refuse to HASH? The answer is a surprisingly clear “yes". PROC SORT has been enhanced to accommodate twenty-first century needs, and this paper discusses those enhancements. INTRODUCTION The largest enhancement to the SORT procedure is the addition of collating sequence options. This is first and foremost recognition that SAS is an international software package, and SAS users no longer work exclusively with English-language data. This capability is part of National Language Support (NLS) and doesn’t require any additional modules. You may use standard collations, SAS-provided translation tables, custom translation tables, standard encodings, or rules to produce your sorted dataset. However, you may only use one collation method at a time. USING STANDARD COLLATIONS, TRANSLATION TABLES AND ENCODINGS A long time ago, SAS would allow you to sort data using ASCII rules on an EBCDIC system, and vice versa. -
Overview of Sorting Algorithms
Unit 7 Sorting Algorithms Simple Sorting algorithms Quicksort Improving Quicksort Overview of Sorting Algorithms Given a collection of items we want to arrange them in an increasing or decreasing order. You probably have seen a number of sorting algorithms including ¾ selection sort ¾ insertion sort ¾ bubble sort ¾ quicksort ¾ tree sort using BST's In terms of efficiency: ¾ average complexity of the first three is O(n2) ¾ average complexity of quicksort and tree sort is O(n lg n) ¾ but its worst case is still O(n2) which is not acceptable In this section, we ¾ review insertion, selection and bubble sort ¾ discuss quicksort and its average/worst case analysis ¾ show how to eliminate tail recursion ¾ present another sorting algorithm called heapsort Unit 7- Sorting Algorithms 2 Selection Sort Assume that data ¾ are integers ¾ are stored in an array, from 0 to size-1 ¾ sorting is in ascending order Algorithm for i=0 to size-1 do x = location with smallest value in locations i to size-1 swap data[i] and data[x] end Complexity If array has n items, i-th step will perform n-i operations First step performs n operations second step does n-1 operations ... last step performs 1 operatio. Total cost : n + (n-1) +(n-2) + ... + 2 + 1 = n*(n+1)/2 . Algorithm is O(n2). Unit 7- Sorting Algorithms 3 Insertion Sort Algorithm for i = 0 to size-1 do temp = data[i] x = first location from 0 to i with a value greater or equal to temp shift all values from x to i-1 one location forwards data[x] = temp end Complexity Interesting operations: comparison and shift i-th step performs i comparison and shift operations Total cost : 1 + 2 + .. -
Lecture 11: Heapsort & Its Analysis
Lecture 11: Heapsort & Its Analysis Agenda: • Heap recall: – Heap: definition, property – Max-Heapify – Build-Max-Heap • Heapsort algorithm • Running time analysis Reading: • Textbook pages 127 – 138 1 Lecture 11: Heapsort (Binary-)Heap data structure (recall): • An array A[1..n] of n comparable keys either ‘≥’ or ‘≤’ • An implicit binary tree, where – A[2j] is the left child of A[j] – A[2j + 1] is the right child of A[j] j – A[b2c] is the parent of A[j] j • Keys satisfy the max-heap property: A[b2c] ≥ A[j] • There are max-heap and min-heap. We use max-heap. • A[1] is the maximum among the n keys. • Viewing heap as a binary tree, height of the tree is h = blg nc. Call the height of the heap. [— the number of edges on the longest root-to-leaf path] • A heap of height k can hold 2k —— 2k+1 − 1 keys. Why ??? Since lg n − 1 < k ≤ lg n ⇐⇒ n < 2k+1 and 2k ≤ n ⇐⇒ 2k ≤ n < 2k+1 2 Lecture 11: Heapsort Max-Heapify (recall): • It makes an almost-heap into a heap. • Pseudocode: procedure Max-Heapify(A, i) **p 130 **turn almost-heap into a heap **pre-condition: tree rooted at A[i] is almost-heap **post-condition: tree rooted at A[i] is a heap lc ← leftchild(i) rc ← rightchild(i) if lc ≤ heapsize(A) and A[lc] > A[i] then largest ← lc else largest ← i if rc ≤ heapsize(A) and A[rc] > A[largest] then largest ← rc if largest 6= i then exchange A[i] ↔ A[largest] Max-Heapify(A, largest) • WC running time: lg n. -
Batcher's Algorithm
18.310 lecture notes Fall 2010 Batcher’s Algorithm Prof. Michel Goemans Perhaps the most restrictive version of the sorting problem requires not only no motion of the keys beyond compare-and-switches, but also that the plan of comparison-and-switches be fixed in advance. In each of the methods mentioned so far, the comparison to be made at any time often depends upon the result of previous comparisons. For example, in HeapSort, it appears at first glance that we are making only compare-and-switches between pairs of keys, but the comparisons we perform are not fixed in advance. Indeed when fixing a headless heap, we move either to the left child or to the right child depending on which child had the largest element; this is not fixed in advance. A sorting network is a fixed collection of comparison-switches, so that all comparisons and switches are between keys at locations that have been specified from the beginning. These comparisons are not dependent on what has happened before. The corresponding sorting algorithm is said to be non-adaptive. We will describe a simple recursive non-adaptive sorting procedure, named Batcher’s Algorithm after its discoverer. It is simple and elegant but has the disadvantage that it requires on the order of n(log n)2 comparisons. which is larger by a factor of the order of log n than the theoretical lower bound for comparison sorting. For a long time (ten years is a long time in this subject!) nobody knew if one could find a sorting network better than this one. -
Can We Overcome the Nlog N Barrier for Oblivious Sorting?
Can We Overcome the n log n Barrier for Oblivious Sorting? ∗ Wei-Kai Lin y Elaine Shi z Tiancheng Xie x Abstract It is well-known that non-comparison-based techniques can allow us to sort n elements in o(n log n) time on a Random-Access Machine (RAM). On the other hand, it is a long-standing open question whether (non-comparison-based) circuits can sort n elements from the domain [1::2k] with o(kn log n) boolean gates. We consider weakened forms of this question: first, we consider a restricted class of sorting where the number of distinct keys is much smaller than the input length; and second, we explore Oblivious RAMs and probabilistic circuit families, i.e., computational models that are somewhat more powerful than circuits but much weaker than RAM. We show that Oblivious RAMs and probabilistic circuit families can sort o(log n)-bit keys in o(n log n) time or o(kn log n) circuit complexity. Our algorithms work in the indivisible model, i.e., not only can they sort an array of numerical keys | if each key additionally carries an opaque ball, our algorithms can also move the balls into the correct order. We further show that in such an indivisible model, it is impossible to sort Ω(log n)-bit keys in o(n log n) time, and thus the o(log n)-bit-key assumption is necessary for overcoming the n log n barrier. Finally, after optimizing the IO efficiency, we show that even the 1-bit special case can solve open questions: our oblivious algorithms solve tight compaction and selection with optimal IO efficiency for the first time. -
CS 758/858: Algorithms
CS 758/858: Algorithms ■ COVID Prof. Wheeler Ruml Algorithms TA Sumanta Kashyapi This Class Complexity http://www.cs.unh.edu/~ruml/cs758 4 handouts: course info, schedule, slides, asst 1 2 online handouts: programming tips, formulas 1 physical sign-up sheet/laptop (for grades, piazza) Wheeler Ruml (UNH) Class 1, CS 758 – 1 / 25 COVID ■ COVID Algorithms This Class Complexity ■ check your Wildcat Pass before coming to campus ■ if you have concerns, let me know Wheeler Ruml (UNH) Class 1, CS 758 – 2 / 25 ■ COVID Algorithms ■ Algorithms Today ■ Definition ■ Why? ■ The Word ■ The Founder This Class Complexity Algorithms Wheeler Ruml (UNH) Class 1, CS 758 – 3 / 25 Algorithms Today ■ ■ COVID web: search, caching, crypto Algorithms ■ networking: routing, synchronization, failover ■ Algorithms Today ■ machine learning: data mining, recommendation, prediction ■ Definition ■ Why? ■ bioinformatics: alignment, matching, clustering ■ The Word ■ ■ The Founder hardware: design, simulation, verification ■ This Class business: allocation, planning, scheduling Complexity ■ AI: robotics, games Wheeler Ruml (UNH) Class 1, CS 758 – 4 / 25 Definition ■ COVID Algorithm Algorithms ■ precisely defined ■ Algorithms Today ■ Definition ■ mechanical steps ■ Why? ■ ■ The Word terminates ■ The Founder ■ input and related output This Class Complexity What might we want to know about it? Wheeler Ruml (UNH) Class 1, CS 758 – 5 / 25 Why? ■ ■ COVID Computer scientist 6= programmer Algorithms ◆ ■ Algorithms Today understand program behavior ■ Definition ◆ have confidence in results, performance ■ Why? ■ The Word ◆ know when optimality is abandoned ■ The Founder ◆ solve ‘impossible’ problems This Class ◆ sets you apart (eg, Amazon.com) Complexity ■ CPUs aren’t getting faster ■ Devices are getting smaller ■ Software is the differentiator ■ ‘Software is eating the world’ — Marc Andreessen, 2011 ■ Everything is computation Wheeler Ruml (UNH) Class 1, CS 758 – 6 / 25 The Word: Ab¯u‘Abdall¯ah Muh.ammad ibn M¯us¯aal-Khw¯arizm¯ı ■ COVID 780-850 AD Algorithms Born in Uzbekistan, ■ Algorithms Today worked in Baghdad. -
Hacking a Google Interview – Handout 2
Hacking a Google Interview – Handout 2 Course Description Instructors: Bill Jacobs and Curtis Fonger Time: January 12 – 15, 5:00 – 6:30 PM in 32‐124 Website: http://courses.csail.mit.edu/iap/interview Classic Question #4: Reversing the words in a string Write a function to reverse the order of words in a string in place. Answer: Reverse the string by swapping the first character with the last character, the second character with the second‐to‐last character, and so on. Then, go through the string looking for spaces, so that you find where each of the words is. Reverse each of the words you encounter by again swapping the first character with the last character, the second character with the second‐to‐last character, and so on. Sorting Often, as part of a solution to a question, you will need to sort a collection of elements. The most important thing to remember about sorting is that it takes O(n log n) time. (That is, the fastest sorting algorithm for arbitrary data takes O(n log n) time.) Merge Sort: Merge sort is a recursive way to sort an array. First, you divide the array in half and recursively sort each half of the array. Then, you combine the two halves into a sorted array. So a merge sort function would look something like this: int[] mergeSort(int[] array) { if (array.length <= 1) return array; int middle = array.length / 2; int firstHalf = mergeSort(array[0..middle - 1]); int secondHalf = mergeSort( array[middle..array.length - 1]); return merge(firstHalf, secondHalf); } The algorithm relies on the fact that one can quickly combine two sorted arrays into a single sorted array. -
WLFC: Write Less in the Flash-Based Cache
WLFC: Write Less in the Flash-based Cache Chaos Dong Fang Wang Jianshun Zhang Huazhong University of Huazhong University of Huazhong University of Science and Technology Science and Technology Science and Technology Huhan, China Huhan, China Huhan, China Email: [email protected] Email: [email protected] Email: [email protected] Abstract—Flash-based disk caches, for example Bcache [1] and log-on-log [5]. At the same time, the Open-Channel SSD Flashcache [2], has gained tremendous popularity in industry (OCSSD) has became increasingly popular in academia. in the last decade because of its low energy consumption, non- Through a standard OCSSD interface, the behaviors of the volatile nature and high I/O speed. But these cache systems have a worse write performance than the read performance because flash memory can be managed by the host to improve the of the asymmetric I/O costs and the the internal GC mechanism. utilization of the storage. In addition to the performance issues, since the NAND flash is In this paper, we mainly make three contributions to optimize a type of EEPROM device, the lifespan is also limited by the write performance of the flash-based disk cache. First, we Program/Erase (P/E) cycles. So how to improve the performance designed a write-friendly flash-based disk cache system and the lifespan of flash-based caches in write-intensive scenarios has always been a hot issue. Benefiting from Open-Channel SSDs upon Open-Channel SSDs, named WLFC (Write Less in (OCSSDs) [3], we propose a write-friendly flash-based disk cache the Flash-based Cache), in which the requests is handled system, which is called WLFC (Write Less in the Flash-based by a strictly sequential writing method to reduce write Cache). -
Quick Sort Algorithm Song Qin Dept
Quick Sort Algorithm Song Qin Dept. of Computer Sciences Florida Institute of Technology Melbourne, FL 32901 ABSTRACT each iteration. Repeat this on the rest of the unsorted region Given an array with n elements, we want to rearrange them in without the first element. ascending order. In this paper, we introduce Quick Sort, a Bubble sort works as follows: keep passing through the list, divide-and-conquer algorithm to sort an N element array. We exchanging adjacent element, if the list is out of order; when no evaluate the O(NlogN) time complexity in best case and O(N2) exchanges are required on some pass, the list is sorted. in worst case theoretically. We also introduce a way to approach the best case. Merge sort [4] has a O(NlogN) time complexity. It divides the 1. INTRODUCTION array into two subarrays each with N/2 items. Conquer each Search engine relies on sorting algorithm very much. When you subarray by sorting it. Unless the array is sufficiently small(one search some key word online, the feedback information is element left), use recursion to do this. Combine the solutions to brought to you sorted by the importance of the web page. the subarrays by merging them into single sorted array. 2 Bubble, Selection and Insertion Sort, they all have an O(N2) time In Bubble sort, Selection sort and Insertion sort, the O(N ) time complexity that limits its usefulness to small number of element complexity limits the performance when N gets very big. no more than a few thousand data points. -
Investigating the Effect of Implementation Languages and Large Problem Sizes on the Tractability and Efficiency of Sorting Algorithms
International Journal of Engineering Research and Technology. ISSN 0974-3154, Volume 12, Number 2 (2019), pp. 196-203 © International Research Publication House. http://www.irphouse.com Investigating the Effect of Implementation Languages and Large Problem Sizes on the Tractability and Efficiency of Sorting Algorithms Temitayo Matthew Fagbola and Surendra Colin Thakur Department of Information Technology, Durban University of Technology, Durban 4000, South Africa. ORCID: 0000-0001-6631-1002 (Temitayo Fagbola) Abstract [1],[2]. Theoretically, effective sorting of data allows for an efficient and simple searching process to take place. This is Sorting is a data structure operation involving a re-arrangement particularly important as most merge and search algorithms of an unordered set of elements with witnessed real life strictly depend on the correctness and efficiency of sorting applications for load balancing and energy conservation in algorithms [4]. It is important to note that, the rearrangement distributed, grid and cloud computing environments. However, procedure of each sorting algorithm differs and directly impacts the rearrangement procedure often used by sorting algorithms on the execution time and complexity of such algorithm for differs and significantly impacts on their computational varying problem sizes and type emanating from real-world efficiencies and tractability for varying problem sizes. situations [5]. For instance, the operational sorting procedures Currently, which combination of sorting algorithm and of Merge Sort, Quick Sort and Heap Sort follow a divide-and- implementation language is highly tractable and efficient for conquer approach characterized by key comparison, recursion solving large sized-problems remains an open challenge. In this and binary heap’ key reordering requirements, respectively [9]. -
Counting Sort and Radix Sort
Counting sort and radix sort Nick Smallbone December 11, 2018 Radix sort is the oldest sorting algorithm which is still in general use; it predates the computer. On the left is Herman Hollerith, the inventor of radix sort. In the late 1800s, he built enormous electromechanical machines which used radix sort to tabulate US census data; he set up a company to sell the machines, and the company later became IBM. On the right you can see a 1950s IBM sorting machine running the radix sort algorithm. Radix sort is unlike the sorting algorithms we have seen so far, in that it is not based on comparisons (≤). However, it only works on certain kinds of data – it is mainly used on numerical data, and is often faster than comparison-based sorting algorithms. Before seeing radix sort, we will look at a simpler algorithm, counting sort. 1 Counting sort #1: sorting a list of digits Suppose you want to sort the first 100 digits of π, in ascending order, by hand. How would you do it? 31415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679 Counting sort is one approach to sorting lists of digits. It can be carried out by hand or on a computer. The idea is like this. First we count how many times each digit occurs in the input data (in this case, the first 100 digits of π): Digit 0 1 2 3 4 5 6 7 8 9 Number of occurrences 8 8 12 12 10 8 9 8 12 14 We can compute this table while looking through the input list only once, by keeping a running total for each digit.