Worst-Case Efficient Sorting with Quickmergesort

Total Page:16

File Type:pdf, Size:1020Kb

Worst-Case Efficient Sorting with Quickmergesort Worst-Case Efficient Sorting with QuickMergesort Stefan Edelkamp1 and Armin Weiß2 1King's College London, UK 2FMI, Universit¨atStuttgart, Germany San Diego, January 7, 2019 25 20 [ns] n log 15 n 10 time per 5 Right: random with2 large10 elements215 in220 the middle225 and end. number of elements n 25 20 [ns] n log 15 n 10 time per 5 210 215 220 225 number of elements n std::sort (Quicksort/Introsort) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs. Comparison-based sorting: Quicksort, Heapsort, Mergesort 25 20 [ns] n log 15 n 10 time per 5 Right: random with2 large10 elements215 in220 the middle225 and end. number of elements n Comparison-based sorting: Quicksort, Heapsort, Mergesort 25 20 [ns] n log 15 n 10 time per 5 210 215 220 225 number of elements n std::sort (Quicksort/Introsort) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs. Comparison-based sorting: Quicksort, Heapsort, Mergesort 25 25 20 20 [ns] [ns] n n log log 15 15 n n 10 10 time per time per 5 5 210 215 220 225 210 215 220 225 number of elements n number of elements n std::sort (Quicksort/Introsort) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs. Right: random with large elements in the middle and end. Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) Quicksort, Heapsort, Mergesort Algorithm Fast on average \in place" O(n log n) worst case Quicksort 3 3 7 Heapsort 7 3 3 Mergesort 3 7 3 Wish to have three times 3: Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Outline Outline: QuickMergesort Our improvements and theoretical bounds Experiments After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with Quicksort Quicksort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with Quicksort Quicksort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 5: After line 6: both parts sorted recursively with Quicksort Quicksort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 6: both parts sorted recursively with Quicksort Quicksort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: Quicksort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with Quicksort After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with QuickMergesort QuickMergesort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with QuickMergesort QuickMergesort 1: procedure Quicksort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) Mergesort 5: Quicksort(A[`; : : : ; cut − 1]) 6: Quicksort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with QuickMergesort QuickMergesort 1: procedure QuickMergesort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Mergesort(A[`; : : : ; cut − 1]) 6: QuickMergesort(A[cut;:::; r]) 7: end if 8: end procedure QuickMergesort 1: procedure QuickMergesort(A[`; : : : ; r]) 2: if r > ` then 3: pivot choosePivot(A[`; : : : ; r]) 4: cut partition(A[`; : : : ; r], pivot) 5: Mergesort(A[`; : : : ; cut − 1]) 6: QuickMergesort(A[cut;:::; r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 5: After line 6: both parts sorted recursively with QuickMergesort 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort | {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort 3 2 4 5 6 0 1 7 9 10 11 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort 9 10 8 11 2 3 4 7 0 1 5 6 QuickMergesort 1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort. 7 11 4 5 6 10 9 2 3 1 0 8 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort | {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort 3 2 4 5 6 0 1 7 9 10 11 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort 9 10 8 11 2 3 4 7 0 1 5 6 QuickMergesort 1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort. 7 11 4 5 6 10 9 2 3 1 0 8 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort | {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort 7 11 4 5 6 10 9 2 3 1 0 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort 9 10 8 11 2 3 4 7 0 1 5 6 QuickMergesort 1.
Recommended publications
  • Arxiv:1606.00484V2 [Cs.DS] 4 Aug 2016
    Fast Deterministic Selection Andrei Alexandrescu The D Language Foundation [email protected] Abstract lection algorithm most used in practice [8, 27, 31]. It can The Median of Medians (also known as BFPRT) algorithm, be thought of as a partial QUICKSORT [16]: Like QUICK- although a landmark theoretical achievement, is seldom used SORT,QUICKSELECT relies on a separate routine to divide in practice because it and its variants are slower than sim- the input array into elements less than or equal to and greater ple approaches based on sampling. The main contribution than or equal to a chosen element called the pivot. Unlike of this paper is a fast linear-time deterministic selection al- QUICKSORT which recurses on both subarrays left and right UICKSELECT gorithm QUICKSELECTADAPTIVE based on a refined def- of the pivot, Q only recurses on the side known k inition of MEDIANOFMEDIANS. The algorithm’s perfor- to contain the th smallest element. (The side to choose is mance brings deterministic selection—along with its desir- known because the partitioning computes the rank of the able properties of reproducible runs, predictable run times, pivot.) and immunity to pathological inputs—in the range of prac- The pivot choosing strategy strongly influences QUICK- ticality. We demonstrate results on independent and identi- SELECT’s performance. If each partitioning step reduces the cally distributed random inputs and on normally-distributed array portion containing the kth order statistic by at least a UICKSELECT O(n) inputs. Measurements show that QUICKSELECTADAPTIVE fixed fraction, Q runs in time; otherwise, is faster than state-of-the-art baselines.
    [Show full text]
  • (Deterministic & Randomized): Finding the Median in Linear Time
    Lecture 2 Selection (deterministic & randomized): finding the median in linear time 2.1 Overview Given an unsorted array, how quickly can one find the median element? Can one do it more quickly than by sorting? This was an open question for some time, solved affirmatively in 1972 by (Manuel) Blum, Floyd, Pratt, Rivest, and Tarjan. In this lecture we describe two linear-time algorithms for this problem: one randomized and one deterministic. More generally, we solve the problem of finding the kth smallest out of an unsorted array of n elements. 2.2 The problem and a randomized solution Consider the problem of finding the kth smallest element in an unsorted array of size n. (Let’s say all elements are distinct to avoid the question of what we mean by the kth smallest when we have equalities). One way to solve this problem is to sort and then output the kth element. We can do this in time O(n log n) if we sort using Mergesort, Quicksort, or Heapsort. Is there something faster – a linear-time algorithm? The answer is yes. We will explore both a simple randomized solution and a more complicated deterministic one. The idea for the randomized algorithm is to start with the Randomized-Quicksort algorithm (choose a random element as “pivot”, partition the array into two sets LESS and GREATER consisting of those elements less than and greater than the pivot respectively, and then recursively sort LESS and GREATER). Then notice that there is a simple speedup we can make if we just need to find the kth smallest element.
    [Show full text]
  • Arxiv:1810.12047V1 [Cs.DS] 29 Oct 2018 Es EA21) H Ipicto Sachieved Is Simplification the 2016)
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The IT University of Copenhagen's Repository Simple and Fast BlockQuicksort using Lomuto’s Partitioning Scheme Martin Aum¨uller Nikolaj Hass October 30, 2018 Abstract worst-case aware quicksort variant called introsort This paper presents simple variants of the Block- [15], Oracle’s Java, which uses a two-pivot quick- Quicksort algorithm described by Edelkamp and sort algorithm [23], and Microsoft’s C#, which uses Weiss (ESA 2016). The simplification is achieved a traditional median-of-three quicksort approach. by using Lomuto’s partitioning scheme instead of In the pursuit of improving the performance Hoare’s crossing pointer technique to partition the of quicksort-based implementations, a long line input. To achieve a robust sorting algorithm that of papers [16, 23, 13, 3, 21] have looked at the works well on many different input types, the paper benefits of introducing more than one pivot in introduces a novel two-pivot variant of Lomuto’s the algorithm. Here, [13, 3, 21] showed that the partitioning scheme. A surprisingly simple twist memory access behavior improves with a small to the generic two-pivot quicksort approach makes number of pivots — an effect that cannot be the algorithm robust. The paper provides an anal- achieved by other more traditional means such as ysis of the theoretical properties of the proposed al- choosing a pivot from a sample of elements [18]. gorithms and compares them to their competitors. However, better cache behavior leaves the problem The analysis shows that Lomuto-based approaches of branch mispredictions unsolved, which is one incur a higher average sorting cost than the Hoare- of the largest bottleneck’s in today’s processor based approach of BlockQuicksort.
    [Show full text]
  • Insights Into the C++ Standard Library Pavel Novikov @Cpp Ape R&D Align Technology What’S Ahead
    Под капотом стандартной библиотеки C++ Insights into the C++ standard library Pavel Novikov @cpp_ape R&D Align Technology What’s ahead • use std::vector by default instead of std::list • emplace_back and push_back • prefer to use std::make_shared • avoid static objects with non-trivial constructors/destructors • small string optimization • priority queue and heap algorithms • sorting and selection • guarantee of strengthened worst case complexity O(n log(n)) for std::sort • when to use std::sort, std::stable_sort, std::partial_sort, std::nth_element • use unordered associative containers by default • beware of missing the variable name 2 using GadgetList = std::list<std::shared_ptr<Gadget>>; GadgetList findAllGadgets(const Widget &w) { GadgetList gadgets; for (auto &item : w.getAllItems()) if (isGadget(item)) gadgets.push_back(getGadget(item)); return gadgets; } void processGadgets(const GadgetList &gadgets) { for (auto &gadget : gadgets) processGadget(*gadget); } 3 using GadgetList = std::list<std::shared_ptr<Gadget>>; GadgetList findAllGadgets(const Widget &w) { GadgetList gadgets; for (auto &item : w.getAllItems()) if (isGadget(item)) gadgets.push_back(getGadget(item)); return gadgets; } void processGadgets(const GadgetList &gadgets) { for (auto &gadget : gadgets) processGadget(*gadget); } is this reasonable use of std::list? 3 std::list No copy/move while adding elements. Memory allocation for each element. • expensive • may fragment memory • in general worse element locality compared to std::vector 4 std::vector std::vector<T,A>::push_back() has amortized O 1 complexity std::vector<int> v; // capacity=0 v.push_back(1); // allocate, capacity=3 v.push_back(2); v.push_back(3); v.push_back(4); //alloc, move 3 elements, capacity=6 v.push_back(5); 5 6 std::vector 푐 — initial capacity 푚 — reallocation factor To push back 푁 elements we need log푚 푁/푐 reallocations.
    [Show full text]
  • Fast Deterministic Selection
    Fast Deterministic Selection Andrei Alexandrescu The D Language Foundation, Washington, USA Abstract The selection problem, in forms such as finding the median or choosing the k top ranked items in a dataset, is a core task in computing with numerous applications in fields as diverse as statist- ics, databases, Machine Learning, finance, biology, and graphics. The selection algorithm Median of Medians, although a landmark theoretical achievement, is seldom used in practice because it is slower than simple approaches based on sampling. The main contribution of this paper is a fast linear-time deterministic selection algorithm MedianOfNinthers based on a refined definition of MedianOfMedians. A complementary algorithm MedianOfExtrema is also proposed. These algorithms work together to solve the selection problem in guaranteed linear time, faster than state-of-the-art baselines, and without resorting to randomization, heuristics, or fallback approaches for pathological cases. We demonstrate results on uniformly distributed random numbers, typical low-entropy artificial datasets, and real-world data. Measurements are open- sourced alongside the implementation at https://github.com/andralex/MedianOfNinthers. 1998 ACM Subject Classification F.2.2 Nonnumerical Algorithms and Problems Keywords and phrases Selection Problem, Quickselect, Median of Medians, Algorithm Engin- eering, Algorithmic Libraries Digital Object Identifier 10.4230/LIPIcs.SEA.2017.24 1 Introduction The selection problem is widely researched and has numerous applications. Selection is finding the kth smallest element (also known as the kth order statistic): given an array A of length |A| = n, a non-strict order ≤ over elements of A, and an index k, the task is to find the element that would be in slot A[k] if A were sorted.
    [Show full text]
  • SI 335, Unit 7: Advanced Sort and Search
    SI 335, Unit 7: Advanced Sort and Search Daniel S. Roche ([email protected]) Spring 2016 Sorting’s back! After starting out with the sorting-related problem of computing medians and percentiles, in this unit we will see two of the most practically-useful sorting algorithms: quickSort and radixSort. Along the way, we’ll get more practice with divide-and-conquer algorithms and computational models, and we’ll touch on how random numbers can play a role in making algorithms faster too. 1 The Selection Problem 1.1 Medians and Percentiles Computing statistical quantities on a given set of data is a really important part of what computers get used for. Of course we all know how to calculate the mean, or average of a list of numbers: sum them all up, and divide by the length of the list. This takes linear-time in the length of the input list. But averages don’t tell the whole story. For example, the average household income in the United States for 2010 was about $68,000 but the median was closer to $50,000. Somehow the second number is a better measure of what a “typical” household might bring in, whereas the first can be unduly affected by the very rich and the very poor. Medians are in fact just a special case of a percentile, another important statistical quantity. Your SAT scores were reported both as a raw number and as a percentile, indicating how you did compared to everyone else who took the test. So if 100,000 people took the SATs, and you scored in the 90th percentile, then your score was higher than 90,000 other students’ scores.
    [Show full text]
  • Basic Introduction Into Algorithms and Data Structures
    Lecture given at the International Summer School Modern Computational Science (August 15-26, 2011, Oldenburg, Germany) Basic Introduction into Algorithms and Data Structures Frauke Liers Computer Science Department University of Cologne D-50969 Cologne Germany Abstract. This chapter gives a brief introduction into basic data structures and algorithms, together with references to tutorials available in the literature. We first introduce fundamental notation and algorithmic concepts. We then explain several sorting algorithms and give small examples. As fundamental data structures, we in- troduce linked lists, trees and graphs. Implementations are given in the programming language C. Contents 1 Introduction ........................... 2 2 Sorting .............................. 2 2.1 InsertionSort.......................... 3 2.1.1 Some Basic Notation and Analysis of Insertion Sort . 4 2.2 Sorting using the Divide-and-Conquer Principle . 7 2.2.1 Merge Sort . 7 2.2.2 Quicksort ........................... 9 2.3 A Lower Bound on the Performance of Sorting Algorithms 12 3 Select the k-thsmallestelement . 12 4 BinarySearch .......................... 13 5 Elementary Data Structures . 14 5.1 Stacks.............................. 15 1 Algorithms and Data Structures (Liers) 5.2 LinkedLists........................... 17 5.3 Graphs, Trees, and Binary Search Trees . 18 6 AdvancedProgramming . 21 6.1 References............................ 21 1 Introduction This chapter is meant as a basic introduction into elementary algorithmic principles and data structures used in computer science. In the latter field, the focus is on processing information in a systematic and often automatized way. One goal in the design of solution methods (algorithms) is about making efficient use of hardware resources such as computing time and memory. It is true that hardware develop- ment is very fast; the processors’ speed increases rapidly.
    [Show full text]
  • Selection (Deterministic & Randomized): Finding the Median in Linear Time
    Lecture 4 Selection (deterministic & randomized): finding the median in linear time 4.1 Overview Given an unsorted array, how quickly can one find the median element? Can one do it more quickly than by sorting? This was an open question for some time, solved affirmatively in 1972 by (Manuel) Blum, Floyd, Pratt, Rivest, and Tarjan. In this lecture we describe two linear-time algorithms for this problem: one randomized and one deterministic. More generally, we solve the problem of finding the kth smallest out of an unsorted array of n elements. 4.2 The problem and a randomized solution A related problem to sorting is the problem of finding the kth smallest element in an unsorted array. (Let’s say all elements are distinct to avoid the question of what we mean by the kth smallest when we have equalities). One way to solve this problem is to sort and then output the kth element. Is there something faster – a linear-time algorithm? The answer is yes. We will explore both a simple randomized solution and a more complicated deterministic one. The idea for the randomized algorithm is to notice that in Randomized-Quicksort, after the par- titioning step we can tell which subarray has the item we are looking for, just by looking at their sizes. So, we only need to recursively examine one subarray, not two. For instance, if we are looking for the 87th-smallest element in our array, and after partitioning the “LESS” subarray (of elements less than the pivot) has size 200, then we just need to find the 87th smallest element in LESS.
    [Show full text]
  • 1.6 Median Selection
    Algorithms Lecture 1: Recursion [Fa’10] 1.6 Median Selection So how do we find the median element of an array in linear time? The following algorithm was discovered by Manuel Blum, Bob Floyd, Vaughan Pratt, Ron Rivest, and Bob Tarjan in the early 1970s. Their algorithm actually solves the more general problem of selecting the kth largest element in an array, using the following recursive divide-and-conquer strategy. The subroutine PARTITION is the same as the one used in QUICKSORT. SELECT(A[1 .. n], k): if n 25 ≤use brute force else m n=5 for i d 1 toe m B [i] SELECT(A[5i 4 .. 5i], 3) Brute force! mom S ELECT(B[1 .. m],−m=2 ) hhRecursion! ii b c hh ii r PARTITION(A[1 .. n], mom) if k < r return SELECT(A[1 .. r 1], k) Recursion! else if k > r − hh ii return SELECT(A[r + 1 .. n], k r) Recursion! else − hh ii return mom If the input array is too large to handle by brute force, we divide it into n=5 blocks, each containing exactly 5 elements, except possibly the last. (If the last block isn’t full, justd throwe in a few s.) We find the median of each block by brute force and collect those medians into a new array. Then we1 recursively compute the median of the new array (the median of medians — hence ‘mom’) and use it to partition the input array. Finally, either we get lucky and the median-of-medians is the kth largest element of A, or we recursively search one of the two subarrays.
    [Show full text]
  • 3 Prune-And-Search Is Even
    3 Prune-and-Search is even. The expected running time increases with increas- ing number of items, T (k) ≤ T (m) if k ≤ m. Hence, We use two algorithms for selection as examples for the n 1 prune-and-search paradigm. The problem is to find the T (n) ≤ n + max{T (m − 1),T (n − m)} n i-smallest item in an unsorted collection of n items. We mX=1 n could first sort the list and then return the item in the i-th 2 position, but just finding the i-th item can be done faster ≤ n + T (m − 1). n n m= +1 than sorting the entire list. As a warm-up exercise consider X2 selecting the 1-st or smallest item in the unsorted array Assume inductively that T (m) ≤ cm for m< n and A[1..n]. a sufficiently large positive constant c. Such a constant c can certainly be found for m = 1, since for that case min =1; the running time of the algorithm is only a constant. This for j =2 to n do establishes the basis of the induction. The case of n items if A[j] < A[min] then min = j endif reduces to cases of m<n items for which we can use the endfor. induction hypothesis. We thus get n The index of the smallest item is found in n − 1 com- 2c T (n) ≤ n + m − 1 parisons, which is optimal. Indeed, there is an adversary n n m=X+1 argument, that is, with fewer than n − 1 comparisons we 2 c n can change the minimum without changing the outcomes = n + c · (n − 1) − · +1 of the comparisons.
    [Show full text]
  • (Deterministic & Randomized): Finding the Median in Linear Time
    Lecture 4 Selection (deterministic & randomized): finding the median in linear time 4.1 Overview Given an unsorted array, how quickly can one find the median element? Can one do it more quickly than by sorting? This was solved affirmatively in 1972 by (Manuel) Blum, Floyd, Pratt, Rivest, and Tarjan. In this lecture we describe two linear-time algorithms for this problem: one randomized and one deterministic. More generally, we give linear-time algorithms for the problem of finding the kth smallest out of an unsorted array of n elements. 4.2 Lower bounds for comparison-based sorting We saw in the last lecture that Randomized Quicksort will sort any array of size n with only O(n log n) comparisons in expectation. Mergesort and Heapsort are two other algorithms that will sort any array of size n with only O(n log n) comparisons (and these are deterministic algorithms, so there is no “in expectation”). Can one hope to sort with fewer, i.e., with o(n log n) comparisons? Let us quickly see why the answer is “no”, at least for deterministic algorithms (we will analyze lower bounds for randomized algorithms later). To be clear, we are considering comparison-based sorting algorithms that only operate on the input array by comparing pairs of elements and moving elements around based on the results of these comparisons. Sometimes it is helpful to view such algorithms as providing instructions to a data- holder (“move this element over here”, “compare this element with that element and tell me which is larger”). The two key properties of such algorithms are: 1.
    [Show full text]
  • ‣ Mergesort ‣ Counting Inversions ‣ Randomized Quicksort ‣ Median and Selection ‣ Closest Pair of Points
    Divide-and-conquer paradigm Divide-and-conquer. 5. DIVIDE AND CONQUER I Divide up problem into several subproblems (of the same kind). Solve (conquer) each subproblem recursively. ‣ mergesort Combine solutions to subproblems into overall solution. ‣ counting inversions Most common usage. ‣ randomized quicksort Divide problem of size n into two subproblems of size n / 2. O(n) time ‣ median and selection Solve (conquer) two subproblems recursively. ‣ closest pair of points Combine two solutions into overall solution. O(n) time Consequence. Brute force: Θ(n2). Lecture slides by Kevin Wayne Divide-and-conquer: O(n log n). Copyright © 2005 Pearson-Addison Wesley http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on 3/12/18 9:21 AM attributed to Julius Caesar 2 Sorting problem Problem. Given a list L of n elements from a totally ordered universe, 5. DIVIDE AND CONQUER rearrange them in ascending order. ‣ mergesort ‣ counting inversions ‣ randomized quicksort ‣ median and selection ‣ closest pair of points SECTIONS 5.1–5.2 4 Sorting applications Mergesort Obvious applications. Recursively sort left half. Organize an MP3 library. Recursively sort right half. Display Google PageRank results. Merge two halves to make sorted whole. List RSS news items in reverse chronological order. Some problems become easier once elements are sorted. input Identify statistical outliers. A L G O R I T H M S Binary search in a database. Remove duplicates in a mailing list. sort left half A G L O R I T H M S Non-obvious applications. sort right half Convex hull. Closest pair of points. A G L O R H I M S T Interval scheduling / interval partitioning.
    [Show full text]