Analysis of Fast Insertion Sort
Total Page:16
File Type:pdf, Size:1020Kb
Analysis of Fast Insertion Sort A Technical Report presented to the faculty of the School of Engineering and Applied Science University of Virginia by Eric Thomas April 24, 2021 On my honor as a University student, I have neither given nor received unauthorized aid on this assignment as defined by the Honor Guidelines for Thesis-Related Assignments. Eric Thomas Technical advisor: Lu Feng, Department of Computer Science Analysis of Fast Insertion Sort Efficient Variants of Insertion Sort Eric Thomas University of Virginia [email protected] ABSTRACT complexity of 푂(푛) and the latter suffers from complicated implementation. According to Faro et al. [1], there does not As a widespread problem in programming language and seem to exist an iterative, in-place, and online variant of algorithm design, many sorting algorithms, such as Fast Insertion Sort that manages to have a worst-case time Insertion Sort, have been created to excel at particular use complexity of 표(푛2). Originating from Faro et al. [1], a cases. While Fast Insertion Sort has been demonstrated to family of Insertion Sort variants, referred to as Fast have a lower running time than Hoare's Quicksort in many Insertion Sort, is the result of an endeavor to solve these practical cases, re-implementation is carried out to verify problems. those results again in C++ as well as Python, which involves comparing the running times of Fast Insertion Sort, Merge Sort, Heapsort, and Quicksort on random and 2 Related Work partially sorted arrays of different lengths. Furthermore, the After reviewing research by “Mishra et al. [2], Yang et al. advantages and disadvantages of Fast Insertion Sort in [3] and Beniwal et al. [4] [that] analyzed and compared addition to other questions, flaws, insights, and directions different sorting algorithms in order to help users to find for future work is elaborated on. Fast Insertion Sort offers appropriate algorithm as per their need,” Goel and Kumar improved worst-case time complexity over Insertion Sort, [5] said that “an experimental study by Yang et al. [3] while maintaining Insertion Sort's advantages of being revealed that for smaller sized list, Insertion and Selection simple, stable, and adaptive. sort performs well, but for a larger sized list, merge and quicksort are good performers.” As another notable use 1 Introduction case for Insertion Sort, Goel and Kumar note that “dynamic sorting of modules in physical memory uses Insertion Sort Although Insertion Sort has many advantages when algorithm to sort free spaces according to their size in compared to other comparison sorting algorithms, such as fragmented memory,” which is to “aggregate free spaces at being in-place, adaptive, stable, and online (that is, one location and make room for new modules.” Insertion Sort can sort an array as it is received) as well as Furthermore, they survey multiple improvements of having a worst-case space complexity of 휃(1) (not Insertion Sort that have been made by many researchers in including the input array), Insertion Sort has the downside the past. of having a high worst-case time complexity of 휃(푛2). Nonetheless, Insertion Sort has small constants, which Goel and Kumar [5] explain how “a new sorting algorithm allows it to be faster than most other sorting algorithms for for a nearly sorted list of elements was devised from the small-sized input arrays. Currently, there are numerous combination of IS and quickersort algorithm,” which uses sorting algorithms that are either variants of Insertion Sort “an auxiliary list . to remove the unordered pairs from (for example, Shellsort, Tree Sort, and Library Sort) or the main list.” After “sorting is done using IS or quickersort switch to Insertion Sort when the input array is small algorithm,” they said that “results of both the lists are then enough (for example, Timsort, Introsort, and Block Sort), merged to give a final ordered list.” They point out another thus accomplishing better average- and worse-case time sorting algorithm, called “Fun-Sort,” which uses a complexity and preserving some, but not all, advantages of “repeated binary search approach on an unsorted array for Insertion Sort. IS and acquired 휃(푛2 log 푛) time complexity to sort an array of 푛 elements.” Since “gaps are left [in arrays] . to Out of the aforementioned sorting algorithms, only Timsort accelerate insertions,” they mention that “gapped insertion and Block Sort are adaptive, stable, and achieve lower sort,” also known as Library Sort or Binary Insertion Sort, worst-case time complexity than Insertion Sort, though the “is studied to devise an algorithm with 푂(푛 log 푛) time former has a somewhat undesirable worst-case space complexity with high probability.” They touch on “an Analysis of Fast Insertion Sort E. Thomas improvement for IS . proposed by Min [6] as [a] 2- nesting, because the ℎth algorithm recursively calls the ℎ − element insertion sort where two elements from unsorted 1th algorithm, stopping when ℎ = 0. The other type of Fast part of a list are inserted into sorted part” as opposed to Insertion Sort is a “purely recursive” derivation of Fast one-element insertion in Insertion Sort, allowing for Insertion Sort Nested, called simply Fast Insertion Sort or improved time complexity but increased space complexity. Fast Insertion Sort Recursive, which dynamically computes They cover how “Dutta et al. [7] designed an approach for ℎ [1]. data which is in opposite order,” demonstrating that the “number of comparison operations are much less, i.e., The aforementioned block size, 푘, is calculated from the 3(푛−2) following formula where ℎ is the “input size degree” [1]. nearly equal to in comparison to IS.” In addition to 2 Since 푘 is the size of an array, 푘 ∈ ℕ. an Insertion Sort that is based on 2-way expansion, called ℎ−1 “an Adaptive Insertion Sort (AIS) . to reduce the number 푘 = 푛 ℎ (1) of comparisons and shift operations,” they also mention a “Doubly Inserted Sort, an approach similar to the max-min Meanwhile, ℎ is defined such that ℎ ∈ ℕ and ℎ ≥ 1, which sorting algorithm is used to scan list from both ends.” is computed from an additional formula below where 푐 is Furthermore, they observe that “Mohammed et al. [8] the “partitioning degree,” 푐 ∈ ℝ, and 푐 > 0, or a positive proposed Bidirectional Insertion Sort (BCIS) based on the floating point in practice [1]. concept of IS, left and right comparators,” reducing the number of “comparison and shift operations” to the point ℎ = log푐 푛 (2) where the “average case and best case complexity of BCIS ℎ−1 ℎ 1.5 The general intuition is that if 푐 < 푛 ≤ 푐 (can be is 푂(푛 ) and 푂(4푛) respectively.” Lastly, they remark found with Formula 2), then Fast Insertion Sort partitions how “there is hardly any variant of IS which has claimed to the input array into at most 푐 blocks of size 푘. In other maintain online sorting feature of IS,” with some even words, the block size, 푘 , can be either increased by deviating “from in-place and stable sort feature of IS to decreasing 푐 or increasing ℎ, or decreased by increasing 푐 give better performance.” or decreasing ℎ. This property allows the user to fine-tune Lam and Wong [9] said that “Rotated Insertion Sort, or just the algorithm according to expected size of the input array. Rotated Sort for short, is based on the idea of the implicit The procedure of Fast Insertion Sort is a for loop of index 푖 data structure called rotated list,” which is “where the from 0 to 푛 − 1 where each iteration adds 푘 to 푖. For each relative ordering of the elements is stored implicitly in the iteration, the minimum of 푘 and 푛 − 푖 is determined, which pattern of the data structure, rather than explicitly storing is used as the new value of 푘 for the rest of iteration. the relative ordering using offsets or pointers.” They note Otherwise, if the case where 푛 − 푖 < 푘 is not accounted that “rotated list achieves 푂(푛1.5 log 푛) operations using for, then the block size, 푘, would extend past the end of the constant 푂(푤) bits temporary space, or 푂(푛1.5) operations input array. with extra 휃(√푛 log 푛) bits temporary space, regardless of 푤,” where 푤 is the size of a computer word. Additionally, Then Fast Insertion Sort recursively calls itself with a new they present their original “Rotated Library Sort that starting index of 푖 and a new input array size of 푘 from the combines the advantages” of Library Sort and Rotated minimum function. In the case of Fast Insertion Sort Insertion Sort. Petersson and Moffat [10] elaborate on Nested, the ℎ argument is decremented by 1 in order to call adaptive sorting algorithms, such as “Straight Insertion the subsequent nested version of the algorithm. The final Sort” and “Natural Mergesort.” operation of each iteration is an “insert block” procedure that takes the sorted block from 푖 to 푘 − 1 and inserts it into the partition from the starting index of the input array 3 System Design (of the current function call) to 푘 − 1 , while also maintaining the order of that partition [1]. 3.1 Explanation of Fast Insertion Sort The high-level idea of Fast Insertion Sort is to extend a 3.2 Properties of Fast Insertion Sort sorted, left subdivision of the input array by inserting a In addition to keeping Insertion Sort’s properties of being sorted block of a specific size, 푘, for each iteration, all adaptive, stable, and online, Fast Insertion Sort achieves a while maintaining the order during insertion.