HKUST Institutional Repository

FunSort Therese Biedl Timothy Chan Erik D Demaine y y Rudolf Fleischer Mordecai Golin J Ian Munro February Abstract In this pap er we study greedy inplace sorting algorithms which miraculously happ en to work in reasonable time DumbSort which rep eatedly compares all p ossible pairs of array cells sorts n elements in n cycles or time O n NotSoDumbSort which only tests adjacent cells also sorts in n cycles or in time O n GuessSorta randomized version of DumbSort runs in exp ected time O n log n And FunSort an inplace variant of InsertionSort that p erforms re p eated insertions by binary search into an initially unsorted array sorts in time O n log n Intro duction Comparison based sorting algorithms tend to be based on a few basic para digms insertion exchanging selection and merging page Most metho ds op erate by maintaining an invariant of sortedness on an increas ingly longer segment of the array The exchange sorts eg Bubblesort and Quicksort are notable exceptions to this approach But even they maintain strong invariants on the progress made thus far Shellsort from one p oint of view maintains sorted subarrays of increasingly greater length as the pro cess continues Perhaps greater density is a b etter term for the notion of the interleaved sorted segments of Shellsort From another p oint of view Shellsort Department of Computer Science University of Waterlo o Waterlo o ON NL G Canada Email fbiedl tmchan eddemaine imunroguwaterlooca y Department of Computer Science The Hong Kong University of Science and Tech nology Clear Water Bay Kowlo on Hong Kong Email frudolf golingcsusthk R Fleischer was partially supp orted by HKUST grantDAGEG HKUST Theoretical Computer Science Center Research Report HKUST-TCSC-2001-03 is delightfully chaotic Atany stage one is presented with a subregion which one hop es is reasonably close to sorted order and then is to complete the task by using linear InsertionSort Magicallyitworks pretty well Our starting point is the notion of sorting by rep eatedly moving values from current to more likely lo cations by p erforming binary searches If an array were in sorted order except for the lo cation of one value then this approach would minimize b oth the number of comparisons and the number of moves necessary to complete the task The approachwould certainly place the set maximum and minimum in their prop er sp ots and never movethem again If the array is close to b eing in sorted order the pro cess will generally make further progress Our question is whether the general approach of binary insertion into an initially unordered array can be harnassed to yield a viable sorting algorithm We are able to adapt the metho d to correctly p erform a sort and b ound the worstcase runtime reasonably accurately However we feel the greater contribution of this pap er is to lay op en the approach b oth for purp oses of algorithmic improvements and for improvements in the exp ected runtime of our approach In particular we observe that the exp ected p osition of an element binary inserted into an array is approximately its rank Several variations on this theme are explored In a more greedy approach we move values to more likely lo cations by comparing all pairs of array cells and swapping their contents if they are not in the right order We show that n iterations or cycles of this bruteforce approach suce to sort any input indep endent of the order of the comparisons in a cycle Interestingly it turns out that not all n pairs have to b e considered in each cycle It is sucient to have n cycles that only compare the n pairs of adjacent cells in any arbitrary order Bubblesort is a sp ecial case of such an algorithm Previously it was only known that n cycles will sort correctly Our results are also of interest b ecause all of our sorting algorithms are purely inplace And with the exception of the unordered binary searches algorithm they are also oblivious ie they could be realized by a sorting network But in contrast to other work on sorting networks we are not interested in exploiting the parallelism of our sequential algorithms to reduce the numb er of rounds or the depth of the network for example This pap er is organized as follows In Section weintro duce our mo del of inversionxing algorithms and show that all inversionxing algorithms will eventually terminate In Section we give a simple pro of based on the Principle that DumbSort which p erforms n cycles of comparisons of all p ossible cell pairs sorts correctly We think that this is a more natural n time slowsort algorithm than Julstroms Slow Sort which is somehow articially bloated We then show that NotSoDumbSort a streamlined version of DumbSort which only considers adjacent cells also sorts correctly in n cycles reducing the runtime to O n In Section we intro duce GuessSort a randomized variant of Dumb Sort which sorts in exp ected time O n log n In Section we show that FunSort whichisInsertionSort with binarysearch insertions into an initially unordered array sorts in time O n log n And we conclude with a few op en problems in Section To Swap or not to Swap Thats our Question Assume we are given n elements from a totally ordered set stored in an array of size n which we think of b eing arranged from left to right in increasing order For simplicity we assume that all elements are dierent but the results in this pap er also hold in the general case At any time we denote the content of cell i by a for i n Our goal is to sort these elements i in nondecreasing order Intuitively the simplest way to sort an array is to concentrate on just two elements at a time from which we can extract only one bit of information A pair i j of cells i and j where i j isgood if the cell contents are correctly ordered ie a a if i j or a a if i j i j i j Otherwise the pair is bad Bad pairs are also called inversions The number of inversions in the array can be between zero if the array is correctly sorted n and if the array is sorted in reverse order Following these ideas the most basic metho d of sorting is to search for and swap bad pairs De Bruijns called the action of comparing the contents of a pair of cells and leaving the smaller value in the left cell ie the cell with lower index a swapora miniswap if the two cells are adjacent Note that swapping a go o d pair has no eect whereas in a swap of a bad pair an exchange swapthe cell contents are exchanged The following simple observation seems to be folklore but we could not nd it stated prop erly in the literature although it is used in It implies that we can sort by rep eatedly doing exchange swaps Lemma An exchange swap strictly decreases the number of inversions Proof Consider an exchange swap i j where ij and a a see Fig i j Let k be a good pair b efore the swap where k Since the content of cell i is decreasing and the content of cell j is increasing k can only b ecome a bad pair if k j or i i m j k Figure If weswap the bad pair i j then the go o d pair k b ecomes a bad pair However the bad pair i b ecomes a go o d pair We also accidentally turn the bad pairs i mandm j into go o d pairs but that is ok In the former case i was a bad pair b efore the swap b ecause a moves i to cell j but it is good afterwards and in the latter case k j was a bad pair b efore the swap b ecause a moves to cell i but it is go o d afterwards j In any case the numb er of bad pairs decreases b ecause i j becomesagood pair We only consider sorting algorithms based on swaps called exchange sort ing algorithms byKnuth page We prefer to call them inversionxing algorithms Knuth also calls sorting algorithms purely based on miniswaps primitive Exercise on page Lemma implies that inversion xing algorithms cannot makean unb ounded number of exchange swaps n Corollary Any inversionxing algorithm makes at most O n exchange swaps n Proof Initially the array can have at most inversions and each exchange swap strictly decreases the number of inversions by Lemma So if we knew the indices of the bad pairs we could easily sort in O n time Unfortunately we cannot exp ect to get this information for free In the next three sections we will show several simple but nevertheless correct and even semiecient ways to nd bad pairs We only consider oblivious algorithms where the sequence of comparisons is xed in advance In par ticular it is indep endent of the input sequence or the outcome of individual comparisons That makes the algorithms similar to sorting networks except that we do not try to exploit any parallelism DumbSort Test All Hit One A not to o clever brute force way of nding an exchange swap is to try all swaps This approach is similar to the BellmanFord algorithm for computing shortest paths in a weighted graph where in eachstepwe know that there is one no de of optimal value which we should relax next but since we cannot identify that no de we simply relax all no des Section therefore the bad runtime of that algorithm n A cycle is a sequence of all swaps in some arbitrary order the swap sequence A minicycle is a sequence of all n miniswaps in some arbitrary order the miniswap sequence Note that no miniswap sequence can sort the input sequence On the other hand there is aswap sequence that sorts all sequences namely the swap sequence n n However not all swap sequences can sort any input sequence For example the sequence will not b e sorted by the swap sequence the result would be Since anyswap sequence contains at

Load more