Sorting in Linear Time?
Total Page:16
File Type:pdf, Size:1020Kb
Sorting in Linear Time? ¡ ¢ Arne Andersson Torben Hagerup Stefan Nilsson Rajeev Raman Abstract integers in the range ¥b¦¨¦c¤2 d in linear time via bucket sort- ing, thereby demonstrating that the comparison-based lower We show that a unit-cost RAM with a word length of £ bits bound can be meaningless in the context of integer sorting. ¥§¦¨¦ © ¤ ¤ can sort ¤ integers in the range in Integer sorting is not an exotic special case, but in fact time, for arbitrary £ ¤ , a signi®cant improvement is one of the sorting problems most frequently encountered. $¤ over the bound of !¤#" achieved by the fusion trees Aside from the ubiquity of integers in algorithms of all kinds, of Fredman and Willard. Provided that £%%&'(¤*)*+-, , for we note that all objects manipulated by a conventional com- some ®xed .0/1¥ , the sorting can even be accomplished in puter are represented internally by bit patterns that are inter- linear expected time with a randomized algorithm. preted as integers by the built-in arithmetic instructions. For Both of our algorithms parallelize without loss on a unit- most basic data types, the numerical ordering of the repre- cost PRAM with a word length of £ bits. The ®rst one yields senting integers induces a natural ordering on the objects rep- 2¤ ¤ an algorithm that uses 23 $¤ time and op- resented; e.g., if an integer represents a character string in the erations on a deterministic CRCW PRAM. The second one natural way, the induced ordering is the lexicographic order- yields an algorithm that uses 3(4¤ expected time and ing among character strings. This is true even for ¯oating- !¤ expected operations on a randomized EREW PRAM, point numbers; indeed, the IEEE 754 ¯oating-point standard .7/8¥ provided that £53(4¤6)*+-, for some ®xed . was designed speci®cally to facilitate the sorting of ¯oating- Our deterministic and randomized sequential and parallel point numbers by means of integer-sorting subroutines [13, algorithms generalize to the lexicographic sorting problem p. 228]. Most sorting problems therefore eventually boil of sorting multiple-precision integers represented in several down to sorting integers or, possibly, multiple-precision in- words. tegers stored in several words. Classical algorithms for integer sorting require assump- tions about the size of the integers to be sorted, or else have a 1 Introduction running time dependent on the size. Bucket sorting requires ¥§¦e¦¤f the ¤ input keys to be in the range . Radix sorting Sorting is one of the most fundamental computational prob- in g phases, each phase implemented via bucket sorting, can 2¤ $¤ lems, and ¤ keys can be sorted in time by any ¥§¦e¦¤ih# j !¤gC sort ¤ integers in the range in time. A more of a number of well-known sorting algorithms. These algo- sophisticated technique, due to Kirkpatrick and Reisch [14], rithms operate in the comparison-based setting, i.e., they ob- reduces this to ¤ kgC , but the fact remains that as the size tain information about the relative order of keys exclusively of the integers to be sorted grows to in®nity, the cost of the through pairwise comparisons. It is easy to show that a run- sorting also grows to in®nity (or to lm¤ ¤ , if we switch ning time of ¤ $¤ is optimal in the comparison-based to a comparison-based method at the appropriate point). model. However, this model may not always be the most natural one for the study of sorting problems, since real ma- If we allow intermediate results containing many more bits chines allow many other operations besides comparison. Us- than the input numbers, we can actually sort integers in linear time independently of their size, as demonstrated by Paul and ing indirect addressing, for instance, it is possible to sort ¤ Simon [18] and Kirkpatrick and Reisch [14]. But again, from 9 Department of Computer Science, Lund University, Box 118, S±221 00 a practical point of view, this is not what we want, since a real J*F¨>6KL:*=<?<@A=¨:CBED F<HCBEJ<> Lund, Sweden. :<;*=¨>*?A@<=¨:CBEDGF6HIBEJ*> , M machine is unlikely to have unit-time instructions for operat- Max-Planck-Institut furÈ Informatik, D±66123 Saarbrucken,È Germany. Supported by the ESPRIT Basic Research Actions Program of the EU un- ing on integers containing a huge number of bits. Instead, if £ der contract No. 7141 (project ALCOM II). F¨NA;*O¨>*=<?GP¨QSR6T<J OUBVP¨QAWXBY@¨> the input numbers are -bit integers, we would like all inter- Z Department of Computer Science, King's College London, Strand, Lon- mediate results computed by a sorting algorithm to ®t in £ don WC2R 2LS, U. K. ;<: PL:*=A?<@¨[<JCB]\¨[AD^B_:6[`Ba*\ bits as wellÐin the terminology of Kirkpatrick and Reisch, the algorithm should be conservative. In this case it is real- istic to assume that a full repertoire of ªreasonableº instruc- tions can be applied to word-sized operands in constant time. In the remainder of the paper, when nothing else is stated, we will take ªsortingº to mean sorting £ -bit words on a unit-cost RAM with a word length of £ bits. ¤ '($¤ Fredman and Willard [9] were the ®rst to show that ¤ arbi- optimal if . trary integers can be sorted in !¤'(¤ time by a conserva- Our results ¯ow from the combination of the two tech- tive method. Their algorithm, based on fusion trees, sorts ¤ niques of packed sorting and range reduction. Packed sort- ¤ ¤ integers in " time. We describe two simple algo- ing, introduced by Paul and Simon [18] and developed further rithms that improve their result. It should be noted that fusion in [12] and [2], saves on integer sorting by packing several trees have other uses besides sorting, such as in ef®cient data integers into a single word and operating simultaneously on structures, to which our results do not apply. all of them at unit cost. This is only possible, of course, if several integers to be sorted ®t in one word, i.e., packed sort- Our ®rst algorithm works in 2¤ $¤ time. It uses arithmetic instructionsdrawn from what we call the restricted ing is inherently nonconservative. Range reduction, on the instruction set, including comparison, addition, subtraction, other hand, reduces the problem of sorting integers in a cer- bitwise AND and OR, and unrestricted bit shift, i.e., shift of an tain range to that of sorting integers in a smaller range. The entire word by a number of bit positions speci®ed in a second combination of the two techniques is straightforward: First word. As is not dif®cult to see, these instructions are all in range reduction is applied to replace the original full-size in- ¡ AC , i.e., they can be implemented through constant-depth, tegers by smaller integers of which several ®t in one word, polynomial-size circuits with unbounded fan-in. Since this is and then these are sorted by means of packed sorting. known not to be the case for the multiplicationinstruction[4], As a purely technical point, we assume a machine architec- which is essential for the fusion-tree algorithm, our algorithm ture that always allows us to address enough working mem- can also be viewed as placing less severe demands on the un- ory for our algorithms, even when £ is barely larger than £ '(¤ 6e derlying hardware; this answers a question posed by Fredman (¤ (this is an issue only for , in which and Willard (an answer to this question is already implicit in case radix sorting works in linear time and space). Also, [3]). Also, the algorithm by Fredman and Willard is nonuni- standard algorithms for multiple-precision arithmetic allow form, in the sense that a number of precomputed constants us to assume constant-time operations on words of !£ bits, rather than exactly £ bits. depending on £ need to be included in the algorithm. Our algorithms need to know the value of £ itself, but no other precomputed constants. 2 Sorting in © log log time Our second algorithm is randomized and works in ¤ & $¤6)*+-, expected time, provided that £ for some ®xed Our goal in this section is to prove the following theorem. ¥ .f/ . Suf®ciently large integers can thus be sorted in lin- £ ear expected time by a conservative algorithm. The algo- Theorem 1 For all given integers ¤ and ¤ ¥§¦e¦c© rithm uses a full instruction set that augments the restricted (¤ , integers in the range can be sorted in instruction set with instructions for multiplication and ran- 2¤ ¤ time on a unit-cost RAM with a word length of £ bits and the restricted instruction set. dom choice, where the latter takes an operand ¢ in the range $¦e¦c© and returns a random integer drawn from the uni- £ £ For all positive integers ¤ and with , denote by ¥¤e¦¨¦e¦¦¤§¢©¨ form distributionover and independent of all other ¤A ¤ !¤ the worst-case time needed to sort integers of bits such integers. £ each, assuming and to be known. A sequential version of Ben-Amram and Galil [5, Theorem 5] have shown that, un- a parallel algorithm due to Albers and Hagerup [2] shows that !¤'($¤ der some circumstances, sorting requires time on ¤A 5!¤ ¤! "#_£%$ & ¤ $(4¤ & !¤ for all and , a RAM with an instruction set consisting of comparison, ad- 3 k¤ ( (¤ i.e., provided that keys can be packed into dition, subtraction, multiplication, and bitwise boolean oper- one word, sorting can be accomplished in linear time. This ations. While it is possible to simulate left shifts using multi- follows directly from Corollary 1 of [2]. (The corollary re- plication in their model, their lower bound does not apply if $()(+* quires a quantity ' to be known, but it is easy to see right shifts are allowed. We, on the other hand, assume that that it suf®ces, in our case, to know the word length £ .) We the complexity of left and right shifts is the same (as indeed sketch the algorithm to illustrate its simplicity.