Adaptive Bitonic Sorting

Encyclopedia of Parallel Computing “00101” — 2011/4/16 — 13:39 — Page 1 — #2 A One of the main di7erences between “regular” bitonic Adaptive Bitonic Sorting sorting and adaptive bitonic sorting is that regular bitonic sorting is data-independent, while adaptive G"#$%&' Z"()*"++ bitonic sorting is data-dependent (hence the name). Clausthal University, Clausthal-Zellerfeld, Germany As a consequence, adaptive bitonic sorting cannot be implemented as a sorting network, but only on archi- Definition tectures that o7er some kind of 9ow control. Nonethe- Adaptive bitonic sorting is a sorting algorithm suitable less, it is convenient to derive the method of adaptive for implementation on EREW parallel architectures. bitonic sorting from bitonic sorting. Similar to bitonic sorting, it is based on merging, which Sorting networks have a long history in computer is recursively applied to obtain a sorted sequence. In science research (see the comprehensive survey []). contrast to bitonic sorting, it is data-dependent. Adap- One reason is that sorting networks are a convenient tive bitonic merging can be performed in O n parallel p way to describe parallel sorting algorithms on CREW- time, p being the number of processors, and executes ! " PRAMs or even EREW-PRAMs (which is also called only O n operations in total. Consequently, adaptive n log n PRAC for “parallel random access computer”). bitonic sorting can be performed in O time, ( ) p In the following, let n denote the number of keys which is optimal. So, one of its advantages is that it exe- ! " to be sorted, and p the number of processors. For the cutes a factor of O log n less operations than bitonic sake of clarity, n will always be assumed to be a power sorting. Another advantage is that it can be imple- ( ) of . (In their original paper [], Bilardi and Nicolau mented e5ciently on modern GPUs. have described how to modify the algorithms such that they can handle arbitrary numbers of keys, but these Discussion technical details will be omitted in this article.) Introduction 6e8rst to present a sorting network with optimal 6is chapter describes a parallel sorting algorithm, asymptotic complexity were Ajtai, Komlós, and Sze- adaptive bitonic sorting [], that o7ers the following merédi []. Also, Cole [] presented an optimal parallel bene8ts: merge sort approach for the CREW-PRAM as well as for the EREW-PRAM. However, it has been shown that It needs only the optimal total number of compar- neither is fast in practice for reasonable numbers of keys ison/exchange operations, O n log n . ● [, ]. 6e hidden constant in the asymptotic number of ( ) In contrast, adaptive bitonic sorting requires less operations is less than in other optimal parallel sort- ● than n log n comparisons in total, independent of the ing methods. number of processors. On p processors, it can be imple- It can be implemented in a highly parallel manner n log n Correctedmented in O Prooftime, for p n . on modern architectures, such as a streaming archi- p log n ● Even with a small number of processors it is e5- tecture (GPUs), even without any scatter operations, ! " ≤ cient in practice: in its original implementation, the that is, without random access writes. sequential version of the algorithm was at most by a factor .slower than quicksort (for sequence lengths up to )[]. David Padua (ed.), Encyclopedia of Parallel Computing,DOI./----, ©SpringerScience+BusinessMediaLLC Encyclopedia of Parallel Computing “00101” — 2011/4/16 — 13:39 — Page 2 — #3 A Adaptive Bitonic Sorting Fundamental Properties Asaconsequence,allindexarithmeticis understood One of the fundamental concepts in this context is the modulo n,thatis,indexi k i k mod n, unless notion of a bitonic sequence. otherwise noted, so indices range from through n . + ≡ + As mentioned above, adaptive bitonic sorting can be − regardedasavariantofbitonicsorting,whichisinorder De!nition (Bitonic sequence) Let a a ,...,a n to capture the notion of “rotational invariance” (in some be a sequence of numbers. 6en, a is bitonic,i7itmono- =( − ) sense) of bitonic sequences; it is convenient to de8ne the tonically increases and then monotonically decreases, following rotation operator. or if it can be cyclically shi:ed (i.e., rotated) to De!nition (Rotation) Let a a ,...,a and become monotonically increasing and then monoton- n j N.Wede8nearotationasanoperatorRj on the ically decreasing. =( − ) sequence a: ∈ Figure shows some examples of bitonic sequences. Rja aj, aj ,...,aj n In the following, it will be easier to understand =( + + − ) any reasoning about bitonic sequences, if one consid- 6is operation is performed by the network shown ers them as being arranged in a circle or on a cylinder: in Fig. .Suchnetworksarecomprisedofelementary then, there are only two in9ection points around the cir- comparators (see Fig. ). cle. 6is is justi8ed by De8nition .Figure depicts an Two other operators are convenient to describe example in this manner. sorting. i i i 1 n 1 n 1 n Adaptive Bitonic Sorting. Fig. Three examples of sequences that are bitonic. Obviously, the mirrored sequences (either way) are bitonic, too Corrected Proof Adaptive Bitonic Sorting. Fig. Left:accordingtotheirdefinition,bitonicsequencescanberegardedaslyingona cylinder or as being arranged in a circle. As such, they consist of one monotonically increasing and one decreasing part. Middle:inthispointofview,thenetworkthatperformstheL and U operators (see Fig. ) can be visualized as a wheel of “spokes.” Right:visualizationoftheeffectoftheL and U operators; the blue plane represents the median Encyclopedia of Parallel Computing “00101” — 2011/4/16 — 13:39 — Page 3 — #4 Adaptive Bitonic Sorting A a min(a,b) n 6e proof needs to consider only two cases: j and j n .Intheformercase,Eq. becomes La = b max(a,b) LR n a,whichcanbeveri8edtrivially.Inthelattercase, ≤ < = Eq. becomes a max(a,b) n n LRja min aj, aj ,...,min a , an ,..., min aj , a+j n − − = ' ' ( ' ( b min(a,b) R La. − − + j ' (( Adaptive Bitonic Sorting. Fig. Comparator/exchange 6us,= with the cylinder metaphor, the L and U oper- elements ators basically do the following: cut the cylinder with circumference n at any point, roll it around a cylinder n with circumference ,andperformposition-wisethe max and min operator, respectively. Some examples are shown in Fig. 6e following theorem states some important properties of the L and U operators. %eorem Given a bitonic sequence a, max La min Ua . Adaptive Bitonic Sorting. Fig. Anetworkthatperforms the rotation operator Moreover, La and Ua{are} bitonic≤ { too.} In other words, each element of La is less than or equal to each element of Ua. 6is theorem is the basis for the construction of the bitonic sorter [].6e8rst step is to devise a bitonic merger (BM). We denote a BM that takes as input bitonic sequences of length n with BMn.ABMisrecur- sively de8ned as follows: n n BMn a BM La ,BM Ua . 6e base case is, of course, a two-key sequence, which Adaptive Bitonic Sorting. Fig. Anetworkthatperforms ( )=' ( ) ( ) ( is handled by a single comparator. A BM can be easily the L and U operators represented in a network as shown in Fig. Given a bitonic sequence a of length n,onecanshow that De!nition (Half-cleaner) Let a a,...,an . BMn a Sorted a .() n n − La min a, a ,...,min=(a , an , ) It should be obvious( that)= the sorting( ) direction can be Ua max a, a n ,...,max a n− , an− . = ' ' ( ' (( changed simply by swapping the direction of the ele- In [],= a' network' that( performs' these− − operations(( mentary comparators. together is called a half-cleaner (see Fig.Corrected). Coming backProof to the metaphor of the cylinder, the It is easy to see that, for any j and a, 8rst stage of the bitonic merger in Fig. can be visualized as n comparators, each one connecting an element La R j mod n LRja,() of the cylinder with the opposite one, somewhat like − and = spokes in a wheel. Note that here, while the cylinder can rotate freely, the “spokes” must remain 8xed. Ua R j mod n URja.() From a bitonic merger, it is straightforward to derive − 6is is the reason why= the cylinder metaphor is valid. abitonicsorter,BSn,thattakesanunsortedsequence, Encyclopedia of Parallel Computing “00101” — 2011/4/16 — 13:39 — Page 4 — #5 A Adaptive Bitonic Sorting Ua Ua a La a La i i 1 n/2 1nn/2 Adaptive Bitonic Sorting. Fig. Examples of the result of the L and U operators. Conceptually, these operators fold the n n bitonic sequence (black), such that the part from indices throughn (light gray) is shifted into the range through (black); then, L and U yield the upper (medium gray) and lower (dark gray) hull, respectively + BM(n) 0 La BM(n/2) n/2-1 Sorted Bitonic n/2 Ua BM(n/2) n–1 1 stage Adaptive Bitonic Sorting. Fig. Schematic, recursive diagram of a network that performs bitonic merging and produces a sorted sequence either up or down. As a consequence, the bitonic sorter consists of Like the BM, it is de8ned recursively, consisting of two O n log n comparators. smaller bitonic sorters and a bitonic merger (see Fig. ). Clearly, there is some redundancy in such a net- ' ( Again, the base case is the two-key sequence. work, since n comparisons are su5cient to merge two sorted sequences. 6e reason is that the comparisons Analysis of the Number of Operations of performed by the bitonic merger are data-independent. Bitonic Sorting Since a bitonic sorter basically consists of a number of Derivation of Adaptive Bitonic Merging bitonic mergers, it su5ces to look at the total number of 6e algorithm for adaptive bitonic sorting is based on comparisons of the latter. the following theorem. 6e total number of comparators, C n ,inthe Corrected%eorem ProofLet a be a bitonic sequence. 6en, there is bitonic merger BMn is given by: ( ) an index q such that n n C n C ,withC , La aq,...,aq n () Ua aq n ,...,+aq− () ( )= ) *+ ( )= = ' ( which amounts to = ' + − ( (Remember that index arithmetic is always mod- C n n log n.

Adaptive Bitonic Sorting

Improving the Performance of Bubble Sort Using a Modified Diminishing Increment Sorting

A Proposed Solution for Sorting Algorithms Problems by Comparison Network Model of Computation

Bitonic Sorting Algorithm: a Review

Sorting Algorithm 1 Sorting Algorithm

I. Sorting Networks Thomas Sauerwald

Bitonic Sort and Quick Sort

A Single SMC Sampler on MPI That Outperforms a Single MCMC Sampler

Sorting Algorithm 1 Sorting Algorithm

Metode Sorting Bitonic Pada GPU

Oblivious Computation with Data Locality

Fast Segmented Sort on Gpus

A Time-Based Adaptive Hybrid Sorting Algorithm on CPU and GPU with Application to Collision Detection