On the usage of Sorting Networks to Big Data

Blanca López and Nareli Cruz-Cortés Artificial Intelligence Laboratory, Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-IPN), México D.F., México Country

Abstract— Sorting data in a computer is maybe the most Sorting Networks (SN) are an example of the non-adaptive popular classical task in . For the majority algorithms. of applications the main goal is to minimize the number of Taking advantage of the divide-and-conquer strategy uti- comparisons and execution time that the lized by the , it is designed a strategy where some consumes. Sorting Networks are algorithms that perform SN are coupled to it in order to reduce the comparisons exactly the same number of comparisons to order any input performed by the QuickSort. permutation for a given input data size. That is, each step The remaining of this paper is organized as follows. does not depend on the result of a previous comparisons. In Section 2 some basic concepts about Quicksort and Thus, designing Sorting Networks with a minimal number of Sorting Networks are presented. In Section 3 the proposal comparisons becomes a very important task. However, it is is explained. Section 4 presents the experiments and results. an NP-hard problem. Actually, the optimal Sorting Networks Finally in Section 5 some conclusions are drawn. with a minimal number comparisons (or at least close to the optimal) for small input data sizes from 3 to 16 are published 2. Basic Concepts in the specialized literature. Of course, these input data sizes are very small to be used in real world problems. In this 2.1 Quicksort Algorithm work we propose a new strategy to improve the QuickSort Quicksort (also known as Partition-Exchange ) was performance by coupling it with some Sorting Networks to first presented in 1960 by Tony Hoare [4]. It uses a divide- large input data. The results demonstrate it helps reducing and-conquer strategy by dividing a large list into two smaller the sorting execution time. sublists. A sublist with the smallest values and another with the greatest. Then, each sublist is recursively ordered. The Keywords: Sorting Networks, QuickSort algorithm is as follows: 1) Choose an element from the list that will be called 1. Introduction pivot. 2) Order the list in such a way that all the values which Sorting Algorithms are maybe one of the most studied are less than the pivot will be located to its left (before problems in Computer Science, from the theoretical and the pivot). Further, all the values greater than the pivot practical points of view. Applications of them can be found will be located to its right (after the pivot). This way, in Data Processing Systems, Network Communication Sys- the value in the pivot is on its final position. tems, Image Processing, Artificial Intelligence, Cryptogra- 3) For each sublist, repeat the previous steps in a recur- phy, Computer Security, Information Systems, among many sively manner until the sublists size is zero or one. others. A large set of Sorting Algorithms can be found in the This idea is illustrated in Figure 1. QuickSort is a very specialized literature, such as: quicksort, , merge efficient algorithm that on the average and best cases makes sort, shell sort, , insertion, , shear sorting, O(n log n) comparisons for sorting n elements. In the worst 2 etc. Choosing the most efficient algorithm usually depends case it makes O(n ). Some variants to this algorithm have on the type of application at hand. In general, the Sorting been presented in [6][3] where their authors proposed some Algorithms can be classified into two groups: the adap- modifications to reduce the execution time. tive and non-adaptive. An adaptive algorithm executes its compare-interchange operations depending on the input data. 2.2 Sorting Networks On the other hand, the non-adaptive algorithms have fixed SN are algorithms with the main feature of being oblivi- operations which are executed no matter the configuration ous, it means that their current operations (comparisons) do of the input data (e. g. all the possible permutations). They not depend on the input data or the previous comparisons always execute the same compare-interchange operations. [5][7]. Unlike other well known sorting algorithms (bubble Pivot

x0=4 y0 =1 c0 c2 Iteration 1 x1=2 y1 =2 c4 QS QS c3

x2=1 c1 y2 =3

Pivot Pivot x3=3 y3 =4

Fig. 2 SORTING NETWORKFOR n = 4 INPUTS. Iteration 2

QS QS QS QS . compare-interchange each time a comparator is found. So, . the comparators c0 and c1 are executed first, then c2 and . c3, and finally c4. c0 evaluates 4 > 2, thus the values of x0 and x1 are swapped. c1 evaluates 1 < 3, so the values of x2 and x3 remain without change. This process continues Fig. 1 until all the comparators are applied, so the final sorted list RECURSIVE PARTITION OPERATION OF THE QUICKSORT ALGORITHM y0, y1, y2, y3 at the right accomplishes y0 ≤ y1 ≤ y2 ≤ y3. As a matter of fact, if an optimal SN for input size n can be designed (i. e. with minimal number of comparators), then n sort, quicksort, etc.), the sequence and number of compari- it means that is the best manner to sort data. Designing SN sons are exactly the same no matter the input configuration with minimal number of comparators and/or high parallelism (permutation). The SN exhibits two main features: is a classical interesting problem in Computer Science. Actually, nowadays it is an open research area. • The comparisons (called comparators) are fixed before It is important to notice that the optimal SN for input the SN execution, size greater than n = 16 are not know. Actually, only • Some comparisons can be executed in a parallel man- lower bounds regarding the number of comparators are ner. theoretically known [5]. The most studied SN is the one A SN is composed by a set of comparators, where each of with input size n = 16, which is a relatively small value, them executes an action compare-interchange between two considering the huge quantity of information that the modern elements (a, b). The element a must be not grater than b, if systems must handle. The best known SN n = 16 has only so, the values must be interchanged to (b, a). So, for a given 60 comparators, for example, the one designed by Green [5] input list with size n, the set of comparators conforming the is illustrated in Figure 3. SN are applied to it, then the output is the list monotonically In [2] K. E. Batcher proposed an interesting algorithm non decreasing ordered. called Merge Odd-Even to merge two SN into one. That is, Typically, the SN are graphically represented by n ho- if we have a SN with input size n, then, it is possible to rizontal lines representing the n input data. Further, some obtain a SN with input size 2n by merging two copies of vertical lines that represent comparisons between the value the original SN size n each. By following this algorithm it at its top extreme and the value at its bottom. If the value at is possible to obtain SN with larger input sizes 1. the top is grater than the value at the bottom, these values An example, to increase the size of input data in 2n must be swapped. from SN for n = 4. A set of operations to order and two The input data are placed at the left, then, after they output lists “g” and “h” are considered. In the Figure 4 are have traveled across the horizontal lines and executed the shown two lists to re-arrange. The list “t” has the numbers comparisons found, the output is obtained at the right. The {t1, t2..., tg} in ordered. At the same time, second list called data must be ascendant sorted from top to bottom. “w” are composed by {w1, w2, ..., wh}. The “g + h” is the See for example a SN for n = 4 inputs illustrated in output of the merging network, the numbers of the merged Figure 2. Each input data is set on the horizontal lines lists in ascending order are {u1, ..., ug+h−1, ug+h}. i.e., at labeled as x0, x1, x2, x3. The vertical lines are the com- first, a list “g + h” can be build by merging network with parators c0, c1, c2, c3, c4, each receiving two values, i. e., the odd-indexed numbers of the two input lists and the even- the comparator c0 receives the values x0 and x1, and so on. All the data values go from left to right executing a 1Usually SN for input sizes greater than n = 16 are considered as large. step1 step2 step3 step4 step5 step6

t 1

t 2

t 3

t g

w1

w2

w3

wh

Fig. 5 Fig. 3 ODD-EVEN MERGESORT SCHEME.TWO SN FOR n = 8 INPUTSIS SN WITHINPUTSIZE n = 16 DESIGNEDBY GREEN.ITISTHEBEST CONSTRUCTEDBYTWO SN FOR n = 4. KNOWWITH 60 COMPARATORS.

x 1 t 1 u 1 t 1 u 1 O x 2 Co t 2 u2 t 2 u2 C2 d x 3 M t 3 C4 u 3 t 3 u 3 . d C3 . C1 E . t g u 4 w y1 1 E u5 w1 R u5 Co w y2 2 v u6 w2 u6 C2 G w y3 C4 3 e u7 w3 u7 . . C3 E C1 . n wh u g+h

Fig. 6 Fig. 4 ODD-EVENMERGINGSCHEME. ODD-EVEN MERGESORTSCHEMEFORTOORDERTWO SN FOR n = 8 INPUTSISCONSTRUCTEDBYTWO SN FOR n = 4.

In regard to build a SN to n = 8 from two SN for n = 4 with the Odd-Even merge method, the Figure 5 exhibit the indexed numbers of the two input lists. The lowest output of number of comparators that it has full. It has 19 comparators the odd merge is left alone and becomes the lowest number in 6 steps or “layers”. The step 1 − 3 arranged two list to of final list. The steps to ordered two lists are : n = 4, both in non decreasing order. In step 4 was used 1) Merge the keys of “t” by odd-indexed {t1, t3, t5, ...} four comparators to re-arrange (merge) the 8 elements and with the “w” odd-indexed {w1, w3, w5, ...} to form the then, two comparators to merge the four ordered lists (step sequences {x1, x2, x3, ...} of the keys in order. 5) and then, in step 6 was used three comparators to merge 2) Merge the keys of “t” by even-indexed {t2, t4, t6, ...} the ordered sequences to form one ordered list containing with the “w” even-indexed {w2, w4, w6, ...} to form all 8 elements. the sequences {y1, y2, y3, ...} of the keys in order. Nevertheless, this method works better for small input 3) Then set u1 = x1 and use parallel comparators to sizes, and decreases its performance as the input sizes set u2i = min(xi+1, yi) and u2i+1 = max(xi+1, yi) increase, i.e. for large input sizes the resulting SN would i = {1, 2, 3, ...} have more comparators than the optimal. The step 1 and step 2 can be apply in parallel and each Notice that for a given input data size n the corresponding of the involves about (g + t)/2 keys. A more detailed SN has fixed its comparators, which means that a new SN information about the Theorem is [2][5][1][8]. must be designed if the input data size is different than n. In this sense the SN are less flexible than the conventional Pivot adaptive sorting algorithms.

3. Proposal Iteration 1

Considering that the QuickSort is a very efficient al- QS QS gorithm based on a divide-and-conquer strategy, and the optimal SN are only known for small input data sizes, this proposal consists on coupling a SN with the QuickSort Pivot Pivot algorithm to order big input data in an efficient manner. The algorithm will be named Quick+SN. The general idea is to apply QuickSort in conventional Iteration 2 way to the input data as many times as necessary until the sublists are small enough to be sorted by a given SN of SN SN QS SN input size n. This idea is illustrated in Figure 7. With this . small change in the QuickSort it is possible to improve its . execution time while maintaining its flexibility, in the sense . that the algorithm is able to order any input data size. Let us suppose that we have selected a determined SN to work with an input size equal to n = 16 (which is the Fig. 7 greatest near-optimal known). For a given input data A to SCHEMEOFTHEPROPOSED QUICK+SN ALGORITHM be ordered with size Z (where Z is a large number), the algorithm Quick+SN is defined as follows:

1) Divide an array A into two parts (al and ar) by selecting the element in the middle position as pivot numbers which are already ordered, and inverse ordered m denoted by . numbers. 2) Compare each element of a and a against the pivot l r • Two different SN were utilized for n = 16 and n = m and move all the elements less than to the left array 256. al, and all the elements greater than m to the right array ar. Therefore, we have twelve experiment sets with different 3) Verify if the resulting lists are size n: configurations. Each set was executed 30 times. The time • If so, then the sublist is sorted by the SN. and comparisons performed was taken by the Quick+SN • If not, then go to step 1 recursively. applied to each set was obtained. The statistical results about The output of this algorithm is the list A completely ordered. the time are shown in Table 1 and 2 and respect to the Notice that Quick+SN works by applying a specific SN number of comparisons are exhibits in the Table 3 and 4. for a determined input size n. Therefore, only if a resulting All of them have as first column the input configuration, as list has exactly n elements then the SN can be applied to second column show the size of the input data, in the third it. Each time that the QuickSort splits the list into two, it columns are showing the time consumed (in seconds) using is not possible to know the resulting sizes. Hence, it is not Quick+SN and the other case, the number of comparisons possible to know in advance the number of times that the performed. In the last column the original Quicksort results SN will be applied. are shown. The Table 1 and 3 are presented the time and comparison performed of a SN for n = 16. The SN with n = 256 was designed by combining copies of Green’s SN 4. Experimental design (shown in Figure 3) with the algorithm Merge Odd-Even In order to assess the proposed algorithm efficiency, a mentioned in Section 2.2. Its performance is presented in set of experiments were conducted. The Quick+SN was Table 2 and 4. applied to data lists (numbers) with different input sizes, It can be observed for the case of random sorted input input permutations. Additionally, it was experimented by data, that the Quick+SN with n = 16 outperforms the using different SN. The configuration for each experiment original QuickSort. However, for the cases where the input is a combination of the following options: data was already sorted or inverted, the QuickSort obtained • Input data size: Two lists conformed by 1,000,000 and better results than Quick+SN n = 16. Thus, for all the 10,000,000 numbers to be sorted were used as input. configurations and input data sizes, the Quick+SN with • Input permutations: Three different configurations of n = 256 obtained best results with the minimal execution the input lists were used: Randomly generated numbers, times. Table 1 STATISTICAL RESULTS (INSECONDS) OF QUICK+SN AND QUICKSORT. Table 3 SN TO n = 16 WAS UTILIZED. STATISTICAL RESULTS (INCOMPARISONSPERFORMED) OF QUICK+SN AND QUICKSORT.SN TO n = 16 WAS UTILIZED Input Input QS+SN Original Configuration Size Z with n = 16 QuickSort Input Input QS+SN Original Configuration Size Z with n = 16 QuickSort Average 0.129118 0.124313 1,000,000 Median 0.114469 0.112387 Random Average 21123958.73 20962847.72 Best 0.112189 0.10922 1,000,000 Median 20113717 20095717 Random Worst 0.239472 0.226902 Best 20049370 20049190 Worst 29118486 28203428 Average 1.123340 1.132025 10,000,000 Median 1.120732 1.131792 Average 236476813.09 236476536.64 Best 1.119377 1.123426 10,000,000 Median 236477269 236476378 Best 236476282 236476088 Worst 1.149899 1.15022 Worst 236477269 236476983

Average 0.111787 0.111292 Average 20200661.81 20152879.55 1,000,000 Median 0.109365 0.109441 1,000,000 Median 20049710 20049230 Ordered Ordered Best 0.109172 0.109157 n Best 20049523 20049230 Worst 0.130594 0.126134 Worst 21375129 20945874

Average 1.121114 1.116137 Average 236476813.09 236476536.64 10,000,000 Median 236476728 236476378 10,000,000 Median 1.119669 1.116119 Best 236476282 236475960 Best 1.119038 1.116025 Worst 236477269 236476983 Worst 1.135499 1.11630 Average 20200661.82 20152879.55 Average 0.100668 0.097639 1,000,000 Median 20049710 20049283 Inverse 1,000,000 Median 0.100208 0.097374 Best 20049523 20049230 Inverse Best 0.099006 0.097194 Worst 21375129 20049230 Worst 0.101557 0.102348 Average 236476813.09 236476536.67 10,000,000 Median 236476728 236476378 Average 1.136367 1.127812 Best 236476282 236476307 10,000,000 Median 1.13466 1.125864 Worst 236477269 236476983 Best 1.134569 1.121864 Worst 1.18243 1.171423

Table 2 STATISTICAL RESULTS (INSECONDS) OF QUICK+SN AND QUICKSORT. Table 4 SN TO n = 256 WAS UTILIZED. STATISTICAL RESULTS (INCOMPARISONSPERFORMED) OF QUICK+SN AND QUICKSORT.SN TO n = 256 WAS UTILIZED

Input Input QS+SN Original Configuration Size Z with n = 256 QuickSort Input Input QS+SN Original Configuration Size Z with n = 256 QuickSort Average 0.1110462 0.124313 1,000,000 Median 0.108356 0.112387 Average 20238789.7 20962847.72 Random 1,000,000 Median 20095717 20095717 Best 0.107208 0.10922 Random Worst 0.124825 0.226902 Best 20019190 20049190 Worst 28274331 28203428 Average 1.056246 1.132025 Average 236476741.45 236476536.64 10,000,000 Median 1.055244 1.131792 10,000,000 Median 236476378 236476378 Best 1.052281 1.123426 Best 236475960 236476088 Worst 1.074649 1.15022 Worst 236479236 236476983

Average 0.109837 0.111292 Average 20132879.55 20152879.55 1,000,000 Median 0.10799 0.109441 1,000,000 Median 20049283 20049230 Ordered Ordered Best 0.107806 0.109157 n Best 20049230 20049230 Worst 0.124511 0.126134 Worst 20199704 20945874

Average 1.050993 1.116137 Average 236476541.45 236476536.64 10,000,000 Median 236475960 236476378 10,000,000 Median 1.05931 1.116119 Best 236475960 236475960 Best 1.050836 1.116025 Worst 236479236 236476983 Worst 1.052154 1.11630 Average 20148658.28 20152879.55 Average 0.092927 0.097639 1,000,000 Median 20049283 20049283 Inverse 1,000,000 Median 0.09282 0.097374 Best 20049230 20049230 Inverse Best 0.092252 0.097194 Worst 20945874 20049230 Worst 0.093349 0.102348 Average 236477000.60 236476569.67 Average 1.050894 1.127812 10,000,000 Median 236476378 236476378 Best 236475960 236476378 10,000,000 Median 1.050839 1.125864 Worst 236479236 236476983 Best 1.050763 1.121864 Worst 1.051983 1.171423 5. Conclusions Taking advantage of their inherent features, it was pro- posed a combination of the well-known algorithm QuickSort with the algorithm called Sorting Networks. The experi- mental results showed that the proposal is competitive by obtaining better execution times than the original QuickSort. It was also noticed that the execution times were improved when SN for larger input data sizes were utilized. It is necessary to experiment with larger input data, and also with SN for different sizes. Further, a formal study related to the algorithms complexity is necessary. 6. Acknowledgments The authors acknowledges support from CONACyT through projects number 180421 and 132073. The first author acknowledges support from CONACyT through a scholarship to pursue graduate studies at CIC-IPN. References [1] S. W. Baddar and K. E. Batcher. Designing Sorting Networks: A new Paradigm. Springer, 2011. [2] K. E. Batcher. Sorting networks and their applications. In Proceedings of the April 30–May 2, 1968, spring joint computer conference, AFIPS ’68 (Spring), pages 307–314, New York, NY, USA, 1968. ACM. [3] D. Cantone and G. Cincotti. Quickheapsort, an efficient mix of classical sorting algorithms. In Gambosi G. Bongiovanni G.C. and Petreschi R., editors, CIAC, volume 1767 of Lecture Notes in Computer Science, pages 150–162. Springer, 2000. [4] C. A. R. Hoare. Algorithm 64: Quicksort. Commun. ACM, 4(7):321–, July 1961. [5] D. E. Knuth. The Art of Computer Programming, Volume III: Sorting and Searching, 2nd Edition. Addison-Wesley, 1998. [6] U. Kocamaz. Increasing the efficiency of quicksort using a neural network based algorithm selection model. Inf. Sci., 229:94–105, April 2013. [7] D. G. OConnor and R. J. Nelson. Sorting system with n-line sorting switch. United States Patent number 3,029,413, 6, April, 1962 1962. [8] Kenneth E. Batcher Sherenaz W. Al-Haj Baddar.