Utilizing the Linear Diophantine Problem of Frobenius for a Faster Shellsort Sequence

Utilizing the linear Diophantine problem of Frobenius for a faster Shellsort sequence Dr. Bharti Temkin Maximilian Berger November 27, 2004 Abstract not. Those which can’t be represented with positive integer coefficients can be represented with a negative coefficient: TBD: Abstract of the paper Theorem 2. We assume the two positive integer x1 and x2 are relatively prime. Then, every positive integer that can not be represented 1 Introduction with positive integer coefficients c1, c2 as a linear combination of x1 and x2: x = c2x2 + c1x1 can be represented with one negative coefficient TBD c11: x = c2x2 − c11x1. TBD: Where is this prooven? 2 The linear Diophantine problem of Frobenius We may safely limit the coefficent c2 to the range [0..x1 − 1]. Should c2 be greater we can increase c1 instead. The linear Diophantine problem of Frobenius is equivalent to the coin exchange problem: What Having done this, we can now represent every is the largest integer that can not be represented integer in a table, using c1 as the x-axis and c2 with positive integers x1... xn > 1 that are rela- as the y axis, as shown in Figure 1. tively prime? 3 8 13 18 23 28 33 38 43 ... This problem has not be solved generally yet. 1 6 11 16 21 26 31 36 ... However, [5] has solved it for the case of n = 2: 4 9 14 19 24 29 ... 2 7 12 17 2 ... Theorem 1. We assume the two positive inte- 0 5 10 15 ... ger x1 and x2 are relatively prime. Then, every integer x > x1x2 −x1 −x2 can be represented as a Figure 1: Tabular display for integers that can be linear combination of x1 and x2: x = c2x2 +c1x1 represented with the positive coefficients 5 and 7 with positive integer coefficients c1, c2. [5] This representation gives us another way to prove Theorem 2. Since c1 and c2 are positive, What about the integers 1leqx ≤ (x1 −1)(x2 −1) the entry at xa, yb must be larger than any entry ? Some of them can still be represented with at xi, yb where i < a and larger than any entry positive integer coefficients, however, some can at xa yj where j < b. 1 The largest number with a negative coefficient by inserting every element into a already sorted c1 must therefore be the one hat has the largest list. k-insertion sort sorts every kth element in possible c2 and the lagerst possible c1. a list. Definition 1. A list of n elements ei is said to be The largest possible c2 is x1 − 1 (as we defined k-orderd, if ei ≤ ei+ck∀1 ≤ i ≤ n, 1 ≤ c, c, i, k ∈ earlier). The largest possible negative c1 is −1. + Therefore, the largest integer that can not be N represented is: x = (x1 − 1)x2 − 1(x1). This is equivalent to x1x2 − x1 − x2. Shellsort calls k-insertion sort with decreasing values of k. To ensure that the list is sorted, The next question is: How many integers x ≤ the last step is k = 1. (x1 −1)(x2 −1) can not be represented with positive coefficients? If we look at the tabular representation again, 4 Shellsort performance with this question is asking how many numbers are relatively prime numbers left of the column of c1 = 0 ? + For c2 this is 0, since 0x2 will always be 0 and We will assume that k, l ∈ N are two relatively there are no positive integers less than 0. prime numbers. If we sort a list of elements ei with a k- and a l-insertion sort, we will get a For any c this is b c2x2 c. In particular, for c = k,l-ordered list with the properties 2 x1 2 (x1−1)x2 x1 − 1 this is b c. x1 ei ≤ ei+c1k If we sum all these up for c2 = [0..x1 − 1] we get and (x −1)(x −1) 1 2 ej ≤ ej+c l 2 . 2 If we set j = i + c1k we can combine these two: TBD: How????? e ≤ e Theorem 3. We assume the two positive integer i i+c1k+c2l x and x are relatively prime. Then (x1−1)(x2−1) 1 2 2 From the Theorem ?? we know that c1k+c2l can integers can not be represented as a linear com- represent every integer x ≥ (k − 1)(l − 1) thus bination of x1 and x2: x = c2x2 + c1x1 with positive integer coefficients c1, c2. ei ≤ ei+(k−1)(l−1)+1 which leads us to the follwing Unfortunately, the current research only gives good explanations for two relatively prime num- Theorem 4. Every element in a k, l-ordered list bers. There are several papers that try to find is less than (k − 1)(l − 1) indexes away from its upper and lower bounds for the general case. So sorted position in a 1-sorted list, if 1 ≤ k < l, k far, no general formula has been found. ?? and l relative prime. Also, if we look at Theorem 3 we cann see that 3 Shellsort Theorem 5. Every element in a k, l-sorted list (k−1)(l−1) has at most 2 elements appearing before Shellsort, as suggested in TBD, is a repeated ver- that should appear later, if 1 ≤ k < l, k and l sion of k-insertion sort. Insertion sort sorts a list relative prime. 2 5 Shellsort performance with 7 Constructing a Shellsequence non relatively prime numbers We now tried to construct a Shell sequences based on assumptions 1 and 2. If we take two numbers k,l, that are not relatively prime, only numbers that have the common fac- To satisfy assumption 2 we construct an ideal tor gcd(k, l) in it can be represented in terms of sequence f(n) with the follwoing properties: k and l. f(1) = 1 (1) Applying this to Shellsort, this means until the f(n) = f(n − 1)c where c > 1 ∈ R (2) very last step, there will always be values that may need to move through the whole list. Shell- To satisfy assumption 1 we define the sequence sorts strenght, however, lies in eliminating those s(n) as follows: s(n) will be the smallest integer elements, so we arrive at the following greater than f(n) that is relatively prime to all s(2)..s(n − 1). Assumption 1. For a Shellsort sequence to be effective, all the numbers have to be relatively prime 8 Findind the best growth factor To find the best growth factor, we will use the 6 Growth of a Shellsort se- above method to construct shell sequences. We will then appply these to sort arrays of different quence sizes that contain random data. For c we used the range 1.50..3.00 in incre- The growth of the sequence is another important ments of 0.05. For the array size, we used factor. If the sequence grows to slow, to many 103, 104, 105, 106. Each sort has been done 5 shellsort passes will be made, leading to unnecce- times to assure accuracy. The results can be sary comparisons, which take to much time. found in Figures 2, 3 and 4. If the sequene grows to fast, the advantages of As you can see from Figure 4, the faster the se- Shellsort are gone, and the behaviour gets closer quence grows, the less overhead is involved, and and close to that of straight insertion sort. the actual algorithm runs faster. This is, however due to the fact that we were doing inte- If we look at effective sequences that can be fond ger comparisons, which are very fast on modern in literature, most of them grow approximately computers, and the management overhead dom- by a factor of 2. No rule for perfect Shellse- inates the runtime. quence growth has been found, but this seems to be good. If comparisons would dominate the runtime, we need to minize those. If we look at Figure 2 we see multiple minima, at c = 2.2, 2.35, 2.45 Assumption 2. The Shell sequence may neither and 2.55. We will therefore examine the range grow to fast nor to slow. A factor close to 2 2.1..2.6 closer, in steps of 0.01. The results can seems to give the best results. be found in Figure 5. 3 c 106 105 104 103 c 106 105 104 103 1.5 148096157 4033761 268781 19761 1.5 296192315 8067523 537563 39522 1.55 94642467 3489655 261369 18845 1.55 189284934 6979310 522738 37691 1.6 66307896 3222457 253077 18089 1.6 132615792 6444915 506154 36179 1.65 51195019 3071179 243260 17250 1.65 102390038 6142358 486520 34501 1.7 43806001 3076849 242129 17040 1.7 87612002 6153698 484259 34081 1.75 38690668 2969786 230646 16381 1.75 77381337 5939572 461293 32762 1.8 36338939 2922294 226307 15934 1.8 72677879 5844589 452615 31869 1.85 34896162 2850479 219403 15491 1.85 69792324 5700958 438807 30983 1.9 35119277 2883186 222804 15491 1.9 70238555 5766372 445608 30983 1.95 34096450 2770747 213018 15108 1.95 68192900 5541494 426036 30216 2 38363880 2872861 213661 14823 2 76727761 5745722 427323 29646 2.05 33867497 2740231 210997 14689 2.05 67734994 5480462 421994 29378 2.1 33571066 2714777 206737 14647 2.1 67142132 5429554 413474 29295 2.15 33354981 2695285 205777 14379 2.15 66709963 5390571 411554 28759 2.2 33162445 2677880 203966 14326 2.2 66324891 5355760 407932 28653 2.25 33428464 2698651 208228 14090 2.25 66856929 5397303 416457 28180 2.3 32995665 2655091 199975 13956 2.3 65991331 5310182 399950 27912 2.35 32921442 2648227 200888 14014 2.35 65842884 5296454 401777 28028 2.4 33329546 2676205 201823 13889 2.4 66659092 5352411 403647 27778 2.45 32853407 2631833 199408 13738 2.45 65706815 5263666 398817 27477 2.5 33138322 2654605 200421 13845 2.5 66276645 5309210 400842 27691 2.55 32992964 2628471 197599 13704 2.55 65985928 5256942 395198 27408 2.6 33033944 2628661 198060 13643 2.6 66067889 5257323 396120 27286 2.65 33684086 2685786 202121 13597 2.65 67368172 5371573 404242 27194 2.7 33978169 2713627 206348 13810 2.7 67956339 5427254 412696 27620 2.75 34582299 2757018 207685 13960 2.75 69164598 5514036 415371 27920 2.8 33991997 2672757 201107 13824 2.8 67983994 5345514 402215 27648 2.85 34088609 2710268 203217 13770 2.85 68177219 5420536 406435 27540 2.9 34198867 2697561 203390 13733 2.9 68397734 5395123 406780 27466 2.95 35282875 2806847 209515 14078 2.95 70565750 5613694 419030 28156 3 43128969 2934564 207531 13542 3 86257939 5869128 415063 27084 Figure 2: Number of data comparisons for dif- Figure 3: Number of data movements for differ- ferent values of c ent values of c

Utilizing the Linear Diophantine Problem of Frobenius for a Faster Shellsort Sequence

Overview Parallel Merge Sort

Lecture 16: Lower Bounds for Sorting

Advanced Topics in Sorting

Optimal Node Selection Algorithm for Parallel Access in Overlay Networks

CS302 Final Exam, December 5, 2016 - James S

Dualheap Sort Algorithm: an Inherently Parallel Generalization of Heapsort

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ

00 Fast K-Selection Algorithms for Graphics Processing Units

Introspective Sorting and Selection Algorithms

Fast Deterministic Selection

SI 335, Unit 7: Advanced Sort and Search

Practical Massively Parallel Sorting