Worst-Case Efficient Sorting with QuickMergesort

Stefan Edelkamp1 and Armin Weiß2

1King’s College London, UK

2FMI, Universit¨atStuttgart, Germany

San Diego, January 7, 2019 25

20 [ns] n

log 15 n

10

time per 5

Right: random with2 large10 elements215 in220 the middle225 and end. number of elements n

25

20 [ns] n

log 15 n

10

time per 5

210 215 220 225 number of elements n

std::sort (/) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs.

Comparison-based sorting: Quicksort, Heapsort, Mergesort 25

20 [ns] n

log 15 n

10

time per 5

Right: random with2 large10 elements215 in220 the middle225 and end. number of elements n

Comparison-based sorting: Quicksort, Heapsort, Mergesort 25

20 [ns] n

log 15 n

10 time per 5

210 215 220 225 number of elements n

std::sort (Quicksort/Introsort) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs. Comparison-based sorting: Quicksort, Heapsort, Mergesort 25 25

20 20 [ns] [ns] n n log log 15 15 n n

10 10 time per time per 5 5

210 215 220 225 210 215 220 225 number of elements n number of elements n

std::sort (Quicksort/Introsort) std::partial sort (Heapsort) std::stable sort (Mergesort) Running times (divided by n log n) for sorting integers. Left: random inputs. Right: random with large elements in the middle and end. Wish to have three times : Make Quicksort worst-case efficient: Introsort, -of- pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort    Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable)

Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) Quicksort, Heapsort, Mergesort

Algorithm Fast on average “in place” O(n log n) worst case

Quicksort   

Heapsort   

Mergesort   

Wish to have three times : Make Quicksort worst-case efficient: Introsort, median-of-medians pivot selection Make Heapsort fast: Bottom-up Heapsort (not very fast) Make Mergesort in-place: block based merging (stable implementations: Grailsort, Wikisort) rotation based merging (stable, but O(n log2 n)) use one half as buffer to sort the other half (In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012], unstable) Outline

Outline: QuickMergesort Our improvements and theoretical bounds Experiments After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with Quicksort

Quicksort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with Quicksort

Quicksort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 5:

After line 6: both parts sorted recursively with Quicksort

Quicksort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot After line 6: both parts sorted recursively with Quicksort

Quicksort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5: Quicksort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with Quicksort After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with QuickMergesort

QuickMergesort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with QuickMergesort

QuickMergesort

1: procedure Quicksort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) Mergesort 5: Quicksort(A[`, . . . , cut − 1]) 6: Quicksort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with QuickMergesort

QuickMergesort

1: procedure QuickMergesort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Mergesort(A[`, . . . , cut − 1]) 6: QuickMergesort(A[cut,..., r]) 7: end if 8: end procedure QuickMergesort

1: procedure QuickMergesort(A[`, . . . , r]) 2: if r > ` then 3: pivot ← choosePivot(A[`, . . . , r]) 4: cut ← partition(A[`, . . . , r], pivot) 5: Mergesort(A[`, . . . , cut − 1]) 6: QuickMergesort(A[cut,..., r]) 7: end if 8: end procedure After line 4: ≤ pivot ≥ pivot

After line 5:

After line 6: both parts sorted recursively with QuickMergesort 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

| {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

3 2 4 5 6 0 1 7 9 10 11 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort

9 10 8 11 2 3 4 7 0 1 5 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

7 11 4 5 6 10 9 2 3 1 0 8 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

| {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

3 2 4 5 6 0 1 7 9 10 11 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort

9 10 8 11 2 3 4 7 0 1 5 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

7 11 4 5 6 10 9 2 3 1 0 8 3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

| {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

7 11 4 5 6 10 9 2 3 1 0 8 | |{z {z } sortsort with recursively Mergesortwith Mergesort

9 10 8 11 2 3 4 7 0 1 5 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

| {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6

9 10 8 11 2 3 4 7 0 1 5 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort with Mergesort | {z } | {z } merge two parts 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

| {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

| {z } | {z } merge two parts

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

0 101 82 113 284 1135 486 7 9 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 9 10 8 11 2 3 4 7 0 1 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

0 101 82 113 284 1135 486 7 9 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 9 10 8 11 2 3 4 7 0 1 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 10 8 11 2 3 4 7 9 1 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 10 8 11 2 3 4 7 9 1 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 8 11 2 3 4 7 9 10 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 11 8 3 4 7 9 10 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 3 8 11 4 7 9 10 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 3 4 11 8 7 9 10 5 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 486 7 09 101 115 68 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 3 4 5 8 7 9 10 11 6 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 48 7 09 101 115 6 | {z } sort recursively with QuickMergesort

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 3 4 5 6 7 9 10 11 8 7 11 4 5 6 10 9 2 3 1 0 8 | {z } sort with Mergesort

90 101 82 113 284 1135 48 7 09 101 115 6

QuickMergesort

1. Partition according to some pivot element. 2. Sort one part with Mergesort. 3. Sort the the other part recursively with QuickMergesort.

3 2 4 5 6 0 1 7 9 10 11 8 | {z } sort recursively with Mergesort

3 2 4 11 9 10 8 7 0 1 5 6 | {z } sort recursively with Mergesort

9 10 8 11 2 3 4 7 0 1 5 6 | {z } | {z } merge two parts 0 1 2 3 4 5 6 7 9 10 11 8 | {z } sort recursively with QuickMergesort Problem: worst case

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: resp. Introselect. In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n). Problem: worst case

Solution: choose the median as pivot: Quickselect resp. Introselect. In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)! Problem: worst case

Quickselect resp. Introselect. In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Problem: worst case In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. Problem: worst case

Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. Problem: worst case In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. Problem: worst case In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973] n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons

Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. Problem: worst case In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements. Worst case of QuickMergesort

Theorem (Edelkamp, W. 2014, Wild 2018) QuickMergesort needs ≤ n log n − 0.83n (resp. n log n − 1.24n) +o(n) √ comparisons on average with median-of-three (resp. median of n).

Still quadratic worst-case (with median-of-three)!

Solution: choose the median as pivot: Quickselect resp. Introselect. Problem: worst case In-situ Mergesort [Elmasry, Katajainen, Stenmark 2012] Median-of-medians algorithm [Blum, Floyd, Pratt, Rivest, Tarjan 1973]

Theorem (Folklore) The median-of-medians algorithms needs at most 20n + o(n) comparisons to find the median of n elements.

n n Need to find the median of n, 2 , 4 ,... elements 40n comparisons Sufficient if one third is smaller/greater than the pivot: form groups of 3 elements compute the median of each group compute the median of the n/3 medians

···

M1 M2 M3 ... Mk

M

One third are ≤ p: p

Theorem Basic MoMQuickMergesort needs at most n log n + 13.8n + o(n) comparisons.

Median-of-medians QuickMergesort

Key observation: we do not need the exact median. form groups of 3 elements compute the median of each group compute the median of the n/3 medians

···

M1 M2 M3 ... Mk

M

One third are ≤ p: p

Theorem Basic MoMQuickMergesort needs at most n log n + 13.8n + o(n) comparisons.

Median-of-medians QuickMergesort

Key observation: we do not need the exact median.

Sufficient if one third is smaller/greater than the pivot: One third are ≤ p: p

Theorem Basic MoMQuickMergesort needs at most n log n + 13.8n + o(n) comparisons.

Median-of-medians QuickMergesort

Key observation: we do not need the exact median.

Sufficient if one third is smaller/greater than the pivot: form groups of 3 elements compute the median of each group compute the median of the n/3 medians

···

M1 M2 M3 ... Mk

M Theorem Basic MoMQuickMergesort needs at most n log n + 13.8n + o(n) comparisons.

Median-of-medians QuickMergesort

Key observation: we do not need the exact median.

Sufficient if one third is smaller/greater than the pivot: form groups of 3 elements compute the median of each group compute the median of the n/3 medians

···

M1 M2 M3 ... Mk

M

One third are ≤ p: p Median-of-medians QuickMergesort

Key observation: we do not need the exact median.

Sufficient if one third is smaller/greater than the pivot: form groups of 3 elements compute the median of each group compute the median of the n/3 medians

···

M1 M2 M3 ... Mk

M

One third are ≤ p: p

Theorem Basic MoMQuickMergesort needs at most n log n + 13.8n + o(n) comparisons. Step 2 (merge from the right):

Once the buffer is full, the final position for the largest element is “free”.

Expected result: Need a guarantee that one fifth are smaller/greater than the pivot: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons.

Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Step 2 (merge from the right):

Need a guarantee that one fifth are smaller/greater than the pivot: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons.

Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Expected result: Need a guarantee that one fifth are smaller/greater than the pivot: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons.

Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons.

Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result: Need a guarantee that one fifth are smaller/greater than the pivot: Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons.

Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result: Need a guarantee that one fifth are smaller/greater than the pivot: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Merging with less buffer space (Reinhardt 1992)

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result: Need a guarantee that one fifth are smaller/greater than the pivot: ···

M1 M2 M3 M4 M5 M6 ... Mk

0 0 ... 0 M1 M2 M`

M median of pseudomedians of 15 elements Theorem MoMQuickMergesort needs at most n log n + 4.57n + o(n) comparisons. Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result:

trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Step 2 (merge from the right):

Result:

trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”. Result:

trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right): trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result: Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result:

trade-off: effort to find good pivots vs. increased merging costs Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons.

Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result:

trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements. Unbalanced merging and undersampling

For merging sequences of different size a smaller buffer suffices:

Step 1 (merge from the left): Once the buffer is full, the final position for the largest element is “free”.

Step 2 (merge from the right):

Result:

trade-off: effort to find good pivots vs. increased merging costs Undersampling: for θ ≥ 1 apply the median-of-pseudomedians-of-15 strategy to n/θ elements.

Theorem MoMQuickMergesort with undersampling factor θ = 2.2 needs at most n log n + 1.59n + o(n) comparisons. Not clear how to find worst-case instances. Simulation of worst cases: Discard the computed pivots and compute worst-case pivots using Quickselect. Minor details (e. g. random shuffle before Mergesort).

Experiments

Experiments with random permutations of 32-bit integers (other data types in proceedings) in C++. Simulation of worst cases: Discard the computed pivots and compute worst-case pivots using Quickselect. Minor details (e. g. random shuffle before Mergesort).

Experiments

Experiments with random permutations of 32-bit integers (other data types in proceedings) in C++. Not clear how to find worst-case instances. Discard the computed pivots and compute worst-case pivots using Quickselect. Minor details (e. g. random shuffle before Mergesort).

Experiments

Experiments with random permutations of 32-bit integers (other data types in proceedings) in C++. Not clear how to find worst-case instances. Simulation of worst cases: Minor details (e. g. random shuffle before Mergesort).

Experiments

Experiments with random permutations of 32-bit integers (other data types in proceedings) in C++. Not clear how to find worst-case instances. Simulation of worst cases: Discard the computed pivots and compute worst-case pivots using Quickselect. Experiments

Experiments with random permutations of 32-bit integers (other data types in proceedings) in C++. Not clear how to find worst-case instances. Simulation of worst cases: Discard the computed pivots and compute worst-case pivots using Quickselect. Minor details (e. g. random shuffle before Mergesort). – 13.05 ± 0.17 13.8 2.094 4.220 ± 0.007 4.57 0.275 1.218 ± 0.011 1.59

and simulated worst cases.

Counting comparisons

average case worst case Algorithm exp. theo. exp. theo. bMQMS 2.772 ± 0.02 MQMS 2.084 ± 0.001

MQMS11/5 0.246 ± 0.01

n 4 bMQMS / )

n MQMS

log 3 MQMS11/5 n − 2

1 (comparisons 0 210 213 216 219 222 225 228 number of elements n Number of comparisons (linear term) of MoMQuickMergesort variants 13.05 ± 0.17 13.8 4.220 ± 0.007 4.57 1.218 ± 0.011 1.59

and simulated worst cases.

Counting comparisons

average case worst case Algorithm exp. theo. exp. theo. bMQMS 2.772 ± 0.02 – MQMS 2.084 ± 0.001 2.094

MQMS11/5 0.246 ± 0.01 0.275

n 4 bMQMS / )

n MQMS

log 3 MQMS11/5 n − 2

1 (comparisons 0 210 213 216 219 222 225 228 number of elements n Number of comparisons (linear term) of MoMQuickMergesort variants Counting comparisons

average case worst case Algorithm exp. theo. exp. theo. bMQMS 2.772 ± 0.02 – 13.05 ± 0.17 13.8 MQMS 2.084 ± 0.001 2.094 4.220 ± 0.007 4.57

MQMS11/5 0.246 ± 0.01 0.275 1.218 ± 0.011 1.59

n 4 bMQMS / )

n MQMS

log 3 MQMS11/5 n

− MQMS11/5 (wc) 2 MQMS (wc)

1 (comparisons 0 210 213 216 219 222 225 228 number of elements n Number of comparisons (linear term) of MoMQuickMergesort variants and simulated worst cases. Running times

5.00 bMQMS 4.75 MQMS

MQMS11/5 4.50 MQMS11/5 (wc) [ns]

n 4.25 MQMS (wc) log bMQMS (wc) n 4.00

3.75 time per

3.50

3.25 210 213 216 219 222 225 228 number of elements n

Running times of different MoMQuickMergesort variants and their simulated worst cases for random permutations of 32-bit integers. Running times

5.0 In-situ Mergesort

MQMS11/5 4.5 std::sort (Introsort) [ns]

n std::partial sort (Heapsort) log 4.0 n std::stable sort (Mergesort) Wikisort 3.5 time per MQMS11/5 (wc)

3.0

210 213 216 219 222 225 228 number of elements n

Running times for random permutations of 32-bit integers. Running times 25 25

20 20 [ns] [ns] n n log log 15 15 n n

10 10 time per 5 time per 5

210 215 220 225 210 215 220 225 number of elements n number of elements n

std::partial sort MQMS 11/5 (Heapsort) std::sort std::sort (MQMS worst case stopper) (Introsort)

Running times for sorting integers. Left: random inputs. Right: Random with large elements in the middle and end. Running times

4.5 4.5 [ns] [ns] n n 4.0 4.0 log log n n

3.5 3.5 time per time per 3.0 3.0 210 215 220 225 210 215 220 225 number of elements n number of elements n

std::partial sort MQMS 11/5 (Heapsort) std::sort std::sort (MQMS worst case stopper) (Introsort)

Running times for sorting integers. Left: random inputs. Right: Random with large elements in the middle and end. MQMS11/5 is an unstable with n log n + 1.59n + o(n) comparisons in the worst case n log n + 0.275n + o(n) comparisons in the average case Implementation with stl-style interface1. Thank you!

Conclusion

Algorithm Fast on average “in place” O(n log n) worst case Quicksort    Heapsort    Mergesort    MoMQuickMergesort   

1Code available at https://github.com/weissan/QuickXsort Thank you!

Conclusion

Algorithm Fast on average “in place” O(n log n) worst case Quicksort    Heapsort    Mergesort    MoMQuickMergesort   

MQMS11/5 is an unstable sorting algorithm with n log n + 1.59n + o(n) comparisons in the worst case n log n + 0.275n + o(n) comparisons in the average case Implementation with stl-style interface1.

1Code available at https://github.com/weissan/QuickXsort Conclusion

Algorithm Fast on average “in place” O(n log n) worst case Quicksort    Heapsort    Mergesort    MoMQuickMergesort   

MQMS11/5 is an unstable sorting algorithm with n log n + 1.59n + o(n) comparisons in the worst case n log n + 0.275n + o(n) comparisons in the average case Implementation with stl-style interface1. Thank you!

1Code available at https://github.com/weissan/QuickXsort Experiments with larger records

20 In-situ Mergesort MQMS11/5 18 std::sort (Introsort)

[ns] 16

n std::partial sort (Heapsort) log 14 n std::stable sort 12 (Mergesort) Wikisort

time per 10 MQMS11/5 (wc) 8

6 210 212 214 216 218 220 222 number of elements n

Running times of MoMQuickMergesort (average and simulated worst case), hybrid QMS and other algorithms for random permutations 44-byte records with 4-byte keys. Experiments sorting pointers

60 In-situ Mergesort

MQMS11/5 50 std::sort (Introsort) [ns]

n std::partial sort 40 (Heapsort) log

n std::stable sort (Mergesort) 30 MQMS11/5 (wc) time per

20

210 212 214 216 218 220 222 number of elements n

Running times of MoMQuickMergesort (average and simulated worst case), hybrid QMS and other algorithms for random permutations of pointers to records. Experimental setup

Experiments with random permutations of 32bit integers (other data types in proceedings). running time and comparison count ≥ 100 measurements for each data point Test environment: Intel Core i5-2500K CPU (3.30GHz) with 16GB RAM Ubuntu Linux 64bit version 14.04.4 g++ (4.8.4) compiler with flags -O3 -march=native