Sorting Presorted Data

Sorting Presorted Data

Sorting presorted data Vincent Jugé LIGM – Université Gustave Eiffel, ESIEE, ENPC & CNRS 14/06/2021 Joint work with N. Auger, C. Nicaud, C. Pivoteau & G. Khalighinejad Université Gustave Eiffel Sharif University of Technology V. Jugé Sorting presorted data No! MergeSort has a worst-case time complexity of O(n log(n)) Can we do better? Proof: There are n! possible reorderings Each element comparison gives a 1-bit information END OF TALK! Thus log2(n!) ∼ n log2(n) tests are required Sorting data 0 2 2 3 4 0 1 5 4 1 2 3 0 0 1 1 2 2 2 3 3 4 4 5 V. Jugé Sorting presorted data No! Proof: There are n! possible reorderings Each element comparison gives a 1-bit information END OF TALK! Thus log2(n!) ∼ n log2(n) tests are required Sorting data 0 2 2 3 4 0 1 5 4 1 2 3 0 0 1 1 2 2 2 3 3 4 4 5 MergeSort has a worst-case time complexity of O(n log(n)) Can we do better? V. Jugé Sorting presorted data END OF TALK! Sorting data 0 2 2 3 4 0 1 5 4 1 2 3 0 0 1 1 2 2 2 3 3 4 4 5 MergeSort has a worst-case time complexity of O(n log(n)) Can we do better? No! Proof: There are n! possible reorderings Each element comparison gives a 1-bit information Thus log2(n!) ∼ n log2(n) tests are required V. Jugé Sorting presorted data Sorting data 0 2 2 3 4 0 1 5 4 1 2 3 0 0 1 1 2 2 2 3 3 4 4 5 MergeSort has a worst-case time complexity of O(n log(n)) Can we do better? No! Proof: There are n! possible reorderings Each element comparison gives a 1-bit information END OF TALK! Thus log2(n!) ∼ n log2(n) tests are required V. Jugé Sorting presorted data 0 1 1 0 2 1 0 2 0 2 0 1 5 × 0 4 × 1 3 × 2 0 0 0 0 0 1 1 1 1 2 2 2 Cannot we ever do better? In some cases, we should. 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 V. Jugé Sorting presorted data Cannot we ever do better? In some cases, we should. 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 1 0 2 1 0 2 0 2 0 1 5 × 0 4 × 1 3 × 2 0 0 0 0 0 1 1 1 1 2 2 2 V. Jugé Sorting presorted data 4 runs of lengths 5, 3, 1 and 3 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) Theorem [1,2,4,7,11] Some merge sort has a worst-case time complexity of O(n + n H) We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > Let us do better! 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs V. Jugé Sorting presorted data Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) Theorem [1,2,4,7,11] Some merge sort has a worst-case time complexity of O(n + n H) We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > Let us do better! 4 runs of lengths 5, 3, 1 and 3 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) V. Jugé Sorting presorted data Theorem [1,2,4,7,11] Some merge sort has a worst-case time complexity of O(n + n H) We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > Let us do better! 4 runs of lengths 5, 3, 1 and 3 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) V. Jugé Sorting presorted data We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > Let us do better! 4 runs of lengths 5, 3, 1 and 3 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) Theorem [1,2,4,7,11] Some merge sort has a worst-case time complexity of O(n + n H) V. Jugé Sorting presorted data We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > Let us do better! 4 runs of lengths 5, 3, 1 and 3 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) Theorem [1,2,4,7,11] TimSort has a worst-case time complexity of O(n + n H) V. Jugé Sorting presorted data Let us do better! 4 runs of lengths 5, 3, 1 and 3 0 2 2 3 4 0 1 5 4 1 2 3 1 Chunk your data in non-decreasing runs 2 New parameters: Number of runs (ρ) and their lengths (r1;:::; rρ) Pρ New parameters: Run-length entropy: H = i=1(ri =n) log2(n=ri ) New parameters: Run-length entropy: H 6 log2(ρ) 6 log2(n) Theorem [1,2,4,7,11] TimSort has a worst-case time complexity of O(n + n H) We cannot do better than Ω(n + n H)![4] Reading the whole input requires a time Ω(n) There are X possible reorderings, with X 21−ρ n 2n H=2 > r1 ::: rρ > V. Jugé Sorting presorted data PJ J P A J O 1 2 2 2 2 3 4 1 Invented by Tim Peters[3] 2 Standard algorithm in Python Standard algorithm————————for non-primitiveAndroid, arraysJava, Octave in 3 1st worst-case complexity analysis[6] – TimSort works in time O(n log n) 4 Refined worst-case analysis[7] – TimSort works in time O(n + n H) Bugs uncovered in Python & Java implementations[5,7] A brief history of TimSort 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 ’20 ’21 V. Jugé Sorting presorted data PJ J P A J O 2 2 2 2 3 4 2 Standard algorithm in Python Standard algorithm————————for non-primitiveAndroid, arraysJava, Octave in 3 1st worst-case complexity analysis[6] – TimSort works in time O(n log n) 4 Refined worst-case analysis[7] – TimSort works in time O(n + n H) Bugs uncovered in Python & Java implementations[5,7] A brief history of TimSort 1 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 ’20 ’21 1 Invented by Tim Peters[3] V. Jugé Sorting presorted data PJ J 3 4 3 1st worst-case complexity analysis[6] – TimSort works in time O(n log n) 4 Refined worst-case analysis[7] – TimSort works in time O(n + n H) Bugs uncovered in Python & Java implementations[5,7] A brief history of TimSort P A J O 1 2 2 2 2 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 ’20 ’21 1 Invented by Tim Peters[3] 2 Standard algorithm in Python Standard algorithm————————for non-primitiveAndroid, arraysJava, Octave in V. Jugé Sorting presorted data PJ J 4 4 Refined worst-case analysis[7] – TimSort works in time O(n + n H) Bugs uncovered in Python & Java implementations[5,7] A brief history of TimSort P A J O 1 2 2 2 2 3 2001 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11 ’12 ’13 ’14 ’15 ’16 ’17 ’18 ’19 ’20 ’21 1 Invented by Tim Peters[3] 2 Standard algorithm in Python Standard algorithm————————for non-primitiveAndroid, arraysJava, Octave in 3 1st worst-case complexity analysis[6] – TimSort works in time O(n log n) V.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    47 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us