Radix Sorts Examples Interface Digital { ! Strings Public Int Charat(Int K); ! 64-Bit Integers Public Int Length(Int); Static Int R(); }
Total Page:16
File Type:pdf, Size:1020Kb
Digital keys Many commonly-use key types are inherently digital (sequences of fixed-length characters) interface Radix Sorts Examples interface Digital { ! Strings public int charAt(int k); ! 64-bit integers public int length(int); static int R(); } • key-indexed counting This lecture: • LSD radix sort ! refer to fixed-length vs. variable-length strings • MSD radix sort ! R different characters • 3-way radix quicksort ! key type implements charAt() and length() methods • application: LRS ! code works for String and key types that implement Digital. Widely used in practice References: Algorithms in Java, Chapter 10 ! low-level bit-based sorts Intro to Algs and Data Structs, Section 6.1 ! string sorts Copyright © 2007 by Robert Sedgewick and Kevin Wayne. 3 Review: summary of the performance of sorting algorithms Frequency of execution of instructions in the inner loop: algorithm guarantee average extra assumptions space on keys 2 2 insertion sort N /2 N /4 no Comparable key-indexed counting 2 2 selection sort N /2 N /2 no Comparable LSD radix sort mergesort N lg N N lg N N Comparable MSD radix sort quicksort 1.39 N lg N 1.39 N lg N c lg N Comparable 3-way radix quicksort application: LRS lower bound: N lg N -1.44 N compares are required by any algorithm Q: Can we do better (despite the lower bound)? 2 4 Key-indexed counting: assumptions about keys Key-indexed counting Assume that keys are integers between 0 and R-1 Task: sort an array a[] of N integers between 0 and R-1 Plan: produce sorted result in array temp[] Examples: 1. Count frequencies of each letter using key as index ! char (R = 256) 2. Compute frequency cumulates which specify destinations ! short with fixed R, enforced by client ! int with fixed R, enforced by client a[] temp[] 0 d 0 int N = a.length; Reminder: equal keys are allowed in sorts int[] count = new int[R]; 1 a 1 2 c count[] 2 count for (int i = 0; i < N; i++) 3 f a 0 3 Applications: frequencies count[a[i]+1]++; 4 f b 2 4 ! sort phone numbers by area code 5 b c 5 5 ! sort classlist by precept compute for (int i = 1; i < 256; i++) 6 d 6 d cumulates d 6 ! Requirement: sort must be stable count[i] += count[i-1]; 7 b e 8 7 d ! Ex: Full sort on primary key, then stable radix sort on secondary key 8 f f 9 8 9 b 9 10 e 10 11 a 11 Implication: Can use key as an array index 6 keys < d, 8 keys < e so d’s go in a[6] and a[7] 5 7 Key-indexed counting Key-indexed counting Task: sort an array a[] of N integers between 0 and R-1 Task: sort an array a[] of N integers between 0 and R-1 Plan: produce sorted result in array temp[] Plan: produce sorted result in array temp[] 1. Count frequencies of each letter using key as index 1. Count frequencies of each letter using key as index 2. Compute frequency cumulates which specify destinations 3. Access cumulates using key as index to move records. a[] offset by 1 temp[] a[] temp[] [stay tuned] 0 d 0 0 d 0 a int N = a.length; int N = a.length; int[] count = new int[R]; 1 a 1 int[] count = new int[R]; 1 a 1 a 2 c count[] 2 2 c count[] 2 b count for (int i = 0; i < N; i++) 3 f a 0 3 count for (int i = 0; i < N; i++) 3 f a 2 3 b frequencies frequencies count[a[i]+1]++; 4 f b 2 4 count[a[i]+1]++; 4 f b 5 4 b 5 b c 3 5 5 b c 6 5 c 6 d 6 compute for (int i = 1; i < 256; i++) 6 d 6 d d 1 cumulates d 8 7 b e 2 7 count[i] += count[i-1]; 7 b e 9 7 d 8 f f 1 8 8 f f 12 8 e 9 9 move for (int i = 0; i < N; i++) 9 9 b records b f temp[count[a[i]]++] = a[i]; 10 e 10 10 e 10 f 11 a 11 11 a 11 f 2 d’s 6 8 Key-indexed counting Task: sort an array a[] of N integers between 0 and R-1 Plan: produce sorted result in array temp[] 1. Count frequencies of each letter using key as index 2. Compute frequency cumulates 3. Access cumulates using key as index to find record positions. 4. Copy back into original array a[] temp[] key-indexed counting 0 a 0 a int N = a.length; 1 a 1 a LSD radix sort int[] count = new int[R]; 2 b count[] 2 b MSD radix sort count for (int i = 0; i < N; i++) 3 b a 2 3 b frequencies count[a[i]+1]++; 4 b b 5 4 b 3-way radix quicksort 5 c c 6 5 c application: LRS compute for (int k = 1; k < 256; k++) 6 d 6 d cumulates d 8 count[k] += count[k-1]; 7 d e 9 7 d 8 e f 12 8 e move for (int i = 0; i < N; i++) 9 9 records f f temp[count[a[i]++]] = a[i] 10 f 10 f copy back for (int i = 0; i < N; i++) 11 f 11 f a[i] = temp[i]; 9 11 Review: summary of the performance of sorting algorithms Least-significant-digit-first radix sort Frequency of execution of instructions in the inner loop: LSD radix sort. ! Consider characters d from right to left ! Stably sort using dth character as the key via key-indexed counting. extra assumptions algorithm guarantee average space on keys sort key sort key sort key insertion sort N2 /2 N2 /4 no Comparable 0 d a b 0 d a b 0 d a b 0 a c e selection sort N2 /2 N2 /2 no Comparable 1 a d d 1 c a b 1 c a b 1 a d d mergesort N lg N N lg N N Comparable 2 c a b 2 e b b 2 f a d 2 b a d 3 f a d 3 a d d 3 b a d 3 b e d quicksort 1.39 N lg N 1.39 N lg N c lg N Comparable 4 f e e 4 f a d 4 d a d 4 b e e 5 b a d 5 b a d 5 e b b 5 c a b key-indexed counting N N N + R integers 6 d a d 6 d a d 6 a c e 6 d a b 7 b e e 7 f e d 7 a d d 7 d a d 8 f e d 8 b e d 8 f e d 8 e b b inplace version is possible and practical 9 b e d 9 f e e 9 b e d 9 f a d 10 e b b 10 b e e 10 f e e 10 f e d 11 a c e 11 a c e 11 b e e 11 f e e Q: Can we do better (despite the lower bound)? A: Yes, if we do not depend on comparisons sort must be stable 10 arrows do not cross 12 LSD radix sort: Why does it work? Review: summary of the performance of sorting algorithms Frequency of execution of instructions in the inner loop: Pf 1. [thinking about the past] ! If two strings differ on first character, key- sort key indexed sort puts them in proper relative order. extra assumptions ! 0 d a b 0 a c e algorithm guarantee average If two strings agree on first character, space on keys stability keeps them in proper relative order. 1 c a b 1 a d d 2 f a d 2 b a d insertion sort N2 /2 N2 /4 no Comparable 3 b a d 3 b e d 2 2 4 d a d 4 b e e selection sort N /2 N /2 no Comparable 5 e b b 5 c a b mergesort N lg N N lg N N Comparable 6 a c e 6 d a b 7 a d d 7 d a d quicksort 1.39 N lg N 1.39 N lg N c lg N Comparable 8 8 Pf 2. [thinking about the future] f e d e b b 9 b e d 9 f a d ! LSD radix sort WN WN N + R Digital If the characters not yet examined differ, 10 f e e 10 f e d it doesn't matter what we do now. 11 b e e 11 f e e ! If the characters not yet examined agree, stability ensures later pass won't affect order. in order by previous passes 13 15 LSD radix sort implementation Sorting Challenge Use k-indexed counting on characters, moving right to left Problem: sort a huge commercial database on a fixed-length key field Ex: account number, date, SS number B14-99-8765 756-12-AD46 public static void lsd(String[] a) CX6-92-0112 Which sorting method to use? 332-WX-9877 { 375-99-QWAX 1. insertion sort int N = a.length; CV2-59-0221 387-SS-0321 int W = a[0].length; 2.