Using Tabulation to Implement the Power of Hashing Using Tabulation to Implement the Power of Hashing Talk Surveys Results from I M

Using Tabulation to Implement the Power of Hashing Using Tabulation to Implement the Power of Hashing Talk Surveys Results from I M

Talk surveys results from I M. Patras¸cuˇ and M. Thorup: The power of simple tabulation hashing. STOC’11 and J. ACM’12. I M. Patras¸cuˇ and M. Thorup: Twisted Tabulation Hashing. SODA’13 I M. Thorup: Simple Tabulation, Fast Expanders, Double Tabulation, and High Independence. FOCS’13. I S. Dahlgaard and M. Thorup: Approximately Minwise Independence with Twisted Tabulation. SWAT’14. I T. Christiani, R. Pagh, and M. Thorup: From Independence to Expansion and Back Again. STOC’15. I S. Dahlgaard, M.B.T. Knudsen and E. Rotenberg, and M. Thorup: Hashing for Statistics over K-Partitions. FOCS’15. Using Tabulation to Implement The Power of Hashing Using Tabulation to Implement The Power of Hashing Talk surveys results from I M. Patras¸cuˇ and M. Thorup: The power of simple tabulation hashing. STOC’11 and J. ACM’12. I M. Patras¸cuˇ and M. Thorup: Twisted Tabulation Hashing. SODA’13 I M. Thorup: Simple Tabulation, Fast Expanders, Double Tabulation, and High Independence. FOCS’13. I S. Dahlgaard and M. Thorup: Approximately Minwise Independence with Twisted Tabulation. SWAT’14. I T. Christiani, R. Pagh, and M. Thorup: From Independence to Expansion and Back Again. STOC’15. I S. Dahlgaard, M.B.T. Knudsen and E. Rotenberg, and M. Thorup: Hashing for Statistics over K-Partitions. FOCS’15. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. I Too simple hash functions may work deceivingly well in random tests, but the real world is full of structured data on which they may fail miserably (as we shall see later). Target I Simple and reliable pseudo-random hashing. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. I Too simple hash functions may work deceivingly well in random tests, but the real world is full of structured data on which they may fail miserably (as we shall see later). Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. I Too simple hash functions may work deceivingly well in random tests, but the real world is full of structured data on which they may fail miserably (as we shall see later). Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Too simple hash functions may work deceivingly well in random tests, but the real world is full of structured data on which they may fail miserably (as we shall see later). Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. I Too simple hash functions may work deceivingly well in random tests, but the real world is full of structured data on which they may fail miserably (as we shall see later). Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers • x a t • ! ! • v • ! • f s r • ! ! ! Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers • x a t x • ! ! ! • v • ! • f s r • ! ! ! Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array • q x a g ! c ! ! • • t Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array • q x a g ! c ! x ! • t Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates a • s • z • y f x w • r • x b • Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates a • s • z • y f x w • r • x b • Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates a • s • z • y f x w • r • x b • Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates a • s • z • y f x w • r • x b • Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates a • s • z • y f x w • r • x b • Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates z • s • w • y f x x • a • r x b I sketch A and B to later find A B / A B | \ | | [ | A B / A B = Pr[min h(A)=min h(B)] | \ | | [ | h We need h to be "-minwise independent: 1 " x S : Pr[h(x)=min h(S)] = ± 2 S | | Sketching, streaming, and sampling: 2 I second moment estimation: F2(x¯)= i xi P Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers. I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates I sketch A and B to later find A B / A B | \ | | [ | A B / A B = Pr[min h(A)=min h(B)] | \ | | [ | h We need h to be "-minwise independent: 1 " x S : Pr[h(x)=min h(S)] = ± 2 S | | Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers. I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates Sketching, streaming, and sampling: 2 I second moment estimation: F2(x¯)= i xi P Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers. I linear probing: sequential search in one array I cuckoo hashing: search 2 locations, complex updates Sketching, streaming, and sampling: 2 I second moment estimation: F2(x¯)= i xi I sketch A and B to later find A B / A B | \ | | P[ | A B / A B = Pr[min h(A)=min h(B)] | \ | | [ | h We need h to be "-minwise independent: 1 " x S : Pr[h(x)=min h(S)] = ± 2 S | | Prototypical example: degree k 1 polynomial − I u = b prime; I choose a0, a1,...,ak 1 randomly in [u]; − k 1 I h(x)= a0 + a1x + + ak 1x − mod u. ··· − Many solutions for k-independent hashing proposed, but generally slow for k > 3 and too slow for k > 5. Wegman & Carter [FOCS’77] We do not have space for truly random hash functions, but Family = h :[u] [b] k-independent iff for random h : H { ! } 2H I ( )x [u], h(x) is uniform in [b]; 8 2 I ( )x ,...,x [u], h(x ),...,h(x ) are independent. 8 1 k 2 1 k Many solutions for k-independent hashing proposed, but generally slow for k > 3 and too slow for k > 5. Wegman & Carter [FOCS’77] We do not have space for truly random hash functions, but Family = h :[u] [b] k-independent iff for random h : H { ! } 2H I ( )x [u], h(x) is uniform in [b]; 8 2 I ( )x ,...,x [u], h(x ),...,h(x ) are independent. 8 1 k 2 1 k Prototypical example: degree k 1 polynomial − I u = b prime; I choose a0, a1,...,ak 1 randomly in [u]; − k 1 I h(x)= a0 + a1x + + ak 1x − mod u. ··· − Wegman & Carter [FOCS’77] We do not have space for truly random hash functions, but Family = h :[u] [b] k-independent iff for random h : H { ! } 2H I ( )x [u], h(x) is uniform in [b]; 8 2 I ( )x ,...,x [u], h(x ),...,h(x ) are independent.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    135 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us