Double Hashing
Total Page:16
File Type:pdf, Size:1020Kb
Double Hashing Intro & Coding Hashing Hashing - provides O(1) time on average for insert, search and delete Hash function - maps a big number or string to a small integer that can be used as index in hash table. Collision - Two keys resulting in same index. Hashing – a Simple Example ● Arbitrary Size à Fix Size 0 1 2 3 4 Hashing – a Simple Example ● Arbitrary Size à Fix Size Insert(0) 0 % 5 = 0 0 1 2 3 4 0 Hashing – a Simple Example ● Arbitrary Size à Fix Size Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 0 1 2 3 4 0 5321 Hashing – a Simple Example ● Arbitrary Size à Fix Size Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3 0 1 2 3 4 0 5321 -8002 Hashing – a Simple Example ● Arbitrary Size à Fix Size Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3 Insert(20000) 20000 % 5 = 0 0 1 2 3 4 0 5321 -8002 Modular Hashing ● Overall a good simple, general approach to implement a hash map ● Basic formula: ○ h(x) = c(x) mod m ■ Where c(x) converts x into a (possibly) large integer ● Generally want m to be a prime number ○ Consider m = 100 ○ Only the least significant digits matter ■ h(1) = h(401) = h(4372901) Collision Resolution ● A strategy for handling the case when two or more keys to be inserted hash to the same index. ● Closed Addressing § Separate Chaining ● Open Addressing § Linear Probing § Quadratic Probing § Double Hashing Separate Chaining ● Make each cell of hash table point to a linked list of records that have same hash function value Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Separate Chaining ● Make each cell of hash table point to a linked list of records that have same hash function value Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Open Addressing ● Implementing hashing is to store N key-value pairs in a hash table of size M > N, relying on empty entries in the table to help with collision resolution ● If h(x) == h(y) == i ○ And x is stored at index i in an example hash table ○ If we want to insert y, we must try alternative indices ■ This means y will not be stored at HT[h(y)] ● We must select alternatives in a consistent and predictable way so that they can be located later Linear Probing ● Insert: ○ If we cannot store a key at index i due to collision hi(x) = (Hash(x) + i) % m m = 7 ■ If h0 = (Hash(x) + 0) % m is full we try for h1 Key Hash Value 0 E 12 ■ If h1 = (Hash(x) + 1) % m is full we try for h2 S 5 0 1 C 4 ■ And so on ... E 0 1 ■ Until an open space is found A 5 2 2 ● Search: R 4 3 3 ○ If another key is stored at index i C 4 4 4 R 3 ■ Check i+1, i+2, i+3 … until E 0 6 5 S 0 ● Key is found A 5 8 ● Empty location is found E 0 12 6 A 8 ● We circle through the buffer back to i Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 S 0 A 5 8 E 0 12 6 A 8 Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6 A 8 (A, 8) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6 Delete the key-value pair (A, 8) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 Re-insert the key-value pair (A, 8) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 (E, 12) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 Delete the key-value pair (E, 12) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 Re-insert the key-value pair (E, 12) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 (C, 4) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 Delete the key-value pair (C, 4) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 C 4 Re-insert the key-value pair (C, 4) Linear Probing ● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3 C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 done Linear Probing ● Delete: o Simply setting the key’s table position to null will not work o Need to reinsert into the table all of the keys in the cluster to the deleted key. Double Hashing ● After a collision, instead of attempting to place the key x in i+1 mod m, look at i+h2(x) mod m ○ h2() is a second, different hash function ■ Should still follow the same general rules as h() to be considered good, but needs to be different from h() ● h(x) == h(y) AND h2(x) == h2(y) should be very unlikely ○ Hence, it should be unlikely for two keys to use the same increment 26 Double Hashing – Example ● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 ● h(x) = x mod 5 14 mod 5 = 4 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4 0 1 2 3 4 Double Hashing – Example ● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 3 - (9 mod 3) = 3 ● h(x) = x mod 5 14 mod 5 = 4 3 - (14 mod 3) = 1 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4 3 - (19 mod 3) = 2 0 1 2 3 4 14 1 9 19 4 Example ● https://www.cs.usfca.edu/~galles/visualization/ClosedHash.ht ml.