Double Hashing
Intro & Coding Hashing
Hashing - provides O(1) time on average for insert, search and delete
Hash function - maps a big number or string to a small integer that can be used as index in hash table.
Collision - Two keys resulting in same index. Hashing – a Simple Example
● Arbitrary Size à Fix Size
0 1 2 3 4 Hashing – a Simple Example
● Arbitrary Size à Fix Size
Insert(0) 0 % 5 = 0
0 1 2 3 4 0 Hashing – a Simple Example
● Arbitrary Size à Fix Size
Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1
0 1 2 3 4 0 5321 Hashing – a Simple Example
● Arbitrary Size à Fix Size
Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3
0 1 2 3 4 0 5321 -8002 Hashing – a Simple Example
● Arbitrary Size à Fix Size
Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3 Insert(20000) 20000 % 5 = 0
0 1 2 3 4 0 5321 -8002 Modular Hashing
● Overall a good simple, general approach to implement a hash map ● Basic formula: ○ h(x) = c(x) mod m ■ Where c(x) converts x into a (possibly) large integer ● Generally want m to be a prime number ○ Consider m = 100 ○ Only the least significant digits matter ■ h(1) = h(401) = h(4372901) Collision Resolution
● A strategy for handling the case when two or more keys to be inserted hash to the same index. ● Closed Addressing § Separate Chaining ● Open Addressing § Linear Probing § Quadratic Probing § Double Hashing Separate Chaining
● Make each cell of hash table point to a linked list of records that have same hash function value
Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Separate Chaining
● Make each cell of hash table point to a linked list of records that have same hash function value
Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Open Addressing
● Implementing hashing is to store N key-value pairs in a hash table of size M > N, relying on empty entries in the table to help with collision resolution ● If h(x) == h(y) == i
○ And x is stored at index i in an example hash table
○ If we want to insert y, we must try alternative indices
■ This means y will not be stored at HT[h(y)]
● We must select alternatives in a consistent and
predictable way so that they can be located later Linear Probing
● Insert: ○ If we cannot store a key at index i due to collision hi(x) = (Hash(x) + i) % m m = 7 ■ If h0 = (Hash(x) + 0) % m is full we try for h1 Key Hash Value 0 E 12 ■ If h1 = (Hash(x) + 1) % m is full we try for h2 S 5 0 1 C 4 ■ And so on ... E 0 1 ■ Until an open space is found A 5 2 2 ● Search: R 4 3 3 ○ If another key is stored at index i C 4 4 4 R 3 ■ Check i+1, i+2, i+3 … until E 0 6 5 S 0 ● Key is found A 5 8 ● Empty location is found E 0 12 6 A 8 ● We circle through the buffer back to i Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 S 0 A 5 8 E 0 12 6 A 8 Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6 A 8
(A, 8) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6
Delete the key-value pair (A, 8) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
Re-insert the key-value pair (A, 8) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
(E, 12) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
Delete the key-value pair (E, 12) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
Re-insert the key-value pair (E, 12) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
(C, 4) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
Delete the key-value pair (C, 4) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 C 4
Re-insert the key-value pair (C, 4) Linear Probing
● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3
C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6
done Linear Probing
● Delete: o Simply setting the key’s table position to null will not work o Need to reinsert into the table all of the keys in the cluster to the deleted key. Double Hashing
● After a collision, instead of attempting to place the key x in i+1 mod m, look at i+h2(x) mod m ○ h2() is a second, different hash function ■ Should still follow the same general rules as h() to be considered good, but needs to be different from h() ● h(x) == h(y) AND h2(x) == h2(y) should be very unlikely ○ Hence, it should be unlikely for two keys to use the same increment
26 Double Hashing – Example
● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 ● h(x) = x mod 5 14 mod 5 = 4 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4
0 1 2 3 4 Double Hashing – Example
● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 3 - (9 mod 3) = 3 ● h(x) = x mod 5 14 mod 5 = 4 3 - (14 mod 3) = 1 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4 3 - (19 mod 3) = 2
0 1 2 3 4
14 1 9 19 4 Example
● https://www.cs.usfca.edu/~galles/visualization/ClosedHash.ht ml