Double Hashing

Intro & Coding Hashing

Hashing - provides O(1) time on average for insert, search and delete

Hash function - maps a big number or string to a small integer that can be used as index in .

Collision - Two keys resulting in same index. Hashing – a Simple Example

● Arbitrary Size à Fix Size

0 1 2 3 4 Hashing – a Simple Example

● Arbitrary Size à Fix Size

Insert(0) 0 % 5 = 0

0 1 2 3 4 0 Hashing – a Simple Example

● Arbitrary Size à Fix Size

Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1

0 1 2 3 4 0 5321 Hashing – a Simple Example

● Arbitrary Size à Fix Size

Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3

0 1 2 3 4 0 5321 -8002 Hashing – a Simple Example

● Arbitrary Size à Fix Size

Insert(0) 0 % 5 = 0 Insert(5321) 5321 % 5 = 1 Insert(-8002) -8002 % 5 = 3 Insert(20000) 20000 % 5 = 0

0 1 2 3 4 0 5321 -8002 Modular Hashing

● Overall a good simple, general approach to implement a hash map ● Basic formula: ○ h(x) = c(x) mod m ■ Where c(x) converts x into a (possibly) large integer ● Generally want m to be a prime number ○ Consider m = 100 ○ Only the least significant digits matter ■ h(1) = h(401) = h(4372901) Collision Resolution

● A strategy for handling the case when two or more keys to be inserted hash to the same index. ● Closed Addressing § Separate Chaining ● Open Addressing § § § Double Hashing Separate Chaining

● Make each cell of hash table point to a linked list of records that have same value

Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Separate Chaining

● Make each cell of hash table point to a linked list of records that have same hash function value

Key Hash Value S 2 0 E 0 1 A 0 2 R 4 3 C 4 4 H 4 5 E 0 6 X 2 7 A 0 8 M 4 9 P 3 10 Example from L 3 11 textbook ch3.4 E 0 12 Open Addressing

● Implementing hashing is to store N key-value pairs in a hash table of size M > N, relying on empty entries in the table to help with collision resolution ● If h(x) == h(y) == i

○ And x is stored at index i in an example hash table

○ If we want to insert y, we must try alternative indices

■ This means y will not be stored at HT[h(y)]

● We must select alternatives in a consistent and

predictable way so that they can be located later Linear Probing

● Insert: ○ If we cannot store a key at index i due to collision hi(x) = (Hash(x) + i) % m m = 7 ■ If h0 = (Hash(x) + 0) % m is full we try for h1 Key Hash Value 0 E 12 ■ If h1 = (Hash(x) + 1) % m is full we try for h2 S 5 0 1 C 4 ■ And so on ... E 0 1 ■ Until an open space is found A 5 2 2 ● Search: R 4 3 3 ○ If another key is stored at index i C 4 4 4 R 3 ■ Check i+1, i+2, i+3 … until E 0 6 5 S 0 ● Key is found A 5 8 ● Empty location is found E 0 12 6 A 8 ● We circle through the buffer back to i Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 S 0 A 5 8 E 0 12 6 A 8 Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6 A 8

(A, 8) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 5 8 E 0 12 6

Delete the key-value pair (A, 8) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

Re-insert the key-value pair (A, 8) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

(E, 12) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

Delete the key-value pair (E, 12) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

Re-insert the key-value pair (E, 12) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

(C, 4) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

Delete the key-value pair (C, 4) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6 C 4

Re-insert the key-value pair (C, 4) Linear Probing

● Delete: Simply setting the key’s table position to null will not work o m = 7 o Need to reinsert into the table all of the keys in the cluster to the deleted key. Key Hash Value 0 E 12 S 5 0 1 C 4 E 0 1 Delete S A 5 2 2 R 4 3 3

C 4 4 4 R 3 E 0 6 5 A 8 A 5 8 E 0 12 6

done Linear Probing

● Delete: o Simply setting the key’s table position to null will not work o Need to reinsert into the table all of the keys in the cluster to the deleted key. Double Hashing

● After a collision, instead of attempting to place the key x in i+1 mod m, look at i+h2(x) mod m ○ h2() is a second, different hash function ■ Should still follow the same general rules as h() to be considered good, but needs to be different from h() ● h(x) == h(y) AND h2(x) == h2(y) should be very unlikely ○ Hence, it should be unlikely for two keys to use the same increment

26 Double Hashing – Example

● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 ● h(x) = x mod 5 14 mod 5 = 4 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4

0 1 2 3 4 Double Hashing – Example

● Insert Keys: 4, 9, 14, 1, 19 4 mod 5 = 4 9 mod 5 = 4 3 - (9 mod 3) = 3 ● h(x) = x mod 5 14 mod 5 = 4 3 - (14 mod 3) = 1 1 mod 5 = 1 ● h2(x) = 3 – (x mod 3) 19 mod 5 = 4 3 - (19 mod 3) = 2

0 1 2 3 4

14 1 9 19 4 Example

● https://www.cs.usfca.edu/~galles/visualization/ClosedHash.ht ml