Hash Tables with Open Addressing
Total Page:16
File Type:pdf, Size:1020Kb
Outline 1 Open Addressing Computer Science 331 2 Operations Hash Tables with Open Addressing Search Insert Delete Mike Jacobson 3 Collision Resolution Department of Computer Science University of Calgary Linear Probing Quadratic Probing Lecture #20 Double Hashing Analysis 4 Summary Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 1 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 2 / 26 Open Addressing Open Addressing Open Addressing Example 0 1 2 3 4 5 6 7 T: NIL 25 2 NIL 12 NIL 14 22 In a hash table with open addressing, all elements are stored in the hash table itself. U = f1; 2;:::; 200g For 0 i < m, T [i] is either m = 8 an element of the dictionary being stored, T : as shown above NIL, or h0 : Function such that DELETED (to be described shortly!) h0 : f1; 2;:::; 200g ! f0; 1;:::; 7g Note: The textbook refers to DELETED as a \dummy value." Eg. h0(k) = k mod 8 for k 2 U: h0 used here for rst try to place key in table. Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 3 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 4 / 26 Open Addressing Open Addressing New De nition of a Hash Function Probe Sequence We may need to make more than one attempt to nd a place to insert an The sequence of addresses element. hh(k; 0); h(k; 1);:::; h(k; m 1)i We'll use hash functions of the form is called the probe sequence for key k h : U ¢ f0; 1;:::; m 1g ! f0; 1;:::; m 1g Initial Requirement: h(k; i): Location to choose to place key k on an i th attempt to insert the key, if the locations examined on attempts hh(k; 0); h(k; 1);:::; h(k; m 1)i 0; 1;:::; i 1 were already full. is always a permutation of the integers between 0 and m 1. This location is not used if already occupied This is highly desirable condition::: but it is not satis ed by some of (i.e., if not NIL or DELETED) the hash functions that are frequently used. We will disucss what happens in the general case later in these notes. Function h0(k) from previous slide was: h(k; 0) Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 5 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 6 / 26 Operations Search Operations Search Search Pseudocode Example: Search for 9 The following algorithm either returns an integer i such that T[i] is equal to k, or throws a notFoundException (because k is not stored in 0 1 2 3 4 5 6 7 the hash table). T: NIL 25 2 NIL 12 NIL 14 22 int search (key k) f h : function such that i = 0; do f h : U ¢ f0; 1;:::; 7g ! f0; 1;:::; 7g j = h(k, i); if (T[j] == k) f and h(k; i) = k + i mod 8 for k 2 U and 0 i 7: return j; g; Probes when Searching for 9 : i++; g while ((T[j] != nil) && (i < m)); throw notFoundException; g Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 7 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 8 / 26 Operations Insert Operations Insert Insert Pseudocode: One Algorithm Example: Insert 1 The following algorithm either reports where the key k has been inserted or throws an appropriate exception int insert (key k) f 0 1 2 3 4 5 6 7 i = 0; T: NIL 25 2 NIL 12 NIL 14 22 while (i < m) f j = h(k, i); if (T[j] == nil) f Probe sequence: T[j] = k; return j; g else f if (T[j] == k) f throw foundException; g; g; i++; g; throw tableFullException; g Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 9 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 10 / 26 Operations Delete Operations Delete Delete Pseudocode Example: Delete 22 The next method either deletes k or returns an exception to indicate that k was not in the table. 0 1 2 3 4 5 6 7 T: NIL 25 2 NIL 12 NIL 14 22 void delete (key k) f i=0; do f Probe sequence: j = h(k, i); if (T[j] == k) f T[j] = deleted; return; g; Insert 30? i++; g while ((T[j] != nil) && (i < m)); throw notFoundException; g Question: Why not set T [j] = NIL, above? Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 11 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 12 / 26 Operations Delete Operations Delete Complication Insert: Another Algorithm The \value" DELETED is never overwritten. Exercise: once T [j] is marked DELETED it is not used to store an element of Write another version of the \Insert" algorithm that allows the dictionary! \DELETED" to be overwritten with an input key k Eventually a hash table might report over ows on insertions, even if Don't Forget: Make sure k can never be stored in two or more the the dictionary it stores is empty! locations at the same time! Unfortunately, cannot simply overwrite DELETED with NIL: How to do this: can cause searches to fail when they should succeed because insert terminates when a NIL entry is reached Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 13 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 14 / 26 Collision Resolution Collision Resolution Linear Probing More General Probe Sequences Linear Probing As previously noted it is not always true, in practice, that the sequence of addresses Let h(k) = h(k; 0) hh(k; 0); h(k; 1);:::; h(k; m 1)i Simple Form of Linear Probing: is a permutation h(k; i) = h(k) + i mod m for i 1 Good News: In this more general situation, it is still true that the search algorithm will return an integer i such that T[i] is equal to k if the given key k is stored in the table Generalization: the exceptions FoundException and notFoundException (used h(k; i) = h(k) + ci mod m for i 1 in the algorithms given previously) will still be thrown (precisely) when they are needed for some nonzero constant c (not depending on k or i) Bad News: tableFullException might now be thrown even though there are still some nil entries in the table Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 15 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 16 / 26 Collision Resolution Linear Probing Collision Resolution Quadratic Probing Strengths and Weaknesses Quadratic Probing Strengths: Let h(k) = h(k; 0) If c = 1 (or gcd(c; m) = 1) then the probe sequence is a permutation of 0; 1;:::; m 1 Simple form of Quadratic Probing: This hash function is easy to compute: For i 1 h(k; i) = h(k) + i 2 mod m h(k; i) = h(k; i 1) + c mod m : = h(k; i 1) + 2i 1 mod m for i 1 If linear probing is used, you can delete from a hash table without using DELETED at all, but the algorithm is more complicated. Generalization: 2 h(k; i) = h(k) + c0i + c1i Weakness: for a constant c and a nonzero constant c . Primary Clustering: 0 1 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 17 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 18 / 26 Collision Resolution Quadratic Probing Collision Resolution Double Hashing Strengths and Weaknesses Double Hashing Suppose h and h are both hash functions depending only on k, i.e., Strengths: 0 1 If gcd(m; c) = 1 and m 3 is prime then the probe sequence includes h0; h1 : U ! f0; 1;:::; m 1g (slightly) more than half of 0; 1;:::; m 1 The hash function is easy to compute: and such that h1(k) 6 0 mod m h(k; i) h(k; i 1) = + i 0 1 for every key k: for constants and 0 1 Double Hashing: Weakness: h(i; k) = (h0(k) + i h1(k)) mod m Secondary Clustering: Eg. h0(k) = k mod m; h1(k) = 1 + (k mod m 1) Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 19 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 20 / 26 Collision Resolution Double Hashing Collision Resolution Analysis Strengths and Weaknesses Summary Deletions complicate things: Strengths: Hash tables with chaining are often superior unless deletions are extremely rare (or do not happen at all) If m is prime and gcd(h1(k); m) = 1 then the probe sequence for k is a permutation of 0; 1;:::; m 1 Analysis and experimental results both suggest extremely good Expected number of probes for searches is too high for these tables to be expected performance useful when is close to one, where number of locations storing keys or DELETED = Weakness: m A bit more complicated than linear (or quadratic) probing Remaining slides show results concerning tables produced by inserting n keys k1; k2;:::; kn into an empty table (so = n=m) Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 21 / 26 Mike Jacobson (University of Calgary) Computer Science 331 Lecture #20 22 / 26 Collision Resolution Analysis Collision Resolution Analysis The Best We Can Hope For Analysis of Linear Probing (with c = 1) Uniform Hashing Assumption: Each of the m! permutations is equally Assumption: Each of the mn sequences likely as a probe sequence for a key.