CSE 2331
Table Doubling
11.1 CSE 2331
Dynamic Arrays
Allocate an array of size 10. What if you try to insert 11 elements in the array? Need to reallocate the array.
Reallocate to size 11. Insert element. Reallocate to size 12. Insert element. etc.
How much time does it take to insert 100 elements? How much time does it take to insert n elements?
11.2 CSE 2331
Open Address Hashing
Hash table of size m = 20.
For Θ(1) expected running time, need:
# elements n ≤ 20/2 = 10.
After inserting 10 elements, need to increase hash table size m.
11.3 CSE 2331
Array Doubling
Start with array of size 2. After inserting 2 elements, replace with array of size 4. After inserting 4 elements, replace with array of size 8. After inserting 8 elements, replace with array of size 16. etc.
11.4 CSE 2331
Array Doubling
function Insert(x, A, m, n) /* A is an array of size m. */ /* n = # elements in A. */ 1 if (n = m) then 2 Create new array A2 with size 2m; 3 for i ← 1 to m do 4 A2[i] ← A[i]; 5 end 6 Replace array A with A2; 7 m ← 2m; 8 end 9 n ← n + 1; 10 A[n] ← x;
11.5 CSE 2331
Example A : [ , ] Insert: 21 A : [21, ] Insert: 22 A : [21, 22] Insert: 41 A : [21, 22, 41, ] (Array size m = 4.) Insert: 42 A : [21, 22, 41, 42] Insert: 81 A : [21, 22, 41, 42, 81, , , ] (Array size m = 8.) Insert: 82 Insert: 83 Insert: 84 A : [21, 22, 41, 42, 81, 82, 83, 84] Insert: 61 A : [21, 22, 41, 42, 81, 82, 83, 84, 61, , , , , , , ] (m = 16.)
11.6 CSE 2331
Running Time Analysis
function Insert(x, A, m, n) 1 if (n = m) then 2 Create new array A2 with size 2m; 3 for i ← 1 to m do 4 A2[i] ← A[i]; 5 end 6 Replace array A with A2; 7 m ← 2m; 8 end 9 n ← n + 1; 10 A[n] ← x;
What is the running time of Insert when m 6= n? What is the running time of Insert when m = n?
11.7 CSE 2331
Running Times
# elements Array size m Insert time 0 2 c 1 2 c 2 2 c +4c 3 4 c 4 4 c +8c 5 8 c 6 8 c 7 8 c 8 8 c + 16c 9 16 c 10 16 c ...... 15 16 c 16 16 c + 32c 17 32 c
11.8 CSE 2331
Running Time Analysis
Total running time for n insertions:
T (n)= cn + (Cost of doubling) = cn + (4c + 8c + 16c + 32c + ... + nc/2+ nc + 2nc) = cn + (2nc + nc + nc/2+ ... + 32c + 16c + 8c + 4c) = cn + 2nc(1+1/2 + 1/4+ ... + 2/n) ≤ cn + 4cn = 5cn. T (n)= cn + (Cost of doubling) ≥ cn.
Since cn ≤ T (n) ≤ 5cn,
T (n) ∈ Θ(n).
11.9 CSE 2331
Amortized Analysis
Cost of one single operation may vary greatly. Average cost is much lower than the highest cost.
Example: Array doubling. Insert cost: c or c + 2cn. Total cost of n Insert’s: 5cn. Average Insert cost: 5cn/n = 5c ∈ Θ(1).
11.10 CSE 2331
Hash Table Doubling
function Dict.Insert(K, D) 1 m ← HashTable.size; 2 if (HashTable.NumElements ≥ m/2) then 3 Create new hash table HashTable2 with size 2m; 4 for i ← 1 to HashTable.size do 5 if (HashTable[i] is not empty) then 6 K ← HashTable[i].key; 7 HashTable2.Insert(K, HashTable[i].data); 8 end 9 end 10 Replace HashTable with HashTable2; 11 end 12 HashTable.Insert(K,D);
11.11 CSE 2331
Deletions from Dynamic Tables
11.12 CSE 2331
Pop
function Pop(x, A, m, n) /* A is an array of size m. */ /* n = # elements in A. */ 1 if (n = 0) then error “Empty array.”; 2 x ← A[n]; 3 n ← n − 1; 4 return (x);
11.13 CSE 2331
Dynamic Table Deletions
m = size of array A. n = number of elements in A. Want to shrink the array if n is a lot less than m.
Proposal: Create new array of size m/2 if n ≤ m/2.
What’s the problem?
11.14 CSE 2331
Insert/Delete problem
Operation # elements Array size m Time 32 32 Insert c + 64c 33 64 Delete c + 32c 32 32 Insert c + 64c 33 64 Delete c + 32c 32 32 Insert c + 64c 33 64 Delete c + 32c 32 32
11.15 CSE 2331
Dynamic Table Deletions
m = size of array A. n = number of elements in A. Want to shrink the array if n is a lot less than m.
Solution: Create new array of size m/2 if n ≤ m/4.
11.16 CSE 2331
Dynamic Table Pop
function DynamicPop(x, A, m, n) /* A is an array of size m. */ /* n = # elements in A. */ 1 if (n = 0) then error “Empty array.”; 2 x ← A[n]; 3 n ← n − 1; 4 if ((n ≤ m/4) and (m ≥ 4)) then 5 Create new array A2 with size m/2; 6 for i ← 1 to n do 7 A2[i] ← A[i]; 8 end 9 Replace array A with array A2; 10 m ← m/2; 11 end 12 return (x);
11.17 CSE 2331
Example
A : [31, 32, 33, 34, 35, 36, , , , , , , , , , ] (Array has size m = 16.) DynamicPop A : [31, 32, 33, 34, 35, , , , , , , , , , , ] DynamicPop A : [31, 32, 33, 34, , , , ]. (Array has size m = 8.) DynamicPop A : [31, 32, 33, , , , , ]. DynamicPop A : [31, 32, , ] (Array has size m = 4.)
11.18 CSE 2331
Dynamic Table Pop function DynamicPop(x, A, m, n) 1 if (n = 0) then error “Empty array.”; 2 x ← A[n]; 3 n ← n − 1; 4 if ((n ≤ m/4) and (m ≥ 4)) then 5 Create new array A2 with size m/2; 6 for i ← 1 to n do 7 A2[i] ← A[i]; 8 end 9 Replace array A with array A2; 10 m ← m/2; 11 end 12 return (x);
What is the running time of DynamicPop when n 6= m/4? What is the running time of DynamicPop when n = m/4?
11.19 CSE 2331
Running Times # elements Array size m DynamicPop time 18 64 c 17 64 c + 32c 16 32 c ...... 10 32 c 9 32 c + 16c 8 16 c 7 16 c 6 16 c 5 16 c +8c 4 8 c 3 8 c +4c 2 4 c +2c 1 2 c
11.20 CSE 2331
Running Time Analysis
Assume number of elements n = array size m = 2k. Total running time for n deletes:
T (n)= cn + (Cost of table halving) = cn +(cn + cn/2+ cn/4+ cn/8+ ... + 2c) = cn + cn(1+1/2 + 1/4 + 1/8+ ... + 2/n) ≤ cn + 2cn = 3cn. T (n)= cn + (Cost of table halving) ≥ cn.
Since cn ≤ T (n) ≤ 3cn,
T (n) ∈ Θ(n).
11.21 CSE 2331
Insert & Delete Running Time
n inserts. n deletes. All inserts do not necessarily precede all deletes.
Total running time of 2n operations is still Θ(n).
11.22 CSE 2331
Data Structures for Disjoint Sets
11.23 CSE 2331
Union-Find Data Structure
Represent disjoint sets: Each element is in exactly one set.
Operations: MakeSet(x) - Create a new set containing element x. Union(x, y) - Union of the sets containing x and y. FindSet(x) - Return a reference to a representative element of the set containing x.
11.24 CSE 2331
Connected Component
v12 v1 KK 0 KK 0 KK 0 KK 0 v11 KK0 v2 ZZZZZZZ Z Kc0K0cccccc << ZZZZZZcZccccccc 0KK ppp < c ccccccc ZZ ZZZZZZ0 pKpK ccc Are vi and vj in the same connected component? 11.25 CSE 2331 Connected Component /* Create a set for each vertex */ 1 foreach vertex vi do MakeSet(vi); 2 while (Not Done) do ... /* Add edge (vi, vj ) */ 3 if (FindSet(vi) 6= FindSet(vj )) then 4 Union (vi, vj ); 5 end ... /* Report if vx and vy are in the same component */ 6 if (FindSet(vx) = FindSet(vy )) then 7 Print “vx and vy are in the same component.” 8 end ... 9 end 11.26 CSE 2331 Linked List Based Union-Find . x1 o / x2 o / x3 o / x4 o / x5 . x6 o / x7 o / x8 o / x9 o / x10 o / x11 11.27 CSE 2331 Linked List Based Union-Find . x1 o / x2 o / x3 o / x4 o / x5 . x6 o / x7 o / x8 o / x9 o / x10 o / x11 11.28 CSE 2331 Linked List Based Union-Find rwyx . x1 / x2 / x3 / x4 / x5 9 tail {rwyx . x6 / x7 / x8 / x9 / x10 / x11 7 tail 11.29 CSE 2331 Linked List Based Union-Find rwyx . x1 / x2 / x3 / x4 / x5 9 tail {rwyx . x6 / x7 / x8 / x9 / x10 / x11 7 tail procedure FindSet(x) 1 return (x.head); 11.30 CSE 2331 Linked List Based Union-Find }} sw / x1 / x2 / x3 / x4 / x5 procedure Union(x, y) : ′ 1 x ← FindSet (x); tail ′ 2 y ← FindSet (y); ′ ′ 3 x .tail.next ← y ; ′ } 4 w ← y ; sw / x6 / x7 / x8 / x9 5 while (w 6= NULL) do = ′ 6 w.head ← x ; tail 7 w ← w.next; 8 end ′ ′ 9 x .tail ← y .tail; 11.31 CSE 2331 Linked List Based Union-Find procedure Union(x, y) ′ procedure MakeSet(x) 1 x ← FindSet (x); ′ 1 x.head ← x; 2 y ← FindSet (y); ′ ′ 2 x.tail ← x; 3 x .tail.next ← y ; ′ 3 x.next ← NULL; 4 w ← y ; 5 while (w 6= NULL) do ′ 6 w.head ← x ; procedure FindSet(x) 7 w ← w.next; 1 return (x.head); 8 end ′ ′ 9 x .tail ← y .tail; 11.32 CSE 2331 Linked List: Weighted Union procedure WeightedUnion(x, y) ′ 1 x ← FindSet (x); ′ 2 y ← FindSet (y); ′ ′ 3 if (x = y ) then return; ′ ′ 4 if (x .length ≥ y .length) then ′ ′ 5 x .tail.next ← y ; ′ 6 w ← y ; 7 while (w 6= NULL) do ′ 8 w.head ← x ; 9 w ← w.next; 10 end ′ ′ ′ 11 x .length = x .length + y .length; ′ ′ 12 x .tail ← y .tail; 13 else 14 WeightedUnion(y, x); 15 end 11.33 CSE 2331 Weighted Union: Analysis For a given node xi, how many times does xi.head change? Min size of set containing xi Cost of changing xi.head 1 c 2 c 4 c 8 c . . . . n/2 c n c head Ki = Total cost of changing xi. ≤ c + c + . . . + c = c log2(n). | log{z2(n) } n n Total for all xi: Pi=1 Ki ≤ Pi=1 c log(n)= n log2(n). 11.34 CSE 2331 Weighted Union: Analysis Lower bound: WeightedUnion (x1, x2); WeightedUnion (x3, x4); WeightedUnion (x5, x6); WeightedUnion (x7, x8); . . . WeightedUnion (x1, x3); WeightedUnion (x5, x7); WeightedUnion (x9, x11); . . . WeightedUnion (x1, x5); WeightedUnion (x9, x13); . . . Time : c + c + . . . + c +2c +2c + . . . +2c +4c +4c + . . . +4c + . . . | {zn } | ⌊n/{z2⌋ } | ⌊n/{z4⌋ } = cn log2(n) Takes Ω(n log(n)) time. 11.35 CSE 2331 Linked List: Weighted Union procedure MakeSet(x) 1 x.head ← x; 2 x.tail ← x; 3 x.next ← NULL; 4 x.length ← 1; procedure FindSet(x) 1 return (x.head); 11.36 CSE 2331 Tree Based Union-Find Ø Ø Ø Ø x1 x2 x3 x4 = ? `A {{ O O O AA {{ AA {{ A x5 x6 x7 x8 x9 x10 = aC aC {{ O CC O CC O {{ CC CC {{ C C x11 x12 x13 x14 x15 x16 O x17 procedure FindSet(x) 1 if (x = x.parent) then return (x); 2 else 3 z ← FindSet(x.parent); 4 return (z); 5 end 11.37 CSE 2331 Tree Based Union-Find Ø Ø Ø Ø x1 x2 x3 x4 = ? `A {{ O O O AA {{ AA {{ A x5 x6 x7 x8 x9 x10 = aC aC {{ O CC O CC O {{ CC CC {{ C C x11 x12 x13 x14 x15 x16 O x17 procedure Union(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ 3 y .parent ← x ; 11.38 CSE 2331 Tree Based Union-Find procedure MakeSet(x) 1 x.parent ← x; procedure FindSet(x) 1 if (x = x.parent) then return (x); 2 else 3 z ← FindSet(x.parent); 4 return (z); 5 end procedure Union(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ 3 y .parent ← x ; 11.39 CSE 2331 Tree Based Union-Find Ø Ø Ø Ø x1 x2 x3 x4 = ? `A {{ O O O AA {{ AA {{ A x5 x6 x7 x8 x9 x10 = aC aC {{ O CC O CC O {{ CC CC {{ C C x11 x12 x13 x14 x15 x16 O x17 procedure Union(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ 3 y .parent ← x ; 11.40 CSE 2331 Height Based Union procedure MakeSet(x) 1 x.parent ← x; 2 x.height ← 0; procedure UnionByHeight(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ ′ ′ 3 if (x .height > y .height) then y .parent ← x ; ′ ′ ′ ′ 4 else if (x .height < y .height) then x .parent ← y ; 5 else ′ ′ 6 y .parent ← x ; ′ ′ 7 x .height ← x .height + 1; 8 end 11.41 CSE 2331 Height Based Union × × × × x1 x2 x3 x4 ; = bE ww O O zz O EE ww zz E x5 x6 x7 x8 x9 x10 ; Gc Gc ww O GG O GG O ww G G x11 x12 x13 x14 x15 x16 O x procedure UnionByHeight(x, y) 17 ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ ′ ′ 3 if (x .height > y .height) then y .parent ← x ; ′ ′ ′ ′ 4 else if (x .height < y .height) then x .parent ← y ; 5 else ′ ′ 6 y .parent ← x ; ′ ′ 7 x .height ← x .height + 1; 8 end 11.42 CSE 2331 Height Based Union procedure UnionByHeight(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ ′ ′ 3 if (x .height > y .height) then y .parent ← x ; ′ ′ ′ ′ 4 else if (x .height < y .height) then x .parent ← y ; 5 else ′ ′ 6 y .parent ← x ; ′ ′ 7 x .height ← x .height + 1; 8 end Proposition 1. If r is the root of tree T , then r.height is the height of tree T . 11.43 CSE 2331 Height Based Union: Proof of Proposition 1 Proposition 1. If r is the root of tree T , then r.height is the height of tree T . Proof. Proof by induction on the number of vertices. Base case: If a tree T has one vertex, r, then r.height equals 0 and height(T ) equals 0. Induction hypothesis: If tree T has fewer than n vertices and r is the root of T , then r.height = height(T ). Induction Step: Show that if tree T has n> 1 vertices and r is the root of T , then r.height = height(T ). 11.44 CSE 2331 Induction Step: Show that if tree T has n> 1 vertices and r is the root of T , then r.height = height(T ). Tree T was created by UnionByHeight(x, y) where x is in tree Tx ′ ′ and y is in tree Ty. Let x be the root of Tx and y be the root of Ty. By the induction hypothesis, ′ ′ x .height = height(Tx) and y .height = height(Ty). Case I: x′.height > y′.height. ′ r.height = x .height = height(Tx)= height(r). (Why?) Case II: x′.height < y′.height: Same argument as Case I. Case III: x′.height = y′.height. ′ r.height = x .height +1= height(Tx)+1= height(r). (Why?) 11.45 CSE 2331 Height Based Union procedure UnionByHeight(x, y) ′ 1 x ← FindSet(x); ′ 2 y ← FindSet(y); ′ ′ ′ ′ 3 if (x .height > y .height) then y .parent ← x ; ′ ′ ′ ′ 4 else if (x .height < y .height) then x .parent ← y ; 5 else ′ ′ 6 y .parent ← x ; ′ ′ 7 x .height ← x .height + 1; 8 end Definition. size(T ) = number of vertices of tree T . Proposition 2. If T has height h, then size(T ) ≥ 2h. 11.46 CSE 2331 Height Based Union: Proof of Proposition 2 Proposition 2. If T has height h, then size(T ) ≥ 2h. Proof. Proof by induction on size(T ). Base case: If size(T ) = 1, then height(T ) equals 0 and size(T ) = 1 ≥ 2height(T ). Induction hypothesis: If size(T ) < n, then size(T ) ≥ 2height(T ). Induction Step: Show that if size(T )= n> 1, then size(T ) ≥ 2height(T ). Tree T was created by UnionByHeight(x, y) where x is in tree Tx ′ ′ and y is in tree Ty. Let x be the root of Tx and y be the root of Ty. By the induction hypothesis, height(Tx) height(Ty ) size(Tx) ≥ 2 and size(Ty) ≥ 2 . 11.47 CSE 2331 Induction Step: Show if size(T )= n> 1, then size(T ) ≥ 2height(T ). By the induction hypothesis, height(Tx) height(Ty ) size(Tx) ≥ 2 and size(Ty) ≥ 2 . Case I: x′.height > y′.height. ′ height(T )= r.height = x .height = height(Tx) (Proposition 1.) height(Tx) height(T ) size(T )= size(Tx)+ size(Ty) ≥ size(Tx) ≥ 2 = 2 . (Why?) Case II: x′.height < y′.height: Same argument as Case I. Case III: x′.height = y′.height. ′ height(T )= r.height = x .height +1= height(Tx) + 1 (Prop. 1.) ′ ′ height(Tx)= x .height = y .height = height(Ty) (Prop. 1). height(Tx) height(y) size(T )= size(Tx)+ size(Ty) ≥ 2 + 2 = 2 × 2height(Tx) = 2height(Tx)+1 = 2height(T ). (Why?) 11.48 CSE 2331 Height Based Union Proposition 2. If T has height h, then size(T ) ≥ 2h. Corollary. If T has height h, then h ≤ log2(size(T )). Proof. By Proposition 2, size(T ) ≥ 2h. h log2(size(T )) ≥ log2(2 )= h. 11.49 CSE 2331 Tree Based Union-Find Ø Ø Ø Ø x1 x2 x3 x4 = ? `A {{ O O O AA {{ AA {{ A x5 x6 x7 x8 x9 x10 = aC aC {{ O CC O CC O {{ CC CC {{ C C x11 x12 x13 x14 x15 x16 O x17 procedure FindSet(x) 1 if (x = x.parent) then return (x); 2 else 3 z ← FindSet(x.parent); 4 return (z); 5 end 11.50 CSE 2331 Find With Path Compression Ø Ø Ø Ø x1 x2 x3 x4 = ? `A {{ O O O AA {{ AA {{ A x5 x6 x7 x8 x9 x10 = aC aC {{ O CC O CC O {{ CC CC {{ C C x11 x12 x13 x14 x15 x16 O procedure FindSetPathCompress(x) x17 1 if (x = x.parent) then return (x); 2 else 3 z ← FindSetPathCompress(x.parent); 4 x.parent ← z; 5 return (z); 6 end 11.51 CSE 2331 Find With Path Compression: Example x 15 jT oo7 _??TTTT ooo ? TTT oo ?? TTTT ooo TTT x5 x17 x9 o7 ^= Og O ooo == OOO ooo == OOO ooo OO x4 x2 x1 _?? Ñ@ ^== ?? ÑÑ == ? ÑÑ = x14 x13 x11 ? _?? ? ^== ^== ?? == == ? = = x10 x7 x16 x6 x3 ? _?? Ñ@ ?? ÑÑ ? ÑÑ x18 x12 x8 11.52 CSE 2331 Union by Rank with Path Compression Combine height based union find with path compression. Replace x.height with x.rank. Proposition 1: If r is the root of tree T , then height(T ) ≤ r.rank. Proposition: The cost of n UnionByRank and FindSetPathCompress is nα(n) where α(n) is the inverse of the Ackerman function. 11.53 CSE 2331 The Ackermann Function A1(n) = 2n. n A2(n) = 2 × 2 ×···× 2 = 2 . n 2 | 2... {z } A (n) = 22 n. 3 (n) Ak(n)= Ak−1(1) = Ak−1(Ak−1(Ak−1(... (Ak−1(1)) ...))) . n | {z } Inverse of A1(n)= ⌈n/2⌉. Inverse of A2(n)= ⌈log2(n)⌉. Inverse of A3(n)= ⌈log ∗(n)⌉ = ⌈log2(log2(log2(... (log2(n)) ...)))⌉. n α(n) = min{k : Ak(k)|≥ n}. {z } (Slightly different than definition in Intro to Algorithms by CLRS.) 11.54