CSE 638: Advanced Algorithms
Lectures 20 & 21 ( Cache-oblivious Priority Queue with Decrease-Keys )
Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2013 Cache-Oblivious Buffer Heap
Amortized I/O Bounds
Priority Queue Delete / Delete-Min Decrease-Key
Cache- Buffer Heap oblivious 1 N O log B2 B Cache- Tournament Tree aware
Binary Heap O(((log N ))) ( worst-case ) 2 Internal Memory 1 Fibonacci Heap O(((log 2 N ))) O ((( ))) Cache-Oblivious Buffer Heap: Structure
Consists of r = 1 + log 2N levels, where N = total number of elements.
For 0 ≤≤≤ i ≤≤≤ r - 1, level i contains two buffers:
Element Buffers Update Buffers element buffer Bi B U contains elements of the form 0 0 B U (x, kx), where x is the element 1 1
id, and kx is its key B2 U2
update buffer Ui B U contains updates ( Delete , i i Decrease-Key and Sink ), each B U augmented with a time-stamp. r - 1 r - 1 Fig: The Buffer Heap Cache-Oblivious Buffer Heap: Invariants
i Element Buffers Update Buffers Invariant 1 : |Bi|≤≤≤ 2
B0 U0
U Invariant 2 : B1 1
B2 U2 (a) No key in Bi is larger than any key in Bi+1
(b) For each element x in Bi, all updates yet Bi Ui to be applied on x reside in U0, U1, … , Ui
Br - 1 Ur - 1 Invariant 3 : Fig: The Buffer Heap (a) Each Bi is kept sorted by element id
(b) Each Ui ( except U0 ) is kept ( coarsely ) sorted by element id and time-stamp Cache-Oblivious Buffer Heap: Operations
The following operations are supported: − Delete (x): Deletes the element x from the queue. − Delete-Min ( ) : Extracts an element with minimum key from queue.
− Decrease-Key (x, kx): ( weak Decrease-Key ) ′′′ ′′′ If x already exists in the queue, replaces key k x of x with min( kx, k x),
otherwise inserts x with key kx into the queue.
A new element x with key kx can be inserted into queue by Decrease-Key (x, kx ). Cache-Oblivious Buffer Heap: Operations
Decrease-Key (x, kx) :
Insert the operation into U0 augmented with current time-stamp.
Delete (x) :
Insert the operation into U0 augmented with current time-stamp.
Delete-Min ( ) : Two phases: − Descending Phase ( Apply Updates ) − Ascending Phase ( Redistribute Elements ) Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) : 1. sort updates
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 2. apply updates: U1 − Delete (x)
B2 − Decrease-Key (x, kx) U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2 3. carry updates: − all updates that did not apply − applied Decrease-Keys as Deletes
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) : 1. sort updates: − merge segments B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 2. apply updates: U2 − Delete (x)
− Decrease-Key (x, kx) − Sink (x, kx)
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2
3. carry updates
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Descending Phase ( Apply Updates ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
− find 2k elements with 2k smallest keys Uk+1 − convert each remaining element x with
key kx to Sink (x, kx) Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1
push the Sinks to Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
move all elements from Bk to shallower levels leaving Bk empty
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) :
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: Delete-Min Delete-Min ( ) - Ascending Phase ( Redistribute Elements ) : element with minimum key
B0 U0
B1 U1
B2 U2
Bk-1 Uk-1
Bk Uk
Uk+1 Cache-Oblivious Buffer Heap: I/O Complexity
1 r−1 r − 1 Potential Function: 4 3 1 Φ ((()(H))) = rU0 +∑((()( riU −++))) i ∑ ((()( iB))) i B i=1 i = 0
Element Buffers Update Buffers direction of direction of B0 U0 elements updates
B1 U1
B2 U2
Bi Ui
updates generate elements
Br - 1 Ur - 1
Fig: Major Data Flow Paths in the Buffer Heap
Lemma: A Buffer Heap on N elements supports Delete , Delete-Min and 1 Decrease-Key operations cache-obliviously in amortized Ο log 2 N B I/Os each using Ο ((( N ))) space. Cache-Oblivious Buffer Heap: I/O Complexity
1 r−1 r − 1 Potential Function: 4 3 1 Φ ((()(H))) = rU0 +∑((()( riU −++))) i ∑ ((()( iB))) i B i=1 i = 0
Element Buffers Update Buffers direction of direction of B0 U0 elements updates
B1 U1
B2 U2
Bi Ui
updates generate elements
Br - 1 Ur - 1
Fig: Major Data Flow Paths in the Buffer Heap
Lemma: A Buffer Heap on N elements supports Delete , Delete-Min and 1 Decrease-Key operations cache-obliviously in amortized Ο log 2 N B I/Os each using Ο ((( N ))) space. Cache-Oblivious Buffer Heap: I/O Complexity
1 r−1 r − 1 Potential Function: 4 3 1 Φ ((()(H))) = rU0 +∑((()( riU −++))) i ∑ ((()( iB))) i B i=1 i = 0
Element Buffers Update Buffers direction of direction of B0 U0 elements updates
B1 U1
B2 U2
Bi Ui
updates generate elements
Br - 1 Ur - 1
Fig: Major Data Flow Paths in the Buffer Heap
Lemma: A Buffer Heap on N elements supports Delete , Delete-Min and 1 Decrease-Key operations cache-obliviously in amortized Ο log 2 N B I/Os each using Ο ((( N ))) space. Cache-Oblivious Buffer Heap: I/O Complexity
1 r−1 r − 1 Potential Function: 4 3 1 Φ ((()(H))) = rU0 +∑((()( riU −++))) i ∑ ((()( iB))) i B i=1 i = 0
Element Buffers Update Buffers direction of direction of B0 U0 elements updates
B1 U1
B2 U2
Bi Ui
updates generate elements
Br - 1 Ur - 1
Fig: Major Data Flow Paths in the Buffer Heap
Lemma: A Buffer Heap on N elements supports Delete , Delete-Min and 1 Decrease-Key operations cache-obliviously in amortized Ο log 2 N B I/Os each using Ο ((( N ))) space. Cache-Oblivious Buffer Heap: I/O Complexity
1 r−1 r − 1 Potential Function: 4 3 1 Φ ((()(H))) = rU0 +∑((()( riU −++))) i ∑ ((()( iB))) i B i=1 i = 0
Element Buffers Update Buffers direction of direction of B0 U0 elements updates
B1 U1
B2 U2
Bi Ui
updates generate elements
Br - 1 Ur - 1
Fig: Major Data Flow Paths in the Buffer Heap
Lemma: A Buffer Heap on N elements supports Delete , Delete-Min and 1 Decrease-Key operations cache-obliviously in amortized Ο log 2 N B I/Os each using Ο ((( N ))) space. Cache-Oblivious Buffer Heap: Layout
• All Bi’s are kept in a stack SB. • All Ui’s are kept in a stack SU. • An array As is maintained in a stack SA.
for 0 ≤≤≤ i ≤≤≤ r – 1 As[i] contains:
− |Bi|
− number of segments in Ui
− number of updates in each segment of Ui In both stacks lower level buffers are placed above higher level buffers. The left to right order of the elements in any buffer are maintained top to bottom in the stack. Cache-Oblivious Buffer Heap: Reconstruction
After each operation check whether ∑∑∑|Ui| ≥≥≥ ∑∑∑|Bi|, and if so,
Step 1: Sort the elements in SB by element id and level number.
Step 2: Sort the updates in SU by element id and time-stamp.
Step 3: Scan SB and Su simultaneously, and apply the updates in
Su on the elements of SB. Step 4: Reconstruct the data structure by filling the shallowest
levels with the current elements in SB, and emptying Su. Cache-Oblivious Buffer Heap: I/O Complexity Lemma : A BH supports Delete , Delete-Min , and Decrease-Key operations
in O((1 / B) log 2(N / B)) amortized I/Os each assuming a tall cache.
PotentialProof : We Method will :use the Potential Method .
Associates− Each credit Decrease-Key with the entire inserted data structure into U0 willinstead be treatedof specific as objects. a pair of 〈〈〈 〉〉〉 − D0operations:= initial state Decrease-Key of the data structure, Dummy .
−− DiEach= state component of data structure of the afteractual i-th cost operation, will have i = a1, ΘΘΘ 2,(1 …, / Bn) factor
− ΦΦΦassociated= a potential with function it whichmapping we will each drop Di to for a real simplicit numbery. ΦΦΦ(Di)
− ci / ai = actual / amortized cost of the i-th operation, i = 1, 2, …, n − For 0 ≤≤≤ i ≤≤≤ r - 1, let ui = |Ui|, and bi = |Bi|. n n Then = +ΦΦ − ⇒⇒⇒ = + ΦΦ − − If H isacDii the current((()( i)))( state((() D i−−−1))) of BH∑ acD, ii we ∑ define((()( potentialn ))) ((()( D 0 ))) of H as: i=1 i = 1
1 1 − define ΦΦΦ so that ΦΦΦ(D0) = 0 andr− ΦΦΦ(Di) ≥≥≥ 0 for r all − i. Φ H=4 ru + 3 riu −+ i + 1 b n((()( n))) 0 ∑ n ((()( )))(i ∑ ((()))) i i=1 i = 0 Then ∑ci= ∑ a i −Φ((()( Dn ))) ≤ ∑ a i i=1 i = 1 i = 1 Thus the total amortized cost is an upper bound on the total actual cost. Cache-Oblivious Buffer Heap: I/O Complexity Lemma : A BH supports Delete , Delete-Min , and Decrease-Key operations
in O((1 / B) log 2(N / B)) amortized I/Os each assuming a tall cache.
PotentialProof : We Method will :use the Potential Method .
Associates− Each credit Decrease-Key with the entire inserted data structure into U0 willinstead be treatedof specific as objects. a pair of 〈〈〈 〉〉〉 − D0operations:= initial state Decrease-Key of the data structure, Dummy .
−− DiEach= state component of data structure of the afteractual i-th cost operation, will have i = a1, ΘΘΘ 2,(1 …, / Bn) factor
− ΦΦΦassociated= a potential with function it whichmapping we will each drop Di to for a real simplicit numbery. ΦΦΦ(Di)
− ci / ai = actual / amortized cost of the i-th operation, i = 1, 2, …, n − For 0 ≤≤≤ i ≤≤≤ r - 1, let ui = |Ui|, and bi = |Bi|. n n Then = +ΦΦ − ⇒⇒⇒ = + ΦΦ − − If H isacDii the current((()( i)))( state((() D i−−−1))) of BH∑ acD, ii we ∑ define((()( potentialn ))) ((()( D 0 ))) of H as: i=1 i = 1
1 1 − define ΦΦΦ so that ΦΦΦ(D0) = 0 andr− ΦΦΦ(Di) ≥≥≥ 0 for r all − i. Φ H=4 ru + 3 riu −+ i + 1 b n((()( n))) 0 ∑ n ((()( )))(i ∑ ((()))) i i=1 i = 0 Then ∑ci= ∑ a i −Φ((()( Dn ))) ≤ ∑ a i i=1 i = 1 i = 1 Thus the total amortized cost is an upper bound on the total actual cost. Cache-Oblivious Buffer Heap: I/O Complexity
r−1 r − 1 Φ =4 + 3 −+ + 1 Amortized Cost of Reconstruction : ((()(H))) ru0 ∑((()( riu)))(i ∑ ((() i))) b i i=1 i = 0 r−1 r − 1 r −−−1 − ≤≤≤ r u Starts with ∑ u i ≥≥≥ ∑ b i . Thus cost of steps 1 to 4 is ∑∑∑ i. i=0 i = 0 r −−−1 i === 0 − Total potential drop is ≤≤≤ r ∑∑∑ u i. i === 0 r −−−1 r −−−1 − Amortized cost of reconstruction is ≤≤≤ r ∑∑∑ u i ≤≤≤ r ∑∑∑ u i . i === 0 i === 0
After each operation check whether ∑∑∑ui ≥≥≥ ∑∑∑bi, and if so,
Step 1: Sort the elements in SB by element id and level number.
Step 2: Sort the updates in SU by element id and time-stamp.
Step 3: Scan SB and Su simultaneously, and apply the updates in
Su on the elements of SB. Step 4: Reconstruct the data structure by filling the shallowest
levels with the current elements in SB, and emptying Su. X r=== Ο(log N ) I/O cost of sorting X elements = Οlog 2 X (( 2 )) B Cache-Oblivious Buffer Heap: I/O Complexity
r−1 r − 1 Φ =4 + 3 −+ + 1 Amortized Cost of Reconstruction : ((()(H))) ru0 ∑((()( riu)))(i ∑ ((() i))) b i i=1 i = 0 r−1 r − 1 r −−−1 − ≤≤≤ r u Starts with ∑ u i ≥≥≥ ∑ b i . Thus cost of steps 1 to 4 is ∑∑∑ i. i=0 i = 0 r −−−1 i === 0 − Total potential drop is ≤≤≤ r ∑∑∑ u i. i === 0 r −−−1 r −−−1 − Amortized cost of reconstruction is ≤≤≤ r ∑∑∑ u i ≤≤≤ r ∑∑∑ u i . i === 0 i === 0 Amortized Cost of Delete : − Actual cost = 1 . − Increase in potential = 4r.
− Amortized cost = 1 + 4 r = O(log 2N).
Delete (x) :
Insert the operation into U0 augmented with current time-stamp. [Stack Push /Pop requires O(1 / B) amortized I/Os each] Cache-Oblivious Buffer Heap: I/O Complexity
r−1 r − 1 Φ =4 + 3 −+ + 1 Amortized Cost of Reconstruction : ((()(H))) ru0 ∑((()( riu)))(i ∑ ((() i))) b i i=1 i = 0 r−1 r − 1 r −−−1 − ≤≤≤ r u Starts with ∑ u i ≥≥≥ ∑ b i . Thus cost of steps 1 to 4 is ∑∑∑ i. i=0 i = 0 r −−−1 i === 0 − Total potential drop is ≤≤≤ r ∑∑∑ u i. i === 0 r −−−1 r −−−1 − Amortized cost of reconstruction is ≤≤≤ r ∑∑∑ u i ≤≤≤ r ∑∑∑ u i . i === 0 i === 0 Amortized Cost of Delete : − Actual cost = 1 . Decrease-Key (x, k ) : − Increase in potential = 4r. x Insert the operation into U0 augmented with current time-stamp.
− Amortized cost [Each= 1 +Decrease-Key 4 r = O(logis considered2N). as two operations] Amortized Cost of Decrease-Key : − Actual cost = 2 ××× 1. − Increase in potential = 2 ××× 4r.
− Amortized cost = 2 + 8 r = O(log 2N). Cache-Oblivious Buffer Heap: I/O Complexity
Amortized Cost of Delete-Min : X I/O cost of sorting X elements = Ο log 2 X B Actual Cost:
− Cost of sorting U0 is ≤≤≤ ru 0. k 1 − Cost of examining the updates in U0, U1, …, Uk is∑∑∑ ((()(k− i + ))) u i . i === 0 k −−−1 − Cost of examining the elements in B0, B1, …, Bk-1 is ∑∑∑ b i . i === 0 ′′′ − Let bk and b k be the number of elements in Bk before and after the updates, respectively. X I/O cost of scanning X elements = Ο Then the total cost of examining B during updates and selectionB k X kI/O′′′ is costthe smallestof scanning level X withelements non-empty = Ο B after updates is max ((( b k , b k ))) . B k after updates. − Cost of accessing As is ≤≤≤ r .
k k −−−1 1 ′′′ Thus actual cost, cru≤+0 ∑((()( kiu −+++))) i ∑ b i max((()( bbrk , k ))) + i=0 i = 0 Cache-Oblivious Buffer Heap: I/O Complexity
Amortized Cost of Delete-Min : Actual Cost:
− Cost of sorting U0 is ≤≤≤ ru 0. k 1X − Cost of examining theI/O updates cost of inselection U0, U1, from …, U Xk elementsis∑∑∑ ((()(k− = i Ο + ))) u i . i === 0 k −−−1 B − Cost of examining the elements in B0, B1, …, Bk-1 is ∑∑∑ b i . i === 0 − Let b and b′′′ be the number of elements in B before and after k k As stores informationk on each the updates, respectively. buffer and has length = Ο ((( r )))
Then the total cost of examining Bk during updates and selection
after updates is max ((( b k , b k′′′ ))) .
− Cost of accessing As is ≤≤≤ r .
k k −−−1 1 ′′′ Thus actual cost, cru≤+0 ∑((()( kiu −+++))) i ∑ b i max((()( bbrk , k ))) + i=0 i = 0 Cache-Oblivious Buffer Heap: I/O Complexity
r−1 r − 1 Amortized Cost of Delete-Min : Φ H=4 ru + 3 riu −+ i + 1 b Φ ((()(H))) = ru0 +∑((()( riu −+)))(i ∑ ((() i + ))) b i i=1 i = 0 Potential Drop:
− Potential drop due to changes in update buffers is k k 4 3 31 ≥ru0 +∑((()( riu −))) i −−−((()( rk))) ∑ u i i =1 i = 0 − Potential drop due to changes in element buffers is k −−−1 ′′′ ≥∑∑∑ bi + max((()( bk , b k ))) i === 0
Thus, total drop in potential is k k −−−1 1 ′′′ ≥ru0 +∑((()( kiu −+))) i + ∑ b i + max((()( bbk , k ))) i=0 i = 0 Therefore, amortized cost of Delete-Min is
ccHˆ =+(((Φ((( after))) − Φ((( H before )))))) ≤= r Ο ((()(log 2 N ))) Cache-Oblivious Buffer Heap: I/O Complexity
1+++ε 0 We assume N ≫ M === Ω ((( B ))) for some ε >>> N ⇒⇒⇒ log2N === Ο log 2 B
Therefore, under the tall cache assumption,
Amortized I/O cost of each operation 1 N (Delete , Delete-Min and Decrease-Key ) is = Ο log 2 B B Removing the “Tall Cache” Assumption
i − Restrict the size of each update buffer: |Ui|≤≤≤ 2
− Now all buffers ( Ui and Bi) of the first log 2B levels occupy only O(B) blocks.
− No external I/O is required to access the first log 2B levels. − Thus amortized I/O cost of each operation is 1 1 N =Ο((()(log2N − log 2 B ))) = Ο log 2 B B B