Chapter 4 Data Structures

Algorithm Theory WS 2014/15

Fabian Kuhn Fibonacci Heaps

Lacy‐merge variant of binomial heaps: • Do not merge trees as long as possible…

Structure: A Fibonacci consists of a collection of trees satisfying the min‐heap property.

Variables: • . : root of the containing the (a) minimum key • . : circular, doubly linked, unordered list containing the roots of all trees • . : number of nodes currently in

Algorithm Theory, WS 2014/15 Fabian Kuhn 2 Example

Figure: Cormen et al., Introduction to Algorithms

Algorithm Theory, WS 2014/15 Fabian Kuhn 3 Cost of Delete‐Min & Decrease‐Key

Delete‐Min: 1. Delete min. root and add . to . time: 1 2. Consolidate . time: length of . • Step 2 can potentially be linear in (size of ) Decrease‐Key (at node ): 1. If new key parent key, cut sub‐tree of node time: 1 2. Cascading cuts up the tree as long as nodes are marked time: number of consecutive marked nodes • Step 2 can potentially be linear in

Exercises: Both operations can take time in the worst case!

Algorithm Theory, WS 2014/15 Fabian Kuhn 4 Cost of Delete‐Min & Decrease‐Key

• Cost of delete‐min and decrease‐key can be Θ… – Seems a large price to pay to get insert and merge in 1 time

• Maybe, the operations are efficient most of the time? – It seems to require a lot of operations to get a long rootlist and thus, an expensive consolidate operation – In each decrease‐key operation, at most one node gets marked: We need a lot of decrease‐key operations to get an expensive decrease‐key operation

• Can we show that the average cost per operation is small?

• We can  requires

Algorithm Theory, WS 2014/15 Fabian Kuhn 5 Amortization

• Consider sequence ,,…, of operations (typically performed on some )

• : execution time of operation • ≔ ⋯: total execution time

• The execution time of a single operation might

vary within a large range (e.g., ∈1, )

• The worst case overall execution time might still be small  average execution time per operation might be small in the worst case, even if single operations can be expensive

Algorithm Theory, WS 2014/15 Fabian Kuhn 6 Analysis of Algorithms

• Best case

• Worst case

• Average case

• Amortized worst case

What it the average cost of an operation in a worst case sequence of operations?

Algorithm Theory, WS 2014/15 Fabian Kuhn 7 Example: Binary Counter

Incrementing a binary counter: determine the bit flip cost: Operation Counter Value Cost 00000 1 00001 1 2 00010 2 3 00011 1 400100 3 5 00101 1 6 00110 2 7 00111 1 801000 4 9 01001 1 10 01010 2 11 01011 1 12 01100 3 13 01101 1

Algorithm Theory, WS 2014/15 Fabian Kuhn 8 Accounting Method

Observation: • Each increment flips exactly one 0 into a 1 001001111 ⟹ 001000000 Idea: • Have a bank account (with initial amount 0) • Paying to the bank account costs • Take “money” from account to pay for expensive operations

Applied to binary counter: • Flip from 0 to 1: pay 1 to bank account (cost: 2) • Flip from 1 to 0: take 1 from bank account (cost: 0) • Amount on bank account = number of ones  We always have enough “money” to pay!

Algorithm Theory, WS 2014/15 Fabian Kuhn 9 Accounting Method

Op. Counter Cost To Bank From Bank Net Cost Credit 0 0 0 0 0 10 0 0 0 1 1 20 0 0 1 0 2 30 0 0 1 1 1 40 0 1 0 0 3 50 0 1 0 1 1 60 0 1 1 0 2 70 0 1 1 1 1 80 1 0 0 0 4 90 1 0 0 1 1 10 0 1 0 1 0 2

Algorithm Theory, WS 2014/15 Fabian Kuhn 10 Potential Function Method

• Most generic and elegant way to do amortized analysis! – But, also more abstract than the others…

• State of data structure / system: ∈(state space)

Potential function : →

• Operation :

– : actual cost of operation – : state after execution of operation (: initial state) – ≔Φ: potential after exec. of operation – : amortized cost of operation :

Algorithm Theory, WS 2014/15 Fabian Kuhn 11 Potential Function Method

Operation :

actual cost: amortized cost: Φ Φ Overall cost:

≔ Φ Φ

Algorithm Theory, WS 2014/15 Fabian Kuhn 12 Binary Counter:

• Potential function: :

• Clearly, Φ 0and Φ 0for all 0

• Actual cost : . 1 flip from 0 to 1

. 1flips from 1 to 0

• Potential difference: Φ Φ 1 1 2

• Amortized cost: Φ Φ 2

Algorithm Theory, WS 2014/15 Fabian Kuhn 13 Back to Fibonacci Heaps

• Worst‐case cost of a single delete‐min or decrease‐key operation is Ω

• Can we prove a small worst‐case amortized cost for delete‐min and decrease‐key operations?

Remark:

• Data structure that allows operations ,…,

• We say that operation has amortized cost if for every execution the total time is

⋅ ,

where is the number of operations of type Algorithm Theory, WS 2014/15 Fabian Kuhn 14 Amortized Cost of Fibonacci Heaps

• Initialize‐heap, is‐empty, get‐min, insert, and merge have worst‐case cost

• Delete‐min has amortized cost • Decrease‐key has amortized cost

• Starting with an empty heap, any sequence of operations with at most delete‐min operations has total cost (time)

.

• We will now need the marks…

• Cost for Dijkstra: || || log ||

Algorithm Theory, WS 2014/15 Fabian Kuhn 15 Simple (Lazy) Operations

Initialize‐Heap : • . ≔ . ≔

Merge heaps and ′: • concatenate root lists • update .

Insert element into : • create new one‐node tree containing  H′ • merge heaps and ′

Get minimum element of : • return .

Algorithm Theory, WS 2014/15 Fabian Kuhn 16 Operation Delete‐Min

Delete the node with minimum key from and return its element:

1. ≔.; 2. if . 0 then 3. remove . from . ; 4. add . . (list) to . 5. . ;

// Repeatedly merge nodes with equal degree in the root list // until degrees of nodes in the root list are distinct. // Determine the element with minimum key

6. return

Algorithm Theory, WS 2014/15 Fabian Kuhn 17 Rank and Maximum Degree

Ranks of nodes, trees, heap:

Node : • : degree of

Tree : • : rank (degree) of root node of

Heap : • : maximum degree of any node in

Assumption (: number of nodes in ): – for a known function

Algorithm Theory, WS 2014/15 Fabian Kuhn 18 Operation Decrease‐Key

Decrease‐Key, : (decrease key of node to new value )

1. if .then return; 2. . ≔ ; update . ; 3. if ∈ . ∨ . . then return 4. repeat 5. ≔ . ; 6. . ; 7. ≔; 8. until . ∨ ∈ . ; 9. if ∉ . then . ≔ ;

Algorithm Theory, WS 2014/15 Fabian Kuhn 19 Fibonacci Heaps: Marks

Cycle of a node:

1. Node is removed from root list and linked to a node .

2. Child node of is cut and added to root list .

3. Second child of is cut node is cut as well and moved to root list

The boolean value .. indicates whether node has lost a child since the last time was made the child of another node.

Algorithm Theory, WS 2014/15 Fabian Kuhn 20 Potential Function

System state characterized by two parameters: • : number of trees (length of . ) • : number of marked nodes that are not in the root list

Potential function: ≔

Example: 5 18 9 1 2 15 17

14 20 22 12 25 13 3 8

31 19 7

• 7, 2  Φ11

Algorithm Theory, WS 2014/15 Fabian Kuhn 21 Actual Time of Operations

• Operations: initialize‐heap, is‐empty, insert, get‐min, merge actual time: 1 – Normalize unit time such that ,,,, 1

• Operation delete‐min: – Actual time: length of . – Normalize unit time such that

length of . • Operation descrease‐key: – Actual time: length of path to next unmarked ancestor – Normalize unit time such that

lengthof path to next unmarked ancestor

Algorithm Theory, WS 2014/15 Fabian Kuhn 22 Amortized Times

Assume operation is of type:

• initialize‐heap:

– actual time: 1, potential: Φ Φ 0 – amortized time: Φ Φ 1

• is‐empty, get‐min:

– actual time: 1, potential: Φ Φ (heap doesn’t change) – amortized time: Φ Φ 1

• merge:

– Actual time: 1 – combined potential of both heaps: Φ Φ – amortized time: Φ Φ 1

Algorithm Theory, WS 2014/15 Fabian Kuhn 23 Amortized Time of Insert

Assume that operation is an insert operation:

• Actual time: 1

• Potential function: – remains unchanged (no nodes are marked or unmarked, no marked nodes are moved to the root list) – grows by 1 (one element is added to the root list)

, 1 Φ Φ 1 • Amortized time:

Algorithm Theory, WS 2014/15 Fabian Kuhn 24 Amortized Time of Delete‐Min

Assume that operation is a delete‐min operation:

Actual time: . Potential function : • : changes from . to at most • : (# of marked nodes that are not in the root list) – no new marks – if node is moved away from root list, . is to false  value of does not increase!

, . Φ Φ |. | Amortized Time: Algorithm Theory, WS 2014/15 Fabian Kuhn 25 Amortized Time of Decrease‐Key

Assume that operation is a decrease‐key operation at node :

Actual time: length of path to next unmarked ancestor Potential function :

• Assume, node and nodes ,…, are moved to root list – ,…, are marked and moved to root list, . mark is set to true • marked nodes go to root list, 1node gets newly marked • grows by 1, grows by 1 and is decreased by

1, 1 Φ Φ 1 2 1 Φ 3 Amortized time:

Algorithm Theory, WS 2014/15 Fabian Kuhn 26 Complexities Fibonacci Heap

• Initialize‐Heap: • Is‐Empty: • Insert: • Get‐Min: • Delete‐Min: amortized • Decrease‐Key: • Merge (heaps of size and , ):

• How large can get?

Algorithm Theory, WS 2014/15 Fabian Kuhn 27 Rank of Children

Lemma:

Consider a node of rank and let ,…,be the children of in the order in which they were linked to . Then,

. Proof:

Algorithm Theory, WS 2014/15 Fabian Kuhn 28 Size of Trees

Fibonacci Numbers: 0, 1, ∀2: Lemma: In a Fibonacci heap, the size of the sub‐tree of a node with rank is at least . Proof:

• : minimum size of the sub‐tree of a node of rank

Algorithm Theory, WS 2014/15 Fabian Kuhn 29 Size of Trees

1, 2, ∀2: 2 • Claim about Fibonacci numbers:

∀ 0: 1

Algorithm Theory, WS 2014/15 Fabian Kuhn 30 Size of Trees

1, 2,∀2: 2, 1 • Claim of lemma:

Algorithm Theory, WS 2014/15 Fabian Kuhn 31 Size of Trees

Lemma: In a Fibonacci heap, the size of the sub‐tree of a node with rank is at least .

Theorem: The maximum rank of a node in a Fibonacci heap of size is at most . Proof: • The Fibonacci numbers grow exponentially: 1 1 5 1 5 ⋅ 5 2 2

• For , we need nodes.

Algorithm Theory, WS 2014/15 Fabian Kuhn 32 Summary: Binomial and Fibonacci Heaps

Binomial Heap Fibonacci Heap initialize insert get‐min delete‐min * decrease‐key * merge is‐empty ∗ amortized time

Algorithm Theory, WS 2014/15 Fabian Kuhn 33 Minimum Spanning Trees

Prim Algorithm:

1. Start with any node ( is the initial component) 2. In each step: Grow the current component by adding the minimum weight edge connecting the current component with any other node

Kruskal Algorithm:

1. Start with an empty edge set 2. In each step: Add minimum weight edge such that does not close a cycle

Algorithm Theory, WS 2014/15 Fabian Kuhn 34 Implementation of Prim Algorithm

Start at node , very similar to Dijkstra’s algorithm: 1. Initialize 0and ∞for all 2. All nodes are unmarked

3. Get unmarked node which minimizes :

4. For all , ∈ , min ,

5. mark node

6. Until all nodes are marked

Algorithm Theory, WS 2014/15 Fabian Kuhn 35 Implementation of Prim Algorithm

Implementation with Fibonacci heap: • Analysis identical to the analysis of Dijkstra’s algorithm:

insert and delete‐min operations

decrease‐key operations

• Running time:

Algorithm Theory, WS 2014/15 Fabian Kuhn 36 Kruskal Algorithm

1 1. Start with an 13 empty edge set 3 6 10 2. In each step: 4 23 14 Add minimum 21 weight edge 7 2 such that does 28 16 31 not close a cycle 17 9 19 8 12 25 20

18 11

Algorithm Theory, WS 2014/15 Fabian Kuhn 37 Implementation of Kruskal Algorithm

1. Go through edges in order of increasing weights

2. For each edge : if does not close a cycle then

add to the current solution

Algorithm Theory, WS 2014/15 Fabian Kuhn 38 Union‐Find Data Structure

Also known as Disjoint‐Set Data Structure…

Manages partition of a set of elements • set of disjoint sets

Operations:

• _: create a new set that only contains element

• : return the set containing

• , : merge the two sets containing and

Algorithm Theory, WS 2014/15 Fabian Kuhn 39 Implementation of Kruskal Algorithm

1. Initialization: For each node : make_set

2. Go through edges in order of increasing weights: Sort edges by edge weight

3. For each edge ,: if then add to the current solution ,

Algorithm Theory, WS 2014/15 Fabian Kuhn 40 Managing Connected Components

• Union‐find data structure can be used more generally to manage the connected components of a graph … if edges are added incrementally

• make_set for every node

• find returns component containing

• union, merges the components of and (when an edge is added between the components)

• Can also be used to manage biconnected components

Algorithm Theory, WS 2014/15 Fabian Kuhn 41 Basic Implementation Properties

Representation of sets: • Every set of the partition is identified with a representative, by one of its members ∈

Operations: • make_set: is the representative of the new set

• find: return representative of set containing

• union, : unites the sets and containing and and returns the new representative of ∪

Algorithm Theory, WS 2014/15 Fabian Kuhn 42 Observations

Throughout the discussion of union‐find: • : total number of make_set operations • : total number of operations (make_set, find, and union)

Clearly: • • There are at most 1unionoperations

Remark: • We assume that the make_setoperations are the first operations – Does not really matter…

Algorithm Theory, WS 2014/15 Fabian Kuhn 43 Implementation

Each set is implemented as a linked list: • representative: first list element (all nodes point to first elem.) in addition: pointer to first and last element

• sets: 1,5,8,12,43 , 7,9,15 ; representatives: 5, 9

Algorithm Theory, WS 2014/15 Fabian Kuhn 44 Linked List Implementation

_: • Create list with one element: time:

: • Return first list element: time:

Algorithm Theory, WS 2014/15 Fabian Kuhn 45 Linked List Implementation

, : • Append list of to list of :

Time:

Algorithm Theory, WS 2014/15 Fabian Kuhn 46 Cost of Union (Linked List Implementation)

Total cost for 1unionoperations can be Θ:

• make_set ,make_set ,…,make_set, union , ,union , ,…,union ,

Algorithm Theory, WS 2014/15 Fabian Kuhn 47 Weighted‐Union Heuristic

• In a bad execution, average cost per union can be Θ

• Problem: The longer list is always appended to the shorter one

Idea: • In each union operation, append shorter list to longer one!

Cost for union of sets and : min ,

Theorem: The overall cost of operations of which at most are make_set operations is .

Algorithm Theory, WS 2014/15 Fabian Kuhn 48 Weighted‐Union Heuristic

Theorem: The overall cost of operations of which at most are make_set operations is . Proof:

Algorithm Theory, WS 2014/15 Fabian Kuhn 49 Disjoint‐Set Forests

• Represent each set by a tree

• Representative of a set is the root of the tree

Algorithm Theory, WS 2014/15 Fabian Kuhn 50 Disjoint‐Set Forests

_: create new one‐node tree

: follow parent point to root (parent pointer to itself)

, : attach tree of to tree of

Algorithm Theory, WS 2014/15 Fabian Kuhn 51 Bad Sequence

Bad sequence leads to tree(s) of depth Θ

• make_set ,make_set ,…,make_set, union , ,union , ,…,union ,

Algorithm Theory, WS 2014/15 Fabian Kuhn 52 Union‐By‐Size Heuristic

Union of sets and : • Root of trees representing and : and • W.l.o.g., assume that || • Root of ∪: ( is attached to as a new child)

Theorem: If the union‐by‐size heuristic is used, the worst‐case cost of a ‐operation is Proof:

Similar Strategy: union‐by‐rank • rank: essentially the depth of a tree

Algorithm Theory, WS 2014/15 Fabian Kuhn 53 Union‐Find Algorithms

Recall: operations, of the operations are make_set‐operations

Linked List with Weighted Union Heuristic: • make_set: worst‐case cost 1 • find : worst‐case cost 1 • union : amortized worst‐case cost log

Disjoint‐Set Forest with Union‐By‐Size Heuristic: • make_set: worst‐case cost 1 • find : worst‐case cost log • union : worst‐case cost log

Can we make this faster?

Algorithm Theory, WS 2014/15 Fabian Kuhn 54 Path Compression During Find Operation

: 1. if .then 2. . ≔ find . 3. return .

Algorithm Theory, WS 2014/15 Fabian Kuhn 55 Complexity With Path Compression

When using only path compression (without union‐by‐rank):

: total number of operations • of which are find‐operations • of which are make_set‐operations  at most 1are union‐operations

Total cost: ⋅ ⋅ ⁄

Algorithm Theory, WS 2014/15 Fabian Kuhn 56 Union‐By‐Size and Path Compression

Theorem:

Using the combined union‐by‐rank and path compression heuristic, the running time of disjoint‐set (union‐find) operations on elements (at most make_set‐operations) is

⋅ , ,

Where , is the inverse of the Ackermann function.

Algorithm Theory, WS 2014/15 Fabian Kuhn 57 Ackermann Function and its Inverse

Ackermann Function:

For , ℓ 1, ℓ, , ℓ , ℓ ≔ , , , ℓ , ,ℓ , ,ℓ

Inverse of Ackermann Function: , ≔ | , ⁄

Algorithm Theory, WS 2014/15 Fabian Kuhn 58 Inverse of Ackermann Function

• , ≔ min 1 | , ⁄ log

⟹, ⁄ , 1 ⟹ , min 1|, 1 log • 1, ℓ 2ℓ, , 1 1,2, , ℓ 1, , ℓ 1

Algorithm Theory, WS 2014/15 Fabian Kuhn 59