CS 124 Section 1 Binary Min/Max Heaps

Yash Nair

February 2021

Yash Nair Section 1 February 2021 1 / 25 Table of Contents

What is a Heap and Why do we Use Them? Heap Basics Representing a Heap Implementing Heap Operations Problems

Yash Nair Section 1 February 2021 2 / 25 Purpose of Heaps

Data structure to find (and also remove) most extreme (i.e., maximal or minimal) element in a set we can add to quickly MAX-HEAP finds maximal element, MIN-HEAP finds minimal We will consider MAX-HEAP, but note that a MIN-HEAP is just a MAX-HEAP if we multiply the elements by 1 −

Yash Nair Section 1 February 2021 3 / 25 Using a Sorted Array is Slower than a Heap

Do the following operations using a sorted array: insert 5, insert 3, insert 2, ﬁnd and remove max, ﬁnd and remove max, insert 1

Yash Nair Section 1 February 2021 4 / 25 Comparing Operation Times

Data Structure Add Element Find Max Find and Remove Max Build from Unsorted Array

Sorted Array O(n) O(1) O(1) O(n log n)

Unsorted Array O(1) O(n) O(n) O(1)

Binary Heap O(log n) O(1) O(log n) O(n)

Yash Nair Section 1 February 2021 5 / 25 Heap Operations Overview

PEEK(): what is the largest element in the heap? EXTRACTMAX(): remove the largest element from the heap INSERT(x): insert the element x into the heap BUILD-HEAP(A): given an unsorted array, A, build a binary heap

Want to be able to do these operations quickly by using a binary tree structure.

Yash Nair Section 1 February 2021 6 / 25 CS 124 Section #2 Heaps and Graphs 2/8/16

1 Heaps Heaps are data structures that make it easy to ﬁnd the element with the most extreme value in a collection of elements. A Min-Heap prioritizes the element with the smallest value, while a Max-Heap prioritizes the element with the largest value. Because of this property, heaps are often used to implement priority queues. In section today, we will focus on the binary max heap. You can ﬁnd more about heaps by reading pages 151–169 in CLRS.

1.1 What is a heap? Max heaps are a very use data structure which maintains some structure on a list of numbers such that you can do the following operations. The goal is to be able to very quickly find the maximum element in a list, while supporting insertions and deletions to the list. • peek(): what is the largest element in the list? • extractmax(): remove the largest element from the list • insert(x): insert the element x into the list. The goal is to have come up with a way to implement heaps such that all three of the above operations are fast. Heaps will be useful when we talk about Dijkstra’s algorithm for shortest paths. An efficient way to do this is to use a binary tree to store the list of numbers. This won’t be an ordinary binary tree, but one that satisfies the heap property, which is: If x is a parent of y in the binary tree, then x>y 1.2 RepresentingBinary Trees a Heap as Arrays Example One way to represent binary trees more easily is to use an 1-indexed array. For example, the following array will be used to represent the following max-heap

[124, 109, 121, 50, 1, 110, 51]

124

109 121

50 1 110 51

We call the ﬁrst element in the heap element 1. Now, given an element i, we can ﬁnd its left and right children with a little arithmetic:

Yash Nair Section 1 February 2021 7 / 25 1 Binary Trees as Arrays

Can order elements of left-aligned binary tree and view them in array Letting the ﬁrst element (i.e., root) of tree have index 1, what are the following: PARENT(i) = LEFT(i) = RIGHT(i) =

Yash Nair Section 1 February 2021 8 / 25 CS 124 Section #2 Heaps and Graphs 2/8/16

1.1 What is a heap? Max heaps are a very use data structure which maintains some structure on a list of numbers such that you can do the following operations. The goal is to be able to very quickly find the maximum element in a list, while supporting insertions and deletions to the list. • peek(): what is the largest element in the list? • extractmax(): remove the largest element from the list • insert(x): insert the element x into the list. The goal is to have come up with a way to implement heaps such that all three of the above operations are fast. Heaps will be useful when we talk about Dijkstra’s algorithm for shortest paths. An efficient way to do this is to use a binary tree to store the list of numbers. This won’t be an ordinary binary tree, but one that satisfies the heap property, which is: Heap RepresentationIf x is a parent of y in the binary tree, then x>y 1.2 Representing a Heap One way to representHeap property: binary treesif morex is aeasily parent is to of usey anin 1-indexed the binary array. tree, For thenexample,x > they following array will be used to represent the following max-heap A heap is a binary tree that satisfies the heap property

[124, 109, 121, 50, 1, 110, 51]

124

109 121

50 1 110 51

We call the ﬁrst element in the heap element 1. Now, given an element i, we can ﬁnd its left and right children withYash a littleNair arithmetic: Section 1 February 2021 9 / 25

1 Exercise 1. How would you ﬁnd the index of the parent, left child and right child of the ith element in the heap? • Parent(i)= • Left(i)= • Right(i)=

Solution The array representation is simpliﬁed by calling the ﬁrst element in the heap 1: i • Parent(i)= 2 • Left(i)=2i ⌅ ⇧ • Right(i)=2i +1

Max-Heapify(1.3 Heap operationsH, N) Before we look at how peek, insertion and deletion work, let’s ﬁrst implement some helper functions thatInput: will beN usefulis fornode maintaining in the the left-aligned heap structure and binary building tree a heapH fromsuch scratch. that the N is 1.3.1the rootMax-Heapify of a MAX-HEAP, except that N may be smaller than its Max-Heapify(H, N): Given that the children of the node N in the Max-Heap H are each the rootchildren of a Max-Heap (i.e.,, but rearranges all lower the tree layers rooted at satisfyN to be atheMax-Heap heap. property) We will call this when weGoal: break the Rearrange heap property nodes in order so to ﬁx that it. tree rooted at N is a MAX-HEAP Max-Heapify(H, N) Require: N is the root of a Max-Heap except, possibly, at N (i.e., N could be smaller than its children) 1: (l, r) (Left(N), Right(N))

2: if Exists(l)andH[l] >H[N] then 3: largest l

4: else 5: largest N

6: end if 7: if Exists(r)andH[r] >H[largest] then 8: largest r

9: end if 10: if largest = N then 6 11: Swap(H[N],H[largest]) 12: Max-Heapify(H, largest) 13: end if Ensure: N is the root of a Max-Heap

Description:Yash Nair First, we set larger to be theSection bigger of1 N and `,thevalueofthenodeFebruaryN or 2021 its left 10 / 25 child. Then, we ﬁnd the larger of that and the value of N’s right child. We swap H[larger] with the root so that now the root is bigger in value than the two children. We recursively call max-heapify on the side that we swapped with. If no swap occurred, we do not recurse.

2 Exercise 2.Example • Run Max-HeapifyRun Max-Heapifywith withN =1Non= 1 on the below graph. What is the runtime? H = [10, 15, 7, 13, 9, 4, 1, 8, 6]

15 7

13 9 4 1

8 6

Yash Nair Section 1 February 2021 11 / 25 Note that every level besides the one with N =1does satisfy the max-heap property. That is an important assumption when calling max-heapify. • What is Max-Heapify’s run-time?

Solution

• Max-Heapify(H, N) returns a max-heap

[15, 13, 7, 11, 9, 4, 1, 8, 10]

13 7

11 9 4 1

8 10

• Runs in O(log n) time because node N is moved down at most log n times.

3 Runtime of Max-Heapify

O(log n) comparisons are made O(log n) swaps are made Total runtime: O(log n).

Yash Nair Section 1 February 2021 12 / 25 Runs in O(log n) time because node N is moved down at most log n times. •

Runs1.3.2 in O(log Build-Heapn) time because node N is moved down at most log n times. Building• a Heap Build-Heap(A): Given an unordered array, makes it into a max heap. 1.3.2 Build-Heap Build-Heap(AlgorithmA): Given an 2 unorderedBuild-Heap( array,A makes) it into a max heap. AlgorithmRequire: 2 Build-Heap(A Ais) an array. Require: Aforis ani array.= length(A)/2 down to 1 do for i = length(A)/2b down to 1 do c b Max-Heapifyc (A, i) Max-Heapify(A, i) end for end for

Description:Description:Begin at the leafBegin nodes atand the call leafmax-heapify nodes andso that call gradually,max-heapify the all the elements,so that gradually, the all the elements, starting at the bottom, will follow the heap property. Idea: Startstarting from last at the node bottom, with will child follow and the work heap up property.

Exercise (3). Run Build-HeapRunExerciseBuild-Heap on: (3)on.A =[2, 1, 4, 3, 6, 5] • Run Build-Heap on A =[22, 1, 4, 3, 6, 5] • 2 1 4

1 4 3 6 5

Running time (loose upper bound): •Yash Nair Section 1 February 2021 13 / 25 Running time (tight upper bound): • 3 6 5 Solution.

Running time (loose[2, 1, upper4, 3, 6, bound):5] • ! [2, 1, 5, 3, 6, 4] Running time (tight upper bound):! • [2, 6, 5, 3, 1, 4] ! [6, 3, 5, 2, 1, 4] O(nSolution.log n) because we perform n operations of max-heapify and those all take O(log n)time • each. O(n). Intuitively, we might be able to get a tighter bound[2, 1 because, 4, 3,Max-Heapify6, 5] is run • ! more often at lower points on the tree, when subtrees are shallow. Making use of the fact [2, 1, 5, 3, 6, 4] ! [2, 6, 5, 3, 1, 4] 4 ! [6, 3, 5, 2, 1, 4] O(n log n) because we perform n operations of max-heapify and those all take O(log n)time • each. O(n). Intuitively, we might be able to get a tighter bound because Max-Heapify is run • more often at lower points on the tree, when subtrees are shallow. Making use of the fact

4 Runtime of Build-Heap

Naive runtime analysis: O(n) calls to Max-Heapify, each takes O(log n) time, so O(n log n) Tighter analysis: When we run Max-Heapify from node of height h, runtime is O(h) Build-Heap does this for heights h = 0,..., log n At most n/2h+1 nodes at height h (Proof:b inductionc on h. BC: h = 0; thered are ate most n/2 nodes at height 0. IS: remove last layer, then new tree has at mostd n c n/2 nodes and height h in original tree is now height h 1, so, by IH,− d at mostc (n n/2 )/2h n/2h+1 ) Gives runtime bound− of − d c ≤ d e blog nc log n n h O(h) = O n d2h+1 e  2h  h=0 h=0 X X   log n ∞ ∞ h 1.5h 3 h = = 4, 2h ≤ 2h 4 h=0 h=0 h=0 X X X so runtime is O(n) Yash Nair Section 1 February 2021 14 / 25 Peek

Q: How to return the maximum element of MAX-HEAP, H, quickly? A: Return H[0] (i.e., the root)! Guaranteed to be maximal by Build-Heap.

Yash Nair Section 1 February 2021 15 / 25 that an n-element heap has height log n and at most n/2h+1 nodes of any height h,the b c d e total number of comparisons that will have to be made is

log n lg n b c n h h+1 O(h)=O n h d2 e 2 ! Xh=0 Xh=0 h It is not hard to see that h1=0 2h coverges to some constant (namely 2). So the runtime is O(n). P

1.3.3 Peek Exercise (4). If someone gave you a max heap H and wanted you to tell me the maximum element in H (without removing it), how would you do so? What is the run time?

Solution. You would simply return H[0] because the root of the max heap is guaranteed to be the largest Extract-Maxelement. This runs in O(1) time.

Remove largest1.3.4 element Extract-Max of heap, and ensure that heap structure is Extract-Max(H): Remove the element with the largest value from the heap and ﬁx everything maintained: so that the heap structure is maintained.

Algorithm 3 * Extract-Max(H) Require: H is a non-empty Max-Heap max H[root]

H[root] H[Size(H)] last element of the heap. { } Size(H) =1 Max-Heapify(H, root) return max

Description: Once we remove the max (the root), we need to replace it with something that’s already in the tree. We choose to take one of the leaves and move it to the root and then max- heapify the heap.

Exercise (5). 3 5 Run Extract-Max on H =[6, 3, 5, 2, 1, 4]. •

2 1 4

What is Extract-Max’s run time? • Yash Nair Section 1 February 2021 16 / 25 5

Solution. Extract-Max ﬁrst returns 6. It then moves the last element, 4, to the head (Why • the last element? It is guaranteed to be a leaf. It would be much harder to pluck out a node with children), and then Max-Heapify is used to maintain the heap structure:

[4, 3, 5, 2, 1] [5, 3, 4, 2, 1] !

O(log n) because we’re performing just one max-heapify operation. •

1.3.5 Insert Insert(H, v): Add the value v to the heap H.

Algorithm 4 Insert(H, v) Require: H is a Max-Heap, v is a new value. Size(H)+=1 H[Size(H)] v Set v to be in the next empty slot. { } N Size(H) Keep track of the node currently containing v. { } while N is not the root and H[Parent(N)]

end while

Description: Add the node to the leaf and then promote it upwards until it is smaller than its parent.

Exercise (6). Run Insert(H, v)withv =8and • H =[6, 3, 5, 2, 1, 4]

6 that an n-element heap has height log n and at most n/2h+1 nodes of any height h,the b c d e total number of comparisons that will have to be made is

log n lg n b c n h h+1 O(h)=O n h d2 e 2 ! Xh=0 Xh=0 h It is not hard to see that h1=0 2h coverges to some constant (namely 2). So the runtime is O(n). P

1.3.3 Peek Exercise (4). If someone gave you a max heap H and wanted you to tell me the maximum element in H (without removing it), how would you do so? What is the run time?

Runtime of Extract-MaxSolution. You would simply return H[0] because the root of the max heap is guaranteed to be the largest element. This runs in O(1) time.

1.3.4 Extract-Max Extract-Max(H): Remove the element with the largest value from the heap and ﬁx everything so that the heap structure is maintained.

Algorithm 3 * Extract-Max(H) Require: H is a non-empty Max-Heap max H[root]

H[root] H[Size(H)] last element of the heap. { } Size(H) =1 Max-Heapify(H, root) return max

Exercise (5). Run Extract-Max on H =[6, 3, 5, 2, 1, 4]. •

Yash Nair Section 1 February 2021 17 / 25

5 3 5

2 1 4

What is Extract-Max’s run time? •

[4, 3, 5, 2, 1] [5, 3, 4, 2, 1] Insert !

O(log n) because we’re performing just one max-heapify operation. •

1.3.5 Insert Insert(H, v): Add the value v to the heap H.

end while

Description: Add the node to the leaf and then promote it upwards until it is smaller than its parent.

Exercise (6). Run Insert(H, v)withv =8and • Yash Nair Section 1 February 2021 18 / 25 H =[6, 3, 5, 2, 1, 4]

6 Example

Run Insert(H, v) with v = 8 on the heap below:

3 5

2 1 4

What is Insert’s runtime? •

Solution. Insert v into H as follows: • [6, 3, 5, 2, 1, 4, 8] Yash Nair Section 1 ! February 2021 19 / 25 [6, 3, 8, 2, 1, 4, 5] ! [8, 3, 6, 2, 1, 4, 5] Here, the while loop is halted by the condition that N is the root. Insert’s runtime is O(log n) because you do the while loop at most log n times. •

Function Run-time peek O(1) Solution. extract-max O(log n) insert O(log n)

2 Depth First Search

In lecture, we talked how to represent graphs on a computer, as well as talked about DFS, BFS, and Dijkstra’s algorithm and their various applications. We will go more in depth into the shortest path algorithms in the next section.

2.1 Important Properties DFS is a systematic way to traverse a graph, touching all of the vertices of the graph once. • The run-time is O( V + E ). • | | | | DFS produces tree edges, forward edges, back edges, and cross edges. • DFS also produces pre and post-order numbers that tell you when a node was ﬁrst touched • and when it has been done searching all that node’s neighbors. DFS can be used for detecting cycles, ﬁnding topological sorts, and determining the strongly • connected components of a graph.

7 3 5

2 1 4

What is Extract-Max’s run time? •

[4, 3, 5, 2, 1] [5, 3, 4, 2, 1] Runtime of Insert ! O(log n) because we’re performing just one max-heapify operation. •

1.3.5 Insert Insert(H, v): Add the value v to the heap H.

end while

Description: Add the node to the leaf and then promote it upwards until it is smaller than its parent. While loop occurs O(log n) times Both stepsExercise on while(6). loop take constant time Run Insert(H, v)withv =8and Total runtime:• O(log n) H =[6, 3, 5, 2, 1, 4]

Yash Nair Section 1 February 2021 20 / 25 6 Recap

MAX-HEAP allows us to ﬁnd the maximum of a collection of elements quickly Max-Heapify, which ensures heap property, takes O(log n) time Build-Heap, which builds a MAX-HEAP give collection of numbers using Max-Heapify, takes O(n) time Peek, which returns maximal element in heap, takes O(1) time by returning the root Extract-Max, which removes the maximal element and then Max-Heapiﬁes to maintain the heap structure, takes O(log n) time Insert, which promotes the added node up until it is smaller than its parent to ensure heap structure, takes O(log n) time

Yash Nair Section 1 February 2021 21 / 25 0. Hand-Running Heap Operations

Start with the empty heap and perform the following operations: Insert(3), Insert(1), Insert(4), Insert(1), Extract-Max, Insert(5), Insert(9), Extract-Max, Insert(2), Extract-Max, Extract-Max, Insert(6). What does the resulting heap look like?

Yash Nair Section 1 February 2021 22 / 25 Exercise 1. How would you ﬁnd the index of the parent, left child and right child of the ith element in the heap? • Parent(i)= • Left(i)= • Right(i)=

Solution The array representation is simpliﬁed by calling the ﬁrst element in the heap 1: i • Parent(i)= 2 • Left(i)=2i ⌅ ⇧ • Right(i)=2i +1

1.1.3 IterativeHeap operations Max-Heapify (CLRS 6.2-5) Before we look at how peek, insertion and deletion work, let’s ﬁrst implement some helper functions Recallthat will the be pseudocode useful for maintaining for the the heap recursive structure implementation and building a heap from of scratch.Max-Heapify below.1.3.1 WriteMax-Heapify an iterative version, taking the same amount of time. In Max-Heapify(H, N): Given that the children of the node N in the Max-Heap H are each the particular,root of a Max-Heap your iterative, rearranges version the tree rootedmay atnotN tocall be atheMax-Heap original. We Max-Heapify will call this when algorithmwe break the nor heap itself. property in order to ﬁx it. Max-Heapify(H, N) Require: N is the root of a Max-Heap except, possibly, at N (i.e., N could be smaller than its children) 1: (l, r) (Left(N), Right(N))

2: if Exists(l)andH[l] >H[N] then 3: largest l

4: else 5: largest N

6: end if 7: if Exists(r)andH[r] >H[largest] then 8: largest r

9: end if 10: if largest = N then 6 11: Swap(H[N],H[largest]) 12: Max-Heapify(H, largest) 13: end if Ensure: N is the root of a Max-Heap Yash Nair Section 1 February 2021 23 / 25 Description: First, we set larger to be the bigger of N and `,thevalueofthenodeN or its left child. Then, we ﬁnd the larger of that and the value of N’s right child. We swap H[larger] with the root so that now the root is bigger in value than the two children. We recursively call max-heapify on the side that we swapped with. If no swap occurred, we do not recurse.

2 2. Tightness of Runtime Analyses

a.) Show that the O(log n) upper-bound on Max-Heapify’s runtime is tight. In other words, construct an inﬁnite family of arrays, An , with { }n∞=1 An a MAX-HEAP except, possibly, at the root (i.e., the root may be smaller than at least one of its children), and length(An) = n for all n, so that the running time, T (n), of Max-Heapify(An, 1) is Ω(log n).

b.) Let An be any inﬁnite family of arrays with length(An) = n. { }n∞=1 Prove that the running time, T (n), of Build-Heap(An) is Θ(n).

Yash Nair Section 1 February 2021 24 / 25 3. Sorting with a MAX-HEAP

Consider the following sorting algorithm: run Build-Heap, then repeatedly Extract-Max and append to a sorted list. How eﬃcient is this algorithm? Give a tight bound on the running time.

Yash Nair Section 1 February 2021 25 / 25