Data Structures and Algorithms (CS210A)

Lecture 29: • Building a Binary on 풏 elements in O(풏) time. • Applications of Binary heap : sorting • Binary trees: beyond searching and sorting

1 Recap from the last lecture

2 A complete binary

How many leaves are there in a Complete of size 풏 ?

풏/ퟐ

3 Building a Binary heap

Problem: Given 풏 elements {푥0, …, 푥푛−1}, build a binary heap H storing them.

Trivial solution: (Building the Binary heap incrementally) CreateHeap(H); For( 풊 = 0 to 풏 − ퟏ ) What is the time Insert(푥𝑖,H); complexity of this algorithm?

4 Building a Binary heap incrementally

What useful inference can you draw from Top-down this Theorem ? approach

The for inserting a leaf node = ?O (log 풏 ) # leaf nodes = 풏/ퟐ ,  Theorem: Time complexity of building a binary heap incrementally is O(풏 log 풏). 5 Building a Binary heap incrementally

What useful inference can you draw from Top-down this Theorem ? approach

The O(풏) time algorithm must take O(1) time

for each of the 풏/ퟐ leaves. 6 Building a Binary heap incrementally

Top-down approach

7 Think of alternate approach for building a binary heap In any complete 98 binaryDoes treeit suggest, how a manynew nodes approach satisfy to

heapbuild propertybinary heap ? ? 14 33

all leaf 37 11 52 32 nodes 

41 21 76 85 17 25 88 29 Bottom-up approach

47 75 9 57 23 heap property: “Every ? node stores value smaller than its children” We just need to ensure this property at each node. 8 Think of alternate approach for building a binary heap

98

14 33

37 11 52 32

41 21 76 85 17 25 88 29 Bottom-up approach

47 75 9 57 23 heap property: “Every ? node stores value smaller than its children” We just need to ensure this property at each node. 9 A new approach to build binary heap

1. Just copy the given 풏 elements {푥0, …, 푥푛−1} into an array H.

2. The heap property holds for all the leaf nodes in the corresponding complete binary tree.

3. Leaving all the leaf nodes, process the elements in the decreasing order of their numbering and the heap property for each of them.

10 A new approach to build binary heap 98

14 33

37 11 52 32

85 17 25 88 41 21 29 76

The first node to be 47 75 9 57 23 processed. H 98 14 33 37 11 52 32 85 17 25 88 41 21 29 76 47 75 9 57 23

11 A new approach to build binary heap 98

14 33

37 11 52 32

85 17 23 88 41 21 29 76

The second node to 47 75 9 57 25 be processed. H 98 14 33 37 11 52 32 85 17 23 88 41 21 29 76 47 75 9 57 25

12 A new approach to build binary heap 98

14 33

37 11 52 32

85 9 23 88 41 21 29 76

The third node to 47 75 17 57 25 be processed. H 98 14 33 37 11 52 32 85 9 23 88 41 21 29 76 47 75 17 57 25

13 A new approach to build binary heap 98

14 33

37 11 52 32

47 9 23 88 41 21 29 76

85 75 17 57 25 H 98 14 33 37 11 52 32 47 9 23 88 41 21 29 76 85 75 17 57 25

14 A new approach to build binary heap 98

v 54 21

9 11 33 29

47 17 23 88 41 52 32 76

85 75 37 57 25 H 98 54 21 9 11 33 29 47 17 23 88 41 52 32 76 85 75 37 57 25

15 A new approach to build binary heap 98

v 9 21

54 11 33 29

47 17 23 88 41 52 32 76

85 75 37 57 25 H 98 54 21 9 11 33 29 47 17 23 88 41 52 32 76 85 75 37 57 25

16 A new approach to build binary heap 98

v 9 21

17 11 33 29

47 54 23 88 41 52 32 76

85 75 37 57 25 H 98 54 21 9 11 33 29 47 17 23 88 41 52 32 76 85 75 37 57 25

17 A new approach to build binary heap 98

v 9 21

17 11 33 29

47 37 23 88 41 52 32 76

85 75 54 57 25 H 98 54 21 9 11 33 29 47 17 23 88 41 52 32 76 85 75 37 57 25

Let v be a node corresponding to index i in H. The process of restoring heap property at i called Heapify(i,H). 18 Heapify(풊,H)

Heapify(풊,H) { 풏  size(H) -1 ;

While ( ? and ? ) {

For node 풊, compare its value with those of its children If it is smaller than any of its children  Swap it with smallest child and move down …

Else stop !

} } 19

Heapify(풊,H)

Heapify(풊,H) { 풏  size(H) -1 ; Flag  true; While ( 풊 ≤ ( 풏 −?1 ) / 2 and Flag ?= true ) { 풎풊풏  풊; If( H[ 풊 ]>H[ ?ퟐ 풊 + ퟏ ] ) 풎풊풏  ퟐ풊 + ퟏ; If( ퟐ 풊 + ퟐ ≤ 풏 and ? H[ 풎풊풏 ]>H[ ퟐ 풊 + ퟐ ] ) 풎풊풏  ퟐ풊 + ퟐ; If(풎풊풏 ≠ 풊) { H(풊) ↔ H(풎풊풏); 풊  풎풊풏; } else Flag  false; } } 20

Building Binary heap in O(n) time How many nodes of height ℎ can there be 98

in a complete Binary Time to heapify node v ? tree of 풏 nodes ? v 54 21

9 11 33 29 Height( v)

47 17 23 88 41 52 32 76

85 75 37 57 25

H 98 54 21 9 11 33 29 47 17 23 88 41 52 32 76 85 75 37 57 25

Time complexity of algorithm = ? 푶(ℎ) ∙ 푵(ℎ) 21 ℎ No. of nodes of height ℎ A complete binary tree How many nodes of height ℎ can there be Hence the number of nodes 푛 of height ℎ is bounded by in a complete Binary ퟐ풉 tree of 풏 nodes ?

Each subtree is also a complete binary tree.  A subtree of height ℎ has at least 2ℎ nodes Moreover, no two subtrees of height ℎ in the given tree have any element in common22

Building Binary heap in O(n) time

푛 Lemma: the number of nodes of height ℎ is bounded by . ퟐℎ

log 푛 Hence Time complexity to build the heap = 푛 푶(ℎ) 풉=1 ퟐℎ log 푛 ℎ = 푛 푐 𝑖=1 ퟐℎ = O(푛)

As an exercise (using knowledge from your JEE log 푛 preparation days), show that ℎ is bounded by 2 ℎ=1 ퟐ풉

23 Sorting using a Binary heap

24 Sorting using heap

Build heap H on the given 풏 elements; While (H is not empty) { x  Extract-min(H); print x; } This is HEAP algorithm Time complexity : O(풏 log 풏)

Question: Which is the best : (, Heap sort, Quick sort) ? Answer: Practice programming assignment 

25

Binary trees: beyond searching and sorting

• Elegant solution for two interesting problem

• An important lesson:

Lack of proper understanding of a problem is a big hurdle to solve the problem

26 Two interesting problems on sequences

27 What is a sequence ?

A sequence S = ≺ 푥0, …, 푥푛−1 ≻ • Can be viewed as a mapping from [0, 푛]. • Order does matter.

28 Problem 1

Multi-increment

29 Problem 1

Given an initial sequence S = ≺ 푥0, …, 푥푛−1 ≻ of numbers, maintain a compact to perform the following operations: • ReportElement(풊):

Report the current value of 푥𝑖. • Multi-Increment(풊, 풋, ∆):

Add ∆ to each 푥푘 for each 풊 ≤ 풌 ≤ 풋

Example: Let the initial sequence be S = ≺ 14, 12, 23, 12, 111, 51, 321, -40 ≻ After Multi-Increment(2,6,10), S becomes ≺ 14, 12, 33, 22, 121, 61, 331, -40 ≻ After Multi-Increment(0,4,25), S becomes ≺ 39, 37, 58, 47, 146, 61, 331, -40 ≻ After Multi-Increment(2,5,31), S becomes ≺ 39, 37, 89, 78, 177, 92, 331, -40 ≻

30 Problem 1

Given an initial sequence S = ≺ 푥0, …, 푥푛−1 ≻ of numbers, maintain a compact data structure to perform the following operations: • ReportElement(풊):

Report the current value of 푥𝑖. • Multi-Increment(풊, 풋, ∆):

Add ∆ to each 푥푘 for each 풊 ≤ 풌 ≤ 풋

Trivial solution :

Store S in an array A[0.. 풏-1] such that A[풊] stores the current value of 푥풊. Multi-Increment(풊, 풋, ∆) { 풊 ≤ 풌 ≤ 풋 풌  풌 For ( ) A[ ] A[ ]+ ∆; O(풋 − 풊) = O(풏) }

ReportElement(풊){ return A[풊] } O(1)

31 Problem 1

Given an initial sequence S = ≺ 푥0, …, 푥푛−1 ≻ of numbers, maintain a compact data structure to perform the following operations: • ReportElement(풊):

Report the current value of 푥𝑖. • Multi-Increment(풊, 풋, ∆):

Add ∆ to each 푥푘 for each 풊 ≤ 풌 ≤ 풋

Trivial solution :

Store S in an array A[0.. 풏-1] such that A[풊] stores the current value of 푥풊.

Question: the source of difficulty in breaking the O (n) barrier for Multi-Increment() ? Answer: we need to explicitly maintain in S.

Question: who asked/inspired us to maintain S explicitly. Answer: 1. incomplete understanding of the problem 2. conditioning based on incomplete understanding 32

Towards efficient solution of Problem 1

Assumption: without loss of generality assume 풏 is power of 2.

Explore ways to maintain sequence S implicitly such that • Multi-Increment(풊, 풋, ∆) is efficient • Report(풊) is efficient too.

Main hurdle: To perform Multi-Increment(풊, 풋, ∆) efficiently

33 Problem 2

Dynamic Range-minima

34 Problem 2

Given an initial sequence S = ≺ 푥0, …, 푥푛−1 ≻ of numbers, maintain a compact data structure to perform the following operations efficiently for any 0 ≤ 풊 < 풋 < 풏. • ReportMin(풊, 풋):

Report the minimum element from {푥푘 | for each 풊 ≤ 풌 ≤ 풋} • Update(풊, a):

a becomes the new value of 푥𝑖.

AIM: • O(풏) size data structure. • ReportMin(풊, 풋) in O(log 풏) time. • Update(풊, a) in O(log 풏) time. All data structure lovers must ponder over these two problems . 35 We shall discuss them in the next class.