CS 270 Week 10 Oliver Kullmann

Binary heaps Sorting Heapification Building a 1 Binary heaps HEAP-

Priority 2 Heapification queues

QUICK- 3 Building a heap SORT Analysing 4 QUICK- HEAP-SORT SORT 5 Priority queues Tutorial

6 QUICK-SORT

7 Analysing QUICK-SORT

8 Tutorial CS 270 General remarks Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

We return to sorting, considering HEAP-SORT and HEAP- QUICK-SORT. SORT Priority queues CLRS Reading from for week 7 QUICK- SORT 1 Chapter 6, Sections 6.1 - 6.5. Analysing 2 QUICK- Chapter 7, Sections 7.1, 7.2. SORT

Tutorial CS 270 Discover the properties of binary heaps Algorithms Oliver Running example Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 First property: level-completeness Algorithms Oliver Kullmann

Binary heaps In week 7 we have seen binary trees: Heapification

Building a 1 We said they should be as “balanced” as possible. heap 2 Perfect are the perfect binary trees. HEAP- SORT 3 Now close to perfect come the level-complete binary Priority trees: queues QUICK- 1 We can partition the nodes of a (binary) T into levels, SORT

according to their distance from the root. Analysing 2 We have levels 0, 1,..., ht(T ). QUICK- k SORT 3 Level k has from 1 to 2 nodes. Tutorial 4 If all levels k except possibly of level ht(T ) are full (have precisely 2k nodes in them), then we call the tree level-complete. CS 270 Examples Algorithms Oliver The Kullmann

1 ❚ Binary heaps ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ Heapification 2 ❥ 3 ❖ ❄❄ ❖❖ Building a ⑧⑧ ❄ ⑧⑧ ❖❖ heap ⑧⑧ ❄ ⑧⑧ ❖❖❖ 4 5 6 ❄ 7 ❄ HEAP- ⑧ ❄ ⑧ ❄ SORT ⑧⑧ ❄ ⑧⑧ ❄ Priority 10 13 14 15 queues

QUICK- is level-complete (level-sizes are 1, 2, 4, 4), while SORT ❥ 1 ❚❚ Analysing ❥❥❥❥ ❚❚❚❚ QUICK- ❥❥❥ ❚❚❚ SORT ❥❥❥ ❚❚❚❚ 2 ❥❥ 3 ♦♦♦ ❄❄ ⑧ Tutorial ♦♦ ❄❄ ⑧⑧ ♦♦♦ ⑧ 4 ❄ 5 ❄ 6 ❄ ⑧⑧ ❄❄ ⑧ ❄ ⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ 8 9 10 11 12 13

is not (level-sizes are 1, 2, 3, 6). CS 270 The height of level-complete binary trees Algorithms Oliver Kullmann

Binary heaps

Heapification

For a level-complete binary tree T we have Building a heap ht(T )= ⌊lg(#nds(T ))⌋ . HEAP- SORT

Priority That is, the height of T is the binary of the number queues of nodes of T , after removal of the fractional part. QUICK- SORT

Analysing We said that “balanced” T should have QUICK- ht(T ) ≈ lg(#nds(T )). SORT Tutorial Now that’s very close. CS 270 Second property: completeness Algorithms Oliver Kullmann

Binary heaps

Heapification

To have simple and efficient access to the nodes of the tree, the Building a nodes of the last layer better are not placed in random order: heap HEAP- SORT Best is if they fill the positions from the left without gaps. Priority A level-complete binary tree with such gap-less last layer is queues QUICK- called a complete tree. SORT

Analysing So the level-complete binary tree on the examples-slide is QUICK- not complete. SORT While the running-example is complete. Tutorial CS 270 Third property: the heap-property Algorithms Oliver The running-example is not a binary : Kullmann

1 It would be too expensive to have this property together Binary heaps with the completeness property. Heapification Building a 2 However we have another property related to order (not heap HEAP- just related to the structure of the tree): The value of every SORT

node is not less than the value of any of its successors (the Priority nodes below it). queues QUICK- 3 This property is called the heap property. SORT

4 Analysing More precisely it is the max-heap property. QUICK- SORT Definition 1 Tutorial A binary heap is a binary tree which is complete and has the heap property.

More precisely we have binary max-heaps and binary min-heaps. CS 270 Fourth property: Efficient index computation Algorithms Oliver Kullmann

Binary heaps Consider the numbering (not the values) of the nodes of the Heapification Building a running-example: heap

HEAP- 1 This numbering follows the layers, beginning with the first SORT Priority layer and going from left to right. queues 2 Due to the completeness property (no gaps!) these numbers QUICK- SORT yield easy relations between a parent and its children. Analysing 3 QUICK- If the node has number p, then the left child has number SORT

2p, and the right child has number 2p + 1. Tutorial 4 And the parent has number ⌊p/2⌋. CS 270 Efficient array implementation Algorithms Oliver For binary search trees we needed full-fledged trees (as discussed Kullmann

in week 7): Binary heaps

Heapification 1 That is, we needed nodes with three pointers: to the parent Building a and to the two children. heap HEAP- 2 However now, for complete binary trees we can use a more SORT

efficient array implementation, using the numbering for the Priority array-indices. queues QUICK- SORT m So a binary heap with nodes is represented by an array with Analysing m QUICK- elements: SORT C-based languages use 0-based indices (while the book uses Tutorial 1-based indices). For such an index 0 ≤ i < m the index of the left child is 2i +1, and the index of the right child is 2i + 2. While the index of the parent is ⌊(i − 1)/2⌋. CS 270 Float down a single disturbance Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 The idea of heapification Algorithms Oliver Kullmann The input is an array A and index i into A. Binary heaps

It is assumed that the binary trees rooted at the left and Heapification

right child of i are binary (max-)heaps, but we do not Building a assume anything on A[i]. heap HEAP- After the “heapification”, the values of the binary tree SORT i Priority rooted at have been rearranged, so that it is a binary queues

(max-)heap now. QUICK- SORT

Analysing For that, the proceeds as follows: QUICK- SORT 1 First the largest of A[i], A[l], A[r] is determined, where Tutorial l = 2i and r = 2i + 1 (the two children). 2 If A[i] is largest, then we are done. 3 Otherwise A[i] is swapped with the largest element, and we call the procedure recursively on the changed subtree. CS 270 Analysing heapification Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a Obviously, we go down from the node to a leaf (in the worst heap HEAP- case), and thus the running-time of heapification is SORT

Priority queues linear in the height h of the subtree. QUICK- SORT

Analysing This is O(lg n), where n is the number of nodes in the subtree QUICK- (due to h = ⌊lg n⌋). SORT Tutorial CS 270 Heapify bottom-up Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 The idea of building a binary heap Algorithms Oliver Kullmann

Binary heaps One starts with an arbitrary array A of length n, which shall be Heapification Building a re-arranged into a binary heap. Our example is heap

HEAP- A = (4, 1, 3, 2, 16, 9, 10, 14, 8, 7). SORT Priority queues We repair (heapify) the binary trees bottom-up: QUICK- SORT

1 The leaves (the final part, from ⌊n/2⌋ +1 to n) are already Analysing QUICK- binary heaps on their own. SORT 2 For the other nodes, from right to left, we just call the Tutorial heapify-procedure. CS 270 Analysing building a heap Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a Roughly we have O(n · lg n) many operations: heap

HEAP- 1 Here however it pays off to take into account that most of SORT Priority the subtrees are small. queues 2 Then we get run-time O(n). QUICK- SORT

Analysing QUICK- So building a heap is linear in the number of elements. SORT

Tutorial CS 270 Heapify and remove from last to first Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 The idea of HEAP-SORT Algorithms Oliver Kullmann

Binary heaps

Heapification Now the task is to sort an array A of length n: Building a heap 1 First make a heap out of A (in linear time). HEAP- SORT 2 Repeat the following until n = 1: Priority 1 The maximum element is now A[1] — swap that with the queues last element A[n], and remove that last element, i.e., QUICK- n := n − 1. SORT Analysing 2 Now perform heapification for the root, i.e., i = 1. We have QUICK- a binary (max-)heap again (of length one less). SORT Tutorial The run-time is O(n · lg n). CS 270 All basic operations are (nearly) there Algorithms Oliver Recall that a (basic) (max-) has the operations: Kullmann Binary heaps MAXIMUM Heapification DELETE-MAX Building a heap

INSERTION. HEAP- SORT We use an array A containing a binary (max-)heap (the task is Priority queues

just to maintain the heap-property!): QUICK- SORT 1 The maximum is A[1]. Analysing QUICK- 2 For deleting the maximum element, we put the last element SORT A[n] into A[1], decrease the length by one (i.e., n := n − 1), Tutorial and heapify the root (i.e., i = 1). 3 And we add a new element by adding it to the end of the current array, and heapifying all its predecessors up on the way to the root. CS 270 Examples Algorithms Oliver Kullmann

Binary heaps Using our running-example, a few slides ago for HEAP-SORT: Heapification Building a heap 1 Considering it from (a) to (j), we can see what happens HEAP- when we perform a sequence of DELETE-MAX SORT operations, until the heap only contains one element (we Priority queues ignore here the shaded elements — they are visible only for QUICK- the HEAP-SORT). SORT

2 Analysing And considering the sequence in reverse order, we can see QUICK- what happens when we call INSERTION on the SORT respective first shaded elements (these are special Tutorial insertions, always inserting a new max-element). CS 270 Analysis Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a MAXIMUM is a constant-time operation. heap DELETE-MAX is one application of heapification, and HEAP- SORT O n n so need time (lg ) (where is the current number of Priority elements in the heap). queues QUICK- INSERTION seems to up to the current height many SORT applications of heapification, and thus would look like Analysing 2 QUICK- O((lg n) ), but it’s easy to see that it is O(lg n) as well (see SORT the tutorial). Tutorial CS 270 The idea of QUICK-SORT Algorithms Oliver Kullmann

Remember MERGE-SORT: Binary heaps Heapification

A divide-and-conquer algorithm for sorting an array in time Building a O(n · lg n). heap HEAP- The array is split in half, the two parts are sorted SORT Priority recursively (via MERGE-SORT), and then the two sorted queues

half-arrays are merged to the sorted (full-)array. QUICK- SORT Now we split along an element x of the array: Analysing QUICK- SORT

We partition into elements ≤ x (first array) and > x Tutorial (second array). Then we sort the two sub-arrays recursively. Done! CS 270 Remark on ranges Algorithms Oliver In the book arrays are 1-based: Kullmann Binary heaps 1 So the indices for an array A of length n are 1,..., n. Heapification 2 Accordingly, a sub-array is given by indices p ≤ r, meaning Building a heap the range p,..., r. HEAP- SORT

For Java-code we use 0-based arrays: Priority queues 1 So the indices are 0,..., n − 1. QUICK- SORT

2 Accordingly, a sub-array is given by indices p < r, meaning Analysing QUICK- the range p,..., r − 1. SORT

Tutorial Range-bounds for a sub-array are here now always left-closed and right-open!

So the whole array is given by the range-parameters 0, n. CS 270 The main procedure Algorithms Oliver Kullmann public static void sort( final int [] A, final int p, final int r) { Binary heaps assert(A != null ); Heapification Building a assert(p >= 0); heap assert(p <= r); HEAP- SORT assert(r <= A.length); Priority final int length = r − p; queues i f (length <= 1) return ; QUICK- SORT place partition element last(A,p,r); Analysing final int q = partition(A,p,r); QUICK- SORT

assert(p <= q); Tutorial assert(q < r); sort(A,p,q); sort(A,q+1,r); } CS 270 The idea of partitioning in-place Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 An example Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 The code Algorithms Oliver Kullmann

Instead of i we use q = i + 1: Binary heaps private static int partition( final int [] A, Heapification Building a final int p, final int r) { heap assert (p+1 < r); HEAP- SORT final int x = A[r −1]; Priority int q = p; queues for ( int j =p; j < r −1; ++j) { QUICK- SORT

final int v = A[j]; Analysing i f QUICK- (v <= x) {A[j] = A[q]; A[q++] = v; } SORT

} Tutorial A[r −1] =A[q]; A[q] = x; return q; } CS 270 Selecting the pivot Algorithms Oliver The partitioning-procedure expects the partitioning-element to Kullmann be the last array-element. So for selecting the pivot, we can just Binary heaps choose the last element: Heapification Building a private static void heap place partition element last( final int [] HEAP- SORT final int final int A, p, r) {} Priority queues However this makes it vulnerable to “malicious” choices, so we QUICK- SORT randomise better : Analysing QUICK- private static void SORT place partition element last( final int [] Tutorial A, final int p, final int r) { final int i = p+( int ) Math.random() ∗(r−p); { final int t=A[i ]; A[i]=A[r −1]; A[r −1]=t ; } } CS 270 A not unreasonable tree Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- SORT

Priority queues

QUICK- SORT

Analysing QUICK- SORT

Tutorial CS 270 Average-case Algorithms If we actually achieve that both sub-arrays are at least a Oliver Kullmann constant fraction α of the whole array (in the previous picture, that’s α = 0.1), then we get Binary heaps Heapification T (n)= T (α · n)+ T ((1 − α) · n)+Θ(n). Building a heap

HEAP- That’s basically the second case of the Master Theorem (the SORT 1 picture says it’s similar to α = 2 ), and so we would get Priority queues T (n)=Θ(n · log n). QUICK- SORT

Analysing And we actually get that: QUICK- SORT for the non-randomised version (choosing always the last Tutorial element as pivot), when averaging over all possible input sequences (without repetitions); for the randomised version (choosing a random pivot), when averaging over all (internal!) random choices; here we do not have to assume something on the inputs, except that all values are different. CS 270 Worst-case Algorithms Oliver Kullmann

Binary heaps However, as the tutorial shows: Heapification Building a heap 2 The worst-case run-time of QUICK-SORT is Θ(n ) HEAP- (for both versions)! SORT Priority queues

QUICK- This can be repaired, making also the worst-case run-time SORT Θ(n · log n). Analysing QUICK- For example by using median-computation in linear time for SORT the choice of the pivot. Tutorial However, in practice this is typically not worth the effort! CS 270 HEAP-SORT on sorted sequence Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a heap

HEAP- What does HEAP-SORT on an already sorted sequence? And SORT what’s the complexity? Consider the input sequence Priority queues

QUICK- 1, 2,..., 10. SORT

Analysing QUICK- SORT

Tutorial CS 270 Simplifying insertion Algorithms Oliver Kullmann

Binary heaps

Heapification

Building a When discussing insertion into a (max-)priority-queue, heap implemented via a binary (max-)heap, we just used a general HEAP- addition of one element into an existing heap: SORT Priority queues

The insertion-procedure used heapification up on the path QUICK- to the root. SORT Analysing Now actually we have always special cases of heapification QUICK- SORT — namely which? Tutorial CS 270 Change to the partitioning procedure Algorithms Oliver Kullmann

Binary heaps

Heapification

What happens if we change the line Building a heap

i f (v <= x) {A[j] = A[q]; A[q++] = v; } HEAP- SORT

Priority of function partition to queues i f QUICK- (v < x) {A[j] = A[q]; A[q++] = v; } SORT

Analysing QUICK- Can we do it? SORT Would it have advantages? Tutorial CS 270 QUICK-SORT on constant sequences Algorithms Oliver What is QUICK-SORT doing on a constant sequence, in its Kullmann three incarnations: Binary heaps

pivot is last element Heapification pivot is random element Building a heap

pivot is median element? HEAP- SORT One of the two sub-arrays will have size 1, and QUICK-SORT Priority queues 2 degenerates to an O(n ) algorithm (which does nothing). QUICK- SORT

What can we do about it? Analysing QUICK- We can refine the partition-procedure by SORT Tutorial not just splitting into two parts, but into three parts: all elements < x, all elements = x, and all elements > x.

Then we choice the pivot-index as the middle index of the part of all elements = x. We get O(n log n) for constant sequences. CS 270 Worst-case for QUICK-SORT Algorithms Oliver Kullmann

Binary heaps

Heapification Consider sequences without repetitions, and assume the pivot is Building a always the last element: heap HEAP- What is a worst-case input? SORT Priority And what is QUICK-SORT doing on it? queues QUICK- Every already sorted sequence is a worst-case example! SORT Analysing QUICK-SORT behaves as with constant sequences. QUICK- SORT Note that this is avoided with randomised pivot-choice (and, of Tutorial course, with median pivot-choice). CS 270 Worst-case O(n log n) for QUICK-SORT Algorithms How can we achieve O(n log n) in the worst-case for Oliver Kullmann QUICK-SORT? Binary heaps The point is that just choosing, within our current Heapification framework, the median-element is not enough, but we need Building a heap the change the framework, allowing to compute the HEAP- median-index. SORT Priority Best is to remove the function queues

place partition element last, and leave the QUICK- partitioning fully to function partition. SORT Analysing QUICK- Then the main procedure becomes (without the asserts): SORT public static void sort( final int [] A, final Tutorial int p, final int r) { i f (r−p <= 1) return ; final int q = partition(A,p,r); sort(A,p,q); sort(A,q+1,r); }