CS 270 Algorithms Week 10 Oliver Kullmann
Binary heaps Sorting Heapification Building a heap 1 Binary heaps HEAP- SORT
Priority 2 Heapification queues
QUICK- 3 Building a heap SORT Analysing 4 QUICK- HEAP-SORT SORT 5 Priority queues Tutorial
6 QUICK-SORT
7 Analysing QUICK-SORT
8 Tutorial CS 270 General remarks Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
We return to sorting, considering HEAP-SORT and HEAP- QUICK-SORT. SORT Priority queues CLRS Reading from for week 7 QUICK- SORT 1 Chapter 6, Sections 6.1 - 6.5. Analysing 2 QUICK- Chapter 7, Sections 7.1, 7.2. SORT
Tutorial CS 270 Discover the properties of binary heaps Algorithms Oliver Running example Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 First property: level-completeness Algorithms Oliver Kullmann
Binary heaps In week 7 we have seen binary trees: Heapification
Building a 1 We said they should be as “balanced” as possible. heap 2 Perfect are the perfect binary trees. HEAP- SORT 3 Now close to perfect come the level-complete binary Priority trees: queues QUICK- 1 We can partition the nodes of a (binary) tree T into levels, SORT
according to their distance from the root. Analysing 2 We have levels 0, 1,..., ht(T ). QUICK- k SORT 3 Level k has from 1 to 2 nodes. Tutorial 4 If all levels k except possibly of level ht(T ) are full (have precisely 2k nodes in them), then we call the tree level-complete. CS 270 Examples Algorithms Oliver The binary tree Kullmann
1 ❚ Binary heaps ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ ❥❥❥❥ ❚❚❚❚ Heapification 2 ❥ 3 ❖ ❄❄ ❖❖ Building a ⑧⑧ ❄ ⑧⑧ ❖❖ heap ⑧⑧ ❄ ⑧⑧ ❖❖❖ 4 5 6 ❄ 7 ❄ HEAP- ⑧ ❄ ⑧ ❄ SORT ⑧⑧ ❄ ⑧⑧ ❄ Priority 10 13 14 15 queues
QUICK- is level-complete (level-sizes are 1, 2, 4, 4), while SORT ❥ 1 ❚❚ Analysing ❥❥❥❥ ❚❚❚❚ QUICK- ❥❥❥ ❚❚❚ SORT ❥❥❥ ❚❚❚❚ 2 ❥❥ 3 ♦♦♦ ❄❄ ⑧ Tutorial ♦♦ ❄❄ ⑧⑧ ♦♦♦ ⑧ 4 ❄ 5 ❄ 6 ❄ ⑧⑧ ❄❄ ⑧ ❄ ⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ ⑧⑧ ❄ 8 9 10 11 12 13
is not (level-sizes are 1, 2, 3, 6). CS 270 The height of level-complete binary trees Algorithms Oliver Kullmann
Binary heaps
Heapification
For a level-complete binary tree T we have Building a heap ht(T )= ⌊lg(#nds(T ))⌋ . HEAP- SORT
Priority That is, the height of T is the binary logarithm of the number queues of nodes of T , after removal of the fractional part. QUICK- SORT
Analysing We said that “balanced” T should have QUICK- ht(T ) ≈ lg(#nds(T )). SORT Tutorial Now that’s very close. CS 270 Second property: completeness Algorithms Oliver Kullmann
Binary heaps
Heapification
To have simple and efficient access to the nodes of the tree, the Building a nodes of the last layer better are not placed in random order: heap HEAP- SORT Best is if they fill the positions from the left without gaps. Priority A level-complete binary tree with such gap-less last layer is queues QUICK- called a complete tree. SORT
Analysing So the level-complete binary tree on the examples-slide is QUICK- not complete. SORT While the running-example is complete. Tutorial CS 270 Third property: the heap-property Algorithms Oliver The running-example is not a binary search tree: Kullmann
1 It would be too expensive to have this property together Binary heaps with the completeness property. Heapification Building a 2 However we have another property related to order (not heap HEAP- just related to the structure of the tree): The value of every SORT
node is not less than the value of any of its successors (the Priority nodes below it). queues QUICK- 3 This property is called the heap property. SORT
4 Analysing More precisely it is the max-heap property. QUICK- SORT Definition 1 Tutorial A binary heap is a binary tree which is complete and has the heap property.
More precisely we have binary max-heaps and binary min-heaps. CS 270 Fourth property: Efficient index computation Algorithms Oliver Kullmann
Binary heaps Consider the numbering (not the values) of the nodes of the Heapification Building a running-example: heap
HEAP- 1 This numbering follows the layers, beginning with the first SORT Priority layer and going from left to right. queues 2 Due to the completeness property (no gaps!) these numbers QUICK- SORT yield easy relations between a parent and its children. Analysing 3 QUICK- If the node has number p, then the left child has number SORT
2p, and the right child has number 2p + 1. Tutorial 4 And the parent has number ⌊p/2⌋. CS 270 Efficient array implementation Algorithms Oliver For binary search trees we needed full-fledged trees (as discussed Kullmann
in week 7): Binary heaps
Heapification 1 That is, we needed nodes with three pointers: to the parent Building a and to the two children. heap HEAP- 2 However now, for complete binary trees we can use a more SORT
efficient array implementation, using the numbering for the Priority array-indices. queues QUICK- SORT m So a binary heap with nodes is represented by an array with Analysing m QUICK- elements: SORT C-based languages use 0-based indices (while the book uses Tutorial 1-based indices). For such an index 0 ≤ i < m the index of the left child is 2i +1, and the index of the right child is 2i + 2. While the index of the parent is ⌊(i − 1)/2⌋. CS 270 Float down a single disturbance Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 The idea of heapification Algorithms Oliver Kullmann The input is an array A and index i into A. Binary heaps
It is assumed that the binary trees rooted at the left and Heapification
right child of i are binary (max-)heaps, but we do not Building a assume anything on A[i]. heap HEAP- After the “heapification”, the values of the binary tree SORT i Priority rooted at have been rearranged, so that it is a binary queues
(max-)heap now. QUICK- SORT
Analysing For that, the algorithm proceeds as follows: QUICK- SORT 1 First the largest of A[i], A[l], A[r] is determined, where Tutorial l = 2i and r = 2i + 1 (the two children). 2 If A[i] is largest, then we are done. 3 Otherwise A[i] is swapped with the largest element, and we call the procedure recursively on the changed subtree. CS 270 Analysing heapification Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a Obviously, we go down from the node to a leaf (in the worst heap HEAP- case), and thus the running-time of heapification is SORT
Priority queues linear in the height h of the subtree. QUICK- SORT
Analysing This is O(lg n), where n is the number of nodes in the subtree QUICK- (due to h = ⌊lg n⌋). SORT Tutorial CS 270 Heapify bottom-up Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 The idea of building a binary heap Algorithms Oliver Kullmann
Binary heaps One starts with an arbitrary array A of length n, which shall be Heapification Building a re-arranged into a binary heap. Our example is heap
HEAP- A = (4, 1, 3, 2, 16, 9, 10, 14, 8, 7). SORT Priority queues We repair (heapify) the binary trees bottom-up: QUICK- SORT
1 The leaves (the final part, from ⌊n/2⌋ +1 to n) are already Analysing QUICK- binary heaps on their own. SORT 2 For the other nodes, from right to left, we just call the Tutorial heapify-procedure. CS 270 Analysing building a heap Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a Roughly we have O(n · lg n) many operations: heap
HEAP- 1 Here however it pays off to take into account that most of SORT Priority the subtrees are small. queues 2 Then we get run-time O(n). QUICK- SORT
Analysing QUICK- So building a heap is linear in the number of elements. SORT
Tutorial CS 270 Heapify and remove from last to first Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 The idea of HEAP-SORT Algorithms Oliver Kullmann
Binary heaps
Heapification Now the task is to sort an array A of length n: Building a heap 1 First make a heap out of A (in linear time). HEAP- SORT 2 Repeat the following until n = 1: Priority 1 The maximum element is now A[1] — swap that with the queues last element A[n], and remove that last element, i.e., set QUICK- n := n − 1. SORT Analysing 2 Now perform heapification for the root, i.e., i = 1. We have QUICK- a binary (max-)heap again (of length one less). SORT Tutorial The run-time is O(n · lg n). CS 270 All basic operations are (nearly) there Algorithms Oliver Recall that a (basic) (max-)priority queue has the operations: Kullmann Binary heaps MAXIMUM Heapification DELETE-MAX Building a heap
INSERTION. HEAP- SORT We use an array A containing a binary (max-)heap (the task is Priority queues
just to maintain the heap-property!): QUICK- SORT 1 The maximum is A[1]. Analysing QUICK- 2 For deleting the maximum element, we put the last element SORT A[n] into A[1], decrease the length by one (i.e., n := n − 1), Tutorial and heapify the root (i.e., i = 1). 3 And we add a new element by adding it to the end of the current array, and heapifying all its predecessors up on the way to the root. CS 270 Examples Algorithms Oliver Kullmann
Binary heaps Using our running-example, a few slides ago for HEAP-SORT: Heapification Building a heap 1 Considering it from (a) to (j), we can see what happens HEAP- when we perform a sequence of DELETE-MAX SORT operations, until the heap only contains one element (we Priority queues ignore here the shaded elements — they are visible only for QUICK- the HEAP-SORT). SORT
2 Analysing And considering the sequence in reverse order, we can see QUICK- what happens when we call INSERTION on the SORT respective first shaded elements (these are special Tutorial insertions, always inserting a new max-element). CS 270 Analysis Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a MAXIMUM is a constant-time operation. heap DELETE-MAX is one application of heapification, and HEAP- SORT O n n so need time (lg ) (where is the current number of Priority elements in the heap). queues QUICK- INSERTION seems to up to the current height many SORT applications of heapification, and thus would look like Analysing 2 QUICK- O((lg n) ), but it’s easy to see that it is O(lg n) as well (see SORT the tutorial). Tutorial CS 270 The idea of QUICK-SORT Algorithms Oliver Kullmann
Remember MERGE-SORT: Binary heaps Heapification
A divide-and-conquer algorithm for sorting an array in time Building a O(n · lg n). heap HEAP- The array is split in half, the two parts are sorted SORT Priority recursively (via MERGE-SORT), and then the two sorted queues
half-arrays are merged to the sorted (full-)array. QUICK- SORT Now we split along an element x of the array: Analysing QUICK- SORT
We partition into elements ≤ x (first array) and > x Tutorial (second array). Then we sort the two sub-arrays recursively. Done! CS 270 Remark on ranges Algorithms Oliver In the book arrays are 1-based: Kullmann Binary heaps 1 So the indices for an array A of length n are 1,..., n. Heapification 2 Accordingly, a sub-array is given by indices p ≤ r, meaning Building a heap the range p,..., r. HEAP- SORT
For Java-code we use 0-based arrays: Priority queues 1 So the indices are 0,..., n − 1. QUICK- SORT
2 Accordingly, a sub-array is given by indices p < r, meaning Analysing QUICK- the range p,..., r − 1. SORT
Tutorial Range-bounds for a sub-array are here now always left-closed and right-open!
So the whole array is given by the range-parameters 0, n. CS 270 The main procedure Algorithms Oliver Kullmann public static void sort( final int [] A, final int p, final int r) { Binary heaps assert(A != null ); Heapification Building a assert(p >= 0); heap assert(p <= r); HEAP- SORT assert(r <= A.length); Priority final int length = r − p; queues i f (length <= 1) return ; QUICK- SORT place partition element last(A,p,r); Analysing final int q = partition(A,p,r); QUICK- SORT
assert(p <= q); Tutorial assert(q < r); sort(A,p,q); sort(A,q+1,r); } CS 270 The idea of partitioning in-place Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 An example Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 The code Algorithms Oliver Kullmann
Instead of i we use q = i + 1: Binary heaps private static int partition( final int [] A, Heapification Building a final int p, final int r) { heap assert (p+1 < r); HEAP- SORT final int x = A[r −1]; Priority int q = p; queues for ( int j =p; j < r −1; ++j) { QUICK- SORT
final int v = A[j]; Analysing i f QUICK- (v <= x) {A[j] = A[q]; A[q++] = v; } SORT
} Tutorial A[r −1] =A[q]; A[q] = x; return q; } CS 270 Selecting the pivot Algorithms Oliver The partitioning-procedure expects the partitioning-element to Kullmann be the last array-element. So for selecting the pivot, we can just Binary heaps choose the last element: Heapification Building a private static void heap place partition element last( final int [] HEAP- SORT final int final int A, p, r) {} Priority queues However this makes it vulnerable to “malicious” choices, so we QUICK- SORT randomise better : Analysing QUICK- private static void SORT place partition element last( final int [] Tutorial A, final int p, final int r) { final int i = p+( int ) Math.random() ∗(r−p); { final int t=A[i ]; A[i]=A[r −1]; A[r −1]=t ; } } CS 270 A not unreasonable tree Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- SORT
Priority queues
QUICK- SORT
Analysing QUICK- SORT
Tutorial CS 270 Average-case Algorithms If we actually achieve that both sub-arrays are at least a Oliver Kullmann constant fraction α of the whole array (in the previous picture, that’s α = 0.1), then we get Binary heaps Heapification T (n)= T (α · n)+ T ((1 − α) · n)+Θ(n). Building a heap
HEAP- That’s basically the second case of the Master Theorem (the SORT 1 picture says it’s similar to α = 2 ), and so we would get Priority queues T (n)=Θ(n · log n). QUICK- SORT
Analysing And we actually get that: QUICK- SORT for the non-randomised version (choosing always the last Tutorial element as pivot), when averaging over all possible input sequences (without repetitions); for the randomised version (choosing a random pivot), when averaging over all (internal!) random choices; here we do not have to assume something on the inputs, except that all values are different. CS 270 Worst-case Algorithms Oliver Kullmann
Binary heaps However, as the tutorial shows: Heapification Building a heap 2 The worst-case run-time of QUICK-SORT is Θ(n ) HEAP- (for both versions)! SORT Priority queues
QUICK- This can be repaired, making also the worst-case run-time SORT Θ(n · log n). Analysing QUICK- For example by using median-computation in linear time for SORT the choice of the pivot. Tutorial However, in practice this is typically not worth the effort! CS 270 HEAP-SORT on sorted sequence Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a heap
HEAP- What does HEAP-SORT on an already sorted sequence? And SORT what’s the complexity? Consider the input sequence Priority queues
QUICK- 1, 2,..., 10. SORT
Analysing QUICK- SORT
Tutorial CS 270 Simplifying insertion Algorithms Oliver Kullmann
Binary heaps
Heapification
Building a When discussing insertion into a (max-)priority-queue, heap implemented via a binary (max-)heap, we just used a general HEAP- addition of one element into an existing heap: SORT Priority queues
The insertion-procedure used heapification up on the path QUICK- to the root. SORT Analysing Now actually we have always special cases of heapification QUICK- SORT — namely which? Tutorial CS 270 Change to the partitioning procedure Algorithms Oliver Kullmann
Binary heaps
Heapification
What happens if we change the line Building a heap
i f (v <= x) {A[j] = A[q]; A[q++] = v; } HEAP- SORT
Priority of function partition to queues i f QUICK- (v < x) {A[j] = A[q]; A[q++] = v; } SORT
Analysing QUICK- Can we do it? SORT Would it have advantages? Tutorial CS 270 QUICK-SORT on constant sequences Algorithms Oliver What is QUICK-SORT doing on a constant sequence, in its Kullmann three incarnations: Binary heaps
pivot is last element Heapification pivot is random element Building a heap
pivot is median element? HEAP- SORT One of the two sub-arrays will have size 1, and QUICK-SORT Priority queues 2 degenerates to an O(n ) algorithm (which does nothing). QUICK- SORT
What can we do about it? Analysing QUICK- We can refine the partition-procedure by SORT Tutorial not just splitting into two parts, but into three parts: all elements < x, all elements = x, and all elements > x.
Then we choice the pivot-index as the middle index of the part of all elements = x. We get O(n log n) for constant sequences. CS 270 Worst-case for QUICK-SORT Algorithms Oliver Kullmann
Binary heaps
Heapification Consider sequences without repetitions, and assume the pivot is Building a always the last element: heap HEAP- What is a worst-case input? SORT Priority And what is QUICK-SORT doing on it? queues QUICK- Every already sorted sequence is a worst-case example! SORT Analysing QUICK-SORT behaves as with constant sequences. QUICK- SORT Note that this is avoided with randomised pivot-choice (and, of Tutorial course, with median pivot-choice). CS 270 Worst-case O(n log n) for QUICK-SORT Algorithms How can we achieve O(n log n) in the worst-case for Oliver Kullmann QUICK-SORT? Binary heaps The point is that just choosing, within our current Heapification framework, the median-element is not enough, but we need Building a heap the change the framework, allowing to compute the HEAP- median-index. SORT Priority Best is to remove the function queues
place partition element last, and leave the QUICK- partitioning fully to function partition. SORT Analysing QUICK- Then the main procedure becomes (without the asserts): SORT public static void sort( final int [] A, final Tutorial int p, final int r) { i f (r−p <= 1) return ; final int q = partition(A,p,r); sort(A,p,q); sort(A,q+1,r); }