‣ Mergesort ‣ Counting Inversions ‣ Randomized Quicksort ‣ Median and Selection ‣ Closest Pair of Points

Divide-and-conquer paradigm Divide-and-conquer. 5. DIVIDE AND CONQUER I Divide up problem into several subproblems (of the same kind). Solve (conquer) each subproblem recursively. ‣ mergesort Combine solutions to subproblems into overall solution. ‣ counting inversions Most common usage. ‣ randomized quicksort Divide problem of size n into two subproblems of size n / 2. O(n) time ‣ median and selection Solve (conquer) two subproblems recursively. ‣ closest pair of points Combine two solutions into overall solution. O(n) time Consequence. Brute force: Θ(n2). Lecture slides by Kevin Wayne Divide-and-conquer: O(n log n). Copyright © 2005 Pearson-Addison Wesley http://www.cs.princeton.edu/~wayne/kleinberg-tardos Last updated on 3/12/18 9:21 AM attributed to Julius Caesar 2 Sorting problem Problem. Given a list L of n elements from a totally ordered universe, 5. DIVIDE AND CONQUER rearrange them in ascending order. ‣ mergesort ‣ counting inversions ‣ randomized quicksort ‣ median and selection ‣ closest pair of points SECTIONS 5.1–5.2 4 Sorting applications Mergesort Obvious applications. Recursively sort left half. Organize an MP3 library. Recursively sort right half. Display Google PageRank results. Merge two halves to make sorted whole. List RSS news items in reverse chronological order. Some problems become easier once elements are sorted. input Identify statistical outliers. A L G O R I T H M S Binary search in a database. Remove duplicates in a mailing list. sort left half A G L O R I T H M S Non-obvious applications. sort right half Convex hull. Closest pair of points. A G L O R H I M S T Interval scheduling / interval partitioning. merge results Scheduling to minimize maximum lateness. A G H I L M O R S T Minimum spanning trees (Kruskal’s algorithm). ... 5 6 Merging Mergesort implementation Goal. Combine two sorted lists A and B into a sorted whole C. Input. List L of n elements from a totally ordered universe. Scan A and B from left to right. Output. The n elements in ascending order. Compare ai and bj. If ai ≤ bj, append ai to C (no larger than any remaining element in B). If ai > bj, append bj to C (smaller than every remaining element in A). MERGE-SORT(L) _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ IF (list L has one element) RETURN L. sorted list A sorted list B Divide the list into two halves A and B. 3 7 10 ai 18 2 11 bj 20 23 A ← MERGE-SORT(A). T(n / 2) B ← MERGE-SORT(B). T(n / 2) ERGE Θ(n) merge to form sorted list C L ← M (A, B). RETURN L. 2 3 7 10 11 _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ 7 8 A useful recurrence relation Divide-and-conquer recurrence: recursion tree Def. T (n) = max number of compares to mergesort a list of length n. Proposition. If T (n) satisfies the following recurrence, then T (n) = n log2 n. 0 n =1 assuming n Recurrence. T (n)= is a power of 2 2 T (n/2) + n n>1 0 n =1 <latexit sha1_base64="Z3k0jP42WP//6lnqi9e2pn3k2fw=">AAACo3icbVFNa9tAEF0pbZq4H3HSYy5DnYaUFkcyhaSElEAvPfSQFrsJWMKsViNnyWoldkfBrvAP7S0/pStHhdrpwC6PNzNvZt8mpZKWguC35288ebr5bGu78/zFy1c73d29n7aojMCRKFRhrhNuUUmNI5Kk8Lo0yPNE4VVy+6XJX92hsbLQQ5qXGOd8qmUmBSdHTbq/hkf6HURncN5cnSjBqdS1cIp20YnOAjiEiHBGNcgMDrQrCw9gAVE0DvonmMeuZgDRB3Ayx4NG6L2T0etdn/92dSLUaSs/6faCfrAMeAzCFvRYG5eTXW8vSgtR5ahJKG7tOAxKimtuSAqFbt/KYsnFLZ/i2EHNc7RxvTRpAW8dk0JWGHc0wZL9t6PmubXzPHGVOacbu55ryP/lxhVlp3EtdVkRavEwKKsUUAGN45BKg4LU3AEujHS7grjhhgty/7IyZaldolh5ST2rtBRFimusohkZ3rgYrnv2GIwG/U/98PvH3sVpa+cW22dv2BEL2Qm7YF/ZJRsxwe69TW/H6/qH/jf/hz98KPW9tuc1Wwk//gPFucbd</latexit> T (n) ≤ T ( n/2 )+T ( n/2 )+n n>1 <latexit sha1_base64="G9BiiiqVnrc2NLAK0QbOMus1fE8=">AAACy3icbVFdb9MwFHUyPkr56sYjL1d0oCKkLJmQ2mkCTeKFJzSklk2qo8pxbjprjhPZDmoJfeSP8K/4NzhZJtGO++Kjc67P/UpKKYwNwz+ev3fv/oOHvUf9x0+ePns+2D/4ZopKc5zxQhb6MmEGpVA4s8JKvCw1sjyReJFcf2r0i++ojSjU1K5LjHO2VCITnFlHLQa/pyP1FugpUInN06cJLoWqufM0mz49DeENUIsrW4PI4FDBB4gOYQOUzsNgjHnscqYjKjNZFBrU0TFQ3eLG9Z1zbESOQnZaA28ltev98da7T1GlXROLwTAMwjbgLog6MCRdnC/2vQOaFrzKUVkumTHzKCxtXDNtBZfopqoMloxfsyXOHVQsRxPX7TI38NoxKWRumKxQFlr23x81y41Z54nLzJm9MrtaQ/5Pm1c2m8S1UGVlUfGbQlklwRbQXAZSoZFbuXaAcS1cr8CvmGbcuvttVWm9S+Rbk9SrSglepLjDSruymjVbjHZ3dhfMjoOTIPr6fng26dbZIy/JKzIiERmTM/KZnJMZ4V7PC7yxN/G/+Nb/4f+8SfW97s8LshX+r7/FTNWb</latexit> between ⎣n / 2⎦ and n − 1 compares T (n) n = n Solution. T (n) is O(n log2 n). T (n / 2) T (n / 2) 2 (n/2) = n T (n / 4) T (n / 4) T (n / 4) T (n / 4) 4 (n/4) = n Assorted proofs. We describe several ways to solve this recurrence. log 2 n Initially we assume n is a power of 2 and replace ≤ with = in the recurrence. T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) T (n / 8) 8 (n/8) = n ⋮ ⋮ T(n) = n log2 n 9 10 Proof by induction Divide-and-conquer: quiz 1 Proposition. If T (n) satisfies the following recurrence, then T (n) = n log2 n. Which is the exact solution of the following recurrence? 0 n =1 assuming n 0 n =1 T (n)= is a power of 2 T (n)= 2 T (n/2) + n n>1 T ( n/2 )+T ( n/2 )+n 1 n>1 <latexit sha1_base64="Z3k0jP42WP//6lnqi9e2pn3k2fw=">AAACo3icbVFNa9tAEF0pbZq4H3HSYy5DnYaUFkcyhaSElEAvPfSQFrsJWMKsViNnyWoldkfBrvAP7S0/pStHhdrpwC6PNzNvZt8mpZKWguC35288ebr5bGu78/zFy1c73d29n7aojMCRKFRhrhNuUUmNI5Kk8Lo0yPNE4VVy+6XJX92hsbLQQ5qXGOd8qmUmBSdHTbq/hkf6HURncN5cnSjBqdS1cIp20YnOAjiEiHBGNcgMDrQrCw9gAVE0DvonmMeuZgDRB3Ayx4NG6L2T0etdn/92dSLUaSs/6faCfrAMeAzCFvRYG5eTXW8vSgtR5ahJKG7tOAxKimtuSAqFbt/KYsnFLZ/i2EHNc7RxvTRpAW8dk0JWGHc0wZL9t6PmubXzPHGVOacbu55ryP/lxhVlp3EtdVkRavEwKKsUUAGN45BKg4LU3AEujHS7grjhhgty/7IyZaldolh5ST2rtBRFimusohkZ3rgYrnv2GIwG/U/98PvH3sVpa+cW22dv2BEL2Qm7YF/ZJRsxwe69TW/H6/qH/jf/hz98KPW9tuc1Wwk//gPFucbd</latexit> <latexit sha1_base64="VRAYMk/LbRQG3UoGOQDaFFRdfHc=">AAACxXicbVFda9swFJW9ry77SrvHvVyWbqSMZXYYtCNsFPbSxw6StRCZIMvXqagsGUkuCSbsj+yP7d9Mdl1Y0t2nwzn389y0lMK6KPoThA8ePnr8ZO9p79nzFy9f9fcPflpdGY4zrqU2lymzKIXCmRNO4mVpkBWpxIv0+nujX9ygsUKrqVuXmBRsqUQuOHOeWvR/T4fqCL5Cj6a4FKrmvpfd9OgkgvdAHa5cDSKHQ+Vz4kPYAKXzaHSMReJzpkMqc6m1AfVpDNS0+Ajo5AOdQCNyFLLTGngnqY/xbvdvd917FFXWrbHoD6JR1AbcB3EHBqSL88V+cEAzzasCleOSWTuPo9IlNTNOcIn+rspiyfg1W+LcQ8UKtEnd2riBd57JIPfn5Fo5aNl/K2pWWLsuUp9ZMHdld7WG/J82r1x+ktRClZVDxW8H5ZUEp6H5CWTCIHdy7QHjRvhdgV8xw7jzn9ua0vYukW9dUq8qJbjOcIeVbuUMa1yMdz27D2bj0ZdR/OPz4PSks3OPvCFvyZDE5JickjNyTmaEB2EwDOJgHJ6FKnThzW1qGHQ1r8lWhL/+Aje6018=</latexit> − no longer assuming n is a power of 2 Pf. [ by induction on n ] Base case: when n = 1, T(1) = 0 = n log2 n. A. T(n) = n ⎣log2 n⎦ Inductive hypothesis: assume T(n) = n log2 n. B. T(n) = n ⎡log2 n⎤ Goal: show that T(2n) = 2n log2 (2n). ⎣log n⎦ C. T(n) = n ⎣log2 n⎦ + 2 2 − 1 recurrence n ⎡log n⎤ D. T(n) = n ⎡log2 n⎤ − 2 2 + 1 = log k 2 T(2n) = 2 T(n) + 2n <latexit sha1_base64="OIDlLBbcsVZR990sJ0LjHkzac4E=">AAACWXicbVDLSgMxFE3HV62vVpdugkVwUcqMCCoiCG5cVrC20KlDJr2toZlkSO6IZehv+DVu9R8EP8b0sbDVC0lOzrk3N/fEqRQWff+r4K2srq1vFDdLW9s7u3vlyv6j1Znh0ORaatOOmQUpFDRRoIR2aoAlsYRWPLyd6K0XMFZo9YCjFLoJGyjRF5yho6KyH17Ra+q20GZJlA+vg/GTomEtrNFQchDSHXoQndIhDc3kHpWrft2fBv0Lgjmoknk0okphP+xpniWgkEtmbSfwU+zmzKDgEsalMLOQMj5kA+g4qFgCtptPRxvTY8f0aF8btxTSKfu7ImeJtaMkdpkJw2e7rE3I/7ROhv2Lbi5UmiEoPmvUzyRFTSc+0Z4wwFGOHGDcCPdXyp+ZYRydmwtdpm+nwBcmyV8zJbjuwRIr8RUNGzsXg2XP/oLmaf2y7t+fVW8u5nYWySE5IickIOfkhtyRBmkSTt7IO/kgn4Vvz/OKXmmW6hXmNQdkIbyDH4oEtJc=</latexit> k=1 E. Not even Knuth knows. inductive hypothesis = 2 n log2 n + 2n = 2 n (log2 (2n) – 1) + 2n = 2 n log2 (2n). ▪ 11 12 Analysis of mergesort recurrence Digression: sorting lower bound Proposition. If T(n) satisfies the following recurrence, then T(n) ≤ n ⎡log2 n⎤. Challenge. How to prove a lower bound for all conceivable algorithms? 0 n =1 Model of computation. Comparison trees. T (n) ≤ T ( n/2 )+T ( n/2 )+n n>1 Can access the elements only through pairwise comparisons. <latexit sha1_base64="PlaAp4PYCkUIob8GuIy/Wl4iXIM=">AAACynicbVFdb9MwFHUyYKN8deORlys6UBGiJBOiQxNoEi+8IA2pZZPqqHKcm86aY0e2M7WK+sYf4Wfxb3CyTKId98VH51yf+5WWUlgXRX+CcOfe/Qe7ew97jx4/efqsv3/w0+rKcJxyLbW5SJlFKRROnXASL0qDrEglnqdXXxv9/BqNFVpN3KrEpGALJXLBmfPUvP97MlRvgJ4Aldg8PZriQqiae0+77tGTCF4Ddbh0NYgcDhV8hvgQ1kDpbDTGIvEpkyGVudTagHp/BNS0uDF96w0bkaOQndbAW0ltW3+5te5RVFnXw7w/iEZRG3AXxB0YkC7O5vvBAc00rwpUjktm7SyOSpfUzDjBJfqhKosl41dsgTMPFSvQJnW7yzW88kwGuR8m18pBy/77o2aFtasi9ZkFc5d2W2vI/2mzyuXHSS1UWTlU/KZQXklwGprDQCYMcidXHjBuhO8V+CUzjDt/vo0qrXeJfGOSelkpwXWGW6x0S2dYs8V4e2d3wfRo9GkU//gwOD3u1rlHXpCXZEhiMian5Bs5I1PCg93gXfAxGIffQxuuwvomNQy6P8/JRoS//gIlc9Vh</latexit>

‣ Mergesort ‣ Counting Inversions ‣ Randomized Quicksort ‣ Median and Selection ‣ Closest Pair of Points

Overview Parallel Merge Sort

Lecture 16: Lower Bounds for Sorting

Arxiv:1606.00484V2 [Cs.DS] 4 Aug 2016

Advanced Topics in Sorting

Optimal Node Selection Algorithm for Parallel Access in Overlay Networks

CS302 Final Exam, December 5, 2016 - James S

Dualheap Sort Algorithm: an Inherently Parallel Generalization of Heapsort

Divide and Conquer CISC4080, Computer Algorithms CIS, Fordham Univ

00 Fast K-Selection Algorithms for Graphics Processing Units

Introspective Sorting and Selection Algorithms

(Deterministic & Randomized): Finding the Median in Linear Time

Arxiv:1810.12047V1 [Cs.DS] 29 Oct 2018 Es EA21) H Ipicto Sachieved Is Simpliﬁcation the 2016)