Math 3012 Lecture 8 - Complexity and Problem Size
Total Page:16
File Type:pdf, Size:1020Kb
Math 3012 Lecture 8 - Complexity and problem size Lu´ıs Pereira Georgia Tech September 7, 2018 Complexity and Problem/Input size Terminology We say that a problem has size n if the problem data can be described using n pieces of information where each packet can be read in a constant amount of time. Basic example Problem data is list of n numbers below 1000000. Non example Problem data is list of n numbers below 1000n. Problem size is n log2(1000n) = n(log2 1000 + log2 n) ≈ n log2 n. Running Time Terminology Suppose that for an input of size n an algorithm requires f (n) steps/operations. We call f (n) the running time of the algorithm. Remark Typically one can’t determine f (n) precisely, but one can give a reasonable estimate. Question Giving two algorithms for the same task with running times f (n) and g(n), how does one compare their running times? Some common increasing functions In practice f (n) → ∞ as n increases. Some possible such functions are √ log∗ n log log n log n n0.1 n n n log n n1.1 n2 n3 nlog n 2n √ n 10n (log n)n ( n)n nn 22n 222 Big-Oh notation Definition Given two (positive) functions f (n) and g(n) we write f = O(g) or f (n) = O(g(n)) if there is a constant C such that f (n) ≤ C · g(n) Informally “f is never much bigger than g” Remark If there is integer M and constant C such that f (n) ≤ C · g(n) for n ≥ M then there is some other constant Ce such that f (n) ≤ Ce · g(n) for all n In words The condition f = O(g) only depends on large n. Big-Oh notation: example Example Suppose there are 3 algorithms for the same task and 3 I algorithm 1 has running time f1(n) = n + 3 log n 3 2 I algorithm 2 has running time f2(n) = 2n + n 2 I algorithm 3 has running time f3(n) = 200n How do these running times compare? Answer For small n it is f1(n) < f2(n) < f3(n). For large n one has f2(n) ≈ 2f1(n) so f1 = O(f2) and f2 = O(f1) Further, f3(n) is eventually much smaller than f1(n) and f2(n) so f3 = O(f1) and f3 = O(f2) but not f1 = O(f3) nor f2 = O(f3). Little-Oh notation Definition Given two (positive) functions f (n) and g(n) we write f = o(g) or f (n) = o(g(n)) if f (n) lim = 0 n→∞ g(n) Informally “f is eventually much smaller than g” Example For 3 3 2 2 f1(n) = n + 3 log n f2(n) = 2n + n f3(n) = 200n one has f3 = o(f1) f3 = o(f2) Common increasing functions revisited In the previous list, for any consecutive f (n), g(n) it is f = o(g). √ log∗ n ≺ log log n ≺ log n ≺ n0.1 ≺ n ≺ n ≺ ≺ n log n ≺ n1.1 ≺ n2 ≺ n3 ≺ nlog n ≺ 2n ≺ √ n ≺ 10n ≺ (log n)n ≺ ( n)n ≺ nn ≺ 22n ≺ 222 Alternative notation (warning: not commonly used) I f g means f = O(g) I f ≺ g means f = o(g) Piazza poll Question Consider the following running time functions: n I f1(n) = 2 + 20 log n 3 n I f2(n) = n + 2 4 5 I f3(n) = n + n Choose the option with two correct statements. Answers (A) f1 = o(f2) and f1 = O(f3) (B) f1 = o(f2) and f3 = O(f1) (C) f2 = O(f1) and f1 = O(f3) (D) f2 = O(f1) and f3 = O(f1) Some motivating problems (1) Goal: Determining how difficult a given problem is. Given a list S of numbers, consider the following problems: I) What is the largest integer in S? II) If a is the first number in S, are there integers b, c in S such that a = b + c? III) Are there integers a, b, c in S such that a = b + c? IV) Does S satisfy fair division? (i.e. can S be divided into two parts with the same sum?) Some motivating problems (2) I) What is the largest integer in S? This can be done in |S| = n steps. II) If a is the first number in S, are there integers b, c in S such that a = b + c? n n2 This can be done in 2 ≈ 2 steps. III) Are there integers a, b, c in S such that a = b + c? n n3 This can be done in n 2 ≈ 2 steps. In practice Often ignore constants. Important parts are n, n2, n3. Some motivating problems (3) - Fair division IV) Does S satisfy fair division? I There are 2n ways to break S into parts A ∪ B Testing if P s = P s requires n − 2 additions and one I si ∈A i si ∈B i comparison I Solving fair division via this algorithm requires about n2n steps Question from lecture 1 Consider S with n = |S| = 1000000. Alice thinks S can’t be fairly divided while Bob thinks it can. Who (if they are correct) would find it easier to convince Carlos? Answer Bob. Since if he has a partition S = A ∪ B, that partition can be tested in about n = 1000000 steps. On the other hand, Alice should require n2n = 1000000 · 21000000 steps to prove she is right. Observation Possible answers can be tested in polynomial time. But we don’t know how to find an answer in polynomial time. P vs. NP Two important classes of problems P A problem is the class P if an answer can be found in polynomial running time. NP A problem is the class NP if, given a possible answer, that answer can be tested/certified in polynomial running time. Examples I Problems I,II,III from before are in P I Problem IV is in NP Observations I P ⊆ N P, i.e. all P problems are also NP problems I it is unknown whether P = NP or not Sorting Sorting problem Given a list of numbers a1, a2, a3, ··· , a100 reorder the list so that it is in increasing order. Observation The basic steps in solving this problem are comparison questions like Is a1 < a2? Is a3 < a7? Is a9 < a98? How many such questions are needed? n Brute force method Ask all 2 comparison questions. This solves problem in O(n2) running time. How much can this run time be improved? Worst case scenario bound on sorting Observations I When reordering a list of size n there are n! possibilities to consider. I When asking a yes/no question, in the worst case scenario you 1 still have 2 as many possibilities or more. Upshot A foolproof sorting algorithm will always need to make at least log2 n! comparison. Stirling’s approximation Fact(Stirling’s approximation) n! lim √ = 1 n→∞ n n 2πn e Consequence log2 n! ∼ n log2 n Upshot A sorting algorithm is considered optimal if its running time is O(n log n). There are many optimal algorithms: merge sort, heap sort, introsort, Timsort, Cubesort, Block Sort among others. A quick overview of merge sort Key fact Given two ordered lists of size k, merging them into a single ordered list requires about 2k comparisons. Why? Strategy: I Compare lowest elements; move smallest to merged list I Compare (new) lowest elements; move smallest to merged list I etc... Merge sort strategy n 1. Split list of size n into two sublists of size 2 2. Sort each of the two sublist (using merge sort) 3. Merge the sublists Upshot Running time satisfies r(n) = 2r(n/2) + n. One can show that this implies r(n) = O(n log n).