<<

INTRODUCTION TO COMPUTATION COMPLEXITY

JIAN ZHANG [email protected]

“Informally, an is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output”[1]. For example, an ascending sorting algo- rithm takes a sequence of numbers x1, x2,...xn as input and generates its order statistics x(1), x(2),...,x(n) as output. There are several important questions that can be asked for each algorithm: (1) Is the algorithm correct for every possible input? (2) What is the time complexity (average, worst, best)? (3) What is the space (memory) complexity? (4) What kind of data structure is used? Time complexity analysis aims to address the second question. Growth of Functions

The time required by an algorithm is often a of n where n is the size of the input. The order of growth of the running time of an algorithm gives a simple characterization of its efficiency and enables us to compare their relative performance. Definition 1.1. For a given function g(n), we denote by Θ(g(n)), O(g(n)) and Ω(g(n)) the set of functions

Θ(g(n)) = {f(n) : there exist c1,c2,n0 > 0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0}

O(g(n)) = {f(n) : there exist c,,n0 > 0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0}

Ω(g(n)) = {f(n) : there exist c,n0 > 0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0} . Instead of writing f(n) ∈ Θ(g(n)), people often use f(n) = Θ(g(n)) in practice. Obviously we have f(n) = Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)). If f(n) = Θ(g(n)) (O(g(n)) , Ω(g(n)), we say that g(n) is an asymptotic tight (upper, lower) bound for f(n). Example 1.2. (1) 2n2 +3n +1=2n2 + Θ(n)=Θ(n2) (2) n + log n = Θ(n) (3) an2 + bn + c = O(n2) (4) an2 + bn + c = O(n5). The O-notation may or may not be asymptotically tight. In the above example (3) is tight but (4) is not. Definition 1.3. The following o-notation is used to denote an upper bound that is not asymptotically tight:

o(g(n)) = {f(n) : for any c> 0 there exists a constant n0 > 0 such that 0 ≤ f(n) < cg(n) for all n ≥ n0} . Example 1.4. (1) an2 + bn + c = o(n3) (2) nk = o(nk+1). The Θ relation can be shown to be an equivalence relation. In other words, it satisfies the reflexivity, transitivity and symmetry properties. As a result, we can separate the functions into partitions. Each pair of functions inside one partition have the Θ relation, and any two functions from different partitions are not Θ related. For many common partitions we can establish an order < between them by using the definition of o- notation. The following order is often used: § ¤ 2 k n n Θ(1) < Θ(log n) < Θ(n log n) < Θ(n ) <...< Θ(n ) <...< Θ(2 ) < Θ(3 ) <...< Θ(n!). ¦ ¥ 1 englishINTRODUCTION TO COMPUTATION COMPLEXITY 2

Note that it is not possible to put all partitions in such a linear order list. For example, the function n and n1+sin n can not be compared using the O or Ω-notation. Constant and Small-Order Factors

When we use the O-notation constants are omitted and small order terms are discarded. For example, an algorithm which takes time 100n2 and 2n2 will both in the same class Θ(n2). On the other hand, two with computation time 106 log n and 1.5n will have Θ(log n) < Θ(n). In practice it may be the case that 106 log n> 1.5n for the largest possible n value. Remember that the time complexity is asymptotic and determined by the tail of the sequence and thus is only guarantted to hold when n → ∞. However, in practice it often gives a fairly resonable estimate and avoids the tedious and difficult calcluation1. Recursive Solution

Recursive relations are often used to compute the complexity of an algorithm. Here we give some results which will be useful in later analysis. Let us assume that the time complexity of an algorithm is T (n) with n the size of the input. For instance, in sorting algorithms n will be how many numbers to be sorted. (1) T (n)= T (n − 1)+1: Θ(n). (2) T (n)= T (n − 1) + n: Θ(n2). Assuming T (1) = 1 we have T (n)= T (n − 1)+ n = . . . =1+ . . . + n = n(n + 1)/2. (3) T (n)= T (n/2)+1: Θ(log n). Assuming n =2k and T (1) = 1 we have k k− T (2 )= T (2 1)+1= . . . = 1+log(n). (4) T (n)=2T (n/2)+ n: Θ(n log n). Assuming n =2k and T (1) = 0 we have k k− k k k k T (2 )=2T (2 1)+2 = . . . =2 + . . . +2 = k2 = n log n. (5) T (n)=2T (n/2)+1: Θ(n). Assuming n =2k and T (1) = 0 we have k k− k T (2 )=2T (2 1)+1= . . . =1+2+ . . . +2 =2n − 1. References

[1] Thomas H. Cormen, Charles . Leiserson, Ronald L. Rivest and Clifford Stein. Introduction to Algorithms, 2nd edition, MIT Press, 2001.

1In practice you can also use the profile technique/tool to find out the time complexity of your code. Try the GNU gprof tool under linux/unix.