Computational Complexity Heuristic Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Computational complexity Heuristic Algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Definitions: problems and instances A problem is a general question expressed in mathematical terms. Usually the same question can be expressed on many examples: they are instances of the problem. For instance: • Problem: “Is n prime?” • Instance: “Is 7 prime?” A solution S is the answer corresponding to a specific instance. Formally, a problem P is a function that maps instances from a set I into solutions (set S): P : I → S A priori, we do not know how to compute it: we need an algorithm. Definitions: algorithms An algorithm is a procedure with the following properties: • it is formally defined • it is deterministic • it made of elementary operations • it is finite. An algorithm for a problem P is an algorithm whose steps are determined by an instance I ∈ I of P and produce a solution S ∈ S A : I → S An algorithm defines a function and it also computes it. If the function is the same, the algorithm is exact; otherwise, it is heuristic. Algorithms characteristics A heuristic algorithm should be 1. effective: it should compute solutions with a value close to the optimum; 2. efficient: its computational complexity should be low, at least compared with an exact algorithm; 3. robust: it should remain effective and efficient for any possible input. To compute a solution, an algorithm needs some resources. The two most important ones are • space (amount of memory required to store data); • time (number of elementary steps to be performed to compute the final result). Complexity Time is usually considered as the most critical resource because: • time is subtracted from other computations more often than space; • it is often possible to use very large amounts of space at a very low cost, but not the same for time; • the need of space is upper bounded by the need for time, because space is re-usable. It is intuitive that in general the larger is an instance, the larger is the amount of resources that are needed to compute its solution. However how the computational cost grows when the instance size grows is not always the same: it depends on the problem and on the algorithm. By computational complexity of an algorithm we mean the speed with which the consumption of computational resources grows when the size of the instance grows. Measuring the time complexity The time needed to solve a problem depends on: • the specific instance to be solved • the algorithm used • the machine that executes the algorithm • . We want a measure of the time complexity with the following characteristics: • independent of the technology, i.e. it must be the same when the computation is done on different hardware; • synthetic and formally defined, i.e. it must be represented by a simple and well-defined mathematical expression; • ordinal, i.e. it must allow to rank the algorithms according to their complexity. The observed computing time, does not satisfy these requirements. Time complexity The asymptotic worst-case time complexity of an algorithm provides the required measure in this way: 1. we measure the number T of elementary operations executed (which is computer-independent); 2. we compute a number n which determines the number of bits needed to define the size of any instance (e.g., the number of elements in the ground set in a combinatorial optimization problem); 3. we find the maximum number of elementary operations needed to solve instances of size n T (n)= max T (I) n ∈ N I∈In (this reduces the complexity to a function T : N → N) 4. we approximate T (n) with a simpler funcion f (n), for which we are only interested in the asymptotic trend for n → +∞ (complexity is more important when instances are larger) 5. finally we can collect these functions in complexity classes. Notation: Θ T (n) ∈ Θ (f (n)) means that + ∃c1, c2 ∈ R , n0 ∈ N : c1 f (n) ≤ T (n) ≤ c2 f (n) for all n ≥ n0 where c1, c2 and n0 are constant values, independent on n. T (n) is between c1f (n) and c2f (n) T(n) c f(n) f(n) 2 • for a suitable “small value” c1 T (n) A • for a suitable “large value” c2 c f(n) • for any size larger than n0 1 n n 0 Asymptotically, f (n) is an estimate of T (n) within a constant factor: • for large instances, the computing time is proportional to f (n). Notation: O T (n) ∈ O (f (n)) means that + ∃c ∈ R , n0 ∈ N : T (n) ≤ c f (n) for all n ≥ n0 where c and n0 do not depend on n. T(n) c f(n) T (n) is upper bounded by cf (n) T (n) • for a suitable “large value” c A • for any n larger than a suitable f(n) n0 n n 0 Asymptotically, f (n) is an upper bound for T (n) within a constant factor: • for large instances the computing time is at most proportional to f (n). Notation: Ω T (n) ∈ Ω (f (n)) means that ∃c > 0, n0 ∈ N : T (n) ≥ c f (n) for all n ≥ n0 where c and n0 do not depend on n. T(n) f(n) T (n) is “lower bounded” by cf (n) T (n) • for some suitable “small value” A di c c f(n) • for any n larger than n0 n n 0 Asymptotically, f (n) is a lower bound of T (n) within a constant factor: • for large instances the computing time is at least proportional to f (n) Combinatorial optimization In combinatorial optimization problems it is natural to define the size of an instance as the cardinality of its ground set. An explicit enumeration algorithm • considers each subset S ⊆ E, • evaluates whether it is feasible (x ∈ X) in α (n) time, • evaluates the objective function f (x) in β (n) time, • records the best value found. Since the number of solutions is exponential in n, its complexity is at least exponential, even if α (n) and β (n) are polynomials (as often occurs). Polynomial and exponential complexity In combinatorial optimization, the main distinction is between • polynomial complexity: T (n) ∈ O nd for a constant d > 0 • exponential complexity: T (n) ∈ Ω (d n) for a constant d > 1 The algorithms of the former type are efficient; those of the latter type are inefficient. In general, heuristic algorithms are polynomial and they are used when the corresponding exact algorithms are exponential. Assuming 1 operation/µsec n n2 op. 2n op. 1 1µ sec 2µ sec 10 0.1 msec 1 msec 20 0.4 msec 1 sec 30 0.9 msec 17.9 min 40 1.6 msec 12.7 days 50 2.5 msec 35.7 years 60 3.6 msec 366 centuries Problem transformations and reductions Some times it is possible and convenient to reformulate an instance of a problem P into an instance of a problem Q and then to transform back the solution of the latter into a solution of the former. Polynomial transformation P Q: given any instance of P • a corresponding instance of Q is defined in polynomial time • the instance of Q is solved by a suitable algorithm, providing a solution SQ • from SQ a corresponding solution SP is obtained in polynomial time Example: VCP SCP, MCP MISP and MISP MCP. Problem transformations and reductions Polynomial reduction P Q: given any instance of P • an algorithm A is executed a polynomial number of times; • to solve instances of a problem Q obtained in polynomial time from the instance of P and from the results of the previous runs; • from the solutions computed, a solution of the instance of P is obtained. Examples: BPP PMSP and PMSP BPP. In both cases • if A is polynomial/exponential, the overall algorithm turns out to be polynomial/exponential • if A is exact/heuristic, the overall algorithm turns out to be exact/heuristic Optimization vs. decision A polynomial reduction links optimization and decision problems. • Optimization problem: given a function f and a feasible region X, what is the minimum of f in X? f ∗ = min f = ? x∈X • Decision problem: given a function f , a value k and a feasible region X, do solutions with a value not larger than k exist? ∃x ∈ X : f (x) ≤ k? The two problems are polynomially equivalent: • the decision problem can be solved by solving the optimization problem and then comparing the optimal value with k; • the optimization problem can be solved by repeatedly solving the decision problem for different values of k, tuned by dichotomous search. Drawbacks of worst-case analysis The worst-case time complexity has some relevant drawbacks: • it does not consider the performance of the algorithm on the easy/small instances; in practice the most difficult instances could be rare or unrealistic; • it provides a rough estimate of the computing time growth, not of the computing time itself; • the estimate can be very rough, up to the point it becomes useless; • it may be misleading: algorithms with worse worst-case computational complexity can be very efficient in practice, even more than algorithms with better worst-case computational complexity. Other complexity measures To overcome these drawbacks one could employ different definitions of computational complexity: • parameterized complexity expresses T as a function of some other relevant parameter k besides the size of the instance n: T (n, k) • average-case complexity assumes a probability distribution on I and it evaluates the expected value of T (I) on In T (n)= E [T (I) |I ∈ In] If the distribution has some parameter k, the average-case complexity is also parameterized, i.e.