Appendix A THE COMPLEXITY OF PROBLEMS
A.1 Preliminaries Many scheduling problems are combinatorial in nature: problems where we seek the optimum from a very large but finite number of solutions. Sometimes such problems can be solved quickly and efficiently, but often the best solution procedures available are slow and tedious. It therefore becomes important to assess how well a proposed procedure will perform. The theory of computational complexity addresses this issue. The seminal papers of complexity theory date from the early 70’s (e.g., Cook, 1971 and Karp, 1972). Today, it is a wide field encompassing many sub-fields. For a formal treatment, the interested reader may wish to consult Papadimitriou (1994). As we shall see, the theory partitions all realistic problems into two groups: the “easy” and the “hard” to solve, depending on how complex (hence how fast or slow) the computational procedure for that problem is. The theory defines still other classes, but all except the most artificial mathematical constructs fall into these two. It should be noted that “easy” or “hard” does not simply mean quickly or slowly solved. Sometimes, for small problem instances, “hard” problems may be more quickly solved than “easy” ones. As we shall see, the difficulty of a problem is measured not by the absolute time needed to solve it, but by the rate at which the time grows as the problem size increases. To this point, we have not used the accepted terminology; we introduce it now. A problem is a well-defined question to which an unambiguous an- swer exists. Solving the problem means answering the question. The prob- lem is stated in terms of several parameters, numerical quantities which are left unspecified but are understood to be predetermined. They make up the data of the problem. An instance of a problem gives specified values to each parameter. A combinatorial optimization problem, whether maximization or minimization, has for each instance a finite number of candidates from which the answer, or optimal solution, is selected. The choice is based on a real- valued objective function which assigns a value to each candidate solution. A
H. Emmons and G. Vairaktarakis, Flow Shop Scheduling: Theoretical Results, Algorithms, 319 and Applications, International Series in Operations Research & Management Science 182, DOI 10.1007/978-1-4614-5152-5, © Springer Science+Business Media New York 2013 320 A THE COMPLEXITY OF PROBLEMS decision problem or recognition problem has only two possible answers, “yes” or “no”. An example of an optimization problem is a linear program, which asks “what is the greatest value of cx subject to Ax ≤ b?”, where bold characters denote n-dimensional vectors (lower case) or n × n matrices (upper case). To make this a combinatorial optimization problem, we might make the variable x bounded and integer-valued so that the number of candidate solutions is finite. A decision problem is “does there exist a solution to the linear program with cx ≥ k?” To develop complexity theory, it is convenient to state all problems as de- cision problems. An optimization (say, maximization) problem can always be replaced by a sequence of problems of determining the existence of solutions with values exceeding k1,k2,....Analgorithm is a step-by-step procedure which provides a solution to a given problem; that is, to all instances of the problem. We are interested in how fast an algorithm is. We now introduce a measure of algorithmic speed: the time complexity function.
A.2 Polynomial versus Exponential Algorithms Note that we always think of solving problems using a computer. Thus, an algorithm is a piece of computer code. Similarly, the size of a problem in- stance is technically the number of characters needed to specify the data, or the length of the input needed by the program. For a decision problem, an algorithm receives as input any string of characters, and produces as output either “yes” or “no” or “this string is not a problem instance.” An algorithm solves the instance or string in time k if it requires k basic operations (e.g., add, subtract, delete, compare, etc.) to reach one of the three conclusions and stop. It is customary to use as a surrogate for instance size, any number that is roughly proportional to the true value. We shall use the positive integer n to represent the size of a problem instance. In scheduling, this usually represents the number of jobs to be scheduled. In summary, for a decision problem Π: Definition A.1 The Time Complexity Function (TCF) of algorithm A is:
TA(n)=maximal time for A to solve any string of length n. In what follows, the big oh notation introduced by Hardy and Wright (1979) will be used when expressing the time complexity function. We say that, for two real-valued functions f and g, f(n) is O(g(n)), or f(n) is of the same order as g(n)if|f(n)|≤k ·|g(n)| for all n ≥ 0 and some k>0. An efficient, polynomially bounded, polynomial time, or simply polynomial algorithm is one which solves a problem instance in time bounded by a power of the instance size. Formally: A.3 Reducibility 321
Definition A.2 An algorithm A is polynomial time if there exists a polyno- mial p such that + TA(n) ≤ p(n), ∀n ∈ Z ≡{1, 2,...}. More specifically, an algorithm is polynomial of degree c,orhas complexity O(nc), or runs in O(nc) time if, for some k>0, the algorithm never takes longer than knc (the TCF) to solve an instance of size n. Definition A.3 The collection P comprises all problems for which a poly- nomial time algorithm exists. Problems which belong to P are the ones we referred to earlier as “easy”. All other algorithms are called exponential time or just exponential, and problems for which nothing quicker exists are “hard”. Although not all algorithms in this class have TCF’s that are technically exponential functions, we may think of a typical one as running in O(cp(n)) for some polynomial p(n). Other examples of exponential rates of growth are nn and n!. We can now see how, as suggested earlier, the terms “hard” and “easy” are somewhat misleading, even though exponential TCFs clearly lead to far more rapid growth in solution times. Suppose an “easy” problem has an algorithm with running time bounded by, say kn5. Such a TCF may not be exponential, but it may well be considered pretty rapidly growing. Furthermore, some algorithms take a long time to solve even small problems (large k), and hence are unsatisfactory in practice even if the time grows slowly. On the other hand, an algorithm for which the TCF is exponential is not always useless in practice. The concept of the TCF is a worst case estimate, so complexity is only an upper bound on the amount of time required by an algorithm. This is a conservative measure and usually useful, but it is too pessimistic for some popular algorithms. The simplex algorithm for linear programming, for example, has a TCF that is O(2m) where m is the number of constraints, but it has been shown (see Nemhauser et al., 1989) that for the average case the complexity is only O(nm) where n is the number of variables. Thus, the algorithm is actually very fast for most problems encountered. Despite these caveats, exponential algorithms generally have running times that tend to increase at an exponential rate and often seem to “explode” when a certain problem size is exceeded. Polynomial time algorithms usually turn out to be of low degree (O(n3) or better), run pretty efficiently, and are considered desirable.
A.3 Reducibility A problem can be placed in P as soon as a polynomial time algorithm is found for it. Sometimes, rather than finding such an algorithm, we may place it in P by showing that it is “equivalent” to another problem which is already known to be in P. We explain what we mean by equivalence between problems with the following definitions. 322 A THE COMPLEXITY OF PROBLEMS
Definition A.4 A problem Π is polynomially reducible, or simply reducible to a problem Π (Π ∝ Π) if, for any instance I of Π , an instance I of Π can be constructed in polynomially bounded time, such that, given the solution SI to I, the solution SI to I can be found in polynomial time. We call the construction of the I that corresponds to I a polynomial transformation of I into I. Later, we will briefly mention a more general type of reducibility, in which the polynomial time requirements for constructing I and finding SI are relaxed. Until then, reduction will mean polynomial reduction. Definition A.5 Two problems are equivalent if each is reducible (or simply reduces) to the other. Since reduction, and hence equivalence, are clearly transitive properties, we can define equivalence classes of problems, where all problems in the same equivalence class are reducible (or equivalent) to each other. Consider poly- nomial problems. Clearly, for two equivalent problems, if one is known to be polynomial, the other must be, too. Also, if two problems are each known to be polynomial, they are equivalent. This is because any problem Π ∈Pis reducible to any other problem Π ∈Pin the following trivial sense. For any instance I of Π , we can pick any instance of Π, ignore its solution, and find the solution to I directly. We conclude that P is an equivalence class. We state a third simple result for polynomial problems as a theorem. Theorem A.1 If Π ∈P, then Π ∝ Π ⇒ Π ∈P. Proof: Given any instance I of Π , one can find an instance I of Π by applying a polynomial time transformation to I . Since Π ∈P, there is a polynomial time algorithm that solves I. Hence, using the transformation followed by the algorithm, I can be solved in polynomial time. 2 Normally, to “reduce” means to “make simpler”. Not so here. Keep in mind that if Π reduces to Π (Π ∝ Π) then, unless they are equivalent, Π is the more difficult problem. We can say that Π is a special case of Π.
A.4 Classification of Hard Problems In practice, we do not usually use reduction to show a problem is polynomial. We are more likely to start optimistically looking for an efficient algorithm directly, which may be easier than seeking another problem known to be polynomial, for which we can find an appropriate transformation. But sup- pose we cannot find either an efficient algorithm or a suitable transformation. We begin to suspect that our problem is not “easy” (i.e., is not a member of P). How can we establish that it is in fact “hard”? We start by defining a larger class of problems, which includes P and also all the difficult problems we may ever encounter. To describe it, consider any combinatorial decision A.4 Classification of Hard Problems 323 problem. For a typical instance, there may be a very large number of possible solutions which may have to be searched. Picture a candidate solution as a set of values assigned to the variables x =(x1, ..., xn). The question may be “for a given vector c is there a feasible solution x such that cx ≤ B?” and the algorithm may search the solutions until it finds one satisfying the inequality (whereupon it stops with the answer “yes”) or exhausts all solutions (and stops at “no”). This may well be a big job. But suppose we are told “the answer is ‘yes’, and here is a solution x that satisfies the inequality”. We feel we must at least verify this, but that is trivial. Intuitively, even for the hardest problems, the amount of work to check that a given candidate solution confirms the answer “yes” should be small, even for very large instances. We will now define our “hard” problems as those which, though hard to solve, are easy to verify, where as usual “easy” means taking a time which grows only polynomially with instance size. To formalize this, let:
VA(n) = maximal time for A to verify that a given solution establishes the answer “yes” for any instance of length n.
Definition A.6 An algorithm A˜ is nondeterministic polynomial time if there exists a polynomial p such that for every input of length n with answer “yes”, ≤ VA˜(n) p(n).
Definition A.7 The collection NP comprises all problems for which a non- deterministic polynomial algorithm exists. It may be noted that a problem in NP is solvable by searching a decision tree of polynomially bounded depth, since verifying a solution is equivalent to tracing one path through the tree. From this, it is easy to see that P⊆NP. Strangely, complexity theorists have been unable to show that P⊂NP; it remains possible that all the problems in NP could actually be solved by polynomial algorithms, so that P = NP. However, since so many brilliant re- searchers have worked on so many difficult problems in NP for so many years without success, this is regarded as being very unlikely. Assuming P = NP, as we shall hereafter, it can be shown that the problems in NP include an infinite number of equivalence classes, which can be ranked in order of in- creasing difficulty; where an equivalence class C is more difficult than another class C if, for every problem Π ∈Cand every Π ∈C , Π ∝ Π but Π ∝ Π . There also exist problems that cannot be compared: neither Π ∝ Π nor Π ∝ Π. Fortunately, however, all problems that arise naturally have always been found to lie in one of two equivalence classes: the “easy” problems in P, and the “hard” ones, which we now define. The class of NP-hard problems (NPH) is a collection of problems with the property that every problem in NP can be reduced to the problems in this class. More formally, 324 A THE COMPLEXITY OF PROBLEMS
Definition A.8 NPH = {Π : ∀Π ∈NP,Π ∝ Π} Thus each problem in NPH is at least as hard as any problem in NP.We know that some problems in NPH are themselves in NP, though some are not. Those that are include the toughest problems in NP, and form the class of NP-complete problems (NPC). That is, Definition A.9 NPC = {Π :(Π ∈NP) and (∀Π ∈NP,Π ∝ Π)} The problems in NPC form an equivalence class. This is so because all prob- lems in NP reduce to them, hence, since they are all in NP, they reduce to each other. The class NPC includes the most difficult problems in NP.As we mentioned earlier, by a surprising but happy chance, all the problems we ever encounter outside the most abstract mathematical artifacts turn out to belong to either P or NPC. When tackling a new problem Π, we naturally wonder whether it belongs to P or NPC: is it “easy” or “hard”? As we said, to show that the problem belongs to P, we usually try to find a polynomial time algorithm, though we could seek to reduce it to a problem known to be polynomial. If we are unable to show that the problem is in P, the next step generally is to attempt to show that it lies in NPC; if we can do so, we are justified in not developing an efficient algorithm. To show that our problem Π is hard, we look for a problem, call it Π that has already been proven hard, and can be reduced to our problem. That is, for any instance of the hard problem, we can efficiently construct an instance of our problem such that knowing the answer to our problem will immediately tell us the answer to the hard problem. Effectively, the hard problem Π is a special case of our problem Π. Now, if our problem is easy, the hard problem would be easy. But it is not. So our problem must be hard, too. This logic is summarized in the following theorem, which should be clear enough to require no proof. Theorem A.2 ∀Π, Π ∈NP, (Π ∈NPC) and (Π ∝ Π) ⇒ Π ∈NPC Thus, we need to find a problem Π ∈NPCand show Π ∝ Π, thereby demonstrating that Π is at least as hard as any problem in NPC. To facilitate this, we need a list of problems known to be in NPC. Several hundred are listed in Garey and Johnson (1979) in a dozen categories such as Graph Theory, Mathematical Programming, Sequencing and Scheduling, Number Theory, etc., and more are being added all the time. Even given an ample selection, a good deal of tenacity and ingenuity are usually needed to pick one with appropriate similarities to ours and to fill in the precise details of the transformation. In the next section, we describe the basic technique for theorem proving in complexity theory, and conclude with an illustrative example. A.5 Strong NP-Completeness 325 A.5 Strong NP-Completeness We now introduce one of the various ways NP-complete problems can be classified into smaller subclasses, the only one we will use in this monograph: the partitioning of the class NPC into the two sets, ordinary and strongly NP- complete problems. For a detailed description of these classes see Garey and Johnson (1979). In practical terms, an ordinary NP-complete problem can be solved using implicit enumeration algorithms like dynamic programming. In this case, the time complexity of the algorithm is not polynomial in the length of input data, but it is polynomial in the size of these data. For instance, Partition is an NP-complete problem (to be defined shortly, in Sect. A.7), for which the input data are k positive integers vi (i =1, 2, ..., k). Let V be the size of this data: V =Σivi. Partition is solvable by dynamic programming in O(nV ) time (see Martello and Toth, 1990). Evidently, this complexity is polynomial in V . To see that this complexity bound is not polynomial in the length of the data, consider the binary encoding scheme. In this scheme each vi can be represented by a string of length O(log vi), and hence v1,...,vn can be described by a string of length O(Σi log vi) which is no greater than O(n log V ). We see that the time complexity O(nV ) of the dynamic program (DP) is polynomial in the size V of the data, but not polynomial in the length of the input data, O(n log V ). When the complexity of an algorithm is polynomial in the size of the data, but not the length of the input, we refer to it as a pseudo-polynomial algorithm. A NP-complete problem solvable by a pseudo-polynomial algorithm is called ordinary NP-complete. Else, the problem is strongly NP-complete.
A.5.1 Pseudo-Polynomial Reduction As we know, to show ordinary NP-completeness of Π, we start with an or- dinary NP-complete Π and provide a polynomial reduction to Π. That is, for any instance I of Π we produce an instance I of Π in polynomial time, and given the solution SI of I, we produce a solution SI of I , also in poly- nomial time. Now, if we could solve Π in polynomial time, we would have a sequence of three polynomial steps that would solve Π . But we know Π is not polynomially solvable, and so Π cannot be, either. The same logic applies if we start with a strongly NP-complete Π . Given a polynomial reduction, Π must also be strongly NP-complete: if Π were anything less (polynomial or ordinary NP-complete), Π would be, too. But now note: if either or both the steps in the reduction were pseudo-polynomial, and if Π could be solved polynomially or pseudo-polynomially, we would still have an overall pseudo-polynomial solution to Π , giving us the contradiction we need. This should provide the motivation for the following analogue of Definition A.4: Definition A.10 A problem Π is pseudo-polynomially reducible to a prob- lem Π (Π ∝ Π) if, for any instance I of Π , an instance I of Π can be 326 A THE COMPLEXITY OF PROBLEMS constructed in pseudo-polynomially bounded time, such that, given the solu- tion SI to I, the solution SI to I can be found in pseudo-polynomial time. This definition leads to the following extension of Theorem A.2: Theorem A.3 ∀Π, Π ∈NP, if Π is strongly NP-complete, and Π ∝ Π, then Π is strongly NP-complete. This is a stronger result than Theorem A.2. However, it is not to our knowledge ever used, partly because Theorem A.2 seems to be sufficient, partly because pseudo-polynomial transformations are much harder to find than polynomial ones, and finally because Theorem A.3 does not seem to be widely known.
A.6 How to show a Problem is NP-Complete We now summarize the process of actually proving the NP-completeness, whether ordinary or strong, of a new Problem Π of interest. Recall, we are dealing only with decision problems. 1. Show that Π ∈NP.
That is, given a solution SΠ of Π we must be able to check whether SΠ provides a “yes” or “no” answer for Π in polynomial time. This is a technical requirement. After all, as we said earlier, “all the problems we ever encounter outside the most abstract mathematics turn out to belong to either P or NPC” and hence to NP. Thus, in practice, this step is commonly assumed without mention. 2. Find a problem Π ∈NPCthat reduces to Π. This, of course, is the crux of the matter. It is not easy to do, requiring technical skills born of insight and experience. If a candidate problem Π is to serve our purposes, then by the definition of reduction in Sect. A.3, the following must be true and verifiable: • For any instance I of Π , we must be able to construct an instance I of Π such that I has the solution SI = yes if and only if I has the solution SI = yes. • The times required to construct I from I , and to construct SI from SI , must be polynomial [may be polynomial or pseudo-polynomial] in the size of (i.e., the length of input data required to specify) I , when I is ordinary [strongly] NP-complete. 3. Determining whether Π is ordinary or strongly NP-complete The precise complexity of Π depends largely on the complexity status of the known NP-complete problem Π . • If Π is strongly NP-complete, then Π is strongly NP-complete. A.7 A Sample Proof 327