Average Case Complexity∗
Total Page:16
File Type:pdf, Size:1020Kb
Average Case Complexity∗ Yuri Gurevich,† University of Michigan. Abstract. We attempt to motivate, jus- to wider classes. NP and the class of NP tify and survey the average case reduction search problems are sufficiently important. theory. The restriction to the case when in- stances and witnesses are strings is stan- dard [GJ], even though it means that one 1. Introduction may be forced to deal with string encodings of real objects of interest. The reason is An NP decision problem may be spec- that we need a clear computational model. ified by a set D of strings (instances)in If instances and witnesses are strings, the some alphabet, another set W of strings usual Turing machine model can be used. in some alphabet and a binary relation The size of a string is often taken to be its R ⊆ W × D that is polynomial time com- length, though this requirement can easily putable and polynomially bounded (which be relaxed. means that the size |w| of w is bounded by A solution of an NP problem is a feasible a fixed polynomial of the size |x| of x when- decision or search algorithm. The question ever wRx holds). If wRx holds, w is called arises what is feasible. It is common to a witness for x. The decision problem spec- adopt the thesis ified by D, W and R, call it DP(D, W, R), may be stated thus: Given an element x (1.0) Feasible algorithms are exactly of D, determine if there exists w ∈ W such polynomial time ones. that wRx holds. The corresponding search problem, call it SP(D, W, R) may be stated which can be split into two parts: thus: Given an element x of D, determine if there exists a witness w for x and if so (1.1) Every polynomial time algorithm is then exhibit such a witness. feasible. Problems of the form SP(D, W, R)may be called NP search problems even though (1.2) Every feasible algorithm is polyno- NP is supposed to be a class of decision mial time. problems. It will be convenient for us to use the term “an NP problem” to mean an We do not believe in a unique feasibil- NP decision problem (a genuine NP prob- ity concept; feasibility depends on appli- lem) or its search counterpart. In this talk, cation. In real time applications or in the we deal only with NP decision and search case of huge databases (like that of the US problems even though methods of the av- Internal Revenue Service), even linear time erage complexity theory may be applied may be prohibitively expensive. However, one important feasibility concept fits (1.1) ∗Springer LNCS 510, 1991, 615–628. well. It is often reasonable to assume that † Partially supported by NSF grant CCR89- if your customer had time to produce an 04728. Address: EECS Dept., University of Michi- gan, 1301 Beal Ave., Ann Arbor, MI 48109-2122. input object of some size n, then you can Email: [email protected] afford to spend time n2 or n3 etc. on the 1 object. Of course, this “etc.” should be 2. Polynomial on average treated with caution. Probably, you will be unable to spend time n64. Fortunately, In the average case approach, a decision in practically important cases, these poly- or search problem is supposed to be given nomials tend to be reasonable. together with a probability distribution on instances. A problem with a probability P-time (polynomial time) is a very distribution on instances is called random- robust and machine-independent concept ized (or distributional [BCGL]). Determin- closed under numerous operations. If you ing an appropriate distribution is a part of believe that (i) linear time algorithms are the formalization of the problem in ques- feasible and (ii) every algorithm with fea- tion and it isn’t necessarily easy. (The sible procedures running in feasible time robustness of the average case approach (counting a procedure call as one step) with respect to probability distributions is is feasible, then you are forced to accept of some help.) In this talk, we address all P-time algorithms as feasible. Find- only the task of solving a given random- ing a polynomial-time algorithm for an NP ized problem. The task is to devise an al- problem often requires ingenuity, whereas gorithm that solves the problem quickly on an exponential-time algorithm is there for average with respect to the given distribu- free. Proving that a given NP problem tion. No pretense is made that the algo- can be solved in polynomial time, you feel rithm is also good for the worst case. that you made a mathematical advance. It One advantage of the average case ap- seems that this feeling of mathematician’s proach is that it often works. It was pos- satisfaction contributed to the popularity sible a priori that, whenever you have a of the P-time concept. (Notice however fast on the average algorithm, you also that a superpolynomial bound may require have an algorithm that is fast even in the ingenuity and be a source of satisfaction as worst case. This is not at all the case. well.) Sometimes a natural probability distribu- We do not knowvery strong arguments tion makes the randomized problem ridicu- in favor of (1.2). Moreover, there are lously easy. Consider for example the 3- strong arguments against it. The most im- coloring search problem when all graphs portant one for our purpose in this talk is of the same size have the same probabil- that sometimes there are satisfying prac- ity. The usual backtracking solves this ran- tical ways to cope with hard NP problems domized search problem in (surprise!) at in the absence of P-time algorithms. In the most 197 steps on average, never mind the case of an optimization problem, one may size of the given instance [Wi]. The reason have a fast approximating algorithm. In is that there are very simple and probable the case of a decision or search problem, witnesses to non-colorability, like a clique one may have a decision or search algo- of 4. The distribution is greatly biased rithm that is usually fast or almost always toward the negative answer. The average fast or fast on average. Leonid Levin pro- time can be further cut down if the algo- posed one average case approach in [Le1]. rithm starts with a direct search for such His approach is the topic of this talk. witnesses. There are numerous examples of success in the cases of less biased dis- We hope this account is entertaining. It tributions. This is, however, a topic of a certainly isn’t complete. The references in- different talk. clude three student papers: [Gr], [Kn] and One may argue that, in many applica- [Sc]. tions, the average case may be more im- 2 portant than the worst case. Imagine a fac- ily its length. Third, the probability dis- tory that produces complete graphs. Each tribution is polynomial time computable edge of a graph is produced separately us- (P-time). The notion of P-time distribu- ing the same technology and there is a fixed tions was introduced in [Le1] and analyzed probability p for an edge to be faulty. Sup- to some extent in [Gu1]. The requirement pose that, for whatever reason, it happens that the distribution is P-time will be dis- to be important to knowwhethera given cussed and relaxed in Section 6. Meantime, graph has a hamiltonian circuit composed viewis as some technical restriction that is entirely of faulty edges. There is an algo- often satisfied in practice. rithm A that solves the hamiltonian search Consider a function T from a domain problem with any fixed edge-probability D to the interval [0..∞] of the real line distribution and that has an expected run- extended with ∞. Let En(T ) be the ex- ning time linear in the number of vertices pectation of T with respect to the condi- [GS]. You may want to use that algorithm tional probability Pn(x)=P{x ||x| = n}. and open a hamiltonian shop. There may The definition of polynomiality on average even be several such factories in your area. seems obvious: T is polynomial on average A customer will bring you a graph G with (relative to D)if some number n of vertices. If it is custom- ary that the charge depends only on the (2.1) En(T ) is bounded by a polynomial number of vertices, you may charge a fee of n. proportional to the expected running time of A on n-vertex graphs, where the propor- Unfortunately, this obvious answer is not tionality coefficient is chosen to ensure you satisfying. It is easy to construct D and a fair profit and competitiveness. T such that T is polynomial on average but T 2 is not. It seems reasonable to ac- For the sake of fairness, we should men- cept that (i) T is polynomial on average if tion situations where the failure to solve E (T )=O(n), and (ii) if T is bounded by quickly one instance means the failure of n a polynomial of a function which is poly- the whole enterprise. Even then the worst nomial on average then T is polynomial case approach may be too conservative. In on average. These two assumptions imply any case, it seems to us that, in many ap- that T is polynomial on average if plications where polynomial time is feasi- ble, polynomial on average time is feasi- (2.2) there exists ε>0 such that E (T ε) ble as well.