<<

Average Case Complexity∗

Yuri Gurevich,† University of Michigan.

Abstract. We attempt to motivate, jus- to wider classes. NP and the class of NP tify and survey the average case reduction search problems are sufficiently important. theory. The restriction to the case when in- stances and witnesses are strings is stan- dard [GJ], even though it means that one 1. Introduction may be forced to deal with string encodings of real objects of interest. The reason is An NP decision problem may be spec- that we need a clear computational model. ified by a set D of strings (instances)in If instances and witnesses are strings, the some alphabet, another set W of strings usual Turing machine model can be used. in some alphabet and a binary relation The size of a string is often taken to be its R ⊆ W × D that is polynomial time com- length, though this requirement can easily putable and polynomially bounded (which be relaxed. means that the size |w| of w is bounded by A solution of an NP problem is a feasible a fixed polynomial of the size |x| of x when- decision or search algorithm. The question ever wRx holds). If wRx holds, w is called arises what is feasible. It is common to a witness for x. The decision problem spec- adopt the thesis ified by D, W and R, call it DP(D, W, R), may be stated thus: Given an element x (1.0) Feasible algorithms are exactly of D, determine if there exists w ∈ W such polynomial time ones. that wRx holds. The corresponding search problem, call it SP(D, W, R) may be stated which can be split into two parts: thus: Given an element x of D, determine if there exists a witness w for x and if so (1.1) Every polynomial time algorithm is then exhibit such a witness. feasible. Problems of the form SP(D, W, R)may be called NP search problems even though (1.2) Every feasible algorithm is polyno- NP is supposed to be a class of decision mial time. problems. It will be convenient for us to use the term “an NP problem” to mean an We do not believe in a unique feasibil- NP decision problem (a genuine NP prob- ity concept; feasibility depends on appli- lem) or its search counterpart. In this talk, cation. In real time applications or in the we deal only with NP decision and search case of huge databases (like that of the US problems even though methods of the av- Internal Revenue Service), even linear time erage complexity theory may be applied may be prohibitively expensive. However, one important feasibility concept fits (1.1) ∗Springer LNCS 510, 1991, 615–628. well. It is often reasonable to assume that † Partially supported by NSF grant CCR89- if your customer had time to produce an 04728. Address: EECS Dept., University of Michi- gan, 1301 Beal Ave., Ann Arbor, MI 48109-2122. input object of some size n, then you can Email: [email protected] afford to spend time n2 or n3 etc. on the

1 object. Of course, this “etc.” should be 2. Polynomial on average treated with caution. Probably, you will be unable to spend time n64. Fortunately, In the average case approach, a decision in practically important cases, these poly- or search problem is supposed to be given nomials tend to be reasonable. together with a probability distribution on instances. A problem with a probability P-time (polynomial time) is a very distribution on instances is called random- robust and machine-independent concept ized (or distributional [BCGL]). Determin- closed under numerous operations. If you ing an appropriate distribution is a part of believe that (i) linear time algorithms are the formalization of the problem in ques- feasible and (ii) every algorithm with fea- tion and it isn’t necessarily easy. (The sible procedures running in feasible time robustness of the average case approach (counting a procedure call as one step) with respect to probability distributions is is feasible, then you are forced to accept of some help.) In this talk, we address all P-time algorithms as feasible. Find- only the task of solving a given random- ing a polynomial-time algorithm for an NP ized problem. The task is to devise an al- problem often requires ingenuity, whereas gorithm that solves the problem quickly on an exponential-time algorithm is there for average with respect to the given distribu- free. Proving that a given NP problem tion. No pretense is made that the algo- can be solved in polynomial time, you feel rithm is also good for the worst case. that you made a mathematical advance. It One advantage of the average case ap- seems that this feeling of mathematician’s proach is that it often works. It was pos- satisfaction contributed to the popularity sible a priori that, whenever you have a of the P-time concept. (Notice however fast on the average algorithm, you also that a superpolynomial bound may require have an algorithm that is fast even in the ingenuity and be a source of satisfaction as worst case. This is not at all the case. well.) Sometimes a natural probability distribu- We do not knowvery strong arguments tion makes the randomized problem ridicu- in favor of (1.2). Moreover, there are lously easy. Consider for example the 3- strong arguments against it. The most im- coloring search problem when all graphs portant one for our purpose in this talk is of the same size have the same probabil- that sometimes there are satisfying prac- ity. The usual backtracking solves this ran- tical ways to cope with hard NP problems domized search problem in (surprise!) at in the absence of P-time algorithms. In the most 197 steps on average, never mind the case of an optimization problem, one may size of the given instance [Wi]. The reason have a fast approximating algorithm. In is that there are very simple and probable the case of a decision or search problem, witnesses to non-colorability, like a clique one may have a decision or search algo- of 4. The distribution is greatly biased rithm that is usually fast or almost always toward the negative answer. The average fast or fast on average. Leonid Levin pro- time can be further cut down if the algo- posed one average case approach in [Le1]. rithm starts with a direct search for such His approach is the topic of this talk. witnesses. There are numerous examples of success in the cases of less biased dis- We hope this account is entertaining. It tributions. This is, however, a topic of a certainly isn’t complete. The references in- different talk. clude three student papers: [Gr], [Kn] and One may argue that, in many applica- [Sc]. tions, the average case may be more im-

2 portant than the worst case. Imagine a fac- ily its length. Third, the probability dis- tory that produces complete graphs. Each tribution is polynomial time computable edge of a graph is produced separately us- (P-time). The notion of P-time distribu- ing the same technology and there is a fixed tions was introduced in [Le1] and analyzed probability p for an edge to be faulty. Sup- to some extent in [Gu1]. The requirement pose that, for whatever reason, it happens that the distribution is P-time will be dis- to be important to knowwhethera given cussed and relaxed in Section 6. Meantime, graph has a hamiltonian circuit composed viewis as some technical restriction that is entirely of faulty edges. There is an algo- often satisfied in practice. rithm A that solves the hamiltonian search Consider a function T from a domain problem with any fixed edge-probability D to the interval [0..∞] of the real line distribution and that has an expected run- extended with ∞. Let En(T ) be the ex- ning time linear in the number of vertices pectation of T with respect to the condi- [GS]. You may want to use that algorithm tional probability Pn(x)=P{x ||x| = n}. and open a hamiltonian shop. There may The definition of polynomiality on average even be several such factories in your area. seems obvious: T is polynomial on average A customer will bring you a graph G with (relative to D)if some number n of vertices. If it is custom- ary that the charge depends only on the (2.1) En(T ) is bounded by a polynomial number of vertices, you may charge a fee of n. proportional to the expected running time of A on n-vertex graphs, where the propor- Unfortunately, this obvious answer is not tionality coefficient is chosen to ensure you satisfying. It is easy to construct D and a fair profit and competitiveness. T such that T is polynomial on average but T 2 is not. It seems reasonable to ac- For the sake of fairness, we should men- cept that (i) T is polynomial on average if tion situations where the failure to solve E (T )=O(n), and (ii) if T is bounded by quickly one instance means the failure of n a polynomial of a function which is poly- the whole enterprise. Even then the worst nomial on average then T is polynomial case approach may be too conservative. In on average. These two assumptions imply any case, it seems to us that, in many ap- that T is polynomial on average if plications where polynomial time is feasi- ble, polynomial on average time is feasi- (2.2) there exists ε>0 such that E (T ε) ble as well. The question arises what does n is bounded by a polynomial of n it mean that a function is polynomial (i.e. polynomially bounded) on average. which is equivalent to Following [BG], we define a domain to  ε be a set U (the universe) with a function (2.2 ) (∃ε>0) En(T /n)=O(1). from U to natural numbers (the size func- tion) and a probability distribution on U The weak point of condition (2.2) is that satisfying the following technical restric- it is uniform in n. It is possible, for ex- tions. First, the universe is a set of strings ample, that there exists a set Q of natu- in some alphabet (so that the Turing ma- ral numbers such that T (x) is large when chine model can be used). Second, there |x|∈Q but, for n ∈ Q, the probability are only finitely many elements of positive that |x| = n is very small. The “official” probability and any given size. It is not re- definition requires a slightly weaker condi- quired that the size of a string is necessar- tion.

3 Definition 2.1. A function T from a do- function f from a domain D1 to a do- main D to [0..∞] is polynomial on average main D2 need not constitute a reduc- (in short, AP) if tion of Π1 = RDP(D1,W1,R1)toΠ2 = RDP(D2,W2,R2)evenif(∃w)(wR1x) ↔ (2.3) (∃ε>0) E(T ε/|x|) < ∞. (∃w)(wR2(fx)) for all x in D1. Why? Sup- pose that you have a decision algorithm Theorem 2.1 [Gu1]. Conditions (2.2) A2 for Π2 that runs very fast on average. and (2.3) are equivalent if there exists an That decision algorithm can be pulled back integer k such that P{x : |x| = n} >n−k by means of f to give the following deci- for all sufficiently large n with P{x : |x| = sion algorithm A1 for Π1: Given x, com- n} > 0. pute y = f(x) and then apply A2 to y.It A more extensive discussion of the issue may happen unfortunately that the range of AP functions can be found in [Gu1] and of f has an unfair share of rare difficult in- [BCGL]. We will return to that issue in stances of Π2 and A1 is not at all AP-time. Section 7. It would be natural to say that a func- tion f from D1 to D2 reduces D1 to D2 if 3. A new setting (3.1) f is AP-time, and A randomized NP decision or search (3.2) For every AP function T : D → problem may be specified by a triple 2 [0..∞], the composition T ◦ f is AP. (D, W, R) where D, W and R are as above (in Section 1) except that D is Because of the universal quantification, a domain now. Randomized decision however, the requirement (3.2) is not con- problems RDP(D, W, R) and randomized venient to use. Fortunately, (3.2) can search problems RSP(D, W, R) play the be simplified. Let Pi be the probabil- roles of decision and search problems in ity distribution of Di and let T0(y)= the NP theory. The role of NP is played −1 P1(f (y))/P2(y). Since E(T0) = 1, we by the class RNP of randomized NP deci- have the following special case of (3.2): sion problems. The term “an RNP prob- −1 lem” will be used to mean an RNP decision (3.3) P1(f (fx))/P2(fx)isAP. problem or its search counterpart. An al- gorithm for an RNP problem is considered Theorem 3.1 [BG]. Assume (3.1). Then feasible if it runs in time that is AP rela- (3.2) ↔ (3.3). tive to D; thus the role of P is played by the class AP of AP-time decidable RNP Say that D2 dominates D1 with respect decision problems. (This is going to be re- to f if (3.3) holds. This is a slight varia- vised.) tion on Levin’s original definition of domi- It is difficult to exhibit an RNP decision nation. problem that is not AP. (Such a problem Definition 3.1. A function f from a do- exists if NT ime(2O(n)) = DT ime(2O(n)) main D to a domain D reduces D to D [BCGL].) In particular, the existence of 1 2 1 2 if f is AP-time computable and D domi- such a problem implies P = NP. However 2 nates D with respect to f. the reduction theory allows one to exhibit 1 maximally hard RNP problems. Definition 3.2. RDP(D1,W1,R1) re- We try to motivate an appropriate re- duces to RDP(D2,W2,R2) if there exists duction notion. Notice that an AP-time a reduction f of D1 to D2 such that

4 (∃w)(wR1x) ↔ (∃w)(wR2(fx)) for all in- Corollary 3.2 witnesses a certain robust- stances x ∈ D1 of positive probability. ness of the average case approach. Leonid Levin proved that a randomized Definition 3.3. RSP(D ,W ,R ) re- 1 1 1 version of the (bounded) tiling problem is duces to RSP(D ,W ,R ) if there exists a 2 2 2 complete for RNP [Le1]. Some additional reduction f of D to D and a polynomial 1 2 problems complete for RNP with respect to time computable function g from W to 2 reductions as defined above were given in W satistying, for every x ∈ D of positive 1 1 [Gu1,Gu2]; in the next section, we describe probability, the following two conditions: one of them.

• (∃w1)(w1R1x) −→ (∃w2)(w2R2(fx)),

• w2R2(fx) −→ (gw2)R1x. 4. One RNP complete prob- lem Here is one example. Given an arbi- trary RSP(D, W, R), form the direct prod- In order to generalize the notion of the uct W  of W and (the universe of) D. Let uniform probability distribution from fi- P (w, x)=x be the projection function nite sample spaces to domains, we need from W  to D, F be the restriction of P to fix some default probability distribution to the set {(w, x): wRx, and R be the on natural numbers. Let us agree that the graph {((w, x),x): wRx} of F . The new default probability of any n>1 is propor- problem RSP(D, W ,R) is the problem of tional to n−1 · (log n)−2. Call a domain D inverting F . Given an element x of D, de- uniform if elements of the same size have termine whether F −1(x) is nonempty and the same probability distribution and the if so then exhibit an element of F −1(x). probability of {x : |x| = n} is proportional Thus, every RNP search problem reduces to the default probability of n. Further, to a randomized problem of inverting a define the direct product D1 × D2 of do- polynomial-time computable function. mains D1 and D2 to be the domain of pairs (a, b), where a ∈ D1 and b ∈ D2, such that Corollary 3.1. Let Π1,Π2 and Π3 be RNP decision (resp. search) problems. |(a, b)| = |a|+|b| and P(a, b)=P(a)×P(b). Recall that the modular group is the mul- • If Π1 reduces to Π2 and Π2 reduces to tiplicative group of two-by-two integer ma- Π3 then Π1 reduces to Π3. trices of determinant 1. In this section, a matrix is an element of the modular group. • If Π1 reduces to Π2 and there exists an Make a uniform domain out of the modu- AP-time decision (resp. search) algo- lar group in a natural way. The size |A| of rithm for Π2 then there exists an AP- a matrix A may be defined as the number time decision (resp. search) algorithm of bits necessary to write the matrix down for Π1. (using the binary notation for the entries); alternatively, it may be defined as the log Corollary 3.2. Let D1 and D2 of the maximal absolute value of its entries. be two domains with the same universe (Notice that |A| is the size rather than the and the same size function dominating determinant of a matrix A.) each the other with respect to the iden- A matrix pair (B,C) gives rise to an op- tity function. Then any RDP(D1,W,R) erator TB,C(X)=BXC over the modu- (resp. RSP(D1,W,R)) is solvable in AP- lar group which is linear (even though the time if and only if RDP(D ,W,R) (resp. modular group is not closed under addi- 2  RSP(D2,W,R)) is so. tion) in the following sense: If X = Yi

5  then TB,C(X)= TB,C(Yi). Andreas is reduced to a randomized version of the Blass proved that an arbitrary operator (bounded) Post Correspondence Problem over the modular group is linear if and [Gu1]. Third, the randomized PCP is re- only if there exist matrices B and C such duced to Matrix Decomposition [Gu3]. that either T (X)=BXC for all X or A simpler version of Matrix Decomposi- else T (X) is the transpose of BXC for tion is obtained by making S a sequence all X. Moreover, any linear operator T of matrices rather than linear operators. uniquely extends to a linear operator on The question becomes whether A can be all two-by-two integer (or even complex) represented as a product of at most nS- matrices; this gives rise to the standard matrices. We doubt that the modified representation of T by a four-by-four in- problem is complete for RNP, but the sim- teger matrix. The two presentations are ilar problem for larger matrices, like 20 × polynomial-time computable each from the 20 is complete [Ve]. other. Thus, an appropriate matrix pair (B,C) with one additional bit, indicating whether the transpose is applied, is a natu- 5. Revision 1: Randomized ral representation of the corresponding lin- reductions ear operator T . It is natural to define the domain of linear operators as the direct The setting of Section 3 turns out to be product of two copies of the matrix domain too restrictive. Call a domain D flat if and the uniform domain of two elements. P(x) is bounded by 2−|x|ε for some ε>0. Let σ be a sufficiently large positive inte- Intuitively, a flat domain is akin to a uni- ger. form one. No element has a probability We are ready to define an RNP deci- that is too big for its size. Many usual sion problem called Matrix Decomposition. domains are flat. For example, any do- The domain of Matrix Decomposition is main of graphs is flat if the size of a graph the direct product of the matrix domain, is the number of vertices (or the number σ copies of the domain of linear operators of vertices plus the number of edges, or and the domain of natural numbers where the size of the adjacency matrix) and the |n| = n and P(n) is the default probability conditional probability distribution on n- of n. In other words, an instance of Matrix vertex graphs is determined by the edge- Decomposition comprises a matrix A, a se- probability p(n) with n−2+δ 0. tation preserving operators and the unary Lemma 5.1 [Gu1]. If an RNP decision notation for a natural number n. The cor- problem on a flat domain is complete for responding question is if there a product RNP then deterministic exponential time P = T × ...× T of m ≤ n linear opera- 1 m DTime(exp(nO(1)) equals nondeterministic tors T ∈ S such that A = P (1). i exponential time NTime(exp(nO(1)). Theorem 4.1 [Gu3]. Matrix Decompo- Proof Sketch. Turn the given nonde- sition is RNP complete. terministic exponential time decision prob- The prove of RNP hardness consists of lem D0 into a very sparse RNP prob- the following steps. First, a randomized lem (D1,µ1) whose positive instances x version of the (bounded) halting problem have enormous (for their size) probabili- is proved complete for RNP. This result ties. Given such an x, a reduction f of is implicit in [Le1] and explicit in [Gu1]. (D1,µ1) to a flat RNP decision problem Second, the randomized halting problem (D, µ) produces an instance f(x) of a high

6 probability and therefore a small size. An takes inputs from a domain D and which exponential time decision procedure for D can flip a fair coin. M can be viewed as together with a polynomial time procedure a deterministic machine a func- for computing f give a (deterministic) ex- tion f(x, r) where x ∈ D and r is a se- ponential time decision procedure for D0. quence of (the outcomes of) coin flips. Call QED such f a random function on D. (One may object to the term “random function” on This proof contains no evidence that the grounds that a random function on A RNP problems with flat distributions are should be a randomly chosen function on easy; it is natural to viewthe lemma as A. The term “a random function” is fash- evidence that the reductions of Section 3 ioned in [BG] after well accepted terms like are not strong enough. Those reductions “a real function”. A real function assigns are many-one reductions and one obvious real numbers to elements. A random func- move is to generalize them to Turing re- tion assigns random objects to (almost all) ductions. However, the lemma survives the elements.) Formally, f is an ordinary (or generalization. Leonid Levin had a fruitful deterministic) function on a different do- idea: randomize. Allowthe reducing al- main Df that is an extension of D.(Df has gorithm to flip coins and produce a some- a somewhat unusual size function. Notice what random output. Now the proof of the that the auxiliary input r does not have the lemma fails: Instead of producing one in- status of the real input x because we are stance f(x) of a high probability and small interested in measuring the running time of size, a randomizing reduction produces a M in terms of the real input x only. The multitude of instances of small probabili- size of a pair (x, r)inDf is the size of x in ties and large sizes. D.) Using randomizing reductions, Levin Random functions compose and give (if and his student Ramarathnam Venkatesan you care) a nice category. Say that a ran- constructed a natural randomized graph- dom function f on D is AP-time com- coloring problem complete for RNP [VL]. putable if the deterministic function f is Their paper demonstrates another reason, AP-time computable with respect to Df . a very good one, for using randomizing re- We say that a domain D dominates a do- ductions. Randomizing reductions allowus main D with respect to a random function to use the structure of a random instance   f from D to D if D dominates Df (in of the target problem (the one whose com- the sense of Section 3) with respect to the pleteness one would like to prove). Given  deterministic function f from Df to D . an instance x of the randomized halting problem, the reducing machine of Venkate- Definition 5.1. A reduction of a domain san and Levin flips coins to produce a ran- D1 to a domain D2 is an AP-time random dom graph of an appropriate size. Then function f from D1 to D2 such that D2 it massages this graph to make sure that dominates D1 with respect to f. x can be coded in with a sufficiently high Theorem 5.1. Let f be an AP-time ran- probability. dom function from a domain D1 to a do- Next, following [BG], we generalize the main D2. Then the following statements notion of domain reduction by allowing re- are equivalent: ductions to flip coins. All reductions as in • D is dominated by D with respect Section 3 will be called deterministic from 1 2 to f. nowon. Consider a Turing machine M which • For every AP-time random function T

7 from D2 to [0..∞], the composition T ◦ algorithm for Π on a given instance k times f is AP. and taking the majority answer solves Π with correctness guarantee β. The situ- Corollary. Domain reductions compose. ation is even better for search problems where one needs only one successful at- The question arises when a reduction f tempt; the inequality α>1/2 can be re- of a domain D1 to a domain D2 reduces an placed by α>0. Of course, decreasing RDP(D1,W1,R1) to an RDP(D2,W2,R2) correctness guarantees can be boosted as or reduces an RSP(D1,W1,R1)toan well. In the case of search problems, it RSP(D2,W2,R2). Say that f is (fully) cor- makes sense to allowinverse polynomial rect if (∃w)(wR1x) ↔ (∃w)(wR2f(x, r)) correctness guarantees. Iterating such an for all x ∈ D1 of positive probability and AP-time randomizing search algorithm a all r. Should we require that f is fully cor- sufficient number of times gives an AP- rect or not? In either case, we must allow time randomizing search algorithm with – to be consistent – randomizing decision correctness guarantee close to 1. If an RNP and search algorithms satisfying the corre- search problem Π2 is is solvable with an sponding correctness requirement. inverse polynomial correctness guarantee It may seem that there is no sense in and an RNP search problem Π1 is reducible using fully correct randomizing reductions. to Π2 with an inverse polynomial correct- Such a reduction can be made determinis- ness guarantee then Π1 is solvable with an tic by pretending that all coins come up inverse polynomial correctness guarantee. heads. This may ruin the domination re- Define the revised counterpart of P in quirement however. Fully correct reduc- the newsetting to be the class RAP of tions may be employed to overcome the RNP decision problems solvable with a phenomenon of flatness [Gu1, Gu3]. constant correctness guarantee exceeding It is much more fruitful though to allow 1/2, say, 2/3. partially correct reductions [VL, BCGL, Notice, however, that the repetition IL]. Say that a reduction f(x.r)ofD1 technique for boosting the probability of to D2 reduces an RDP(D1,W1,R1)toan correctness, which we applied to random- RDP(D2,W2,R2) with probability guar- izing decision and search algorithms above, antee α(n) if, for every x ∈ D1 of pos- itive probability, the probability of the is not directly applicable to many-one ran- domizing reductions. Repeating a random- event (∃w)(wR1x) ↔ (∃w)(wR2f(x, r)) is at least α(|x|). Define partially correct izing reduction k times results in k outputs reductions of RNP search problems, par- in the target domain, not one as in the def- tially correct decision algorithms and par- inition of reduction. In other words, such a tially correct search algorithms in a sim- repetition is a version of Turing (or truth- ilar way. In the rest of this section, re- table) reduction, not a many-one reduc- ducing (resp. solving) an RNP problem Π tion. At this point, we refer the reader to with correctness guarantee α means reduc- papers [VL, BCGL, IL] where partially cor- ing (resp. solving) Π by an AP-time ran- rect reductions were successfully employed. domizing algorithm with correctness guar- It is our impression that the notion of re- antee α. ductions of RNP problems requires a little If an RNP decision problem Π is solv- additional cleaning work. able with a constant correctness guarantee The prospect of using Turing reductions α>1/2 then, for any constant β<1, raises a hope of bridging the gap between there exists k such that running the given decision and search problems, of reducing

8 search problems to decision problems. This that some randomizing algorithm is used works in the NP setting. For every NP to generate instances of a problem in ques- search problem Π, there exists an obvi- tion. Then there exists a computable func- ous polynomial time Turing reduction of tion h(r) from sequences of coin flips to in- Π to an NP decision problem Π. An in- stances of our problem such that the prob- stance of Π comprises an instance x of Π ability µ(x) of an instance x is proportional and a string u; the corresponding question to the uniform probability of h−1(x). (We is whether there exists a witness w for x say “proportional” rather than “equal” be- (with respect to Π) with an initial seg- cause the generating algorithm is not re- ment u. This simple reduction does not quired to always terminate.) work for RNP problems; it violates the Remark. Every such distribution µ, domination condition. Using substantially never mind the complexity of the gener- more sophisticated randomizing Turing re- ating algorithm, is dominated by so-called ductions, Ben-David, Chor, Goldreich and universal distribution reflecting the infor- Luby were able to prove that every RNP mation complexity. Li and Vitani notice search problem reduces to an appropriate that, in the case of the universal distri- RNP decision problem [BCGL]. bution, the average-case time complexity is “of the same order of magnitude as 6. Revision 2: P-samplable the corresponding worst-case complexity” distributions [LV]. The idea is that, in particular, the universal distribution dominates the dis- tribution that concentrates exclusively on We return to the definition of domains the worst-case instances. In practice, of in Section 2 and discuss probability distri- course, the generating algorithms satisfy butions. For simplicity, restrict attention severe resource bounds and the average- to domains where the universe is the set case complexity is often much lower than {0, 1}∗ of all binary strings and the size of the worst-case complexity. a string is its length. In this case, the do- It is possible that function h is easily in- main is completely defined by the probabil- vertible and preserves the order of strings. ity distribution, and one may speak about For example, h(r) may be the concatena- the uniform distribution and about one dis- tion of r and some string h(r). In such tribution dominating another. a case, distribution µ is P-time. In gen- A probability distribution on {0, 1}∗ is eral, however, one cannot count on µ be- called P-time computable if there exists ing P-time. One may want to distinguish exists a polynomial time algorithm that, between the cases when µ in question is given a string x, computes (a good approx- produced by nature or some other disinter- imation to) the probability of the collec- ested party and the case when the distribu- tion of strings y such that |y| < |x| or else tion is produced by an adversary. Even a |y| = |x| and y precedes x in the lexico- disinterested party may inadvertently mess graphical order. For example, the uniform things up. Certainly one would expect an distributions is P-time. The restriction to adversary to mess things up. The following P-time distributions was used by Levin to definition is implicitly in [Le2] and explic- construct the first RNP complete problem itly in [BCGL]. [Le1]. This restriction turns out to be too strict. Definition 6.1. A distribution is P- What distributions are likely to come up samplable if it is generated by a coin- in applications? It is natural to assume flipping Turing machine M (with no addi-

9 tional input) such that the length of every For example, it contains terminating computation of M is bounded by a fixed polynomial of the output. M(T ) = sup{t : P[T ≥ t] ≥ 1/2}. Ben-David, Chor, Goldreich and Luby If M(T ) = 0 then P[T ≥ M(T )]=1> prove that (i) every P-time distribution 1/2; otherwise, by the continuity from the is P-samplable, and (ii) if there exists a left, P[T ≥ M(T )] = infs 1/2; otherwise, by the con- technical sense – to invert) then there is a tinuity from the right, P[T ≤ M(T )] = P-samplable distribution which is not dom- sups>M(T )P[T ≤ s] ≥ 1/2. The number inated by any P-time distribution [BCGL]. M(T ) may be called the upper median for Fortunately, Impagliazzo and Levin were T . For brevity, we will say that M(T )is able to prove the following theorem. the median for T . Notice that M(T ) ≤ 2E(T ). For, T is a Theorem 6.1 [IL]. Every NP search function on some sample space X. Define problem with a P-samplable distribution another function reduces to an NP search problem with the uniform distribution. T (x)=[ifxt] is continuous large k, M(T,k) ≤ kQ(T, a/k). from the right, i.e., P[T>t] = inf{P[T> s]: s>t}. Also P[T ≤ t] is continuous Proof. Let s = Q(T,1/k) and t = from the right. P[T

10 so that A(x) ≤ T (x) ≤ B(x) for all x. Let strictly monotone. If q(k)=q(k +1)= A1,...,Ak (resp. B1,...,Bk) be indepen- ... = q(k + j − 1) 0 such that k q(k) · P[q(k − 1) 1/2 P[T>q(k − 1)] − P[T ≥ q(k + 1)] ≤ and therefore M(A, k) ≥ s. 1 1 − = O(k−2). We prove the second inequality. Since k − 1 k +1 −a a 1/2. Let k be sufficiently It is clear that if q(k) is bounded by a poly- k large so that (1−a/k) > 1/2. It suffices to nomial of k then T is bounded on average. prove that M(B,k) 0. On the other hand, every value of T be- Let q = P[B ≤ t] so that q = P[T ≤ t] ≥ longs to some interval [q(k),q(k + 1)). We 1 − a/k. Clearly, B1 + ...+ Bk 1/2 1 1 − =Ω(k−2). k k +2 so that P[M(B,k) ≥ kt + ε] < 1/2 and 1/j  1/j therefore M(B,k) 0 such that E(T ε < ∞. Nowfix a domain. The notion of bound- edness on average allows us to give the fol- Theorem 7.1. T is bounded on average lowing useful characterization of AP func- if and only if Q(T,1/k) is bounded by a tions. polynomial of k if and only if M(T,k)is bounded by a polynomial of k. Lemma 7.2. A function T from some domain D to [0..∞] is AP if and only if Proof. By Lemma 7.1, it suffices to prove there exists a polynomial p and a bounded only the first equivalence. Let q(k)= on average function B from D to [0..∞] Q(T,k). Without loss of generality, T is such that T (x)=p(x) · B(x). unbounded, P[T = ∞]=0and∞ is the 1/k only limit point for the range of T . (If the Proof. If E(T (x) /|x|) < 0, the de- k range of T has other limit points, replace sired B(x)=T (x)/|x| . Suppose that i 1/j T with T ). Then q(k) = sup{t : P[T ≥ T (x)x| · B(x), E(B(x) ) < ∞ and 1/k 1/k]} = max{t : P[T ≥ 1/k]} and there- k = max(i, j). Then E((Tx) /|x|) ≤ 1/j fore P[T = q(k)] > 0. E(B < ∞. QED Further, without loss of generality, we It is easy to check that the class of may assume that the sequence q(k)is bounded on average functions is closed

11 under, say, pointwise maxima (h(x)= 8. One alternative to max(f(x),g(x)) and products. Hence the P=?NP class of AP functions is closed under point- wise maxima and products. Is theoretical computer science a part of ? A good case can be made in favor of the thesis that theo- retical computer science is a part of ap- plied mathematics. The word “applied” is essential here. Even a remote connec- tion to possible applications gives impor- tant guidance. A case in point is the fa- mous question P=?NP, so very popular in theoretical computer science. The impor- tance of the P=?NP question is related to the thesis identifying feasible and poly- nomial time computations, thesis (1.0) of Section 1. (There is a beautiful article of Trakhtenbrot on the original motiva- tion for the P=?NP question [Tr].) In this connection, a , who isn’t convinced that P captures feasibility, may question the centrality of the P=?NP question. A purer mathematician may be unmoved. The question was asked. The challenge was posed. Now is the time to solve the question rather than to try to get around it. In [Gu2], the P=?NP question was criti- cized for a bias toward the positive solution and an alternative question RAP=?RNP, the counterpart of the P=?NP question in the average-case approach, was advertized. In that connection, the following game be- tween Challenger and Solver was consid- ered. Challenger chooses an NP problem Π and then repeatedly picks instances of Π, and Solver tries to solve them. If P=NP then Solver can win the game: By the more convincing part (1.1) of the thesis (1.0), polynomial time algorithms are feasi- ble. On the other hand, it is not clear that Challenger can make Solver work much harder than he does if P =NP.Itmaybe very hard to find hard instance of Π. Nowassume that Challenger employs a randomizing algorithm generating in- stances of Π in time polynomial in the

12 output. (Polynomiality in the output is Acknowledgement. We are happy to a feasibility requirement in this context. thank Andreas Blass and Leonid Levin There is a limit on Challenger’s time.) for most useful discussions, Ramarathnam This makes Π an RNP problem. We con- Venkatesan and Moshe Vardi for comments sider (randomizing) decision or search al- on a draft of this paper, and our student gorithms feasible if they run in time poly- Jordan Stojanovski who was asked to prove nomial on average. Thus, Solver can win if Theorem 7.1 and did a good job. RAP=RNP. What happens if RAP =RNP? Can Challenger win the game? The answer seems to be yes. The following argument References involves medians. [BCGL] Shai Ben-David, Benny Chor, Suppose that Π is any RNP problem and Michael Luby, that is not RAP, and T (x, r) is the run- S “On the Theory of Average Case ning time of Solver’s algorithm on instance Complexity”, Symposium on The- x and a sequence r of coin flips. By Theo- ory of Computing, ACM, 1989, rem 7.1, the median M(T ,k) of the time S 204–216. that Solver needs to solve k instances is not bounded by a polynomial of k. Further- [BG] Andreas Blass and Yuri Gure- more, Challenger’s generating process may vich, “On the Reduction Theory be altered in such a way that the expec- for Average-Case Complexity”, in tation E(TC ) of Challenger’s time for gen- Proc. of CSL’90, 4th Workshop on erating one instance is bounded D [Gu2]. Computer Science Logic (Eds. E. Then the expected time for generating in- B¨orger, H. Kleine B¨uning and M. dependently k instances is Θ(k) and there- Richter), Springer LNCS, 1991. fore (recall the remark that M(T ) ≤ 2E(T ) in Section 7) the median M(TC ,k) of Chal- [GJ] Michael R. Garey and David lenger’s time needed to produce k instances S. Johnson, “Computers and In- is Θ(k). Thus, the median M(TS,k)of tractability: A Guide to the The- Solver’s time is not bounded by any poly- ory of NP-Completeness”, Free- nomial of the expectation k · E(TC )orthe man, NewYork, 1979. median M(T ,k) of Challenger’s time. C [Go] Oded Goldreich, “Towards a The- It is easy to see that RAP =RNP if there ory of Average Case Complexity: exists a one-way function. It is not known A survey”, TR-531, Computer Sci- whether the converse is true. The converse ence Dept., Technion, Haifa, Israel, fails under some oracle [IR], but the ques- March 1988. tion is open and very exciting. The ex- istence of one-way functions allows cryp- [Gr] Per Grape, “Complete Problems tography in a meaningful sense [ILL, Ha]. with L-Samplable Distributions”, If the existence of one-way functions fol- 2nd Scandinavian Workshop on Al- lows from RAP =RNP then the question gorithm Theory, Springer Lecture RAP=?RNP is beautifully balanced. Ei- Notes in Computer Science 447, ther all RNP problems are easy on average 1990, 360–367. or else there are problems that are hard on [Gu1] Yuri Gurevich, “Average Case average but then they can be used to do Complexity”, J. Computer and Sys- cryptography. tem Sciences (a special issue on FOCS’87), to appear.

13 [Gu2] Y. Gurevich, “The Challenger- [Kn] P.M.W. Knijnenburg, “On Ran- Solver game: Variations on the domizing Decision Problems: A Theme of P=?NP”, Bulletin of Eu- Survey of the Theory of Random- ropean Assoc. for Theor. Computer ized NP”, Tech. Report RUU-CS- Science, October 1989, 112–121. 88-15, Rijksuniversitait Utrecht, The Netherlands, March 1988. [Gu3] Yuri Gurevich, “Matrix Decom- position Problem is Complete for [Le1] Leonid A. Levin, “Average Case the Average Case”, Symposium on Complete Problems”, STOC 1984, Foundations of Computer Science, the final version in SIAM Journal IEEE Computer Society Press, of Computing, 1986. 1990, 802–811. A full version of [Le2] Leonid A. Levin, “One-Way Func- this paper, coauthored by Blass tions and Pseudo-Random Gener- and Gurevich, is being prepared for ators”, Symposium on Theory of publication. Computing, ACM, 1985, 363–375.

[GS] Yuri Gurevich and Saharon Shelah, [LV] Ming Li and Paul M. B. Vi- “Expected computation time for tani, “Average Case Complexity Hamiltonian Path Problem”, SIAM under the Universal Distribution J. on Computing 16:3 (1987), 486– Equals Worst Case Complexity”, 502. Manuscript, 1989.

[Ha] Johan Hastad, “Pseudo-Random [Sc] Robert E. Schapire, “The Emerging Generators under Uniform Func- Theory of Average Case Complex- tions”, Symposium on Theory of ity”, Tech. Report MIT/LCS/TM- Computing, ACM, 1990, 395–404. 431, June 1990.

[IR] Russel Impagliazzo and Stephen [Tr] Boris A. Trakhtenbrot, “A Survey Rudich, private communication. of Russian Approaches to Perebor (Brute-Force Search) Algorithms”, [IL] Russel Impagliazzo and Leonid A. Annals of the History of Comput- Levin, “No Better Ways to Gener- ing, 6:4 (1984), 384–400. ate Hard NP Instances than Pick- ing Uniformly at Random”, Sympo- [VL] Ramarathnam Venkatesan and sium on Foundations of Computer Leonid Levin, “ Random Instances Science, IEEE Computer Society of a Graph Coloring Problem are Press, 1990, 812–821. Hard”, Symposium on Theory of Computing, ACM, 1988, 217–222. [ILL] Russel Impagliazzo, Leonid A. [Ve] Ramarathnam Venkatesan, private Levin and Michael Luby, “Pseudo- correspondence. Random Generation from One-Way Functions”, 21st Symposium on [Wi] Herbert S. Wilf, Some Examples Theory of Computing, ACM, New of Combinatorial Averaging, Amer- York, 1989, 12–24. ican Math. Monthly 92 (1985), 250– 261. [Jo] David S. Johnson, ”The NP-- Completeness Column”, Journal of Algorithms 5 (1984), 284-299.

14