Probabilistically Checkable Debate Systems and Approximation Algorithms for PSPACE-Hard Functions (Extended Abstract)*

Anne Condont Joan Feigenbaum$ ~ Peter Shorl~

Abstract 1 Introduction

Suppose that two candidates, B and C, agree to a de- bate format. Voter V is too busy to catch more than We initiate an investigation of probabilistically check- a very small number of bits of the debate. How does able debate systems (PCDS’S), a natural generalization V decide which of B or C won the debate? In this pa- of the probabilistically checkable proof systems stud- per, we show that if B and C choose the right debate ied in [1, 2, 3, 8]. A PCDS for a language consists format, V’s problem is solved. By listening to a few, of a probabilistic polynomial-time verifier V and a de- randomly chosen, sounds bites of the debate, V can bate between player 1, who claims that the input z is with near certainty figure out who won. in L, and player O, who claims that the input x is not Similarly, suppose that B or C is giving a speech in L. We show that there is a PCDS for L in which to a set of voters VI, . . . . Vn, represented by finite au- V flips O(log n) random coins and reads O(1) bits of tomata. He would like to give the speech that results the debate if and only if L is in PSPACE. This char- in acceptance (votes) by the greatest number of Vi ‘s, acterization of PSPACE is used to show that certain PSPACE-hard functions are as hard to approximate We show that not only can he not compute this maxi- mum exactly, but he cannot come within an arbitrary as they are to compute exactly. constant factor, unless he has access to an oracle (po- litical consultant) with the full power of P!3PACE. Our work builds on the recent progress that has been made in the theory of probabilistically checkable

t University of Wisconsin, Computer Sciences Depart- proof systems (PCPS ‘s). Results about the language.- ment, 1210 West Dayton Street, Madison, WI 57306 USA, recognition power of PCPS’S have led to lower bounds condon@cs. wise. edu. Supported in part by NSF grants CCR- on the difficulty of approximating NP-hardl functions. 9100886 and CCR-9257241. In this paper, we define probabilistically checkable de.. tAT&T Ben Laboratories, Room 2C473, 600 Mowt~n Avenue, P. O. Box 636, Murray Hill, NJ 07974-0636 USA, bate systems (PCDS’S). We prove several results about jf~research. att . corn. the language-recognition power of PCDS ‘~s and then ~AT&T Ben Laboratories, Room 2C324, 600 Momt& use them to obtain lower bounds on the difficulty of Avenue, P. O. Box 636, Murray Hill, NJ 07974-0636 USA, approximating PSPACE-hard functions. lrmd@research. att. coin. ~AT&T Bell Laboratory=, Room 2DI 49, 600 Momtain Let us describe the background for this work in more Avenue, P. O. Box 636, Murray Hill, NJ 07974-0636 USA, detail. Loosely speaking, a language L has a PCPS if, shor~research. att. corn. for every z c L, there is a string m such thi~t a proba- * The full version of this paper has been submitted for journal bilistic verifier V can be convinced with high probabil- publication and is available as DIMACS TR 93-1o. ity that z c L. The class PCP((n), g(n)) consists of Permission to copy without fee or part of this material is those languages recognizable by PCPS’S in which the granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the verifier uses O(r(n)) coin flips and looks at O(q(n)) title of the publication and its date appear, and notice is given bits. It was recently shown that PCP(log.-, n, 1) = NP that copying is by permission of the Association for Computing (cf. [1, 2]). Machinery. To copy otherwise, or to republish, requires a fee Results on the power of classes PCP(r(n], g(n)) can and/or specific permission. 25th ACM STOC ‘93-5 /93/CA, USA be used to show that many important approximation 01993 ACM 0-S9791-591 -7/93 /0005 /0305 . ..$1.50

305 problems are hard, unless there is some unexpected The following is a technical building block, interest- collapse of complexity classes. The first result along ing in its own right, that is used in the proof that these lines showed that MAX-CLIQUE was difficult to PSPACE = PCD(log n, 1): If r(n) = fl(log n), then approximate [8]. The seminal result of [8] haa been im- PCD(r’(n), q(n)) contains the same languages if the proved several times, and it is now known that there is verifier reads O(q(n)) rounds of the debate as it does an c such that approximating MAX-CLIQUE within if the verifier reads O(q(n)) bits of the debate. a factor of ne is as difficult as solving NP-complete problems exactly [1]. Furthermore, there is a large We then use our main result about the language- class of natural optimization problems, those complete recognition power of PCDS’s to prove lower bounds for the class MAX-SNP defined in [15], that do not on the difficulty of approximating PSPACEhard func- have polynomial-time approximation schemes unless P tions. Let MAX Q3SAT be the following natural opti- = NP; that is, for each of these problems, there is an mization version of the canonical PSPACEcomplete e such that approximating the optimal solution within language QBF. Suppose @ = QIZIQZZZ . . . Qnxn ratio ~ is as hard as solving NP-complete problems ex- 4(zl, z2,..., zn ) is a quantified boolean formula, with actly [1]. This result on MAX-SNP shows that many Qi c {3, V}, and ~ in 3CNF. Suppose that the vari- well-known optimization problems are hard to approx- ables of the formula are chosen, in order of quantifica- imate closely, including Traveling Salesman with Tri- tion, by two players O and 1, where player O chooses the angle Inequality, MAX-SAT, and MAX-CUT. universally quantified variables and player 1 chooses A PCDS is a generalization of a PCPS. In a PCDS the existentially quantified variables. If player 1 can for L, there are two computationally powerful players, guarantee that k clauses of @ will be satisfied by 1 and O (called B and C at the beginning of this sec- the resulting assignment, regardless of what player O tion) and a probabilistic polynomial-time verifier V. chooses, we say that k clauses of @ are simultaneously Players 1 and O play a game in which they alternate satisfiable. We let MAX Q3SAT be the function that writing out strings on a debate tape iT. Player 1‘s goal maps a quantified 3CNF formula @ to the maximum is to convince V that an input a c L, and player O’s number of simultaneously satisfiable clauses. goal is to convince V that x @ L. When the debate is over, V looks at x and T and decides whether x c L (player 1 wins the debate) or z @ L (player O wins the Theorem: There is a constant O < c <1 such that ap- debate). Suppose V flips O(r(n)) random coins and proximating MAX Q3SAT within ratio c is PSPACE- reads O(q(n)) bits of ~. If, under the best strategies hard. Thus MAX Q3SAT is as hard to approximate of players 1 and O, V’s decision is correct with high as it is to compute exactly. probability, then we say that L is in PCD(r(n), q(n)). Specifically, we say that a language L is in We use reductions to prove that certain other PCD(r(n), q(n)) if it has a nonadaptive PCDS with PSPACEhard functions are PSPACE-hard to approx- one-sided error in which players 1 and O write on a imate in a stronger sense. These include maximiza- debate tape, and then V makes O(r(n)) coin flips and tion versions of the Finite Automata Intersection prob- queries O(q(n)) bits based on these coin flips. By non- lem, shown PSPACEcomplete by Kozen [12], and adaptive, we mean that the choice of bits queried by V the Generalized Geography problem, shown PSPACE- is based solely on the input and the coin flips. By one- complete by Schaefer [17]. We show that there is a con- sided error, we mean that whenever z E L, V must stant ~ such that approximating these problems within correctly decide that x E L, no matter which sequence ratio n’ is PSPACEhard. of O(r(n)) coins are flipped (assuming correct play on the part of player 1). When z # L, V is allowed to The rest of this paper is organized as follows. We conclude incorrectly that z ~ L with probability at define PCDS’S, and all of our other terms, precisely most c, for some fixed ~ < 1. in Section 2. Our results on the language-recognition With the above definition in hand, we can state our power of PCDS’S are given in Section 3. Those on main results about the language-recognition power of approximation of PSPACEhard functions are given PCDS’S. in Section 4. Section 5 contains open questions and a discussion of subsequent related results. Theorem: PSPACE ~ PCD(log n, 1). This - We have omitted some proofs because of space limi- sult is best possible, because we can show that t ations; they can all be found in our journal submission PCD(log n, q(n)) is contained in PSPACE, for any (DIMACS TR 93-10). These results first appeared in function q. our Technical Memorandum [6].

306 2 Preliminaries properties.

We first review the definition of a PCPS. A verifier is . For all x in L, there is a debate subtree such that, a probabilistic polynomial-time Thing machine that for all debates r labeling a path of this subtree, takes as input a pair x, n, where z c {O, l}*, and V outputs 1 with probability 1 on input x, T. In either accepts or rejects. A language L has a proba- this case, we say that ~ is accepted b,y (D, V). bilistically checkable proof system, or PCPS, with error ● For all z not in L, on all debate subtrees, there probability c if there is a verifier V with the following exists a debate n labeling some path of the subtree properties. such that V outputs 1 with probability at most .s on input c, ir. In this case, we say that .x k ● For all z in L, there is a string r such that V rejected by (D, V). accepts with probability 1 on input x, ~.

This definition allows “one-sided error,’” analogous ● For all z not in L, on all strings r, V accepts with to the type of errors that are allowed in the complexity probability at most c on input Z, r. class CORP. All our results also hold for a “zero-sided We say that the verifier makes q(n) queries if the error” definition, with three possible outputs, 1, 0, and number of bits of r read by the verifier is at most q(n), A, for “player 1 won,” “player O won,” and “1 donut when the input is of size n. PCP(r(n), q(n)) is the know who won,” respectively. In this case, the verifier class of languages that have probabilistically checkable must never declare the losing player to be a winner, proof systems with error probability 1/3 in which the but it may, both in the case that x c L and in the verifier uses O(r(n)) random bits and O(q(n)) queries. case that x @ L, say that it doesn’t know who won. We next extend this to define PCDS’S. A probabilis- As in the theory of PCPS’S, we say that the verifier iically checkable debate system, or PCDS, consists of a makes q(n) queries if the number of bits of ir read verifier V and a debate format D. As before, the ver- by the verifier is at most q(n) when the input is c)f ifier is a probabilistic polynomial-time size n. The verifier V in a PCDS is required to be that takes aa input a pair z, T, where n c {O, l}*, and nonadaptive, by which we mean that the bits of ir read outputs 1 or O. We interpret these outputs to mean by V depend solely on the input and the coin flips. l[f “player 1 won the debate” and “player O won the de- L has a PCDS with error probability 1/3 in which V bate,” respectively. flips O(r(n)) coins and reads O(q(n)) bits of r, we say A debate format is a pair of functions ~(n), g(n). that L E PCD(r(n), q(n)). Informally, for a fixed n, a debate between two players, It will be convenient in later proofs to use a gener- O and 1, consistent with format f(n), g(n), contains alized class, GPCD(r(n),q(n)). GPCD(r(n),q(n)) is g(n) rounds. At round i ~ 1, player i mod 2 chooses defined exactly as PCD(r(n) ,q(n)), except that the a string of length f(n). verifier of a GPCDS nonadaptively queries O(q(n)) For each n, corresponding to the debate format D is rounds of the debate ~ (rather than O(q(n)) bits c~f a debate tree. This is a complete binary tree of depth the debate). Thus, in a GPCDS, there is no restrict- f(n)g(n) such that, from any node, one edge is la- ion on the number of bits queried by thle verifier in beled O and the other is labeled 1. A debate is any each round. string of length f (n)g(n). Thus, there is a one-to-one Next, we give some definitions relating to approx- correspondence between debates and the paths in the imability of PSPACE-hard functions. Let f be any debate tree. Moreover, a debate is the concatenation real-valued function with domain D ~ {O, 1}*. Let of g(n) substrings of length f(n). Each substring is A be an algorithm that, on input z E {O, l}*, prc~ called a round of the debate, and each debate of this duces an output A(x). We say that A approximate es debate tree has g(n) rounds. f within ratio e(n), O < c(n) < 1, if for all z c D, Again for a fixed n, a debate subtree is a subtree c(IzI) ~ A(x)/f(z) < l/c(lzl). If s(n) > 1, thelm of the debate tree of depth f(n)g(n) such that each “A approximates f within ratio c(n)” means that node at level i (the root is at level O) has 1 child if l/c(lzl) < A(z)/f(z) ~ c(IzI). If algorithm A com- i div f(n) is even, and it has two children if i div f(n) putes the function g, we also say that g approximates is odd. Informally, the debate subtree corresponds to f within ratio ~. a list of “responses” of player 1, against all possible The function f has a polynomial-time approximation “arguments” of player O in the debate. scheme, or PTAS, if for each e, O < c < 1, there is A language L has a PCDS with error probability E if a polynomial-time algorithm A that approximates ~f there is a pair (D = (f(n), g(n)), V) with the following within ratio e.

307 We say that a function g is PSPA CE-hard if and the other “false.” Each path P in the tree cor- PSPACE ~ Pg, i.e., if every language in PSPACE is responds to an assignment of the variables; we say P polynomial-time reducible to g. By “approximating .f satisfies 4 if this assignment satisfies ~. Call edges within ratio e(n) is PSPACE-hard,” we mean that, if g at odd-numbered levels (that is, corresponding to ex- approximates ~ within ratio ~(n), then g is PSPACE- istentially quantified variables) l-edges and edges at hard. even-numbered levels O-edges. An ~-assignment sub- tree Al is a subtree of A that has two O-edges from each node at each even level and one l-edge from each 3 Complexity-Theoretic node at each odd level. Similarly, a V-assignment sub- Results tree AO is a subtree of A that has two l-edges from each node at each odd level and one O-edge from each

Our first theorem on the language-recognition power node at each even level. An 3-assignment subtree Al of PCDS’s addresses the question of whether verifiers is optimal if it maximizes (over all El-assignment sub- that read O(q(n)) rounds of the debate tape have more trees) the number of paths that satisfy 4. Similarly, power than verifiers that read O(q(n)) bits of the de- a V-assignment subtree A. is optimal if it maximizes bate tape. Surprisingly, for r(n) = log n, the answer (over all V-assignment subtrees) the number of paths is no. that do not satisfy ~. Note that if@ c QBF, all paths of an optimal 3-assignment subtree satisfy d, whereas Theorem 3.1 For every q(n), PCD(log n, q(n)) = if@ @ QBF, then no path of an optimal V-assignment GPCD(log n, q(n)). subtree satisfies ~.

Sketch of proof: We use the results in [1, 3] that Theorem 3.2 PSPACE = GPCD(log n, 1), show that, given q input strings and an NP predicate about the strings, it is possible to write down a prob- abilistically checkable proof that these strings satisfy Proofi The direction GPCD(log n,l) ~ PSPACE the predicate. This proof can be verified probabilisti- is straightforward. To prove the other direction, cally using O(log n) random bits and O(q) query bits we show that QBF 6 GPCD(logn,l). Let @ = (of the proof and the inputs), provided that the inputs 3Z1VZ2. . .%n~(q, . . .,zn ) be an instance of QBF in are encoded using a good error-correcting code. which quantifiers alternate strictly. Let AO, AI be op- Given a GPCDS D for some language L, we con- timal V- and 3- assignment subtrees of 0, respectively. struct a PCDS D’ for L by simulating D and forc- Note that @ is a true QBF if and only if the unique ing the players to encode their moves using the error- path P of length n that is in both AO and Al satis- correcting code. After the simulated version of D has fies ~. been played, we let player 1 play one more move in We give a debate format and a protocol of players which, for each random seed R of V, player 1 proves O and 1 that enable the players to record parts of the using a probabilistically checkable proof that V(R) assignment subtrees A. and Al in such as way that accepts the simulated game. Note that the state- a verifier can efficiently check whether or not path P ment “V(R) accepts” is a predicate that is computable satisfies ~. in polynomial-time from the O(q) inputs that V(R) In the debate, the players alternately play rounds, reads. starting with player 1. Roughly, in one round, player Some technical problems arise, because player O may i does two things: presents a challenge to player 1 – i not properly encode some or all of its moves in the and responds to previous challenges written by player simulated version of D. We show in our journal version 1 – i. A chailenge by player 1 – i to player i is simply how to overcome these problems. ~ a path of the assignment subtree Ai that ends in a If r(n) = o(log n), then it is not necessarily true that (1 – i)-edge. The response of player i to this challenge GPCD(r(n), q(n)) ~ PCD(r(n), q(n)). For example, it is the edge that extends this path in Ai (if the path is is clear that GPCD(O, 1) is the entire polynomial-time not already of length n). hierarchy, whereas PCD(O,l) is just P. Note that a challenge to player 1 (resp. O) has to be Let @ = 3X1VZ2 . . .%n(j(zl, ..., Zn) be an instance a path of Al (resp. Ao). How can player O write down of QBF; without loss of generality, we assume that such a path without knowing Al? Essentially, in round quantifiers alternate strictly. The assignment tree A t,player O must present paths that are consistent with for @ is the complete binary tree of depth n, where what player 1 wrote in rounds 1 through t – 1. We will one edge from every internal node is labeled “true” elaborate on this point below.

308 Play proceeds as follows: In round t, one of the play- Round n: x1=T ers writes a tree Tt. If player i is honest, then in round t = i mod 2, player i writes the (unique) smallest sub- tree Tt of Ai such that Tt contains all challenges by X2 =T player 1 – i at rounds j < t. The debate format is thus /

(~(n), 9(n)), where g(n) is chosen to make the error x3=F probability small enough and ~(n) is chosen to allow \ encoding of binary trees of the appropriate form; we ● will explain in detail how to choose f and g when we ● prove correctness of the protocol.

x?) =T Before specifying the algorithm of the verifier V, we /’ give some examples of debates in which player 1 is hon- est. The values assigned to the variables depend on the Round n + 1: Same as round n. input formula and are unimportant for the purposes of ● this discussion. ● ●

Round g: Same as round n,

Example 1: Suppose that player O is honest as well. In round t, 1 ~ t ~ n, player i s t mod 2 assigns a value to xt by writing down a path of length t that

Round 1: extends the path written in round t – 1. The path ZI=T written in round n is P, the intersection of Al and Acl. f In rounds n + 1 through g, the path P is repeated in d each round. V will declare player 1 the winner (i.e., accept the input) if and only if P satisfies q$.

Round 2; Example 2: Suppose that @ is a true QEIF and that ZI=T player O cheats in an effort to convince V to reject. one way player O may do this is to play ?Lmove that x2=T is not a subtree of Al – that is, to lie about player 1‘s / previous moves.

Round 1:

Round 3: x1=F x1=T \

Round 2: XZ=T XI=T A ZI=F XS=F

(

Round 3: ● xl =F \

Z2=F \ > x3=T

/ . ●

309 Note that, in round 3, the honest player 1 need only node is an i-edge (because Ti is a subtree of Ai) and extend the FF path of the tree played by O in round 2. (ii) the edge to every leaf of depth < n + 1 is an i-edge Because the TT path that appears in round 2 is not (because player i responds to any recorded challenge in Al, i.e., because it is not consistent with player 1‘s by player 1 – i). A tree Tt satisfying these two prop- move in round 1, it does not satisfy the definition of a erties is valid. (In Example 3 above, player O should challenge. choose which of the paths FTT or FFF of round 5 to extend in round 6, because extending both of them Example 3: Once again, suppose that @ is true and would result in an invalid move TG.) Also, we define a that player O cheats. This time, player O does by valid challenge by player 1 – i to player i at round j lying about player O’s own moves rather than those of to be the (unique) longest challenge that lies in Tj, if player 1. Tj is valid.

Round 1: Note that if player i is honest, then Tt_z is a subtree x1=F of Tt. This is because both Tt-2 and Tt respond to all \ valid challenges from rounds j < t – 2. Also, Tt has at most one more leaf than Tt-2, namely the leaf of Round 2: the path that contains the valid challenge at round t – 1, if it lies on a different path from previous valid challenges. x2=T We take the number of rounds to be g = g(n) > 4n; / this will ensure that the error probability is no more than 2(n – 1)/(g – 3), as explained below. We let Round 3: f(n), the length of a round, be such that any binary XI=F tree of depth at most n with at most g(n) leaves can be described using f(n) bits, via some simple encoding T $72 = of trees to strings. This is sufficient, since the number of leaves of Tg(n) is at most g(n). x3 =T Below we state V’s algorithm formally and prove d ) it correct, but first we explain intuitively how V can catch a cheating player by examining only a constant Round 4: number of rounds of the debate. Suppose that the x1=F input formula is true and hence that player 1 follows the protocol honestly. If they are valid, the trees Tg x2=F (player 1’s last move) and Tg_l (player 0’s last move) \ intersect in a unique path, say P(g). If it is of length n, then P(g) satisfies ~, and V will certainly declare Round 5: player 1 the winner. Thus player O must try to “stall” x1=F \ and prevent P(g) from growing to length n. Consider a move Tt, where t < g – 1 is even. Tt contains a (unique) ZZ. =T XZ=F longest prefix of P(g), say P(t). Informally, we say that player O “cooperates” if lP(t) I > lP(t – 1) 1. Player

X3=T x3=F O can cooperate in fewer than n rounds, or else P(g) is of length n. Essentially, our formal proof shows that, if player O indeed cooperates in fewer than n rounds, A● then many intermediate moves Tt are either invalid or ● ● not subtrees of the final move T~- 1. Either way, V can detect that player O is cheating by examining only a In this example, both the FF path of round 4 and constant number of randomly chosen moves. the FT path of round 2 require a response by player 1 We now give a formal statement of V’s algorithm in round 5, because both are legitimate challenges, i.e., and a proof of its correctness. V first examines the neither is inconsistent with player 1‘s previous moves. last subtrees, Tg_ 1 and T~, written by players O and 1 A move Tt by an honest player i must have the fol- respectively. If Tg _ 1 is not valid, V accepts. If Tg is not lowing properties: (i) at most one edge from every valid, V rejects. Otherwise, let P(g) be the (unique)

310 path in both Tg- 1 and T~. If the length of P(g) is n round for which p(i) = l,p(i) = 2, , . .,p(i) = n – 1, and P(g) satisfies ~, V accepts. Ifthe length of P(g) and there are at most n — 1 such rounds. is n and P(g) does not satisfy C#J,then V rejects. Suppose, then, that V accepts on examining rounds Otherwise, the length of P(g) is less than n. In t and t – 1. Suppose that j < t is a round of player this case, V chooses a random round t,1 < t < g, 1, where Tj is valid. We need to show p(~) < p(t). in which player 1 plays, and examines rounds t and We first show that P(j) is a prefix of P(t -- 1),which t – 1. If Tt is not valid or is not a subtree of Tg, V implies that p(j) ~ p(t– 1). This is because P(j), with rejects. Similarly, if Tt - 1 is not valid or is not a subtree the last edge removed if it is a O-edge, is the prefix of of Tf _ 1, V accepts. Otherwise, let P’ be the longest a valid challenge of player 1 to player O at round j, path in both Tt - 1 and Tt. If the last edge of PI is not and since player O is honest, player O responds to this a O-edge, then V rejects. Otherwise, V accepts. This challenge at round t — 1. To complete the proof, we completes the statement of V’s algorithm. show that p(t – 1) < p(t).Note that it must be the case that P’, the longest path in both Tt -1 and Tt, ends in Now, suppose that @ c QBF and that player 1 is a O-edge, since V accepts. Also, this path must be a honest. We show that V accepts with probability 1. prefix of P(t – 1) and P(t). In fact, this path must This is true if the length of P(g) is n, since in this case P(g) is a path in Al and hence satisfies +. If the equal P(t – 1). (Otherwise, the l-edge fc~llowing P“ in P(t – 1) must be in Tg. length of P(g) is less than n, then for all rounds t in Since Tt is a subtree of which player 1 plays, Tt is valid and a subtree of Tg. Tg, this l-edge must be the l-edge of T* following P’, contradicting the fact that P’ is the longest path in, Suppose that Tt_ 1 is also valid and a subtree of Tg_ 1. both Tt- 1 and Tt.) Finally, the l-edge fc,llowing P’ It remains to show that the longest path P’ in both in Tt must be in Tg (since V checks for thits) and alscl Tt -1 and Tt must end in a O-edge. in Tg _ 1 (since player O is honest). Thus P(t) contains First, note that the length of P’ must be < n; other- P(t – 1) as a prefix, and it also contains an additional wise P’ is contained in both Tg _ 1 and Tg and therefore l-edge. This implies that p(t) > p(t – 1) ~ p(j), w P’ = P(g), which contradicts our assumption that the required. 1 length of P(g) is < n. Also, since Tt- 1 is valid, all paths of Tt _ 1 of length < n that end in a l-edge are followed by a O-edge. Furthermore, if P’ ends in a 1- Corollary 3.3 PSPACE = PCD(log n, 1). edge, the O-edge following P’ in Tt - 1 must also be in Tt, since the path formed by P’ and this O-edge is a prefix of the valid challenge of player O to player 1 at 4 Approximation Algorithms round t — 1, and player 1 is honest. This contradicts for PSPACE-hard Fun(ct ions the fact that P’ is the longest path in both Tt _ 1 and Tt. Hence P’ must end in a O-edge. In this section, we give many examples of PSPACE- We next show that if @ @ QBF and player O hard functions that are hard to approximate, We con- is honest, then V accepts with probability at most sider maximization versions of the problem of deciding 2(n – 1)/(g – 3). If the length of P(g) is n, V re- whether a quantified Boolean formula is true and show jects, since in this case P(g) is a path in A. and hence that one version can be approximated within ratio 1/2, does not satisfy #. Hence suppose that the length of yet there is some constant c < 1/2 such that approx- P(g) is less than n; thus player 1 cannot be honest. imating the function within ratio c is PSPACE-hard. We claim that in this case, there can only be n – 1 val- We prove even stronger results for the maximization ues of t such that V accepts when rounds t and t – 1 versions of several other PSPACE-complete problems. are examined. Since V chooses t randomly and uni- For example, there is a constant e such that approxi- formly from (g – 3)/2 choices, the error probability is mating Finite Automata Intersection (cf. Kozen [12]) at most 2(n — 1)/(g — 3). The number of choices for and Generalized Geography (cf. Schaefer [1 7]) within t is (g – 3)/2, because V never chooses player 1‘s first ratio n’ is PSPACE-hard. move, since it is not preceded by a move of player O. We first consider variants of a well-known PSPACE- If Tt is valid, let p(t)be the length of P(t), where complete problem, that of deciding whether a P(t) is the longest prefix of P(g) in Tt. For any t quantified Boolean formula is true. I.et @ = such that player 1 plays in round t, we show that, Q1x1Qzz2 . . .QnX~q!J(Z1, XZ, . . .,xn), where each Q~ E if V accepts on examining rounds t and t — 1, then {3, V}, each xi is a Boolean variable, and # is in 3 con- p(t) > p(j) for all j < t such that Tj is valid. This junctive normal form (3 CNF). Suppose that the vari- implies the claim, since then V accepts only the first ables of @ are chosen, in order of quantification, by two

311 players O and 1, where player O chooses the universally Theorem 4.4 There is a constant E > 0 such quantified variables and player 1 chooses the existen- that approximating MAX FA-INT within ratio n’ is tially quantified variables. If player 1 can guarantee PSPACE-hard. that k clauses of # will be satisfied by the resulting as- signment, regardless of what player O chooses, we say that k clauses of @ are simultaneously satisfiable. We Briefly, this theorem is proved by a from let MAX Q3SAT be the function, with domain the set a variant of MAX Q3SAT. Given an instance @ of this of quantified formulas in 3CNF, that maps a quantified variant, an automaton is constructed for each clause formula @ to the maximum number of simultaneously in such a way that all automata accept a string w if satisfiable clauses. and only if @ is true. The string w (of exponential The proof of the following theorem is similar to that length) happens to be an enumeration of all the possi- of [1] on the non-approximability of MAX 3SAT; it ble paths of an “assignment subtree” that describes the uses the fact that PSPACE = PCD(log n,l). assignments of player 1 to the existentially quantified variables, against all possible assignments of player O Theorem 4.1 There is a constant O < E < 1 such to the universally quantified variables. A major ingre- that approximating MAX Q3SAT within ratio E is dient of the proof is to show that if a proper subset PSPACE-hard. of size k of these automata accept a string Wf, then in fact there is a set of k clauses of @ that player 1 can We next provide an example of a function that can guarantee will be satisfied, for all possible assignments be approximated to within some constant ratio but of player O. cannot be approximated to within an arbitrary con- stant ratio, unless PSPACE = P. We say a quanti- Schaefer [17] defined Generalized Geography (de- fied formula is balanced if every clause of the formula noted GGEOG) to be an abstraction of a popular car contains some existentially quantified variable. We game in which two players alternately list the names define Balanced Q35’AT to be the set of those true of countries, each beginning with the last letter of the quantified formulas in 3CNF form that are also bal- previous country, until one player cannot list a new anced. This language is easily seen to be PSPACE- country. A corresponding game can be played on a di- complete. We define the corresponding function MAX rected graph G. A marker is initially placed on a start Balanced Q3SAT to be the function MAX Q3SAT, node s, and two players, O and 1, alternately move it restricted to the domain of balanced quantified formu- along an edge, with the constraint that player 1 starts las. Lemma 4.2 follows from techniques of Johnson and each edge can be used only once. The first player [11], and Lemma 4.3 is a simple corollary of Theorem unable to move loses. Informally, a natural optimiza- 4.1. tion version of GGEOG is to compute how long player 1 can keep the game going, even if player 1 does not Lemma 4.2 There is a polynomial-time algorithm eventually win the game. We say (G,s) can be played that approximates MAX Balanced Q3SAT within ratio for k rounds if player 1 has a strategy that causes the 1/2. marker to move along k edges of the graph before the game ends. We define MAX GGEOG to be the func- Lemma 4.3 There is a constant O < e <1 such that tion whose domain is the set of pairs (G,s), where G is approximating MAX Balanced Q3SAT within ratio E a directed graph with node s, that maps a pair (G,s) is PSPACE-hard. to the maximum number k of rounds that (G,s) can be played. We now consider a problem from automata theory. Let FA-INT be the set of sequences Al, A2, . . . . Am of deterministic finite-state automata having the same Theorem 4.5 There is a constant E > 0 such input alphabet X such that there exists a string w that that approximating MAX GGEOG within ratio ne is is accepted by all the automata. Kozen [12] showed PSPACE-hard. the problem to be PSPACEcomplete. The function MAX FA-INT has as its domain the set of all sequences A1, A2, ..., Am of deterministic finite state automata Again, this theorem is proved by a reduction from having the same input alphabet X and maps such a a variant of MAX Q3SAT. The reduction is similar sequence to the largest number k such that there exists to that of Schaefer [17], modified to preserve non- a string w that is accepted by k of the automata. approximabilit y.

312 5 Open Questions and Subse- Johnson, Madhu Sudan, and Mihalis Yannakakis for quent Related Work their comments on earlier versions. Finally, we thank Nick Reingold for helping us typeset Section 3.

Open questions include:

Question 5.1 Is there a Collapse Theorem for References PCDS’S? That is, is PCD(r, q) with debate format [1]S. Arora, C. Lund, R. Motwani, M. Sudan, and (~(n), g(n)) equal to pCD(r, q) with debate format M. Szegedy, Proof Verification and Hardness of (f’(n), 9(n)/2), for some f’ = poiy(f)? It can be Approximation Problems, Proc. 33rd Symposium shown that debate format (pcdy(n), k) is not equiv- on Foundations of Computer Science, IE,EE Com- alent to debate format (pcdy(n), k’), for certain con- puter Society Press, Los Alamitos, 1992, . 14– stants k # k’, unless the polynomial-time hierarchy 23. collapses. What if g(n) ~ n?

[2] S. Arora and M. Safra, Probabilistic Checking of Question 5.2 We have given examples of PSPACE- Proofs, Proc. 33rd Symposium on Foundations hard functions that do not have PTAS’S, unless some of Computer Science, IEEE Computer Society unexpected collapse occurs. It is st raightforward to de- Press, Los Alamitos, 1992, pp. 2-13. fine a PSPACE-hard function that does have a PTAS, but the straightforward examples are artificial. Is [3] L. Babai, L. Fortnow, L, Levin, and M. Szegedy, there a natural PSPACE-hard function that has a Checking Computations in Poly!ogarithmic Time, PTAS? Proc. 23rd Symposium on Theory of Computing, ACM, New York, 1991, pp. 21-31. Question 5.3 Each hardness-of-approximation result given in Section 4 requires a tailor-made reduction. Is [4] L. Babai and S. Moran, Arthur-Merlin Games: there a PSPACE version of the function class MAX- A Randomized Proof System and a Hierarchy of SNP? If there were, it might allow us to prove in one Complexity Classes, J. Comput. System Sci., 36 fell swoop that an entire class of maximization prob- (1988), pp. 254-276. lems are PSPACE-hard to approximate within certain ratios, just as Arora et al. [1] prove that all MAX-SNP [5] E. Berlekamp and L. Welch, Error Correction problems are NP-hard to approximate within certain of Algebraic Block Codes, US Patent Number ratios. 4,633,470. Appears in [10].

A polynomial-round Arthur-Merlin game (cf. [4]) [6] A. Condon, J. Feigenbaum, C. Lund, and P, can be thought of as a PCDS in which r(n) and q(n) Shor, Pro babilistica!ly Checkable Debate Systems are both arbitrary polynomials and one of the debaters and Approximation Algorithms for PSPACE- simply chooses random moves. In this context, the fact Hard Functions, AT&T Bell Laboratories Tech- that AM(poly) = PSPACE (cf. [13, 18]) means that, nical Memorandum, January 12, 1993. if r(n) and q(n) are both arbitrary polynomials, then [7] A. K. Chandra, D. C. Kozen and L. J. Stock- the universal debater in a PCDS can be replaced by a meyer, Alternation, J. ACM, 28 (1981), pp. 114- random debater without loss of generality. In a forth- 133. coming paper, we show that, even if r(n) = log n and q(n) = 1, we can replace the universal debater by a [8] U. Feige, S. Goldwasser, L. LOVA.SZ, M. Safra, random debater and still retain the power to recognize and M. Szegedy, Approximating Clique is A!most any language in PSPACE. This fact has implications NP- Complete, Proc. 32nd Symposium on Founda- for the hardness of approximating stochastic PSPACE- tions of Computer Science, IEEE Computer Soci- hard functions, of the type studied by Papadimitriou ety Press, Los Alamitos, 1991, pp. 2–12. [14]. [9] M, R. Garey and D. S, Johnson, Computers and Intractability: A Guide to the Theory of NP- 6 Acknowledgements CompletenessJ Freeman, San Francisco, 1979.

We thank and Mario Szegedy for help- [10] P. Gemmell and M. Sudan. Highly resilient correc- ful discussions in the formative stages of this work. We tors for polynomials, Inf. Proc. Letters, 43 (1992), thank Jin-yi Cai, Lenore Cowen, Uriel Feige, David pp. 169-174.

313 [11] D. S. Johnson, Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9 (1974), pp. 256-278.

[12] D. Kozen, Lower bounds for natural proof sys- tems, Proc. 18th Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los A1amitos, 1977, pp. 254-266.

[13] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for interactive proof systems, J. ACM, 39 (1992), pp. 859-868.

[14] C. Papadimitriou. Games Against Nature, J. Comput. System Sci., 31 (1985), pp. 288-301.

[15] C. Papadimitriou and M. Yannakakis. Optimiza- tion, Approximation, and Complexity Classes, J. Comput. System Sci., 43 (1991), pp. 425-440.

[16] R. Rubinfeld and M. Sudan. Testing polynomial functions efjlciently and over rational domains, Proc. 3rd Symposium on Discrete Algorithms, ACM/SIAM, 1992, pp. 23–32. A more efficient tester is to appear in journal version.

[17] T. J. Schaefer, On the Complexity of Some Two-Person Perfect-Information Games, J. Com- put. System Sci., 16 (1978), pp. 185-225.

[18] A. Shamir. IP = PSPACE, J. ACM, 39 (1992), pp. 869-877.

314