Polynomial Space 1 Context Sensitive Languages

2 Linear Bounded Automata

3 Polynomial Space

Klaus Sutner Carnegie Mellon University 2019/03/26

Towards Polynomial Space 2 Context-Sensitive Grammars 3

Definition (CSG) A context-sensitive grammar (CSG) is a grammar where all productions are of Logarithmic space and linear (deterministic) space clearly make algorithmic the form sense. But how about nondeterministic linear space? αAβ αγβ where γ = ε → 6 Generalizing context-free grammars naturally leads to context-sensitive grammars and the obvious parsing “algorithm” for a context-sensitive language 2 For technical reasons, some authors also allow S ε in which case S may not is in NSPACE(n) (and, by Savitch, in SPACE(n )). → appear on the righthand side of any production. A language is context-sensitive if it can be generated by a context-sensitive grammar. Somewhat surprisingly, it’s just a small step from there to all of PSPACE. Note the constraint that the replacement string γ = ε; as a consequence we 6 have α β implies α β ⇒ | | ≤ | |

Brute-Force Recognition 4 Needless to say . . . 5

Lemma Lemma Every context-sensitive language is decidable. Not all decidable languages are context-sensitive.

Proof. Proof. Here is a cute diagonalization argument for this. ? Suppose w Σ and n = w . In any potential derivation (αi)i

We know that the language Alas, we also need to show that (G) L. L = anbncn n 1 L ⊆ { | ≥ } This is a bit harder: we need to show that the productions cannot be abused in is not context free. Here is a context-sensitive grammar G for L: some unintended way to generate other strings: recall that there is no let V = S, B and set restriction on the order in which productions can be applied, they just have to { } match. S aSBc abc → | E.g., the following is allowed: cB Bc → bB bb S aaSBcBc aaSBBcc → ⇒ ⇒

A typical derivation looks like Exercise S an+1bc(Bc)n an+1bBncn+1 an+1bn+1cn+1 Figure out the details. ⇒ ⇒ ⇒ It follows by induction that L (G). ⊆ L

Example: More Counting 8 Closure Properties 9

It is also known that the language

? L = x a, b, c #ax = #bx = #cx { ∈ { } | } Theorem Context-sensitive languages are closed under union, concatenation, Kleene star is not context free. But, again, it is easily context-sensitive: and reversal. let V = S, A, B, C and set They are also closed under ε-free homomorphisms. { } S S0 ε → | S0 S0ABC ABC → | Proof. Straightforward by manipulating the grammar. 2 XY YX for all X,Y A, B, C → ∈ { } A a → B b → Note that arbitrary homomorphisms do not work in this case: they erase too C c much information and can force too large a search. → Note that most productions are actually context free. The critical part is the commutation productions for A, B, C . { }

Two Innocent Questions 10 Normal Form 11

Theorem (Kuroda) Are CSL closed under intersection? Every context-sensitive grammar can be written with productions of the form Are CSL closed under complement? A BC AB CDA a → → →

The answer is Yes in both cases (so this is quite different from context-free languages). The proof is very similar to the argument for Chomsky normal form for CFG (using only productions A BC and A a). → → The proof for intersection can be based on a machine model, and is much easier than the proof for complement (which requires a special and very Note that the recognition algorithm becomes particularly simple when the CSG surprising counting technique; see next lecture). is given in : we first get rid of all terminals and then operate only on pairs of consecutive variables. Monotonicity 12 Monotonic is Context-Sensitive 13

Theorem A language is context-sensitive iff it is generated by a monotonic grammar.

Derivations in CSGs are length-non-decreasing. Correspondingly, define a Proof. grammar to be monotonic if all productions are of the form Using LBAs introduced below, it suffices to show that every monotonic π : α β where α Γ?V Γ?, β Γ?, α β language can be accepted by an LBA. Initially, some terminal string → ∈ ∈ | | ≤ | | a1, a2, . . . , an is written on the tape. As we have seen already, a monotonic grammar can only generate a decidable Search handle: the head moves to the right a random number of places, language. say, to ai. In fact, these look rather similar to context-sensitive grammars except that we Check righthand side: the machine verifies that ai, . . . , aj = β where α β is a production. are now allowed to manipulate the context (but see 2 slides down). In → particular, every CSG is monotonic. Replace by lefthand side: the block ai, . . . , aj is replaced by α, possibly leaving some blanks. Collect: remove possible blanks by shifting the rest of the tape left. This loop repeats until the tape is reduced to S and we accept. If any of the guesses goes wrong we reject.

Tiny Example 14 And the Difference? 15

One can also directly convert monotonic productions to context-sensitive ones. CFL recognition is cubic time, via a dynamic programming algorithm that As a simple example, consider a commutativity rule basically enumerates all possible parse trees.

Why would a similar approach not also work for CSLs? As Kuroda NF shows, AB BA → on the face of it, the productions seem only mildly more complicated. Here are equivalent context-sensitive rules: Recall how one can associate undecidability with CFGs: we are interested in the AB AX language (M) of all accepting computations of a TM: → C AX BX → BX BA # C0 # C1 # ... # Cn # → Here X is a new variable and the green variable is the one that is being Unfortunately, this language fails to be context-free: the copy language ww w Σ? is not context-free, and this one is worse. replaced. Note how the left/right context is duly preserved. { | ∈ }

Tricks 16 And Context-Sensitive? 17

Consider the complement Σ? (M). For context-sensitive languages there is no problem: we can directly generate − C (M). This requires a bit of work using a grammar directly, but is fairly easy C Consider a representation where we use alternating mirror images: to see from the machine model (see below): an LBA can easily perform the necessary checks. op op # C0 # C1 # C2 # C3 ... # Cn # This works essentially since palindromes are context-free. As a consequence even some the most basic questions about CSG are undecidable. This is really quite surprising.

Claim Theorem The resulting language of non-computations is context-free. It is undecidable whether a CSG generates the empty language. Decidability Overview 18

x LL = L = Σ? L = KL K = ∈ ∅ ∩ ∅ 1 Context Sensitive Languages regular YYYYY DCFL YYYYN CFL YYNNN 2 Linear Bounded Automata CSL YNNNN decidable YNNNN semi-dec. NNNNN 3 Polynomial Space

Standard decision problems for various language classes.

Needless to say, for the decidable ones we would like a more precise complexity classification (in which case it may matter how precisely the instance is given).

A Machine Model? 20 Backwards Derivation 21

In order to find a parser for CSL it seems natural to look for an associated Suppose L is context-sensitive via G. The idea is to run a derivation of G machine models: backwards, starting at a string x of terminals. semidecidable — Turing machines To this end, nondeterministically guess a handle in the current string, a place context free — pushdown automata where there is a substring of the form β, where α β is a production in G. → regular — finite state machines Erase β and replace it by α. Rinse and repeat.

The original string x is in L iff we can ultimately reach S this way. We need a machine model that is stronger than pushdown automata (FSM plus stack), but significantly weaker than full Turing machines. Note that two stacks won’t work, we need a different approach. Of course, this is yet another path existence problem in a suitable digraph.

Linear Bounded Automata 22 The Myhill-Landweber-Kuroda Theorem 23

Definition A (LBA) is a type of one-tape, nondeterministic The development happened in stages: acceptor where the input is written between special end-markers and the computation can never leave the space between these Myhill 1960 considered deterministic LBAs. markers (nor overwrite them). Landweber 1963 showed that they produce only context-sensitive languages. Kuroda 1964 generalized to nondeterministic LBAs and showed that this Thus the initial configuration looks like produces precisely all the context-sensitive languages.

#q0x1x2 . . . xn#

and the tape head can never leave this part of the tape. Theorem It may seem that there is not enough space to perform any interesting A language is accepted by a (nondeterministic) LBA iff it is context-sensitive. computations on an LBA, but note that we can use a sufficiently large tape alphabet to “compress” the input to a fraction of its original size and make room. Proof 24 Digression: Physical Realizability II 25

It was recognized already in the 1950s that Turing machines are, in many ways, too general to describe anything resembling the type of computation that was One direction is by definition of an LBA. possible on emerging digital computers. For the opposite direction it is a labor of love to check in great gory detail that the workings of an LBA can be described by a context-sensitive grammar. The Rabin-Scott finite state machine paper was one radical attempt to impose a radical constraint on Turing machines that brings them into the realm of 2 “feasible” computation.

Myhill’s introduction of LBA is another attempt at constructive restrictions. As Note that nondeterminism is critical here: grammars are naturally we now know, LBAs are still a rather generous interpretation of the notion of nondeterministic and a deterministic machine would have to search through all feasible computation; real, practical algorithms need further constraints. possible choices. That seems to require more than just linear space (open problem, see below). Still, it is a perfectly good model and there are many interesting problems that fit perfectly into this framework.

Intersection Closure 26 Why Not CFG? 27

Theorem CSL are closed under intersection. In essence, the argument says that we can combine two LBAs into a single one that checks for intersection. This is entirely similar to the argument for FSMs. Proof.

Burning Question: Why can’t we do the same for PDAs? Given two LBAs Mi for Li. we can construct a new LBA for L1 L2 by using ∩ a 2-track tape alphabet Γ = Σ Σ. × The upper track is used to simulate M , the lower track is used to simulate M . 1 2 Because we cannot in general combine two stacks into a single one (though It is easy to check that the simulating machine is again a LBA (it will sweep this works in some cases; the stack height differences need to be bounded). back and forth over the whole tape, updating both tracks by one step on the But in general, two stacks suffice to simulate a Turing machine. way). 2

More Problems 29

1 Context Sensitive Languages

So context-sensitive recognition is naturally in NSPACE(n), and by Savitch, in SPACE(n2).

2 Linear Bounded Automata Two natural questions: Are there other natural decision problems of this kind?

Is there any reason to consider larger space classes, say, SPACE(n42)? 3 Polynomial Space Quantified Boolean Formulae 30 Validity of QBF 31

Definition Problem: Validity of Quantified Boolean Formulae (QBF) A quantified Boolean formula (QBF) is a formula consisting of propositional Instance: A quantified Boolean sentence ϕ. connectives “and,” “or” and “not,” as well as existential and universal Question: Is ϕ valid? quantifiers.

If all variables in a QBF are bounded by a quantifier then the formula has a Note that many properties of propositional formulae can easily be expressed in truth value: we can simply expand it out as in terms of QBF. For example, ϕ(x1, x2, . . . , xn) is satisfiable iff

x ϕ(x) ϕ(0) ϕ(1) x1, . . . , xn ϕ(x1, x2, . . . , xn) is valid ∃ 7→ ∨ ∃ x ϕ(x) ϕ(0) ϕ(1) ∀ 7→ ∧ Likewise, the formula is a tautology iff

In the end we are left with an exponential size propositional formula without x1, . . . , xn ϕ(x1, x2, . . . , xn) is valid ∀ variables which can simply be evaluated in linear time.

QBF 32 Validity Tester 33

n 1 Lemma for x = 0..2 − do Validity of QBF can be checked by a deterministic LBA. v = 0 m 1 for y = 0..2 − do if ϕ(x, y) Proof. As a case in point, consider the formula then v = 1; break if v = 0 then return No x1, . . . , xn y1, . . . , ym ϕ(x, y) ∀ ∃ return Yes The argument easily extends to any other formula.

To check validity we use two loops, one counting from 0 to 2n 1 for x and Of course, the running time is exponential, but linear space is quite enough. m − another counting from 0 to 2 1 for y. 2 −

QBF Solvers 34 Succinct Representations 35

A quantified Boolean formula can be construed as a succinct representation of a much larger ordinary Boolean formula: deciding validity becomes harder because the input is much shorter.

http://satisfiability.org/ Succinct representations appear in many places:

http://www.qbflib.org/

There is a lot of current research on building powerful QBF solvers. Succinct Hypercube 36 Games 37

Many games such as Hex, Checkers, Chess, . . . are naturally finite (at least with the right set of rules). Thus, from the complexity perspective, they are trivial.

d However, they often generalize to an infinite class of games. For example, Hex Vertex set: 2 can obviously be played on any n n board. × Edge set: (x, y) 2d 2d such that dist(x, y) = 1 ∈ ×

Here dist(x, y) refers to the Hamming distance: i xi = yi . |{ | 6 }|

More generally, we could use a Boolean formula Φ(x, y) to determine the edges in a graph G(2d, Φ) on 2d.

And the formula could be represented by a ROBDD.

Incidentally, John Nash proved that there is no draw in Hex (in 1949).

Winning 38 Integer Circuit Evaluation 39

Consider a circuit with n integer inputs; we think of x Z as the singleton x . ∈ { } Internal nodes all have indegree 2 (but unbounded outdegree), and come in There are several natural questions one can ask about these generalized games: three types: Union A B ∪ Is a particular board position winning? Sum A B = a + b a A, b B ⊕ { | ∈ ∈ } Does the first player have a winning strategy? Product A B = a b a A, b B ⊗ { · | ∈ ∈ } We assume there is exactly one output node of outdegree 0, the subset of Z If the total number of moves in a game is polynomial in the size of the board appearing there is the result of evaluating the circuit. (Checkers, Hex, Reversi), then these questions are typically in PSPACE: we can systematically explore the game tree to check if a position is winning. Problem: Integer Circuit Evaluation (ICE) Instance: A ICE ciruit C, an integer a. Question: Is a in the output set?

PSPACE 40 Comment 41

A B is linear; we might also admit arbitrary finite sets as input. ∪ But: A B and A B are both potentially quadratic, so the sets defined by ⊕ ⊗ an ICE can be exponentially large. This result may seem a bit contrived, but it actually is quite natural: there is a long-standing question of how circuit evaluation compares to formula evaluation (it should be slightly harder if the fan-out is higher than 1). Lemma ICE is in PSPACE. For example, Boolean circuit evaluation is P-complete, but evaluating a Boolean formula is in NC1 (presumably a smaller class based on circuits, more later). Proof. We cannot simply evaluate the circuit in polynomial space: the sets might be too large. More generally, there are interesting connections between various kinds of But we can nondeterministically guess the one element in each set that is circuits (or straight-line programs) and small complexity classes. actually needed to obtain the target value a in the end, and follow these witnesses throughout the circuit. 2 Boring Completeness 42 Quantified Boolean Formulae 43

Theorem As before with NP, it is not too hard to come up with a totally artificial Validity testing for quantified Boolean formulae is PSPACE-complete. PSPACE-complete problem:

t e K = e # x # 1 x accepted by Me in t = x + e space { | | | } Proof. Because of the padding, we can easily check membership in K in linear space. Membership in PSPACE is easy to see: we can check validity by brute-force using just linear space. And the standard reduction shows that the problem is hard. To see this, note that every block of existential and universal quantifiers can be considered as a binary counter: for k Boolean variables we have to check values Not too interesting, here is a much better problem. 0,..., 2k 1. − This is exponential time but linear space.

Proof Cont’d 44 Quoi? 45

For hardness one uses the fact that a configuration C0 of a Turing machine can be expressed by a (large) collection C of Boolean variables much as in Cook-Levin, plus ideas from Savitch’s theorem. So Φk(C1,C2) means that for some witness C and all choices of D1 and D2 we have We will construct a quantified Boolean formula Φk of polynomial size such that

k t Φk(C1,C2) t 2 (C10 M C20 ). (C1,C) = (D1,D2) (C,C2) = (D1,D2) implies Φk 1(D1,D2) ⇐⇒ ∃ ≤ ` ∨ −

This just gets us around having to write down Φk 1 twice. This is straightforward for k = 0, 1: copy the appropriate parts of the − Cook-Levin argument. 2

For k > 1 note that Φk(C1,C2) iff C (Φk 1(C1,C) Φk 1(C,C2)). ∃ − ∧ − Unfortunately, this direct approach causes an exponential blow-up in size. Therefore we use a trick: Depending on your temperament, you will recognize this as a brilliant technical maneuver, or dismiss it as a slimy trick. Φk(C1,C2) C D1,D2 ((C1,C) = (D1,D2) ⇐⇒ ∃ ∀ ∨ (C,C2) = (D1,D2) Φk 1(D1,D2)) ⇒ −

Context Sensitive Recognition 46 CSR Completeness 47

Theorem Context Sensitive Recognition is PSPACE-complete. We have seen that for context-sensitive grammars it is decidable whether a word is generated by the grammar. Proof. For hardness use the following well-known fact from language theory: Problem: Context Sensitive Recognition (CSR) ε / L NSPACE(n) implies L is context-sensitive (via monotonic grammars). Instance: A CSG G = V, Σ,P,S and a word x Σ?. ∈ ∈ h i ∈ ? Question: Is x in (G)? Now let L Σ be in PSPACE, say, L SPACE(p(n)). Define L ⊆ ∈ p( x ) L0 = x # 1 | | x L { | ∈ } In fact, we can solve this problem in NSPACE(n) by guessing where in the current string to apply a production of G backwards. By the previous remark L0 is context-sensitive. In fact, a grammar G for L0 can be constructed in polynomial time from a Turing machine for L. Ultimately we will wind up in S iff the given string is in the language. p( x ) Hence the map f defined by f(x) = (G, x # 1 | | ) is a polynomial time reduction from L to CSR. 2 PSPACE and FSMs 48 The Proof 49

Here is a question regarding the intersection of a family of DFAs. The problem is in linear NSPACE as follows.

Problem: DFA Intersection Let ni = i and set n = ni. |A | Instance: A list ,..., of DFAs. 1 m Q A A set pi = init( i) Question: Is ( i) empty? A L A for i = 1..n do T guess a Σ ∈ Note that m, the number of machines, is not fixed. We can check Emptiness in compute pi = δi(pi, a) linear time on the accessible part of the product machine = i; alas, the if pi Fi for all i A A ∈ latter has exponential size in general. then return Yes Q return No

Theorem (Kozen 1977) As usual, this “algorithm” may produce false No’s, but a Yes guarantees a The DFA Intersection Problem is PSPACE-complete. yes-instance.

For hardness, let T be a Turing machine with polynomial space bound Here is machine k, where 0 k p(n) 3. p(n) n. Consider IDs encoded as strings A ≤ ≤ − ≥ It reads #Σkxyz and remembers x, y and z; #a1a2 . . . al p b1b2 . . . br It then skips p(n) k 3 letters, reads a # and skips k more letters. A whole computation has the form − −

Then the machine reads the next three letters x0, y0, z0 and checks that #I0#I1 ... #Im 1$ − they are compatible with the transition relation of the Turing machine. # and $ are new markers and we may safely assume that the length of each ID Lastly, it skips another p(n) k 3 letters, and either reads a # and is exactly p(n) (pad with blank tape symbols). Also, assume that m is even − − (change the TM). starts all over, or reads $ and accepts.

The idea is to construct a family of DFAs that make sure that I2i I2i+1 is For example, if x = p, y = a and δ(p, a) = (b, q, +1) then x0 = b, y0 = q and 7→ correct, and another family that checks I2i+1 I2i+2. We will only deal with z0 = z. Of course, there may be multiple possibilities. 7→ the first case.

Determinization 53

Combining “even” k machines with corresponding “odd” machines insures A that every accepted string is a proper sequence of IDs. We can think of a nondeterministic automaton to a succinct representation A Lastly, we add one more DFA that checks that the sequence of IDs starts at the of an equivalent : the accessible part of the power B proper initial ID, and ends in an accepting ID (say, after emptying out the automaton as detailed in the standard Rabin-Scott construction. tape). The size of has no better than an exponential bound B All the DFAs together accept precisely one string (coding the accepting 2|A| |B| ≤ computation of M) iff M(ε) . ↓ It is straightforward to modify the construction to check for M(x) instead. ↓ Unfortunately, this bound is tight in general. It remains tight when one 2 considers only semiautomata where Q = F = I, even when they are nearly deterministic and co-deterministic. Blow-Up Example 1 54 Blow-Up Example 2 55

Consider the following DFA = [n], Σ, δ, 1, 1 where [n] = 1, . . . , n , a A h { }i { } Σ = a, b, c and the transition function is given by { } b a Here is a 6-state semiautomaton. a b δa a cyclic shift on [n], X = b: machine is deterministic δb the transposition that interchanges 1 and 2, and co-deterministic, so the power X automaton has size 1. b δc sends 1 and 2 to 2, identity elsewhere. X = a: the power automaton has 6 b a maximal size 2 . a b It is well-known that has a transition semigroup of maximal size nn. A Furthermore, the reversal rev( ) of produces a power automaton of maximal A A a size 2n, and the power automaton turns out to be reduced.

Power Automaton Reachability 56

For hardness one can exploit a feature of Kozen’s hardness proof of DFA Problem: Power Automaton Reachability (PAR) Intersection: all the DFAs in the argument have a unique final state. Instance: A nondeterministic automaton , a set P Q of states. A ⊆ Question: Is P a state in the power automaton of ? So we can let = be the disjoint union of the Kozen machines, and set A A A

S P = fin( i) i = 1, . . . , m { A | } Theorem 2 PAR is PSPACE-complete. We will see shortly that counting the states of the power automaton is similarly Proof. hard. Membership is PSPACE follows from the usual guess-one-letter-at-a-time argument (as opposed to guessing the whole witness string).

And More 58

Many of the generalized games from above turn out to be PSPACE-complete.

ICE is PSPACE-complete.

Testing whether two regular expressions are equivalent is PSPACE-complete (even if one of them is Σ?). The same holds true for nondeterministic finite state machines.