Introduction to Tree Language Theory

Hitoshi Ohsaki

National Institute of Advanced Industrial Science and Technology (AIST)

seminar talk (6/10)

2009 VI. P & NP TM & k-TM k-TM (k > 1) : MTM with one control-unit one read-only tape (called input tape) k read/write tapes (called working tapes) k-NTM : non-deterministic k-TM

Cf. TM , one control-unit + one read/write tape

Corollary

TM = k-TM = k-NTM (k > 1)

Proof The left equivalence follows from the fact that MTM is simulated by TM and the reverse is trivial. Using a similar proof of the previous fact (see the slides of the previous talk), one can show that k-NTM is simulated by NTM and the reverse also holds. Since NTM is simulated by TM, the right equivalence follows. 2 2 Running time & working space

Given k-TM = (Γ, Σ, Q, q , Q , ∆) M 0 fin time (w) : the number of moves until halts on input w M M (called running time of on w) M space (w) : the number of cells in working tapes which M M visited at least once until halts on input w M (called working space of on w) M For the length n of input

T (n) : max time (w) w Σ∗ : w = n M { M | ∈ | | }

S (n) : max space (w) w Σ∗ : w = n M { M | ∈ | | }

Question

How should we define TL (or SL) for language ? 3 Linear speed-

Given k-TM , there exists (k + 1)-TM such that M1 M2 – ( ) = ( ) L M1 L M2 1 – T 2(n) 6 T 1(n) for sufficiently large n N M 2 M ∈

if limn T 1(n)/n = →∞ M ∞

Proof

Consider k = 1, but the following proof can be generalized to arbitrary k > 1. Let 1 = (Γ, Σ, Q, q0, Qfin, ∆), then define 2 = (Γ , Σ, Q , q , Q , ∆ ) with Γ = M M 0 0 0 fin0 0 0 (Γ Γ Γ) Σ, Q = (p, i) p Q, 1 i 3 q , qacc, q and Q = qacc . × × ∪ 0 { | ∈ 6 6 } ∪ { 0 rej} fin0 { } Here the blank for 2 is (], ], ]), and the states q , qacc, q are fresh symbols. M 0 rej Transition function ∆ is defined according to 1’s move on each 3 cells (“chunk” 0 M of cells at 2i 1, 2i, 2i+ 1). That is, initially, for input a0 a1 an, 2 copies it to − · · · M (k+1)th working tape in compressed manner, e.g. (], a0, a1) (a1, a2, a3) (an 1, an, ]) · · · − if n is even ; (], a0, a1) (a1, a2, a3) (an 2, an 1, an) if n is odd. For this initialization, · · · − − including the running time to return the tape-head on (k+1)th working tape to the 1 leftmost position, 2 needs n + n + c moves (c constant). Next, using (k+1)th M 2 working tape as the input tape of 2, 2 simulates 1. (proof cont’d) M M M 4 Proof (cont’d)

For the move of 2, suppose, for instance, that 1 with the state p reads y on M M i the input tape, and then 1 moves on the working tape as shown in the figure. If M 1’s tape-head on the input tape stays, say on y + 1, in the same chunk even after M i the other head moves on to the cell of b6, ∆0 contains the following map :

(p, 2), (],],]), (yi 1, yi, yi+ 1), (b3, b4, b5) (q, 3), (],],]), (yi 1, yi, yi+ 1), (c1, c2, c3), S, , S h − i 7→ h − i On the right-hand side of the above mapping, “S” stands for stay. Because 2’s head on the input tape M working tape of 1 is no longer needed, it keeps staying on a blank. The M p head on (k+1)th-working tape also stays on the same cell, but the the current state must be changed from b1 b2 b3 b4 b5 b6 b7 (p, 2), meaning that 1’s tape-head reads yi, to (q, 3), M . meaning that the head reads yi+ 1. During the compu- . tation, 2 should halt on qacc (resp. q ) if 1 - M rej M ⇓ cepts (rejects) the input. Moreover, since 1 is deter- M just after moving ministic, must be deterministic. By construction, 2 on to the cell of b6 q M 1 the simulation can be done within at most 2 T 1(n)+c0 M (c0 constant) moves. Since limn T 1(n)/n = , b1 b2 c1 c2 c3 b6 b7 →∞ M1 ∞ we have that for large n N, T 2(n) 6 2 T 1(n). 2 ∈ M M 5 Remarks

1. The size of chunk is not necessarily 3. When the size is i (> 3), 1 the simulation can be done within i 1 T 1(n) − M 2. requires Σ i tape symbols and ( Q i) + 3 state symbols M2 | | | | × 2 3. If limn T 1(n)/n = , one can construct k-TM 2 that →∞ M ∞ M simulates k-TM (without introducing a new working tape) M1

4. From the proof, one can see that S 2(w) is also improved : M

Proposition (Tape compression)

Given k-TM , there exists (k + 1)-TM such that M1 M2 – ( ) = ( ) L M1 L M2 1 – S 2(n) 6 S 1(n) for sufficiently large n N M 2 M ∈

Note

If the size of the chunk is ( 3), the simulation can be done within 1 ( ) i i > i 1 S 1 n 6 − M DTIME & DSPACE

Let T, S be 1-variable polynomials with positive coefficients

DTIME(T ) : languages accepted by k-TM such that M T (n) 6 T (n) for every (sufficiently large) n N M ∈ DSPACE(S) : languages accepted by k-TM such that M S (n) 6 S(n) for every (sufficiently large) n N M ∈ Let be the set of 1-variable polynomials with positive coefficients P NP : DTIME(T ) T [∈P NPSPACE : DSPACE(S) S [∈P Note

P PSPACE ⊆ 7 (∵ Polynomially time-bounded computation requires at most polynomial tape space) NTIME & NSPACE

Let T, S be 1-variable polynomials with positive coefficients

NTIME(T ) : languages accepted by k-NTM such that M T (n) 6 T (n) for every (sufficiently large) n N M ∈ NSPACE(S) : languages accepted by k-NTM such that M S (n) 6 S(n) for every (sufficiently large) n N M ∈ Let be the set of 1-variable polynomials with positive coefficients P NP : NTIME(T ) T [∈P NPSPACE : NSPACE(S) S [∈P Note

P NP NPSPACE, PSPACE NPSPACE ⊆ ⊆ ⊆ 8 Example of P

The membership problem for CFG in Chomsky normal form : instance is grammar = (Σ, Q, q , Q , ∆), word w in Σ G 0 fin ∗ solution is “yes” if w ( ) ; “no” otherwise ∈ L G Proof (for the problem being in P)

Let w = a1 a2 an (n 0). By assumption of the problem, transition rules in ∆ · · · > are in the forms of p q r, p a, q0 ε for p Q, q, r Q q0 , a Σ. → → → ∈ ∈ − { } ∈ 1. If n = 0 and q0 ε ∆, return “yes.” for ( ` = 2 , ` n , `++ ) → ∈ 6 2. Next, execute the following program : for ( i = 1 , i 6 n - ` + 1 , i++ ) for ( i = 1 , i 6 n , i++ ) j := i + ` - 2 ; if p ai in ∆ for ( k = i , k 6 j , k++ ) ∃ → T (i, i) := T (i, i) p ; if p q r in ∆ & ∪ { } ∃ → 3. Then, execute the program on the right. q T (i, k) & ∃ ∈ 4. If q0 T (1, n), return “yes” ; otherwise r T (k+1, j) ∈ ∃ ∈ return “no.” then T (i, j) := T (i, j) p ; ∪ { }

The innermost loop is done in k1 n running time, hence this computation is done in 3 × 3 at most k2 n (k1, k2 constants). This implies that the problem is in DTIME(n ).2 × 9 Example of NP

The bounded halting problem for NTM :

instance is NTM = (Γ, Σ, Q, q0, Qfin, ∆), w in Σ , k in N M ∗ solution is “yes” if w ( ) within k moves ; “no” otherwise ∈ L M Proof

Suppose Ψ( ), Ψ(w) is the binary code for non-deterministic UTM , then h M i MNU the question if w ( ) is the membership problem Ψ( ), Ψ(w) ( ). ∈ L M h M i ∈ L MNU Let n be the size of input for this problem, then the total length of the word that 2 2 represents and w with delimiters is at most 2 Γ + 2 Q + Γ Q + w + c1. M 2 2 | | | | | | × | | | | Encoding and w by Ψ takes (c2 Γ Q )+(c3 w ) running time in proportion M ×| | ×| | 2 ×2 | | to n. The simulation of takes (c4 Γ Q ) + c5 for each move. In this M × | | × | | simulation, at each step, tries to find a transition rule which can be applied, MNU and then reflects the result on the tapes. By assumption, the simulation must be stopped within k iterations. Hence, the problem is in NTIME(n). 2 Consult books (e.g. [1]) or search on the web for more examples of NP-problems.

[1] M.R. Garey & D.S. Johnson: Computers and Intractability – A Guide to the Theory of NP-completeness, Freeman, 1979. 10 Polynomial slow-down (simulation of MTM)

2 2-TM 1 can be simulated by TM 2 in c T (n) (c : constant) M M × M1 Proof

Let Γ be the set of tape symbols of 1. Similar to the proof of “MTM=TM” in M the previous talk on TM, suppose 3 is a fresh symbol, and then, define (single tape) TM 2 whose tape symbols are Γ 3 a a Γ . In this proof, 2’s single M ∪ { } ∪ { | ∈ } M tape is divided into 2 tracks as shown in the figure. The symbols in input tape and working tape are alternately placed on the single tape. The left- and right-end of the sequence are marked by 3. The locations of 1’s tape-heads on the input tape M and on the working tape are represented by symbols with overline, e.g. a. Let n be the length of the input. For each move, the 1 M simulation takes at most 2 T 1 (n) + 9 moves, × M input tape ] a b c ] where 8 (= 2 4) moves for adjusting the head × and overwriting overlined-symbols to normal working tape ] f g ] ] symbols, and 1 move for reading 3. Since the number of iterations is T 1 (n), the simulation M 2 takes 2 T (n)2 + 9 T (n) in total. For the M 1 1 3 g 3 × M × M 2 ] a f b c ] ] initialization, it takes c1 n (c1 constant). 2 × Note that the above proof can be generalized to the simulation of k-TM (k > 2). 11 Summary of TM & MTM & NTM

polynomial slow-down

(single-tape) TM linear speed-up

simulation∃

linear speed-up (single-tape) NTM

NTM is simulated by MTM in P-time iff P = NP (k-working-tape) MTM 12 Reducibility

Given languages L over Σ, M over Γ ( Σ Γ ) ⊆ L 6m M : L is many-one reducible to M if k-TM such that for every input w Σ , ∃ M ∈ ∗ w L if and only if halts on w with w M ∈ M 0 ∈ as output on working tape k ( f : w L iff f(w) M ) ∃ ∈ ∈ P L 6m M : L is polynomial-time reducible to M if

L 6m M & T (n) 6 T (n) for polynomial T M ( f polynomial function : w L iff f(w) M ) ∃ ∈ ∈ log L 6m M : L is log-space reducible to M

L 6m M & S (n) 6 c log n (c : constant) M × * kth-tape is write-only and is not taken into account as working space 13 P- and NP-completeness

Given language L over Σ

L is called NP-hard if for every language M NP, M P L ∈ 6m L is called NP-complete if L is NP-hard & L NP ∈ L is called P-hard if for every language M P, M log L ∈ 6m L is called P-complete if L is P-hard & L P ∈ Note

log P 6m ⊆ 6m DSPACE(T (n)) NSPACE(T (n)) DTIME(2c T (n)) if T (n) log(n) ∵ ⊆ ⊆ c>0 × > Moreover, the following statements are equivalentS : 1. P = NP (1 3 From previous observation) ⇒ 2. NP-complete problem is in P (2 1 P,NP are closed under P ) ⇒ 6m 3. P-complete problem is NP-complete (3 2 Obvious) ⇒ 14 P- and NP-complete problems (tricky example)

The bounded halting problem for TM :

instance is TM = (Γ, Σ, Q, q0, Q , ∆), w in Σ , k in N M fin ∗ solution is “yes” if w ( ) within k moves ; “no” otherwise ∈ L M This problem is P-complete

Proof

Similar to the bounded halting problem for NTM, it is easy to show that this problem is in P. Next, let L be a language over Σ in P. Then, there exists TM ML with polynomial T (n) such that L = ( ) and for every w L, halts on w L ML ∈ ML within T ( w ) running time. Take this TM , an (arbitrary) word w from Σ , and | | ML ∗ k = T ( w ), then w L if and only if halts on w within k moves. 2 | | ∈ ML

The above example can be modified as an example of NP-complete problems, which means that : The bounded halting problem for NTM is NP-complete 15 Other NP-complete problems

Satisfiability problem for Boolean formulas [Cook 1971] : instance is propositional Boolean formula φ solution is “yes” if there is an assignment that satisfies φ ; “no” otherwise

Hamilton circuit in directed graphs : instance is directed graph G solution is “yes” if there is a cycle that passes through every vertex in G exactly once ; “no” otherwise

Clique in graphs : instance is directed graph G, k in N solution is “yes” if G has a complete sub-graph (each pair of vertices in the sub-graph is connected by an edge) of size k ; “no” otherwise 16 Savitch’s theorem

PSPACE = NPSPACE

Proof

It suffices to show , because the reverse is trivial. Let L be a language in ⊇ NPSPACE, then there exists k-NTM 1 = (Γ, Σ, Q, q0, Qfin, ∆) which halts on input M w using at most S( w ) space. Suppose k = 1, but this proof can be generalized to | | any k > 1. We construct below k0-TM 2 that accepts L and that uses at most 2 M S( w ) space. Assume for 2 that it accepts w when and only when 2 clean up | | M M the working tape (that contains only blanks) and halts with the final state qfin and the tape-head on the input tape is placed at leftmost position. Observation : Since 1 uses for input w at most S( w ) space, w L if and only if M | | ∈ w is accepted within Γ S( w ) ( Q S( w ) w ) moves, where Γ S( w ) is the number | | | | × | |× | | ×| | | | | | of possible configurations of working tape and ( Q S( w ) w ) is the possible | | × | | × | | combinations of state and positions of (two) tape-heads. Hence, w L if and only c S( w ) ∈ if w is accepted within 2 × | | moves (c : constant and computable). This observation brings the idea to define k0-TM 2 such that 2 accepts input c S( w ) M M w if w is accepted by 1 within 2 moves ; 2 rejects w otherwise. One M × | | M should notice that 2 does not necessarily simulate 1 move by move, but it has M c MS( w ) to determine whether w is accepted by 1 within 2 × | | moves. (Proof cont’d) M 17 Proof (cont’d) Define the procedure reach(α, β, k) on the right : α, β are configurations reach( α , β , k ) (words of input and working tapes to- if k = 0 & α = β gether with the current state and the then return “yes” ; locations of tape-heads) of 1, k is M if k = 1 & 1 moves from α to β in one step a non-negative integer, and is ini- M C then return “yes” ; tially the set of configurations of 1 whose length is less than or equals if k > 2 then M to S( w ) + w . Let α0 and αfin be the while γ C do | | | | ∃ ∈ initial and final configurations. Given if reach( α, γ, k/2 ) = “yes” & d e w, by assumption, they are unique. If reach( γ, β, k/2 ) = “yes” c S( w ) b c reach(α0, αfin, 2 × | | ) =“yes,” there then return “yes” break ; exists a sequence α0, α1, . . . , αn (= αfin) else C := C γ ; − { } such that 1 moves from α to α + 1, M i i od = or αi αi+ 1, so w is accepted by 1. If return “no” c S( w ) M reach(α0, αfin, 2 × | | ) =“no,” w is not c S( w ) accepted by 1. For reach(α0, αfin, 2 ), there are recursive calls in the program M × | | at most log 2c S( w ) = c S( w ) times. Each stack frame requires at most 2 S( w ) 2 × | | × | | × | | space. Hence, this program can be realized by k -TM using 2c S( w )2 space. 2 0 × | |

Corollary P NP PSPACE = NPSPACE 18 ⊆ ⊆ Exercise

1. Show that if languages A, B P, then A B, A B, (A)C, A B P. ∈ ∪ ∩ · ∈ 2. Show that if languages A, B NP, then A B, A B, A B NP. ∈ ∪ ∩ · ∈ P 3. Show that for all languages A, B, C over Σ, A 6m A (reflexivity), A P B & B P C A P C (transitivity), A P B (A)c P (B)c. 6m 6m ⇒ 6m 6m ⇒ 6m 4. Which of the following statements hold? Explain the reason why. (a) A P B & B P A P 6m ∈ ⇒ ∈ (b) A P B & B NP A NP 6m ∈ ⇒ ∈ (c) A P B & B PSPACE A PSPACE 6m ∈ ⇒ ∈ (d) A P B A P (B)c 6m ⇒ 6m 5. Select one of the problems on page 16 (or find another example), and then show that this problem is NP-complete. 6. Show that P = NP if and only if a finite language is NP-complete. 7. Show that the following problem is NP-complete : Given a finite set U, subsets S , , Sn ( U) and an integer k, the question if 1 · · · ⊆ there exist k subsets S , , S such that S S = U. 19 i1 · · · ik i1 ∪ · · · ∪ ik Appendix : What if P = NP ...

Most of researchers believe P = NP today. But if this long-standing 6 open question “P = NP?” is positively solved, what happens ? :

It turns out – NP = coNP, P = PH (the union of all complexity classes in the ), and thus, polynomial hierarchy is collapsed Furthermore – SSL, RSA, PGP are no longer secure infrastructures, so – E-commerce business is exposed to serious menace of security flaws On the other hand – better predictions of weather, earthquakes and other natural phenomena are established – mathematicians could be replaced by efficient theorem-discovering programs (G¨odel 1956). Additionally – one who has solved the question first receives the prize of $1M from CMI [1]

[1] P vs NP: Millennium Prize Problems, Clay Mathematics Institute of Cambridge, USA, May 2000. Information available at http://www.claymath.org/millennium/ 20 Copyright (version Jul-01-2009) c 2009 Hitoshi Ohsaki

National Institute of Advanced Industrial Science and Technology (AIST) – Senri-site, AIST Kansai.

Office: Shin-Senri Nishi 1--2--14 (MSK bldg. 5th floor), Toyonaka, Osaka 560--0083, Japan

URL: http://staff.aist.go.jp/hitoshi.ohsaki/

All rights reserved.

No part of this lecture material may be reproduced in any form or by any means, electronic, mechanical, pho- tocopying, or otherwise, without the prior consent of the author.