4.3. General Phrase-Structure Grammars 4.3
Total Page:16
File Type:pdf, Size:1020Kb
4. Turing Machines and R. E. and Context-Sensitive Languages 4.3. General Phrase-Structure Grammars 4.3. General Phrase-Structure Grammars A phrase-structure grammar is a 4-tupleG=(N,T,S,P), where P (N T) (N T) is afinite semi-Thue system onN T such ⊆ ∪ ∗ × ∪ ∗ ∪ that � 1 for all(� r) P (see Section 2). | | N ≥ → ∈ Theorem 4.20 For each phrase-structure grammar G, the language L(G) is recursively enumerable. Proof. ForG=(N,T,S,P) we describe a 2-tape-NTMM that accepts the languageL(G): The inputw is stored on tape 1, and on tape 2, M guesses a derivation ofG, starting fromS. As soon as a terminal wordz has been generated on tape 2,M checks whetherz=w holds. In the affirmative,M halts, while in the negative, M enters an infinite loop. Hence,L(M)=L(G). Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 269 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.3. General Phrase-Structure Grammars Theorem 4.21 If L is a recursively enumerable language, then there exists a phrase-structure grammar G such that L(G)=L. Proof. LetL Σ be r.e., and letM=(Q,Σ,Γ,✷,δ,q ,q ) be a 1-TM ⊆ ∗ 0 1 s.t.L(M)=L. We construct a grammerG=(N,Σ,A 1,P) by taking N := ((Σ ε ) Γ) Q A ,A ,A , ∪{ } × ∪ ∪{ 1 2 3} and definingP as follows: (1)A q A , (4)A [ε,✷]A , 1 → 0 2 3 → 3 (2)A [a,a]A for alla Σ, (5)A ε. 2 → 2 ∈ 3 → (3)A A , 2 → 3 Using these productionsG generates a sentential form q [a ,a ][a ,a ] [a ,a ][ε,✷] [ε,✷] 0 1 1 2 2 ··· n n ··· for a wordw=a a a Σ . This sentential form encodes the initial 1 2 ··· n ∈ ∗ configuration of the TMM on inputw. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 270 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.3. General Phrase-Structure Grammars Proof of Theorem 4.21 (cont.) Further,P contains the following productions for simulatingM: (6) q[a,X] [a,Y]p, ifδ(q,X)=(p,Y,R), → (7) [b,Z]q[a,X] p[b,Z][a,Y] for allb Σ ε andZ Γ, → ∈ ∪{ } ∈ ifδ(q,X)=(p,Y,L), (8) q[a,X] p[a,Y], ifδ(q,X)=(p,Y,0). → Ifq w✷ m upv holds for someu,v Γ ,p Q, andm 0, then 0 �M∗ ∈ ∗ ∈ ≥ m q [a ,a ][a ,a ] [a ,a ][ε,✷] ∗ 0 1 1 2 2 ··· n n →P [a1,u 1] [a k ,u k ]p[ak+1 ,v 1] [a n,v n k ][ε,v n k+1 ] [ε,v n+m k ], ··· ··· − − ··· − whereu=u 1u2 ...u k andv=v 1v2 ...v n+m k . − Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 271 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.3. General Phrase-Structure Grammars Proof of Theorem 4.21 (cont.) Finally,P also contains some productions for treating halting configurations: [a,X]q q aq 1 → 1 1 (9) q [a,X] q aq for alla Σ ε andX Γ. 1 → 1 1 ∈ ∪{ } ∈ q ε 1 → Ifq w ∗ uq v, that is, ifw L(M), then: 0 � M 1 ∈ A q [a ,a ][a ,a ] [a ,a ][ε,✷] m 1 →∗ 0 1 1 2 2 ··· n n ∗ [a1,u 1] [a k ,u k ]q1[ak+1 ,v 1] [a n,v n k ] [ε,v n+m k ] → ··· ··· − ··· − a a a =w, →∗ 1 2 ··· n which shows thatL(M) L(G). ⊆ Conversely, ifw L(G), then we see from the construction above that ∈ M halts on inputw, which implies thatL(G)=L(M). Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 272 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.3. General Phrase-Structure Grammars Corollary 4.22 A language L is recursively enumerable if and only if it is generated by a phrase-structure grammar. From the various characterizations of the class of r.e. languages, we obtain the following closure properties. Corollary 4.23 (a) The class RE is closed under union, intersection, product, Kleene star, reversal, and inverse morphisms. (b) The class RE is not closed under complementation. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 273 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages 4.4. Context-Sensitive Languages A grammarG=(N,T,S,P) is context-sensitive if each production (� r) P has the formαXβ αuβ, whereX N, → ∈ → ∈ α,β,u (N T) , andu=ε. In addition,G may contain the production ∈ ∪ ∗ � (S ε), ifS does not occur on the right-hand side of any production. → A languageL is called context-sensitive if there exists a context-sensitive grammarG such thatL=L(G). By CSL(Σ) we denote the set of all context-sensitive languages onΣ, and CSL denotes the class of all context-sensitive languages. A grammarG=(N,T,S,P) is called monotone if � r holds for | |≤| | each production(� r) P. Also a monotone grammar may contain → ∈ the production(S ε) ifS does not occur on the right-hand side of → any production. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 274 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages Theorem 4.24 (a) For each context-sensitive grammar G, there is a monotone grammar G� such that L(G�) =L(G) (b) For each monotone grammar G�, there is a context-sensitive grammar G such that L(G)=L(G �). Proof. (a) (b): By definition each context-sensitive grammar is monotone. ⇒ (b) (a): LetG = (N ,T,S ,P ) be a monotone grammar. ⇒ � � � � FromG � we constructG �� = (N��,T,S �,P ��) by taking N�� :=N � A a T andP �� :=h(P �) A a a T , ∪{ a | ∈ } ∪{ a → | ∈ } whereh is defined throughA A(A N ) anda A (a T). �→ ∈ � �→ a ∈ G�� is monotone, and all the new productions are context-sensitive. It remains to replace the productions inh(P �) by context-sensitive productions. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 275 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages Proof of Theorem 4.24 (cont.) LetA A B B (2 m n) be a production fromh(P ), 1 ··· m → 1 ··· n ≤ ≤ � whereA ,B N . We introduce new nonterminalsZ ,Z ,...,Z and i j ∈ �� 1 2 m replace the above production by the following ones: A A Z A A 1 ··· m → 1 2 ··· m Z1A2 A m Z 1Z2A3 A m ··· →. ··· . Z1Z2 Z m 1Am Z 1 Z mBm+1 B n ··· − → ··· ··· Z1 Z mBm+1 B n B 1Z2 Z mBm+1 B n ··· ··· →. ··· ··· . B1 B m 1ZmBm+1 B n B 1 B m 1BmBm+1 B n. ··· − ··· → ··· − ··· The new productions are context-sensitive. It’s obvious that they simulate the old production. On the other hand, the new productions can only be used for this purpose, as the new nonterminals Z1,Z 2,...,Z m do not occur in any other production. By repeating the process above for each production fromh(P �), we obtain a context-sensitive grammarG such thatL(G)=L(G �) Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 276 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages Example. LetG be the following monotone grammar: G=( S,B , a,b,c ,S, S aSBc,S abc, cB Bc, bB bb ). { } { } { → → → → } We claim thatL(G)=L := a nbncn n 1 . { | ≥ } Claim 1: L L(G). ⊆ Proof of Claim 1. We show thata nbncn L(G) by induction onn. ∈ Forn= 1, we haveS abc. → Assume thatS anbncn has been shown for somen 1. → ∗ ≥ We consider the following derivation: n n n n+1 n n+1 n+1 n+1 n+1 S aSBc ∗ a a b c Bc ∗ a b Bc a b c . → → · · → → Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 277 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages Example (cont.) Claim 2: L(G) L. ⊆ Proof of Claim 2. By applying production 1 repeatedly, we obtain a sentential form anS(Bc) n, which is rewritten into a sentential forma nSα by production 3, whereα B,c + and α = α =n. ∈{ } | | B | | c In order to get rid ofS, production 2 must be applied, that is, we obtaina nSα a n+1bcα. → To get rid of all nonterminalsB inα, all occurrences ofc must be moved to the right and then allB are rewritten intob by production 4, that is,a n+1bcα an+1bBncn+1 an+1bn+1cn+1. → ∗ →∗ Hence,L(G) L. ⊆ Together Claims 1 and 2 show thatL(G)=L. Prof. Dr. F. Otto (Universität Kassel) Automata and Grammars 278 / 294 4. Turing Machines and R. E. and Context-Sensitive Languages 4.4. Context-Sensitive Languages Each context-free grammar in CNF (Theorem 3.9) is context-sensitive. On the other hand, a nbncn n 1 CFL (see thefirst example { | ≥ } �∈ after Theorem 3.14). Corollary 4.25 CFL� CSL. A grammarG=(N,T,S,P) is in Kuroda Normal Form, if it only contains productions of the following forms: (A a),(A BC),(AB CD), wherea T andA,B,C,D N. → → → ∈ ∈ With respect to the production(S ε), we have the same restriction → as before. Theorem 4.26 Given a context-sensitive grammar G, one can effectively construct an equivalent context-sensitive grammar G� that is in Kuroda Normal Form.