What Was Wrong with the ?

Makoto Kanazawa Hosei University

Chomsky Hierarchy of Formal Languages

recursively enumerable

context- sensitive

context- free

regular

Enduring impact of Chomsky’s work on theoretical computer science. 15 pages devoted to the Chomsky hierarchy.

Hopcroft and Ullman 1979 No mention of the Chomsky hierarchy. 450 INDEX The same with the latest edition of Carmichael, R. D., 444 CNF-formula, 302 Cartesian product, 6, 46 Co-Turing-recognizableHopcroft, language, 209 Motwani, and Ullman (2006). CD-ROM, 349 Cobham, Alan, 444 Certificate, 293 Coefficient, 183 CFG, see Context-free grammar Coin-flip step, 396 CFL, see Context-free language Complement operation, 4 Chaitin, Gregory J., 264 Completed rule, 140 Chandra, Ashok, 444 Complexity class Characteristic sequence, 206 ASPACE(f(n)),410 Checkers, game of, 348 ATIME(t(n)),410 Chernoff bound, 398 BPP,397 Chess, game of, 348 coNL,354 Chinese remainder theorem, 401 coNP,297 , 108–111, 158, EXPSPACE,368 198, 291 EXPTIME,336 Chomsky, Noam, 444 IP,417 Church, Alonzo, 3, 183, 255 L,349 Church–Turing thesis, 183–184, 281 NC,430 CIRCUIT-SAT,386 NL,349 Circuit-satisfiability problem, 386 NP,292–298 CIRCUIT-VALUE,432 NPSPACE,336 Circular definition, 65 NSPACE(f(n)),332 Clause, 302 NTIME(f(n)),295 Clique, 28, 296 P,284–291,297–298 CLIQUE,296 PH,414 Sipser 2013Closed under, 45 PSPACE,336 Closure under complementation RP,403 context-free languages, non-, 154 SPACE(f(n)),332 deterministic context-free TIME(f(n)),279 languages, 133 ZPP,439 P,322 Complexity theory, 2 regular languages, 85 Composite number, 293, 399 Closure under concatenation Compositeness witness, 401 context-free languages, 158 COMPOSITES,293 NP,322 Compressible string, 267 P,322 theory, 3 regular languages, 47, 60 decidability and undecidability, Closure under intersection 193–210 context-free languages, non-, 154 recursion theorem, 245–252 regular languages, 46 reducibility, 215–239N: finite alphabet of nonterminal Semi-Thue System/StringClosure under Rewriting star Turing machines, 165–182 context-free languages, 158 Computable function, 234 System NP,327 Computation history symbols P,327 context-free languages, 225–226 regular languages, 62 defined, 220 Closure under union linear bounded automata,Σ: finite alphabet of terminal symbols context-free languages, 158 221–225 Production α → βNP,322 α, β ∈ (N ∪ Σ)* Post Correspondence Problem, P,322 227–233 regular languages, 45, 59 reducibility, 220–233 One-step rewriting γαδ ⇒ γβδ

Language L(G) = { w ∈ Σ* ∣ S ⇒* w }

Type 0 unrestricted

Type 1 |α| ≤ |β| Type 2 α ∈ N Type 3 α ∈ N, β ∈ Σ N

LBA: linear-bounded automata = rewriting machine systems models NSPACE(n) Turing machines PDA: pushdown automata type 0 recursively Turing machines enumerable FA: finite automata

context- LBA type 1 sensitive

type 2 context- free PDA

type 3 regular FA recursively Context-sensitive languages are enumerable type 0 computationally very complex. Some context-sensitive languages are type 1 context- sensitive PSPACE-complete.

PSPACE-complete

context- type 2 free regular type 3

• Context-sensitive languages are computationally very complex.

• There is a huge gap between context-free and context- sensitive.

• Many intermediate classes of languages have been considered in theory.

• In particular, “mildly context-sensitive” grammars have been investigated by computational linguists.

Indexed Grammars

Hopcroft and Ullman 1979 recursively What type of rewriting system is an enumerable type 0 ?? type 1 context- sensitive

?? indexed

context- type 2 free regular type 3

Type 0

Production α → β

One-step rewriting γαδ ⇒ γβδ

Indexed

Production T → ABC

One-step rewriting Tfg ⇒ AfgBfgCfg

An indexed grammar is not an instance of a type 0 grammar. recursively Indexed languages are also sort of enumerable type 0 complex.

type 1 context- sensitive

indexed

NP-complete

context- type 2 free regular type 3

68 STUART M. SHffiBER 68 STUART M. SHffiBER 2. Robert Berwick and Amy Weinberg. The Grammatical Basis of Linguistic Performance: Language Use and Acquisition. MIT Press, Cambridge, Massachusetts, 1984. APPLICABILITYGERALD GAZDAR OF INDEXED GRAMMARS 71 70 GERALD GAZDAR2. Robert Berwick and Amy Weinberg. The Grammatical Basis of Linguistic Performance: Gazdar 1988 3. JoanLanguage Bresnan, Use editor. and Acquisition. The Mental MIT Representation Press, Cambridge, of Grammatical Massachusetts, Relations. 1984. MIT Press, GERALD GAZDAR UNIVERSITY OF SUSSEX push items onto, pop3. JoanCambridge, items Bresnan, from,Massachusetts, editor. andThe Mental 1982.copy Representation the stack. of What Grammatical we Relations.end up MIT Press, of orthographic conventions: nonterminal symbols are indicated by upper 4. LucaCambridge, Cardelli Massachusetts, and Peter 1982.Wegner. Understanding types, data abstraction and UNIVERSITY OF SUSSEX with now is no longer4. polymorphism.Luca equivalent Cardelli Draft andto thepaper,Peter CF-PSGs 1985.Wegner. Understanding but is significantly types, data more abstraction and case letters (A, B, C),SCHOOL terminal OF SOCIALsymbols SCIENCES by lower case letters (a, b, e), 5. Noampolymorphism. Chomsky. Aspects Draft paper, of the 1985.Theory of Syntax. MIT Press, Cambridge, Massachusetts, possibly empty stringsSCHOOL of terminals OF SOCIAL and nonterminals SCIENCES by W, W , W , etc., powerful, namely 5.the Noam1965. indexed Chomsky. Aspectsgrammars of the Theory (Aho, of Syntax. 1968). MIT Press,This Cambridge, class Massachusetts,of BRIGHTON 1 2 grammars has been6. Noamalluded1965. Chomsky. to aLectures number on Government of times and in Binding.the recent Foris Publications,linguistic Dordrecht, indices by lower case italic lettersBRIGHTON k), and stacks of indices by square Holland, 1982. 6. . Lectures on Government and Binding. Foris Publications, Dordrecht, brackets and periods ([], [ .. ], [i, .. ]) where [i, .. ] is a stack whose topmost literature: by Klein7. Steven Holland,(1981) Fortune, 1982. in Daniel connection Leivant, and Michaelwith O'Donnell.nested Thecomparative Expressiveness of Simple and Second Order Type Structures. Research Report RC 8542, IBM Thomas J. Watson 7. Steven Fortune, Daniel Leivant, and Michael O'Donnell. The Expressiveness of Simple index is i, [] is an empty, and [.. ] is a possibly empty stack of indices. As for constructions, by DahlResearch (1982) Center, in connectionYorktown Heights, with New topicalised York, 1980. pronouns, by and Second Order Type Structures. Research Report RC 8542, IBM Thomas J. Watson the rules themselves, Aho (1968)APPLICABILITY uses one notation, and Hopcroft and Engdahl (1982) and8. Gerald ResearchGazdar Gazdar. Center, (1982)Phrase Yorktown Structurein Heights,connection Grammar, New York, pages with1980. 131-186. Scandinavian D. Reidel, Dordrecht, APPLICABILITY 8. GeraldHolland, Gazdar. 1982.] , pages 131-186. D. Reidel, Dordrecht, Ullman (1979) use another.OF INDEXED The notation GRAMMARS used below is essentially just a unbounded dependencies,9. Gerald Gazdar, by Huybregts Ewan Klein, Geoffrey(1984) K. ilDdPullum, Pulman and Ivan andA. Sag. Ritchie Generalized Phrase Holland, 1982.] OF INDEXED GRAMMARS (1984) in connection9. GeraldStructure with Gazdar, Grammar. Dutch, Ewan Blackwell Klein, by Geoffrey MarshPublishing, K. Pullum, andOxford, Parteeand England, Ivan A. and(1984) Sag. Harvard Generalized in University Phrase redundant, and intendedlyTO NATURAL more perspicuous, LANGUAGES variant of that employed by connection with variablePress,Structure binding, Cambridge, Grammar. and Massachusetts, Blackwell doubtless Publishing, 1985. elsewhere Oxford, England, as well. and Harvard University Hopcroft and Ullman. TO NATURAL LANGUAGES 10. RonaldPress, Kaplan.Cambridge, Personal Massachusetts, communication, 1985. 1985. Indexed grammars11.10. MartinRonald fall Kaplan.Kay.in Unificationbetween Personal communication,Grammar. CF-PSGs Technical 1985. and Report, context-sensitive Xerox Palo Alto Research In the standard formulations, an indexed grammar can contain rules of Center, Palo Alto, California, 1983. APPLICABILITY OF INDEXED GRAMMARS 73 grammars in their generative11. Martin Kay. capacity. Unification Every Grammar. context-free Technical Report, language Xerox Palo(CPL) Alto Research three72 differentGERALD sorts: GAZDAR 12. FernandoCenter, Palo C. N. Alto, Pereira California, and Stuart 1983. M. Shieber. The semantics of grammar formalisms seen as computer languages. In of the Tenthn International Conference on is an indexed language12. Fernando (IL), C.but N. notPereira conversely. and Stuart M. ThusShieber. anbne The semanticsis an of grammarIL, for formalisms stack to all nonterminal .daughters 1. in all rule* types, and (b) restricts (4) 51] Computationalseen as computer Linguistics, languages. StanfoId In University,of Stanford, the Tenth California, International 2-7 JulyConference 1984. on (1) 1. A[.. ] -> W[ .. ] INTRODUCI10N example, but it is not13. Fernando a CPL. C. And N. Pereira every and ILDavid is H.a D.context-sensitive Warren. Parsing as deduction. language, In 1. INTRODUCI10N* Computational Linguistics, StanfoId University, Stanford, California, 2-7 July 1984. additionsIfi we1. to take theA[ stack..the ] class-> to B[i,casesof context-free .. ] where the phrase mother structure category grammars has but (CF- a single but not conversely.13. ofUntilFernandothe 21st recently, AnnualC. N. Pereira Meeling there and of Davidthe Associationwere H. D. noWarren. for goodComputatiolJal Parsing arguments as deduction. Linguistics, Into pages 137- 144,ofthe Massachusetts 21st Annual Meeling Institute of the of AssociationTechnology, for Cambridge, ComputatiolJal Massachusetts, Linguistics, 15-17pages June137- daughter.PSGs)Ifi i.i.we Neitherand takeA[i, modify the.. ]aspect class ->it so W[of thatof ..context-free ] the (i) grammarsdefinition phrase are appears structureallowed to togrammars bemake essential, use (CF- of and suggest that the natural1983.144, Massachusetts languages Institute (NLs) of Technology,fell outside Cambridge, the Massachusetts,CFLs (see 15-17 June thingsfmitePSGs) are feature probablyand modify systems only it so anddone that (ii) that(i) rules grammars way are in permittedorder are allowed to facilitateto manipulateto make doing use the ofproofs. 14. Ritchie,1983. Graeme. The Computational Complexity of Sentence Derivation in Functional I shall refer to rules that have one or other of these three forms as 8 Pullum and Gazdar, Unification1982, for Grammar. defense In ofProceedings this claim), of the but11th workInternational by CulyConference on Supposefeaturesfmite we feature in allow arbitrary systemsthe ways,rule and typesthen (ii) what shownrules we are endin (2) permittedup within addition is equivalentto manipulate to those to what thelisted in 14. Ritchie, Graeme. The Computational Complexity of Sentence Derivation in Functional H&U rules. The first type of rule simply copies the stack to all (1985), Huybregts (1984)ComputationalUnification and Grammar. ShieberLinguistics, In pages(1985) Proceedings 5!W-586, does Universityof thenow 11th ofindicate Bonn,International Bonn, rather West-Germany, Conference on (1): wefeatures started in out arbitrary with. Suppose,ways, then however, what we that end we up takewith theis equivalent class of context- to what AugustComputational 1986. Linguistics, pages 5!W-586, University of Bonn, Bonn, West-Germany, strongly that they15. do. William However, C. Rounds no and NLRobert phenomena Kasper. A Complete are knownLogical Calculus which for RecoId nonterminalfreewe startedphrase daughters. outstructure with. Suppose, grammarsThe second however, and typemodify that of weit rule so take that pushes the (i) class grammars a newof context- index are onto August 1986. (2) i. would imply that 15.NLs StructuresWilliam exist C. Representing Roundswhich andfall LinguisticRobert outside Kasper. Information. the A Completeindexed In Proceedings Logical class Calculusof the(see 15th for Annual RecoId the stackallowedfree phrasehanded toA[ employ ..structure ] down-> a single toWiDgrammars its designated B[unique.. ] andW2 nonterminalDmodify feature it that so that takesdaughter. (i) stacks grammars Andof items arethe third SymposiumStructures onRepresenting Logic in ComputerLinguistic Science, Information. Massachusetts In Proceedings Institute of ofthe Technology, 15th Annual drawnallowedii. from to employsome finite a single set asdesignated its values, feature and (ii) that rules takes are stacks permitted of items to Gazdar and Pullum, 1985,Cambridge, for Massachussetts,a survey). June 1986. type of rule popsA[) an-> index WiD off B[i, the .. ] stackWZp and distributes what is left to its Symposium on Logic in Computer Science, Massachusetts Institute of Technology, drawn from some finite set as its values, and (ii) rules are permitted to Hopcroft and Ullman16. StuartCambridge, M.write Shieber. Massachussetts, that The design"of June ofthe 1986.a computer many languagegeneralizations for linguistic information. of In nonterminaliii. daughters.A[l, .. ] -> WIn1 the[]E[ rules, .. ] W2 UA and B are nonterminal symbols 16. Stuart M. Shieber.Of the TenthThe design International of a computer Conference language on Computational for linguistic information. Linguistics, In Stanford University, Stanford, California, 2-7 July 1984. (not necessarily* This paper originates distinct) in aand talk givenW is to a the string Workshop ofterminal on Scandinavian and/or nonterminaland the context-free grammars that haveOf the been Tenth proposed,International Conferencea class calledon Computational 'indexed' Linguistics, Rules of type (2.i) would allow the stack to be carried down onto a appears the most 17.natural, StuartStanford M. in University,Shieber. that itAn Stanford, arises Introduction California,in ato wide Unification-Based 2-7 July variety 1984. of Approaches contexts" to Grammar. symbols.Theory* This ofA paper Grammar compound originates held inin aTrondheim talksymbol given to in ofthethe WorkshopSummerthe form of on 1982 Scandinavian A[ and.. ]orgamzed means byand Lars thatthe the 17. VolumeStuart M.4 of Shieber. Lecture NoteAn Introduction Series, Center to forUnification-Based the Study of Language Approaches and Information, to Grammar. singleHellan.Theory nonterminal I ofam Grammar indebted designated heldto Jens in TrondheimErik Fenstaddaughter, in forthe drawingSummer and rulesmy of attention1982 of and type toorgamzed an (2.ii) error by inand Larsmy (2.iii) (1979:389). One purposeStanford,Volume 4 California,of of Lecturethe 1986. Note present Series, Centerpaper for isthe Studyto make of Language indexed and Information, nonterminalApresentationHellan. I am there. indebted bears A subsequent tothe Jens stack Erik version Fenstad [.. ]. was A forgivencompound drawing to the my Santa attention Cruz Workshopto ofan errorthe on formin themy W[ .• ] 18. Stuart M. Shieber. A simple reconstruction of GPSG. In Proceedings of the 11th wouldMathematics also allow of Grammars such a anddesignated Languages organized daughter by Geoffrey to have K. Pullumits stack in June incremented 1985. I . grammars more accessibleStanford, toCalifornia, linguists 1986. and computational linguists, and standspresentation for a there.string A subsequent of terminal version was and/or given to thenonterminal Santa Cruz Workshop symbols on the each 18. InternationalStuart M. Shieber. Conference A simpleon Computational reconstruction Linguistics, of GPSG. University In Proceedings of Bonn, of the Bonn, 11th or decremented,amMathematics grateful ofto Grammars Fernandorespectively. andPereira, Languages As Carl shown organized Pollard, in byand the Geoffrey Stuart appendix, K. Shieber Pullum permittinginfor June relevant 1985. I . such more directly relevantWestInternational Germany,to their Conference 25-29 concerns. August on 1986.Computational It assumes Linguistics, some University passive of Bonn, Bonn, nonterminalconversationsam grateful symbol-of to BillFernando Marsh which andPereira, Geoff bears Carl Pullum thePollard, stackfor theirand [ ..detailed ].Stuart Terminal Shiebercomments symbolsfor on relevantearlier cannot additionaldrafts, and rule to'Dikran types Karageuzian has no andconsequences his colleagues for for the workthe theyclass did of on languagesthe diagrams such 19. StuartWest M.Germany, Shieber. 25-29 Using August restriction 1986. to extend parsing algorithms for complex-feature-r conversations to Billif Marsh W =and BcDe,Geoff Pullum for their detailed= B[ .. comments ]e on earlier bear Thus, then W[ •. ] D[.. ]e. Stacks of B[z] competence in mathematical19. basedStuart formalisms.M. Shieber. linguistics, InUsing restriction but notof tothe very extend 22nd much. Annualparsing InalgorithmsMeeting accord of for the withcomplex-feature- Association for grammarsanddrafts, figures and can foundto'Dikran generate. below. Karageuzian Support for and work his colleagueson this paper for was the workprovided they by did the on ESRC the diagrams (UK), by the purpose just mentioned,Computationalbased formalisms. I shall Linguistics, Inpresent University grammarsof of the Chicago, 22nd informally Annual Chicago, Meeting Dlinois, as of Julysets the 1985. Association of for indicesandgrants figuresare to ,;thus Stanfordfound below.associated University Support from forwith thework Nationalnonterminals on this Sciencepaper was Foundation providedand transmitted (BNs.8102406)by the ESRC (UK), andby rules. b 20. PatrickComputational H. Winston Linguistics, and Karen University A. Prendergast. of Chicago, The Chicago, AI Business: Dlinois, Commercial July 1985. Uses of The thebyproductions grantsSloan Foundationto Stanford in an Universityby Indexed the Center from Grammar forthe the National Study can Scienceof be Language written Foundation and as Information, (BNs.8102406)(refer [Gazdar and andby 85a], the rules, rather than as ArtifiCialthe official Intelligence. n-tuples. MIT Press, The Cambridge, start symbolMassachusetts, will 1984. always be r Indeed,grants the from iLs the Sloan'are Foundation exactly characterised and System Development by a classFoundation of automata to the Center known for as 20. Patrick H. Winston and Karen A. Prendergast. The AI Business: Commercial Uses of the Sloan Foundation by the Center for the Study of Language and Information, and by ArtifiCial Intelligence. MIT Press, Cambridge, Massachusetts, 1984. original(one-wayAdvanced grantsdefinition fromnondeterministic) Study the of inSloan' Indexed the BehavioralFoundation Grammars Sciences. andnested System appears Developmentstack in [Ahoautomata Foundation 68]) (Aho, to the Center 1969). for The "S" or and the relevant sets of terminals, indices, and nonterminals can I b r indicesAdvanced that Studymake in upthe3. Behavioral stacksSOME are Sciences. EXAMPLE drawn from GRAMMARS some finite vocabulary though simply be inferred from looking at what appears in the rules. \ I 69 As we go down the tree, nonterminal as are produced, and index as are the.Y'[··]--+ stacks ..Ydi, themselves .. ] .....Yn[i, are.. ] (push) not boundedor 69 in size. Any upper bound on the TheU. Reylebest and way C. Rohrerto see (eds.), how indexed grammars work is to look at an loaded onto the stack. WheD we get to the middle, signalled by the change size of stacks restricts the class of grammars to the context-free class. example,NaturalU. Reyle such Language and asC. Rohrer(3), Parsing which (eds.), and Linguistic shows Theories,a grammar 69-94. for aDbD. of category from A to B, then nonterminal bs are produced, and index as .Y [i, © ..1988 ]--+.\ by 1 [D. •• ] Reidel.....Y PUblishingn [ •• ] Company.or r TheNatural types Language of.rule Parsing shownVijayashanker and Linguisticin (1) are Theories, just 69-94.those 1987 permitted in Hopcroft and are removed from the stack. The stack thus records how many 2. THE FORM OF RULES: A STACK-ORIENTED NOTATION r (3)© 1988S[ .• ]by D. ReidelaA[z, PUblishing.• ] Company. Ullman's defInition-> of the class of grammars--what I shall refer to as the nonterminal as were originally produced, and this record can then be used standardA[ definition... ] -> aA[a, Notice.. ] that this defInition (a) copies all or most of the In exhibiting schematic rules and trees below, I shall maintain a number r The productions in an Indexed Grammar can be written as (refer [Gazdar 85a], the to produce exactly as many nonterminal bs. One slight complication is the A[.. ] -> B[.. ] r use of the end-marker index z. This is necessary because the standard (!fIWt original definition of Indexed Grammars appears in [Aho 68]) In an IndexedB[a, .. ] ->Grammar, bB[.. ] a stack of svmbols. (called. the indices) is associated \vith formulations of index grammars allow a non-empty stack to simply rl B[z, .. ] -> b \ nontenninals during derivations. [.. ] is used to represent the present stack associated disappear if the category the· stack appears on only has terminal .Y'[··]--+ ..Ydi, .. ] .....Yn[i, .. ] (push) or withThis a nonterminal. looks more In thecomplicated first production than the it actuallyparent (th.s is. ofAll the it production)does is generate distributes daughters. We could avoid the need for end-marker indices of this kind by r trees such as that shown in (4): requiring that the stack distributed over daughters must be empty when .Y [i, .. ]--+.\ 1 [ •• ] .....Y n [ •• ] or r copies of its stack to its children after pushing the symbol i at the top of the stack. Thus, every daughter is nonterminal. As shown in the appendix, such a r we can see how the stack information is shared by all the symbols in me right-hand constraint would not affect the class of languages generated, but it does r simplify the formulation of certain kinds of grammar. Thus we could of the production. (!fIWt simplify the grammar in (3) above by replacing each occurrence of z with In an Indexed Grammar, a stack of svmbols (called. the indices) is associated \vith r . Q. When I come to discuss certain abstract tree configurations later in the l In an LIG the productions have the fonn nontenninals during derivations. [.. ] is used to represent the present stack associated paper, I shall simply ignore end-marker indices. rr with a nonterminal. In the first production the parent (th.s of the production) distributes I Notice that (4) is a right-linear (and hence finite state) tree, and that copies of its stack to its children after pushing the symbol i at the top of the stack. Thus, r .."'( [i, .. ]--+ ."'( 1 [] • • •.,:r j [ .. ] . . .•"'( n [] (pop) or r we can see how the stack information is shared by all the symbols in me right-hand rr of the production. I LIGIn andiffers LIG fromthe productions IG in that thehave stack the isfonn passed on to only one of the children. [] corre- rr sponds to a new but empty stack. The productions of LIG' s can be generalized [0 be of .J... ":"<'? r the form r .."'( [i, .. ]--+ ."'( 1 [] • • •.,:r j [ .. ] . . .•"'( n [] (pop) or rr r LIG differs from IG in that the stack is passed on to only one of the children. [] corre- r sponds to a new but empty stack. The productions of LIG' s can be generalized [0 be of rr the form r r 79 rr r

r 79 r TAG: tree-adjoining grammar; LIG: “Convergence” of Mildly Context- Sensitive Grammar Formalisms linear indexed grammar; HG:

TAG ≡ LIG ≡ HG

Vijay-Shanker and Weir 1994

recursively There is a book where the author calls enumerable type 0 the linear indexed grammars “type 1.5”. type 1 context- sensitive

type 1.5? linear indexed

context- type 2 free regular type 3

Type 0

Production α → β

One-step rewriting γαδ ⇒ γβδ

Linear indexed

Production T[] → AB[]C

One-step rewriting T[ fg] ⇒ AB[ fg]C

A linear indexed grammar is not an instance of a type 0 grammar. recursively There are actually almost no other enumerable type 0 grammar formalisms that are instances of Chomsky’s type 0 grammars. type 1 context- sensitive

indexed

linear indexed

context- type 2 free regular type 3

§ 1. GENERATIVE GRAMMARS AND LINGUISTIC COMPETENCE 9

ments may provide useful, in fact, compelling evidence for such a theory. To avoid what has been a continuing misunderstanding, it is perhaps worth while to reiterate that a is not a model for a speaker or a hearer. It attempts to characterize in the most neutral possible terms the knowledge of the language that provides the basis for actual use of language by a speaker- hearer. When we speak of a grammar as generating a sentence with a certain structural description, we mean simply that the grammar assigns this structural description to the sentence. How did Chomsky come to use semi- When we say that a sentence has a certain derivation with respect to a particular generative grammar, we say nothing about how Thuethe speaker→ or Posthearer might proceed,→Chomsky in some practical or Thue systems as the most general

Tan JouHNLefficient ow SmBouc way, LOGIC to construct such a derivation. These questions Volume 12,belong Number 1,to March the 1947 theory of language use — the theory of per- type of grammars? formance. No doubt, a reasonable model of language use will incorporate, as a basic component, the generative grammar that I believe Post (1947) coined the term expresses RECURSIVE the speaker-hearer's UNSOLVABILITY knowledge OF A PROBLEMof the language; OF THUE but this generative grammar does EMIL not, L. POSTin itself, prescribe the char- “semi-Thue system”. Alonzoacter Church or functioning suggested of to a theperceptual writer modelthat a orcertain a model problem of speech of Thue [6]' mightproduction. be proved unsolvableFor various by theattempts methods to ofclarify [5]. Wethis proceed point, to seeprove the Chomsky (1957), Gleason (1961), Miller and Chomsky (1963), and problem recursively unsolvable,Chomsky that is, unsolvable 1965 in the sense of Church [1], but bymany a method other meetingpublications. the special needs of the problem. Thue'sConfusion (general) problemover this is matter the following. has been Given sufficiently a finite persistent set of symbols to al, suggest that a terminological change might be in order. Never- a2, ... , a, , we consider arbitrary strings (Zeichenreihen) on those symbols, that is,theless, rows ofI think symbols that eachthe termof which "generative is in the grammar" given set. is Nullcompletely strings are in- appropriate, and have therefore continued to use it. The term cluded. We further have given a finite set of pairs of corresponding strings on "generate" is familiar in the sense intended here in logic, the ai's, (Al , B1), (A2 , B2), , I (An , B,). A string R is said to be a substring particularly in Post's theory of combinatorial systems. Further- of a string S if S can be written in the form URV, that is, S consists of the letters, more, "generate" seems to be the most appropriate translation in order of occurrence, of some string U, followed by the letters of R, followed by for Humboldt's term erzeugcn, which he frequently uses, it seems, the letters of some string V. Strings P and Q are then said to be similar if Q in essentially the sense here intended. Since this use of the term can be obtained from P by replacing a substring Ai or Bi of P by its correspond- "generate" is well established both in logic and in the tradition FORMALent Bi, Ai. REDUCTIONSClearly, if P and OFQ are THE similar, GENERAL Q and P COMBINATORIALare similar. Finally, P of linguistic theory, I can see no reason for a revision of and Q are said to be equivalent DECISION if there PROBLEM.* is a finite set R1 , R2, * * *, R, of strings on terminology. a,, * * *, a, such that in the sequence of strings P, R1, R2, ... , RX Q each string except the last is similar By toEMIL the L.following POST. string. It is readily seen that this relation between strings on a,, * * *, a, ,is indeed an equivalence relation. Thue's problem is then the problem of determining for arbitrarily given strings A, B on al, * * *, a;, whether, or no, A and B are equivalent. 1. Introduction. It is not new to the literature that the usual form of This problem, at least for the writer, is more readily placed if it is restated a symbolic in terms logic of a withspecial its form parenthesis of the canonical notation systems and infinite of [3]. set In ofthat variables notation, can be transformedstrings C and Dinto are similarone in ifwhich D can thebe obtained enunciations, from C i.by e., applying formulas to Cof one the system, of the arefollowing finite operations:sequences of letters,' the different letters constituting a once-and-for-all PAiQ given produces finite PBQ,set. If PBQ the primitiveproduces lettersPAQ, iof = such1, 2, a *system , n. (1) are represented In these operations by a,, thea2, operational, ap, an arbitrary variables P, enunciation Q represent arbitrary of the systemstrings will take the Strings form A andai, Ba2* will a,,then i be- equivalent1, 2, 3, if, Bij can1, be2, obtained , 1. In from describing A by starting the with A, and applying in turn a finite sequence of operations (1). That is, A Postbasis of such a systemCanonical it is convenient to use new Systems letters to represent finite and B are equivalent if B is an assertion in the "canonical system"2 with primi- sequences tive assertion of the A andabove operations primitive (1). Thue'sletters. general If then problem A, B, thus * , becomesE represent the

the decision sequences FORMALproblem ai, forREDUCTIONS a the2 .class . . ofa,,OF all THEaj, canonical aj2GENERAL ... systems aj COMBINATORIAL, * of* *,this a)t1 "Thue a,2 type." * a., respectively, ABR ThisE will general represent problem could the DECISION easilysequence be PROBLEM.*proved ail arecursively 2 **aO aj,unsolvable aj2 aj.- if, a,1 instead am2 of the pair of operations for each i of (1), we merely had the first operation of * .* am,,. each pair.3 In fact, by direct methods By EMIL such L. POST. as those of [31, we easily reduce the We shall say that such a system is in canonical form if its basis has the decision problem of an arbitrary "normal system" [3] to the decision problem following of such structure.2a system of "semi-ThueThe primitive type," assertions the known of recursive the system unsolvability are a specified of the 1. Introduction. It is not new to the literature that the usual form of finite Received set a symbolicof Octoberenunciations logic 26, with 1946. its of Presented parenthesis the above to notation the form. American and The infinite Mathematical operations set of variables Society of the can November system are a specified 2, 1946. befinite transformed set of intoproductionts, one in which each the enunciations, of the following i. e., formulas form: of the 1 Numbers system, in are brackets finite sequencesrefer to the of letters,'bibliography the different at the end letters of the constituting paper. a 2 Null assertions, however, now being allowed. once-and-for-all gllPi'lgiven finite g12PVt2 set. If the91M primitive gmPi'.1 letters 91 ('M+1)of such a system are J That is, using the language of propositions instead of operations, if we merely had an represented by a,, a2, , ap, an arbitrary enunciation of the system will take implication where (1) g2lPi"ihas an equivalence. g22PV'2 . .2m2Pi"m_2 g2(m2+1) the form ai, a2* a,, i - 1, 2, 3, 1 , ij 1, 2, , 1. In describing the basis of such a system it is convenient to use new letters to represent finite sequences of the above P(o) primitive d (e) gkmkP,'letters. If then (k) A,ghk B, * , E represent theThis sequences content downloaded ai, a 2 from . . 133.25.240.37. a,, aj, aj2 on ... Thu, aj 28, * Jun * 2018*, a)t1 12:34:17 a,2 UTC* a., respectively, All use producesubject to http://about.jstor.org/terms ABR E will represent the sequence ail a 2 **aO aj, aj2 aj.- a,1 am2 * .* am,,. gqlPil gXi2P . . . ,q'Mpim ,QM+l' We shall say that such a system is in canonical form if its basis has the following structure.2 The primitive assertions of the system are a specified * Received November 14, 1941; Revised April 11, 1942. finite set of enunciations of the above form. The operations of the system are a 'More exactly, "strings" of "marks," to use terms of C. I. Lewis (A Survey of An indexed specified grammar finite set of is productionts, an instance each of theof followinga Post form: canonical system. Symbolic Logic, Berkeley, 1918: chapter VI, sec. III). 2 This formulation stems gllPi'l from g12PVt2 the "Generalization 91M gmPi'.1 91 ('M+1)by Postulation" of the writer's

"Introduction to a general theory g2lPi"i g22PV'2of elementary . .2m2Pi"m_2 propositions," g2(m2+1) American Journal of Mathematics, vol. 43 (1921), pp. 163-185 (see p. 176). We take this opportunity to make the following Emendation: Lemma 1 thereof (pp. 177-178) requires the added P(o) d (e) gkmkP,' (k) ghk condition that the expressions replacing the r's do not involve any letter upon which a produce substitution is made in the given deductive process. This necessitates several minor changes in the proof of the gqlPiltheorem gXi2P there . . following.. ,q'Mpim ,QM+l'Actually, both Lemma 1 and its companion Lemma 2 admit of further simplification, with the proof of the theorem then * Received November 14, 1941; Revised April 11, 1942. being valid as it stands. 'More exactly, "strings" of "marks," to use terms of C. I. Lewis (A Survey of Symbolic Logic, Berkeley, 1918: chapter VI, sec. III). 197 2 This formulation stems from the "Generalization by Postulation" of the writer's "Introduction to a general theory of elementary propositions," American Journal of Mathematics, vol. 43 (1921), pp. 163-185 (see p. 176). We take this opportunity to make the following Emendation: Lemma 1 thereof (pp. 177-178) requires the added condition that the expressions replacing the r's do not involve any letter upon which a This substitution content downloaded is made in the from given 133.25.167.229 deductive process. on Fri, This 29 necessitates Jun 2018 04:50:34several minor UTC changes in the proofAll of use the subject theorem to therehttp://about.jstor.org/terms following. Actually, both Lemma 1 and its companion Lemma 2 admit of further simplification, with the proof of the theorem then being valid as it stands. 197

This content downloaded from 133.25.167.229 on Fri, 29 Jun 2018 04:50:34 UTC All use subject to http://about.jstor.org/terms Perhaps because of this striking Why Didn’t Chomsky Use Post Canonical Systems? result?

REDUCTIONS OF TI-IE COMIBINATORIAL DECISION PROBLEM. 199

the production whenever itPost contains 1947 the enuniciations represelnted by the several premises of the prodluction. A very special case of the canonical form is what we term the normal form. A system in canonical form will be said to be in normal form if it has but one primitive assertion, and, ea el of its productions is in the form

gP

produces

Pg'.

The main purpose of the presenit paper is to demonstrate that every system in canonical form can formally be reduced to a system in normal form. The two forms may therefore in fact be said to be equipotent.- More precisely, we prove the following

THEOREM. Giver a system in canonical form with primitive letters a1, a2 < * , , ap, a system in1 normal form with primitive letters a, a2, , aiu, a 1, a'2, * , a' can be set utp such that the assertions of the system in canonical form are exactly those assertions of the system in normal formn which involve no other letters than al, a2, , ap.

As a result of this theorem the decision problem for a system in canonical form is reduced to the decision problenm for the corresponding systemi in normial form. For an enunciation of the former system is an assertion when and only when it is an assertion of the latter system. Hence any procedure whiclh could effectively determine for an arbitrary enunciation of the system in normal form whether it is or is not an assertionrecursively thereof would automatically do the same for the system in canonical form. Now by methocls such as those referred to in the opening sentence of thisenumerable introduction, it can be shown that the type problem 0 of determining for an arbitrary well-formed formula in the X-calculus of Church whether it has or has not a normnal form (Church) ) can be reduced to the decision problem for a particular system in our canonical form. While type Church1 has proved the above problenmcontext- unsolvable in a certaini technical sense, in the interest of economy we invokesensitive his identification of X-definability with effective calculability to conclude that as a resuilt the decision problem for that particular system in canonical form, and hence for the class of systems in canonical form, is unsolvable. We are thus led to the more surprising result

5 Alonzo Church, " An unsolvable problemindexed of elementary number theory," American Journal of Mathematics, vol. 58 (1936), pp. 345-363.

linear This content downloaded from 133.25.167.229 on Fri, 29 Jun 2018 04:50:34 UTC All use subject to http://about.jstor.org/termsindexed

context- type 2 free regular type 3

It is silly to define well-formed Should Context-Free Grammars Be Thought of as String Rewriting Systems? sentences in terms of derivations if what we really want is to assign tree S ⇒ NP VP ⇒ Kim VP ⇒ Kim V NP ⇒ Kim saw NP ⇒ Kim saw Sandy S ⇒ NP VP ⇒ Kim VP ⇒ Kim V NP ⇒ Kim V Sandy ⇒ Kim saw Sandy structures to sentences. S ⇒ NP VP ⇒ NP V NP ⇒ Kim V NP ⇒ Kim saw NP ⇒ Kim saw Sandy S ⇒ NP VP ⇒ NP V NP ⇒ Kim V NP ⇒ Kim V Sandy ⇒ Kim saw Sandy S ⇒ NP VP ⇒ NP V NP ⇒ NP saw NP ⇒ Kim saw NP ⇒ Kim saw Sandy S ⇒ NP VP ⇒ NP V NP ⇒ NP saw NP ⇒ NP saw Sandy ⇒ Kim saw Sandy S ⇒ NP VP ⇒ NP V NP ⇒ NP V Sandy ⇒ Kim V Sandy ⇒ Kim saw Sandy S ⇒ NP VP ⇒ NP V NP ⇒ NP V Sandy ⇒ NP saw Sandy ⇒ Kim saw Sandy S NP VP

Kim V NP

saw Sandy All we need is the subrelation of ⇒G* Bottom-up Interpretation of Context- Free Rules restricted to N × Σ*. Just a general type of inductive definition.

Bi ⇒G* vi (i = 1,…,n) A → w0 B1 w1 … Bn wn ∈ P

A ⇒G* w0 v1 w1 … vn wn

L(G) = { w ∈ Σ* | S ⇒G* w }

Inductive Definition = CFG Interpreted Bottom-up

CFGs as Logic Programs on Strings

A → w0 B1 w1 … Bn wn

A(w0 x1 w1 … xn wn) ← B1(x1),…,Bn(xn)

Horn clause

L(G) = { w ∈ Σ* | G ⊢ S(w) } It’s a small step to use n-ary predicates Derivation = Proof Tree for n ≥ 2.

S(Kim saw Sandy) NP(Kim) VP(saw Sandy)

V(saw) NP(Sandy)

G ⊢ S(Kim saw Sandy)

It’s best to think of an MCFG as a kind Multiple Context-Free Grammars of logic program. Each rule is a definite clause. Nonterminals are predicates on strings.

A(α1,…,αq) ← B1(x1,1,…,x1,q1),…,Bn(xn,1,…,xn,qn)

n ≥ 0, q, qi ≥ 1, αk ∈ (Σ ∪ { xi,j | i ∈ [1,n], j ∈ [1,qi] })* each xi,j occurs exactly once in (α1,…,αq)

• q = dim(A) (dimension of A) • dim(S) = 1 • L(G) = { w ∈ Σ* | G ⊢ S(w) }

S(x1#x2) ← D(x1, x2) D(ε, ε) ← { w#wR | w ∈ D1* } D(x1y1, y2x2) ← E(x1,x2), D(y1,y2)

E(ax1ā, āx2a) ← D(x1,x2)

2-MCFG S(aaāāaā#āaāāaa) 2-ary branching D(aaāāaā, āaāāaa)

E(aaāā, āāaa) D(aā, āa)

D(aā, āa) E(aā, āa) D(ε, ε)

E(aā, āa) D(ε, ε) D(ε, ε)

D(ε, ε) derivation tree S(x1…xm) ← A(x1,…,xm) A(ε,…,ε) ←

A(a1 x1 a2,…,a2m−1 xm a2m) ← A(x1,…,xm)

non-branching m-MCFG

n n n n m-MCFL { a1 a2 … a2m−1 a2m | n ≥ 0 } (m−1)-MCFL Seki et al. 1991 2-MCFL 1-MCFL = CFL

MCFGs as Logic Programs on Strings

Surface without Structure word order and tractability issues in natural language analysis

..., daß Frank Julia Fred schwimmen helfen sah

...that Frank saw Julia help Fred swim

...dat Frank Julia Fred zag helpen zwemmen

Annius Groenink Elementary Formal Systems

Smullyan 1961

Elementary Formal Systems

Smullyan 1961

“Elementary Formal Systems can be looked at as variants of the canonical languages of Post …. The reason for our choice of elementary formal systems, in lieu of Post’s canonical systems, is that their structure is more simply described, and their techniques are more easily applied. The general notion of “production” used in Post systems is replaced by the simpler logistic rules of substitution and detachment (modus ponens), which are more easily formalized.”

Chomsky Hierarchy

Rewriting Machines Logic Programs on Strings Languages Systems Elementary Formal Systems Type 0 Turing r.e. (Smullyan 1961) Length-Bounded EFS (Arikawa CSL = Type 1 LBA et al. 1989) NSPACE(n) Simple LMG (Groenink 1997) / Poly-time Hereditary EFS (Ikeda and P Turing Arimura 1997) MCFG MCFL Type 2 PDA Simple EFS (Arikawa 1970) CFL Type 3 FA Reg recursively enumerable

context- sensitive

multiple context-free

head

context- free

regular

A long quote, from an interview Chomsky on Mathematical held in the year when Hopcroft and Linguistics (1979) Ullman’s textbook was published.

. . .

. . .

Chomsky (2004): Generative Enterprise Revisited

Chomsky on Mathematical Linguistics (1979)

. . .

Chomsky (2004): Generative Enterprise Revisited Summary

• The four levels of the Chomsky hierarchy are not of equal importance.

• The choice of semi-Thue systems as the most general form of grammar hampered investigation of some important language classes.

• Smullyan’s elementary formal system provides the right level of generality and simplicity.

• Bottom-up semantics of rules is basic.

• Natural generalization of “inductive definitions”.

What is the Significance of Grammar Formalisms?

• Abstract conception of grammars.

• Makes possible investigation of algorithmic properties of grammars.

• Should have good formal properties (like familiar closure properties).

• The choice of a grammar formalism should not be thought of as a tight characterization of possible human grammars.

• A grammar formalism does not have explanatory power.