What Was Wrong with the Chomsky Hierarchy?
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Structure of Index Sets and Reduced Indexed Grammars Informatique Théorique Et Applications, Tome 24, No 1 (1990), P
INFORMATIQUE THÉORIQUE ET APPLICATIONS R. PARCHMANN J. DUSKE The structure of index sets and reduced indexed grammars Informatique théorique et applications, tome 24, no 1 (1990), p. 89-104 <http://www.numdam.org/item?id=ITA_1990__24_1_89_0> © AFCET, 1990, tous droits réservés. L’accès aux archives de la revue « Informatique théorique et applications » im- plique l’accord avec les conditions générales d’utilisation (http://www.numdam. org/conditions). Toute utilisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright. Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques http://www.numdam.org/ Informatique théorique et Applications/Theoretical Informaties and Applications (vol. 24, n° 1, 1990, p. 89 à 104) THE STRUCTURE OF INDEX SETS AND REDUCED INDEXED GRAMMARS (*) by R. PARCHMANN (*) and J. DUSKE (*) Communicated by J. BERSTEL Abstract. - The set of index words attached to a variable in dérivations of indexed grammars is investigated. Using the regularity of these sets it is possible to transform an mdexed grammar in a reducedfrom and to describe the structure ofleft sentential forms of an indexed grammar. Résumé. - On étudie Vensemble des mots d'index d'une variable dans les dérivations d'une grammaire d'index. La rationalité de ces ensembles peut être utilisée pour transformer une gram- maire d'index en forme réduite, et pour décrire la structure des mots apparaissant dans les dérivations gauches d'une grammaire d'index. 1. INTRODUCTION In this paper we will further investigate indexed grammars and languages introduced by Aho [1] as an extension of context-free grammars and lan- guages. -
The Mathematics of Syntactic Structure: Trees and Their Logics
Computational Linguistics Volume 26, Number 2 The Mathematics of Syntactic Structure: Trees and their Logics Hans-Peter Kolb and Uwe MÈonnich (editors) (Eberhard-Karls-UniversitÈat Tubingen) È Berlin: Mouton de Gruyter (Studies in Generative Grammar, edited by Jan Koster and Henk van Riemsdijk, volume 44), 1999, 347 pp; hardbound, ISBN 3-11-016273-3, $127.75, DM 198.00 Reviewed by Gerald Penn Bell Laboratories, Lucent Technologies Regular languages correspond exactly to those languages that can be recognized by a ®nite-state automaton. Add a stack to that automaton, and one obtains the context- free languages, and so on. Probably all of us learned at some point in our univer- sity studies about the Chomsky hierarchy of formal languages and the duality be- tween the form of their rewriting systems and the automata or computational re- sources necessary to recognize them. What is perhaps less well known is that yet another way of characterizing formal languages is provided by mathematical logic, namely in terms of the kind and number of variables, quanti®ers, and operators that a logical language requires in order to de®ne another formal language. This mode of characterization, which is subsumed by an area of research called descrip- tive complexity theory, is at once more declarative than an automaton or rewriting system, more ¯exible in terms of the primitive relations or concepts that it can pro- vide resort to, and less wedded to the tacit, not at all unproblematic, assumption that the right way to view any language, including natural language, is as a set of strings. -
Genome Grammars
Genome Grammars Genome Sequences and Formal Languages Andreas de Vries Version: June 17, 2011 Wir sind aus Staub und Fantasie Andreas Bourani, Nur in meinem Kopf (2011) Contents 1 Genetics 5 1.1 Cell physiology ............................. 5 1.2 Amino acids and proteins ....................... 6 1.2.1 Geometry of peptide bonds .................. 8 1.2.2 Protein structure ......................... 10 1.3 Nucleic acids ............................... 12 1.4 DNA replication ............................. 14 1.5 Flow of information for cell growth .................. 16 1.5.1 The genetic code ......................... 18 1.5.2 Open reading frames, coding regions, and genes ...... 19 2 Formal languages 23 2.1 The Chomsky hierarchy ......................... 25 2.2 Regular languages ............................ 28 2.2.1 Regular expressions ....................... 30 2.3 Context-free languages ......................... 31 2.3.1 Linear languages ........................ 33 2.4 Context-sensitive languages ...................... 34 2.4.1 Indexed languages ....................... 35 2.5 Languages and machines ........................ 38 3 Grammar of DNA Strings 42 3.1 Searl’s approach to DNA language .................. 42 3.2 Gene regulation and inadequacy of context-free grammars ..... 45 3.3 DNA splicing rule ............................ 46 A Mathematical Foundations 49 A.1 Notation ................................. 49 A.2 Sets .................................... 49 A.3 Maps ................................... 52 A.4 Algebra ................................. -
Hierarchy and Interpretability in Neural Models of Language Processing ILLC Dissertation Series DS-2020-06
Hierarchy and interpretability in neural models of language processing ILLC Dissertation Series DS-2020-06 For further information about ILLC-publications, please contact Institute for Logic, Language and Computation Universiteit van Amsterdam Science Park 107 1098 XG Amsterdam phone: +31-20-525 6051 e-mail: [email protected] homepage: http://www.illc.uva.nl/ The investigations were supported by the Netherlands Organization for Scientific Research (NWO), through a Gravitation Grant 024.001.006 to the Language in Interaction Consortium. Copyright © 2019 by Dieuwke Hupkes Publisher: Boekengilde Printed and bound by printenbind.nl ISBN: 90{6402{222-1 Hierarchy and interpretability in neural models of language processing Academisch Proefschrift ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof. dr. ir. K.I.J. Maex ten overstaan van een door het College voor Promoties ingestelde commissie, in het openbaar te verdedigen op woensdag 17 juni 2020, te 13 uur door Dieuwke Hupkes geboren te Wageningen Promotiecommisie Promotores: Dr. W.H. Zuidema Universiteit van Amsterdam Prof. Dr. L.W.M. Bod Universiteit van Amsterdam Overige leden: Dr. A. Bisazza Rijksuniversiteit Groningen Dr. R. Fern´andezRovira Universiteit van Amsterdam Prof. Dr. M. van Lambalgen Universiteit van Amsterdam Prof. Dr. P. Monaghan Lancaster University Prof. Dr. K. Sima'an Universiteit van Amsterdam Faculteit der Natuurwetenschappen, Wiskunde en Informatica to my parents Aukje and Michiel v Contents Acknowledgments xiii 1 Introduction 1 1.1 My original plan . .1 1.2 Neural networks as explanatory models . .2 1.2.1 Architectural similarity . .3 1.2.2 Behavioural similarity . -
The Complexity of Narrow Syntax : Minimalism, Representational
Erschienen in: Measuring Grammatical Complexity / Frederick J. Newmeyer ... (Hrsg.). - Oxford : Oxford University Press, 2014. - S. 128-147. - ISBN 978-0-19-968530-1 The complexity of narrow syntax: Minimalism, representational economy, and simplest Merge ANDREAS TROTZKE AND JAN-WOUTER ZWART 7.1 Introduction The issue of linguistic complexity has recently received much attention by linguists working within typological-functional frameworks (e.g. Miestamo, Sinnemäki, and Karlsson 2008; Sampson, Gil, and Trudgill 2009). In formal linguistics, the most prominent measure of linguistic complexity is the Chomsky hierarchy of formal languages (Chomsky 1956), including the distinction between a finite-state grammar (FSG) and more complicated types of phrase-structure grammar (PSG). This dis- tinction has played a crucial role in the recent biolinguistic literature on recursive complexity (Sauerland and Trotzke 2011). In this chapter, we consider the question of formal complexity measurement within linguistic minimalism (cf. also Biberauer et al., this volume, chapter 6; Progovac, this volume, chapter 5) and argue that our minimalist approach to complexity of derivations and representations shows simi- larities with that of alternative theoretical perspectives represented in this volume (Culicover, this volume, chapter 8; Jackendoff and Wittenberg, this volume, chapter 4). In particular, we agree that information structure properties should not be encoded in narrow syntax as features triggering movement, suggesting that the relevant information is established at the interfaces. Also, we argue for a minimalist model of grammar in which complexity arises out of the cyclic interaction of subderivations, a model we take to be compatible with Construction Grammar approaches. We claim that this model allows one to revisit the question of the formal complexity of a generative grammar, rephrasing it such that a different answer to the question of formal complexity is forthcoming depending on whether we consider the grammar as a whole, or just narrow syntax. -
Mildly Context-Sensitive Grammar-Formalisms
Mildly Context Sensitive Grammar Formalisms Petra Schmidt Pfarrer-Lauer-Strasse 23 66386 St. Ingbert [email protected] Overview ,QWURGXFWLRQ«««««««««««««««««««««««««««««««««««« 3 Tree-Adjoining-Grammars 7$*V «««««««««««««««««««««««««« 4 CoPELQDWRU\&DWHJRULDO*UDPPDU &&*V ««««««««««««««««««««««« 7 /LQHDU,QGH[HG*UDPPDU /,*V ««««««««««««««««««««««««««« 8 3URRIRIHTXLYDOHQFH«««««««««««««««««««««««««««««««« 9 CCG ĺ /,*«««««««««««««««««««««««««««««««««« 9 LIG ĺ 7$*«««««««««««««««««««««««««««««««««« 10 TAG ĺ &&*«««««««««««««««««««««««««««««««««« 13 Linear Context-)UHH5HZULWLQJ6\VWHPV /&)56V «««««««««««««««««««« 15 Multiple Context-Free-*UDPPDUV««««««««««««««««««««««««««« 16 &RQFOXVLRQ«««««««««««««««««««««««««««««««««««« 17 References««««««««««««««««««««««««««««««««««««« 18 Introduction Natural languages are not context free. An example non-context-free phenomenon are cross-serial dependencies in Dutch, an example of which we show in Fig. 1. ik haar hem de nijlpaarden zag helpen voeren Fig. 1.: cross-serial dependencies A concept motivated by the intention of characterizing a class of formal grammars, which allow the description of natural languages such as Dutch with its cross-serial dependencies was introduced in 1985 by Aravind Joshi. This class of formal grammars is known as the class of mildly context-sensitive grammar-formalisms. According to Joshi (1985, p.225), a mildly context-sensitive language L has to fulfill three criteria to be understood as a rough characterization. These are: The parsing problem for L is solvable in -
Cs.FL] 7 Dec 2020 4 a Model Programming Language and Its Grammar 11 4.1 Alphabet
Describing the syntax of programming languages using conjunctive and Boolean grammars∗ Alexander Okhotin† December 8, 2020 Abstract A classical result by Floyd (\On the non-existence of a phrase structure grammar for ALGOL 60", 1962) states that the complete syntax of any sensible programming language cannot be described by the ordinary kind of formal grammars (Chomsky's \context-free"). This paper uses grammars extended with conjunction and negation operators, known as con- junctive grammars and Boolean grammars, to describe the set of well-formed programs in a simple typeless procedural programming language. A complete Boolean grammar, which defines such concepts as declaration of variables and functions before their use, is constructed and explained. Using the Generalized LR parsing algorithm for Boolean grammars, a pro- gram can then be parsed in time O(n4) in its length, while another known algorithm allows subcubic-time parsing. Next, it is shown how to transform this grammar to an unambiguous conjunctive grammar, with square-time parsing. This becomes apparently the first specifica- tion of the syntax of a programming language entirely by a computationally feasible formal grammar. Contents 1 Introduction 2 2 Conjunctive and Boolean grammars 5 2.1 Conjunctive grammars . .5 2.2 Boolean grammars . .6 2.3 Ambiguity . .7 3 Language specification with conjunctive and Boolean grammars 8 arXiv:2012.03538v1 [cs.FL] 7 Dec 2020 4 A model programming language and its grammar 11 4.1 Alphabet . 11 4.2 Lexical conventions . 12 4.3 Identifier matching . 13 4.4 Expressions . 14 4.5 Statements . 15 4.6 Function declarations . 16 4.7 Declaration of variables before use . -
From Indexed Grammars to Generating Functions∗
RAIRO-Theor. Inf. Appl. 47 (2013) 325–350 Available online at: DOI: 10.1051/ita/2013041 www.rairo-ita.org FROM INDEXED GRAMMARS TO GENERATING FUNCTIONS ∗ Jared Adams1, Eric Freden1 and Marni Mishna2 Abstract. We extend the DSV method of computing the growth series of an unambiguous context-free language to the larger class of indexed languages. We illustrate the technique with numerous examples. Mathematics Subject Classification. 68Q70, 68R15. 1. Introduction 1.1. Indexed grammars Indexed grammars were introduced in the thesis of Aho in the late 1960s to model a natural subclass of context-sensitive languages, more expressive than context-free grammars with interesting closure properties [1,16]. The original ref- erence for basic results on indexed grammars is [1]. The complete definition of these grammars is equivalent to the following reduced form. Definition 1.1. A reduced indexed grammar is a 5-tuple (N , T , I, P, S), such that 1. N , T and I are three mutually disjoint finite sets of symbols: the set N of non-terminals (also called variables), T is the set of terminals and I is the set of indices (also called flags); 2. S ∈N is the start symbol; 3. P is a finite set of productions, each having the form of one of the following: (a) A → α Keywords and phrases. Indexed grammars, generating functions, functional equations, DSV method. ∗ The third author gratefully acknowledges the support of NSERC Discovery Grant funding (Canada), and LaBRI (Bordeaux) for hosting during the completion of the work. 1 Department of Mathematics, Southern Utah University, Cedar City, 84720, UT, USA. -
Reflection in the Chomsky Hierarchy
Journal of Automata, Languages and Combinatorics 18 (2013) 1, 53–60 c Otto-von-Guericke-Universit¨at Magdeburg REFLECTION IN THE CHOMSKY HIERARCHY Henk Barendregt Institute of Computing and Information Science, Radboud University, Nijmegen, The Netherlands e-mail: [email protected] Venanzio Capretta School of Computer Science, University of Nottingham, United Kingdom e-mail: [email protected] and Dexter Kozen Computer Science Department, Cornell University, Ithaca, New York 14853, USA e-mail: [email protected] ABSTRACT The class of regular languages can be generated from the regular expressions. These regular expressions, however, do not themselves form a regular language, as can be seen using the pumping lemma. On the other hand, the class of enumerable languages can be enumerated by a universal language that is one of its elements. We say that the enumerable languages are reflexive. In this paper we investigate what other classes of the Chomsky Hierarchy are re- flexive in this sense. To make this precise we require that the decoding function is itself specified by a member of the same class. Could it be that the regular languages are reflexive, by using a different collection of codes? It turns out that this is impossi- ble: the collection of regular languages is not reflexive. Similarly the collections of the context-free, context-sensitive, and computable languages are not reflexive. Therefore the class of enumerable languages is the only reflexive one in the Chomsky Hierarchy. In honor of Roel de Vrijer on the occasion of his sixtieth birthday 1. Introduction Let (Σ∗) be a class of languages over an alphabet Σ. -
Appendix A: Hierarchy of Grammar Formalisms
Appendix A: Hierarchy of Grammar Formalisms The following figure recalls the language hierarchy that we have developed in the course of the book. '' $$ '' $$ '' $$ ' ' $ $ CFG, 1-MCFG, PDA n n L1 = {a b | n ≥ 0} & % TAG, LIG, CCG, tree-local MCTAG, EPDA n n n ∗ L2 = {a b c | n ≥ 0}, L3 = {ww | w ∈{a, b} } & % 2-LCFRS, 2-MCFG, simple 2-RCG & % 3-LCFRS, 3-MCFG, simple 3-RCG n n n n n ∗ L4 = {a1 a2 a3 a4 a5 | n ≥ 0}, L5 = {www | w ∈{a, b} } & % ... LCFRS, MCFG, simple RCG, MG, set-local MCTAG, finite-copying LFG & % Thread Automata (TA) & % PMCFG 2n L6 = {a | n ≥ 0} & % RCG, simple LMG (= PTIME) & % mildly context-sensitive L. Kallmeyer, Parsing Beyond Context-Free Grammars, Cognitive Technologies, DOI 10.1007/978-3-642-14846-0, c Springer-Verlag Berlin Heidelberg 2010 216 Appendix A For each class the different formalisms and automata that generate/accept exactly the string languages contained in this class are listed. Furthermore, examples of typical languages for this class are added, i.e., of languages that belong to this class while not belonging to the next smaller class in our hier- archy. The inclusions are all proper inclusions, except for the relation between LCFRS and Thread Automata (TA). Here, we do not know whether the in- clusion is a proper one. It is possible that both devices yield the same class of languages. Appendix B: List of Acronyms The following table lists all acronyms that occur in this book. (2,2)-BRCG Binary bottom-up non-erasing RCG with at most two vari- ables per left-hand side argument 2-SA Two-Stack Automaton -
Noam Chomsky
Noam Chomsky “If you’re teaching today what you were teaching five years ago, either the field is dead or you are.” Background ▪ Jewish American born to Ukranian and Belarussian immigrants. ▪ Attended a non-competitive elementary school. ▪ Considered dropping out of UPenn and moving to British Palestine. ▪ Life changed when introduced to linguist Zellig Harris. Contributions to Cognitive Science • Deemed the “father of modern linguistics” • One of the founders of Cognitive Science o Linguistics: Syntactic Structures (1957) o Universal Grammar Theory o Generative Grammar Theory o Minimalist Program o Computer Science: Chomsky Hierarchy o Philosophy: Philosophy of Mind; Philosophy of Language o Psychology: Critical of Behaviorism Transformational Grammar – 1955 Syntactic Structures - 1957 ▪ Made him well-known within the ▪ Big Idea: Given the grammar, linguistics community you should be able to construct various expressions in the ▪ Revolutionized the study of natural language with prior language knowledge. ▪ Describes generative grammars – an explicit statement of what the classes of linguistic expressions in a language are and what kind of structures they have. Universal Grammar Theory ▪ Credited with uncovering ▪ Every sentence has a deep structure universal properties of natural that is mapped onto the surface human languages. structure according to rules. ▪ Key ideas: Humans have an ▪ Language similarities: innate neural capacity for – Deep Structure: Pattern in the mind generating a system of rules. – Surface Structure: Spoken utterances – There is a universal grammar that – Rules: Transformations underlies ALL languages Chomsky Hierarchy ▪ Hierarchy of classes of formal grammars ▪ Focuses on structure of classes and relationship between grammars ▪ Typically used in Computer Science and Linguistics Minimalist Program – 1993 ▪ Program NOT theory ▪ Provides conceptual framework to guide development of linguistic theory. -
The Evolution of the Faculty of Language from a Chomskyan Perspective: Bridging Linguistics and Biology
doi 10.4436/JASS.91011 JASs Invited Reviews e-pub ahead of print Journal of Anthropological Sciences Vol. 91 (2013), pp. 1-48 The evolution of the Faculty of Language from a Chomskyan perspective: bridging linguistics and biology Víctor M. Longa Area of General Linguistics, Faculty of Philology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain e-mail: [email protected] Summary - While language was traditionally considered a purely cultural trait, the advent of Noam Chomsky’s Generative Grammar in the second half of the twentieth century dramatically challenged that view. According to that theory, language is an innate feature, part of the human biological endowment. If language is indeed innate, it had to biologically evolve. This review has two main objectives: firstly, it characterizes from a Chomskyan perspective the evolutionary processes by which language could have come into being. Secondly, it proposes a new method for interpreting the archaeological record that radically differs from the usual types of evidence Paleoanthropology has concentrated on when dealing with language evolution: while archaeological remains have usually been regarded from the view of the behavior they could be associated with, the paper will consider archaeological remains from the view of the computational processes and capabilities at work for their production. This computational approach, illustrated with a computational analysis of prehistoric geometric engravings, will be used to challenge the usual generative thinking on language evolution, based on the high specificity of language. The paper argues that the biological machinery of language is neither specifically linguistic nor specifically human, although language itself can still be considered a species-specific innate trait.