Final Lecture: Applications, Chomsky Hierarchy, and Recap

Total Page:16

File Type:pdf, Size:1020Kb

Final Lecture: Applications, Chomsky Hierarchy, and Recap Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Final lecture: Applications, Chomsky hierarchy, and Recap H. Geuvers and J. Rot Institute for Computing and Information Sciences Radboud University Nijmegen Version: 2016-17 J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 1 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Outline Applications of CFGs Beyond CFGs Recap J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 2 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Programming languages Most programming languages are (deterministic) context-free. There are tools to automatically build: • a lexical analyzer (`lexer') from regular expressions. \if x = 2 then P else Q00 if x = 2 then P else Q ; • a parser from a CFG. ifthenelse ; = P Q x 2 J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 4 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Lindenmayer systems \The development of an organism...may be considered as the execution of a `developmental program' present in the fertilized egg... A central task of developmental biology is to discover the underlying algorithm from the course of development." { Lindenmayer & Rozenberg (1976) Example: Start with A, expand once per iteration: A ! AB 0 A B ! A 1 AB 2 ABA 3 ABAAB 4 ABAABABA ::: J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 5 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Lindenmayer systems Drawing a Lindenmayer system: • F: move forward • +: rotate counter clockwise • −: rotate clockwise • [: push location/angle • ]: pop location/angle Example: F ! F + F − −F + F 0 F 1 F+F--F+F 2 F+F--F+F+F+F--F+F--F+F--F+F+F+F--F+F 3 ::: J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 6 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Lindenmayer systems S ! F [+S][−S] S ! F − [[S] + S] + F [+FS] − S F ! FF J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 7 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Lindenmayer systems (cont'd) Example: Penrose Tiling (P3) S ! [N] + +[N] + +[N] + +[N] + +[N] M ! OF + +PF − − − −NF [−OF − − − −MF ] + + N ! +OF − −PF [− − −MF − −NF ]+ O ! −MF + +NF [+ + +OF + +PF ]− P ! − − OF + + + +MF [+PF + + + +NF ] − −NF J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 8 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Natural language S = hsentencei ! hnoun-phraseihverb-phrasei: hsentencei ! hnoun-phraseihverb-phraseihobject-phrasei: hnoun-phrasei ! hnameiharticleihnouni hnamei ! John j Jill hnouni ! bicycle j mango harticlei ! a j the hverb-phrasei ! hverbi j hadverbihverbi hverbi ! eats j rides hadverbi ! slowly j frequently hadjective-listi ! hadjectiveihadjective-listi j λ hadjectivei ! big j juicy j yellow hobject-phrasei ! hadjective-listihnamei hobject-phrasei ! harticleihadjective-listihnouni Jill frequently eats a juicy yellow mango. belongs to this language J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 9 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Natural language • Many sentences can be modelled using CFGs, e.g.: ...because I saw Cecilia feed the hippopotamuses . • But some (particularly crazy ) natural languages have non-context-free features like,cross-serial dependencies: ...omdat ik Cecilia de nijlpaarden zag voeren. • To capture these, one needs more power than CFGs J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 10 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Context-sensitive languages • Whereas context-free grammars have rules like this: X ! w X 2 V ; w 2 (Σ [ V )∗ • ...a context-sensitive grammar has rules like this: αX β ! αwβ with X 2 V , α; β; w 2 (Σ [ V )∗, w 6= λ. • Context-sensitive grammars generate context-sensitive languages. J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 12 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Context-sensitive languages Example: fanbncn j n > 0g S ! aBC j aSBC CB ! XB XB ! XC XC ! BC aB ! ab bB ! bb bC ! bc cC ! cc J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 13 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Turing machines • An unrestricted grammar has rules like: u ! v u; v 2 (Σ [ V )∗ • Recognisable by Turing machines • The stack is replaced by an infinite tape: ··· ··· • Transitions look like this: a=b; a=b; ! p q p q which read a, write b, and move left or right on the tape • We no longer need a separate input. Just use the tape: ··· a b b ··· J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 14 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Enumerable and computable languages • Languages recognisable by Turing machine are called enumerable languages • A languages is computable if both L and L = Σ∗ − L are enumerable. • In other words, there is a Turing machine with terminates telling us w 2 L and a (possibly different) one that terminates telling us w 2= L • Example: fan j n is not primeg. • Church-Turing thesis: \computable" () \computable by Turing machine" • (To be continued: Berekenbaarheid, 2nd year) J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 15 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Chomsky hierarchy fw j w describes a terminating Turing machine g;::: fan j n is primeg;::: fanbncn j n > 0g;::: fanbn j n > 0g;::: fan j n > 0g;::: Regular Context-free Context-sensitive Computable Enumerable J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 16 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Trade-offs Bigger classes of languages: • More languages can be described. • But, you can say less about them. w 2 L? time memory L1 = L2? Regular yes jwj const. yes Deterministic context-free yes jwj jwj no Context-free yes jwj3 jwj2 no Context-sensitive yes 2jwj jwjk no Computable yes 1 1 no Enumerable if w 2 L 1 1 no J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 17 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Exam topic: regular expressions You should know: • the definition of regular expression • how to compute a regular expression from an NFAλ using the elimination-of-states method • how to build an NFAλ from a regular expression using the `toolkit' J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 19 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Exam topic: finite automata You should know: • the definition of DFA, NFA, NFAλ • how to construct a DFA, NFA or NFAλ for a given language • how to construct a DFA from an NFA (the subset construction) • the constructions on DFAs for complement and intersection (product construction) of languages J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 20 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Exam Topic: regular languages You should know: • the closure properties of regular languages (union, intersection, complement), and how to use them to prove that a language is regular (or is not regular) n n • typical non-regular languages (fa b j n 2 Ng and palindromes) • Kleene's theorem • how to apply the pumping lemma to show that a language is not regular J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 21 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Exam topic: grammars You should know: • the definition of context-free grammar (CFG) • how to generate strings in a CFG (with leftmost derivations) • how to find the language generated by a (simple) CFG • how to construct a CFG that generates a given (context-free) language • the definition of regular grammar • how to construct a regular grammar from a NFA • how to construct an NFAλ from a regular grammar J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 22 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Exam topic: pushdown automata and CFLs You should know: • the definition of pushdown automata (PDAs) • how to give a PDA for a given (simple context-free) language • how find the language accepted by a given PDA • how to construct a PDA from a CFG • how to construct a CFG from a PDA • the closure properties of context-free languages J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 23 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Remarks on sets Beware that ; 6= f;g ; 6= λ ; 6= fλg. and ;· L = L ·; = ; and fλg · L = L · fλg = L. Symbols: • w 2 L: \is in". • L1 ⊆ L2: is a subset of, i.e. everything in L1 is also in L2. • L1 [ L2:union, the things in either of the two sets. • L1 \ L2: intersection, only the things in both sets. • L: complement, the words not in L, L = Σ∗ − L. Terminology: Language described using set-notation, examples: ∗ n m fw 2 fa; bg j jwja is eveng; fa b j n < mg; ; J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 24 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Remarks on words and languages Languages • contain words. • can be infinite, but words are finite. The language L∗ • always contains λ. • is not the same as fw n j w 2 L; n ≥ 0g. • is L0 [ L1 [ L2 [··· . J. Rot Version: 2016-17 Formal Languages, Grammars and Automata 25 / 28 Applications of CFGs Beyond CFGs Recap Radboud University Nijmegen Remarks on proofs To prove that L1 = L2, show that • L1 ⊆ L2 (for all w, w 2 L1 implies w 2 L2), and • L1 ⊇ L2 (for all w, w 2 L2 implies w 2 L1). Proof by contradiction • If regular languages have property X , and L1 does not have property X , then L1 is not regular.
Recommended publications
  • The Mathematics of Syntactic Structure: Trees and Their Logics
    Computational Linguistics Volume 26, Number 2 The Mathematics of Syntactic Structure: Trees and their Logics Hans-Peter Kolb and Uwe MÈonnich (editors) (Eberhard-Karls-UniversitÈat Tubingen) È Berlin: Mouton de Gruyter (Studies in Generative Grammar, edited by Jan Koster and Henk van Riemsdijk, volume 44), 1999, 347 pp; hardbound, ISBN 3-11-016273-3, $127.75, DM 198.00 Reviewed by Gerald Penn Bell Laboratories, Lucent Technologies Regular languages correspond exactly to those languages that can be recognized by a ®nite-state automaton. Add a stack to that automaton, and one obtains the context- free languages, and so on. Probably all of us learned at some point in our univer- sity studies about the Chomsky hierarchy of formal languages and the duality be- tween the form of their rewriting systems and the automata or computational re- sources necessary to recognize them. What is perhaps less well known is that yet another way of characterizing formal languages is provided by mathematical logic, namely in terms of the kind and number of variables, quanti®ers, and operators that a logical language requires in order to de®ne another formal language. This mode of characterization, which is subsumed by an area of research called descrip- tive complexity theory, is at once more declarative than an automaton or rewriting system, more ¯exible in terms of the primitive relations or concepts that it can pro- vide resort to, and less wedded to the tacit, not at all unproblematic, assumption that the right way to view any language, including natural language, is as a set of strings.
    [Show full text]
  • Hierarchy and Interpretability in Neural Models of Language Processing ILLC Dissertation Series DS-2020-06
    Hierarchy and interpretability in neural models of language processing ILLC Dissertation Series DS-2020-06 For further information about ILLC-publications, please contact Institute for Logic, Language and Computation Universiteit van Amsterdam Science Park 107 1098 XG Amsterdam phone: +31-20-525 6051 e-mail: [email protected] homepage: http://www.illc.uva.nl/ The investigations were supported by the Netherlands Organization for Scientific Research (NWO), through a Gravitation Grant 024.001.006 to the Language in Interaction Consortium. Copyright © 2019 by Dieuwke Hupkes Publisher: Boekengilde Printed and bound by printenbind.nl ISBN: 90{6402{222-1 Hierarchy and interpretability in neural models of language processing Academisch Proefschrift ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof. dr. ir. K.I.J. Maex ten overstaan van een door het College voor Promoties ingestelde commissie, in het openbaar te verdedigen op woensdag 17 juni 2020, te 13 uur door Dieuwke Hupkes geboren te Wageningen Promotiecommisie Promotores: Dr. W.H. Zuidema Universiteit van Amsterdam Prof. Dr. L.W.M. Bod Universiteit van Amsterdam Overige leden: Dr. A. Bisazza Rijksuniversiteit Groningen Dr. R. Fern´andezRovira Universiteit van Amsterdam Prof. Dr. M. van Lambalgen Universiteit van Amsterdam Prof. Dr. P. Monaghan Lancaster University Prof. Dr. K. Sima'an Universiteit van Amsterdam Faculteit der Natuurwetenschappen, Wiskunde en Informatica to my parents Aukje and Michiel v Contents Acknowledgments xiii 1 Introduction 1 1.1 My original plan . .1 1.2 Neural networks as explanatory models . .2 1.2.1 Architectural similarity . .3 1.2.2 Behavioural similarity .
    [Show full text]
  • The Complexity of Narrow Syntax : Minimalism, Representational
    Erschienen in: Measuring Grammatical Complexity / Frederick J. Newmeyer ... (Hrsg.). - Oxford : Oxford University Press, 2014. - S. 128-147. - ISBN 978-0-19-968530-1 The complexity of narrow syntax: Minimalism, representational economy, and simplest Merge ANDREAS TROTZKE AND JAN-WOUTER ZWART 7.1 Introduction The issue of linguistic complexity has recently received much attention by linguists working within typological-functional frameworks (e.g. Miestamo, Sinnemäki, and Karlsson 2008; Sampson, Gil, and Trudgill 2009). In formal linguistics, the most prominent measure of linguistic complexity is the Chomsky hierarchy of formal languages (Chomsky 1956), including the distinction between a finite-state grammar (FSG) and more complicated types of phrase-structure grammar (PSG). This dis- tinction has played a crucial role in the recent biolinguistic literature on recursive complexity (Sauerland and Trotzke 2011). In this chapter, we consider the question of formal complexity measurement within linguistic minimalism (cf. also Biberauer et al., this volume, chapter 6; Progovac, this volume, chapter 5) and argue that our minimalist approach to complexity of derivations and representations shows simi- larities with that of alternative theoretical perspectives represented in this volume (Culicover, this volume, chapter 8; Jackendoff and Wittenberg, this volume, chapter 4). In particular, we agree that information structure properties should not be encoded in narrow syntax as features triggering movement, suggesting that the relevant information is established at the interfaces. Also, we argue for a minimalist model of grammar in which complexity arises out of the cyclic interaction of subderivations, a model we take to be compatible with Construction Grammar approaches. We claim that this model allows one to revisit the question of the formal complexity of a generative grammar, rephrasing it such that a different answer to the question of formal complexity is forthcoming depending on whether we consider the grammar as a whole, or just narrow syntax.
    [Show full text]
  • What Was Wrong with the Chomsky Hierarchy?
    What Was Wrong with the Chomsky Hierarchy? Makoto Kanazawa Hosei University Chomsky Hierarchy of Formal Languages recursively enumerable context- sensitive context- free regular Enduring impact of Chomsky’s work on theoretical computer science. 15 pages devoted to the Chomsky hierarchy. Hopcroft and Ullman 1979 No mention of the Chomsky hierarchy. 450 INDEX The same with the latest edition of Carmichael, R. D., 444 CNF-formula, 302 Cartesian product, 6, 46 Co-Turing-recognizableHopcroft, language, 209 Motwani, and Ullman (2006). CD-ROM, 349 Cobham, Alan, 444 Certificate, 293 Coefficient, 183 CFG, see Context-free grammar Coin-flip step, 396 CFL, see Context-free language Complement operation, 4 Chaitin, Gregory J., 264 Completed rule, 140 Chandra, Ashok, 444 Complexity class Characteristic sequence, 206 ASPACE(f(n)),410 Checkers, game of, 348 ATIME(t(n)),410 Chernoff bound, 398 BPP,397 Chess, game of, 348 coNL,354 Chinese remainder theorem, 401 coNP,297 Chomsky normal form, 108–111, 158, EXPSPACE,368 198, 291 EXPTIME,336 Chomsky, Noam, 444 IP,417 Church, Alonzo, 3, 183, 255 L,349 Church–Turing thesis, 183–184, 281 NC,430 CIRCUIT-SAT,386 NL,349 Circuit-satisfiability problem, 386 NP,292–298 CIRCUIT-VALUE,432 NPSPACE,336 Circular definition, 65 NSPACE(f(n)),332 Clause, 302 NTIME(f(n)),295 Clique, 28, 296 P,284–291,297–298 CLIQUE,296 PH,414 Sipser 2013Closed under, 45 PSPACE,336 Closure under complementation RP,403 context-free languages, non-, 154 SPACE(f(n)),332 deterministic context-free TIME(f(n)),279 languages, 133 ZPP,439 P,322 Complexity
    [Show full text]
  • Reflection in the Chomsky Hierarchy
    Journal of Automata, Languages and Combinatorics 18 (2013) 1, 53–60 c Otto-von-Guericke-Universit¨at Magdeburg REFLECTION IN THE CHOMSKY HIERARCHY Henk Barendregt Institute of Computing and Information Science, Radboud University, Nijmegen, The Netherlands e-mail: [email protected] Venanzio Capretta School of Computer Science, University of Nottingham, United Kingdom e-mail: [email protected] and Dexter Kozen Computer Science Department, Cornell University, Ithaca, New York 14853, USA e-mail: [email protected] ABSTRACT The class of regular languages can be generated from the regular expressions. These regular expressions, however, do not themselves form a regular language, as can be seen using the pumping lemma. On the other hand, the class of enumerable languages can be enumerated by a universal language that is one of its elements. We say that the enumerable languages are reflexive. In this paper we investigate what other classes of the Chomsky Hierarchy are re- flexive in this sense. To make this precise we require that the decoding function is itself specified by a member of the same class. Could it be that the regular languages are reflexive, by using a different collection of codes? It turns out that this is impossi- ble: the collection of regular languages is not reflexive. Similarly the collections of the context-free, context-sensitive, and computable languages are not reflexive. Therefore the class of enumerable languages is the only reflexive one in the Chomsky Hierarchy. In honor of Roel de Vrijer on the occasion of his sixtieth birthday 1. Introduction Let (Σ∗) be a class of languages over an alphabet Σ.
    [Show full text]
  • Noam Chomsky
    Noam Chomsky “If you’re teaching today what you were teaching five years ago, either the field is dead or you are.” Background ▪ Jewish American born to Ukranian and Belarussian immigrants. ▪ Attended a non-competitive elementary school. ▪ Considered dropping out of UPenn and moving to British Palestine. ▪ Life changed when introduced to linguist Zellig Harris. Contributions to Cognitive Science • Deemed the “father of modern linguistics” • One of the founders of Cognitive Science o Linguistics: Syntactic Structures (1957) o Universal Grammar Theory o Generative Grammar Theory o Minimalist Program o Computer Science: Chomsky Hierarchy o Philosophy: Philosophy of Mind; Philosophy of Language o Psychology: Critical of Behaviorism Transformational Grammar – 1955 Syntactic Structures - 1957 ▪ Made him well-known within the ▪ Big Idea: Given the grammar, linguistics community you should be able to construct various expressions in the ▪ Revolutionized the study of natural language with prior language knowledge. ▪ Describes generative grammars – an explicit statement of what the classes of linguistic expressions in a language are and what kind of structures they have. Universal Grammar Theory ▪ Credited with uncovering ▪ Every sentence has a deep structure universal properties of natural that is mapped onto the surface human languages. structure according to rules. ▪ Key ideas: Humans have an ▪ Language similarities: innate neural capacity for – Deep Structure: Pattern in the mind generating a system of rules. – Surface Structure: Spoken utterances – There is a universal grammar that – Rules: Transformations underlies ALL languages Chomsky Hierarchy ▪ Hierarchy of classes of formal grammars ▪ Focuses on structure of classes and relationship between grammars ▪ Typically used in Computer Science and Linguistics Minimalist Program – 1993 ▪ Program NOT theory ▪ Provides conceptual framework to guide development of linguistic theory.
    [Show full text]
  • The Evolution of the Faculty of Language from a Chomskyan Perspective: Bridging Linguistics and Biology
    doi 10.4436/JASS.91011 JASs Invited Reviews e-pub ahead of print Journal of Anthropological Sciences Vol. 91 (2013), pp. 1-48 The evolution of the Faculty of Language from a Chomskyan perspective: bridging linguistics and biology Víctor M. Longa Area of General Linguistics, Faculty of Philology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain e-mail: [email protected] Summary - While language was traditionally considered a purely cultural trait, the advent of Noam Chomsky’s Generative Grammar in the second half of the twentieth century dramatically challenged that view. According to that theory, language is an innate feature, part of the human biological endowment. If language is indeed innate, it had to biologically evolve. This review has two main objectives: firstly, it characterizes from a Chomskyan perspective the evolutionary processes by which language could have come into being. Secondly, it proposes a new method for interpreting the archaeological record that radically differs from the usual types of evidence Paleoanthropology has concentrated on when dealing with language evolution: while archaeological remains have usually been regarded from the view of the behavior they could be associated with, the paper will consider archaeological remains from the view of the computational processes and capabilities at work for their production. This computational approach, illustrated with a computational analysis of prehistoric geometric engravings, will be used to challenge the usual generative thinking on language evolution, based on the high specificity of language. The paper argues that the biological machinery of language is neither specifically linguistic nor specifically human, although language itself can still be considered a species-specific innate trait.
    [Show full text]
  • Complexity of Natural Languages
    Complexity of natural languages Computational complexity: the expressive power of (and the resources needed in order to process) classes of languages Linguistic complexity: what makes individual constructions or sentences more difficult to understand This is the dog, that worried the cat, that killed the rat, that ate the malt, that lay in the house that Jack built. This is the malt that the rat that the cat that the dog worried killed ate. The Chomsky hierarchy of languages A hierarchy of classes of languages, viewed as sets of strings, ordered by their “complexity”. The higher the language is in the hierarchy, the more “complex” it is. In particular, the class of languages in one class properly includes the languages in lower classes. There exists a correspondence between the class of languages and the format of phrase-structure rules necessary for generating all its languages. The more restricted the rules are, the lower in the hierarchy the languages they generate are. The Chomsky hierarchy of languages Recursively enumerable languages Context-sensitive languages Context-free languages Regular languages The Chomsky hierarchy of languages Regular (type-3) languages: Grammar: right-linear or left-linear grammars Rule form: A → α where A ∈ V and α ∈ Σ∗ · V ∪ {}. Computational device: finite-state automata The Chomsky hierarchy of languages Context-free (type-2) languages: Grammar: context-free grammars Rule form: A → α where A ∈ V and α ∈ (V ∪ Σ)∗ Computational device: push-down automata The Chomsky hierarchy of languages Context-sensitive
    [Show full text]
  • Representations and Characterizations of Languages in Chomsky Hierarchy by Means of Insertion-Deletion Systems
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by idUS. Depósito de Investigación Universidad de Sevilla Representations and Characterizations of Languages in Chomsky Hierarchy by Means of Insertion-Deletion Systems Gheorghe P˘aun(A,B) Mario J. P´erez-Jim´enez(B,C) Takashi Yokomori(D) (A)Institute of Mathematics of the Romanian Academy PO Box 1-764 – 014700 Bucharest – Romania [email protected], [email protected] (B)Department of Computer Science and Artificial Intelligence Univ. of Sevilla – Avda Reina Mercedes s/n – 41012 Sevilla – Spain [email protected] (D)Department of Mathematics Faculty of Education and Integrated Arts and Sciences Waseda University – 1-6-1 Nishiwaseda – Shinjyuku-ku Tokyo 169-8050 – Japan [email protected] Abstract. Insertion-deletion operations are much investigated in lin- guistics and in DNA computing and several characterizations of Turing computability were obtained in this framework. In this note we contribute to this research direction with a new characterization of this type, as well as with representations of regular and context-free languages, mainly starting from context-free insertion systems of as small as possible complexity. For instance, each recur- sively enumerable language L can be represented in a way similar to the celebrated Chomsky-Sch¨utzenberger representation of context-free lan- guages, i.e., in the form L = h(L(γ) ∩ D), where γ is an insertion system of weight (3, 0) (at most three symbols are inserted in a context of length zero), h is a projection, and D is a Dyck language. A similar represen- tation can be obtained for regular languages, involving insertion systems of weight (2,0) and star languages, as well as for context-free languages – this time using insertion systems of weight (3, 0) and star languages.
    [Show full text]
  • Bounded Languages Described by GF(2)-Grammars
    Bounded languages described by GF(2)-grammars Vladislav Makarov∗ February 17, 2021 Abstract GF(2)-grammars are a recently introduced grammar family that have some un- usual algebraic properties and are closely connected to unambiguous grammars. By using the method of formal power series, we establish strong conditions that are nec- essary for subsets of a∗b∗ and a∗b∗c∗ to be described by some GF(2)-grammar. By further applying the established results, we settle the long-standing open question of proving the inherent ambiguity of the language f anbmc` j n 6= m or m 6= ` g, as well as give a new, purely algebraic, proof of the inherent ambiguity of the language f anbmc` j n = m or m = ` g. Keywords: Formal grammars, finite fields, bounded languages, unambiguous grammars, inherent ambiguity. 1 Introduction GF(2)-grammars, recently introduced by Bakinova et al. [3], and further studied by Makarov and Okhotin [15], are a variant of ordinary context-free grammars, in which the disjunction is replaced by exclusive OR, whereas the classical concatenation is replaced by a new operation called GF(2)-concatenation: K L is the set of all strings with an odd number of partitions into a concatenation of a string in K and a string in L. There are several motivating reasons behind studying GF(2)-grammars. The first of them they are a class of grammars with better algebraic properties, compared to ordinary grammars and similar grammar families, because the underlying boolean semiring logic was replaced by a logic of a field with two elements.
    [Show full text]
  • Calibrating Generative Models: the Probabilistic Chomsky-Schutzenberger¨ Hierarchy∗
    Calibrating Generative Models: The Probabilistic Chomsky-Schutzenberger¨ Hierarchy∗ Thomas F. Icard Stanford University [email protected] December 14, 2019 1 Introduction and Motivation Probabilistic computation enjoys a central place in psychological modeling and theory, ranging in application from perception (Knill and Pouget, 2004; Orban´ et al., 2016), at- tention (Vul et al., 2009), and decision making (Busemeyer and Diederich, 2002; Wilson et al., 2014) to the analysis of memory (Ratcliff, 1978; Abbott et al., 2015), categorization (Nosofsky and Palmeri, 1997; Sanborn et al., 2010), causal inference (Denison et al., 2013), and various aspects of language processing (Chater and Manning, 2006). Indeed, much of cognition and behavior is usefully modeled as stochastic, whether this is ultimately due to indeterminacy at the neural level or at some higher level (see, e.g., Glimcher 2005). Ran- dom behavior may even enhance an agent’s cognitive efficiency (see, e.g., Icard 2019). A probabilistic process can be described in two ways. On one hand we can specify a stochastic procedure, which may only implicitly define a distribution on possible out- puts of the process. Familiar examples include well known classes of generative mod- els such as finite Markov processes and probabilistic automata (Miller, 1952; Paz, 1971), Boltzmann machines (Hinton and Sejnowski, 1983), Bayesian networks (Pearl, 1988), topic models (Griffiths et al., 2007), and probabilistic programming languages (Tenenbaum et al., 2011; Goodman and Tenenbaum, 2016). On the other hand we can specify an analytical ex- pression explicitly describing the distribution over outputs. Both styles of description are commonly employed, and it is natural to ask about the relationship between them.
    [Show full text]
  • Learnability and Language Acquisition
    The data Chomsky hierarchy Bottom up Problems Conclusion Expressive power Learnability and Language Acquisition Alexander Clark Department of Computer Science Royal Holloway, University of London Tuesday The data Chomsky hierarchy Bottom up Problems Conclusion Outline The data Chomsky hierarchy Swiss German Bottom up Problems Conclusion (Ed Stabler, Greg Kobele, Makoto Kanazawa) The data Chomsky hierarchy Bottom up Problems Conclusion Outline The data Chomsky hierarchy Swiss German Bottom up Problems Conclusion More precisely, we observe sounds/gestures. • We don’t observe the meanings, but we are aware of • entailment relations between propositions. (or something similar) The sound/meaning relation is not one-to-one, but there is • a lot of structure. (knowing what was said tells you a lot about the meaning!) The data Chomsky hierarchy Bottom up Problems Conclusion What are the linguistic data? We observe sound/meaning pairs. • The data Chomsky hierarchy Bottom up Problems Conclusion What are the linguistic data? We observe sound/meaning pairs. • More precisely, we observe sounds/gestures. • We don’t observe the meanings, but we are aware of • entailment relations between propositions. (or something similar) The sound/meaning relation is not one-to-one, but there is • a lot of structure. (knowing what was said tells you a lot about the meaning!) The data Chomsky hierarchy Bottom up Problems Conclusion Sound/meaning pairs s5 m5 s4 m4 s3 m3 s2 m2 s1 m1 The data Chomsky hierarchy Bottom up Problems Conclusion Structural descriptions Traditional view grammar g SD φ sound/SM meaning/CI The data Chomsky hierarchy Bottom up Problems Conclusion Example Grammar might be a context-free grammar, augmented • with some compositional semantics.
    [Show full text]