Leveraging Multimodal Redundancy for Dynamic Learning, with SHACER — a Speech and Handwriting Recognizer

Total Page:16

File Type:pdf, Size:1020Kb

Leveraging Multimodal Redundancy for Dynamic Learning, with SHACER — a Speech and Handwriting Recognizer Leveraging Multimodal Redundancy for Dynamic Learning, with SHACER | a Speech and HAndwriting reCognizER Edward C. Kaiser B.A., American History, Reed College, Portland, Oregon (1986) A.S., Software Engineering Technology, Portland Community College, Portland, Oregon (1996) M.S., Computer Science and Engineering, OGI School of Science & Engineering at Oregon Health & Science University (2005) A dissertation submitted to the faculty of the OGI School of Science & Engineering at Oregon Health & Science University in partial ful¯llment of the requirements for the degree Doctor of Philosophy in Computer Science and Engineering April 2007 °c Copyright 2007 by Edward C. Kaiser All Rights Reserved ii The dissertation \Leveraging Multimodal Redundancy for Dynamic Learning, with SHACER | a Speech and HAndwriting reCognizER" by Edward C. Kaiser has been examined and approved by the following Examination Committee: Philip R. Cohen Professor Oregon Health & Science University Thesis Research Adviser Randall Davis Professor Massachusetts Institute of Technology John-Paul Hosom Assistant Professor Oregon Health & Science University Peter Heeman Assistant Professor Oregon Health & Science University iii Dedication To my wife and daughter. iv Acknowledgements This thesis dissertation in large part owes it existence to the patience, vision and generosity of my advisor, Dr. Phil Cohen. Without his aid it could not have been ¯nished. I would also like to thank Professor Randall Davis for his interest in this work and his time in reviewing and considering it as the external examiner. I am extremely grateful to Peter Heeman and Paul Hosom for their e®orts in participating on my thesis committee. There are many researchers with whom I have worked closely at various stages during my research and to whom I am indebted for sharing their support and their thoughts. I would like to thank Ron Cole for his early support of my desire to study computer science, and Wayne Ward for many valuable discussions about parsing and grammar compilation early in my time as a doctoral student. During my work on the CALO project I have been blessed with many fruitful collaborations and I would like acknowledge my gratitude to Alex Rudnicky and Alan Black for their willingness to help in the development of SHACER. Also I'm grateful to David Demirdjian for his unfailing friendship and support in many related research endeavors. I have enjoyed being part of both the Center for Human Computer Communication (CHCC) and now of Adapx. during my years as a PhD student. My thanks to Paulo Barthelmess, Xiao Huang, Matt Wesson, Xioaguang Li, Sanjeev Kumar, David McGee, Michael Johnston, Andrea Coradini and others for their e®orts over the years without which my work and SHACER in particular would not have been possible. To my fellow PhD students, Rebecca and Raj, I wish all the best for their forthcoming theses. Finally, I would like to thank my wife and daughter for their trust and love over these last years. I am grateful to the Defense Advanced Research Projects Agency (DARPA) for ¯- nancially supporting my thesis research through the Contract No. NBCHD030010 for v the CALO program. I note that any opinions, ¯ndings and conclusions or recommenda- tions expressed in this material are my own and do not necessarily reflect the views of the DARPA or the Department of Interior-National Business Center (DOI-NBC). I would also like to thank SRI for letting me work on parts of my thesis as part of the CALO project. vi Contents Dedication ::::::::::::::::::::::::::::::::::::::::: iv Acknowledgements ::::::::::::::::::::::::::::::::::: v Abstract ::::::::::::::::::::::::::::::::::::::::::xxvii 1 Introduction ::::::::::::::::::::::::::::::::::::: 1 1.1 MOTIVATION . 1 1.2 THE PROBLEM: NEW LANGUAGE AND THE NEED FOR DYNAMIC LEARNING . 5 1.3 MULTIMODAL REDUNDANCY AS A BASIS FOR DYNAMIC LEARNING 8 1.4 LEVERAGING MULTIMODAL REDUNDANCY . 10 1.4.1 Problems to be addressed . 11 1.4.2 Hypothesis . 11 1.4.3 Argument . 12 2 Empirical Studies of Multimodal Redundancy :::::::::::::::: 13 2.1 A WORKING HYPOTHESIS OF MULTIMODAL REDUNDANCY . 14 2.1.1 Multimodal Complementarity Versus Redundancy . 16 2.1.2 Out-Of-Vocabulary Words In Natural Speech Processing Contexts . 17 2.1.3 Dynamic Learning Of Out-Of-Vocabulary Words . 19 2.1.4 Multimodality In Learning And Teaching . 19 2.2 STUDY OF MULTIMODAL REDUNDANCY . 20 2.2.1 Term Frequency - Inverse Document Frequency . 21 2.2.2 Corpora Description . 22 2.2.3 Study Results . 25 2.3 DISCUSSION . 39 2.3.1 Study Implications . 39 2.3.2 Multimodal Understanding of Human-Human Interaction . 40 2.4 SUMMARY . 40 vii 3 SHACER: Overview :::::::::::::::::::::::::::::::: 42 3.1 TASK DOMAIN . 42 3.1.1 Gantt Charts . 42 3.1.2 Meeting Corpus . 43 3.1.3 Ambient Cumulative Interface . 44 3.2 SHACER PROCESSING . 47 3.2.1 Modal Recognition . 47 3.2.2 Phonetic Alignment . 50 3.2.3 Pronunciation Re¯nement . 51 3.2.4 Integration . 55 3.2.5 Learning . 55 3.2.6 Understanding Abbreviations . 56 3.3 SUMMARY . 58 4 Prelude to SHACER: Multimodal New Vocabulary Recognition :::: 61 4.1 HIGH LEVEL GOALS . 61 4.1.1 Dynamic Learning . 61 4.1.2 The Vocabulary Problem . 64 4.2 RELATED WORK . 66 4.2.1 Hybrid Fusion for Speech to Phone Recognition . 67 4.2.2 Multimodal Semantic Grounding . 68 4.3 The MNVR Approach . 72 4.3.1 Mutual Disambiguation . 73 4.3.2 System Description . 74 4.3.3 Two-Level Recursive Transition Network Grammar . 76 4.4 EVALUATION: BASELINE PERFORMANCE TEST . 81 4.5 CONCLUSION . 88 4.5.1 Robotic and Intelligent System Learning From Demonstration . 88 4.5.2 Discussion of MNVR Test Results . 88 4.6 SUMMARY . 90 5 SHACER: Alignment and Re¯nement ::::::::::::::::::::: 92 5.1 PHONE ENSEMBLE RECOGNITION . 94 5.1.1 Phone Recognition in Spoken Document Retrieval . 94 5.1.2 SHACER's Hybrid Word/Phone Recognition Approach . 97 5.2 INK SEGMENTATION . 101 5.2.1 Segmentation Rules . 102 5.2.2 Iterative Best-Splitting of Segmented Stroke Groups . 103 viii 5.2.3 Correct Segmentation Rates . 105 5.3 IDENTIFYING HANDWRITING AND SPEECH REDUNDANCY . 105 5.3.1 Word Spotting and Cross-Domain Matching in SHACER . 106 5.3.2 Multiparser Alignment Requests Within SHACER . 108 5.3.3 Phonetic Articulatory-Features as an Alignment Distance Metric . 110 5.3.4 The Mechanics Of Phonetic Alignment . 115 5.3.5 Alignment Procedure Modi¯cations for SHACER . 119 5.4 REFINEMENT OF HANDWRITING/SPEECH SEGMENT PRONUNCI- ATION . 126 5.4.1 Lattice Term Sequence Extraction . 127 5.4.2 Positional Phone-Bigram Modeling . 130 5.4.3 Second Pass Phone Recognition . 134 5.5 SUMMARY . 136 6 SHACER: Integration :::::::::::::::::::::::::::::::137 6.1 INPUT SOURCES . 137 6.1.1 Saliency of Various Input Combination . 138 6.2 CHOOSING THE BEST COMBINATIONS OF INPUT INFORMATION . 140 6.3 INTEGRATION EXAMPLES . 142 6.3.1 Lattice Alignment Fusion: 'Fred Green' . 145 6.3.2 Speech/Handwriting Fusion: 'Cindy Black' . 148 6.3.3 Speech/Handwriting Fusion: `Buy Computer' . 150 6.3.4 Speech/Handwriting Fusion: Discussion . 150 6.4 DISCOVERING HANDWRITING ABBREVIATION SEMANTICS . 152 6.4.1 The Role of the Word/Phrase-Spotting Recognizer . 152 6.4.2 Comparing Word/Phrase-Spotter Recognition to Handwriting . 153 6.5 SUMMARY . 157 7 SHACER: Testing ::::::::::::::::::::::::::::::::::158 7.1 WORKING HYPOTHESIS . 158 7.2 MULTIMODAL TURN END-POINTING . 158 7.2.1 Fixed-Threshold Wait-Times for Turn Segmentation . 159 7.2.2 Parsing and Edge-Splitting . 160 7.2.3 Testing Edge-Splitting in Charter . 161 7.2.4 Baseline Word and Phone Recognition . 163 7.2.5 Word/Phrase Spotter Out-Of-Vocabulary (OOV) Enrollment . 165 7.3 TWO PHASE PROCESSING . 165 7.4 FILE-BASED ACCUMULATORS FOR LEARNING PERSISTENCE . 166 ix 7.4.1 Adding to Speech and Handwriting Dictionaries . 166 7.4.2 Expanding the Master Speech Recognizer's Abbreviation Table . 167 7.4.3 Biasing Handwriting Recognition . 167 7.4.4 A±x Tracking for Mis-Recognition Correction . 167 7.5 PLAUSIBILITY TESTING . 168 7.5.1 Speech/Handwriting Combination Bene¯ts . 168 7.5.2 Testing Articulatory-Feature Based Phonetic Distances . 172 7.6 TEST OUTCOMES . 175 7.6.1 Results on Development Test Set . 175 7.6.2 Results on Held-Out Test Set . 177 7.7 SUMMARY . 178 8 Conclusion and Future Work :::::::::::::::::::::::::::180 8.1 CONCLUSION . 180 8.1.1 Speci¯city of Contribution . 180 8.1.2 Timeliness of Contribution . 180 8.1.3 Signi¯cance of Contribution . 181 8.1.4 Contributions . 181 8.2 FUTURE WORK . 185 8.2.1 Moving Toward Stochastic Decision Processes . 185 8.2.2 Better Modal Recognizers . 185 8.2.3 Better Structural Understanding of Syntax and Semantics . 186 8.2.4 Beyond Handwriting and Speech: MOOVR . 187 8.2.5 Better Understanding of Multimodal Redundancy . 187 Bibliography :::::::::::::::::::::::::::::::::::::::189 A Articulatory Features and Phonetic Distances ::::::::::::::::207 A.1 ARTICULATORY FEATURE TABLE AND PSEUDO-CODE . 207 A.1.1 Articulatory Feature Table . 207 A.2 PER PHONE ARTICULATORY FEATURE VALUE TABLE . 209 A.3 EXAMPLE PHONE DISTANCE TABLE . 211 B SHACER Grammars ::::::::::::::::::::::::::::::::213 B.1 GRAMMAR DESCRIPTION AND FORMAT . 213 B.2 CONSTRAINED SYLLABARY GRAMMAR . 220 B.3 CONSTRAINED PHONE GRAMMAR . 220 B.4 WORD PHRASE GRAMMAR . ..
Recommended publications
  • Redundancy As a Phonological Feature in the English Language
    International Journal of English Language and Communication Studies Vol. 4 No.2 2018 ISSN 2545 - 5702 www.iiardpub.org Redundancy as a Phonological Feature in the English Language Shehu Muhammed Department of GSE School of Education Aminu Saleh College of Education, Azare, Bauchi State Nigeria [email protected] Msuega Ahar Department of Languages and Linguistics, Benue State University, Makurdi [email protected] Daniel Benjamin Saanyol Department of Psychology, School of Education, Aminu Saleh College of Education Azare, Bauchi State [email protected] Abstract The study critically examines redundancy as a phonological feature in the English Language. It establishes that it is an integral aspect of the English phonology. The study adopts a survey method using structuralist approach. The study was conducted among the speakers of English language in Benue State University, Makurdi. The sample size of about 60 respondents were taken from a selected number of departments: Languages and linguistics, Mass communication, Religion and Philosophy, Theatre Arts, English language and the Political Science undergraduate students who ranged from 16 - 35 years. The study used the following instruments: oral interviews, personal observations and discussion group to obtain data for the study. The findings reveal that redundancy must be understood and given space in the daily practice of the language. This will enhance effective communication and understanding among speaker of English Language. In fact, failure in speaking the language with a consistent agreement to the entire phonological system (including the redundant features) will sometimes amount to communication breakdown and misinterpretation of the meaning. A close observation has shown that many speakers of English language lack a good knowledge of its phonological system.
    [Show full text]
  • How to Understand Conjugation in Japanese, Words That Form the Predicate, Such As Verbs and Adjectives, Change Form According to Their Meaning and Function
    How to Understand Conjugation In Japanese, words that form the predicate, such as verbs and adjectives, change form according to their meaning and function. This change of word form is called conjugation, and words that change their forms are called conjugational words. Yōgen (用言: roughly verbs and adjectives) Complex Japanese predicates can be divided into several “words” when one looks at formal characteristics such as accent and breathing. (Here, a “word” is almost equivalent to an accent unit, or the smallest a unit governed by breathing pauses, and is larger than the word described in school grammar.) That is, the Japanese predicate does not consist of one word; rather it is a complex of yōgen, made up of a series of words. Each of the “words” that make up the predicate (i.e., yōgen complex) undergoes word form change, and words in a similar category may be strung together in a row. Therefore, altogether there is quite a wide variety of forms of predicates. Some people do not divide the yōgen complex into words, however. They treat the predicate as a single entity, and call the entire variety of changes conjugation. Some try to integrate all variety of such word form change into a single chart. This chart not only tends to be enormous in size, but also each word shows the same inflection cyclically. There is much redundancy in this approach. When dealing with Japanese, it will be much simpler if one separates the description of the order in which the components of the predicate (yōgen complex) appear and the description of the word form change of individual words.
    [Show full text]
  • A Neurolinguistic Approach to Linguistic Redundancy
    A NEUROLINGUISTIC APPROACH TO LINGUISTIC REDUNDANCY Assistant Prof. Dr. Laura Carmen CU ŢITARU, “Alexandru I.Cuza” University of Ia şi Abstract This paper proposes a neurolinguistic explanation for the overlap of redundancy as conceptualized by information theory and redundancy as understood in literary theory. Keywords : information theory, language recognition, redundancy 1. The Concept of Redundancy In July and October 1948, Claude Shannon, an engineer working at the Bell Telephone Laboratories, published two papers in the Bell System Technical Journal , in which he formulated a set of theorems concerning quick and accurate transmission of messages from one place to another. Although intended primarily for radio and telephone engineers, the generalizations Shannon made established laws that proved to govern all kinds of messages, no matter the medium. He practically established information theory, and its tenets can be used in order to investigate any system in which a message/information is sent from a source to a receiver. One essential condition for any successful communication is that the message should be received and understood by the receiver. But there is a natural occurrence with communication systems in general to be exposed to interferences, which, in the jargon of this field, are called noise . Anything that corrupts the integrity of a message (like image distortions on a TV screen, static in a radio set, gaps or smudged lines in a written text) qualifies as noise. During World War II, Shannon worked on secret codes, and on ways to separate information from noise. What he found was to become one of the most important concepts in communications theory: redundancy .
    [Show full text]
  • CLIPP Christiani Lehmanni Inedita, Publicanda, Publicata Pleonasm and Hypercharacterization
    CLIPP Christiani Lehmanni inedita, publicanda, publicata titulus Pleonasm and hypercharacterization huius textus situs retis mundialis http://www.uni-erfurt.de/ sprachwissenschaft/personal/lehmann/CL_Publ/ Hypercharacterization.pdf dies manuscripti postremum modificati 23.02.2006 occasio orationis habitae 11. Internationale Morphologietagung Wien, 14.-17.02.2004 volumen publicationem continens Booij, Geert E. & van Marle, Jaap (eds.), Yearbook of Morphology 2005. Heidelberg: Springer annus publicationis 2005 paginae 119-154 Pleonasm and hypercharacterization Christian Lehmann University of Erfurt Abstract Hypercharacterization is understood as pleonasm at the level of grammar. A scale of strength of pleonasm is set up by the criteria of entailment, usualness and contrast. Hy- percharacterized constructions in the areas of syntax, inflection and derivation are ana- lyzed by these criteria. The theoretical basis of a satisfactory account is sought in a holis- tic, rather than analytic, approach to linguistic structure, where an operator-operand struc- ture is formed by considering the nature of the result, not of the operand. Data are drawn from German, English and a couple of other languages. The most thorough in a number of more or less sketchy case studies is concerned with German abstract nouns derived in -ierung (section 3.3.1). This process is currently so productive that it is also used to hy- percharacterize nouns that are already marked as nominalizations. 1 1. Introduction Hypercharacterization 2 (German Übercharakterisierung) may be introduced per ostensionem: it is visible in expressions such as those of the second column of T1. T1. Stock examples of hypercharacterization language hypercharacterized basic surplus element German der einzigste ‘the most only’ der einzige ‘the only’ superlative suffix –st Old English children , brethren childer , brether plural suffix –en While it is easy, with the help of such examples, to understand the term and get a feeling for the concept ‘hypercharacterization’, a precise definition is not so easy.
    [Show full text]
  • Word Construction: Tracing an Optimal Path Through the Lexicon
    Word construction: tracing an optimal path through the lexicon Gabriela Caballero (UC San Diego) and Sharon Inkelas (UC Berkeley) Word construction: tracing an optimal path through the lexicon 1. Introduction In this paper we propose a new theory of word formation which blends three components: the “bottom-up” character of lexical-incremental approaches of morphology, the “top-down” or meaning-driven character of inferential-realizational approaches to inflectional morphology, and the competition between word forms that is inherent in Optimality Theory.1 Optimal Construction Morphology (OCM) is a theory of morphology that selects the optimal combination of lexical constructions to best achieve a target meaning. OCM is an incremental theory, in that words are built one layer at a time. In response to a meaning target, the morphological grammar dips into the lexicon, building and assessing morphological constituents incrementally until the word being built optimally matches the target meaning. We apply this new theory to a vexing optimization puzzle confronted by all theories of morphology: why is redundancy in morphology rejected as ungrammatical in some situations (“blocking”), but absolutely required in others (“multiple / extended exponence”)? We show that the presence or absence of redundant layers of morphology follows naturally within OCM, without requiring external stipulations of a kind that have been necessary in other approaches. Our analysis draws on and further develops two notions of morphological strength that have been proposed in the literature: stem type, on a scale from root (weakest) to word (strongest), and exponence strength, which is related to productivity and parsability. 2. The case study: blocking vs.
    [Show full text]
  • Citizen Participation and Local Democracy in Europe Joerg Forbrig 5
    Learning for Local Democracy A Study of Local Citizen Participation in Europe Joerg Forbrig Editor Copyright © 2011 by the Central and Eastern European Citizens Network The opinions expressed in this book are those of individual authors and do not necessarily represent the views of the authors‘ affiliations Published by the Central and Eastern European Citizens Network, in partnership with the Combined European Bureau for Social Development All Rights Reserved With the support of the Visegrad Fund With the support of the Education, Audiovisual and Culture Executive Agency With the support of the Lifelong Learning Programme of the European Union This project has been funded with support from the European Commission. This publication reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein. 2 Table of Contents Introduction: Citizen Participation and Local Democracy in Europe Joerg Forbrig 5 Building Local Communities and Civic Infrastructure in Croatia Mirela Despotović 21 Rebuilding Local Communities and Social Capital in Hungary Ilona Vercseg, Aranka Molnár, Máté Varga and Péter Peták 41 Citizen Education, Municipal Development and Local Democracy in Norway Kirsten Paaby 69 Public Consultations and Participatory Budgeting in Local Policy-Making in Poland Łukasz Prykowski 89 Public Participation Strengthening Processes and Outcomes of Local Decision-Making in Romania Oana Preda 107 Citizen Campaigns in Slovakia: From National Politics to Local Community Participation Kajo Zbořil 129 Contentious Politics and Local Citizen Action in Spain Amparo Rodrigo Mateu 151 Strengthening Local Democracy through Devolution of Power in the United Kingdom Alison Gilchrist 177 Conclusions: Ten Critical Insights for Local Citizen Participation in Europe Joerg Forbrig 203 Bibliography 213 About the Authors 219 3 4 Introduction: Citizen Participation and Local Democracy in Europe Joerg Forbrig Democracy in Europe is in trouble.
    [Show full text]
  • Relationships Between Redundancy, Prosodic Structure and Care of Articulation in Spontaneous Speech
    Stochastic Suprasegmentals: Relationships between Redundancy, Prosodic Structure and Care of Articulation in Spontaneous Speech Matthew Peter Aylett V N I E R U S E I T H Y T O H F G E R D I N B U Thesis submitted for the degree of Doctor of Philosophy University of Edinburgh 2000 Declaration I hereby declare that I composed this thesis myself, and that the research which is reported herein has been conducted by myself unless otherwise indicated. Matthew P. Aylett Edinburgh February 14, 2000 iii Acknowledgements I’ve really enjoyed working on this PhD in the pleasant and helpful environment of HCRC and the Department of Linguistics. Well, more specifically, when I was stressed out, bored, or burnt out, the working environment was always supportive, pleasant and helpful. People have often felt alone when doing a PhD. I have never felt like that here. I won’t give a long list of the people who have helped me over the years (you probably know who you are) as, like goodbyes, I prefer brevity to an outpouring of emotion (I’d like to save that for the pub if that’s okay). However I must especially thank Alice Turk who worked very hard correcting and advising me on my work. (Yeah I know it was her job but I mean VERY HARD as in especially hard). A special thanks also goes to Ellen Bard, not just for suggestions and help, but also for making sure my job did not conflict with my own research work. Let’s see where it goes from here.
    [Show full text]
  • Pinker & Jackendoff (2005
    Cognition 97 (2005) 211–225 www.elsevier.com/locate/COGNIT Discussion The nature of the language faculty and its implications for evolution of language (Reply to Fitch, Hauser, and Chomsky)* Ray Jackendoffa, Steven Pinkerb,* aDepartment of Psychology, Brandeis University, Waltham, MA 02454, USA bCenter for Cognitive Studies, Department of philosophy, Tufts University, Medford, MA 02155, USA Received 17 March 2005; accepted 12 April 2005 Abstract In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the “narrow language faculty”) consists only of recursion, and that this part cannot be considered an adaptation to communication. We argue that their characterization of the narrow language faculty is problematic for many reasons, including its dichotomization of cognitive capacities into those that are utterly unique and those that are identical to nonlinguistic or nonhuman capacities, omitting capacities that may have been substantially modified during human evolution. We also question their dichotomy of the current utility versus original function of a trait, which omits traits that are adaptations for current use, and their dichotomy of humans and animals, which conflates similarity due to common function and similarity due to inheritance from a recent common ancestor. We show that recursion, though absent from other animals’ communications systems, is found in visual cognition, hence cannot be the sole evolutionary development that granted language to humans. Finally, we note that despite Fitch et al.’s denial, their view of language evolution is tied to Chomsky’s conception of language itself, which identifies combinatorial productivity with a core of “narrow syntax.” An alternative conception, in which combinatoriality is spread across words and constructions, has both empirical advantages and greater evolutionary plausibility.
    [Show full text]
  • The Evolution of Case Grammar
    The evolution of case grammar Remi van Trijp language Computational Models of Language science press Evolution 4 Computational Models of Language Evolution Editors: Luc Steels, Remi van Trijp In this series: 1. Steels, Luc. The Talking Heads Experiment: Origins of words and meanings. 2. Vogt, Paul. How mobile robots can self-organize a vocabulary. 3. Bleys, Joris. Language strategies for the domain of colour. 4. van Trijp, Remi. The evolution of case grammar. 5. Spranger, Michael. The evolution of grounded spatial language. ISSN: 2364-7809 The evolution of case grammar Remi van Trijp language science press Remi van Trijp. 2017. The evolution of case grammar (Computational Models of Language Evolution 4). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/52 © 2017, Remi van Trijp Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-944675-45-9 (Digital) 978-3-944675-84-8 (Hardcover) 978-3-944675-85-5 (Softcover) ISSN: 2364-7809 DOI:10.17169/langsci.b52.182 Cover and concept of design: Ulrike Harbort Typesetting: Sebastian Nordhoff, Felix Kopecky, Remi van Trijp Proofreading: Benjamin Brosig, Marijana Janjic, Felix Kopecky Fonts: Linux Libertine, Arimo, DejaVu Sans Mono Typesetting software:Ǝ X LATEX Language Science Press Habelschwerdter Allee 45 14195 Berlin, Germany langsci-press.org Storage and cataloguing done by FU Berlin Language Science Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
    [Show full text]
  • Contrast and Redundancy in Optimality Theory
    Contrast and Redundancy in OT * Jill N. Beckman and Catherine O. Ringen University of Iowa 1. Introduction In pre-OT generative phonology, issues of contrast, redundancy, and underspecification represented active research areas for many years. For example, Stanley (1967) argues that it is arbitrary which feature is left blank in lexical entries in cases in which there is a mutual implication between two features [+f] and [+g] in some environment. He suggests that fully specified lexical entries avoid this arbitrariness. Stanley also argues that underspecified representations are problematic, and that redundant features should be included in underlying forms. However, following Kiparsky (1981, 1985), underspecification and omission of redundant features again became the norm in phonological representations. With the advent of Optimality Theory, redundancy and underspecification have, once again, largely faded from view. Little, if any, attention is paid to the issue of redundant features in current discussions in OT. The purpose of this paper is to show that, as a consequence of the OT tenets of Richness of the Base and Lexicon Optimization, apparently redundant features will be specified in optimal lexical representations. We illustrate this by considering the treatment of voicing and aspiration in both Swedish and Turkish. In each case, features which, in pre-OT accounts, were excluded from underlying representations are shown to be required in underlying forms in OT. 2. Swedish laryngeal features In many Germanic languages, there are aspirated stops and voiced stops. For example, in German there is a two-way stop contrast. In word- initial position, stops are voiceless aspirated or voiceless unaspirated (except when a preceding word ends in a voiced sound).
    [Show full text]
  • Semantic Pleonasm Detection
    Semantic Pleonasm Detection Omid Kashefi Andrew T. Lucas Rebecca Hwa School of Computing and Information University of Pittsburgh Pittsburgh, PA, 15260 kashefi, andrew.lucas, hwa @cs.pitt.edu { } Abstract some GEC corpora do annotate some words or phrases as “redundant” or “unnecessary,” they are Pleonasms are words that are redundant. To typically a manifestation of grammar errors (e.g., aid the development of systems that detect we still have room to improve our current pleonasms in text, we introduce an anno- for tated corpus of semantic pleonasms. We val- welfare system) rather than a stylistic redundancy idate the integrity of the corpus with inter- (e.g., we aim to better improve our welfare annotator agreement analyses. We also com- system). pare it against alternative resources in terms of This paper presents a new Semantic Pleonasm their effects on several automatic redundancy Corpus (SPC), a collection of three thousand sen- detection methods. tences. Each sentence features a pair of potentially 1 Introduction semantically related words (chosen by a heuris- tic); human annotators determine whether either Pleonasm is the use of extraneous words in an (or both) of the words is redundant. The corpus expression such that removing them would not offers two improvements over current resources. significantly alter the meaning of the expression First, the corpus filters for grammatical sentences (Merriam-Webster, 1983; Quinn, 1993; Lehmann, so that the question of redundancy is separated 2005). Although pleonastic phrases may serve from grammaticality. Second, the corpus is fil- literary functions (e.g., to add emphasis) (Miller, tered for a balanced set of positive and negative ex- 1951; Chernov, 1979), most modern writing style amples (i.e., no redundancy).
    [Show full text]
  • Optionality in Linguistics
    Asian Social Science; Vol. 10, No. 20; 2014 ISSN 1911-2017 E-ISSN 1911-2025 Published by Canadian Center of Science and Education Optionality in Linguistics Albina A. Bilyalova1 & Tatyana V. Lyubova1 1 Kazan (Volga region) Federal University, Russian Federation Correspondence: Albina A. Bilyalova, Mira Avenue, 68/19, Naberezhnye Chelny, 423800, Russian Federation. E-mail: [email protected] Received: June 21, 2014 Accepted: July 24, 2014 Online Published: September 28, 2014 doi:10.5539/ass.v10n20p148 URL: http://dx.doi.org/10.5539/ass.v10n20p148 Abstract This paper presents an attempt at conducting an analysis of the notion “optionality” based on the works of experts in the given sphere. There is a certain lack of consistency in the matter of understanding and interpretation of the given phenomenon in the linguistic literature. Some scientists mean by optionality an opportunity to substitute one synonymic word by the other without significant changes in the plane of content; others recognize ellipsis in it, while the third party-an optional combinability. The issue of faculty has obtained the widest discussion in sinology. According to one of the theories, optionality can be studied against redundancy and economy of language means. In this case the phenomenon of optionality in the Chinese language can manifest itself on the level of phonemes, morphemes and lexemes. Functioning of grammatical markers expressing aspect, tense, number, etc. can serve as an example, as it possesses the property of optionality. The majority of scholars agree that optionality is an objective property of language system and is closely connected with the process of the language ongoing development.
    [Show full text]