Unit 2 Structures Handout.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Unit 2 Structures Handout.Pdf 2. The definition of a language as a structure of structures 2.1. Phonetics and phonology Relevance for studying language in its natural or primary medium: oral sounds rather than written symbols. Phonic medium: the range of sounds produced by the speech organs insofar as the play a role in language Speech sounds: Individual sounds within that range Phonetics is the study of the phonic medium: The study of the production, transmission, and reception of human sound-making used in speech. e.g. classification of sounds as voiced vs voiceless: /b/ vs /p/ Phonology is the study of the phonic medium not in itself but in relation with language. e.g. application of voice to the explanation of differences within the system of language: housen vs housev usen vs usev 2.1.1. Phonetics It is usually divided into three branches which study the phonic medium from three points of view: Articulatory phonetics: speech sounds according to the way in which they are produced by the speech organs. Acoustic phonetics: speech sounds according to the physical properties of their sound-waves. Auditory phonetics: speech sounds according to their perception and identification. Articulatory phonetics has the longest tradition, and its progress in the 19th century contributed a standardize and internationally accepted system of phonetic transcription: the origins of the International Phonetic Alphabet used today and relying on sound symbols and diacritics. It studies production in relation with the vocal tract, i.e., organs such as: lungs trachea or windpipe, containing: larynx vocal folds glottis pharyngeal cavity nose mouth, containing fixed organs: teeth and teeth ridge hard palate pharyngeal wall mobile organs: lips tongue soft palate jaw According to their function and participation, sounds may take several features: Voice: voiced vs voiceless sounds, according to the participation of the vocal folds e.g. /b, d, g, z, v/ vs /p, t, k, s, f/ Nasality: nasal vs oral sounds, according to the participation of the velum or soft palate e.g. /n/ vs /s/ … Acoustic phonetics examines the physical nature of sounds according to variables like: sound quality pitch loudness length Auditory phonetics studies the perception of sounds by hearers based on two mechanisms: physiological, psychological A complete description of a sound should include information concerning all three stages/fields: production transmission reception Speech sounds are classified as: Consonants or consonantal-type: their articulation requires a closure or narrowing They can be classified internally according to: Place of articulation: bilabial /p, b, m/ labio-dental /f, v/ dental /θ, δ/ alveolar /t, d, l, n, s, z/ retroflex /r/ palatal /j/ velar /k, g, ŋ/ … Manner of articulation: complete closure plosive /p, b, t, d, k, g/ affricate /t∫/ nasal /m, n, ŋ / intermitent closure roll /r/ partial closure lateral /l/ narrowing fricative /f, v, θ, δ, s, z, ∫/ Vowels or vowel-type: their articulation is not accompanied by closure or narrowing They can be classified internally according to: Position of soft palate (raised vs lowered) Opening formed by the lips (rounded vs unrounded) Part of the tongue raised (front, centre or back) and degree of raising (close, close-mid, open- mid, open): Other vowel-type sounds are: diphthongs towards I: /a I/ e.g. why /e I/ e.g. play /o I/ e.g. point towards u: /a u/ e.g. house /o u/ e.g. low /∂ u/ e.g. no thriphthongs towards I: /a I ∂/ e.g. why /e I ∂/ e.g. play /o I ∂/ e.g. point towards u: /a u ∂/ e.g. house /o u ∂/ e.g. low /∂ u ∂/ e.g. no semi-vowels /w, j/ Suprasegmental (prosodic) features: they affect not just a segment, but long stretches of utterances: stress position rhythm intonation 2.1.2. Phonology Phoneme: the smallest linguistic unit which may bring about a change of meaning. Allophone (or phonemic alternant): a variant form of a phoneme where the variation does not alter the unit’s basic identity. Language may differ phonologically in respect of: The number of phonological elements and their inventories, The syntagmatic relations that determine the phonological well-formedness of possible combinations. Phonemic analysis studies sequences longer than phonemes (syllables) alterations and processes of phonemes in connected speech: assimilation: progressive regressive within a word at word boundaries elision juncture … 2.2. Morphology Traditional approach: the form of language based on the notion WORD A revision: different types of words: lexical word grammatical word orthographical word phonological word a common term difficult to use in technical language Two concepts arise from this revision: word-forms: the various forms that a lexeme may take once inflected e.g. sing, sings, sang, sung lexemes: the smallest distinctive units in the lexicon of a language e.g. SING, as in a dictionary entry On these grounds, morphology is then approached as the study of inflection and derivation in a language the study of the morphemes of a language morpheme is the smallest meaningful unit of a language e.g. plural number morph is the actual realization of a morpheme e.g. -s for plural number allomorph (or morphemic alternant) is a variant form of a morpheme where the variation does not alter the morpheme’s basic identity e.g. -s (/s/), -s (/z/), -ee-, … The realization of morphemes can be bound or free (according to whether they are dependent or independent, respectively). The combination of morphemes does not always take place linearly, that is, morphs do not always occur one after the other, so formal alterations may occur in various degrees: i) Attachment (with or without morphological change): Inflectional morphology Derivational morphology go > goes touch > touchy Attachment may entail duplication of a final element in the base: Inflectional morphology Derivational morphology stop > stopped cut > cutter iii) Phonological and/or orthograhical change as a result of attachment: Inflectional morphology Derivational morphology bite /bait/ > bit /bit/ happy /’hæpi/ > happily /’hæpili/ ii) Deletion (with or without phonological change): Inflectional morphology Derivational morphology bleed > bled tragedy > tragic iii) Partial suppletion: Inflectional morphology Derivational morphology woman > women sing > song iv) Complete suppletion (portmanteau morph): Inflectional morphology Derivational morphology go > went good > well v) No morphological (sometimes phonological) change at all (zero morph): Inflectional morphology Derivational morphology set (non-remote)> set (remote) goV > goN Morphology is a gradient or scale with the following two extreme fields: Inflectional morphology It studies inflections, i.e., affixes whose function is to signal various grammatical relationships of the same lexeme. Inflections are usually specific for each word-class, actually word-classes are often defined in languages using inflectional morphemes as their distinctive criterion: Nouns inflect for number and case Verbs inflect for tense, aspect, … Adjectives and adverbs inflect for degree Pronouns inflect for case, number, person… Derivational morphology It studies derivation, that is, the formation of new lexemes. Two basic types of lexemes can be considered: Simple: units with only one constituent and not formed by any word-formation process Complex: units with more than one constituent and affected by a word-formation process The major word-formation processes are: Affixation: prefixation: an affix precedes the base e.g. replace suffixation: an affix follows the base e.g. placement infixation: an infix sets inside the base (usually in compounds) e.g. syntactico-semantic Conversion: the base is reclassified as a new word-class without formal alteration e.g. busv catchn Voicing and stress shift may accompany conversion: e.g. believe contrastv Compounding: one base is added to another to form a new one e.g. sunrise (from sun + rise) Exocentric (bahuvrihi): internal-centred e.g. gasworks Endocentric (dvandva): external-centred e.g. gasbag Blending: parts of two bases form a new lexeme e.g. electrocute (from electricity and execute) Back-formation: a suffix-like ending is deleted from the base e.g. edit (from editor) Shortening: clipping: a part of the base is deleted e.g. exam (from examination) acronymization: a lexeme is formed with initials: e.g. UK (from United Kingdom) Acronyms can be pronounced as words (NATO) or as initials (UFO). … 2.3. Syntax Syntax is globally concerned with the grammaticality of word-strings: it establishes whether sequences of words (phrases, clauses, sentences) are built in accordance with the grammar of a language system: e.g. *morning this vs this morning Syntax studies word-strings, which can be classified as: Phrase: a sequence of words typically containing more than one unit lacking a subject+predicator structure built around ahead or centre. There are noun phrases (NPs) verb phrases (VPs) adjective phrases (AdjPs) adverb phrases (AdvPs) prepositional phrases (PrepPs) Clause: a sequence of words intermediate between a phrase and a sentence containing a subject+predicator structure which may be subordinate or not There are main clauses (MCls.) coordinate clauses (CoordCls.) subordinate clauses (SubCls.) Sentence: a sequence of words which is the largest structural unit at this level Which cannot be subordinate Sentences are thus not linear sequences, but multi-layered sequences. This structure is brought out by techniques like the Immediate Constituents (ICs) analysis, and its bracketing- labelling. 2.4. Semantics Semantics as the study of meaning and difficulties in the definition of meaning: prevailing wrong concepts previous non-linguistic approaches hardly any agreement.. A starting point can be that meanings are ideas or concepts which can be transferred from the mind of the speaker to the mind of the hearer by embodying them, as it were, in the forms of one language or another. Lexical vs sentence meaning Lexical meaning Lexicology (the study of lexicon) and lexicography (the application of the former study to dictionary-making) for the study of meaning. The structure of meaning based on the notion of lexical field sememe seme resulting in a network of meaning features and related concepts.
Recommended publications
  • Grapheme-To-Lexeme Feedback in the Spelling System: Evidence from a Dysgraphic Patient
    COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (2), 278–307 Grapheme-to-lexeme feedback in the spelling system: Evidence from a dysgraphic patient Michael McCloskey Johns Hopkins University, Baltimore, USA Paul Macaruso Community College of Rhode Island, Warwick, and Haskins Laboratories, New Haven, USA Brenda Rapp Johns Hopkins University, Baltimore, USA This article presents an argument for grapheme-to-lexeme feedback in the cognitive spelling system, based on the impaired spelling performance of dysgraphic patient CM. The argument relates two features of CM’s spelling. First, letters from prior spelling responses intrude into sub- sequent responses at rates far greater than expected by chance. This letter persistence effect arises at a level of abstract grapheme representations, and apparently results from abnormal persistence of activation. Second, CM makes many formal lexical errors (e.g., carpet ! compute). Analyses revealed that a large proportion of these errors are “true” lexical errors originating in lexical selec- tion, rather than “chance” lexical errors that happen by chance to take the form of words. Additional analyses demonstrated that CM’s true lexical errors exhibit the letter persistence effect. We argue that this finding can be understood only within a functional architecture in which activation from the grapheme level feeds back to the lexeme level, thereby influencing lexical selection. INTRODUCTION a brain-damaged patient with an acquired spelling deficit, arguing from his error pattern that Like other forms of language processing, written the cognitive system for written word produc- word production implicates multiple levels of tion includes feedback connections from gra- representation, including semantic, orthographic pheme representations to orthographic lexeme lexeme, grapheme, and allograph levels.
    [Show full text]
  • ON SOME CATEGORIES for DESCRIBING the SEMOLEXEMIC STRUCTURE by Yoshihiko Ikegami
    ON SOME CATEGORIES FOR DESCRIBING THE SEMOLEXEMIC STRUCTURE by Yoshihiko Ikegami 1. A lexeme is the minimum unit that carries meaning. Thus a lexeme can be a "word" as well as an affix (i.e., something smaller than a word) or an idiom (i.e,, something larger than a word). 2. A sememe is a unit of meaning that can be realized as a single lexeme. It is defined as a structure constituted by those features having distinctive functions (i.e., serving to distinguish the sememe in question from other semernes that contrast with it).' A question that arises at this point is whether or not one lexeme always corresponds to just one serneme and no more. Three theoretical positions are foreseeable: (I) one which holds that one lexeme always corresponds to just one sememe and no more, (2) one which holds that one lexeme corresponds to an indefinitely large number of sememes, and (3) one which holds that one lexeme corresponds to a certain limited number of sememes. These three positions wiIl be referred to as (1) the "Grundbedeutung" theory, (2) the "use" theory, and (3) the "polysemy" theory, respectively. The Grundbedeutung theory, however attractive in itself, is to be rejected as unrealistic. Suppose a preliminary analysis has revealed that a lexeme seems to be used sometimes in an "abstract" sense and sometimes in a "concrete" sense. In order to posit a Grundbedeutung under such circumstances, it is to be assumed that there is a still higher level at which "abstract" and "concrete" are neutralized-this is certainly a theoretical possibility, but it seems highly unlikely and unrealistic from a psychological point of view.
    [Show full text]
  • Metalanguage and Encoding Scheme Design for Digital Lexicography
    MONDILEX: I УШ ИИ а Conceptual Modelling of Networking of '•ШАШЛ ЩЛ Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources L. Stur Institute of Linguistics Slovak Academy of Sciences Metalanguage and Encoding Scheme Design for Digital Lexicography MONDILEX Third Open Workshop Bratislava, Slovakia, 15-16 April, 2009 Proceedings Bratislava 2009 MONDILEX: Conceptual Modelling of Networking of Centres for High- Quality Research in Slavic Lexicography and Their Digital Resources Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences Metalanguage and Encoding Scheme Design for Digital Lexicography Innovative Solutions for Lexical Entry Design in Slavic Lexicography MONDILEX Third Open Workshop Bratislava, Slovakia, 15–16 April, 2009 Proceedings Radovan Garabík (Ed.) The workshop is organized by the project GA 211938 MONDILEX Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources supported by EU FP7 programme Capacities – Research Infrastructures Design studies for research infrastructures in all S&T fields Metalanguage and Encoding Scheme Design for Digital Lexicography Bratislava, Ľ. Štúr Institute of Linguistics, 2009. The volume contains contributions presented at the Third open workshop “Metalanguage and encoding scheme design for digital lexicography”, held in Bratislava, Slovakia, on 15–16 April 2009. The workshop is organized by the international project GA 211938 MONDILEX Conceptual Modelling of Networking of Centres for High- Quality
    [Show full text]
  • Monosemy and the Dictionary Henri Béjoint
    Monosemy and the Dictionary Henri Béjoint I. The Notion of "Monosemy" in Linguistics The notion of "monosemy" is often mentioned by linguists, though not always under that name—Cruse (1986), for example, uses "univocality", Catford (1983:24) discusses the use of terms such as "oligosemy" "eurysemy" and "stenosemy" — but it is hardly ever defined or exemplified. Also, few linguists have tried to evaluate the quantitative importance of monosemy: how many words can be considered monosemous in English and in other languages? When evaluations are attempted, the results are surprisingly divergent, the discrepancies probably being due to the indeterminacy ofthe definition of "monosemy". The situation is all the more surprising as "polysemy" is discussed in every single book about semantics. Lexical polysemy has been considered as an unfortunate imperfection by many linguists in the past (dialectologists, after Gilliéron, and structuralists), but nowadays it is often presented as an indispensable feature of language: without polysemy, language could not cope with the diversity and the variability of the notions to be expressed. If every single "referent" had a different name, the lexical code would impose an extraordinary burden on the memory of the language user (see Hagège 1985:126).1 Whichever attitude is adopted, polysemy is important for the semanticist: indeed, for some, it is "the very object of semantics" (Rey-Debove 1971:256). If monosemy is inseparable from polysemy, it must be an equally fundamental concept. Its study is particularly important in terminology, since it is one of the most often quoted characteristics of the term as opposed to the word, but it is also important in lexicology and lexicography.
    [Show full text]
  • Part 1: Introduction to The
    PREVIEW OF THE IPA HANDBOOK Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet PARTI Introduction to the IPA 1. What is the International Phonetic Alphabet? The aim of the International Phonetic Association is to promote the scientific study of phonetics and the various practical applications of that science. For both these it is necessary to have a consistent way of representing the sounds of language in written form. From its foundation in 1886 the Association has been concerned to develop a system of notation which would be convenient to use, but comprehensive enough to cope with the wide variety of sounds found in the languages of the world; and to encourage the use of thjs notation as widely as possible among those concerned with language. The system is generally known as the International Phonetic Alphabet. Both the Association and its Alphabet are widely referred to by the abbreviation IPA, but here 'IPA' will be used only for the Alphabet. The IPA is based on the Roman alphabet, which has the advantage of being widely familiar, but also includes letters and additional symbols from a variety of other sources. These additions are necessary because the variety of sounds in languages is much greater than the number of letters in the Roman alphabet. The use of sequences of phonetic symbols to represent speech is known as transcription. The IPA can be used for many different purposes. For instance, it can be used as a way to show pronunciation in a dictionary, to record a language in linguistic fieldwork, to form the basis of a writing system for a language, or to annotate acoustic and other displays in the analysis of speech.
    [Show full text]
  • Online Dictionaries As Emergent Archives of Contemporary Usage and Collaborative Codification
    Online Dictionaries as Emergent Archives of Contemporary Usage and Collaborative Codification Colleen Cotter Queen Mary, University of London John Damaso Arizona State University February 2007 Abstract Within the history of modern English lexicography, individual dictionary editors have had ultimate control over the selection, meaning, and illustration of words; extensive collaboration with contributors has been limited. However, Internet technologies that easily permit exchanges between a user and a database have allowed a new type of dictionary online, one that is built by the collaboration of contributing end-users, allowing ordinary users of dictionaries who are not trained lexicographers to engage in dictionary-making. We discuss a popular online slang dictionary called UrbanDictionary.com (UD) to illustrate how lexicographic principles are joined with Web-only communication technologies to provide a context for collaborative engagement and meaning-making; and to note the many characteristics and functions shared with the traditional print dictionary. Significantly, UD captures what most traditional English dictionaries fall short of: recording ephemeral quotidian spoken language and representing popular views of meaning. By relying on the users of language to select and define words for a dictionary, UD, which defines more than 1 million words, has in effect influenced access to and formulation of the lexis. Keywords computer-mediated communication, lexicography, slang, youth language; English Queen Mary’s OPAL #9 Occasional Papers Advancing Linguistics 1 Introduction English lexicography stems from a tradition of relatively limited functional collaboration, beginning with Samuel Johnson’s dictionary in 1755, in which editors overseeing numerous contributors held the ultimate authority over the selection, meaning, and illustration of words.
    [Show full text]
  • Different but Not All Opposite: Contributions to Lexical Relationships Teaching in Primary School
    INTE - ITICAM - IDEC 2018, Paris-FRANCE VOLUME 1 All Different But Not All Opposite: Contributions To Lexical Relationships Teaching In Primary School Adriana BAPTISTA Polytechnic Institute of Porto – School of Media Arts and Design inED – Centre for Research and Innovation in Education Portugal [email protected] Celda CHOUPINA Polytechnic Institute of Porto – School of Education inED – Centre for Research and Innovation in Education Centre of Linguistics of the University of Porto Portugal [email protected] José António COSTA Polytechnic Institute of Porto – School of Education inED – Centre for Research and Innovation in Education Centre of Linguistics of the University of Porto Portugal [email protected] Joana QUERIDO Polytechnic Institute of Porto – School of Education Portugal [email protected] Inês OLIVEIRA Polytechnic Institute of Porto – School of Education Centre of Linguistics of the University of Porto Portugal [email protected] Abstract The lexicon allows the expression of particular cosmovisions, which is why there are a wide range of lexical relationships, involving different linguistic particularities (Coseriu, 1991; Teixeira , 2005). We find, however, in teaching context, that these variations are often replaced by dichotomous and decontextualized proposals of lexical organization, presented, for instance, in textbooks and other supporting materials (Baptista et al., 2017). Thus, our paper is structured in two parts. First, we will try to account for the diversity of lexical relations (Choupina, Costa & Baptista, 2013), considering phonological, morphological, syntactic, semantic, pragmatic- discursive, cognitive and historical criteria (Lehmann & Martin-Berthet, 2008). Secondly, we present an experimental study that aims at verifying if primary school pupils intuitively organize their mental lexicon in a dichotomous way.
    [Show full text]
  • Semantic Shifts in the Sphere of Evaluative Units
    Center for Open Access in Science ▪ Belgrade - SERBIA 3rd International e-Conference on Studies in Humanities and Social Sciences http://centerprode.com/conferences/3IeCSHSS.html ISBN (Online) 978-86-81294-02-4 ▪ 2019: 201-210 _________________________________________________________________________ Semantic Shifts in the Sphere of Evaluative Units Tatiana Sallier St. Petersburg State University, RUSSIAN FEDERATION Department of Philology, St. Petersburg Abstract The purpose of the research is to trace some semantic processes occurring in the sphere of evaluative units – lexemes with “good” or “bad” element of meaning. The article is aimed at proving that evaluative units display a semantic shift from more concrete to more abstract meaning. Evaluative lexemes are known to include a denotative (more concrete) seme and an evaluative (abstract) seme. In the process of usage, the denotative seme is suppressed and the lexeme acquires purely evaluative meaning. In the process of “name calling”, “bad” words lose their denotative element and become pure invectives. The denotative seme may not even be known to the speaker. The loss of denotative meaning may occur in the process of word borrowing. Latin “paganus” – a rural dweller – came to mean “pagan” in European languages and the word “поганый” in Russian means just “bad”. Keywords: evaluative, pejorative, semantic shifts, denotative. 1. Introduction The article is aimed at tracing semantic tendencies operating in the sphere of pejorative evaluative lexemes – that is lexemes including an evaluative element of meaning. An evaluative lexeme may be purely evaluative – that is, bear no other meaning except evaluation. Such adjectives as English “good” and “bad” or Russian “хороший”, “плохой” may serve as examples.
    [Show full text]
  • From Phoneme to Morpheme Author(S): Zellig S
    Linguistic Society of America From Phoneme to Morpheme Author(s): Zellig S. Harris Source: Language, Vol. 31, No. 2 (Apr. - Jun., 1955), pp. 190-222 Published by: Linguistic Society of America Stable URL: http://www.jstor.org/stable/411036 Accessed: 09/02/2009 08:03 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=lsa. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected]. Linguistic Society of America is collaborating with JSTOR to digitize, preserve and extend access to Language. http://www.jstor.org FROM PHONEME TO MORPHEME ZELLIG S. HARRIS University of Pennsylvania 0.1.
    [Show full text]
  • Verbs of 'Preparing Something for Eating by Heating It in a Particular
    DEPARTAMENTO DE FILOLOGÍA INGLESA Y ALEMANA Verbs of ‘preparing something for eating by heating it in a particular way’: a lexicological analysis Grado en Estudios Ingleses Fabián García Díaz Tutora: Mª del Carmen Fumero Pérez San Cristóbal de La Laguna, Tenerife 8 de septiembre de 2015 INDEX 1. Abstract ................................................................................................................................. 3 2. Introduction .......................................................................................................................... 4 3. Theoretical perspective ........................................................................................................ 6 4. Analysis: verbs of to prepare something for eating by heating it in a particular way: cook, fry and roast. ................................................................................................................... 9 4.1. Corpus selection .............................................................................................................. 9 4.2. Verb selection ................................................................................................................ 11 5. Paradigmatic relations ....................................................................................................... 13 5.1. Semantic components and lexematic analysis ............................................................... 13 5.2. Lexical relations ...........................................................................................................
    [Show full text]
  • Phones and Phonemes
    NLPA-Phon1 (4/10/07) © P. Coxhead, 2006 Page 1 Natural Language Processing & Applications Phones and Phonemes 1 Phonemes If we are to understand how speech might be generated or recognized by a computer, we need to study some of the underlying linguistic theory. The aim here is to UNDERSTAND the theory rather than memorize it. I’ve tried to reduce and simplify as much as possible without serious inaccuracy. Speech consists of sequences of sounds. The use of an instrument (such as a speech spectro- graph) shows that most of normal speech consists of continuous sounds, both within words and across word boundaries. Speakers of a language can easily dissect its continuous sounds into words. With more difficulty, they can split words into component sounds, or ‘segments’. However, it is not always clear where to stop splitting. In the word strip, for example, should the sound represented by the letters str be treated as a unit, be split into the two sounds represented by st and r, or be split into the three sounds represented by s, t and r? One approach to isolating component sounds is to look for ‘distinctive unit sounds’ or phonemes.1 For example, three phonemes can be distinguished in the word cat, corresponding to the letters c, a and t (but of course English spelling is notoriously non- phonemic so correspondence of phonemes and letters should not be expected). How do we know that these three are ‘distinctive unit sounds’ or phonemes of the English language? NOT from the sounds themselves. A speech spectrograph will not show a neat division of the sound of the word cat into three parts.
    [Show full text]
  • Grapheme-To-Phoneme Models for (Almost) Any Language
    Grapheme-to-Phoneme Models for (Almost) Any Language Aliya Deri and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California {aderi, knight}@isi.edu Abstract lang word pronunciation eng anybody e̞ n iː b ɒ d iː Grapheme-to-phoneme (g2p) models are pol żołądka z̻owon̪t̪ka rarely available in low-resource languages, ben শ嗍 s̪ ɔ k t̪ ɔ as the creation of training and evaluation ʁ a l o m o t חלומות heb data is expensive and time-consuming. We use Wiktionary to obtain more than 650k Table 1: Examples of English, Polish, Bengali, word-pronunciation pairs in more than 500 and Hebrew pronunciation dictionary entries, with languages. We then develop phoneme and pronunciations represented with the International language distance metrics based on phono- Phonetic Alphabet (IPA). logical and linguistic knowledge; apply- ing those, we adapt g2p models for high- word eng deu nld resource languages to create models for gift ɡ ɪ f tʰ ɡ ɪ f t ɣ ɪ f t related low-resource languages. We pro- class kʰ l æ s k l aː s k l ɑ s vide results for models for 229 adapted lan- send s e̞ n d z ɛ n t s ɛ n t guages. Table 2: Example pronunciations of English words 1 Introduction using English, German, and Dutch g2p models. Grapheme-to-phoneme (g2p) models convert words into pronunciations, and are ubiquitous in For most of the world’s more than 7,100 lan- speech- and text-processing systems. Due to the guages (Lewis et al., 2009), no data exists and the diversity of scripts, phoneme inventories, phono- many technologies enabled by g2p models are in- tactic constraints, and spelling conventions among accessible.
    [Show full text]