Archisegment-Based Letter-To-Phone Conversion for Concatenative Speech Synthesis in Portuguese
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Pronunciation Rules in Portuguese Regional Speech (PORT REG) for Coarticulation Process
Pronunciation Rules in Portuguese Regional Speech (PORT REG) for Coarticulation Process Sara Candeias1 and Jorge Morais Barbosa 2 1 Instituto de Telecomunicações, Department of Computers and Electrical Engineering, University of Coimbra, PORTUGAL 2 Departement of Portuguese Language, Faculty of Letters, University of Coimbra, PORTUGAL [email protected], [email protected] Abstract. This paper describes one aspect of an ongoing work to incorporate pronunciation variability in the Portuguese (PORT) speech system. This work focuses on the linguistic rules to improve the grapheme-(multi)phone transcription algorithm that will be implemented. Portuguese ‘Beira Interior’ regional speech (PORT-BI REG) is considered to be in the realm of coarticulation (post-lexical) phenomena. A set of linguistic rules for most of the common vowel transformation in an utterance (vocalic segments at both the left and right edges of the word) is presented. The analysis focuses on the distinctive features that originate vowel sound challenges in connected speech. The results are interesting from the point of view of setting up models to reconstruct a grapheme-phone transcription algorithm for Portuguese multi-pronunciation speech systems. We propose that the linguistic documentation of Portuguese minority speech can be an optimal start for Portuguese speech system development process, too. Keywords: Text-to-Speech; coarticulation (phonology); structural analysis (linguistic features); pronunciation instruction (phonetic). 1 Introduction Several frameworks have been proposed for the grapheme-to-phone transcription module for Portuguese language, such as [2, 3, 12]. However, the problem with the Portuguese regional speech under development is the shortage of speech and text corpora. This is one of the reasons why their linguistic structure has been very poorly investigated, especially at linguistic levels such as phonetics. -
Jennings on the Trail of Pessoa Or Dimensions of Poetical Music
Jennings on the Trail of Pessoa or dimensions of poetical music Pedro Marques* Keywords Fernando Pessoa, Hubert Jennings, Roy Campbell, Peter Rickart, translation, versification, musicality, The thing that hurts and wrings, What grieves me is not, What saddens me is not. Abstract Here we present two unpublished essays by Hubert Jennings about the challenges of translating the poetry of Fernando Pessoa: the first one of them, brief and fragmentary, is analyzed in the introduction; the second, longer and also covering issues besides translation, is presented in the postscript. Having as a starting point the Pessoan poem “O que me doe” and three translations compared by Hubert Jennings, this presentation examines some aspects of poetic musicality in the Portuguese language: verse measurement, stress dynamics, rhymes, anaphors, and parallelisms. The introduction also discusses how much the English versions of the poem, which are presented by Jennings, recreate (or not) the musical-poetic dimensions of the original text. Palavras-chave Fernando Pessoa, Hubert Jennings, Roy Campbell, Peter Rickart, tradução, versificação, musicalidade, O que me doe, O que me dói. Resumo Reproduzem-se aqui dois ensaios inéditos de Hubert Jennings sobre os desafios de se traduzir a poesia de Fernando Pessoa: o primeiro deles, breve e fragmentário, é analisado numa introdução; o segundo, mais longo e versando também sobre questões alheias à tradução, é apresentado em postscriptum. A partir do poema pessoano “O que me dói” e de três traduções comparadas por Hubert Jennings, esta apresentação enfoca alguns aspectos da música poética em língua portuguesa: medida do verso, dinâmica dos acentos, rimas, anáforas e paralelismos. -
The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.4.0
The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.4.0 This Document is Work in Progress Corrections and Suggestions Are Welcome Kenneth R. Beesley SAP Labs, LLC P.O. Box 540475 North Salt Lake Utah 84054, USA [email protected] 19 May 2014 Copyright © 2014 SAP AG. Released under the Apache License, Version 2. http://www.apache.org/licenses/LICENSE-2.0.html Beesley i Preface What is Kleene? Kleene is a programming language that can be used to create many useful and efficient linguistic applications based on finite-state machines (FSMs). These applications include tokenizers, spelling checkers, spelling correc- tors, morphological analyzer/generators and shallow parsers. FSMs are also widely used in speech synthesis and speech recognition. Kleene allows programmers to define, build, manipulate and test finite- state machines using regular expressions and right-linear phrase-structure grammars; and Kleene supports variables, rule-like expressions, user-defined functions and familiar programming-language control syntax. The FSMs can include acceptors and two-projection transducers, either weighted un- der the Tropical Semiring or unweighted. If this makes no sense, the pur- pose of the book is to explain it. Pre-edited Kleene scripts can be run from the command line, and a graphical user interface is provided for interactive learning, programming and testing. Operating Systems Kleene runs on OS X and Linux, requiring Java version 1.5 or higher. Prerequisites This book assumes only superficial familiarity with regular languages, reg- ular relations and finite-state machines. While readers need not have any experience with finite-state program- ming, those who have no programming experience at all, e.g. -
Singing in English in the 21St Century: a Study Comparing
SINGING IN ENGLISH IN THE 21ST CENTURY: A STUDY COMPARING AND APPLYING THE TENETS OF MADELEINE MARSHALL AND KATHRYN LABOUFF Helen Dewey Reikofski Dissertation Prepared for the Degree of DOCTOR OF MUSICAL ARTS UNIVERSITY OF NORTH TEXAS August 2015 APPROVED:….……………….. Jeffrey Snider, Major Professor Stephen Dubberly, Committee Member Benjamin Brand, Committee Member Stephen Austin, Committee Member and Chair of the Department of Vocal Studies … James C. Scott, Dean of the College of Music Costas Tsatsoulis, Interim Dean of the Toulouse Graduate School Reikofski, Helen Dewey. Singing in English in the 21st Century: A Study Comparing and Applying the Tenets of Madeleine Marshall and Kathryn LaBouff. Doctor of Musical Arts (Performance), August 2015, 171 pp., 6 tables, 21 figures, bibliography, 141 titles. The English diction texts by Madeleine Marshall and Kathryn LaBouff are two of the most acclaimed manuals on singing in this language. Differences in style between the two have separated proponents to be primarily devoted to one or the other. An in- depth study, comparing the precepts of both authors, and applying their principles, has resulted in an understanding of their common ground, as well as the need for the more comprehensive information, included by LaBouff, on singing in the dialect of American Standard, and changes in current Received Pronunciation, for British works, and Mid- Atlantic dialect, for English language works not specifically North American or British. Chapter 1 introduces Marshall and The Singer’s Manual of English Diction, and LaBouff and Singing and Communicating in English. An overview of selected works from Opera America’s resources exemplifies the need for three dialects in standardized English training. -
Metathesis Is Real, and It Is a Regular Relation A
METATHESIS IS REAL, AND IT IS A REGULAR RELATION A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics By Tracy A. Canfield, M. S. Washington, DC November 17 , 2015 Copyright 2015 by Tracy A. Canfield All Rights Reserved ii METATHESIS IS REAL, AND IT IS A REGULAR RELATION Tracy A. Canfield, M.S. Thesis Advisor: Elizabeth C. Zsiga , Ph.D. ABSTRACT Regular relations are mathematical models that are widely used in computational linguistics to generate, recognize, and learn various features of natural languages. While certain natural language phenomena – such as syntactic scrambling, which requires a re-ordering of input elements – cannot be modeled as regular relations, it has been argued that all of the phonological constraints that have been described in the context of Optimality Theory can be, and, thus, that the phonological grammars of all human languages are regular relations; as Ellison (1994) states, "All constraints are regular." Re-ordering of input segments, or metathesis, does occur at a phonological level. Historically, this phenomenon has been dismissed as simple speaker error (Montreuil, 1981; Hume, 2001), but more recent research has shown that metathesis occurs as a synchronic, predictable phonological process in numerous human languages (Hume, 1998; Hume, 2001). This calls the generalization that all phonological processes are regular relations into doubt, and raises other -
A STUDY of L2 KANJI LEARNING PROCESS Analysis of Reading and Writing Errors of Swedish Learners in Comparison with Level-Matched Japanese Schoolchildren
A STUDY OF L2 KANJI LEARNING PROCESS Analysis of Reading and Writing Errors of Swedish Learners in Comparison with Level-matched Japanese Schoolchildren Fusae Ivarsson Department of Languages and Literatures Doctoral dissertation in Japanese, University of Gothenburg, 18 March, 2016 Fusae Ivarsson, 2016 Cover: Fusae Ivarsson, Thomas Ekholm Print: Reprocentralen, Campusservice Lorensberg, Göteborgs universitet, 2016 Distribution: Institutionen för språk och litteraturer, Göteborgs universitet, Box 200, SE-405 30 Göteborg ISBN: 978-91-979921-7-6 http://hdl.handle.net/2077/41585 ABSTRACT Ph.D. dissertation at the University of Gothenburg, Sweden, 18 March, 2016 Title: A Study of L2 Kanji Learning Process: Analysis of reading and writing errors of Swedish learners in comparison with level-matched Japanese schoolchildren. Author: Fusae Ivarsson Language: English, with a summary in Swedish Department: Department of Languages and Literatures, University of Gothenburg, Box 200, SE-405 30 Gothenburg, Sweden ISBN: 978-91-979921-7-6 http://hdl.handle.net/2077/41585 The present study investigated the characteristics of the kanji learning process of second language (L2) learners of Japanese with an alphabetic background in comparison with level-matched first language (L1) learners. Unprecedentedly rigorous large-scale experiments were conducted under strictly controlled conditions with a substantial number of participants. Comparisons were made between novice and advanced levels of Swedish learners and the respective level-matched L1 learners (Japanese second and fifth graders). The experiments consisted of kanji reading and writing tests with parallel tasks in a practical setting, and identical sets of target characters for the level-matched groups. Error classification was based on the cognitive aspects of kanji. -
Typesetting Catalan Texts with TEX
Typesetting Catalan Texts with TEX Gabriel Valiente Feruglio Universitat de les Illes Balears Departament de Ciencies Matematiques i Inforrnatica E-07071 Palma de Mallorca (Spain) Internet: dmi gvaO@ps. ui b . es Robert Fuster Universitat Politecnica de Valencia Departament de Matematica Aplicada Cami de Vera, 14. E-46071 Valencia (Spain) Internet: mat5rf c@cci. upv .es Abstract As with other non-American English languages, typesetting Catalan texts imposes some special requirements on TEX. These include a particular set of hyphenation patterns and support for a special ligature: unlike other Romanic languages, Catalan incorporates the middle point in the 11 digraph. Hyphenation rules for Catalan are reviewed in ths paper, after a short introduction to hyphenation by TEX. A minimal set of hyphenation patterns covering all Catalan accents and diacritics is also presented. A discussion about the 1'1 ligature concludes the paper. This work represents a first step towards the Catalan TLP (TEX Language Package), under development within the TWGMLC (Technical Working Group on Multiple Language Coordination), where the first author is chairing the Catalan linguistic subgroup. Resum Aixi com en altres llengiies, la composicio de textos escrits en catala demana requeriments especials a1 TEX. Aquests inclouen un conjunt particular de patrons de guionat, aixi com suport per a un lligam especial ja que, a diferencia d'altres llengiies romaniques, el catala incorpora el punt volat a1 digraf I'l. En aquest paper es fa una introduccio a1 guionat arnb TEX i es revisen les regles de guionat per a1 catala. Tanmateix, es presenta un conjunt minim de patrons de guionat que cobreix tots els accents i marques diacritiques del catala. -
Department of English Linguistics School of English and American Studies Eötvös Loránd University Budapest, Hungary Table of Contents
The Odd Yearbook 2014 Department of English Linguistics School of English and American Studies Eötvös Loránd University Budapest, Hungary Table of Contents Foreword ................................................................................................................................i Fodor, Brigitta: Scottish Vowel Length: Regular Vowel Length Alternations and the Raising of /ae/ in Scottish Standard English ........................................................................................ 1 Pándi, Julianna Sarolta: Flapping in American English: A Theoretical Approach .............. 14 Szalay, Tünde: L Vocalisation in Three English Dialects .................................................... 35 Szabó, Petra Florina: Social and Regional Variation and Intrusive /r/ ................................. 49 Szabó, Lilla Petronella: “You’re a good friend, bro!”—A Corpus-based Investigation of the Meanings of ‘bro’ ................................................................................................................. 81 Kiss, Angelika: A remark on the Individual/Stage-Level Predicate Distinction in English ... 98 Biczók, Bálint: Approaches to Antecedent-Contained Deletion ......................................... 110 Tomacsek, Vivien: Approaches to the Structure of English Small Clauses ......................... 128 Kucsera, Márton: Restrictive relative clauses in Alignment Syntax .................................. 155 Foreword This is the ninth volume in the series ODD Yearbook, which is the collection of undergraduate -
The Perception and Production of Portuguese Mid-Vowels by Native Speakers of American English
Brigham Young University BYU ScholarsArchive Theses and Dissertations 2004-03-20 The Perception and Production of Portuguese Mid-Vowels by Native Speakers of American English Richard Ryan Kendall Brigham Young University - Provo Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Spanish and Portuguese Language and Literature Commons BYU ScholarsArchive Citation Kendall, Richard Ryan, "The Perception and Production of Portuguese Mid-Vowels by Native Speakers of American English" (2004). Theses and Dissertations. 5. https://scholarsarchive.byu.edu/etd/5 This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. THE PERCEPTION AND PRODUCTION OF PORTUGUESE MID-VOWELS BY NATIVE SPEAKERS OF AMERICAN ENGLISH By Richard R. Kendall A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Arts Department of Spanish and Portuguese Brigham Young University April 2004 Copyright © 2004 Richard R. Kendall All Rights Reserved ABSTRACT THE PERCEPTION AND PRODUCTION OF PORTUGUESE MID-VOWELS BY NATIVE SPEAKERS OF AMERICAN ENGLISH Richard R. Kendall Department of Spanish and Portuguese Master of Arts This thesis examines the difficulties that beginning and advanced American learners of Portuguese have correctly perceiving and producing the Portuguese mid- vowels / e o/. The beginning learners were enrolled in their second semester of Portuguese and had rudimentary knowledge of Portuguese. The advanced learners had all lived in Brazil for nearly two years and were enrolled in a more advanced Portuguese course. -
University of Florida Thesis Or Dissertation Formatting
THE ACQUISITION OF ORTHOGRAPHIC-PHONOLOGICAL CORRESPONDENCE RULES IN L2 AND L3 PORTUGUESE: ERROR RESOLUTION, INTERFERENCE, AND GEN ERALIZABILITY By SHARON BARKLEY A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2010 1 © 2010 Sharon Barkley 2 To Herman O. 3 ACKNOWLEDGMENTS I thank my dissertation chair, Dr. Gillian Lord, for her enduring patience, wise counsel, speedy feedback, and constant availability and willingness to help. I thank my committee members, Dr. Theresa Antes, Dr. Libby Ginway, and Dr. Edith Kaan, for their interest, advice, and dedication in reading this dissertation. I thank my raters and fellow teaching assistants, Andrea Ferreira and Quinn Hansen, for their friendship and for the hours they spent cheerfully listening to and transcribing the data for this project. I thank my statistics consultant, Claudio Fuentes, for his invaluable help in the statistical analysis of the data. I thank my professors and colleagues in the Linguistics Program and the Department of Spanish and Portuguese Studies, for their input and support, especially Dr. Laurie Massery and Dr. Rose Smouse. I thank my teaching supervisor, Dr. Libby Ginway, my fellow teaching assistants, and my many students for all that I have learned from them while teaching Portuguese. I thank my family and friends for their continued love, support, encouragement and prayers throughout this degree program, especially Bill and Mary Barkley, Dr. Thomas Barkley, Sharon Dice, Carol Hall, Tom and Florence Barkley, Liz Horner, Chad Preston, and the members of the “Supper Club.” Finally, I thank God for His innumerable gifts to me, and I pray that He may bless these many people who have so patiently accompanied me on this journey. -
Non-Dravidian Elements and (Non)Diasystematic Change in Malayalam
To appear in Constructions in Contact 2: Language change, multilingual practices, and additional language acquisition eds. Höder, S. & Boas, H. Title: Non-Dravidian elements and (non)diasystematic change in Malayalam Savithry Namboodiripad (University of Michigan, Ann Arbor) [email protected] Abstract: This chapter applies a Diasystematic Construction Grammar (DCxG) approach to account for non-Dravidian vocabulary and phonology in Malayalam, a high-contact Dravidian language. The distinction made in DCxG between diaconstructions, which are language non-specific, and idioconstructions, which are language-specific, proves useful in accounting for semantic specialization and phonological heterogeneity due to language contact. Notably, increased contact with English has led in some cases to decreased phonological adaptation, as some constructions change from diaconstructions to idioconstructions: Non-diasystematic change. Taken together, this chapter argues that any analysis of Malayalam must account for non-Dravidian subpatterns, and including language labels as part of speakers' linguistic knowledge enhances our understanding of the dynamics of language contact. Keywords: Malayalam, English, language contact, language change, semantic specialization, loanword adaptation, historical linguistics, sociolinguistics 1 High-contact languages are not categorically different By introducing a distinction between diaconstructions, which are language non-specific, and idioconstructions, which are language-specific, Diasystematic Construction Grammar (DCxG) captures the fact that borders between languages, while often blurred, also shape language use (Höder 2012). Building on other construction-theoretic approaches, DCxG characterizes all languages as consisting of emergent subpatterns of constructions. This includes hybrid or high-contact languages, which differ only in that those subpatterns can be attributed (by analysts, not necessarily speakers) as belonging to a particular language. -
1 Stress and Syllable Structure in English
1 Stress and Syllable Structure in English: Approaches to Phonological Variations* San Duanmu, Hyo-Young Kim, and Nathan Stiennon University of Michigan Abstract 1. What is phonological variation? We use phonological variation to refer to alternative forms that can be used for more or less similar purposes. For example, in English a word made of CVCVCV can have stress on the first syllable, as in Canada, or on the second syllable, as in banana. There is no reason why the stress pattern could not have been the other way round, i.e. for Canada to have stress on the second syllable and for banana to have stress on the first. Nor is there any reason why stress in such words cannot be all on the first syllable, or all on the second. English just happens to use both forms. Similarly, an English word can be VC, such as Ann, CVC, such as sit, or CCCVC, such as split. There is no reason why a word must use one or another form and English just happens to use all those forms. Besides variations within a language, there are also variations across different languages. For example, before the nuclear vowel Standard Chinese allows CG- but not CC-, whereas English allows both CG- and CC-. Similarly, Standard Chinese only allows [–n] and [–ŋ] after the nuclear vowel, whereas English allows many more consonants. * Portions of this work were presented at University of Michigan in 2002, the Second North American Phonology Conference in 2002, Wayne State University in 2004, Peking University in 2004, the 10th Mid- Continental Workshop on Phonology in 2004, and National Chengchi University, Taipei, in 2005.