DOCUMENT RESUME ED 052 654 FL 002 384 TITLE Speech Research

Total Page:16

File Type:pdf, Size:1020Kb

DOCUMENT RESUME ED 052 654 FL 002 384 TITLE Speech Research DOCUMENT RESUME ED 052 654 FL 002 384 TITLE Speech Research: A Report on the Status and Progress of Studies on Lhe Nature of Speech, Instrumentation for its Investigation, and Practical Applications. 1 July - 30 September 1970. INSTITUTION Haskins Labs., New Haven, Conn. SPONS AGENCY Office of Naval Research, Washington, D.C. Information Systems Research. REPORT NO SR-23 PUB DATE Oct 70 NOTE 211p. EDRS PRICE EDRS Price MF-$0.65 HC-$9.87 DESCRIPTORS Acoustics, *Articulation (Speech), Artificial Speech, Auditory Discrimination, Auditory Perception, Behavioral Science Research, *Laboratory Experiments, *Language Research, Linguistic Performance, Phonemics, Phonetics, Physiology, Psychoacoustics, *Psycholinguistics, *Speech, Speech Clinics, Speech Pathology ABSTRACT This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. The reports contained in this particular number are state-of-the-art reviews of work central to the Haskins Laboratories' areas of research. Tre papers included are: (1) "Phonetics: An Overview," (2) "The Perception of Speech," (3) "Physiological Aspects of Articulatory Behavior," (4) "Laryngeal Research in Experimental Phonetics," (5) "Speech Synthesis for Phonetic and Phonological Models," (6)"On Time and Timing in Speech," and (7)"A Study of Prosodic Features." (Author) SR-23 (1970) SPEECH RESEARCH A Report on the Status and Progress ofStudies on the Nature of Speech,Instrumentation for its Investigation, andPractical Applications 1 July - 30 September1970 U.S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE OFFICE OF EDUCATION THIS DOCUMENT HAS BEEN REPRODUCED EXACTLY AS RECEIVED FROMTHE PERSON OR ORGANIZATION ORIGINATING IT,POINTS OF VIEW OR OPINIONS STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION POSITION OR POLICY. t Haskins Laboratories 0 270 Crown Street New Haven, Conn. 06510 0 0 Distribution of this document isunlimited. (This document contains no information notfreely available to the it primarily for general public. Haskins Laboratories distributes library use.) ACKNOWLEDGEMENTS The research reported here was made possible in part by support from the following sources: Information Systems Branch, Office of Naval Research Contract N00014-67-A-0129-0001 Req. No. NR 048-225 National Institute of Dental Research Grant DE-01774 National Institute of Child Health and Human Development Grant HD-01994 Research and Development Division of the Prosthetic and Sensory Aids Service, Veteran Administration Contract V-1005M-1253 National Institutes of Health General Research Support Grant FR-5596 9 UNCLASSIFIED Security Classification DOCUMENT CONTROL DATA R&D 'Security classification of title, body of abstract and indexing annotatitr1 must he entered IvIton the overall report is rirsilied) 1 OPIGIN A TING ACTIVITY (Corporate author) 2a. REPORT SECURITY CLASSIFICATION Haskins Laboratories, Inc. Unclassified 270 Crown Street 2b. GROUP New Haven, Conn. 06510 N/A 3. REPORT TITLE . Status Report on Speech Research, Nt. 23,July-September1970 4. DESCRIPTIVE NOTES (Type of report and inclusive dates) Interim Scientific Report . 5. AU THORISI (First name, middle initial, last name) . Staff of Haskins Laboratories; Franklin S. Cooper, P.I. eiREPORT DATE 7a. TOTAL NO. OF PAGES 7b. NO. OF REFS October 1970 218 750 8.9. CONTRACT OR GRANT NO. 9a. ORIGINATOR'S REPORT NUMBERIS) ONR Contract NO0014-67-0129-0001 b. PROJECT NO. SR-23 (1970) NIDR: Grant E-01774 cNICHD: Grant HD-01994 9b. OTHER REPORT NOIS) (Any other numbers that may be assigned this report) NIH/DRFR: Grant FR-5596 d.VA/PSAS Contract V-1005M-1253 None . 10.DISTRIBUTION STATEMENT Distribution of this document in unlimited.* . 11. SUPPLEMENTARY NOTES 12. SPONSORING MILITARY ACTIVITY . N/A See No. 8 . 13. ABSTRAC T This report (for 1 July - 30 September 1970) isone of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical,applications. The reports contained in this particular number are state-of-the-art reviews of work central to the Laboratories'areas of research: Phonetics:An Overview The Perception of Speech Physiological Aspects of Articulatory Behavior Laryngeal Research in Experimental Phonetics Speech Synthesis for Phonetic and Phonological Models On Time and Timing in Speech A Study of Prosodic Features . Mial=m1Q. (PAGE 1) 6..., I Nov ::,1 Dirl Fe" Lsdocument contains no information UNCLASSIFIED sir," oloi-SO7-6811 Security Classification A.-31402 . not freely available to the general public. It is dietrfatted primarily for library use. 111W.I.AgATFUrt - Security Classification ,,........................ma. ffirsamem.-mora 1 LINK A I KEY WORDS LINK 8 LINK C ,ROLE IKT ROLE ROLE IT-7-1T Phonetics Perception of speech Physiology of speech production Laryngeal research Speech synthesis Timing in speech Prosodic features - Electroinyography - . 1 1 . , ,.............i.....--,.... --............. (BACK) DDI NO V 65 11473 UNCLASSIFIED SPI 0101-0076821 Security Classification A 31309 CONTENTS I. Introductory Note 1 II. Extended Reports and Manuscripts Phonetics: An Overview 3 The Perception of Speech 15 Physiological Aspects of Articulatory Behavior 49 Laryngeal Research in Experimental Phonetics 69 Speech Synthesis for Phonetic and Phonological Models 117 On Time and Timing in Speech 151 A Study of Prosodic Features 179 III. Manuscripts for Publication, Reports, and Oral Papers 209 INTRODUCTORY NOTE ABOUT STATUS REPORT 23 This issue of the Status Report on Speech Research contains state-of- the-art reviews of work central to the Laboratories' areas of research. They were written by staff members for Volume XII of the series Current Trends in Linguistics. This series, edited by Thomas A. Sebeok and pub- lished by Mouton & Co., began to appear in 1963. Fourteen volumes are projected, the final volume to appear in 1974. Each takes as its domain linguistic scholarship in a particular geographical area or an aspect of the field of linguistics. Specialists have contributed chapters to each volume assessing the state of knowledge and research activity in areas in which they themselves are productive. Volume XII is to appear in 1972 in three tomes under the title Lin- guistics and Adjacent: Arts and Sciences. It is to include the following sections: Part One: History of Linguistics Part Two: Linguistics and Philosophy Part Three: Semiotics Part Four: Linguistics and the Verbal Arts Part Five: Special Languages Part Six: Linguistic Aspects of Translation Part Seven: Linguistics and Psychology Part Eight: Linguistics and Sociology Part Nine: Linguistics and Anthropology Part Ten: Linguistics and Economics Part Eleven:Linguistics and Education Part Twelve:Phonetics Part Thirteen:Bio-Medical Applications Part Fourteen: Computer Applications Part Fifteen: Linguistics as a Pilot Science In addition to Thomas A. Sebeok, the editor of the series, Arthur S. Abramson, Dell Hymes, Herbert Rubenstein, and Edward Stankiewicz are serving as associate editors, and Bernard Spolsky as assistant editor of this volume. The associate editor in charge of Part Twelve, Arthur S. Abramson, as well as five contributors, Michael Studdert-Kennedy, Katherine S. Harris, Ignatius G. Mattingly, Leigh Lisker, and Philip Lieberman are affiliated with Haskins Laboratories. A sixth contributor to that part, Masayuki Sawashima, did most of his work on his contribution while he was a guest member of the Haskins research staff, although he is normally at the University of Tokyo. Mr. Peter de Ridder of Mouton & Co. has kindly allowed us to follow our normal procedure of including in our Status Reports on Speech Research manuscripts accepted for publication elsewhere.The circumstances are unusual in that this issue of our Status Report comprises a sizeable bloc of material soon to appear under the copyright of a publishing house. We trust Mouton's courtesy to us will be repaid if the sampling of articles presented here stimulates interest on the part of our readership in acquiring the whole volume for their institutions or themselves once it has been released. 1 6 A.S. Abramson's "Overview" necessarily refers to all ten articles in his section and not just the five contributions from Haskins Laboratories. For the convenience of our readers, we have listed below the Table of Con- tents for Part Twelve: Phonetics. Arthur S. Abramson Phonetics: An Overview D.B. Fry Phonetics in the Twentieth Century John M. Heinz Speech Acoustics Michael Studdert-Kennedy The Perception of Speech Katherine S. Harris Physiological Aspects of Articulatory Behavior Masayuki Sawashima Laryngeal Research in Experimental Pho- netics Ignatius G. Mattingly Speech Synthesis for Phonetic and Phono- logical Models Leigh Lisker On Time and Timing in Speech Philip Lieberman A Study of Prosodic Features J.C. Catford Phonetic Field Work Ancrg Mal6cot Crosslanguage Phonetics 2 Phonetics: An Overview Arthur S. Abramson Haskins Laboratories, New Haven Phonetics is traditionally concerned with the ways in which the sounds of speech are produced, but the resulting descriptions normally mix auditory factors with articulatory ones, thus depending ultimately upon percepts of the phonetician. This would be true even in a laboratory report using such terms as "high back vowel" that do not themselves stem from instrumental analysis. At the very least, then, we must include the hearing of speech sounds in the definition of phonetics to the extent of allowing
Recommended publications
  • THE DEVELOPMENT of ACCENTED ENGLISH SYNTHETIC VOICES By
    THE DEVELOPMENT OF ACCENTED ENGLISH SYNTHETIC VOICES by PROMISE TSHEPISO MALATJI DISSERTATION Submitted in fulfilment of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE in the FACULTY OF SCIENCE AND AGRICULTURE (School of Mathematical and Computer Sciences) at the UNIVERSITY OF LIMPOPO SUPERVISOR: Mr MJD Manamela CO-SUPERVISOR: Dr TI Modipa 2019 DEDICATION In memory of my grandparents, Cecilia Khumalo and Alfred Mashele, who always believed in me! ii DECLARATION I declare that THE DEVELOPMENT OF ACCENTED ENGLISH SYNTHETIC VOICES is my own work and that all the sources that I have used or quoted have been indicated and acknowledged by means of complete references and that this work has not been submitted before for any other degree at any other institution. ______________________ ___________ Signature Date iii ACKNOWLEDGEMENTS I want to recognise the following people for their individual contributions to this dissertation: • My brother, Mr B.I. Khumalo and the whole family for the unconditional love, support and understanding. • A distinct thank you to both my supervisors, Mr M.J.D. Manamela and Dr T.I. Modipa, for their guidance, motivation, and support. • The Telkom Centre of Excellence for Speech Technology for providing the resources and support to make this study a success. • My colleagues in Department of Computer Science, Messrs V.R. Baloyi and L.M. Kola, for always motivating me. • A special thank you to Mr T.J. Sefara for taking his time to participate in the study. • The six Computer Science undergraduate students who sacrificed their precious time to participate in data collection.
    [Show full text]
  • La Voz Humana
    UNIVERSIDAD DE CHILE FACULTAD DE ARTES MAGISTER EN ARTES MEDIALES SHOUT! Tesis para optar al Grado de Magíster en Artes Medialess ALUMNO: Christian Oyarzún Roa PROFESOR GUÍA: Néstor Olhagaray Llanos Santiago, Chile 2012 A los que ya no tienen voz, a los viejos, a los antiguos, a aquellos que el silencio se lleva en la noche. ii AGRADECIMIENTOS Antes que nada quiero agradecer a Néstor Olhagaray, por el tesón en haberse mantenido como profesor guía de este proyecto y por la confianza manifestada más allá de mis infinitos loops y procastinación. También quisiera manifestar mis agradecimientos a Álvaro Sylleros quién, como profesor del Diplomado en Lutería Electrónica en la PUC, estimuló y ayudó a dar forma a parte sustancial de lo aquí expuesto. A Valentina Montero por la ayuda, referencia y asistencia curatorial permanente, A Pablo Retamal, Pía Sommer, Mónica Bate, Marco Aviléz, Francisco Flores, Pablo Castillo, Willy MC, Álvaro Ceppi, Valentina Serrati, Yto Aranda, Leonardo Beltrán, Daniel Tirado y a todos aquellos que olvido y que participaron como informantes clave, usuarios de prueba ó simplemente apoyando entusiastamente el proyecto. Finalmente, quiero agradecer a Claudia González, quien ha sido una figura clave dentro de todo este proceso. Gracias Clau por tu sentido crítico, por tu presencia inspiradora y la ejemplar dedicación y amor por lo que haces. iii Tabla de Contenidos RESÚMEN v INTRODUCCIÓN 1 A Hard Place 1 Descripción 4 Objetivos 6 LA VOZ HUMANA 7 La voz y el habla 7 Balbuceos, risas y gritos 11 Yo no canto por cantar 13
    [Show full text]
  • Part 2: RHYTHM – DURATION and TIMING Część 2: RYTM – ILOCZAS I WZORCE CZASOWE
    Part 2: RHYTHM – DURATION AND TIMING Część 2: RYTM – ILOCZAS I WZORCE CZASOWE From research to application: creating and applying models of British RP English rhythm and intonation Od badań do aplikacji: tworzenie i zastosowanie modeli rytmu i intonacji języka angielskiego brytyjskiego David R. Hill Department of Computer Science The University of Calgary, Alberta, Canada [email protected] ABSTRACT Wiktor Jassem’s contributions and suggestions for further work have been an essential influence on my own work. In 1977, Wiktor agreed to come to my Human-Computer Interaction Laboratory at the University of Calgary to collaborate on problems asso- ciated with British English rhythm and intonation in computer speech synthesis from text. The cooperation resulted in innovative models which were used in implementing the world’s first completely functional and still operational real-time system for arti- culatory speech synthesis by rule from plain text. The package includes the software tools needed for developing the databases required for synthesising other languages, as well as providing stimuli for psychophysical experiments in phonetics and phonology. STRESZCZENIE Pu bli ka cje Wik to ra Jas se ma i wska zów ki do dal szej pra cy w spo sób istot ny wpły nę - ły na mo ją wła sną pra cę. W 1997 ro ku Wik tor zgo dził się na przy jazd do mo je go La - bo ra to rium In te rak cji Czło wiek -Kom pu ter na Uni wer sy te cie w Cal ga ry aby wspól nie za jąć się pro ble ma mi zwią za ny mi z ryt mem w bry tyj skim an giel skim oraz in to na cją w syn te zie mo wy z tek stu.
    [Show full text]
  • Aac 2, 3, 4, 5, 6, 50, 51, 52, 53, 55, 56, 57, 58, 59, 60, 61, 62
    315 Index A augmentative and alternative communication (AAC) 130, 145, 148 AAC 2, 3, 4, 5, 6, 50, 51, 52, 53, 55, 56, 57, Augmentative and Alternative Communication 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, (AAC) 50 69, 130, 131, 138, 140, 141, 142, 143, augmented speech 116, 126 144, 145, 147, 148, 149, 150, 153, 156, Augmented Speech Communication 117 158, 159, 160, 161, 162, 163, 164, 165, Augmented speech communication (ASC) 116 172, 173, 174, 175, 192, 194, 198, 205, automatic categorization 220 206, 207, 208, 213, 214, 215, 216, 257, automatic speech recognition 100 258, 259, 260, 263, 265 Automatic Speech Recognition (ASR) 189 able-bodied people 220 accessibility 12, 14, 15, 21 B acoustic energy 31 adaptive differential pulse code modulation background noise 56, 67, 130, 132, 134, 135, (ADPCM) 32 136, 137, 141, 142, 144 AIBO 17 BCI 17 aided communication techniques 205 bit rate 31, 32, 33, 41, 45, 47 aided communicator 234, 235, 241, 243, 245, bits per second (bps) 31 248, 250, 252 Blissymbols 235, 238, 239, 240, 241, 242, 243, aided language development 235 244, 246, 247, 250, 256 allophones 34, 35 brain-computer interfaces (BCI) 17 alternate reader 17 Brain Computer Interfaces (BCI’s) 6 alternative and augmentative communication brain-machine interface (BMI) 17 (AAC) 2, 93 C alternative communication 1 amyotrophic lateral sclerosis (ALS) 1 CALL 189, 190, 191, 192, 193, 196, 198, 199, analog-to-digital converter 30, 31 201, 202, 203 anti-aliasing (low-pass) filters 3 1 CASLT 188, 189, 190, 191, 193, 198, 199 aphasia 148, 149,
    [Show full text]
  • SPEECH ACOUSTICS and PHONETICS Text, Speech and Language Technology VOLUME 24
    SPEECH ACOUSTICS AND PHONETICS Text, Speech and Language Technology VOLUME 24 Series Editors Nancy Ide, Vassar College, New York Jean V´eronis, Universited´ eProvence and CNRS, France Editorial Board Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands Kenneth W. Church, AT&TBell Labs, New Jersey, USA Judith Klavans, Columbia University, New York, USA David T. Barnard, University of Regina, Canada Dan Tufis, Romanian Academy of Sciences, Romania Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain Stig Johansson, University of Oslo, Norway Joseph Mariani, LIMSI-CNRS, France The titles published in this series are listed at the end of this volume. Speech Acoustics and Phonetics by GUNNAR FANT Department of Speech, Music and Hearing, Royal Institute of Technology, Stockholm, Sweden KLUWER ACADEMIC PUBLISHERS DORDRECHT / BOSTON / LONDON A C.I.P Catalogue record for this book is available from the Library of Congress. ISBN 1-4020-2789-3 (PB) ISBN 1-4020-2373-1 (HB) ISBN 1-4020-2790-7 (e-book) Published by Kluwer Academic Publishers, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. Sold and distributed in North, Central and South America by Kluwer Academic Publishers, 101 Philip Drive, Norwell, MA 02061, U.S.A. In all other countries, sold and distributed by Kluwer Academic Publishers, P.O. Box 322, 3300 AH Dordrecht, The Netherlands. Printed on acid-free paper All Rights Reserved C 2004 Kluwer Academic Publishers No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
    [Show full text]
  • Gender, Ethnicity, and Identity in Virtual
    Virtual Pop: Gender, Ethnicity, and Identity in Virtual Bands and Vocaloid Alicia Stark Cardiff University School of Music 2018 Presented in partial fulfilment of the requirements for the degree Doctor of Philosophy in Musicology TABLE OF CONTENTS ABSTRACT i DEDICATION iii ACKNOWLEDGEMENTS iv INTRODUCTION 7 EXISTING STUDIES OF VIRTUAL BANDS 9 RESEARCH QUESTIONS 13 METHODOLOGY 19 THESIS STRUCTURE 30 CHAPTER 1: ‘YOU’VE COME A LONG WAY, BABY:’ THE HISTORY AND TECHNOLOGIES OF VIRTUAL BANDS 36 CATEGORIES OF VIRTUAL BANDS 37 AN ANIMATED ANTHOLOGY – THE RISE IN POPULARITY OF ANIMATION 42 ALVIN AND THE CHIPMUNKS… 44 …AND THEIR SUCCESSORS 49 VIRTUAL BANDS FOR ALL AGES, AVAILABLE ON YOUR TV 54 VIRTUAL BANDS IN OTHER TYPES OF MEDIA 61 CREATING THE VOICE 69 REPRODUCING THE BODY 79 CONCLUSION 86 CHAPTER 2: ‘ALMOST UNREAL:’ TOWARDS A THEORETICAL FRAMEWORK FOR VIRTUAL BANDS 88 DEFINING REALITY AND VIRTUAL REALITY 89 APPLYING THEORIES OF ‘REALNESS’ TO VIRTUAL BANDS 98 UNDERSTANDING MULTIMEDIA 102 APPLYING THEORIES OF MULTIMEDIA TO VIRTUAL BANDS 110 THE VOICE IN VIRTUAL BANDS 114 AGENCY: TRANSFORMATION THROUGH TECHNOLOGY 120 CONCLUSION 133 CHAPTER 3: ‘INSIDE, OUTSIDE, UPSIDE DOWN:’ GENDER AND ETHNICITY IN VIRTUAL BANDS 135 GENDER 136 ETHNICITY 152 CASE STUDIES: DETHKLOK, JOSIE AND THE PUSSYCATS, STUDIO KILLERS 159 CONCLUSION 179 CHAPTER 4: ‘SPITTING OUT THE DEMONS:’ GORILLAZ’ CREATION STORY AND THE CONSTRUCTION OF AUTHENTICITY 181 ACADEMIC DISCOURSE ON GORILLAZ 187 MASCULINITY IN GORILLAZ 191 ETHNICITY IN GORILLAZ 200 GORILLAZ FANDOM 215 CONCLUSION 225
    [Show full text]
  • WIKI on ACCESSIBILITY Completion Report March 2010
    WIKI ON ACCESSIBILITY Completion report March 2010 By Nirmita Narasimhan Programme Manager Centre for Internet and Society Project Title: Wiki on “Accessibility, Disability and the Internet in India” Page | 1 REPORT Accessibility wiki: accessibility.cis-india.org The wiki project was envisaged and funded by the National Internet Exchange of India (www.nixi.in) and has been executed by the Centre for Internet and Society (www.cis-india.org), Bangalore. Project Start date: May 2009 End date: February 2010. Background India has a large percentage of disabled persons in its population— estimated to be over seven per cent as per the Census of 2001. Even this figure is believed to be a gross under representation of the total number of disabled persons residing in this large and diverse country. Taken in figures, this amounts to roughly 70-100 million persons with disabilities in the territory of India. Out of this number, a mere two per cent residing in urban areas have access to information and assistive technologies which enable them to function in society and enhance their performance. There are several reasons for this, one of them being that there is a deplorable lack of awareness which exists on the kinds of disabilities and about ways in which one can provide information and services to disabled persons. Parents, teachers, government authorities and society at large are all equally unaware about the options which exist in technology today to enable persons with disabilities to carry on independent and productive lives. Barring a few exceptions, India is still trapped in an era where a white cane and a Braille slate symbolises the future for blind people, while the world has progressed to newer forms of enabling technology such as screen readers, daisy players, the Kindle and so on.
    [Show full text]
  • Estudios De I+D+I
    ESTUDIOS DE I+D+I Número 51 Proyecto SIRAU. Servicio de gestión de información remota para las actividades de la vida diaria adaptable a usuario Autor/es: Catalá Mallofré, Andreu Filiación: Universidad Politécnica de Cataluña Contacto: Fecha: 2006 Para citar este documento: CATALÁ MALLOFRÉ, Andreu (Convocatoria 2006). “Proyecto SIRAU. Servicio de gestión de información remota para las actividades de la vida diaria adaptable a usuario”. Madrid. Estudios de I+D+I, nº 51. [Fecha de publicación: 03/05/2010]. <http://www.imsersomayores.csic.es/documentos/documentos/imserso-estudiosidi-51.pdf> Una iniciativa del IMSERSO y del CSIC © 2003 Portal Mayores http://www.imsersomayores.csic.es Resumen Este proyecto se enmarca dentro de una de las líneas de investigación del Centro de Estudios Tecnológicos para Personas con Dependencia (CETDP – UPC) de la Universidad Politécnica de Cataluña que se dedica a desarrollar soluciones tecnológicas para mejorar la calidad de vida de las personas con discapacidad. Se pretende aprovechar el gran avance que representan las nuevas tecnologías de identificación con radiofrecuencia (RFID), para su aplicación como sistema de apoyo a personas con déficit de distinta índole. En principio estaba pensado para personas con discapacidad visual, pero su uso es fácilmente extensible a personas con problemas de comprensión y memoria, o cualquier tipo de déficit cognitivo. La idea consiste en ofrecer la posibilidad de reconocer electrónicamente los objetos de la vida diaria, y que un sistema pueda presentar la información asociada mediante un canal verbal. Consta de un terminal portátil equipado con un trasmisor de corto alcance. Cuando el usuario acerca este terminal a un objeto o viceversa, lo identifica y ofrece información complementaria mediante un mensaje oral.
    [Show full text]
  • Masterarbeit
    Masterarbeit Erstellung einer Sprachdatenbank sowie eines Programms zu deren Analyse im Kontext einer Sprachsynthese mit spektralen Modellen zur Erlangung des akademischen Grades Master of Science vorgelegt dem Fachbereich Mathematik, Naturwissenschaften und Informatik der Technischen Hochschule Mittelhessen Tobias Platen im August 2014 Referent: Prof. Dr. Erdmuthe Meyer zu Bexten Korreferent: Prof. Dr. Keywan Sohrabi Eidesstattliche Erklärung Hiermit versichere ich, die vorliegende Arbeit selbstständig und unter ausschließlicher Verwendung der angegebenen Literatur und Hilfsmittel erstellt zu haben. Die Arbeit wurde bisher in gleicher oder ähnlicher Form keiner anderen Prüfungsbehörde vorgelegt und auch nicht veröffentlicht. 2 Inhaltsverzeichnis 1 Einführung7 1.1 Motivation...................................7 1.2 Ziele......................................8 1.3 Historische Sprachsynthesen.........................9 1.3.1 Die Sprechmaschine.......................... 10 1.3.2 Der Vocoder und der Voder..................... 10 1.3.3 Linear Predictive Coding....................... 10 1.4 Moderne Algorithmen zur Sprachsynthese................. 11 1.4.1 Formantsynthese........................... 11 1.4.2 Konkatenative Synthese....................... 12 2 Spektrale Modelle zur Sprachsynthese 13 2.1 Faltung, Fouriertransformation und Vocoder................ 13 2.2 Phase Vocoder................................ 14 2.3 Spectral Model Synthesis........................... 19 2.3.1 Harmonic Trajectories........................ 19 2.3.2 Shape Invariance..........................
    [Show full text]
  • Half a Century in Phonetics and Speech Research
    Fonetik 2000, Swedish phonetics meeting in Skövde, May 24-26, 2000 (Expanded version, internal TMH report) Half a century in phonetics and speech research Gunnar Fant Department of Speech, Music and Hearing, KTH, Stockholm, 10044 Abstract This is a brief outlook of experiences during more than 50 years in phonetics and speech research. I will have something to say about my own scientific carrier, the growth of our department at KTH, and I will end up with an overview of research objectives in phonetics and a summary of my present activities. Introduction As you are all aware of, phonetics and speech research are highly interrelated and integrated in many branches of humanities and technology. In Sweden by tradition, phonetics and linguistics have had a strong position and speech technology is well developed and internationally respected. This is indeed an exciting field of growing importance which still keeps me busy. What have we been up to during half a century? Where do we stand today and how do we look ahead? I am not attempting a deep, thorough study, my presentation will in part be anecdotal, but I hope that it will add to the perspective, supplementing the brief account presented in Fant (1998) The early period 1945-1966 KTH and Ericsson 1945-1949 I graduated from the department of Telegraphy and Telephony of the KTH in May 1945. My supervisor, professor Torbern Laurent, a specialist in transmission line theory and electrical filters had an open mind for interdisciplinary studies. My thesis was concerned with theoretical matters of relations between speech intelligibility and reduction of overall system bandwidth, incorporating the effects of different types of hearing loss.
    [Show full text]
  • Chapter 10 Speech Synthesis Ch T 11 a T Ti S H R Iti Chapter 11 Automatic
    Chapter 10 Speech Synthesis Chapt er 11 Aut omati c Speech Recogniti on 1 An Enggpineer’s Perspective Speech production Speech analysis ShSpeech Speech Speech quality coding recognition assessment Speech Speech Speaker synthesis enhancement recognition 2 3 4 History Long before modern electronic signal processing was invented, speech researchers tried to build machines to create human speech. Early examples of 'speaking heads' were made by Gerbert of Aurillac (d. 1003), Albertus Magnus (1198-1280), and Roger Bacon (1214-1294). In 1779, the Danish scientist Christian Kratzenstein, working at the time at the Russian Academy of Sciences, built models of the human vocal tract that could produce the five long vowel sounds (a, e, i, o and u). 5 Kratzenstein's resonators 6 Engineering the vocal tract: Riesz 1937 7 Homer Dudley 1939 VODER Synthesizing speech by electrical means 1939 World’s Fair •Manually controlled through complex keyboard •Operator training was a problem 8 Cooper’s Pattern Playback Haskins Labs for investigating speech perception Works like an inverse of a spectrograph Light from a lamp goes through a rotating disk then througgph spectrog ram into photovoltaic cells Thus amount of light that gets transmitted at eachfh frequency b and correspond s to amount of acoustic energy at that band 9 Cooper’s Pattern Playback 10 Modern TTS systems 1960’s first full TTS: Umeda et al (1968) 1970’s Joe Olive 1977 concatenation of linear-prediction diphones Speak and Spell 1980’s 1979 MIT MITalk (Allen, Hunnicut, Klatt) 1990’s-present Diphone synthesis Unit selection synthesis 11 Types of Modern Synthesis Articulatory Synthesis: Model movements of articulators and acoustics o f voca l trac t Formant Synthesis: Start with acoustics,,/ create rules/filters to create each formant Concatenative Synthesis: Use databases of stored speech to assemble new utterances.
    [Show full text]
  • Wormed Voice Workshop Presentation
    Wormed Voice Workshop Presentation micro_research December 27, 2017 1 some worm poetry and songs: The WORM was for a long time desirous to speake, but the rule and order of the Court enjoyned him silence, but now strutting and swelling, and impatient, of further delay, he broke out thus... [Michael Maier] He worshipped the worm and prayed to the wormy grave. Serpent Lucifer, how do you do? Of your worms and your snakes I'd be one or two; For in this dear planet of wool and of leather `Tis pleasant to need neither shirt, sleeve, nor shoe, 2 And have arm, leg, and belly together. Then aches your head, or are you lazy? Sing, `Round your neck your belly wrap, Tail-a-top, and make your cap Any bee and daisy. Two pigs' feet, two mens' feet, and two of a hen; Devil-winged; dragon-bellied; grave- jawed, because grass Is a beard that's soon shaved, and grows seldom again worm writing the the the the,eeeronencoug,en sthistit, d.).dupi w m,tinsprsool itav f bometaisp- pav wheaigelic..)a?? orerdi mise we ich'roo bish ftroo htothuloul mespowouklain- duteavshi wn,jis, sownol hof." m,tisorora angsthyedust,es, fofald,junss ownoug brad,)fr m fr,aA?a????ck;A?stelav aly, al is.'rady'lfrdil owoncorara wns t.) sh'r, oof ofr,a? ar,a???????a? fu mo towess,eethen hrtolly-l,."tigolav ict,a???!ol, w..'m,elyelil,tstreamas..n gotaillas.tansstheatsea f mb ispot inici t.) owar.**1 wnshigigholoothtith orsir.tsotic.'m, sotamimoledug imootrdeavet..t,) sh s,tranciror."wn sieee h asinied.tiear wspilotor,) bla av.nicord,ier.dy'et.*tite m.)..*d, hrouceto hie, ig il m, bsomoug,.t.'l,t, olitel bs,.nt,.dotr tat,)aa? htotitedont,j alesil, starar,ja taie ass.nishiceroouldseal fotitoonckysil, m oitispl o anteeeaicowousomirot.
    [Show full text]