A History of Vocoder Research at Lincoln Laboratory

Total Page:16

File Type:pdf, Size:1020Kb

A History of Vocoder Research at Lincoln Laboratory B.Gold A History of Vocoder Research at Lincoln Laboratory During the 1950s, Lincoln Laboratory conducted a study of the detection of pitch in speech. This work led to the design ofvoice coders (vocoders)-devices that reduce the bandwidth needed to convey speech. The bandwidth reduction has two benefits: it lowers the cost oftransmission and reception ofspeech, and it increases the potential privacy. This article reviews the past 30 years ofvocoder research with an emphasis on the work done at Lincoln Laboratory. ifI could detennine what there is in the very rapidly basically a transmitter of waveforms. In fact, changingcomplexspeechwavethatcorrespondstothe telephone technology to date has been almost simple motions ojthe Zips and tongue, I could then totally oriented toward transmission while the analyze speechjorthesequantities, I wouldhaveaset ojspeech defining signals that could be handled as subject of speech modeling has remained of low-jrequency telegraph currents with resulting ad­ peripheral practical interest, in spite ofyears of vantages ojsecrecy, and more telephone channels in research by engineers and scientists from Lin­ the sameJrequency space as well as a basic under­ coln Laboratory, Bell Laboratories, and other standingoJthecarriernatureojspeechbywhichthe lip research centers. reader interprets speechjrom simple motions. -Homer Dudley, 1935 Modern methods ofspeech processing really beganinthe UnitedStateswiththe development Historical Background oftwo devices: the voice operated demonstrator (voder) [4] and channel voice coder (vocoder) [51. More than 200 years ago, the Hungartan Both devices were pioneered by Homer Dudley, inventorW. von Kempelen was the first to build whose philosophy ofspeech processing is quot­ a mechanical talking contrivance that could ed at the beginning of this article. mimic, albeit in a rudimentary fashion, the The appearance of the voder at the 1939 vartoils sounds of human speech [1, 2]. The World's FairinSanFranciscoandNewYorkCity device, constructed around 1780, proved that elicited intense curiosity. The device was con­ human speech production could be replicated. trolled by a human operator who operated a The next major development in the field was console similar to a piano keyboard (Fig. 1). The undoubtedly the most important: Alexander keys on the keyboard switched the appropriate Graham Bell's invention of the telephone. Al­ bandpass fIlters into the system; by depressing though there is no need to chronicle the well­ two or three of the keys and setting the wrist known events that led to the invention, nor any barto thebuzz(Le., voicing) condition, anopera­ need to discuss its subsequent impact on tor could produce vowel and nasal sounds. If humancommunications, itisinterestingto note the Wlist bar was set to the hiss (Le., voice­ that Bell's primary profession was that of a less) condition, sounds such as the voiceless speech scientist, Le., someone with knowledge fricatives could be generated. Special keys ofthe intricacies ofthe humanvocal apparatus. were used to produce the plosive and affrica­ Reference 3, which contains a description of tive sounds (the chin cheese and) injaw), and Bell's Harp telephone, reveals the inventor's a pitch pedalwas used to control the intonation keen understanding of the rudiments of the of the sound. spectral envelope ofspeech. While the voder was the first electronic syn­ But the telephone that Bell invented was thesizer, the channel vocoder was the first The Lincoln Laboratory Journal. Volume 3, Number 2 (1990) 163 Gold - A History oJVocoder Research at Lincoln Laboratory dard). Thus the channel vocoder was qUickly recognized as a means ofreducing the speech bit rate to some number that could be handled through the average telephone channel. Even­ tually the military set the standard rate at 2.4 kb/sec. To understand Dudley's conceptofmore tele­ phone channels in the same frequency space, it is important to realize that human speech pro­ duction depends on relatively slow changes in the vocal-tract articulators such as the tongue and the lips. Thus, ifwe could develop accurate models of the articulators and estimate the parameters of their motion, we could create an analysis-synthesis system that has a low data rate. Dudley managed to bypass the difficult task of modeling the individual articulators by realizing that he could lump all articulator motion into one time-varying spectral envelope. Furthermore, Dudley understood that, to a first approximation, the excitation produced by vo­ cal-cord vibrations (for voiced sounds) and re­ gions ofvocal-tractconstriction (for consonants) could be separated from the actual speech Fig. 1-0riginal sketch of voder by S. W. Watkins (8 Sep­ tember 1937). sounds that are produced by the motion of the tongue, the lips, and the participation of the nasal passage. Both the spectrum envelope and analysis-synthesis system: that is, the channel excitation parameters must vary slowly (be­ vocoder derived certain parameters from a cause the articulators themselves vary slowly) speech wave, and the parameters were then and thus transmission ofthese derived parame­ used to control a synthesizer that reproduced ters requires only several hundred hertz of the speech. bandwidth rather than the usual 3 to 10 kHz To paraphraseDudley, vocoders could lead to needed to transmitspeechvia normal telephony advantages of more secure communications, methods. and a greater number of telephone channels in In an interesting aside, an informative and the same frequency space. Both of Dudley's entertaining paper by W.R. Bennett (6) gives a predictions were correct, but the exact ways in historical survey of the X-System of secret te­ which they came to pass (or are coming to pass) lephony that high-level Allied personnel used probably differed from what he imagined. extensively dUring World War II. For example, In 1929, digital communications was un­ during the invasion of Normandy in 1944, the known. When digitization did become feasible, system was used for communications between researchers realized that the process could London and Washington. Details about X-Sys­ provide a means of making communications tem have only recentlybecome declassified, and less vulnerable to eavesdropping. Digitization, from thatinformationX-Systemappears to have however, required wider transmission band­ been a sophisticated version of Dudley's chan­ widths. For example, the current 3-kHz band­ nel vocoder. X-System included such features width ofa local telephone line cannottransmita as pulse-coded modulation transmission, loga­ pulse-coded modulation speech signal that is rithmic encoding of the channel signal, and, of coded to 64 kb/sec (the present telephone stan- course, enciphered speech. 164 The Lincoln Laboratory Journal. Volume 3. Number 2 (1990) Gold - A History oJVocoder Research at Lincoln Laboratory Advances in Telephony tain that the growth ofcellular phone traffic will require vocoding techniques. In an article that appeared in National Geo­ graphic magazine [7], F.B. Colton gives an en­ grossing account of the history of telephone Work at Lincoln Laboratory transmission. Figure 2, which is from that ar­ Lincoln Laboratory's history in vocoding ticle, shows the telephone wires on lower Broad­ commenced in the 1950s with extensive re­ way in New York City in 1887. It is clear that search in pattern recognition, which led to progress in telephony could easily have been work on detecting the pitch of speech. Some broughtto a haltifnotfor suchimprovementsas of the Laboratory's contributions in vocoding underground cables and multiplexing tech­ include the niques. At present, fiber optical transmission 1. development ofpitch-detection algorithms and lasers are allowing another great leap for­ and hardware that have served as stan­ ward in capacity as telephone traffic continues dards for almost 30 years, to increase. 2. development ofthe first vocoder hardware Such a technological advance brings to mind for satellite and aircraft narrowband the following question: Will optical fibers with speech communications, their enormous bandwidth make vocoding 3. creation of the first real-time computer technology obsolete? The author believes the programs that used linear predictive cod­ answer is no because of the great potential ing (LPC) and homomorphic vocoding, growth of radio telephones, e.g., cellular 4. use ofvocoders for packet speech commu­ phones. The permissible bandwidths of such nications, devices are limited by nature; thus it seems cer- 5. use of high-frequency (HF) channels for vocoding, 6. invention of a novel analysis-synthesis system (the Sine Transform Coder), and 7. introductionofmethods thatallowvocoder research and development to be carried out in real time on general-purpose com­ puters. This article discusses the first, second, third, and seventh contributions, and gives references to the remaining items. The Vocoder Figure 3 is a schematic diagram ofa vocoder. In the left ofthefigure, thevocal-source analyzer finds the correct time-Varying excitation para­ meters of the input speech (discussed in the following section, "Pitch Detection"). A parame­ ter is modeled as either a quasi-periodic pulse train or a noise source. The vocal-tract analyzer finds the time-varying shape of the spectral envelope of the input speech by means of an appropriate algorithm (discussed in the subse­ quent section, "Speech Synthesis"). The excita­ Fig. 2-Telephone wires on lower Broadway in New York tion parameters and spectral envelope
Recommended publications
  • The Phase Vocoder: a Tutorial Author(S): Mark Dolson Source: Computer Music Journal, Vol
    The Phase Vocoder: A Tutorial Author(s): Mark Dolson Source: Computer Music Journal, Vol. 10, No. 4 (Winter, 1986), pp. 14-27 Published by: The MIT Press Stable URL: http://www.jstor.org/stable/3680093 Accessed: 19/11/2008 14:26 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=mitpress. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected]. The MIT Press is collaborating with JSTOR to digitize, preserve and extend access to Computer Music Journal. http://www.jstor.org MarkDolson The Phase Vocoder: Computer Audio Research Laboratory Center for Music Experiment, Q-037 A Tutorial University of California, San Diego La Jolla, California 92093 USA Introduction technique has become popular and well understood.
    [Show full text]
  • TAL-Vocoder-II
    TAL-Vocoder-II http://kunz.corrupt.ch/ TAL - Togu Audio Line © 2011, Patrick Kunz Tutorial Version 0.0.1 Cubase and VST are trademarks of Steinberg Soft- und Hardware GmbH TAL - Togu Audio Line – 2008 1/9 TAL-Vocoder-II ........................................................................................................... 1 Introduction ................................................................................................................. 3 Installation .................................................................................................................. 4 Windows .............................................................................................................. 4 OS X .................................................................................................................... 4 Interface ...................................................................................................................... 5 Examples .................................................................................................................... 6 Credits ........................................................................................................................ 9 TAL - Togu Audio Line – 2008 2/9 Introduction TAL-Vocoder is a vintage vocoder emulation with 11 bands that emulates the sound of vocoders from the early 80’s. It includes analog modeled components in combination with digital algorithms such as the SFFT (Short-Time Fast Fourier Transform). This vocoder does not make a direct convolution
    [Show full text]
  • The Futurism of Hip Hop: Space, Electro and Science Fiction in Rap
    Open Cultural Studies 2018; 2: 122–135 Research Article Adam de Paor-Evans* The Futurism of Hip Hop: Space, Electro and Science Fiction in Rap https://doi.org/10.1515/culture-2018-0012 Received January 27, 2018; accepted June 2, 2018 Abstract: In the early 1980s, an important facet of hip hop culture developed a style of music known as electro-rap, much of which carries narratives linked to science fiction, fantasy and references to arcade games and comic books. The aim of this article is to build a critical inquiry into the cultural and socio- political presence of these ideas as drivers for the productions of electro-rap, and subsequently through artists from Newcleus to Strange U seeks to interrogate the value of science fiction from the 1980s to the 2000s, evaluating the validity of science fiction’s place in the future of hip hop. Theoretically underpinned by the emerging theories associated with Afrofuturism and Paul Virilio’s dromosphere and picnolepsy concepts, the article reconsiders time and spatial context as a palimpsest whereby the saturation of digitalisation becomes both accelerator and obstacle and proposes a thirdspace-dromology. In conclusion, the article repositions contemporary hip hop and unearths the realities of science fiction and closes by offering specific directions for both the future within and the future of hip hop culture and its potential impact on future society. Keywords: dromosphere, dromology, Afrofuturism, electro-rap, thirdspace, fantasy, Newcleus, Strange U Introduction During the mid-1970s, the language of New York City’s pioneering hip hop practitioners brought them fame amongst their peers, yet the methods of its musical production brought heavy criticism from established musicians.
    [Show full text]
  • Part 2: RHYTHM – DURATION and TIMING Część 2: RYTM – ILOCZAS I WZORCE CZASOWE
    Part 2: RHYTHM – DURATION AND TIMING Część 2: RYTM – ILOCZAS I WZORCE CZASOWE From research to application: creating and applying models of British RP English rhythm and intonation Od badań do aplikacji: tworzenie i zastosowanie modeli rytmu i intonacji języka angielskiego brytyjskiego David R. Hill Department of Computer Science The University of Calgary, Alberta, Canada [email protected] ABSTRACT Wiktor Jassem’s contributions and suggestions for further work have been an essential influence on my own work. In 1977, Wiktor agreed to come to my Human-Computer Interaction Laboratory at the University of Calgary to collaborate on problems asso- ciated with British English rhythm and intonation in computer speech synthesis from text. The cooperation resulted in innovative models which were used in implementing the world’s first completely functional and still operational real-time system for arti- culatory speech synthesis by rule from plain text. The package includes the software tools needed for developing the databases required for synthesising other languages, as well as providing stimuli for psychophysical experiments in phonetics and phonology. STRESZCZENIE Pu bli ka cje Wik to ra Jas se ma i wska zów ki do dal szej pra cy w spo sób istot ny wpły nę - ły na mo ją wła sną pra cę. W 1997 ro ku Wik tor zgo dził się na przy jazd do mo je go La - bo ra to rium In te rak cji Czło wiek -Kom pu ter na Uni wer sy te cie w Cal ga ry aby wspól nie za jąć się pro ble ma mi zwią za ny mi z ryt mem w bry tyj skim an giel skim oraz in to na cją w syn te zie mo wy z tek stu.
    [Show full text]
  • Overview of Voice Over IP
    Overview of Voice over IP February 2001 – University of Pennsylvania Technical Report MS-CIS-01-31* Princy C. Mehta Professor Sanjay Udani [email protected] [email protected] * This paper was written for an Independent Study course. Princy Mehta Overview of Voice over IP Professor Udani Table of Contents ACRONYMS AND DEFINITIONS...............................................................................................................3 INTRODUCTION...........................................................................................................................................5 IMPLEMENTATION OF VOICE OVER IP...............................................................................................6 OVERVIEW OF TCP/IP ...................................................................................................................................6 PACKETIZATION.............................................................................................................................................7 COMPONENTS OF VOIP..................................................................................................................................8 SIGNALING ....................................................................................................................................................8 H.323 .............................................................................................................................................................8 Logical Entities..........................................................................................................................................9
    [Show full text]
  • SPEECH ACOUSTICS and PHONETICS Text, Speech and Language Technology VOLUME 24
    SPEECH ACOUSTICS AND PHONETICS Text, Speech and Language Technology VOLUME 24 Series Editors Nancy Ide, Vassar College, New York Jean V´eronis, Universited´ eProvence and CNRS, France Editorial Board Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands Kenneth W. Church, AT&TBell Labs, New Jersey, USA Judith Klavans, Columbia University, New York, USA David T. Barnard, University of Regina, Canada Dan Tufis, Romanian Academy of Sciences, Romania Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain Stig Johansson, University of Oslo, Norway Joseph Mariani, LIMSI-CNRS, France The titles published in this series are listed at the end of this volume. Speech Acoustics and Phonetics by GUNNAR FANT Department of Speech, Music and Hearing, Royal Institute of Technology, Stockholm, Sweden KLUWER ACADEMIC PUBLISHERS DORDRECHT / BOSTON / LONDON A C.I.P Catalogue record for this book is available from the Library of Congress. ISBN 1-4020-2789-3 (PB) ISBN 1-4020-2373-1 (HB) ISBN 1-4020-2790-7 (e-book) Published by Kluwer Academic Publishers, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. Sold and distributed in North, Central and South America by Kluwer Academic Publishers, 101 Philip Drive, Norwell, MA 02061, U.S.A. In all other countries, sold and distributed by Kluwer Academic Publishers, P.O. Box 322, 3300 AH Dordrecht, The Netherlands. Printed on acid-free paper All Rights Reserved C 2004 Kluwer Academic Publishers No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
    [Show full text]
  • Software Sequencers and Cyborg Singers
    Edinburgh Research Explorer Software Sequencers and Cyborg Singers Citation for published version: Prior, N 2009, 'Software Sequencers and Cyborg Singers: Popular Music in the Digital Hypermodern', New Formations: A Journal of Culture, Theory, Politics, vol. 66, no. Spring, pp. 81-99. https://doi.org/10.3898/newf.66.06.2009 Digital Object Identifier (DOI): 10.3898/newf.66.06.2009 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: New Formations: A Journal of Culture, Theory, Politics Publisher Rights Statement: © Prior, N. (2009). Software Sequencers and Cyborg Singers: Popular Music in the Digital Hypermodern. New Formations, 66(Spring), 81-99 doi: 10.3898/newf.66.06.2009. General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 26. Sep. 2021 SOFTWARE SEQUENCERS AND CYBORG SINGERS: POPULAR MUSIC IN THE DIGITAL HYPERMODERN Nick Prior It has been almost twenty years since Andrew Goodwin’s classic essay, ‘Sample and Hold’, claimed that pop music had entered a new phase of digital reproduction.1 If the digital sampler was postmodernism’s musical engine, then hip hop was its recombinant form, and the erosion of divisions between original and copy the celebrated consequence.
    [Show full text]
  • A Vocoder (Short for Voice Encoder) Is a Synthesis System, Which Was
    The Vocoder Thomas Carney 311107435 Digital Audio Systems, DESC9115, Semester 1 2012 Graduate Program in Audio and Acoustics Faculty of Architecture, Design and Planning, The University of Sydney A vocoder (short for voice encoder) is Dudley's breakthrough device a synthesis system, which was initially analysed wideband speech, developed to reproduce human converted it into slowly varying control speech. Vocoding is the cross signals, sent those over a low-band synthesis of a musical instrument with phone line, and finally transformed voice. It was called the vocoder those signals back into the original because it involved encoding the speech, or at least a close voice (speech analysis) and then approximation of it. The vocoder was reconstructing the voice in also useful in the study of human accordance with a code written to speech, as a laboratory tool. replicate the speech (speech synthesis). A key to this process was the development of a parallel band pass A Brief History filter, which allowed sounds to be The vocoder was initially conceived filtered down to a fairly specific and developed as a communication portion of the audio spectrum by tool in the 1930s as a means of attenuating the sounds that fall above coding speech for delivery over or below a certain band. By telephone wires. Homer Dudley 1 is separating the signal into many recognised as one of the father’s of bands, it could be transmitted easier the vocoder for his work over forty and allowed for a more accurate years for Bell Laboratories’ in speech resynthesis. and telecommunications coding. The vocoder was built in an attempt to It wasn’t until the late 1960s 2 and save early telephone circuit early 1970s that the vocoder was bandwidth.
    [Show full text]
  • Estudios De I+D+I
    ESTUDIOS DE I+D+I Número 51 Proyecto SIRAU. Servicio de gestión de información remota para las actividades de la vida diaria adaptable a usuario Autor/es: Catalá Mallofré, Andreu Filiación: Universidad Politécnica de Cataluña Contacto: Fecha: 2006 Para citar este documento: CATALÁ MALLOFRÉ, Andreu (Convocatoria 2006). “Proyecto SIRAU. Servicio de gestión de información remota para las actividades de la vida diaria adaptable a usuario”. Madrid. Estudios de I+D+I, nº 51. [Fecha de publicación: 03/05/2010]. <http://www.imsersomayores.csic.es/documentos/documentos/imserso-estudiosidi-51.pdf> Una iniciativa del IMSERSO y del CSIC © 2003 Portal Mayores http://www.imsersomayores.csic.es Resumen Este proyecto se enmarca dentro de una de las líneas de investigación del Centro de Estudios Tecnológicos para Personas con Dependencia (CETDP – UPC) de la Universidad Politécnica de Cataluña que se dedica a desarrollar soluciones tecnológicas para mejorar la calidad de vida de las personas con discapacidad. Se pretende aprovechar el gran avance que representan las nuevas tecnologías de identificación con radiofrecuencia (RFID), para su aplicación como sistema de apoyo a personas con déficit de distinta índole. En principio estaba pensado para personas con discapacidad visual, pero su uso es fácilmente extensible a personas con problemas de comprensión y memoria, o cualquier tipo de déficit cognitivo. La idea consiste en ofrecer la posibilidad de reconocer electrónicamente los objetos de la vida diaria, y que un sistema pueda presentar la información asociada mediante un canal verbal. Consta de un terminal portátil equipado con un trasmisor de corto alcance. Cuando el usuario acerca este terminal a un objeto o viceversa, lo identifica y ofrece información complementaria mediante un mensaje oral.
    [Show full text]
  • Half a Century in Phonetics and Speech Research
    Fonetik 2000, Swedish phonetics meeting in Skövde, May 24-26, 2000 (Expanded version, internal TMH report) Half a century in phonetics and speech research Gunnar Fant Department of Speech, Music and Hearing, KTH, Stockholm, 10044 Abstract This is a brief outlook of experiences during more than 50 years in phonetics and speech research. I will have something to say about my own scientific carrier, the growth of our department at KTH, and I will end up with an overview of research objectives in phonetics and a summary of my present activities. Introduction As you are all aware of, phonetics and speech research are highly interrelated and integrated in many branches of humanities and technology. In Sweden by tradition, phonetics and linguistics have had a strong position and speech technology is well developed and internationally respected. This is indeed an exciting field of growing importance which still keeps me busy. What have we been up to during half a century? Where do we stand today and how do we look ahead? I am not attempting a deep, thorough study, my presentation will in part be anecdotal, but I hope that it will add to the perspective, supplementing the brief account presented in Fant (1998) The early period 1945-1966 KTH and Ericsson 1945-1949 I graduated from the department of Telegraphy and Telephony of the KTH in May 1945. My supervisor, professor Torbern Laurent, a specialist in transmission line theory and electrical filters had an open mind for interdisciplinary studies. My thesis was concerned with theoretical matters of relations between speech intelligibility and reduction of overall system bandwidth, incorporating the effects of different types of hearing loss.
    [Show full text]
  • “It Talks!” a Very Brief History of Voice Synthesis
    “It Talks!” a very brief history of voice synthesis Mark Schubin, HPA Tech Retreat, 2018 February 23 1 “human voice” one of the oldest organ stops Mark Schubin, HPA Tech Retreat, 2018 February 23 2 When? We don’t know, but organ-like instrument 3rd-century BCE programmable automatic instruments in 9th-century Baghdad reconstruction of automated reed instrument hydraulis, 1st-century BCE Dion, Greece photo: QuartierLatin1968 Mark Schubin, HPA Tech Retreat, 2018 February 23 3 The Captain’s Parrot & the Magician “Captain Birdseye” Mark Schubin, HPA Tech Retreat, 2018 February 23 4 Pope Sylvester II (999-1003) said to have had a “brazen head” that could answer questions yes or no Mark Schubin, HPA Tech Retreat, 2018 February 23 5 Alexander Graham Bell (& Trouve) “How are you grandmamma? Mark Schubin, HPA Tech Retreat, 2018 February 23 6 Mark Schubin, HPA Tech Retreat, 2018 February 23 7 Mark Schubin, HPA Tech Retreat, 2018 February 23 8 18th & 19th Centuries part of Charles Wheatstone’s version of Wolfgang von Kempelen’s talking machine Mark Schubin, HPA Tech Retreat, 2018 February 23 9 Joseph Faber’s Euphonia Mark Schubin, HPA Tech Retreat, 2018 February 23 10 Electronics: Initially Not for Synthesis “vocoder” 1940 Homer Dudley, "The Carrier Nature of Speech" version Bell System Technical Journal, October 1940 images courtesy of Bell Labs Archive Mark Schubin, HPA Tech Retreat, 2018 February 23 11 But There Were World’s Fairs in 1939 “vocoder” “voder” Homer Dudley, "The Carrier Nature of Speech" Bell System Technical Journal, October
    [Show full text]
  • Spoken Language Generation and Un- Derstanding by Machine: a Problems and Applications-Oriented Overview
    See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/256982222 Spoken Language Generation and Understanding by Machine: A Problems and Applications Oriented Overview Chapter · January 1980 DOI: 10.1007/978-94-009-9091-3_1 CITATIONS READS 6 826 1 author: David Hill The University of Calgary 58 PUBLICATIONS 489 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Computer-Human Interaction View project Animating speech View project All content following this page was uploaded by David Hill on 03 December 2015. The user has requested enhancement of the downloaded file. Spoken Language Generation and Un- derstanding by Machine: a Problems and Applications-Oriented Overview David R. Hill Department of Computer Science, University of Calgary, AB, Canada © 1979, 2015 David R. Hill Summary Speech offers humans a means of spontaneous, convenient and effective communication for which neither preparation nor tools are required, and which may be adapted to suit the developing requirements of a communication situation. Although machines lack the feedback, based on understanding and shared experience that is essential to this form of communication in general, speech is so fundamental to the psychol- ogy of the human, and offers such a range of practical advantages, that it is profitable to develop and apply means of speech communication with machines under constraints that allow the provision of adequate dialog and feedback. The paper details the advantages of speech com- munication and outlines speech production and perception in terms relevant to understanding the rest of the paper. The hierarchy/heterar- chy of processes involved in both automatic speech understanding and the machine generation of speech is examined from a variety of points of view and the current capabilities in terms of viable applications are noted.
    [Show full text]