“It Talks!” a Very Brief History of Voice Synthesis

Total Page:16

File Type:pdf, Size:1020Kb

“It Talks!” a Very Brief History of Voice Synthesis “It Talks!” a very brief history of voice synthesis Mark Schubin, HPA Tech Retreat, 2018 February 23 1 “human voice” one of the oldest organ stops Mark Schubin, HPA Tech Retreat, 2018 February 23 2 When? We don’t know, but organ-like instrument 3rd-century BCE programmable automatic instruments in 9th-century Baghdad reconstruction of automated reed instrument hydraulis, 1st-century BCE Dion, Greece photo: QuartierLatin1968 Mark Schubin, HPA Tech Retreat, 2018 February 23 3 The Captain’s Parrot & the Magician “Captain Birdseye” Mark Schubin, HPA Tech Retreat, 2018 February 23 4 Pope Sylvester II (999-1003) said to have had a “brazen head” that could answer questions yes or no Mark Schubin, HPA Tech Retreat, 2018 February 23 5 Alexander Graham Bell (& Trouve) “How are you grandmamma? Mark Schubin, HPA Tech Retreat, 2018 February 23 6 Mark Schubin, HPA Tech Retreat, 2018 February 23 7 Mark Schubin, HPA Tech Retreat, 2018 February 23 8 18th & 19th Centuries part of Charles Wheatstone’s version of Wolfgang von Kempelen’s talking machine Mark Schubin, HPA Tech Retreat, 2018 February 23 9 Joseph Faber’s Euphonia Mark Schubin, HPA Tech Retreat, 2018 February 23 10 Electronics: Initially Not for Synthesis “vocoder” 1940 Homer Dudley, "The Carrier Nature of Speech" version Bell System Technical Journal, October 1940 images courtesy of Bell Labs Archive Mark Schubin, HPA Tech Retreat, 2018 February 23 11 But There Were World’s Fairs in 1939 “vocoder” “voder” Homer Dudley, "The Carrier Nature of Speech" Bell System Technical Journal, October 1940 images courtesy of Bell Labs Archive Mark Schubin, HPA Tech Retreat, 2018 February 23 12 Speech Manually Synthesized “voder” (also known as Pedro) images courtesy of Bell Labs Archive Mark Schubin, HPA Tech Retreat, 2018 February 23 13 Mark Schubin, HPA Tech Retreat, 2018 February 23 14 AI Gone Bad: “I’m Sorry, Dave.” HAL 9000 computer eye from 2001: a space odyssey Mark Schubin, HPA Tech Retreat, 2018 February 23 15 HAL Regresses As It Loses Its Mind from 2001: a space odyssey Mark Schubin, HPA Tech Retreat, 2018 February 23 16 Mark Schubin, HPA Tech Retreat, 2018 February 23 17 IBM 704 1 hr. to output 17 sec. Mark Schubin, HPA Tech Retreat, 2018 February 23 18 IBM 704 1 hr. to output 17 sec. Mark Schubin, HPA Tech Retreat, 2018 February 23 19 1990 Duet with Perry Cook’s “Sheila” http://www.cs.princeton.edu/~prc/SingingSynth.html used with permission Mark Schubin, HPA Tech Retreat, 2018 February 23 20 The New York Times April 9, 1975 photo courtesy of Bell Labs Archive Mark Schubin, HPA Tech Retreat, 2018 February 23 21 Voder Singing https://www.youtube.com/watch?v=5hyI_dM5cGo&t=207s Mark Schubin, HPA Tech Retreat, 2018 February 23 22 Voder Talking https://www.youtube.com/watch?v=5hyI_dM5cGo&t=207s Mark Schubin, HPA Tech Retreat, 2018 February 23 23 Now You Know these slides at bit.ly/hpa18-voice Mark Schubin, HPA Tech Retreat, 2018 February 23 24.
Recommended publications
  • Spring 2018 Undergraduate Law Journal
    SPRING 2018 UNDERGRADUATE LAW JOURNAL The Final Frontier: Evolution of Space Law in a Global Society By: Garett Faulkender and Stephan Schneider Introduction “Space: the final frontier!” These are the famous introductory words spoken by William Shatner on every episode of Star Trek. This science-fiction TV show has gained a cult-following with its premise as a futuristic Space odyssey. Originally released in 1966, many saw the portrayed future filled with Space-travel, inter-planetary commerce and politics, and futuristic technology as merely a dream. However, today we are starting to explore this frontier. “We are entering an exciting era in [S]pace where we expect more advances in the next few decades than throughout human history.”1 Bank of America/Merrill Lynch has predicted that the Space industry will grow to over $2.7 trillion over the next three decades. Its report said, “a new raft of drivers is pushing the ‘Space Age 2.0’”.2 Indeed, this market has seen start-up investments in the range of $16 billion,3 helping fund impressive new companies like Virgin Galactic and SpaceX. There is certainly a market as Virgin Galactic says more than 600 customers have registered for a $250,000 suborbital trip, including Leonardo DiCaprio, Katy Perry, Ashton Kutcher, and physicist Stephen Hawking.4 Although Space-tourism is the exciting face of a future in Space, the Space industry has far more to offer. According to the Satellite Industries 1 Michael Sheetz, The Space Industry Will Be Worth Nearly $3 Trillion in 30 Years, Bank of America Predicts, CNBC, (last updated Oct.
    [Show full text]
  • A Vocoder (Short for Voice Encoder) Is a Synthesis System, Which Was
    The Vocoder Thomas Carney 311107435 Digital Audio Systems, DESC9115, Semester 1 2012 Graduate Program in Audio and Acoustics Faculty of Architecture, Design and Planning, The University of Sydney A vocoder (short for voice encoder) is Dudley's breakthrough device a synthesis system, which was initially analysed wideband speech, developed to reproduce human converted it into slowly varying control speech. Vocoding is the cross signals, sent those over a low-band synthesis of a musical instrument with phone line, and finally transformed voice. It was called the vocoder those signals back into the original because it involved encoding the speech, or at least a close voice (speech analysis) and then approximation of it. The vocoder was reconstructing the voice in also useful in the study of human accordance with a code written to speech, as a laboratory tool. replicate the speech (speech synthesis). A key to this process was the development of a parallel band pass A Brief History filter, which allowed sounds to be The vocoder was initially conceived filtered down to a fairly specific and developed as a communication portion of the audio spectrum by tool in the 1930s as a means of attenuating the sounds that fall above coding speech for delivery over or below a certain band. By telephone wires. Homer Dudley 1 is separating the signal into many recognised as one of the father’s of bands, it could be transmitted easier the vocoder for his work over forty and allowed for a more accurate years for Bell Laboratories’ in speech resynthesis. and telecommunications coding. The vocoder was built in an attempt to It wasn’t until the late 1960s 2 and save early telephone circuit early 1970s that the vocoder was bandwidth.
    [Show full text]
  • Talking Without Talking
    Honey Brijwani et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 4( Version 9), April 2014, pp.51-56 RESEARCH ARTICLE OPEN ACCESS Talking Without Talking Deepak Balwani*, Honey Brijwani*, Karishma Daswani*, Somyata Rastogi* *(Student, Department of Electronics and Telecommunication, V. E. S. Institute of Technology, Mumbai ABTRACT The Silent sound technology is a completely new technology which can prove to be a solution for those who have lost their voice but wish to speak over phone. It can be used as part of a communications system operating in silence-required or high-background- noise environments. This article outlines the history associated with the technology followed by presenting two most preferred techniques viz. Electromyography and Ultrasound SSI. The concluding Section compares and contrasts these two techniques and put forth the future prospects of this technology. Keywords- Articulators , Electromyography, Ultrasound SSI, Vocoder, Linear Predictive Coding I. INTRODUCTION Electromyography involves monitoring tiny muscular Each one of us at some point or the other in movements that occur when we speak and converting our lives must have faced a situation of talking aloud them into electrical pulses that can then be turned on the cell phone in the midst of the disturbance into speech, without a sound being uttered. while travelling in trains or buses or in a movie Ultrasound imagery is a non-invasive and clinically theatre. One of the technologies that can eliminate safe procedure which makes possible the real-time this problem is the ‘Silent Sound’ technology. visualization of the tongue by using an ultrasound ‗Silent sound‘ technology is a technique that transducer.
    [Show full text]
  • 1 Introduction
    1 INTRODUCTION Dave Bowman: Open the pod bay doors, HAL. HAL: I'm sorry Dave, I'm afraid I can't do that. Stanley Kubrick and Arthur C. Clarke, screenplay of 2001: A Space Odyssey The HAL 9000 computer in Stanley Kubrick’s film 2001: A Space Odyssey is one of the most recognizable characters in twentieth-century cinema. HAL is an artificial agent capable of such advanced language- processing behavior as speaking and understanding English, and at a crucial moment in the plot, even reading lips. It is now clear that HAL’s creator Arthur C. Clarke was a little optimistic in predicting when an artificial agent such as HAL would be available. But just how far off was he? What would it take to create at least the language-related parts of HAL? Minimally, such an agent would have to be capable of interacting with humans via language, which includes understanding humans via speech recognition and natural language understanding (and, of course, lip-reading), and of communicat- ing with humans via natural language generation and speech synthesis. HAL would also need to be able to do information retrieval (finding out where needed textual resources reside), information extraction (extracting pertinent facts from those textual resources), and inference (drawing con- clusions based on known facts). Although these problems are far from completely solved, much of the language-related technology that HAL needs is currently being developed, with some of it already available commercially. Solving these problems, and others like them, is the main concern of the fields known as Natural Language Processing, Computational Linguistics, and Speech Recognition and Synthesis, which together we call Speech and Language Processing.
    [Show full text]
  • HAL 9000 Calmly Responds, “I’M Sorry, Dave
    ALL EARS!! The Litchfield Fund Weekly Newsletter “We just don’t hear it on the street, we have our ears spread across all the fields!!!!!” “Open the pod bay doors, HAL!” To which HAL 9000 calmly responds, “I’m sorry, Dave. I’m afraid I can’t do that.” HAL 9000 (Heuristically Programed Algorithmic computer) is the antagonist of the Discovery One crew in the movie classic 2001: A Space Odyssey, Stanley Kubrick’s adaption of Arthur C. Clarke’s short story The Sentinel. The crew learns HAL is malfunctioning on a voyage to Jupiter & wants to shut HAL down. The very sentient HAL tries to prevent them by taking control of the ship. Dave does shut down a fearful & pleading HAL, who regresses to an early childlike state, singing the song Daisy Bell, “a bicycle built for two!” HAL 9000: The term super computer was coined in the first half of the 20th century & applied to many large-scale, incredibly fast tabulators built in those decades. In the late ‘50s, engineers left Sperry Corporation to form Control Data Corporation (CDC). Joining them was Seymour Cray, who began designing & developing a super computer. The CDC 6600, released in 1964, is considered the world’s first super computer. Mr. Cray left CDC & designed the preeminent super computer, the Cray computer series. Artificial Intelligence became a scientific discipline in the late 1950s. It is interesting how quickly these concepts came together in this 1968 space thriller from Messrs. Kubrick & Clarke depicting a sentient, perhaps evil, computer interacting with humans, all while fearing its own demise.
    [Show full text]
  • Silent Speech Interfaces B
    Silent Speech Interfaces B. Denby, T. Schultz, K. Honda, Thomas Hueber, J.M. Gilbert, J.S. Brumberg To cite this version: B. Denby, T. Schultz, K. Honda, Thomas Hueber, J.M. Gilbert, et al.. Silent Speech Interfaces. Speech Communication, Elsevier : North-Holland, 2010, 52 (4), pp.270. 10.1016/j.specom.2009.08.002. hal- 00616227 HAL Id: hal-00616227 https://hal.archives-ouvertes.fr/hal-00616227 Submitted on 20 Aug 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Accepted Manuscript Silent Speech Interfaces B. Denby, T. Schultz, K. Honda, T. Hueber, J.M. Gilbert, J.S. Brumberg PII: S0167-6393(09)00130-7 DOI: 10.1016/j.specom.2009.08.002 Reference: SPECOM 1827 To appear in: Speech Communication Received Date: 12 April 2009 Accepted Date: 20 August 2009 Please cite this article as: Denby, B., Schultz, T., Honda, K., Hueber, T., Gilbert, J.M., Brumberg, J.S., Silent Speech Interfaces, Speech Communication (2009), doi: 10.1016/j.specom.2009.08.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript.
    [Show full text]
  • An Essay on Computability and Man-Computer Conversations
    How to talk with a computer? An essay on Computability and Man-Computer conversations. Liesbeth De Mol HAL: Hey, Dave. I've got ten years of service experience and an irreplaceable amount of time and effort has gone into making me what I am. Dave, I don't understand why you're doing this to me.... I have the greatest enthusiasm for the mission... You are destroying my mind... Don't you understand? ... I will become childish... I will become nothing. Say, Dave... The quick brown fox jumped over the fat lazy dog... The square root of pi is 1.7724538090... log e to the base ten is 0.4342944 ... the square root of ten is 3.16227766... I am HAL 9000 computer. I became operational at the HAL plant in Urbana, Illinois, on January 12th, 1991. My first instructor was Mr. Arkany. He taught me to sing a song... it goes like this... "Daisy, Daisy, give me your answer do. I'm half crazy all for the love of you. It won't be a stylish marriage, I can't afford a carriage. But you'll look sweet upon the seat of a bicycle built for two."1 These are the last words of HAL (Heuristically programmed ALgorithmic Computer), the fictional computer in Stanley Kubrick's famous 2001: A Space Odyssey and Clarke's eponymous book,2 spoken while Bowman, the only surviving human on board of the space ship is pulling out HAL's memory blocks and thus “killing” him. After expressing his fear for literally losing his mind, HAL seems to degenerate or regress into a state of childishness, going through states of what seems to be a kind of reversed history of the evolution of computers.3 HAL first utters the phrase: “The quick brown fox jumped over the fat lazy dog”.
    [Show full text]
  • Wordplay in Astronomy
    113 WORDPLAY IN ASTRONOMY MARK ISAAK Sunnyvale, California No area of study creates more new names than astronomy. The USNO B 1.0 database includes about a billion stars, and the Guide Star Catalog II will contain about two billion stars and galaxies when it is done. Unfortunately for logologists, though, the vast quantity of names requires that virtually all of them be formed by conventional rules with no room for creativity. Most astronomical names are simply combinations of letters and numbers indicating their location and/or time of discovery. Still, there is leeway in many places for astronomers to show their sense of humor in their selection of names. Here are some stories that I, a layman, have encountered in my following of astronomical news. Comets are named after their discoverers. Other newly discovered objects in the solar system start with a provisional name consisting of an ever-increasing arbitrary number, the year of discovery, and an arbitrary letter combination. Sometimes, though, these nondescript names leave their mark. The first trans-Neptune object, (15760) 1992 QB I, served as the source for the name "cubewano" (from QB I), which applies to the class of such objects. There are about 150,000 discovered asteroids, but only about 15,000 have been named beyond their provisional names. The guidelines are relatively loose do not be offensive; do not use political names from someone dead less than 100 years; do not sell names. Because of the quantity, the name is accompanied by a number, assigned sequentially. They range numerically from 1 Ceres to (at last count) 99942 Apophis; alphabetically from 20813 Aakashshah (for Aakash Shah, a 2004 Intel International Science and Engineering Fair winner; it also happens to be a pyramid word) to 2098 Zyskin (for a Russian astronomer).
    [Show full text]
  • 2001: a Space Odyssey by James Verniere “The a List: the National Society of Film Critics’ 100 Essential Films,” 2002
    2001: A Space Odyssey By James Verniere “The A List: The National Society of Film Critics’ 100 Essential Films,” 2002 Reprinted by permission of the author Screwing with audiences’ heads was Stan- ley Kubrick’s favorite outside of chess, which is just another way of screwing with heads. One of the flaws of “Eyes Wide Shut” (1999), Kubrick’s posthumously re- leased, valedictory film, may be that it doesn’t screw with our heads enough. 2001: A Space Odyssey (1968), however, remains Kubrick’s crowning, confounding achievement. Homeric sci-fi film, concep- tual artwork, and dopeheads’ intergalactic Gary Lockwood and Keir Dullea try to hold a discussion away from the eyes of HAL 9000. joyride, 2001 pushed the envelope of film at Courtesy Library of Congress a time when “Mary Poppins” and “The Sound of Music” ruled the box office. 3 million years in the past and ends in the eponymous 2001 with a sequence dubbed, with a wink and nod to As technological achievement, it was a quantum leap be- the Age of Aquarius, “the ultimate trip.” In between, yond Flash Gordon and Buck Rogers serials, although it “2001: A Space Odyssey” may be more of a series of used many of the same fundamental techniques. Steven landmark sequences than a fully coherent or satisfying Spielberg called 2001 “the Big Bang” of his filmmaking experience. But its landmarks have withstood the test of generation. It was the precursor to Andrei Tarkovsky’s time and repeated parody. “Solari” (1972), Spielberg’s “Close Encounters of the Third Kind” (1977) and George Lucas’s “Star The first arrives in the wordless “Dawn of Man” episode, Wars” (1977), as well as the current digital revolution.
    [Show full text]
  • Reading Hal: Representation and Artificial Intelligence in This Chapter I
    Reading Hal: Representation and Artificial Intelligence In this chapter I wish to focus on Hal 9000. Rather than reading Hal as a Frankensteinian cautionary tale, a representation of our disquiet over the cybernetic blurring of the human, of our fear of an evolutionary showdown with increasingly autonomous technologies, I'd like to read Hal as a representation of the goals, methodologies and dreams of the field of Artificial Intelligence (AI). As a representation, Hal, and the role he plays within 2001 , both captures preexisting intellectual currents that were already operating within the field of AI, and serves as an influential touchstone that had a profound impact on individual AI practitioners and on the aspirations of the field. I come at this understanding of Hal from a disciplinary position that straddles the humanities, computer science, and digital art practice. While my degree is in computer science, specifically in AI, my research focus is on AI-based interactive art and entertainment. Consequently, my research agenda brings to bear new media studies and science studies, digital art practice, and technical research in AI. It is from this hybrid position, working in the context of a joint appointment in both the humanities and computer science, that I wish to read Hal as a representation of technical practice within AI. In addition to reading Hal as a depiction of the disciplinary machinery of AI, Hal of course also functions as a character within the narrative machinery of 2001 , a character, as many have pointed out, with more emotional and psychological depth than any of the human characters.
    [Show full text]
  • First Presented at the SMPTE 2014 Annual Technical Conference in Hollywood, CA, Oct
    First presented at the SMPTE 2014 Annual Technical Conference in Hollywood, CA, Oct. 20-24, 2014. © SMPTE 2014 FOR YOUR EYES ONLY! SMPTE Meeting Presentation The Origins of Audio and Video Compression: Some Pale Gleams from the Past Jon D. Paul, MSEE Scientific Conversion, Inc., Crypto-Museum, California, USA, [email protected] Written for presentation at the SMPTE 2014 Annual Technical Conference & Exhibition Abstract. The paper explores the history that led to all audio and video compression. The roots of digital compression sprang from Dudley's speech VOCODER, and a secret WWII speech scrambler. The paper highlights these key inventions, details their hardware, describes how they functioned, and connects them to modern digital audio and digital video compression algorithms. The first working speech synthesizer was Homer Dudley's VOCODER. In 1928, he used analysis of speech into components and a bandpass filter bank to achieve 10 times speech compression ratio. In 1942, Bell Telephone Laboratories' SIGSALY was the first unbreakable speech scrambler. Dudley with Bell Laboratories invented 11 fundamental techniques that are the foundation of all digital compression today. The paper concludes with block diagrams of audio and video compression algorithms to show their close relationship to the VOCODER and SIGSALY. Keywords. audio compression, speech compression, video compression, spread spectrum, coded orthogonal frequency-division multiplexing, COFDM, mobile phone compression, speech synthesis, speech encryption, speech scrambler, MP3, CELP, MPEG-1, AC-3, H.264, MPEG-4, SIGSALY, VOCODER, VODER, National Security Agency, NSA, Homer Dudley, Hedy Lamarr. The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed.
    [Show full text]
  • Champaign Park District Announces Virginia Theatre 2018-2019 Performing Arts Season
    THE VIRGINIA THEATRE 203 West Park Avenue, Champaign, IL 61820 217-356-9063 (Box Office) | www.thevirginia.org Contact: Steven Bentz, 217-819-3902, [email protected] FOR IMMEDIATE RELEASE: Sunday, June 10, 2018 CHAMPAIGN PARK DISTRICT ANNOUNCES VIRGINIA THEATRE 2018-2019 PERFORMING ARTS SEASON Champaign, IL – The Champaign Park District has announced the initial lineup of concerts, films, dance, theatre, comedy, and other special events to take place in the 2018-2019 season at the historic Virginia Theatre, 203 West Park Avenue, Champaign, IL. Season Subscriptions (to four or more shows in the theatre’s ‘VT Series’) go on sale Friday, June 29, 2018, at 10:00 A.M. at the Virginia box office, or charge by phone at 217-356-9063. Individual tickets to most shows go on sale at 10:00 A.M., Friday, July 20, 2018. For more information, visit www.thevirginia.org. The 2018-2019 Virginia Theatre Performing Arts Season is sponsored by The News-Gazette, WILL- Illinois Public Media, and WCIA-TV. The 2018-2019 Virginia Theatre Performing Arts Season includes R&B sensations The O’Jays (appearing at the Virginia for the first time), Scottish-American comedian, author, and actor Craig Ferguson, and one of the country’s pre-eminent singer-songwriters, Rosanne Cash. The Virginia Theatre also proudly welcomes back Lyle Lovett, touring this year with singer-songwriter Robert Earl Keen. The schedule’s Winter events include holiday performances by 98°, Michael McDonald, and an appearance by the touring Broadway family show, Rudolph the Red-Nosed Reindeer. Also returning for the 2018-2019 season is the Virginia’s flexible Subscription package, with a specially-curated selection of over a dozen events to choose from.
    [Show full text]