9/26/14

Why do I care? Speech Syntesis • BRIDGE BETWEEN SPEECH SCIENCE AND Heater Campbel ASSITIVE TECHNOLOGY TO HELP THOSE WITH MOTOR SPEECH CHALLENGES

• BETTER UNDERSTAND TECHNOLOGY USED TODAY – SOME OF THE SAME TECHNOLOGY IN MY LAB IS USED IN TODAY – HOW DID IT GET TO BE THIS ADVANCED?

• YET FIRST SPEECH GENERATING DEVICES (SGDs) Septmber 24, 2014 NOT USED AS ASSISTIVE TECHNOLOGY – AUTOMATA AS ODDITY – COOL MACHINE

Earliest Speech Syntesis (18-19t Centuries) Early Speech Syntesis (20t Century) Anecdotes from 13th Century of attempts to build machines that imitate human speech, deemed heretical by the church, destroyed. 1857, M. Faber - "Euphonia“

– invenon in circus 1779, Christian Kratzenstein, Russian Academy of Sciences, machine modeling the human vocal tract that could produce the five long sounds – amazed people and also scared them – voice described as ghostly and monotonous 1791, , bellows-operated "acoustic-mechanical speech machine", machine added models of the tongue and lips, enabling it to produce consonants as well as .

1837, Charles Wheatstone - "speaking machine" based on von Kempelen's design, 1922 J. Q. Stewart’s electrical analog of the vocal organs

(Flanagan, 1972, as cited by Haskins, 2008) (as cited by Haskins, 2008)

Early Speech Syntesis (20t Century) Early Speech Syntesis (20t Century)

1937, R. Riesz – modeled human vocal tract with valves and chambers 1930s, , , ”Voice Decoder”

1939- Homer Dudley - The (Voice Operating Demonstrator) -automatically analyzed speech into its fundamental tone and resonances. - keyboard-operated

voice synthesizer

(as cited by Haskins, 2008)

(Klatt, 1987, as cited by Haskins, 2008)

1 9/26/14

t Early Speech Syntesis (20 Century) Since te 1970s

Rapid development associated with technological advancements

1960- Reg Maling- POSSUM- 1980s/1990s -MITalk system, Dennis Klatt ”Klattalk” MIT, Bell Labs system- used 1940s-1950 – Franklin Cooper - Haskins patient-operated selector sophisticated voicing source - formed basis for many systems used today - first Laboratories - -- mechanism- scanned through multilingual language-independent systems, making extensive use of natural converts pictures of the acoustic patterns images on illuminated display language processing methods –- included 6000 words that do not follow letter-sound of speech in the form of a spectrogram conversions (Koul, 2011, p. 267) back into sound.

(as cited by Haskins, 2008) (Wikipedia, 2014) (Lemmetty, 1999) Great audio examples of speech synthesis history 1939-1985: http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html

Wit Computrs comes Syntesized Speech Digitzed Speech • create novel words/messages - messages created from Use prerecorded speech, limited to only a few phrases, best in institutional

settings to communicate basic needs - switch operated (Koul, 2011) leers/words, phrases, sentences, pictures, symbols

Sentient Systems 1983 - “EyeTyper” Concatnatve Syntesis Company later became DynaVox – first SGD with touch-screen technology • Unit selecon synthesis – combines small parts of prerecorded speech Later digital speech generators: (sounds/syllables) • requires large database • relavely natural sounding, though SpringBoard Lite not very smooth ChatBox ActionVoice Statstcal Parametic Syntesis BigMack • “hidden Markov models (HMMs) and “deep neural Cheap Talk network” (DNN) models, formant synthesis Speak Easy • Machine learning technique – speech waveform Digivox/ Dynamo (by DynaVox) constructed from generated parameters • smooth at boundaries, though not very natural

sounding (Lemmetty, 1999)

Concatnatve Syntesis Input for SGDs

Text-to-Speech (TTS) uses keyboard, switch, or touch screen, enter words and device uses algorithm to generate synthec speech

• Usually concatenave synthesis • Can choose your own vocie

Examples • CereProc • Proloquo2Go • ModelTalker (Patel, 2014)

2 9/26/14

Powered by CereProc- te Bush-o-matc CereProc

http://www.idyacy.com/cgi-bin/bushomatic.cgi# https://www.cereproc.com/en/support/live_demo

Proloquo2Go

http://www.assistiveware.com/product/proloquo2go/voices http://www.assistiveware.com/product/proloquo2go

Personalized Syntetc Voices TEDWomen Clip – Rupal Patl Rupal Patel – combines voice (source) from target talker and speech from speech donor (filter) – to produce customized synthec voice as part of VocaliD project (Patel, 2013)

-added to TTS ModelTalker version 1 device

Research showing high intelligibility and recognion of voices (Mills, Bunnell, & Patel, 2014).

(Patel, 2013) https://www.youtube.com/watch?v=d38LKbYfWrs

3 9/26/14

SOURCES . (1996-2008). Talking Heads: Simulacra- The Early History of Talking Machines. From http://www.haskins.yale.edu/featured/heads/simulacra.html.

Lemmetty, S. (1999). Review of Speech Synthesis Technology. (Masters), Helsinki University of Technology, Helsinki, Finland.

Mills, T., Bunnell, H. T., & Patel, R. (2014). Towards Personalized Speech Synthesis for Augmentative and Alternative Communication. Augmentative and Alternative Communication, 30(3), 226-236.

Patel, R. (2013). Synthetic voices, as unique as fingerprints. From https://www.ted.com/talks/rupal_patel_synthetic_voices_as_unique_as_fingerprints#.

Koul, R. (2011). Synthetic Speech. In O. Wendt (Ed.), Assistive technology principles and applications for communication disorders and special education (pp. 530). Bingley, U.K.: Emerald.

Wikipedia. (2014). Speech-generating device. From http://en.wikipedia.org/wiki/Speech-generating_device

4