Information State Based Speech Recognition

Total Page:16

File Type:pdf, Size:1020Kb

Information State Based Speech Recognition Information State Based Speech Recognition GOTHENBURG MONOGRAPHS IN LINGUISTICS 41 Information State Based Speech Recognition Rebecca Jonson Doctoral dissertation in Linguistics, University of Gothenburg c Rebecca Jonson 2010 Printed by Reprocentralen, University of Gothenburg, 2010. Dissertation edition, May 2010 Published at: http://hdl.handle.net/2077/22169 Distribution: Insititutionen f¨or filosofi, lingvistik och vetenskapsteori, G¨oteborgs Universitet Box 200, 405 30 G¨oteborg Abstract Ph.D dissertation in Linguistics at University of Gothenburg, Sweden, 2010 Title: Information State Based Speech Recognition Author: Rebecca Jonson Language: English Department: Department of Philosophy, Linguistics and Theory of Sciences, University of Gothenburg, Box 200, SE-405-30 G¨oteborg Series: Gothenburg Monographs in Linguistics 41 Published at: http://hdl.handle.net/2077/22169 One of the pitfalls in spoken dialogue systems is the brittleness of automatic speech recognition (ASR). ASR systems often misrecognize user input and they are unreliable when it comes to judging their own performance. Recognition failures and deficient confidence estimation affect the performance of a dialogue system as a whole and the impression it makes on a user. Humans outperform ASR systems on most tasks related to speech understanding. One of the reasons is that humans make use of much more knowledge. For example humans appear to take a variety of knowledge-based aspects of the current dialogue into account when processing speech. The main purpose of this thesis is to investigate whether speech recognition also can benefit from the use of higher level knowledge sources and dialogue context when used in spoken dialogue systems. One of the major contributions of this thesis is to provide more insight into what type of knowledge sources in spoken dialogue systems would be potential contributors to the task of ASR and how such knowledge can be represented computationally. In the framework of information state based dialogue management we have an important source of semantic and pragmatic knowl- edge represented in the information state. We will investigate if the knowledge in the information state can help to alleviate the search problem and reliability estimation in speech recognition. We call this knowledge and context aware approach to speech recognition information state based speech recognition. The first part of this thesis investigates approaches to obtaining better initial language models more rapidly for spoken dialogue systems and ways of dynamically selecting the most appropriate models based on the dialogue context. The second part of this thesis concerns the use of the speech recognition output and investi- gates how additional knowledge sources can enhance a dialogue system’s decision-making on how to proceed and make use of speech recognition hypotheses. The thesis presents several experimental studies addressing the issues described above and proposes an integration of the explored techniques into the GoDiS dialogue system. Keywords: dialogue systems, speech recognition, language modelling, dialogue move, dialogue context, ASR, higher level knowledge, linguistic knowledge, N-Best re-ranking, confidence scoring, confidence annotation, information state, ISU approach. Acknowledgements A long journey is coming to an end and it is time to express my gratitude to everyone travelling with me along the way. This journey would not even have started if it was not for my supervisor Prof. Robin Cooper, who was the first to encourage me to become a researcher, and gave me the opportunity to carry out my PhD studies remotely from Spain. Moreover, without Robin holding the compass and helping me to navigate I would probably be lost on the seven seas. Thanks for your guidance, support, patience and never-ending encouragement. Thanks for reading and re-reading every single paragraph of the manuscript. Also my deep gratitude for contributing to this thesis with your perspicaciousness, brilliant ideas and insightful comments. In fact, the inspiration for this voyage started long before my PhD studies with my first encounter with speech recognition and dialogue systems. Thanks to Prof. Jim Hieronymous and Prof. Luis Hern´andez for introducing me to the world of speech recognition and to Staffan Larsson for evoking my interest in dialogue systems and introducing me to the GoDiS “club”. The research questions in this thesis started to grow while working at Telef´onica R&D (TID). I am therefore sincerely grateful to my dear TID colleagues for all interesting discussions. Gracias! This thesis has clearly benefitted from all constructive feedback and insightful comments from my second supervisor Rolf Carlson, from Oliver Lemon (Chapter 8), from Steve Young and Karl Weilhammer (Chapter 4 and Chapter 7), from Joakim Nivre (Chapter 4, Chapter 6 and Chapter 8), from Stina Ericsson and from David Hjelm as well as from all valuable comments from anonymous reviewers of my published papers. For a voyage to be able to reach its end it is also necessary to have the machinery working. A big thank you to Robert Andersson for being a brilliant remote system administrator keeping the boat’s engine healthy and up to date. Warm thanks to the Gothenburg Dialogue lab people (Aarne, Ann-Charlotte, Bj¨orn, David, H˚akan, Jessica, Peter, Staffan, Stina) who have contributed in many different ways and given me a helping hand with GF and GoDiS whenever needed. A special thanks to David Hjelm for your technical help, for inspiring discussions about speech and dialogue and for being a brilliant colleague both at the university and at Artificial Solutions. I also owe a great deal to all students and colleagues who participated in the experiments. An asset when travelling is all the people you meet. I am so lucky to have had the opportunity, thanks to GSLT1, to get to know all amazing GSLT PhD students. I am also happy to have met so many wonderful colleagues, both at the Department of Linguistics and at Artificial Solutions. I am sincerely grateful to have had the opportunity to participate in the TALK project2 and being able to meet so many interesting people sharing the same research interests, as well as receiving funding for my work and for the conference travels to present my work. This journey would not have been possible without the support of my family and friends. Thanks to my parents for always believing in me and encouraging whatever journey I decide to embark on. Besos for Bella for being an extraordinary sister. Kramar to my brother Joakim and his wonderful kids. Lots of love to my everlasting friends (Annica, Kristin, Magdalena, Mira, Sofia, Therese) for being so close although we are so far away. My deepest thanks to Oscar for being the best possible travelmate and making every new day a thrilling adventure. Finally, thanks beforehand to everyone that has the intention of reading parts of this thesis. Enjoy the journey! 1The Swedish Graduate School of Language Technology 2TALK (Talk and Look, Tools for Ambient Linguistic Knowledge), IST-507802 Contents 1 Introduction 1 1.1Whyisspeechsodifficultfordialoguesystems?............... 2 1.2Researchquestions............................... 5 1.3Thesisoutline.................................. 7 I Preliminaries 9 2 Background 11 2.1AbriefintroductiontoASR.......................... 11 2.1.1 Digitalsignalprocessing........................ 13 2.1.2 Acoustic modelling ........................... 14 2.1.3 Language modelling ........................... 17 2.1.4 Decoding................................ 19 2.1.5 The three fundamental ranges . .................... 21 2.2Abriefintroductiontospokendialoguesystems............... 22 2.2.1 Spoken language understanding .................... 23 2.2.2 Dialoguemanagement......................... 26 2.2.3 Naturallanguagegeneration...................... 28 2.2.4 Abriefhistoricalbackground..................... 29 2.3EvaluationmetricsinASR........................... 30 2.3.1 Perplexity................................ 30 2.3.2 Worderrorrateandsentenceerrorrate................ 31 2.3.3 Worderrorratevsconcepterrorrate................. 33 2.3.4 Dialoguemoveerrorrate........................ 33 2.3.5 Significancetesting........................... 34 2.4 Survey of attempts to compensate for ASR deficiencies in spoken dialogue systems..................................... 35 2.4.1 Improvingthefrontend........................ 36 2.4.2 Improvingdigitalsignalprocessing.................. 39 2.4.3 Improving acoustic modelling . .................... 42 2.4.4 Improving language modelling . .................... 45 2.4.5 ImprovingASRhypothesesselection................. 55 v vi 2.4.6 Improving confidence annotation ................... 59 2.4.7 Improvementonotherlevels...................... 62 2.5Conclusions................................... 63 3 Baseline systems 65 3.1Theinformationstateupdateapproach.................... 65 3.2 The TrindiKit platformfordialoguesystemdevelopment......... 66 3.3 The GoDiS dialoguesystem.......................... 67 3.3.1 GoDiS informationstate....................... 67 3.3.2 GoDiS dialoguemoves......................... 70 3.3.3 GoDiS plansandaccommodation................... 71 3.3.4 GoDiS ruleformat........................... 73 3.3.5 GoDiS grounding behaviour . .................... 74 3.4 The baseline GoDiS applications....................... 78 3.4.1 DJ-GoDiS:TheMP3player..................... 78 3.4.2 AgendaTalk :Thetalkingcalendar................. 80 3.5 The TrindiKit logging format .......................
Recommended publications
  • The Definitive Guide
    Mastering the MP3 Audio Experience The Definitive Guide ~~/ REILLY® Scot Hacker Facebook's Exhibit No. 1062 Page 1 MP3 The Definitive Guide Facebook's Exhibit No. 1062 Page 2 MP3 The Definitive Guide Scot Hacker O'REILLY® Beijing · Cambridge · Farnham · KO!n · Paris · Sebastopol · Taipei · Tokyo Facebook's Exhibit No. 1062 Page 3 MP3: The Definitive Guide by Scot Hacker Copyright © 2000 O'Reilly & Associates, Inc. All rights reserved. Printed in the United States of America. Published by O'Reilly & Associates, Inc., 101 Morris Street, Sebastopol, CA 95472. Editor: Simon Hayes Production Editor: Maureen Dempsey Cover Designer: Hanna Dyer Printing History: March 2000: First Edition. Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly & Associates, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. The association between the image of a hermit crab and MP3 is a trademark of O'Reilly & Associates, Inc. While every precaution has been taken in the preparation of this book, the publisher assumes no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. Library of Congress Cataloging-in-Publication Data Hacker, Scot MP3: the definitive guide/Scot Hacker.-1st ed. p. em. ISBN 1-56592-661-7 (alk. paper) 1. MP3. 2.MP3 players. 3. Music-Computer programs. 4. Internet (Computer network)­ Computer programs. I. Title. ML74.4.M6 H33 2000 780' .285'65-dc21 00-025403 ISBN: 1-56592-661-7 [4/00] [M] Facebook's Exhibit No.
    [Show full text]
  • Video Compression: MPEG-4 and Beyond Video Compression: MPEG-4 and Beyond
    Video Compression: MPEG-4 and Beyond Video Compression: MPEG-4 and Beyond Ali Saman Tosun, [email protected] Abstract Technological developments in the networking technology and the computers make use of video possible. Storing and transmitting uncompressed raw video is not a good idea, it requires large storage space and bandwidth. Special algorithms which take the characteristics of the video into account can compress the video with high compression ratio. In this paper I will give a overview of the standardization efforts on video compression: MPEG-1, MPEG-2, MPEG-4, MPEG-7, and I will explain the current video compression trends briefly. See also: Multimedia Networking Products | Multimedia Over IP: RSVP, RTP, RTCP, RTSP | Multimedia Networking References | Books on Multimedia | Protocols for Multimedia on the Internet (Class Lecture) | Video over ATM networks | Multimedia networks: An introduction (Talk by Prof. Jain) | Other Reports on Recent Advances in Networking Back to Raj Jain's Home Page Table of Contents: ● 1. Introduction ● 2. H.261 ● 3. H.263 ● 4. H.263+ ● 5. MPEG ❍ 5.1 MPEG-1 ❍ 5.2 MPEG-2 ❍ 5.3 MPEG-3 ❍ 5.4 MPEG-4 ❍ 5.5 MPEG-7 ● 6. J.81 ● 7. Fractal-Based Coding ● 8. Model-based Video Coding http://www.cis.ohio-state.edu/~jain/cis788-99/compression/index.html (1 of 13) [2/7/2000 10:39:09 AM] Video Compression: MPEG-4 and Beyond ● 9. Scalable Video Coding ● 10 . Wavelet-based Coding ● Summary ● References ● List of Acronyms 1. Introduction Over the last couple of years there has been a great increase in the use of video in digital form due to the popularity of the Internet.
    [Show full text]
  • Mp3
    <d.w.o> mp3 book: Table of Contents <david.weekly.org> January 4 2002 mp3 book Table of Contents Table of Contents auf deutsch en español {en français} Chapter 0: Introduction <d.w.o> ● What's In This Book about ● Who This Book Is For ● How To Read This Book books Chapter 1: The Hype code codecs ● What Is Internet Audio and Why Do People Use It? mp3 book ● Some Thoughts on the New Economy ● A Brief History of Internet Audio news ❍ Bell Labs, 1957 - Computer Music Is Born pictures ❍ Compression in Movies & Radio - MP3 is Invented! poems ❍ The Net Circa 1996: RealAudio, MIDI, and .AU projects ● The MP3 Explosion updates ❍ 1996 - The Release ❍ 1997 - The Early Adopters writings ❍ 1998 - The Explosion video ❍ sidebar - The MP3 Summit get my updates ❍ 1999 - Commercial Acceptance ● Why Did It Happen? ❍ Hardware ❍ Open Source -> Free, Convenient Software ❍ Standards ❍ Memes: Idea Viruses ● Conclusion page source http://david.weekly.org/mp3book/toc.php3 (1 of 6) [1/4/2002 10:53:06 AM] <d.w.o> mp3 book: Table of Contents Chapter 2: The Guts of Music Technology ● Digital Audio Basics ● Understanding Fourier ● The Biology of Hearing ● Psychoacoustic Masking ❍ Normal Masking ❍ Tone Masking ❍ Noise Masking ● Critical Bands and Prioritization ● Fixed-Point Quantization ● Conclusion Chapter 3: Modern Audio Codecs ● MPEG Evolves ❍ MP2 ❍ MP3 ❍ AAC / MPEG-4 ● Other Internet Audio Codecs ❍ AC-3 / Dolbynet ❍ RealAudio G2 ❍ VQF ❍ QDesign Music Codec 2 ❍ EPAC ● Summary Chapter 4: The New Pipeline: The New Way To Produce, Distribute, and Listen to Music ● Digital
    [Show full text]
  • Modeling of an MPEG Layer-3 Encoder and Decoder in Ptolemy
    Modeling of an MPEG Layer-3 Encoder and Decoder in Ptolemy Literature Survey Patrick Brown EE382C – Embedded Software Systems Spring, 2000 Abstract MPEG Audio Layer-3, or “MP3,” has rapidly become the most popular format for the encoding and compression of digital audio and has found widespread application in both hardware and software. This literature survey provides a brief introduction to the MPEG (Moving Picture Experts Group) standard for audio encoding and explores the feasibility of formally modeling an encoder and a decoder. It also describes some of the available literature on the MPEG standard and associated research. The tremendous popularity of the standard has led to a large amount of literature, research, and source code related to the encoding and decoding of MPEG audio. The focus of the actual project will be a model of an MP3 encoder and decoder in Ptolemy [1]. This model will be used to generate C code that can be compared, in terms of speed and memory efficiency, to several of the widely used codecs currently available. Introduction MPEG Audio Layer-3, more commonly known as “MP3,” is part of the set of standards created by the Moving Picture Experts Group (MPEG) [2]. Currently, there are three complete standards that have been created by MPEG and adopted by the International Organization for Standardization (ISO). They are • MPEG-1, approved Nov. 1992 • MPEG-2, approved Nov. 1994 • MPEG-4, approved Oct. 1998 (version 1) and Dec. 1999 (version 2) Each of these standards is made up of several parts. The three main parts to all three are systems, video, and audio.
    [Show full text]
  • MP3 Complete by Guy Hart-Davis and Rhonda Holmes
    SYBEX® Sample Chapter MP3 Complete by Guy Hart-Davis and Rhonda Holmes ISBN #0-7821-2899-8 MP3 Basics Copyright ©2000 SYBEX Inc., 1151 Marina Village Parkway, Alameda, CA 94501. World rights reserved. No part of this pub- lication may be stored in a retrieval system, transmitted, or reproduced in any way, including but not limited to photocopy, photograph, magnetic or other record, without the prior agreement and written permission of the publisher. 2899c01.qxd 10/5/00 1:02 PM Page 1 PART i MP3 Basics LEARN TO: 8 Understand why digital audio is important 8 Understand the advantages and disadvantages of MP3 8 Choose the right hardware for enjoying quality digital audio 8 Know what’s legal with digital audio, and what’s not 2899c01.qxd 10/3/00 7:50 AM Page 3 Chapter 1 MP3 and Digital Audio FEATURING 8 What is digital audio? 8 Why digital audio is an exciting technology 8 The short version of how MP3 works 8 Advantages and disadvantages of MP3 8 Competing digital-audio formats 2899c01.qxd 10/3/00 7:50 AM Page 4 4 Chapter One n this chapter, we’ll discuss why digital audio is such a hot Itopic and how it affects you. We’ll start by talking about what digital audio is, what you can do with it, and why you should be interested in it if you’re not already. Then we’ll switch the focus to MP3 for a bit. We’ll give you the short version of how MP3 works, its advantages and disadvantages, and why it’s currently dominating the digital-audio scene.
    [Show full text]
  • TIM Open Labs 1964 1971 CSELT, the R&D Center of Telecom Italia, the First Italian Electronic Switching Center Born in Torino Was Built
    Gruppo TIM TIM HiTech Torino, luglio 2019 I laboratori di TIM TIM Open Labs 1964 1971 CSELT, the R&D Center of Telecom Italia, The first Italian electronic switching center born in Torino was built 1976 World first communication on fiber optic 1982 World first fiber optic cable for experimenting A little bit of history of with the TV signal is layered in Turin Innovation in Telecom Italia Group 1988 2018 MPEG and MP3 standards CTIO is created for compressing the audio video signal 1999 First UMTS phone call in the urban environment 2001 CSELT becomes TILab 2009 2017 Turin starts the first Announcement: Turin 2016 experimentation in LTE chosen as 5G City "Technology Innovation" Department is born technology First European Open IoT Lab I laboratori di TIM –TIM Open Labs 2 5G un cambio di paradigma …? Prima gli Use Cases poi la Technologia … 2011 2018 1. First Use cases, 1. Tech framework defined, 2. Then requirements, 2. Use cases developed 3. Then Technology … I laboratori di TIM –TIM Open Labs 3 Le caratteristiche del 5G in breve AUMENTA LA DIMINUISCE LA SI MOLTIPLICANO VELOCITÀ LATENZA LE CONNESSIONI 20 1 - 4 1 milione Gbit /secondo Millisecondi in di connessioni per tempi di risposta KM quadrato Il 5G è Il 5G è Il 5G gestisce 20 VOLTE DA 5 A 20 VOLTE 10 VOLTE PIÙ VELOCE PIÙ REATTIVO PIÙ CONNESSIONI del 4G/LTE del 4G/LTE del 4G/LTE I laboratori di TIM –TIM Open Labs 4 5G: un po’ di vision… I laboratori di TIM –TIM Open Labs 5 I laboratori in via Reiss-Romoli Rete fissa, Mobile, 5G, IoT, Servizi Ricerca, Innovazione, Ingegneria, Sviluppo,
    [Show full text]
  • Video Coding Standards: JPEG and MPEG
    Video Codec Design Iain E. G. Richardson Copyright q 2002 John Wiley & Sons, Ltd ISBNs: 0-471-48553-5 (Hardback); 0-470-84783-2 (Electronic) Video Coding Standards: JPEG and MPEG 4.1 INTRODUCTION The majority of video CODECs in use today conform to one of the international standards for video coding. Two standards bodies, the International Standards Organisation (KO) and the International Telecommunications Union (ITU), have developed a series of standards that have shaped the development of the visual communications industry. The IS0 JPEG and MPEG-2 standards have perhaps had the biggest impact: JPEG has become one of the most widely used formats for still image storage and MPEG-2 forms the heart of digital television and DVD-video systems. The ITU’s H.261 standard was originally developed for video conferencing over the ISDN, but H.261 and H.263 (its successor) are now widely used for real-time video communications over a range of networks including the Internet. This chapter begins by describing the process by which these standards are proposed, developed and published. We describe the popular IS0 coding standards, JPEG and PEG- 2000 for still images, MPEG-1, MPEG-2 and MPEG-4 for moving video. In Chapter 5 we introduce the ITU-T H.261, H.263 and H.26L standards. 4.2 THE INTERNATIONALSTANDARDS BODIES It was recognised in the 1980s that video coding and transmission could become a comm- ercially important application area. The development of video coding technology since then has been bound up with a series of international standards for image and video coding.
    [Show full text]
  • O'reilly MP3 the Definitive Guide.Pdf
    MP3: The Definitive Guide Scot Hacker Publisher: O'Reilly First Edition March 2000 ISBN: 1-56592-661-7, 400 pages MP3: The Definitive Guide introduces the power-user to just about all aspects of MP3 technology. It delves into detail on obtaining, recording, and optimizing MP3 files using both commercial and Open Source methods. Coverage is complete for four platforms: Windows, Macintosh, Linux, and BeOS. In- depth chapters describe all aspects of the MP3 experience from distributing, streaming, broadcasting, converting, and playing to archiving your collection. Readers will learn how to test their equipment, optimize their encoding times, evaluate their playback options, control and organize a collection, and even burn their own CD's or distribute their own music to a massive worldwide audience over the Internet. In addition, the author fills readers in on the complex legal issues surrounding MP3 files. Everything you need to know to enjoy MP3 today and tomorrow is contained in this single volume. IT-SC 1 Copyright © 2000 O'Reilly & Associates, Inc. All rights reserved. Printed in the United States of America. Published by O'Reilly & Associates, Inc., 101 Morris Street, Sebastopol, CA 95472. The O'Reilly logo is a registered trademark of O'Reilly & Associates, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly & Associates, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. The use of the hermit crab image in association with MP3 is a trademark of O'Reilly & Associates, Inc.
    [Show full text]
  • MPEG-4 Low Delay Audio Coding
    MPEG-4 Low Delay Audio Coding Jürgen Herre Fraunhofer Institut for Integrated Circuits (FhG-IIS) Erlangen, Germany Dr. Jürgen Herre, [email protected] Seite 1 Fraunhofer Institut Integrierte Schaltungen Overview • Speech coding vs. perceptual audio coding • MPEG-4 Version 2 low delay coding • Delay sources in perceptual audio coding • Low delay AAC coder • Test results • Summary Dr. Jürgen Herre, [email protected] Seite 2 Fraunhofer Institut Integrierte Schaltungen Speech Coding vs. Audio Coding Dr. Jürgen Herre, [email protected] Seite 3 Fraunhofer Institut Integrierte Schaltungen MPEG-4 Version 2 Low Delay Audio Coding Target • High audio and speech quality and • Low bitrate and • Low algorithmic delay (20 ms) Solution MPEG-4 Version 2 Low Delay Audio Coder: • Derived from MPEG-2/4 "Advanced Audio Coding" (AAC) • Specific modifications for low-delay operation Dr. Jürgen Herre, [email protected] Seite 4 Fraunhofer Institut Integrierte Schaltungen Delay Sources in Perceptual Audio Coding • Framing delay • Filterbank delay • Look-ahead delay for block switching • Use of bit reservoir ⇒ Overall delay: NN++ N = framing filterbank look− ahead + tdelay tbitres Fs Dr. Jürgen Herre, [email protected] Seite 5 Fraunhofer Institut Integrierte Schaltungen Example: Delay of AAC Codec (48 kHz / 64 kbps) • Framing delay : 1024 samples • Filterbank delay : 1024 samples • Look-ahead delay for block switching : 576 samples • Use of bit reservoir : 74.7 ms ⇒ Overall delay: 1024 +1024 + 576 t = + 74.7 ms =129.4 ms delay 48000 Dr. Jürgen Herre, [email protected] Seite 6 Fraunhofer Institut Integrierte Schaltungen Low Delay AAC Codec (48 kHz, min. delay mode) • Reduced framing delay : 480 samples • Reduced filter bank delay : 480 samples • No block switching ⇒ no look-ahead delay: 0 samples • Minimal bit reservoir : 0...32 bits ⇒ Overall delay: 480 + 480 + 0 t = + 0 ms = 20 ms delay 48000 Dr.
    [Show full text]
  • March 10 - 12, 2008 San Diego, CA
    March 10 - 12, 2008 San Diego, CA San Diego Marriott Hotel & Marina www.VoiceSearchConference.com Primary Sponsors Caller Experience Analytics Supporting Another Innovative Solution From BBN Technologies Sponsors Welcome to the inaugural Voice Search Conference. hank you for helping to move this can simplify user interfaces and accelerate Tfundamental development forward. the adoption of many creative products and Voice Search leverages advances in speech services. Our many sponsors agree and have technology to make applications and services made it easier for us to deliver a top-notch easier to use. Th e basic Voice Search principle user experience for att endees. We hope you is to reduce complexity and the number agree. of steps that a user must take to achieve an objective. Enjoy the conference! Th e program committ ee: Our guiding principle in producing this Bill Meisel, President, TMA Associates conference is to reduce the number of steps K.W. “Bill” Scholz, President, AVIOS, and you need take to make the most of this President, NewSpeech LLC development in advancing your business. Tom Schalk, Vice President, Voice Tech- Our objective has been to draw together nology, ATX Group experts and practical examples of what is being done and what can be done in applying Voice Search. We believe that Voice Search AVIOS Board of Directors Sara Basson, Program Director - Human Alexander Rudnicky, Principal Systems Ability, IBM - T.J. Watson Research Scientist, Carnegie Mellon University Center Th omas Schalk, Vice President, Voice Neal Bernstein, Senior Director, Local & Technology, ATX Group, Inc. Mobile Search, Microsoft K.W. “Bill” Scholz, President, AVIOS, and Michael Cohen, Manager, Speech Tech- President, NewSpeech LLC nology Group, Google Markett a Silvera, Chief Executive Offi cer, Susan Hura, Principal, SpeechUsability Apptera Alan Knipe, Founder, StarNet Systems Kim Silverman, Manager, Spoken Lan- James Larson, Columnist, Information guage Technologies, Apple Computer Today, Inc.
    [Show full text]