Enabling Non-Speech Experts to Develop Usable Speech-User Interfaces
Enabling Non-Speech Experts to Develop Usable Speech-User Interfaces Anuj Kumar CMU-HCII-14-105 August 2014 Human-Computer Interaction Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 USA Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Copyright © Anuj Kumar 2014. All rights reserved. Committee: Florian Metze Co-Chair, Carnegie Mellon University Matthew Kam Co-Chair, Carnegie Mellon University & American Institutes for Research Dan Siewiorek Carnegie Mellon University Tim Paek Microsoft Research The research described in this dissertation was supported by the National Science Foundation under grants IIS- 1247368 and CNS-1205589, Siebel Scholar Fellowship 2014, Carnegie Mellon’s Human-Computer Interaction Institute, and Nokia. Any findings, conclusions, or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the above organizations or corporations. KEYWORDS Human-Computer Interaction, Machine Learning, Non-Experts, Rapid Prototyping, Speech-User Interfaces, Speech Recognition, Toolkit Development. II Dedicated to Mom and Dad for their eternal support, motivation, and love III IV ABSTRACT Speech user interfaces (SUIs) such as Apple’s Siri, Microsoft’s Cortana, and Google Now are becoming increasingly popular. However, despite years of research, such interfaces really only work for specific users, such as adult native speakers of English, when in fact, many other users such as non-native speakers or children stand to benefit at least as much, if not more. The problem in developing SUIs for such users or for other acoustic or language situations is the expertise, time, and cost in building an initial system that works reasonably well, and can be deployed to collect more data or also to establish a group of loyal users.
[Show full text]