Download Doctoral Thesis
Total Page:16
File Type:pdf, Size:1020Kb
Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Graphics and Interaction Designing Text Entry Methods for Non-Verbal Vocal Input by Ing. OndˇrejPol´aˇcek A thesis submitted to the Faculty of Electrical Engineering, Czech Technical University in Prague, in partial fulfilment of the requirements for the degree of Doctor. PhD programme: Electrical Engineering and Information Technology Branch of study: Information Science and Computer Engineering June 2014 Thesis Supervisor: prof. Ing. Pavel Slav´ık,CSc. Department of Computer Graphics and Interaction Faculty of Electrical Engineering Czech Technical University in Prague Karlovo n´am.13 121 35 Praha 2 Czech Republic Thesis Co-Supervisor: Ing. ZdenˇekM´ıkovec, Ph.D. Department of Computer Graphics and Interaction Faculty of Electrical Engineering Czech Technical University in Prague Karlovo n´am.13 121 35 Prague 2 Czech Republic Copyright c 2014 by Ing. OndˇrejPol´aˇcek ii Abstract and contributions In recent years, a lot of research effort can be observed within the field of computer ac- cessibility focusing on design of text entry methods for people with various impairments. However, only little work have been devoted to non-verbal vocal input (NVVI), which is an interaction modality suitable for motor-impaired people. The main goal of this thesis is to study how non-verbal vocal input (NVVI) can be ap- plied to novel text entry methods and techniques in order to improve quality of life of motor-impaired people. This goal involves several aspects that needs to be explored in order to build a full picture of the problem. Solving each aspect brings us towards bet- ter understanding of the technology and the target users. Namely, this thesis focuses on applicability, acceptability, and accuracy of NVVI, combination of text input and NVVI, and text input optimization. Main contributions of the thesis are the following: 1. A novel predictive text entry method based on n-grams and its evaluation with the target group. 2. Evaluation of an ambiguous text entry method with the target group. 3. Two novel text entry methods based on scanning. 4. A novel method for real-time segmentation of speech and non-verbal vocal input. 5. Subjective comparison of non-verbal vocal commands. The thesis identifies the threshold of applicability of NVVI, defines its optimal target group, and shows how people with disabilities can be included by proper interaction design. The thesis also proposes guidelines, which offer NVVI designers a set of recommendations on how non-verbal vocal commands can be used efficiently. A novel method for speech and NVVI segmentation has been developed in order to improve accuracy of the input and filter spontaneous utterances. Several novel text entry methods have been designed and studied in the thesis. They improve the text input in specific contexts and surpass their predecessors in terms of entry rate or subjective perception. Models of the designed text entry methods have been proposed in order to optimize some of parameters that influence their performance. The work described in the thesis improves understanding of the areas of non-verbal vocal input and text entry methods. Partial results of the thesis were published in peer-reviewed journals and at international conferences. iii Acknowledgements First of all, I would like to express my gratitude to my thesis supervisor, prof. Pavel Slav´ık for his mentoring throughout my doctoral studies, valuable advices, and for taking care of my financial support. I would also like to thank Dr. Adam J. Sporka and Dr. Zdenˇek M´ıkovec for consulting the research directions and helping me with design of experiments described in the thesis. My thanks also go to Dr. Thomas Grill for his support during my sabbatical leave at the University of Salzburg. I would also like to thank all co-authors for their support and constructive criticism. Furthermore, I would like to express my thanks to all people who participated in studies and experiments. Finally, my greatest thanks to my family and friends whose support was of great importance during my doctoral studies. My work has been partially supported by MSMT under the research program MSM 6840770014, EC funded projects VitalMind (ICT-215387) and Veritas (IST-247765), project TextAble (LH12070; MSMT Czech Republic), SGS grant Automatically Gener- ated UIs in Nomadic Applications (SGS10/290/OHK3/3T/13; FIS 10-802900), and grants FR-TI2/128 and TE01020415. iv Contents 1 Introduction 1 1.1 Motivation . .2 1.2 Challenges . .4 1.3 Contributions of the Thesis . .5 1.4 Dissertation Organization . .5 I State of the Art 7 2 Non-Verbal Vocal Input 8 2.1 Classification of Non-Verbal Vocal Input . .9 2.1.1 Sound Signal Features . .9 2.1.2 Input Channel Types . .9 2.2 Pitch-Based NVVI . 10 2.2.1 Pitch-Based Vocal Gestures . 11 2.3 Overview of NVVI Applications . 12 2.3.1 Keyboard Emulation . 12 2.3.2 Pitch-Based Mouse Emulation . 13 2.3.3 Timbre-Based Mouse Emulation . 15 2.3.4 Gaming . 15 2.3.5 Mobile applications . 16 2.3.6 Other NVVI Applications . 16 2.4 Summary . 17 3 Text Input for Motor-Impaired People 18 3.1 Introduction . 18 3.1.1 Related Overviews . 19 3.2 Selection Techniques . 20 3.2.1 Direct Selection . 20 3.2.2 Scanning . 21 3.2.3 Pointing and Gestures . 26 v 3.3 Character layouts . 27 3.3.1 Static distribution . 28 3.3.2 Dynamic distributions . 28 3.4 Use of Language Models . 29 3.4.1 Statistical Language Models . 30 3.4.2 Prediction . 31 3.4.3 Ambiguous Keyboards . 31 3.5 Interaction Modalities . 33 3.5.1 Gaze interaction . 34 3.5.2 Acoustic modality . 34 3.5.3 Bio signals . 35 3.6 Experimental Setup of Text Entry Methods Evaluations . 35 3.6.1 Basic Experimental Setups . 36 3.6.2 Measures . 37 3.7 Overview and Evaluation of Text Entry Methods . 38 3.7.1 Overview of Text Entry Methods . 39 3.7.2 Evaluation of Text Entry Methods . 41 3.8 Summary . 43 II Contribution 45 4 Overview of contributions 46 4.1 Limitations of the State of the Art . 46 4.2 How this Thesis Extends the State of the Art . 46 4.2.1 Technology-Oriented Contributions . 47 4.2.2 Human-Oriented Contributions . 47 4.3 Summary . 48 5 Segmentation of Speech and Humming 49 5.1 Motivation . 49 5.2 Related work . 50 5.2.1 IAC method . 51 vi 5.3 Segmentation method using MFCC and RMS . 52 5.4 Evaluation . 54 5.4.1 Experiment data . 54 5.4.2 Results and Discussion . 57 5.5 Summary . 58 6 Comparative Study of Pitch-Based Gestures 59 6.1 Motivation . 59 6.2 Related work . 60 6.3 Experiment . 60 6.3.1 Organization . 61 6.3.2 NVVI test application . 61 6.3.3 Selected Gestures . 62 6.3.4 Quantitative questionnaire. 64 6.3.5 Participants . 64 6.4 Results . 65 6.4.1 Quantitative results . 65 6.4.2 Qualitative results . 67 6.5 Discussion . 69 6.5.1 Guidelines for the Design of Pitch-Based Gestures . 69 6.6 Summary . 70 7 Scanning Keyboard Design 71 7.1 Motivation . 71 7.2 Related work . 72 7.3 Scanning Techniques . 73 7.3.1 Row-Column Scanning . 73 7.3.2 N-ary Search Scanning . 74 7.3.3 Row-Column Scanning on an Array . 77 7.4 Evaluation . 78 7.4.1 Scanning keyboard designs . 78 7.4.2 Measuring performance . 79 vii 7.4.3 Experiment . 81 7.5 Results . 82 7.5.1 Discussion . 86 7.6 Summary . 88 8 Ambiguous Keyboard Design 89 8.1 Motivation . 89 8.2 Related work . 90 8.3 Text Entry Method . 90 8.3.1 QANTI, the predecessor of CHANTI . 90 8.3.2 Design of CHANTI . 92 8.4 Evaluation . 94 8.4.1 The Study Organization . 94 8.4.2 NVVI Training . ..