Octopus: Evaluating Touchscreen Keyboard Correction and Recognition Algorithms Via “Remulation”

Octopus: Evaluating Touchscreen Keyboard Correction and Recognition Algorithms Via “Remulation”

Session: Keyboards and Hotkeys CHI 2013: Changing Perspectives, Paris, France Octopus: Evaluating Touchscreen Keyboard Correction and Recognition Algorithms via “Remulation” Xiaojun Bi1 Shiri Azenkot2 Kurt Partridge1 Shumin Zhai1 1Google Inc. 2University of Washington Mountain View, CA, USA Seattle, WA, USA {bxj, kep, zhai}@google.com [email protected] ABSTRACT on the user’s finger gestures (Figure 1, right). SGK’s are The time and labor demanded by a typical laboratory-based also known as Shape Writing keyboards [21, 28], gesture keyboard evaluation are limiting resources for algorithmic keyboards [5] or word-gesture keyboards [29] in the adjustment and optimization. We propose Remulation, a literature. Different forms of SGKs have been complementary method for evaluating touchscreen commercially distributed in many products such as keyboard correction and recognition algorithms. It ShapeWriter, SlideIT, Swype, Flex T9, TouchPal, and the replicates prior user study data through real-time, on-device Android 4.2 stock keyboard (Gesture Typing) [2]. SGKs simulation. To demonstrate remulation, we have developed face the same correction challenges as STKs because they Octopus, an evaluation tool that enables keyboard must map ambiguous finger gestures to words. developers to efficiently measure and inspect the impact of algorithmic changes without conducting resource-intensive user studies. It can also be used to evaluate third-party keyboards in a “black box” fashion, without access to their algorithms or source code. Octopus can evaluate both touch keyboards and word-gesture keyboards. Two empirical examples show that Remulation can efficiently and effectively measure many aspects of touch screen keyboards at both macro and micro levels. Additionally, we Figure 1. Illustrations of Smart Touch Keyboard contribute two new metrics to measure keyboard accuracy (Left) and Smart Gesture Keyboard (Right) at the word level: the Ratio of Error Reduction (RER) and HCI research has explored various techniques for error the Word Score. prevention, including adapting decoding algorithms to hand Author Keywords posture [4], and personalizing ten-finger typing for large Simulation; Text Entry; Touch Screen Interaction touchscreens [7]. ACM Classification As with other advanced UI technologies such as speech H5.2 [Information interfaces and presentation]: User recognition, effective and efficient evaluation is critical to Interfaces. - Graphical user interfaces. the improvement of smart keyboards. An evaluation generally consists of (1) data collection and (2) data General terms analysis. Our goal is to facilitate both stages of the process. Design; Human Factors Collecting keyboard output data typically involves a INTRODUCTION laboratory experiment with a dozen or more participants. Today, touchscreen keyboards are used by hundreds of The time and labor required by these experiments make millions of people around the world as their default text frequent evaluation of small algorithmic changes infeasible. entry method. To reduce typing errors, most of these Moreover, current data analysis techniques do not provide “Smart Touch Keyboards” (STKs) correct errors important information about keyboard algorithms. For automatically as users type. However, occasionally the example, they do not explicitly and quantitatively measure error correction system itself makes a mistake, with an STK’s ability to correct user errors, and the typical undesirable and sometimes humorous consequences [6,20]. accuracy metrics (e.g., the MSD error rate [16]) examine An increasingly popular alternative to the STK is the smart output at the character-level, and do not reflect today’s gesture keyboard (SGK). An SGK recognizes words based keyboards’ word-level behaviors. To expand the repertoire of tools and methods for Permission to make digital or hard copies of all or part of this work for evaluating STKs and SGKs, we propose Remulation, a personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies novel keyboard evaluation approach that measures bear this notice and the full citation on the first page. To copy otherwise, correction and recognition algorithms of keyboards by or republish, to post on servers or to redistribute to lists, requires prior replaying previously collected user data through real-time specific permission and/or a fee. on-device simulation. It takes touch screen events recorded CHI 2013, April 27–May 2, 2013, Paris, France. Copyright © 2013 ACM 978-1-4503-1899-0/13/04...$15.00. 543 Session: Keyboards and Hotkeys CHI 2013: Changing Perspectives, Paris, France during prior user studies as input, and injects them into a (Android in particular but other OSs in principle) and live mobile keyboard, at the same rate as they were evaluate keyboards in a “black box” fashion. collected. The original experiment is thereby replicated on In what follows we discuss relevant prior work on (1) a working keyboard. Remulation can efficiently evaluate simulating human text entry for evaluating keyboard many (not all) aspects of both STKs and SGKs without performance without traditional laboratory studies, and (2) conducting laboratory experiments, and can be done current analysis techniques used for assessing a text entry repeatedly as if the same group of participants were method’s accuracy. employed to type the same input tirelessly over and over on different keyboards. Simulating Text Entry In 1982, Rumelhart and Norman [23] described a model for Typical use cases for Remulation include: (1) a developer simulating skilled typists on physical typewriters, with the seeks to evaluate the impact of an algorithmic change to her goal of understanding human typing behavior. Their model keyboard, (2) a researcher seeks to evaluate different focused on predicting keystroke timing and simulated key versions of the same third-party keyboard, and (3) a device transposition and doubling errors. In our work, we rely on manufacturer wants to select a third-party keyboard to direct data replication rather than complex modeling of embed in its devices without inspecting the keyboards’ human performance, and seek to evaluate keyboard proprietary source code or algorithms. algorithms rather than theorize user behavior. To effectively measure the accuracy of a keyboard Since the early 1990’s, research interest has shifted from algorithm, we introduce the metrics of Word Score, which studying physical typewriters to soft keyboards. The Fitts- reflects the number of correct words out of every 100 input digraph model, first proposed by Lewis [12], was used to words from a given test dataset, and Ratio of Error estimate average text entry speeds based on movement time Reduction (RER), which quantifies an STK’s error between pairs of keys and digraph frequencies. This model prevention and correction ability. Both metrics measure has been a popular performance prediction tool for keyboard accuracy at the word level. keyboard evaluation with different layouts using a single In the rest of this paper we demonstrate the Remulation finger or a stylus [12,13,14,19], and has been used as an approach and its corresponding data analysis methods by objective function for keyboard optimization [26]. A two- designing and implementing Octopus, a Remulation-based thumb physical keyboard predictive model has also been keyboard evaluation tool. We put Remulation and Octopus proposed [15]. Unlike these models, which predict an upper into practice by applying them to two STKs and two SGKs, bound for average text entry speed assuming a certain error revealing various insights at both the macro and micro rate implied by Fitts’ law (4% per target), our Remulation levels. and analysis method assesses keyboard error rates and accuracies using the same speed that had naturally occurred In summary, this work contributes a novel approach to in the data collection experiment. Also, we focus on evaluating touchscreen keyboards, which includes: evaluating keyboards with similar appearances but different • Remulation, a new approach for evaluating keyboard algorithms, instead of different layouts. correction and recognition algorithms by replicating Measuring Accuracy in Text Entry prior user study data with real-time simulation Standard text entry accuracy metrics compare the tran- • The design and implementation of Octopus, a scribed and presented strings. The Minimum String Dis- Remulation-based keyboard evaluation tool and tance (MSD) [16] is often used to measure the “distance” system between the two strings, based on the number of character insertions, deletions, and replacements needed to turn one • A demonstration of Remulation and Octopus in real string into another. While this has been effective in measur- use, with evaluations of two STKs and two SGKs ing traditional physical keyboards that literally output every • RER and Word Score, new metrics for evaluating letter typed, it is insufficient for measuring STKs. STKs keyboard accuracy. embed a dictionary or language model and do not neces- RELATED WORK sarily map touch points to text on an individual letter basis. Collecting natural use data and applying them to train Similarly, SGKs usually work at the word level: they rec- recognition algorithms has been widely adopted as a ognize a continuous finger gesture to output a single word research methodology in AI (e.g., speech and handwriting [21, 28, 29]. Since both SGKs and STKs operate at word- recognition).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us