Paper, We Describe Three User Studies Undertaken to Investigate Users ’Typing Abilities
Total Page:16
File Type:pdf, Size:1020Kb
Int. J. Human-Computer Studies 106 (2017) 44–62 Contents lists available at ScienceDirect International Journal of Human-Computer Studies journal homepage: www.elsevier.com/locate/ijhcs Is it too small?: Investigating the performances and preferences of users when typing on tiny QWERTY keyboards Xin Yi a,b,c, Chun Yu a,b,c,∗, Weinan Shi a,b,c, Yuanchun Shi a,b,c a Department of Computer Science and Technology, Tsinghua University, China b Tsinghua National Laboratory for Information Science and Technology, China c Beijing Key Lab of Networked Multimedia, China a r t i c l e i n f o a b s t r a c t Keywords: Typing on tiny QWERTY keyboards on smartwatches is considered challenging or even impractical due to the Text entry limited screen space. In this paper, we describe three user studies undertaken to investigate users ’typing abilities Tiny keyboard and preferences on tiny QWERTY keyboards. The first two studies, using a smartphone as a substitute for a Smartwatch smartwatch, tested five different keyboard sizes (2, 2.5, 3, 3.5 and 4 cm). Study 1 collected typing data from QWERTY participants using keyboards and given asterisk feedback. We analyzed both the distribution of touch points Typing pattern (e.g., the systematic offset and shape of the distribution) and the effect of keyboard size. Study 2 adopted a Bayesian algorithm based on a touch model derived from Study 1 and a unigram word language model to perform input prediction. We found that on the smart keyboard, participants could type between 26.8 and 33.6 words per minute (WPM) across the five keyboard sizes with an uncorrected character error rate ranging from 0.4% to 1.9%. Participants ’ subjective feedback indicated that they felt most comfortable with keyboards larger than 2.5 cm. Study 3 replicated the 3.0 and 3.5 cm keyboard tests on a real smartwatch and verified that in terms of text entry speed, error rate and user preference, there was no significant difference between the results measured on a smartphone and that on a smartwatch with same sized keys. This study result indicated that the results of Study 1 and 2 are applicable to smartwatch devices. Finally, we conducted a simulation to investigate the performance of different touch/language models based on our collected data. The results showed that using either a bigram language model or a detailed touch model can effectively correct imprecision in users ’input. Our results suggest that achieving satisfactory levels of text input on tiny QWERTY keyboards is possible. © 2017 Elsevier Ltd. All rights reserved. 1. Introduction keyboards such as those on smartphones. Using word-level techniques, users do not necessarily have to select each individual key accurately; In recent years, smartwatches have become increasingly popular and rather, the algorithm can intelligently predict the user ’s intended word have garnered attention from both the public and researchers. However, based on prior knowledge of the language (e.g., word frequency) and text entry on smartwatches remains challenging due to the extremely the user ’s touch characteristics (e.g., the distribution of touch points). limited available screen space. Many consumer smartwatches lack sup- To this end, many studies have been conducted to quantify users ’touch port for text entry altogether (e.g., Apple Watch and MOTO 360), which input characteristics, also known as a touch model . The results, which significantly limits the usability of these smartwatches. Currently, only are almost always collected on smartphone keyboards, are valuable for a few commercial smartwatches feature a built-in keyboard (e.g., Mi- designing high-performance prediction algorithms for text entry. How- crosoft Band 1 and Samsung Gear S2 2 ). ever, knowledge of whether these results also apply to tiny QWERTY A number of techniques have been proposed by researchers to enable keyboards is still lacking. To our knowledge, no previous study has pro- text entry on QWERTY keyboards on smartwatches; most are character- vided empirical results on touch point distribution in a smartwatch sce- level input techniques that facilitate the selection of individual keys nario. (e.g., Zoomboard, Oney et al., 2013 , and ZShift, Leiva et al., 2015 ). How- In this work, we examined user performances and preferences when ever, word-level input is also an important feature of modern software typing on tiny QWERTY keyboards with the ultimate goal of exploring ∗ Corresponding author at: Department of Computer Science and Technology, Tsinghua University, China. E-mail addresses: [email protected] , [email protected] (C. Yu). 1 https://www.microsoft.com/microsoft-band . 2 http://www.samsung.com/global/galaxy/gear-s2/ . http://dx.doi.org/10.1016/j.ijhcs.2017.05.001 Received 1 March 2016; Received in revised form 28 April 2017; Accepted 1 May 2017 Available online 25 May 2017 1071-5819/© 2017 Elsevier Ltd. All rights reserved. X. Yi et al. Int. J. Human-Computer Studies 106 (2017) 44–62 the feasibility of easy-to-implement, high-performance text entry algo- subjectively perceived the keyboard as being more usable as the key- rithms on smartwatches. We iteratively conducted three studies on tiny board size increased. However, the input speed was still very slow ( < QWERTY keyboards with sizes ranging from 2.0 to 4.0 cm. In the first 10 WPM). study, we collected typing data from twenty participants. An asterisk Character-level techniques can be effective when entering Out-Of- would appear and a “hit ”sound were played when a touch event was Vocabulary (OOV) words (e.g., names and passwords). However, when detected ( Azenkot and Zhai, 2012; Findlater et al., 2011 ). Despite a sys- entering common words or sentences, the performance of these tech- tematic offset in touch points, the participants could achieve an input niques becomes awkward because careful aiming or multiple operations rate of between 21.4 and 24.8 words per minute (WPM) with fairly high (taps/swipes) are required to enter each individual character. Conse- precision. quently, these designs limit the text entry speed in such scenarios. In the second study, we evaluated the performance of a naive Bayesian algorithm derived from the data in Study 1. By leveraging a 2.2. Word-level techniques touch model and a unigram word language model, participants were able to achieve an input rate of 32.6 WPM with a character error rate Word-level techniques take advantage of language redundancy to of 0.6% on a 3.0 cm QWERTY keyboard. Interestingly, larger keyboards resolve ambiguities or noise in the user ’s input. Thus, they facilitate the (i.e., 3.5 and 4.0 cm) did not significantly increase the performance in task of entering individual words. Users tap only once for each character terms of input speed, accuracy or user preference. or draw a gesture to enter the word as opposed to the multiple operations In the third study, we replicated the tests of the 3.0 and 3.5 cm key- required for entry in the methods described above. Recently, Gordon boards on a real smartwatch. This study served to verify the assumption et al. (2016) demonstrated that with the help of word-level techniques, that testing keyboards with the same size on a smartphone instead of both touch typing and gesture typing could achieve over 22 WPM on a smartwatch would not lead to significant differences in the measured tiny QWERTY keyboards. However, no detailed results were provided speed, accuracy or subjective ratings and indicated that the results of on the pattern of touch point distribution during typing. Because our Studies 1 and 2 could be applied to real smartwatches. goal is to focus on touch typing, we review word-level techniques on To explore the effect of different touch and language models, we con- touch keyboards. Currently, there are two major types of word-level ducted a simulation comparing the error correction ability of different techniques: ambiguous keyboards and the Bayesian approach. touch/language models using the data collected in Study 1. We found that using either a bigram language model or a detailed touch model 2.2.1. Ambiguous keyboards could achieve a top-1 error rate below 8%. This suggested that both in- An ambiguous keyboard groups multiple characters into single keys corporating a powerful language model and improving the modeling of and uses a predefined dictionary to resolve input ambiguity. For ex- users ’touch behavior could benefit users ’text entry performances on ample, the T9 keyboard ( Grover et al., 1998 ) may be the most famous tiny QWERTY keyboards. ambiguous keyboard on smartphones; it requires only 8 numeric keys (0 and 1 are not included) rather than the 26 keys of a standard QWERTY 2. Related work layout. On smartwatches, where the screen is very small, this design significantly decreases the number of displayed keys, thus allowing an Various techniques have been used to facilitate text entry on smart- increase in the displayed key size. On the other hand, T9 ’s shortcom- watches. Some employed auxiliary sensors or hardware to extend the ing is that its ambiguous keyboards are vulnerable to input imprecision. input expressivity of smartwatches, for example, using a magnetometer Users must accurately target each key, which can be difficult on smart- to enable in-air gestures ( Harrison and Hudson, 2009 ), tilting the de- watches. vice to select characters ( Partridge et al., 2002; Sazawal et al., 2002 ), In 2004, Dunlop (2004) first studied the feasibility of entering text and combining physical panning, twisting, tilting, and clicking as input on a smartwatch using four alphabetic buttons and a central space key. methods ( Xiao et al., 2014 ). However, touch is one of the most common Unfortunately, the author only reported the average time and number modalities in daily life (e.g., on smartphones and tablets); consequently, of errors when entering each sentence; therefore, the results could not this review focuses on text entry techniques based on tiny keyboards on be extrapolated to WPM or error rates directly.