Towards an Adaptive Communication Aid with Text Input from Ambiguous Keyboards

Karin Harbusch Michael Kiihn University Koblenz–Landau, Computer Science Department Universitatsstr. 1, D-56070 Koblenz, GERMANY hrbh,hn}n–blnz.d

Abstract computer–assisted text input. Often, speech im- pairments coincide with severe motor impair- Ambiguous keyboards provide efficient ments. Standard keyboards or graphical input typing with low motor demands. In devices are often unsuitable for motor impaired our project l concerning the develop- users. Sometimes, only the operation of one or a ment of a communication aid, we em- very small number of physical switches is possible phasize adaptation with respect to the via buttons, joystick, eye–tracking or otherwise. sensory input. At the same time, we These two contexts of use are considerably dif- wish to impose individualized language ferent: Mobile communication typically happens models on the text determination pro- in the context of asynchronous telecommunication cess. UKO–II is an open architecture (although fast exchange via SMS or e–mail some- based on the Emacs text editor with a times develops into a synchronous communication server/client interface for adaptive lan- situation). Alternative and augmentative (AAC) guage models. Not only the group of methods typically deal with communication strate- motor impaired people but also users of gies in synchronous, face–to–face contexts where, watch–sized devices can profit from this e.g., an electronic communication aid is used to ambiguous typing. produce a text that is synthesized by a text–to- 1 Introduction speech system. (Of course, the produced text can also be utilized in asynchronous telecommunica- Written text for communication is of growing im- tion.) portance in e–mails, SMS, newsgroups, web pages However, in both contexts the challenging goal — even in synchronous communication situations is to efficiently produce short pieces of — usually like chatting, transmitted by electronic devices highly variable — natural language text under dif- (computers, cellular phones, handhelds). Com- ficult circumstances. The small size of the device puter assisted text entry methods such as ambigu- is one factor prohibiting the use of a full keyboard, ous keyboards are feasible for synchronous and the other factor is the user's restricted motor func- even for asynchronous communication scenarios tion. Both application areas share the aim of a per- as they allow complex communication on small sonalized language model to be most effective for electronic devices. Various systems on the mobile the user. phone and handheld market promise a solution to easier and faster text entry. 2 Efficient text input methods People with communication disorders are a Two main classes of efficient text input methods second group of users who can benefit from can be identified. First, on a standard QWERTY

1 The project is partially funded by the DFG — German 2As we concentrate on free text entering devices, we ig- Research Foundation — under grant HA 2716/2-1. nore icon–based systems (cf. Lonke et al. (1999)).

207 keyboard, input can be accelerated by predicting editor, which already includes many text entry and completion of commands and other strings manipulation functions useful in our context. Fur- (Darragh and Witten, 1992), which reduces the thermore, operating system support (e.g. sockets), number of keystrokes necessary to enter a word. basic applications like mail, and a development Motion impaired users who cannot access a full environment with extensive documentation are at keyboard are slowed down because they have to the programmer's fingertips. All components in select each individual key in multiple steps (scan- the communication aid dealing with input/output ning). have been implemented as Emacs Lisp modules. Second, ambiguous keyboards give rise to com- Our communication aid called UKO—II (Fig. l ) munication based on a reduced number of keys is adaptable in two ways: First, the system is cus- (down to 4, cf. Fig. 1). Typing on these devices, tomizable to differing keyboard layouts and to the the user presses the key corresponding to the let- selection of word suggestions or additional edit- ter only once. When the key corresponding to ing commands Second, a layered structure of lan- the space bar is pressed, a is consulted guage models controls the disambiguation process to find all corresponding to the ambiguous and adapts to the user's text input. We discuss both code. modes in turn.

The advantages of an ambiguous keyboard with _QI word disambiguation for users of AAC devices are W prnt ntn 22 pr i outlined by Kushler (1998). The efficiency of an ai t re ambiguous keyboard approximates one keystroke ea i per letter. Apart from literacy, no memorization ai t i of special encodings is required. Attention to the e e display is required only after the word has been i typed. A keyboard with fewer and larger keys may Raw--* , - xEmacs: UKO Text - "° 1 UKO matches ko aq cegi allow easier direct selection for users who other- Comma wise may depend on scanning. S L1VWX r t z ttn I ttn 2 I ttn I ttn 0 An obstacle to both strategies, prediction and disambiguation, may arise from gaps in the elec- tronic lexicon. If a word is not known to the sys- Figure 1: UKO-II Emacs text editing interface tem, the user of an ambiguous keyboard has to with the ambiguous keyboard for English. leave the typing mode in order to enter the word by other means. Another drawback of ambiguous Our text entry interface presumes (n > 1) text entry is the increased cognitive load imposed physical buttons. This parameter is determined ei- on users while typing the word: They may be un- ther by the user's motor functions or the device's able to see the letters of the word already typed available buttons. For n > 4, a genetic algo- and therefore have to memorize the input position. rithm computes a distribution of letters that opti- 3 The adaptive UKO-II system mizes the length of suggested word lists with re- spect to the fixed word frequency information pro- Assistive devices have to respond to dramatically vided by the lexicon. We utilize the frequencies varying needs (Edwards, 1995). Therefore, in or- of the CELEX database (Baayen et al., 1995) ei- der to be useful, they should allow adaptation to ther for German or English; cf. Kilian and Garbe specific requirements. We decided basically to (2001) for off—line design of the entire keyboard design an open architecture for a communication layout. If < 3, the keys have to be selected on a system with publicly available sources 3 . virtual keyboard (scanning). Scaffolding for our implementation is provided In our project the keyboard is tailored to a user by the programmable and extendable Emacs text with cerebral palsy. No more than four buttons

For a collection of Open Source assisfivetechnologies, can be accessed directly. Three buttons are am- see TFLUTHCenter(http://vom.trace.mdedu/linux4 biguous letter keys with sets of letters assigned to

208 them. The fourth button invokes letter deletion, domains such as particular school subjects. command mode or word disambiguation. Words Texts in the various domains have been col- are entered by pressing the corresponding ambigu- lected. Their frequencies and contextual in- ous key once for each letter. Only after the word is formation are estimated in this layer. completed, the user disambiguates the input by se- 4. The general language model stems from lecting the intended word in a list of hits provided large corpora; cf. CELEX frequencies by the language model. Fig. 1 depicts the situation (Baayen et al., 1995). Furthermore, the user after the word "aid" has been typed — by pressing can add personal vocabulary such as proper the middle, the right and the middle button again names. (key sequence "232") — and before the user se- lects the intended word in the list of suggestions4 . Except for the list, the layers are com- If the target word is not known to the system, bined by interpolating the probabilities for any it is possible to spell the word and to include it word proposal. Alternatively, the user chooses ex- in the lexicon for future use. Other actions in the plicitly between the local text model, a domain command mode provide text navigation and edit- model or the general model in order to disam- ing as well as activation of the speech output sys- biguate a word. tem. These actions are triggered either by over- We have implemented several language models loading the three letter keys with commands, or by providing the user with ranked lists of predicted entering and disambiguating a command name. words for ambiguous input. Communication be- The ranking in the list of suggestions for an am- tween a language model and the text entry inter- biguous code is determined by a statistical lan- face is handled in a client/server setting imple- guage model. In the simplest case, word fre- mented by sockets. Sockets enable a clearly dis- quencies extracted from corpora determine the or- tinct interface to the language model components. dering. As is known from various applications, An interesting technical option of the client/server unconditional probabilities can be improved by architecture is to use a language model server that adding user–tailored constraints. We provide the is located on another device, e.g. the notebook user with a situated and personalized language used in the classroom or the communication aid model consisting of different layers: of another user.

1. The stop word model comprises a list of a 4 Related work few hundred highly frequent stop words that Prediction–based systems are widely applied in are not supposed to vary in their distribution the commercial area of communication aids (cf. with respect to text genres, styles, etc. These words are proposed with highest likelihood if the PAL system by Swiffin et al. (1987) and et al. the corresponding code matches. WordQ by Shein (1998)). As we do not investigate prediction–based methods, we only re- 2. The local text model is incrementally con- fer to recent work in this area, such as Baroni et structed while writing a personal document. al. (2002) and Fazly (2002). Recently mentioned words are proposed with An interesting recent development in the area higher likelihood than the general model of ambiguous keyboards is the work performed by would do (various formulae for shuffling the (Tanaka-Ishii et al., 2002). They describe an am- competitive suggestions are currently eval- biguous text input system with five or less letter uated (Harbusch et al., 2003)). Further- keys. Word predictions are computed on the ba- more, we have implemented a word fre- sis of prediction by partial matching (PPM) at the quency adaptation for the text model. word level. The letters are assigned to the keys in 3. Various domain specific models allow ap- alphabetical order. This approach favorably com- propriate suggestions in different semantic pares to ours. However, in our approach the keys

4In the worst case, this list consists of 50 words in English have been assigned non–alphabetically after opti- and 75 in German, respectively. misation with respect to a large corpus.

209 Other work on typing with word disambigua- with disabilities. Cambridge University Press, Cam- tion focusses on the nine letter keys of a standard bridge, MA. phone keyboard (e.g. Forcarda (2001), Skiena and Afsaneh Fazly. 2002. The use of syntax in word com- Rau (1996)), and can be traced back to the early pletion utilities. MSc thesis, Department of Com- 1980s (Witten, 1982, pp. 120-122). Work in alter- puter Science, University of Toronto, Canada. native and augmentative communication preced- Mikel L. Forcada. 2001. Corpus-based stochastic ing Kushler (1998) deals with key—by—key dis- finite-state predictive text entry for reduced key- ambiguation for efficient text input (Levine and boards: Application to Catalan. Procesamiento del Lenguaje Natural, 27:65-70. Goodenough-Trepagnier, 1990; Arnott and Javed, 1992). Karin Harbusch, Saga Hasan, Hajo Hoffmann, Michael Kiihn, and Bernhard Schiller. 2003. Domain— 5 Conclusion specific disambiguation for typing with ambiguous We have presented UKO—II, an adaptive ambigu- keyboards. In Proceedings of the EACL workshop on Language Modeling fc)r Text Entry Methods. ous keyboard providing ranked lists of word sug- gestions from customized language models. Michael Kiihn and Rim Garbe. 2001. Predictive and With respect to the adaptation of the system's highly ambiguous typing for a severely speech and motion impaired user. In C. Stephanidis, editor, user interface, we are transferring the keyboard to Universal Access in Human-Computer Interaction, a hand—held PC in order to make the every—day pages 933-937. Lawrence Erlbaum, Mahwah, NJ. use by a wheelchair user more convenient. Pro- Cliff Kushler. 1998. AAC using a reduced keyboard. viding access to cellular phone communication is In (CSUN Center of Disabilities, 1998). also on our agenda. As to the various language models, we have de- Stephen H. Levine and Cheryl Goodenough- Trepagnier. 1990. Customised text entry devices signed all four layers. On the level of domain for motor-impaired users. Applied Ergonomics, models, we have modelled school topics and two 21(1):55-62. different research topics. Currently we run evalu- Filip T. Loncke, John Clibbens, Helen H. Arvidson, ation studies on the competition formulae for the and Lyle Lloyd. 1999. Augmentative and Alter- rankings in the final list of suggestions native Communication: New Directions in Research and Practice. Whurr, London, UK. References Fraser Shein, Tom Nantais, Rose Nishiyama, Cynthia John L. Arnott and Muhammad Y. Javed. 1992. Prob- Tam, and Paul Marshall. 1998. Word cueing for abilistic character disambiguation for reduced key- persons with writing difficulties: WordQ. In (CSUN boards using small text samples. AAC Augmentative Center of Disabilities, 1998). and Alternative Communication, 8(1): 215-223 . Steven Skiena and Harald Rau. 1996. Dialing for doc- R. Harald Baayen, Richard Piepenbrock, and Leon Gu- uments: an experiment in information theory. Jour- likers. 1995. The CELEX lexical database (re- nal of Visual Languages and Computing, 7:79-95. lease 2), [CD-ROM]. Linguistic Data Consortium, Andy L. Swiffin, John L. Arnott, and Alan F. Newell. Philadelphia, PA. 1987. The use of syntax in a predictive commu- Marco Baroni, Johannes Matiasek, and Harald Trost. nication aid for the physically handicapped. In 2002. Wordform- and class-based prediction of the Richard D. Steele and William Gerrey, editors, Proc. components of German nominal compounds in an of the 10th Annual Conference on Rehabilitation AAC system. In (Tseng, 2002), pages 57-63. Technology, pages 124-126, Washington, DC. Kumiko Tanaka-Ishii, Yusuke Inutsuka, and Masato CSUN Center of Disabilities. 1998. Online proceed- Takeichi. 2002. Entering text with a four-button ings of the Technology and Persons with Disabilities device. (Tseng, 2002), pages 988-994. Conference 1998. California State University, Nor- In tri dge, CA. Shu-Chuan Tseng, editor. 2002. Proceedings of the 19th International Conference on Computational John J. Darragh and Ian H. Witten. 1992. The Reactive Linguistics (COLING 2002), Taipei, Taiwan. Keyboard. Cambridge Univ. Press, Cambridge, MA. Ian H. Witten. 1982. Principles of Computer Speech. Alistair D.N. Edwards, editor. 1995. Extra-ordinary Academic Press, London, UK. human-computer interaction: Interfaces for users

210