Towards an Adaptive Communication Aid with Text Input from Ambiguous Keyboards

Towards an Adaptive Communication Aid with Text Input from Ambiguous Keyboards Karin Harbusch Michael Kiihn University Koblenz–Landau, Computer Science Department Universitatsstr. 1, D-56070 Koblenz, GERMANY hrbh,hn}n–blnz.d Abstract computer–assisted text input. Often, speech im- pairments coincide with severe motor impair- Ambiguous keyboards provide efficient ments. Standard keyboards or graphical input typing with low motor demands. In devices are often unsuitable for motor impaired our project l concerning the develop- users. Sometimes, only the operation of one or a ment of a communication aid, we em- very small number of physical switches is possible phasize adaptation with respect to the via buttons, joystick, eye–tracking or otherwise. sensory input. At the same time, we These two contexts of use are considerably dif- wish to impose individualized language ferent: Mobile communication typically happens models on the text determination pro- in the context of asynchronous telecommunication cess. UKO–II is an open architecture (although fast exchange via SMS or e–mail some- based on the Emacs text editor with a times develops into a synchronous communication server/client interface for adaptive lan- situation). Alternative and augmentative (AAC) guage models. Not only the group of methods typically deal with communication strate- motor impaired people but also users of gies in synchronous, face–to–face contexts where, watch–sized devices can profit from this e.g., an electronic communication aid is used to ambiguous typing. produce a text that is synthesized by a text–to- 1 Introduction speech system. (Of course, the produced text can also be utilized in asynchronous telecommunica- Written text for communication is of growing im- tion.) portance in e–mails, SMS, newsgroups, web pages However, in both contexts the challenging goal — even in synchronous communication situations is to efficiently produce short pieces of — usually like chatting, transmitted by electronic devices highly variable — natural language text under dif- (computers, cellular phones, handhelds). Com- ficult circumstances. The small size of the device puter assisted text entry methods such as ambigu- is one factor prohibiting the use of a full keyboard, ous keyboards are feasible for synchronous and the other factor is the user's restricted motor func- even for asynchronous communication scenarios tion. Both application areas share the aim of a per- as they allow complex communication on small sonalized language model to be most effective for electronic devices. Various systems on the mobile the user. phone and handheld market promise a solution to easier and faster text entry. 2 Efficient text input methods People with communication disorders are a Two main classes of efficient text input methods second group of users who can benefit from can be identified. First, on a standard QWERTY 1 The project is partially funded by the DFG — German 2As we concentrate on free text entering devices, we ig- Research Foundation — under grant HA 2716/2-1. nore icon–based systems (cf. Lonke et al. (1999)). 207 keyboard, input can be accelerated by predicting editor, which already includes many text entry and completion of commands and other word strings manipulation functions useful in our context. Fur- (Darragh and Witten, 1992), which reduces the thermore, operating system support (e.g. sockets), number of keystrokes necessary to enter a word. basic applications like mail, and a development Motion impaired users who cannot access a full environment with extensive documentation are at keyboard are slowed down because they have to the programmer's fingertips. All components in select each individual key in multiple steps (scan- the communication aid dealing with input/output ning). have been implemented as Emacs Lisp modules. Second, ambiguous keyboards give rise to com- Our communication aid called UKO—II (Fig. l ) munication based on a reduced number of keys is adaptable in two ways: First, the system is cus- (down to 4, cf. Fig. 1). Typing on these devices, tomizable to differing keyboard layouts and to the the user presses the key corresponding to the let- selection of word suggestions or additional edit- ter only once. When the key corresponding to ing commands Second, a layered structure of lan- the space bar is pressed, a dictionary is consulted guage models controls the disambiguation process to find all words corresponding to the ambiguous and adapts to the user's text input. We discuss both code. modes in turn. The advantages of an ambiguous keyboard with _QI word disambiguation for users of AAC devices are W prnt ntn 22 pr i outlined by Kushler (1998). The efficiency of an ai t re ambiguous keyboard approximates one keystroke ea i per letter. Apart from literacy, no memorization ai t i of special encodings is required. Attention to the e e display is required only after the word has been i typed. A keyboard with fewer and larger keys may Raw--* , - xEmacs: UKO Text - "° 1 UKO matches ko aq cegi allow easier direct selection for users who other- Comma wise may depend on scanning. S L1VWX r t ' - z ttn I ttn 2 I ttn I ttn 0 An obstacle to both strategies, prediction and disambiguation, may arise from gaps in the electronic lexicon. If a word is not known to the sys- Figure 1: UKO-II Emacs text editing interface tem, the user of an ambiguous keyboard has to with the ambiguous keyboard for English. leave the typing mode in order to enter the word by other means. Another drawback of ambiguous Our text entry interface presumes (n > 1) text entry is the increased cognitive load imposed physical buttons. This parameter is determined ei- on users while typing the word: They may be un- ther by the user's motor functions or the device's able to see the letters of the word already typed available buttons. For n > 4, a genetic algo- and therefore have to memorize the input position. rithm computes a distribution of letters that opti- 3 The adaptive UKO-II system mizes the length of suggested word lists with respect to the fixed word frequency information pro- Assistive devices have to respond to dramatically vided by the lexicon. We utilize the frequencies varying needs (Edwards, 1995). Therefore, in or- of the CELEX database (Baayen et al., 1995) ei- der to be useful, they should allow adaptation to ther for German or English; cf. Kilian and Garbe specific requirements. We decided basically to (2001) for off—line design of the entire keyboard design an open architecture for a communication layout. If < 3, the keys have to be selected on a system with publicly available sources 3 . virtual keyboard (scanning). Scaffolding for our implementation is provided In our project the keyboard is tailored to a user by the programmable and extendable Emacs text with cerebral palsy. No more than four buttons 3 For a collection of Open Source assisfivetechnologies, can be accessed directly. Three buttons are am- see TFLUTHCenter(http://vom.trace.mdedu/linux4 biguous letter keys with sets of letters assigned to 208 them. The fourth button invokes letter deletion, domains such as particular school subjects. command mode or word disambiguation. Words Texts in the various domains have been col- are entered by pressing the corresponding ambigu- lected. Their frequencies and contextual in- ous key once for each letter. Only after the word is formation are estimated in this layer. completed, the user disambiguates the input by se- 4. The general language model stems from lecting the intended word in a list of hits provided large corpora; cf. CELEX frequencies by the language model. Fig. 1 depicts the situation (Baayen et al., 1995). Furthermore, the user after the word "aid" has been typed — by pressing can add personal vocabulary such as proper the middle, the right and the middle button again names. (key sequence "232") — and before the user se- lects the intended word in the list of suggestions4 . Except for the stop word list, the layers are com- If the target word is not known to the system, bined by interpolating the probabilities for any it is possible to spell the word and to include it word proposal. Alternatively, the user chooses ex- in the lexicon for future use. Other actions in the plicitly between the local text model, a domain command mode provide text navigation and edit- model or the general model in order to disam- ing as well as activation of the speech output sys- biguate a word. tem. These actions are triggered either by over- We have implemented several language models loading the three letter keys with commands, or by providing the user with ranked lists of predicted entering and disambiguating a command name. words for ambiguous input. Communication be- The ranking in the list of suggestions for an am- tween a language model and the text entry inter- biguous code is determined by a statistical lan- face is handled in a client/server setting imple- guage model. In the simplest case, word fre- mented by sockets. Sockets enable a clearly dis- quencies extracted from corpora determine the or- tinct interface to the language model components. dering. As is known from various applications, An interesting technical option of the client/server unconditional probabilities can be improved by architecture is to use a language model server that adding user–tailored constraints. We provide the is located on another device, e.g. the notebook user with a situated and personalized language used in the classroom or the communication aid model consisting of different layers: of another user. 1. The stop word model comprises a list of a 4 Related work few hundred highly frequent stop words that Prediction–based systems are widely applied in are not supposed to vary in their distribution the commercial area of communication aids (cf.

Load more