<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Springer - Publisher Connector Multimodal User Interfaces (2015) 9:275–286 DOI 10.1007/s12193-015-0182-7

ORIGINAL PAPER

A survey of assistive technologies and applications for blind users on mobile platforms: a review and foundation for research

Ádám Csapó1,3 · György Wersényi1 · Hunor Nagy1 · Tony Stockman2

Received: 13 December 2014 / Accepted: 29 May 2015 / Published online: 18 June 2015 © The Author() 2015. This article is published with open access at Springerlink.com

Abstract This paper summarizes recent developments in 1 Introduction audio and tactile feedback based assistive technologies tar- geting the blind community. Current technology allows A large number of visually impaired people use state-of- applications to be efficiently distributed and run on mobile the-art technology to perform tasks in their everyday lives. and handheld devices, even in cases where computational Such technologies consist of electronic devices equipped requirements are significant. As a result, electronic travel with sensors and processors capable of making “intelligent” aids, navigational assistance modules, text-to-speech appli- decisions. Various feedback devices are then used to com- cations, as well as virtual audio displays which combine municate results effectively. One of the most important and audio with haptic channels are becoming integrated into stan- challenging tasks in developing such technologies is to create dard mobile devices. This trend, combined with the appear- a user interface that is appropriate for the sensorimotor capa- ance of increasingly user-friendly interfaces and modes of bilities of blind users, both in terms of providing input and interaction has opened a variety of new perspectives for the interpreting output feedback. Today, the use of commercially rehabilitation and training of users with visual impairments. available mobile devices shows great promise in address- The goal of this paper is to provide an overview of these ing such challenges. Besides being equipped with increasing developments based on recent advances in basic research computational capacity and sensor capabilities, these devices and application development. Using this overview as a foun- generally also provide standardized possibilities for touch- dation, an agenda is outlined for future research in mobile based input and perceptually rich auditory-tactile output. As interaction design with respect to users with special needs, a result, the largest and most widespread mobile platforms as well as ultimately in relation to sensor-bridging applica- are rapidly evolving into de facto standards for the imple- tions in general. mentation of assistive technologies. The term “assistive technology” in general is used within Keywords Assistive technologies · Sonification · several fields where users require some form of assistance. Mobile applications · Blind users Although these fields can be diverse in terms of their scope and goals, user safety is always a key issue. Therefore, besides augmenting user capabilities, ensuring their safety and well-being is also of prime importance. Designing nav- igational aids for visually impaired people is an exemplary case where design decisions must in no way detract from users’ awareness of their surroundings through natural chan- Ádám Csapó [email protected] nels. In this paper, our goal is to provide an overview of the- 1 Széchenyi István University, Gy˝or, Hungary oretical and practical solutions to the challenges faced by 2 Queen Mary University, London, UK the visually impaired in various domains of everyday life. 3 Institute for Computer Science and Control, Hungarian Recent developments in mobile technology are highlighted, Academy of Sciences, Budapest, Hungary with the goal of setting out an agenda for future research 123 276 J Multimodal User Interfaces (2015) 9:275–286 in CogInfoCom-supported assistive technologies. The paper nature of auditory icons. As a parallel example to the auditory is structured as follows. In Sect. 2, a summary is provided icon illustration described above, one could imagine a pre- of basic auditory and haptic feedback techniques in gen- specified but abstract pattern of tones to symbolize the event eral, along with solutions developed in the past decades both of someone entering or leaving a virtual space. Whenever a in academia and for the commercial market. In Sect. 3,an data-oriented perspective is preferred, as in transferring data overview is given of generic capabilities provided by state- to audio, the term ‘sonification’ is used, which refers to the of-the-art mobile platforms which can be used to support “use of non-speech audio to convey information or percep- assistive solutions for the visually impaired. It is shown tualize data” [6] (for a more recent definition, the reader is that both the theoretical and practical requirements relat- referred to [7]). ing to assistive applications can be addressed in a unified Since the original formulation of these concepts, sev- way through the use of these platforms. A cross-section of eral newer kinds of auditory representations have emerged. trending mobile applications for the visually impaired on the For an overview of novel representations—including repre- Android and iOS platforms is provided in Sect. 4. Finally, sentations with speech-like and/or emotional characteristics an agenda is set out for future research in this area, ranging (such as spearcons, spindexes, auditory and spe- from basic exploration of perceptual and cognitive issues, moticons), as well as representations used specifically for to the development of improved techniques for prototyping navigation and alerting information (such as musicons and and evaluation of sensor-bridging applications with visually morphocons)—the reader is referred to the overview pro- impaired users. vided in [8]. In the haptic and tactile domains, a distinction that is some- what analogous to the auditory domain exists between iconic 2 Overview of auditory and haptic feedback and message-like communications, although not always methods in a clearly identifiable sense. Thus, while MacLean and Enriquez suggest that haptic icons are conceptually closer The research fields dealing with sensory interfaces between to earcons than auditory icons, in that they are message-like users and devices often acknowledge a certain duality (“our approach shares more philosophically with [earcons], between iconic and more abstract forms of communication. but we also have a long-term aim of adding the intuitive In a sense, this distinction is intuitive, but its value in interface benefits of Gaver’s approach…”) [9], the same authors in a design has also been experimentally justified with respect to different publication write that “haptic icons, or hapticons, different sensory modalities. [are] brief programmed forces applied to a user through a We first consider the case of auditory interfaces, in which haptic interface, with the role of communicating a simple the icon-message distinction is especially strong. Auditory idea in manner similar to visual or auditory icons” [10]. Sim- icons were defined in the context of everyday listening ilarly, Brewster and Brown define tactons and tactile icons as “caricatures of everyday sounds” [1–3]. This was the as interchangeable terms, stating that both are “structured, first generalization of David Canfield-Smith’s original visual abstract messages that can be used to communicate mes- icon concept [4] to modalities other than vision, through sages non-visually” [11]. Such a view is perhaps tenable as a theory that separates ‘everyday listening’ from ‘musi- long as no experimentally verifiable considerations suggest cal listening’. Briefly expressed, Gaver’s theory separates that the two terms should be regarded as referring to differ- cases where sounds are interpreted with respect to their ent concepts. Interestingly, although the terms ‘haptification’ perceptual-musical qualities, as opposed to cases where they and ‘tactification’ have been used in analogy to visualization are interpreted with respect to a physical context in which the and sonification, very often these terms arise independently same sound is generally encountered. As an example of the of any data-oriented approach (i.., it is the use of haptic latter case, the sound of a door being opened and closed could and tactile feedback—as opposed to no feedback—that is for instance be used as an icon for somebody entering or leav- referred to as haptification and tactification). ing a virtual environment. Earcons were defined by Blattner, In both the auditory and haptic/tactile modalities, the Sumikawa and Greenberg as “non-verbal audio messages task of creating and deploying useful representations is a used in the user–computer interface to provide information multifaceted challenge which often requires a simultaneous to the user about some computer object, operation, or inter- reliance on psychophysical experiments and trial-and-error action” [5]. Although this definition does not in itself specify based techniques. While the former source of knowledge whether the representation is iconic or message-like, the is important in describing the theoretical limits of human same paper offers a distinction between ‘representational’ perceptual capabilities—in terms of e.. just-noticeable- and ‘abstract’ earcons, thus acknowledging that such a dual- differences (cf. e.g. [12–14]) and other factors—the latter ity exists. Today, the term ‘earcon’ is used exclusively in the can be equally important as the specific characteristics of the second sense, as a concept that is complementary to the iconic devices employed and the particular circumstances in which 123 J Multimodal User Interfaces (2015) 9:275–286 277 an application is used can rarely be controlled for in advance. The former goal is supported by the growing availability This is well demonstrated by the large number of auditory of (mobile) processing power, while the latter is supported and haptic/tactile solutions which have appeared in assistive by the growing availability of unencumbered wearable tech- technologies in the past decades, and by the fact that, despite nologies. A recent example of a solution which aims to make huge differences among them, many have been used with a use of these developments is a product of a company by the significant degree of success. An overview of relevant solu- name “Artificial Vision For the Blind”, which incorporates a tions is provided in the following section. pair of glasses from which haptic feedback is transmitted to the palm [28,29]. 2.1 Systems in assistive engineering based on tactile The effective transmission of distance information is a solutions key issue if depth information has to be communicated to users. The values of distance to be represented are usually Historically speaking, solutions supporting vision using the proportional or inversely proportional to tactile and auditory tactile modality appeared earlier than audio-based solutions, attributes, such as frequency or spacing between impulses. therefore, a brief discussion of tactile-only solutions is pro- Distances reproduced usually range from 1 to 15 . Instead vided first. of a continuous simulation of distance, discrete levels of dis- Systems that substitute tactile stimuli for visual informa- tance can be used by defining areas as being e.g. “near” or tion generally translate images from a camera into electrical “far”. or vibrotactile stimuli, which can then be applied to various parts of the body (including the fingers, the palm, the back or 2.2 Systems in assistive engineering based on auditory the tongue of the user). Experiments have confirmed the via- solutions bility of this approach in supporting the recognition of basic shapes [15,16], as well as reading [17,18] and localization In parallel to tactile feedback solutions, auditory feedback tasks [19]. has also increasingly been used in assistive technologies ori- Several ideas for such applications have achieved com- ented towards the visually impaired. Interestingly, it has been mercial success. An early example of a device that supports remarked that both the temporal and frequency-based res- reading is the Optacon device, which operates by transcod- olution of the auditory sensory system is higher than the ing printed letters onto an array of vibrotactile actuators in a resolution of somatosensory receptors along the skin [30]. 24 × 6 arrangement [17,20,21]. While the Optacon was rel- For several decades, however, this potential advantage of atively expensive at a price of about 1500 GBP in the 1970s, audition over touch was difficult to take advantage of due it allowed for reading speeds of 15–40 words per minute [22] to limitations in processing power. For instance, given that (others have reported an average of about 28 wpm [23], while sound information presented to users is to be synchronized the variability of user success is illustrated by the fact that one with the frame rate at which new data is read, limitations of the authors of the current paper knew and observed at least in visual processing power would have ultimately affected two users with optacon reading speeds of over 80 wpm). A the precision of feedback as well. Even today, frame rates camera extension to the system was made available to allow of about 2–6 frames per second (fps) are commonly used, for the reading of on-screen material. despite the fact that modern camera equipment is easily capa- An arguably more complex application area of tactile sub- ble of capturing 25 fps, and that human auditory capabilities stitution for vision is navigation. This can be important not would be well suited to interpreting more information. only for the visually impaired, but also for applications in Two early auditory systems—both designed to help blind which users are subjected to significant cognitive load (as users with navigation and obstacle detection—are Son- evidenced by the many solutions for navigation feedback in icGuide and the LaserCane [31]. SonicGuide uses a wearable both on-the-ground and aerial navigation [24–26]). ultrasonic echolocation system (in the form of a pair of eye- One of the first commercially available devices for nav- glass frames) to provide the user with cues on the azimuth igation was the Mowat sensor, from Wormald International and distance of obstacles [32]. Information is provided to the Sensory Aids, which is a hand-held device that uses ultra- user by directly mapping the ultrasound echoes onto audible sonic detection of obstacles and provides feedback in the sounds in both ears (one ear can be used if preferred, leaving form of tactile vibrations that are inversely proportional to the other free for the perception of ambient sounds). distance. Another example is Videotact, created by Fore- The LaserCane system works similarly, although its inter- Thought Development LLC, which provides navigation cues face involves a walking cane and it uses infrared instead through 768 titanium electrodes placed on the abdomen [27]. of ultrasound signals in order to detect obstacles that are Despite the success of such products, newer ones are still relatively close to the cane [33]. It projects beams in three being developed so that environments with increasing levels different directions in order to detect obstacles that are above of clutter can be supported at increasing levels of comfort. the cane (and thus possibly in front of the chest of the user), 123 278 J Multimodal User Interfaces (2015) 9:275–286 in front of the cane at a maximum distance of about 12 ft, and the location of features in the environment that may be of in front of the cane in a downward direction (e.g., to detect interest to the user. The sounds used by SWAN include navi- curbs and other discontinuities in terrain surface). Feedback gation beacons (earcon-like sounds), object sounds (through to the user is provided using tactile vibrations for the forward- spatially localized auditory icons), surface transitions, and oriented beam only (as signals in this direction are expected location information and announcements (brief prerecorded to be relatively more frequent), while obstacles from above speech samples). and from the terrain are represented by high-pitched and low- General-purpose systems for visual-to-audio substitution pitched sounds, respectively. (e.g. with the goal of allowing pattern recognition, move- A similar, early approach to obstacle detection is the Not- ment detection, spatial localization and mobility) have also tingham Obstacle Detector that works using a ∼16 Hz ultra- been developed. Two influential systems in this category are sonic detection signal and is somewhat of a complement to the vOICe system developed by Meijer [38] and the Pros- the Mowat sensor (it is also handheld and also supports obsta- thesis Substituting Vision with Audition (PSVA) developed cle detection based on ultrasound, albeit through audio feed- by Capelle and his colleagues [31]. The vOICe system trans- back). Eight gradations of distance are assigned to a musical lates the vertical dimension of images into frequency and the scale. However, as the system is handheld, the position at horizontal dimension of images into time, and has a resolu- which it is held compared to the horizontal plane is impor- tion of 64 pixels × 64 pixels (more recent implementations tant. Blind users have been shown to have lower awareness of use larger displays [30]). The PSVAsystem uses similar con- limb positions, therefore this is a drawback of the system [34] cepts, although time is neglected and both dimensions of the More recent approaches are exemplified by the Real-Time image are mapped to frequencies. Further, the PSVA system Assistance Prototype (RTAP) [35], which is camera-based uses a biologically more realistic, retinotopic model, in which and is more sophisticated in the kinds of information it con- the central, foveal areas are represented in higher resolution veys. It is equipped with stereo cameras applied on a helmet, (thus, the model uses a periphery of 8 × 8 pixels and a foveal a portable laptop with Windows and small stereo head- region in the center of 8 × 8 pixels). phones. Disadvantages are the limited panning area of ±32◦, the lack of wireless connection, the laptop-sized central unit, the use of headphones blocking the outside world, low reso- 2.3 Systems in assistive engineering based on auditory lution in distance and “sources too close” perception during and tactile solutions binaural rendering, and the unability to detect objects at the ground level. The latter can be solved by a wider viewfield or Solutions combining the auditory and tactile modalities have using the equipment as complementary to the white stick that thus far been relatively rare, although important results and is responsible for detection of steps, stones etc. The RTAP solutions have begun to appear in the past few years. Despite uses 19 discrete levels for distance. It also provides the advan- some differences between them, in many cases the solutions tage of explicitly representing the lack of any obstacle within represent similar design choices. a certain distance, which can be very useful in reassuring HiFiVE is one example of a technically complex vision the user that the system is still in operation. Further, based support system that combines sound with touch and manip- on the object classification capabilities of its vision system, ulation [39–41]. Visual features that are normally perceived the RTAP can filter objects based on importance or proxim- categorically are mapped onto speech-like (but non-verbal) ity. Tests conducted using the system have revealed several auditory phonetics (analogous to spemoticons, only not important factors in the success of assistive solutions. One emotional in character). All such sounds comprise three syl- such factor is the ability of users to remember auditory events lables which correspond to different areas in the image: one (for example, even as they disappear and reappear as the syllable for color, and two for layout. For example, “way-lair- user’s head moves). Another important factor is the amount roar” might correspond to “white-grey” and “left-to-right”. of training and the level of detail with which the associated Changes in texture are mapped onto fluctuations of vol- training protocols are designed and validated. ume, while motion is represented through binaural panning. A recent system with a comparable level of sophistica- Guided by haptics and tactile feedback, users are also enabled tion is the System for Wearable Audio Navigation (SWAN), to explore various areas on the image via finger or hand which was developed to serve as a safe pedestrian navigation motions. Through concurrent feedback, information is pre- and orientation aid for persons with temporary or permanent sented on shapes, boundaries and textures. visual impairments [36,37]. SWANconsists of an audio-only The HiFiVE system has seen several extensions since it output and a tactile input via a dedicated handheld interface was originally proposed, and these have in some cases led to device. Once the user’s location and head direction are deter- a relative rise to prominence of automated vision processing mined, SWAN guides the user along the required path using approaches. This has enabled the creation of so-called audio- a set of beacon sounds, while at the same time indicating tactile objects, which represent higher-level combinations 123 J Multimodal User Interfaces (2015) 9:275–286 279 of low-level visual attributes. Such objects have auditory 3 Generic capabilities of mobile computing representations, and can also be explored through ‘tracers’— platforms i.e., communicational entities which either systematically present the properties of corresponding parts of the image Recent trends have led to the appearance of generic mobile (‘area-tracers’), or convey the shapes of particular items computing technologies that can be leveraged to improve (‘shape-tracers’). users’ quality of life. In terms of assistive technologies, this The See ColOr system is another recent approach to com- tendency offers an ideal working environment for developers bining auditory feedback with tactile interaction [42]. See who are reluctant to develop their own, specialized hard- ColOr combines modules for local perception, global percep- ware/software configurations, or who are looking for faster tion, alerting and recognition. The local perception module ways to disseminate their solutions. uses various auditory timbres to represent colors (through a The immense potential behind mobile communication hue-saturation-level technique reminiscent of Barrass’s TBP platforms for assistive technologies can be highlighted model [43]), and rhythmic patterns to represent distance as through various perspectives, including general-purpose measured on the azimuth plane. The global module allows computing, advanced sensory capabilities, and crowdsourc- users to pinpoint one or more positions within the image ing/data integration capabilities. using their fingers, so as to receive comparative feedback relevant to those areas alone. The alerting and recognition 3.1 General-purpose computing modules, in turn, provide higher-level feedback on obsta- cles which pose an imminent threat to the user, as well as The fact that mobile computing platforms offer standard on “auditory objects” which are associated with real-world APIs for general-purpose computing provides both applica- objects. tion developers and users with a level of flexibility that is At least two important observations can be made based on very conducive to developing and distributing novel solu- these two approaches. The first observation is that whenever tions. As a result of this standardized background (which audio is combined with finger or hand-based manipulations, can already be seen as a kind of ubiquitous infrastructure), the tactile/haptic modality is handled as the less prominent of there is no longer any significant need to develop one-of-a- the two modalities—either in the sense that it is (merely) used kind hardware/software configurations. Instead, developers to guide the scope of (the more important) auditory feedback, are encouraged to use the “same language” and thus col- or in the sense that it is used to provide small portions of the laboratively improve earlier solutions. A good example of complete information available, thereby supporting a more this is illustrated by the fact that the latest prototypes of explorative interaction from the user. The second observa- the HiFiVE system are being developed on a commodity tion is that both of these multi-modal approaches distinguish tablet device. As a result of the growing ease with which between low-level sensations and high-level perceptions— new apps can be developed, it is intuitively clear that a with the latter receiving increasing support from automated growing number of users can be expected to act upon the processing. This second point is made clear in the See ColOr urge to improve existing solutions, should they have new system, but it is also implicitly understood in HiFiVE. ideas. This general notion has been formulated in several domains in terms of a transition from “continuous improve- ment” to “collaborative innovation” [44]; or, in the case of 2.4 Summary of key observations software engineering, from “software product lines” to “soft- ware ecosystems” [45]. Based on the overview presented in Sect. 2, it can be con- cluded that feedback solutions range from iconic to abstract, and from serial to parallel. Given the low-level physical 3.2 Advanced sensory capabilities nature of the tasks considered (e.g. recognizing visual char- acters and navigation), solutions most often apply a low-level State-of-the-art devices are equipped with very advanced, and direct mapping of physical changes onto auditory and/or and yet generic sensory capabilities enabling tight interac- tactile signals. Such approaches can be seen as data-driven tion with the environment. For instance, the Samsung Galaxy sonification (or tactification, but in an analogous sense to S5 (which appeared in 2014) has a total of 10 built-in sen- sonification) approaches. Occasionally, they are comple- sors. This is a very general property that is shared by the mented by higher-level representations, such as the earcon- vast majority of state-of-the-art mobile devices. Typically based beacon sounds used in SWAN, or the categorical supported sensors include: verbalizations used in HiFiVE. From a usability perspec- tive, such approaches using mostly direct physical, and some – GPS receivers, which are very useful in outdoor environ- abstract metaphors seem to have proven most effective. ments, but also have the disadvantage of being inaccurate, 123 280 J Multimodal User Interfaces (2015) 9:275–286

slow, at times unreliable and of being unusable in indoor in mobile phone and tablet use by this population is the environments fact that the Royal National Institute of the Blind in the UK – Gyro sensors, which detect the rotation state of the mobile runs a monthly “Phone Watch” event at its headquarters in device based on three axes (it is interesting to note that London. the gyroscopes found on most modern devices are suf- ficiently sensitive to measure acoustic signals of low 4.1 Applications for Android frequencies, and through signal processing and machine learning, can be used as a microphone [46]). In this subsection, various existing solutions for Android are – Accelerator sensors, which detect the movement state of presented. the device based on three axes – Magnetic sensors (compass), which have to be calibrated, 4.2 Text, speech and typing and can be affected by strong electromagnetic fields. Android includes a number of facilities for text-to-speech The accuracy of such sensors generally depends on the spe- based interaction as part of its Accessibility Service. In partic- cific device, however, they are already highly relevant to ular, TalkBack, KickBack, and SoundBack are applications applications which integrate inputs from multiple sensors and designed to help blind and visually impaired users by allow- will only improve in accuracy in the coming years. ing them to hear and/or feel their selections on the GUI [61]. These applications are also capable of reading text out loud. 3.3 Crowdsourcing and data integration capabilities The Sound quality of TalkBack is relatively good compared to other screen-readers for PCs, however, proper language A great deal of potential arises from the ability of mobile versions of SVOX must be installed in order to get the communication infrastructures to aggregate semantically rel- required quality, and some of these are not free. On the other evant data that is curated in a semi-automated, but also hand, operating vibration feedback cannot be switched off crowd-supported way. This capability allows for low-level and the system sometimes reads superfluous information on sensor data to be fused with user input and (semantic) back- the screen. Users have reported that during text-messaging, ground information in order to achieve greater precision in errors can occur if the contact is not already in the address functionality. book. The software is only updated for the latest Android A number of recent works highlight the utility of this versions. Although all of them are preinstalled on most con- potential, for example in real-time emergency response [47, figurations, the freely available IDEAL Accessible App can 48], opportunistic data dissemination [49], crowd-supported be used to incorporate further enhancements in their func- sensing and processing [50], and many others. Crowdsourced tionality. ICT based solutions are also increasingly applied in the assis- The Classic Text to Speech Engine includes a combination tive technology arena, as evidenced by several recent works of over 40 male and female voices, and enables users to listen on e.g. collaborative navigation for the visually impaired in to spoken renderings of e.g. text files, e-books and translated both the physical world and on the Web [51–55]. Here, sen- texts [62]. The app also features voice support in key areas sor data from the mobile platform (e.g., providing location like navigation. In contrast to SVOX, this application is free, information) are generally used to direct users to the infor- but with limited language support both for reading and the mation that is most relevant, or to the human volunteer who human-device user interface. has the most knowledge with respect to the problem being BrailleBack works together with the TalkBack app to pro- addressed. vide a combined and speech experience [63]. This allows users to connect a number of supported refreshable Braille displays to the device via Bluetooth. Screen content 4 State-of-the-art applications for mobile platforms is then presented on the Braille display and the user can navi- gate and interact with the device using the keys on the display. Both Android and iOS offer applications that support visually Ubiquitous Braille based typing is also supported through the impaired people in performing everyday tasks. Many of the BrailleType application [64]. applications can also be used to good effect by sighted per- The Read for the Blind application is a community- sons. In this section, a brief overview is provided of various powered app that allows users to create audio books for the solutions on both of these platforms. blind, by reading books or short articles from magazines, Evidence of the take up of these applications can be seen newspapers or interesting websites [65]. on email lists specific to blind and visually impaired users The ScanLife Barcode and QR Reader [62] enables users [56–58], exhibitions of assistive technology [59], and spe- to read barcodes and QR codes. This can be useful not only cialist publications [60]. A further symptom of the growth in supermarkets, but in other locations as well using even a 123 J Multimodal User Interfaces (2015) 9:275–286 281 relatively low resolution camera of 3.2 MP,however, memory the many apps created by the Eyes-Free Project that helps demands can be a problem. blind people in navigation by providing real-time vibration The identification of colors is supported by application feedback if they are not moving in the correct direction [62]. such as the Color Picker or Seenesthesis [66,67]. More The accuracy based on the in-built GPS can be low, making generally, augmented/virtual magnifying glasses—such as it difficult to issue warnings within 3–4 m of accuracy. This Magnify—also exist to facilitate reading for the weak-sighted can be eliminated by having a better GPS receiver connected [62]. Magnify only works adequately if it is used with a cam- via Bluetooth. era of high resolution. Similarly, Intersection Explorer provides a spoken account of the layout of streets and intersections as the user drags her 4.3 Speech-based command interfaces finger across a map [74]. A more comprehensive application is The vOICe for The Eyes-Free Shell is an alternative home screen or launcher Android [75]. The application maps live camera views to for the visually impaired as well as for users who are unable soundscapes, providing the visually impaired with an aug- to focus on the screen (e.g. while driving). The application mented reality based navigation support. The app includes a provides a way to interact with the touch screen to check talking color identifier, talking compass, talking face detec- status information, launch applications, and direct dial or tor and a talking GPS locator. It is also closely linked with message specific contacts [68]. While the Eyes-Free Shell the Zxing barcode scanner and the Google Goggles apps by and the Talking Dialer screens are open, the physical controls allowing for them to be launched from within its own context. on the keyboard and navigation arrows are unresponsive and The vOICe uses pitch for height and loudness for brightness different characters are assigned to the typing keys. Widgets in one-second left to right scans of any view: a rising bright can be used, the volume can be adjusted and users can find line sounds as a rising tone, a bright spot as a beep, a bright answers with Eyes-Free Voice Search. filled rectangle as a noise burst, a vertical grid as a rhythm Another application that is part of the Accessibility Ser- (cf. Section 2.2 and [38]). vice is JustSpeak [69]. The app enables voice control of the Android device, and can be used to activate on-screen 4.5 Applications for iOS controls, launch installed applications, and trigger other com- monly used Android actions. In this subsection, various existing solutions for iOS are pre- Some of the applications may need to use TalkBack as sented. well.

4.4 Navigation 4.6 Text, speech and typing

Several Android apps can be used to facilitate navigation One of the most important apps is VoiceOver. It provides sub- for users with different capabilities in various situations (i.e., stantive screen-reading capabilities for Apple’s native apps both indoor and outdoor navigation in structured and unstruc- but also for many third party apps developed for the iOS plat- tured settings). form. VoiceOver renders text on the screen and also employs Talking Location enables users to learn their approximate auditory feedback in response to user interactions. The user position through WiFi or mobile data signals by shaking the can control speed, pitch and other parameters of the audi- device [70]. Although often highly inaccurate, this can nev- tory feedback by accessing the Settings menu. VoiceOver ertheless be the only alternative in indoor navigation, where supports Apple’s Safari web browser, providing element by GPS signals are not available. The app allows users to send element navigation, as well as enabling navigation between SMS messages to friends with their location, allowing them headings and other web page components. This sometimes to obtain help when needed. This is similar to the idea behind provides an added bonus for visually impaired users, as Guard My Angel, which sends SMS messages, and can also the mobile versions of web sites are often simpler and less send them automatically if the user does not provide a “heart- cluttered than their non-mobile counterparts. Importantly beat” confirmation [71]. VoiceOver can easily be switched on and off by pressing the Several “walking straight” applications have been devel- home button three times. This is key if the device is being oped to facilitate straight-line walking [72,73]. Such appli- used alternately by a visually impaired and a sighted user, as cations use built-in sensors (i.e. mostly the magnetic sensor) the way in which interactions work is totally different when to help blind pedestrians. VoiceOver is running. Through the augmentation of mobile capabilities with data Visually impaired users can control their device using services comes the possibility to make combined use of GPS VoiceOver by using their fingers on the screen or by hav- receivers, compasses and map data. WalkyTalky is one of ing an additional keyboard attached. There are some third 123 282 J Multimodal User Interfaces (2015) 9:275–286 party applications which do not conform to Apple guidelines output. While this solution helps to mediate the problem of on application design and so do not work with VoiceOver. headphones masking ambient sounds, its use means that the Voice Over can be used well in conjunction with the user is hearing both ambient sounds and whatever is going oMoby app, which searches the Internet based on photos on in the app, potentially leading to overload of the auditory taken with the iPhone camera and returns a list of search channel or distraction during navigation tasks. results [76]. oMoby also allows for the use of images from With an application called Light Detector, the user can the iPhone photo library and supports some code scanning. It transform any natural or artificial light source he/she encoun- has an unique image recognition capability and is free to use. ters into sound [85]. By pointing the iPhone camera in any Voice Brief is another utility for reading emails, feeds, direction the user will hear higher or lower pitched sounds weather, news etc. [77]. depending on the intensity of the light. Users can check Dragon Dictation can help to translate voice into text [78]. whether the lights are on, where the windows and doors are Speaking and adding punctuation as needed verbally, Apple’s closed, etc. Braille solution, BrailleTouch offers a split-keyboard design Video Motion Alert (VM Alert) is an advanced video in the form of a braille cell to allow a blind user to type [79]. processing application for the iPhone capable of detecting The left side shows dots 1, 2 and 3 while the right side holds motion as seen through the iPhone camera [86]. VM Alert dots 4, 5 and 6. The right hand is oriented over the iPhone’s can use either the rear or front facing camera. It can be con- Home button with the volume buttons on the left edge. figured to sound a pleasant or alarming audible alert when it In a way similar to various Android solutions, the Recog- detects motion and can optionally save images of the motion nizer app allows users to identify cans, packages and ID cards conveniently to the iPhone camera roll. by through camera-based barcode scans [79]. Like Money Reader, this app incorporates object recognition functional- 4.8 Navigation ity. The app stores the image library locally on the phone and does not require an Internet connection. The iOS platform also provides applications for navigation The LookTel Money Reader recognizes currency and assistance. The Maps app provided by default with the iPhone speaks the denomination, enabling people experiencing is accessible in the sense that its buttons and controls are visual impairments or blindness to quickly and easily iden- accessible using VoiceOver, and using these one can set tify and count bills [80]. By pointing the camera of the iOS routes and obtain turn by turn written instructions. device at a bill the application will tell the denomination in Ariadne GPS also works with VoiceOver [87]. Talking real-time. maps allow for the world to be explored by moving a fin- The iOS platform also offers color identification (for free) ger around the map. During exploration, the crossing of a called Color ID Free [81]. street is signaled by vibration. The app has a favorites fea- ture that can be used to announce stops on the bus or train, 4.7 Speech-based command interfaces or to read street names and numbers. It also enables users to navigate large buildings by pre-programming e.g. classroom The TapTapSee app is designed to help the blind and visually locations. Rotating maps keep the user centered, with terri- impaired identify objects they encounter in their daily lives tory behind the user on the bottom of the screen and what [82]. By double tapping the screen to take a photo of anything, is ahead on the top portion. Available in multiple languages, at any angle, the user will hear the app speak the identification Ariadne GPS works anywhere Google Maps are available. back. Similarly to WalkyTalk, low resolution GPS receivers can A different approach is possible through crowdsourced be a problem, however this can be solved through external solutions, i.e. by connecting to a human operator. VizWiz is receivers connected to the device [70]. an application that records a question after taking a picture GPS Lookaround also uses VoiceOver to speak the name of any object [83]. The query can be sent to the Web Worker, of the street, city, cross-street and points of interest [79]. IQ Engines or can be emailed or shared on Twitter. Web Users can shake the iPhone to create a vibration and swishing worker is a human volunteer who will review and answer sound indicating the iPhone will deliver spoken information the question. On the other hand, IQ Engines is an image about a location. recognition platform. BlindSquare also provides information to visually A challenging aspect of daily life for visually impaired impaired users about their surroundings [79]. The tool uses users is that headphones block environmental sounds. “Aware- GPS and a compass to identify location. Users can find out ness! The Headphone App” allows one to listen to the details of local points of interest by category, define routes headphones while also hearing the surrounding sounds [84]. to be walked, and have feedback provided while walking. It uses the microphone to pick up ambient sounds while lis- From a social networking perspective, BlindSquare is closely tening to music or using other apps using headphones for linked to FourSquare: it collects information about the user’s 123 J Multimodal User Interfaces (2015) 9:275–286 283 environment from FourSquare, and allows users to check in while helping them work together. The challenge lies is using to FourSquare by shaking the iPhone. open-ended voice-only commands directed at an automatic speech recognition module. 4.9 Gaming and serious gaming Deep sea (A Sensory Deprivation Video Game) and Aurifi are audio-only games in which players have a mask that Although not directly related to assistive technologies, audio- obscures their vision and takes over their hearing, plunging only solutions can help visually impaired users access them into a world of blackness occupied only by the sound training and entertainment using so-called serious gaming of their own breathing or heartbeat [99,100]. Although these solutions [88–90]. games try to mimic “horrifying environments” using audio One of the first attempts was Shades of Doom, an audio- only, at best a limited experience can be provided due to only version of the maze game called Doom, which is not the limited capabilities of the playback system. In general, available on mobile platforms [91]. The user had to navigate audio-only games (especially those using 3D audio, stereo through a maze, accompanied by steps, growling of a monster panning etc.) require high-quality headphones. and the goal was to find the way out without being killed. The game represents interesting new ideas, but the quality of the sound it uses, as well as its localization features have been 5 Towards a research agenda for mobile assistive reported to be relatively poor. technology for visually impaired users A game called Vanished is another “horror game” that relies entirely on sound to communicate with the player. It Given the level of innovation described in the above sections has been released on iPhone, but there is also a version for in this paper, and the growth in take up of mobile devices, Android [92]. A similar approach is BlindSide, an audio-only both phones and tablets, by the visually impaired commu- adventure game promising a fully-immersive 3D world [93]. nity worldwide, the potential and prospects for research into Nevertheless, games incorporating 3D audio and directional effective Interaction Design and User Experience for non- simulation may not always provide high quality experience, visual use of mobile applications is enormous. The growing depending on the applied playback system (speakers, head- prevalence of small screen devices, allied to the continual phone quality etc.). growth in the amount of information—including cognitive Papa Sangre is an audio-only guidance-based navigation content—that is of interest to users, means that this research game for both Android and iOS [94]. The user can walk/run will undoubtedly have relevance to sensor-bridging applica- by tapping left and right on the bottom half, and turn by slid- tions targeted at the mobile user population as a whole [101]. ing the finger across the top half of the screen while listening In the following subsections, we set out a number of areas to 3D audio cues for directional guidance. and issues for research which appear to be important to the Grail to the Thief tries to establish a classic adventure further development of the field. game like Zork or Day of the Tentacle, except that instead of text-only descriptions it presents the entire game through 5.1 Research into multimodal non-visual interaction audio [95]. A Blind Legend is an adventure game that uses binaural 1. Mode selection. Where a choice exists, there is relatively audio to help players find their bearings in a 3D environment, little to guide designers on the choice of mode in which and allows for the hero to be controlled through the phone’s to present information. We know relatively little about touchscreen using various multi-touch combinations in dif- which mode, audio or tactile, is better suited to which ferent directions [96]. type of information. Audio archery brings archery to Android. Using only the 2. Type selection: Little is also known about how informa- ears and reflexes the goal is to shoot at targets. The game tion is to be mapped to the selected mode, i.e. whether is entirely auditory. Users hear a target move from left to through (direct) representation-sharing or (analogy- right. The task consists in performing flicking motions on based) representation-bridging as described in [102,103]. the screen with one finger to pull back the bow, and releasing 3. Interaction in a multimodal context. Many studies have it as the target is centered [97]. The game has been reported been conducted on intersensory integration, sensory to work quite well with playback systems with two speakers cross-effects and sensory dominance effects in visu- at a relatively large distance which allows for perceptually ally oriented desktop and virtual applications [104–107]. rich stereo imagery. However, few investigations have taken place, in a non- Agents is an audio-only adventure game in which players visual and mobile computing oriented context, into the control two field agents solely via simulated “voice calls” consumption of information in a primary mode while on their mobile phone [98]. The task is to help them break being presented with information in a secondary mode. into a guarded complex, then to safely make their way out, How does the presence of the other mode detract from 123 284 J Multimodal User Interfaces (2015) 9:275–286

cognitive resources for processing information delivered impaired people. The focus was on the use of sound and vibra- in the primary mode? What information can safely be rel- tion (haptics) whenever applicable. It was demonstrated that egated to yet still perceived effectively in the secondary the most commonly used mobile platforms—i.e. Google’s mode? Android and Apple’s iOS—both offer a large variety of 4. What happens when users themselves are given the abil- assistive applications by using the built-in sensors of the ity to allocate separate information streams to different mobile devices, and combining this sensory information with presentation modes? the capability of handling large datasets, as well as cloud 5. What is the extent of individual differences in the ability resources and crowdsourced contributions in real-time. A to process auditory and haptic information in unimodal recent H2020 project was launched to develop a navigational and multimodal contexts? Is it a viable goal to develop device running on a mobile device, incorporating depth- interpretable tuning models allowing users to fine-tune sensitive cameras, spatial audio, haptics and development feedback channels in multimodal mobile interactions of training methods. [108,109]? 6. Context of use. How are data gained experimentally on Acknowledgments This project has received funding from the Euro- the issues above effected when examined in real, mobile pean Union’s Horizon 2020 research and innovation programme under grant Agreement No. 643636 “Sound of Vision”. This research was contexts of use? Can data curated from vast numbers realized in the frames of TÁMOP 4.2.4. A/2-11-1-2012-0001 “National of crowdsourced experiments be used to substitute, or Excellence Program—Elaborating and operating an inland student and complement laboratory experimentation? researcher personal support system”. The project was subsidized by the European Union and co-financed by the European Social Fund.

5.2 Accessible development lifecycle Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm The following issues relate to the fact that the HCI litera- ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit ture on user-centered prototype development and evaluation to the original author(s) and the source, provide a link to the Creative is written assuming a visual context. Very little is available Commons license, and indicate if changes were made. about how to go about these activities with users with no or little sight. 1. How might the tools used for quickly creating paper- References based visual prototypes be replaced by effective counter- 1. Gaver (1986) Auditory icons: using sound in computer inter- parts for a non-visual context? What types of artifact are faces. Human Comput Interact 2(2):167–177 effective in conducting prototyping sessions with users 2. Gaver W (1988) Everyday listening and auditory icons. PhD the- with little or no vision? It should be a fundamental prop- sis, University of California, San Diego erty of such materials that they can be perceived and 3. Gaver W (1989) The SonicFinder: an interface that uses auditory icons. Human Comput Interact 4(1):67–94 easily altered by the users in order that an effective two- 4. Smith (1975) Pygmalion: a computer program to model and way dialogue between developers and users takes place. stimulate creative thought. PhD thesis, Stanford University, Dept. 2. To what extent are the standard means of collecting eval- of Computer Science uation data from sighted users applicable to working with 5. Blattner M, Sumikawa D, Greenberg (1989) Earcons and icons: their structure and common design principles. Human Comput visually impaired users? For example, how well does Interact 4(1):11–44 speak-aloud protocol work in the presence of spoken 6. Kramer G (ed) (1994) Auditory display: sonification, audifica- screen-reader output? What are the difficulties posed for tion, and auditory interfaces. Santa Fe Institute Studies in the evaluation given the lack of a common vocabulary for Sciences of Complexity. Proceedings volume XVIII. Addison- Wesley, Reading expressing the qualities of sounds and haptic interactions 7. Hermann , Hunt A, Neuhoff JG (2011) The sonification hand- by non-specialist users? book. Logos, Berlin 3. Are there techniques that are particularly effective in eval- 8. Csapo A, Wersenyi G (2014) Overview of auditory representa- uating applications for visually impaired users which give tions in human–machine interfaces. ACM Comput Surveys 46(2) 9. Maclean , Enriquez M (2003) Perceptual design of haptic icons. realistic results while addressing health and safety? For In: In Proceedings of eurohaptics, pp 351–363 example, what is the most effective yet realistic way of 10. Enriquez M, MacLean K (2003) The hapticon editor: a tool in sup- evaluating an early stage navigation app? port of haptic communication research. In: Proceedings of the 11th symposium on haptic interfaces for virtual environment and tele- operator systems (HAPTICS’03). IEEE Computer Society, Los 5.3 Summary Angeles, pp 356–362 11. Brewster S, Brown (2004) Tactons: structured tactile messages for non-visual information display. In: Proceedings of the fifth This paper briefly summarized recent developments on conference on Australasian user interface (AUIC’04), vol 28, mobile devices in assistive technologies targeting visually Dunedin, pp 15–23 123 J Multimodal User Interfaces (2015) 9:275–286 285

12. Picinali L, Feakes , Mauro D A, Katz B (2012) Spectral dis- 35. Dunai L, Fajarnes GP, Praderas VS, Garcia BD, Lengua IL (2010) crimination thresholds comparing audio and haptics for complex Real-time assistance prototype—a new navigation aid for blind stimuli. In: Proceedings of international workshop on haptic and people. In: Proceedings of IECON 2010—36th annual conference audio interaction design (HAID 2012), pp 131–140 on IEEE Industrial Electronics Society, pp 1173–1178 13. Picinali L, Feakes C, Mauro D, Katz BF (2012) Tone-2 tones dis- 36. Walker BN, Lindsay J (2006) Navigation performance with a vir- crimination task comparing audio and haptics. In: Proceedings tual auditory display: effects of beacon sound, capture radius, and of IEEE international symposium on haptic audio-visual environ- practice. Hum Factors 48(2):265–278 ments and games, Munich, pp 19–24 37. Wilson J, Walker B , Lindsay J, Cambias C, Dellaert F (2007) 14. Picinali L, Katz BF (2010) Spectral discrimination thresholds SWAN: system for wearable audio navigation. In: Proceedings of comparing audio and haptics. In: Proceedings of haptic and audi- the 11th international symposium on wearable computers (ISWC tory interaction design workshop, Copenhagen, pp 1–2 2007), USA 15. Kaczmarek KA, Haase SJ (2003) Pattern identification and per- 38. Meijer P (1992) An experimental system for auditory image rep- ceived stimulus quality as a function of stimulation current on a resentations. IEEE Trans Biomed Eng 39(2):112–121 fingertip-scanned electrotactile display. IEEE Trans Neural Syst 39. Dewhurst D (2007) An audiotactile vision-substitution system. Rehabil Eng 11:9–16 In: Proceedings of second international workshop on interactive 16. Sampaio E, Maris S, Bach- Rita P (2001) Brain plasticity: sonification, York, pp 1–4 ‘visual’ acuity of blind persons via the tongue. Brain Res 908:204– 40. Dewhurst D (2009) Accessing audiotactile images with HFVE 207 silooet. In: Proceedings of fourth international workshop on haptic 17. Bliss JC, Katcher MH, Rogers CH, Shepard RP (1970) Optical- and audio interaction design. Springer, Berlin, pp 61–70 to-tactile image conversion for the blind. IEEE Trans Man Mach 41. Dewhurst D (2010) Creating and accessing audio-tactile images Syst 11:58–65 with “HFVE” vision substitution software. In: Proceedings of the 18. Craig JC (1981) Tactile letter recognition: pattern duration and third interactive sonification workshop. KTH, Stockholm, pp 101– modes of pattern generation. Percept Psychophys 30:540–546 104 19. Jansson G (1983) Tactile guidance of movement. Int J Neurosci 42. Gomez ValenciaJD (2014) A computer-vision based sensory sub- 19:37–46 stitution device for the visually impaired (See ColOr), PhD thesis. 20. Levesque , Pasquero J, Hayward V (2007) Braille display by University of Geneva lateral skin deformation with the STReSS tactile transducer. 43. Barrass S (1997) Auditory information design, PhD thesis. Aus- In: Proceedings of the second joint eurohaptics conference and tralian National University symposium on haptic interfaces for virtual environment and tele- 44. Chapman RL, Corso M (2005) From continuous improvement operator systems, Tsukuba, pp 115–120 to collaborative innovation: the next challenge in supply chain 21. Bach-y Rita P, Tyler ME, Kaczmarek KA (2003) Seeing with the management. Prod Plan Control 16(4): 339–344 brain. Int J Hum Comput Interact 15(2):285–295 45. Bosch J (2009) From software product lines to software ecosys- 22. Craig JC (1977) Vibrotactile pattern perception: extraordinary tems. In: Proceedings of the 13th international software product observers. Science 196(4288):450–452 line conference, pp 111–119 23. Hislop D, Zuber BL, Trimble JL (1983) Characteristics of reading 46. Michalevsky Y, Boneh D, Nakibly G (2014) Gyrophone: recog- rate and manual scanning patterns of blind optacon readers. Hum nizing speech from gyroscope signals. In: Proceedings of 23rd Factors 25(3):379–389 USENIX security symposium, San Diego 24. Segond , Weiss D, Sampaio E (2005) Human spatial naviga- 47. Rogstadius J, Kostakos V, Laredo J, Vukovic M (2011) Towards tion via a visuo-tactile sensory substitution system. Perception real-time emergency response using crowd supported analysis of 34:1231–1249 social media. In: Proceedings of CHI workshop on crowdsourcing 25. vanErp JBF, vanVeen AHC, Jansen C, Dobbins T (2005) Way- and human computation, systems, studies and platforms point navigation with a vibrotactile waist belt. Perception 2: 48. Blum JR, Eichhorn A, Smith S, Sterle-Contala M, Cooperstock JR 106–117 (2014) Real-time emergency response: improved management of 26. Jones LA, Lockyer B, Piateski E (2006) Tactile display and vibro- real-time information during crisis situations. J Multimodal User tactile recognition on the torso. Adv Robot 20:1359–1374 Interfaces 8(2):161–173 27. http://www.4thtdev.com. Accessed Mar 2015 49. Zyba G, Voelker G M, Ioannidis S, Diot C (2011) Dissemination 28. http://unreasonableatsea.com/artificial-vision-for-the-blind/. in opportunistic mobile ad-hoc networks: the power of the crowd. Accessed Mar 2015 In: Proceedings of IEEE INFOCOM, pp 1179–1187 29. http://www.dgcs.unam.mx/ProyectoUNAM/imagenes/080214. 50. Ra M-R, Liu B, La Porta TF, Govindan R (2012) Medusa: a pdf. Accessed Mar 2015 programming framework for crowd-sensing applications. In: Pro- 30. Kim JK, Zatorre RJ (2008) Generalized learning of visual-to- ceedings of the 10th international conference on mobile systems, auditory substitution in sighted individuals. Brain Res 1242:263– applications, and services, pp 337–350 275 51. Balata J, Franc J, Mikovec , Slavik P (2014) Collaborative 31. Capelle C, Trullemans C, Arno P, Veraart C (1998) A real- navigation of visually impaired. J Multimodal User Interfaces time experimental prototype for enhancement of vision reha- 8(2):175–185 bilitation using auditory substitution. IEEE Trans Biomed Eng 52. Hara K, Le V, Froehlich J (2013) Combining crowdsourcing and 45(10):1279–1293 google street view to identify street-level accessibility problems. 32. Kay L (1974) A sonar aid to enhance spatial perception of the In: Proceedings of the SIGCHI conference on human factors in blind: engineering design and evaluation. IEEE Radio Electron computing systems, pp 631–640 Eng 44(11):605–627 53. Rice MT, Aburizaiza AO, Jacobson RD, Shore BM, Paez FI (2012) 33. Murphy EF (1971) The VA—bionic laser can for the blind. In: Supporting accessibility for blind and vision-impaired people with The National Research Council, (ed) Evaluation of sensory aids a localized gazetteer and open source geotechnology. Trans GIS for the visually handicapped. National Academy of Sciences, pp 16(2):177–190 73–82 54. Cardonha C, Gallo D, Avegliano P, Herrmann R, Koch F, Borger 34. Bissitt D, Heyes AD (1980) An application of bio-feedback in the S (2013) A crowdsourcing platform for the construction of rehabilitation of the blind. Appl Ergon 11(1):31–33 123 286 J Multimodal User Interfaces (2015) 9:275–286

accessibility maps. In: Proceedings of the 10th international cross- 86. https://itunes.apple.com/HU/app/id387523411?mt=8. Accessed disciplinary conference on web accessibility, p 26 Mar 2015 55. Kremer KM (2013) Facilitating accessibility through crowd- 87. http://appadvice.com/applists/show/apps-for-the-visually-impai sourcing (2013). http://www.karenkremer.com/kremercrow red. Accessed Mar 2015 dsourcingaccessibility.pdf. Accessed Mar 2015 88. Balan , Moldoveanu A, Moldoveanu F, Dascalu M-I (2014) 56. http://apps4android.org/knowledgebase/screen_reader.htm. Audio games—a novel approach towards effective learning in Accessed Mar 2015 the case of visually-impaired people. In: Proceedings of seventh 57. http://www.AppleVis.com. Accessed Mar 2015 international conference of education, research and innovation, 58. https://groups.google.com/forum/#!forum/viphone. Accessed Seville Mar 2015 89. Lopez MJ, Pauletto S (2009) The design of an audio film for 59. http://www.qac.ac.uk/exhibitions/sight-village-birmingham/1. the visually impaired. In: Proceedings of the 15th international htm#.VQcQvM9ybcs. Accessed Mar 2015 conference on auditory display (ICAD 09), Copenhagen, pp 210– 60. Access IT magazine published monthly by the Royal National 216 Institute of the Blind UK. http://www.rnib.org.uk. Accessed Mar 90. Balan O, Moldoveanu A, Moldoveanu F, Dascalu M-I (2014) Nav- 2015 igational 3D audio-based game- training towards rich auditory 61. http://www.androlib.com/android.application.com-google-andro spatial representation of the environment. In: Proceedings of the id-marvin-kickback-FExn.aspx. Accessed Mar 2015 18th international conference on system theory, control and com- 62. http://www.androidauthority.com/best-android-apps-visually-im puting, Sinaia paired-blind-97471. Accessed Mar 2015 91. http://www.gmagames.com/sod.html. Accessed Mar 2015 63. https://play.google.com/store/apps/details?id=com.googlecode.e 92. http://www.pixelheartstudios.com/vanished. Accessed Mar 2015 yesfree.brailleback. Accessed Mar 2015 93. http://www.blindsidegame.com. Accessed Mar 2015 64. http://www.ankitdaf.com/projects/BrailleType/. Accessed Mar 94. http://www.maclife.com/article/gallery/10_apps_blind_and_part 2015 ially_sighted. Accessed Mar 2015 65. https://andreashead.wikispaces.com/ 95. http://www.gizmag.com/grail-to-the-thief-blind-accessible-aud Android+Apps+for+VI+and+Blind. Accessed Mar 2015 io-adventure/31824/. Accessed Mar 2015 66. https://play.google.com/store/apps/details?id=com.beanslab.col 96. http://www.ulule.com/a-blind-legend/. Accessed Mar 2015 orblindhelper.helper. Accessed Mar 2015 97. https://play.google.com/store/apps/details?id=net.l_works.audio 67. https://play.google.com/store/apps/details?id=touch.seenesthesii _archery. Accessed Mar 2015 is. Accessed Mar 2015 98. http://www.gamecritics.com/brandon-bales/ 68. http://accessibleandroid.blogspot.hu/2010/09/how-do-i-use-eye sounds-great-a-peek-at-the-audio-only-agents#sthash. s-free-shell.html. Accessed Mar 2015 oGSStwTg.dpuf. Accessed Mar 2015 69. http://eyes-free.blogspot.hu/. Accessed Mar 2015 99. http://wraughk.com/deepsea. Accessed Mar 2015 70. http://www.androlib.com/android.application. 100. http://www.appfreakblog.com/blog/the-sound-only-game-aurifi com-shake-locator-qzmAi.aspx. Accessed Mar 2015 .html. Accessed Mar 2015 71. http://www.prlog.org/11967532-guard-my-angel-mobile-app-a 101. Baranyi P, Csapo A (2012) Definition and synergies of cognitive dapted-to-visually-impaired-people.html. Accessed Mar 2015 infocommunications. Acta Polytech Hung 9(1):67–83 72. Panëels SA, Varenne D, Blum JR, Cooperstock JR (2013) The 102. Csapo A, Baranyi P (2012) A unified terminology for the struc- walking straight mobile application: helping the visually impaired ture and semantics of CogInfoCom channels. Acta Polytech Hung avid veering. In: Proceedings of ICAD13, Lódz, pp 25–32 9(1):85–105 73. https://play.google.com/store/apps/details?id=com.johnny.straig 103. Csapo A, Israel JH, Belaifa O (2013) Oversketching and asso- htlinewalk.app. Accessed Mar 2015 ciated audio-based feedback channels for a virtual sketching 74. http://www.androlib.com/android.application.com-google-andro application. In: Proceedings of 4th IEEE international conference id-marvin-intersectionexplorer-qqxtC.aspx. Accessed Mar 2015 on cognitive infocommunications, pp 509–414 75. http://www.androlib.com/android.application.voice-voice-wiz. 104. Biocca F, Kim J, Choi Y (2001) Visual touch in virtual environ- aspx. Accessed Mar 2015 ments: an exploratory study of presence, multimodal interfaces 76. http://mashable.com/2010/03/04/omoby-visual-search-iphone/. and cross-modal sensory illusions. Presence Teleoper Virtual Env- Accessed Mar 2015 iron 10(3):247–265 77. https://itunes.apple.com/HU/app/id423322440?mt=8. Accessed 105. Biocca F, Inoue Y, Polinsky H, Lee A, Tang A (2002) Visual cues Mar 2015 and virtual touch: role of visual stimuli and intersensory integra- 78. https://itunes.apple.com/HU/app/id341446764?mt=8. Accessed tion in cross-modal haptic illusions and the sense of presence. In: Mar 2015 Gouveia F (ed) Proceedings of presence, Porto 79. http://www.eweek.com/mobile/slideshows/ 106. Hecht D, Reiner M (2009) Sensory dominance in combinations 10-iphone-apps-designed-to-assist-the-visually-impaired/# of audio, visual and haptic stimuli. Exp Brain Res 193:307–314 sthash.0GDG5TYh.dpuf. Accessed Mar 2015 107. Pavani F, Spence C, Driver J (2000) Visual capture of touch: 80. https://itunes.apple.com/HU/app/id417476558?mt=8. Accessed out-of-the-body experiences with rubber gloves. Psychol Sci Mar 2015 11(5):353–359 81. https://itunes.apple.com/HU/app/id402233600?mt=8. Accessed 108. Csapo A, Baranyi P (2011) Perceptual interpolation and open- Mar 2015 ended exploration of auditory icons and earcons. In: 17th inter- 82. https://itunes.apple.com/us/app/taptapsee-blind-visually-impair national conference on auditory display, international community ed/id567635020?mt=8. Accessed Mar 2015 for auditory display, Budapest 83. https://itunes.apple.com/HU/app/id439686043?mt=8. Accessed 109. Csapo A, Baranyi P (2012c) The spiral discovery method: an inter- Mar 2015 pretable tuning model for CogInfoCom channels. J Adv Comput 84. https://itunes.apple.com/HU/app/id389245456?mt=8. Accessed Intell Intell Inform 16(2):358–367 Mar 2015 85. https://itunes.apple.com/HU/app/id420929143?mt=8. Accessed Mar 2015 123