Binaural Navigation for the Visually Impaired with a Smartphone
Total Page:16
File Type:pdf, Size:1020Kb
ICMC 2015 – Sept. 25 - Oct. 1, 2015 – CEMI, University of North Texas Binaural Navigation for the Visually Impaired with a Smartphone Lee Tae Hoon Manish Reddy Vuyyuru T Ananda Kumar Simon Lui NUS High School NUS High School NUS High School Singapore University of [email protected] [email protected] [email protected] Technology and Design [email protected] ABSTRACT account for sound localization in the median plane (front, above, back, below). MSC arise from the differences in the We aimed to determine the feasibility of binaural navigation way sounds interact with the head, torso, pinna and ear canal and demonstrate piloting of blindfolded subjects using only of the listener. recorded (virtual) binaural cues. First, the localization accu- racy, sensitivity to angular deviations and susceptibility to 3. OVERALL METHODOLOGY front-back confusion of the subjects was measured using set- 30 subjects were selected from an initial pool of volunteers ups modified from previous works on localization capacity. after screening against hearing disabilities using standard Afterwards, a software prototype that seeks the relevant bin- hearing tests. Subjects’ heads were centered at the origin. An- aural cues from a database to keep a subject on a predeter- gles to a subject’s left from the nasal axis are defined as pos- mined path by determining the subject’s head orientation and itive and vice versa (see Figure 1). Markers were laid at spe- coordinates was developed on the Android platform. It was cific angles at a 1.5m radius from the origin. A 0.500s metro- scored by the root-mean-squared deviation (RMSD) of blind- nome click covering the frequency spectrum from 500Hz to folded subjects from the route. Experimental results show 5000Hz was played over a 100mm diameter speaker levelled that precise piloting of blindfolded subjects is achievable us- to the seated subjects’ ear levels as the audio stimulus. An ing only virtual binaural cues. We also discuss the growing individualized spatial audio database of virtual binaural cues potential for more sophisticated uses of binaural media. was built by recording clicks played from the marked posi- 1. MOTIVATION tions with in-ear binaural microphones for every subject. The MS-TFB-2-80049 ultra-low noise microphones were small The ability to deduce the location of a sound source from au- and in-ear, thus capturing MSC without disturbing the an- ditory cues (localization) is a behavioral instinct developed thropometry of the external features, and binaural, thus cap- upon birth[1]. By recruiting the visual cortex in the context of turing both ITD and IID. localization, the visually impaired process auditory cues bet- ter than people without any visual impairment[2]. However, 4. ABSOLUTE LOCALIZATION outdoor navigation via environmental cues is still remarkably difficult for them. Since audio navigation systems for the vis- ually impaired rely primarily on verbal instructions instead of spatial audio localization[3][4][5], the investigation was initi- ated to explore and develop the feasibility of recorded (vir- tual) binaural cues for indoor and outdoor navigation. 2. BACKGROUND Three main auditory cues are responsible for auditory locali- zation - interaurl intensity differences (IID), interaural time differences (ITD) and monoaural spectral cues (MSC). IID and ITD account for sound localization laterally (left, ahead, Figure 1. Setup of Experiment 1 right). The sound received at the contralateral ear is shad- 4.1 Method (Real Sound Source) owed by the head and results in IID. The cue is more pro- o found for higher frequencies where the wavelength of the The audio stimulus was played from the reference angles 75 , o o o o sound is smaller than the distance between the two ears 55 , 10 , -35 and -65 at a radius of 1.5m from the origin. (>1600Hz). The arrival time delay from the physical separa- The angles were purposefully picked from the frontal azi- tion of the ears on the head results in ITD and is more pro- muth since subjects instinctively turn to face sounds coming found for sounds of lower frequencies (<1600Hz)[6]. MSC from the back before attempting to pinpoint the source. Sub- jects, unaware of the reference angles, were tasked to point a Copyright: © 2015 Lee Tae Hoon et al. This is an open-access article laser at the source after exposure to each audio stimulus. The distributed under the terms of the Creative Commons Attribution License laser then lands on a sheet of paper arched into a semi-circle 3.0 Unported, which permits unrestricted use, distribution, and reproduction of radius 1.8m from the origin positioned at the height of sub- in any medium, provided the original author and source are credited. jects’ torsos to capture localization data. The subjects were – 286 – ICMC 2015 – Sept. 25 - Oct. 1, 2015 – CEMI, University of North Texas blindfolded and fastened to a chair. Head movements, tracked by a laser, were damped with a neck brace. Wrists were im- mobilized with a device resting on a series of armrests ar- ranged around the subjects at chest level so that the laser pointer was aligned with the direction the arms were pointing. Every subject underwent a session consisting of 30 trials. In every trial, the audio stimulus, repeated 5 times with a time delay of 1.10s to allow time for processing, was played from the speaker from the 5 angles in a randomized order. Subjects then pointed a laser at the perceived sound source and the po- sition the laser landed on the arched sheet was marked and Figure 5. SSD SE values for virtual sounds from 3 subjects and the angular deviation from the actual sound source measured. the average are shown. At the end of the 30 trials, all 5 reference angles were tested No statistically meaningful difference in the localization of 30 times each. real versus virtual sound sources is observed. However, sin- 4.2 Method (Virtual Sound Source) gle sample t-test statistics show there is sufficient evidence at the 1% significance level to conclude that the angular devia- Instead of a real sound source (i.e. speakers), the stimulus is tion of the perceived source from the real sound source is sig- retrieved from a subject’s personal spatial audio database and nificant. The most likely culprit, a lack of coordination when delivered via in-ear stereo earphones to preserve the auditory pointing at the perceived source, cannot be solely held ac- cues. The stimulus, setup and task are otherwise identical to countable as the sample standard deviation of signed errors the previous experiment method as in Section 4.1. (SSD SE) increases as the absolute angular distance from the 4.3 Results and Discussion nasal axis increases. This systematic behavior of SSD SE hints at a psychoacoustic nature of subjects to perceive sounds away from the source. The localization accuracy is greatest at the nasal axis and decays with increasing unsigned angular distance from the axis. The conclusion is in agree- ment with previous works on human auditory localization for real sound sources[7]. Auditory localization in a navigation task is a continuous pro- cess. Sounds presented in a free field are localized by instinc- tively turning to face the perceived source which recursively Figure 2. Mean Unsigned Error (MUSE) for real sounds from 3 shifts the source closer to the nasal axis for accurate localiza- subjects and the average are shown. tion. Thus, the localization errors (<15o) far from the nasal axis and errors (<5o) close to the nasal axis are within reason- able ranges for binaural navigation. 5. RELATIVE LOCALIZATION Figure 3. MUSE for virtual sounds from 3 subjects and the aver- age are shown. Figure 6. Setup of Experiment 2 Localization errors close to the nasal axis are investigated. A relative localization paradigm[8] was employed to determine sensitivity to small angular deviations of a sound source Figure 4. Sample Standard Deviation of Signed Errors (SSD about the axis. Only virtual sound sources were used. SE) for real sounds from 3 subjects and the average are shown. – 287 – ICMC 2015 – Sept. 25 - Oct. 1, 2015 – CEMI, University of North Texas 5.1 Method 6.1 Method 1o intervals were marked from 10o to -10o at a radius of 1.5m MSC can be too subtle for listeners resulting in front-back from origin. Subjects’ spatial audio databases were updated confusion[10]. Thus, its extent and its effect on the feasibility with binaural recordings from each of the 21 points. S1 de- on binaural navigation was studied next. The reference angle, notes the reference angle, 0o and S2 any of the 21 points. Sub- S1, is set to -90o and S2 can take any of 61 values from -60o jects are played S1 followed by S2 through in-ear stereo ear- to -120o at 1o intervals. The experimental method is otherwise phones and tasked with determining the position of S2 rela- identical to that detailed in Section 5.1. tive to S1 (right, left, on). By doing so, the errors from a lack 6.2 Results and Discussion of coordination which are inherent to an absolute localization paradigm[9] were eliminated. The responses were categorized into 3 types - Type 0 when subjects perceived change or ab- sence of change correctly and stated the correct position of S2 relative to S1, Type 1 when subjects perceived absence of change when there is a change and Type 2 when subjects per- ceived change but stated the wrong position of S2 relative to S1. Each subject underwent 30 sessions, with each session involving all 21 recordings played as S2 in a randomised or- der. 5.2 Results and Discussion Figure 9.