A Device for Human Ultrasonic Echolocation Jascha Sohl-Dickstein, Santani Teng, Benjamin M
Total Page:16
File Type:pdf, Size:1020Kb
1 A device for human ultrasonic echolocation Jascha Sohl-Dickstein, Santani Teng, Benjamin M. Gaub, Chris C. Rodgers, Crystal Li, Michael R. DeWeese and Nicol S. Harper Abstract—Representing and interacting with the external much higher sound frequencies for their echolocation, and world constitutes a major challenge for persons who are blind or have specialized capabilities to detect time delays in sounds. visually impaired. Nonvisual sensory modalities such as audition The most sophisticated echolocation abilities are found in and somatosensation assume primary importance. Some non- human species, such as bats and dolphins, sense by active echolo- microchiropteran bats (microbats) and odontocetes (dolphins cation, in which emitted acoustic pulses and their reflections are and toothed whales). For example, microbats catch insects on used to sample the environment. Active echolocation is also used the wing in total darkness, and dolphins hunt fish in opaque by some blind humans, who use signals such as tongue clicks water. Arguably simpler echolocation is also found in oilbirds, or cane taps as mobility aids. The ultrasonic pulses employed swiftlets, Rousettas megabats, some shrews and tenrecs, and by echolocating animals have several advantages over those used by humans, including higher spatial resolution and fewer even rats [1]. Microbats use sound frequencies ranging from diffraction artifacts. However, they are unavailable to unaided 25-150 kHz in echolocation, and use several different kinds of humans. Here we present a device that combines principles of echolocation calls [2]. One call, the broadband call or chirp, a ultrasonic echolocation and spatial hearing to present human brief tone (< 5 ms) sweeping downward over a wide frequency users with environmental cues that are i) not otherwise available range, is used for localization at close range. A longer duration to the human auditory system and ii) richer in object and spatial information than the more heavily processed sonar cues call, the narrowband call, which with a narrower frequency of other assistive devices. The device consists of a wearable range is used for detection and classification of objects typ- headset with an ultrasonic emitter and stereo microphones ically at longer range. In contrast to microbats, odontocetes affixed with artificial pinnae. The echoes of ultrasonic pulses use clicks; shorter in duration than bat calls and with sound are recorded and time-stretched to lower their frequencies into frequencies up to 200 kHz [3]. Odontocetes may use shorter the human auditory range before being played back to the user. We tested performance among naive and experienced sighted calls as sound travels ∼ 4 times faster in water, whereas bat volunteers using a set of localization experiments. Naive subjects calls may be longer to have sufficient energy for echolocation were able to make laterality and distance judgments, suggesting in air. Dolphins can even use echolocation to detect features that the echoes provide innately useful information without that are unavailable via vision: for example, dolphins can tell prior training. Naive subjects were unable to make elevation visually identical hollow objects apart based on differences in judgments. However, in localization testing one trained subject demonstrated an ability to judge elevation as well. This suggests thickness [4]. that the device can be used to effectively examine the environment Humans are not typically considered among the echolocat- and that the human auditory system can rapidly adapt to these ing species. Remarkably, however, some blind persons have artificial echolocation cues. We contend that this device has the demonstrated the use of active echolocation in their daily lives, potential to provide significant aid to blind people in interacting interpreting reflections from self-generated tongue clicks for with their environment. such tasks as obstacle detection [5], distance discrimination Index Terms—echolocation, ultrasonic, blind, assistive device [6], and object localization [7], [8]. The underpinnings of human echolocation in blind (and sighted) people remain poorly characterized, though some informative cues [9], neural I. INTRODUCTION correlates [10], and models [11] have been proposed. While In environments where vision is ineffective, some animals the practice of active echolocation via tongue clicks is not have evolved echolocation — perception using reflections of commonly taught, it is recognized as an orientation and self-made sounds. Remarkably, some blind humans are also mobility method [12], [13]. However, most evidence in the able to echolocate to an extent, frequently with vocal clicks. existing literature suggests that human echolocation ability, However, animals specialized for echolocation typically use even in blind, trained experts, does not approach the precision and versatility found in organisms with highly specialized Jascha Sohl-Dickstein is with Stanford University and Khan Academy. echolocation mechanisms. e-mail: [email protected]. An understanding of the cues underpinning human auditory Santani Teng is with Massachusetts Institute of Technology. spatial perception is crucial to the design of an artificial e-mail: [email protected] Benjamin M. Gaub is with University of California, Berkeley. echolocation device. Lateralization of sound sources depends e-mail: [email protected] heavily on binaural cues in the form of timing and intensity Chris C. Rodgers is with Columbia University. differences between sounds arriving at the two ears. For ver- e-mail: [email protected] Crystal Li is with University of California, Berkeley. tical and front/back localization, the major cues are direction- e-mail: [email protected] dependent spectral transformations of the incoming sound Michael R. DeWeese is with University of California, Berkeley. induced by the convoluted shape of the pinna, the visible outer e-mail: [email protected] Nicol S. Harper is with University of Oxford. portion of the ear [14]. Auditory distance perception is less e-mail: [email protected] well characterized than the other dimensions, though evidence 2 suggests that intensity and the ratio of direct-to-reverberant Device&layout& energy play major roles in distance judgments [15]. Notably, Le6&earpiece& the ability of humans to gauge distance using pulse-echo Le6& Ultrasonic& ar7ficial& delays has not been well characterized, though these serve microphone& Computer& pinna& Echo& as the primary distance cues for actively echolocating animals 1) Sends&chirp&train&to& speaker&& [16]. 2) Slows&sound&from& Ultrasonic& microphone&&25C Chirp& Object& speaker& Here we present a device, referred to as the Sonic Eye, that fold&a6er&each&click& uses a forehead-mounted speaker to emit ultrasonic “chirps” and&sends&to& corresponding& (FM sweeps) modeled after bat echolocation calls. The echoes earpiece& Right& Echo& Ultrasonic& ar7ficial& microphone& are recorded by bilaterally mounted ultrasonic microphones, pinna& each mounted inside an artificial pinna, also modeled after (a) Right&earpiece& bat pinnae to produce direction-dependent spectral cues. After each chirp, the recorded chirp and reflections are played back 1 to the user at m of normal speed, where m is an adjustable magnification factor. This magnifies all temporally based cues linearly by a factor of m and lowers frequencies into the human audible range. For empirical results reported here, m is 20 or 25 as indicated. That is, cues that are normally too high or too fast for the listener to use are brought into the usable range simply by replaying them more slowly. (b) (c) Although a number of electronic travel aids that utilize sonar have been developed (e.g., [17], [18], [19], [20]), very few Fig. 1. (a) Diagram of components and information flow. (b) Photograph provide information other than range-finding or a processed of the current hardware. (c) Photograph of one of the artificial pinnae used, modeled after a bat ear. localization cue, and none appears to be in common use. Studies of human hearing suggest that it is very adaptable to altered auditory cues, such as those provided by different binaural spacing or altered pinna shapes. Additionally, in blind subjects the visual cortex can be recruited to also represent auditory cues [10]. Our device is designed to utilize the intrinsic cue processing and learning capacity in the human auditory system [21], [22]. Rather than provide a heavily processed input to the user as other devices have, which ulti- mately will be relatively impoverished in terms of information about the world, we provide a minimally processed input that, while initially challenging to use, has the capacity to be much more informative about the world and integrate better with the innate human spatial hearing system. The relatively raw echoes contain not just distance information but vertical Fig. 2. Schematic of waveforms at several processing stages, from ultrasonic and horizontal location, as well as texture, geometric, and speaker output to stretched pulse-echo signal headphone output presented to material cues. Behavioral testing suggests that novice users user. Red traces correspond to the left ear signal, and blue traces to right ear can quickly judge the laterality and distance of objects, and signal. with experience can also judge elevation, and that the Sonic Eye thus demonstrates potential as an assistive mobility device. A preliminary version of this work was presented in poster tapered using a cosine