Analyzing Visual Attention Via Virtual Environments
Total Page:16
File Type:pdf, Size:1020Kb
Analyzing Visual Attention via Virtual Environments Haikun Huang1 Ni-Ching Lin2 Lorenzo Barrett1 Darian Springer1 Hsueh-Cheng Wang2 Marc Pomplun1 Lap-Fai Yu1 1University of Massachusetts Boston, 2National Chiao Tung University Figure 1: Humans rely on their visual attention to directional signage to navigate in an unfamiliar environment. Abstract such as the Oculus Rift VR headset and the Virtuix Omni platform, one can navigate in virtual 3D environments like in the real-world. The widespread popularity of consumer-grade virtual reality de- The visual perception in this type of virtual reality is highly real- vices such as the Oculus Rift and the HTC Vive provides new, istic, and perceptual data can be collected for analyzing human vi- exciting opportunities for visual attention research in virtual en- sual attention and behavior that are almost identical to those in the vironments. Via these devices, users can navigate in and interact real-world. In this project, we propose to conduct visual attention with virtual environments naturally, and their responses to different research in virtual reality and to use the analysis results to devise dynamic events can be closely tracked. In this paper, we explore automatic 3D scene analysis tools to be used in everyday scenar- the possibility of using virtual environments to study how direc- ios. Specifically, we aim at identifying the key visual features that tional signage may guide human navigation in an unfamiliar envi- influence human navigation and decision-making in everyday en- ronment. vironments. We then train machine learning models to statistically encode these visual features. These models enable us to build 3D Keywords: Virtual Environments, Visual Attention, Visualization scene analysis tools for various applications. Concepts: •Computing methodologies ! Perception; Virtual reality; 2 Related Work 1 Introduction Visual attention is an important research direction in cognitive sci- ence to study and computationally model the behavior of humans. People rely on their visual attention to navigate in everyday en- Humans navigate and interact with everyday environments primar- vironments. In order to direct their attention, they continually shift ily based on their visual attention. For example, a tourist walks to their gaze to retrieve information for decision making [Hwang et al. his or her destination in an unfamiliar subway station (Figure1) by 2009; Hwang et al. 2011; Wu et al. 2014]. For example, by paying following the directional signage (e.g. text, signs, maps) he or she attention to directional signage (e.g., traffic signs), people can cross sees in the environment. Therefore, in order to optimize the flow roads safely. Visual attention is known to be attracted by low-level and safety of human crowds, research on human visual attention features such as high contrasts, depictions of objects (e.g., cars, is important and can lead to various practical applications to im- pedestrians, or text), and high-level scene context. The attraction prove the design of public facilities (e.g., subway stations) and to of attention by low-level visual saliency has been extensively stud- increase commercial benefits through appropriate design of com- ied. Some particular classes of objects, however, are known to at- mercial premises (e.g., shopping malls). tract eye fixations independently of their low-level visual saliency. The potential benefits brought about by visual attention research These include text [Wang and Pomplun 2012; Wang et al. 2012; are huge, yet much of the existing research is still preliminary and Wang et al. 2014], people and other objects with special meaning. in the laboratory stage. The major bottleneck is that most of the existing experiments are conducted in a highly controlled environ- In the past decades, visual attention studies were typically carried ment. For example, due to technical limitations, experiments on out using static stimuli (e.g., on a still 2D image). However, real- human visual attention are mostly done on 2D images, rather than world scene inspection in everyday activity is generally dynamic. in realistic 3D environments. Hence the experimental findings can There has been investigation to track people’s eye movements on a hardly be generalized to support the development of practical tools screen while they are watching video stimuli (sports and movies), for handling everyday scenarios. In recent years, there have been but the scenario is still very different from real-world scene inspec- significant breakthroughs in virtual reality technology, which make tion, such as the 3D world perceived by a person walking on a street. virtual 3D environments practical for conducting human visual at- Although wearable eye tracking systems have become available, it tention experiments. Using state-of-the-art virtual reality devices, is still difficult to analyze the correspondence of eye movement data with the objects in real-world 3D environments captured by a scene Permission to make digital or hard copies of part or all of this work for camera. The recent advancement in virtual reality provides a new personal or classroom use is granted without fee provided that copies are opportunity to study human visual attention in a much more real- not made or distributed for profit or commercial advantage and that copies istic yet controllable manner, by tracking people’s eye fixations in bear this notice and the full citation on the first page. Copyrights for third- virtual 3D environments. party components of this work must be honored. For all other uses, contact the owner/author(s). c 2016 Copyright held by the owner/author(s). So far little attempt has been made on using virtual 3D environ- SA ’16, December 05-08 2016, , Macao ments to conduct visual attention research, or on using visual atten- ISBN: 978-1-4503-4548-4/16/12 tion data to devise 3D scene analysis tools. Human visual attention DOI: http://dx.doi.org/10.1145/2992138.2992152 data can provide very useful information for analyzing and design- Figure 3: The virtual Kenmore subway station showing different directional signs. Figure 2: Overview of our approach. conic range with the ray being at the central, and the intensity of ing 3D scenes, such that the scene designs automatically generated visual attention drops following a Gaussian distribution. We visu- by graphics techniques [Yu et al. 2011; Yu et al. 2016] are optimized alize visual attention by overlaying a heatmap of its intensity onto with respect to human ergonomics and visual attention. The gen- the virtual environment. erated designs can be put into practical uses in the real-world, for example, when planning the layout of airports or train stations and 4 Conclusion deciding where to place signs, maps, or other types of information relevant to navigation. We proposed a novel virtual reality platform for conducting visual attention experiments. In particular, we discussed how such plat- 3 Approach form can be used to study how human visual attention and naviga- tion behavior is guided by the directional signage in the environ- ment. We believe this platform is very useful for conducting many Virtual Reality Platform. We build a virtual reality platform other visual attention experiments. In future, we will also incorpo- on which a user can navigate freely in and view a virtual environ- rate an eye tracking device on our virtual reality platform to obtain ment. Figure2 shows our virtual reality platform, which is com- more accurate positions of visual attention. posed of the Virtuix Omni motion-tracking platform and the Ocu- lus Rift headset. Refer to [Avila and Bailey 2014] for details of the devices. The Virtuix Omni tracks the current location of the user References in the virtual environment. The Oculus Rift tracks the position and orientation of the user’s head (camera), from which we can approx- AVILA, L., AND BAILEY, M. 2014. Virtual reality for the masses. imately compute the user’s current position of visual attention. We IEEE computer graphics and applications, 5, 103–104. collect the visual attention data from a number of users, and visu- alize the visual attention pattern by overlaying a heatmap over the HWANG, A. D., HIGGINS, E. C., AND POMPLUN, M. 2009. A virtual environment. model of top-down attentional control during visual search in complex scenes. Journal of Vision 9, 5, 25. Virtual Environment. We construct the 3D model of the Kenmore subway station in Boston to conduct our experiments. Figure3 HWANG, A. D., WANG, H.-C., AND POMPLUN, M. 2011. Se- shows some screenshots. The subway station has about 30 to 40 mantic guidance of eye movements in real-world scenes. Vision directional signage which tells people the correct directions to walk research 51, 10, 1192–1205. to a platform of the subway station or the exit of subway station. We MAW, N. N., AND POMPLUN, M. 2004. Studying human face choose to conduct our experiments in this station because this is a recognition with the gaze-contingent window technique. In place where most pedestrians navigate with a particular goal (e.g. CogSci. to take a train at a certain platform) in mind, and their navigation decision heavily relies on the instructions given by the directional WANG, H.-C., AND POMPLUN, M. 2012. The attraction of visual signage. attention to texts in real-world scenes. Journal of Vision 12, 6, 26. We manually build the 3D model of the station. First, we do a visual inspection of the station, taking pictures and marking the locations WANG, H.-C., LU, S., LIM, J.-H., AND POMPLUN, M. 2012. Vi- of all the directional signage. Based on the station’s indoor map, sual attention is attracted by text features even in scenes without we build the 3D layout of the station, and add in all the realistic text. In CogSci. details such as chairs, trash bins, ticket machines and gates.