Immersive and Collaborative Data Visualization Using Virtual Reality Platforms
Total Page:16
File Type:pdf, Size:1020Kb
2014 IEEE International Conference on Big Data Immersive and Collaborative Data Visualization Using Virtual Reality Platforms Scott Davidoff, Jeffrey S. Norris Ciro Donalek, S. G. Djorgovski, Alex Cioc, Anwell Wang, Jet Propulsion Laboratory Jerry Zhang, Elizabeth Lawler, Stacy Yeh, Pasadena, CA 91109, USA Ashish Mahabal, Matthew Graham, Andrew Drake [Scott.Davidoff, jeffrey.s.norris]@jpl.nasa.gov California Institute of Technology Giuseppe Longo Pasadena, CA 91125, USA University Federico II [donalek,george]@astro.caltech.edu Napoli, Italy [email protected] Abstract — Effective data visualization is a key part of the The key point is that “big data science” is not about data: discovery process in the era of “big data”. It is the bridge it is about discovery and understanding of meaningful between the quantitative content of the data and human patterns hidden in the data. Massive and complex data sets – intuition, and thus an essential component of the scientific path no matter how content-rich or how expensively obtained – are from data into knowledge and understanding. Visualization is of no use if we cannot discover interesting patterns in them. also essential in the data mining process, directing the choice of the applicable algorithms, and in helping to identify and remove This applies to data both from measurements, and data bad data from the analysis. However, a high complexity or a generated as models or simulations that need to be compared high dimensionality of modern data sets represents a critical and interpreted. obstacle. How do we visualize interesting structures and patterns Visualization is the main bridge between the quantitative that may exist in hyper-dimensional data spaces? A better understanding of how we can perceive and interact with multi- content of the data and human intuition, and it can be argued dimensional information poses some deep questions in the field of that we cannot really understand or intuitively comprehend cognition technology and human-computer interaction. To this anything (including mathematical constructs) that we cannot effect, we are exploring the use of immersive virtual reality visualize in some way. Humans have a remarkable pattern platforms for scientific data visualization, both as software and recognition system in our heads, and the ability for knowledge inexpensive commodity hardware. These potentially powerful discovery in data-driven science depends critically on our and innovative tools for multi-dimensional data visualization can ability to perform effective and flexible visual exploration. also provide an easy and natural path to a collaborative data This may be one of the key methodological challenges for the visualization and exploration, where scientists can interact with data-rich science in the 21st century. their data and their colleagues in the same visual space. Immersion provides benefits beyond the traditional “desktop” Different data analysis methods, tools, or approaches also visualization tools: it leads to a demonstrably better perception of depend critically on the geometry and topology of data a datascape geometry, more intuitive data understanding, and a distribution in the relevant parameter space. A blind better retention of the perceived relationships in the data. application of algorithms based of erroneous assumptions Keywords — astroinformatics; visualization; virtual reality; inevitably leads to wrong or misleading results. Visualization data analysis; big data; pattern recognition thus must be an integral part of any data mining process. It is also essential in the data preparation process, as missing or I. INTRODUCTION potentially anomalous measurements can be readily identified Challenges of the “big data” in terms of the data rates and by inspection, and removed from the analysis. volumes are well known. Even more interesting than the A high complexity or a high dimensionality of modern growth of data volumes is the dramatic increase in data data sets thus represents a critical obstacle: we are biologically complexity: multi-dimensional data sets often combine optimized to see the world and the patterns in it in 3 numerical measurements, images, spectra, time series, dimensions. How do we visualize structures that may exist in categorical labels, text, etc. Feature vectors with tens, hyper-dimensional data spaces? For example, there may exist hundreds, or even thousands of dimensions at once clustering in tens of dimensions, or multivariate correlations encapsulate the key challenge of data-driven discovery: the embedded in higher dimensionality manifolds. In general, scope of data and processing introduce new opportunities to some low-dimensionality projection of high-dimensional find connections in data, while at the same time they demand a structures would smear out the structures that may be present new set of tools to support the particular qualities of in the data, and thus render them effectively unrecognizable. investigating massive data. Various dimensionality reduction methods do exist, but may 978-1-4799-5666-1/14/$31.00 ©2014 IEEE 609 not always be applicable. The more dimensions we can RiftTM headset, Leap MotionTM “3D mouse”, Microsoft visualize effectively, the higher are our chances of recognizing KinectTM sensor), that can be competitive with vastly more potentially interesting patterns, correlations, or outliers. expensive cave-type visualization facilities on analytic power and flexibility, and also an incomparable portability. This paper is a progress report on our initial exploration of the utility and optimal use practices of immersive Virtual Our initial focus is on an effective visualization of highly Reality (VR) as a platform for an interactive, collaborative, dimensional data that can be represented as feature vectors (or scientific data visualization and visual exploration. data points) in some high dimensionality data parameter space. We plan to address visualization of non-discrete or continuum II. VIRTUAL REALITY AS VISUALIZATION PLATFORM variables or multidimensional density fields in the future. VR has been shown to lead to better discovery in domains We also leverage the substantial commercial software that whose primary dimensions are spatial. Early work development for the virtual environments such as video games demonstrated that immersion helps scientists more effectively or virtual worlds (VWs), with the associated engines/libraries, investigate paleontology [1], brain tumors [2], shape perception and focus on the development of scientific data visualization [3], underground cave analysis structures [4], MRI [5], organic tools operating within such environments. At the same time, chemistry [6] and physics [7]. Data visualization has been these platforms provide an easy and natural path to a shown to support highly abstract multi-dimensional analyses. collaborative data visualization and exploration, where Many researchers look to visualization to support exploration scientists can interact with their data and their colleagues in the of large datasets. same visual space. One of our goals is to investigate the confluence of III. SOME PRELIMINARY DEVELOPMENTS immersive virtual reality and abstract visualization, and the interplay between novel technology and human perception to As a first step towards addressing these challenges, we generate insight into both. VR and abstract data visualization conducted some preliminary inquiries and technical have each independently been demonstrated to improve science demonstrations as described below. outcomes, but to the best of our knowledge, there is no We conducted an experiment in the possible scientific and evaluation of how immersive virtual reality can use abstract scholarly uses of virtual worlds (which are general purpose representation of high-dimensional data in support of scientific immersive VR platforms) such as Second LifeTM (SL) and investigation. Because immersive visualization can multiply OpenSimulator (OpenSim), an open source version of the SL the effectiveness of desktop visualization, we propose that software. The initial experiments were done through the immersive visualization become one of the foundations to Meta-Institute for Computational Astrophysics (MICA), the explore the higher dimensionality and abstraction that are first professional scientific organization based in VWs [16, 17, attendant with “big data”. 26], subsequently moving to an OpenSim-based VW, vCaltech. Separately, VR is also proving it can support collaborative Use of such “off the shelf” VWs offers several advantages. tasks. When used as a medium for telepresence [8] – the First, all of the graphical rendering, geometry and interactivity feeling of being there – VR has been shown to increase issues are already solved; all that needs to be done is scripting situation awareness [9] and the vividness [10], interactivity [8] for a specific purpose of data visualization. They come with and media richness [11], as well as the proprioceptive [12] or suitable software libraries and building tools. Their cost is zero kinesthetic [13] properties of remote experiences. These and user interaction is built in. qualities have been shown to enhance collaborative teleoperation scenarios like remote driving [14] and surgery We have developed scripts for a general, immersive [15], just to cite a few. However, so far there has been no visualization of highly dimensional data sets in VWs, using the evaluation of how immersive VR can use abstract SL Linden Scripting Language (LSL),