Advanced Visualization and Interactive Applications in the Immersive Cabin

Kaloian Petkov and Arie Kaufman, Fellow, IEEE Department of Computer Science and CEWIT Stony Brook University Stony Brook, NY 11794-4400, USA [kpetkov,ari]@cs.stonybrook.edu

Abstract— We present a visualization and interaction framework enclosed CAVEs exist [2], they present a number of for the Immersive Cabin (IC), which is a 5-wall CAVE-like engineering challenges that result in significant cost increases. immersive 3D environment. The back-end rendering is provided by a cluster of 5 workstations with 2 GPUs We have proposed and constructed the Immersive Cabin each. We utilize a pair of projectors for each wall and an external (IC) [3] as a robust and affordable design for a fully enclosed LCD shutter system that is synchronized with active stereo visualization environment. The IC consists of 4 projection glasses. Each workstation produces a synchronized pair of walls and a floor and the design is compatible with multiple stereoscopic images for a single wall in the IC. If only one GPU is projectors per surface and the new 120Hz stereo projectors. dedicated to rendering, the second one is utilized as a Our design utilizes an automatic door to provide access into the computational device using /C++ and the NVIDIA CUDA IC, which compares favorably cost-wise with facilities that use extensions. In addition to the depth cues and surround immersion heavy duty machinery to move the back wall. Our facility has from the visualization system, we use wireless optical head and been used for a number of projects and in this paper we present hand tracking to further enhance the data exploration a summary of the advances in terms of the software framework, capabilities. Combined with a range of interaction and navigation the IC instruments and some of the applications. tools, our system can support a variety of interactive applications, including architectural and automotive pre-visualization, urban planning, medical imaging, and simulation and rendering of physical phenomena. II. EQUIPMENT AND CONSTRUCTION A. Immersive Cabin Keywords-Virtual Reality, CAVE, Immersive Cabin, GPU, Virtual Colonoscopy The IC is a compact, fully enclosed visualization environment with 5 projection surfaces and an automatic door I. INTRODUCTION as part of the back wall. Each projection node is equipped with The size of data has grown in recent years to the point the following: where traditional techniques for its visualization and analysis  Dual Hitachi SXGA+ projectors are not sufficient. It is not uncommon to be presented with multi-gigabyte volumetric datasets from fluid dynamics  Beacon SX+ shutter and synchronization simulations, high resolution CT scans, or data structures so complex that projections onto 2D display are ineffective. A  IR emitter number of techniques have been developed for the The layout of the installation is illustrated in Fig. 1 and visualization of large-scale complex data. Facilities such as the additional details about the construction are available in the CAVE [1] and devices such as Head Mounted Displays paper by Qiu et al. [3]. The shutters in front of the projectors (HMDs) provide a much larger field of view into a virtual are connected to the Beacon system and operate at 120Hz. The environment and further improve the perception of spatial IR emitter is also connected to the Beacon system and allows relationships between data by utilizing stereoscopic rendering. for synchronization with the LCD shutter glasses. The 5 Despite their advantages, both solutions present a number of Beacon systems are daisy-chained so that all the shutters in the challenges that ultimately limit their appeal and traction outside IC are synchronized. Each pair of projectors is driven by a the visualization field. HMDs are usually bulky and heavily single workstation with two NVIDIA Quadro FX 4600 boards. wired, and although they present a visually unobstructed view Although perfect synchronization between the GPUs and the into the data, their usage is associated with eye strain, projectors is possible with the use of NVIDIA G-Sync boards, headaches and other medical issues. CAVEs on the other hand our setup can operate with software-based framelocking. The offer a more natural visualization environment that is amenable G-Sync boards are mandatory for installations without an to augmented reality applications. However, building fully external shutter system, e.g. when using native 120Hz enclosed facilities remains a challenge and many installations projectors for stereoscopic rendering. have a limited number of display surfaces. Although fully

This work was partially supported by NSF grants CCF-0702699, CCF- 0448399, DMS-0528363, DMS-0626223, NIH grant CA082402, and the Center of Excellence in Wireless and Information Technology (CEWIT). Figure 2. Blue cloud represents the tracking coverage areas seen simultaneously by (left to right) 3, 6 and 8 cameras. The positions near the center of the IC that are seen by 6 cameras are optimal for head and gesture tracking since optical occlusions are minimized.

C. Input Devices Our system supports standard input devices such as keyboards and mice, although they are not particularly suitable for the exploration of virtual environments. We primarily rely Figure 1. Diagram for an installation of the Immersive Cabin. on wireless gamepads as the main navigation tool and the framework currently supports the Logitech Rumblepad 2 and the 360 wireless controllers. In our experiments, the Our cluster contains a single head node which is 3DConnexion Space Navigator has provided superior responsible for managing all the machines and processing all performance for certain medical and architectural applications user interactions. In addition, we have a 32-node cluster with as it combines 6 degrees of freedom in a very intuitive and dual CPUs at each node and an NVIDIA Quadro FX 4500 compact device. However, in practice, the absence of native graphics board. These machines are not used for direct output wireless models greatly complicates the setup. to the projection surfaces in the IC and are instead dedicated to Our IR tracking system is deployed to provide a different off-line rendering and natural phenomena simulations. modality of user interactions. Head tracking is used to interact with the visual system directly by allowing the user unobstructed movement inside the volume of the IC. The B. Tracking System position and orientation of the user are both taken into account The IC utilizes an infrared tracking system from when rendering the virtual world, which allows, for example, NaturalPoint that contains 8 IR cameras with wide-angle peeking around a corner or simulating an object that is inside lenses, 2 synchronization hubs and the ARENA motion capture the IC. In addition, the tracking system allows the development software. The cameras are mounted along the top edge of the of a more intuitive user interface that relies on natural gestures IC walls and provide sufficient coverage for head and gesture instead of tracked devices. tracking throughout most of the interior space. The synchronization hubs are connected to the sync signal of the III. RENDERING FRAMEWORK Beacon shutter system so that the pulsed IR illumination from A. Framework Design the cameras does not interfere with the IR signal for the shutter glasses. We use between 3 and 5 markers to define rigid bodies Designing a flexible framework for distributed GPU for the tracking software and special care has to be taken so rendering is generally a challenging task. Compared to a that the marker arrangements are sufficiently different in the traditional 3D engine, our system needs to support a network of topological sense. For the LCD shutter glasses we use a 5 tape workstations with single or multiple rendering pipes, multiple marker pattern that reduces the occurrence of optical display surfaces per node and stereo rendering. Support for occlusions, while for other objects, 3 markers may be more traditional display layouts, such as multiple angled sufficient. In the case of hand gloves, protruding spherical monitors on a desk, is essential for debugging and application markers are used as occlusions prevent the efficient use of flat development. The Chromium library [4] provides support for tape markers. rendering on a cluster on nodes by directly manipulating the stream of rendering commands issued by a graphics API such Fig. 2 illustrates the coverage areas for tracking that we as OpenGL. A set of stream filters are used at each rendering achieve in the IC. The blue cloud represents the area that is node to create a parallel graphics system in which the seen by a specified number of cameras and we provide the visualization application is not necessarily aware of the results for coverage by 3, 6 and 8 cameras. A point needs to be underlying rendering platform. While this approach greatly seen by at least 3 cameras in order to obtain its position in 3D simplifies the process of creating Virtual Reality applications, space and a rigid body can contain between 3 and 6 points. network bandwidth becomes a major bottleneck with modern Although most of the volume in the IC is covered by at least 3 applications that may utilize hundreds of high resolution cameras, the illustration in Fig. 2 does not account for optical textures, advanced and multiple rendering passes per occlusions which are a significant problem. Rather, we have frame [5]. CGLX [6] follows a similar approach by providing a optimized the camera placement for optimal tracking near the replacement for the OpenGL library that abstracts the center of the IC and towards the front screen. visualization surfaces. VRJuggler [7] and Equalizer [8] on the other hand operate at a higher level using a runtime library to manage multiple instances of the application on the cluster computers. With this approach, only the user input is transmitted over the network which results in significant bandwidth savings. On the other hand, the visualization framework is less transparent to the programmer and may require significant changes to existing code. Our framework for distributed rendering is based on a master- slave model similar to the architecture of VRJuggler, in which instances of the application are executed on each of the rendering nodes. The user interacts directly with the head node of the cluster, which is responsible for processing user input, Figure 3. Volume rendering for [left] medical visualization and [right] tracking data and physics interactions. The head node also environmental effects. transmits updates for dynamic scene elements (e.g. cameras and objects under the control of the physics engine). For for certain visual effects that are related to the transport of light clusters with genlock/framelock capabilities, rendering in the scene. In particular, sharp specular reflections and synchronization between the cluster nodes is handled in refractions, accurate shadows and global illumination effects hardware; otherwise our framework falls back to a software are difficult to produce. Such effects are typically associated implementation of framelocking based on the head node’s with high quality off-line rendering and ray-tracing. A number message passing capabilities. of GPU-based ray-tracing algorithms have been proposed that can achieve interactive frame-rates even for large geometric models [12-15]. Our raytracing module is based on NVIDIA B. SceniX Rendering OptiX [15] and it is tightly coupled into the scene graph. Since The main rendering capabilities of our framework are based both raytracing and OpenGL rendering traverse the same on the NVIDIA SceniX library [9]. SceniX provides a robust underlying scene representation, using OptiX as a renderer implementation of the scene graph with a plug-in mechanism requires only minor changes to the user interface. for loading and saving of scene data and related file formats. Implementing advanced effects such as global illumination, Our assets are primarily stored in the COLLADA format [10], requires writing additional OptiX programs in the CUDA although we have developed custom plug-ins for a number of language. The performance is significantly lower than with data formats used in other scene graph implementations and OpenGL, however we can achieve interactive rendering for our own internal research. The rendering system uses the most scenes with low to moderate complexity (Fig. 4). current implementation of OpenGL, the NVIDIA Cg 3 language for writing shaders, and supports modern GPU features such as geometry shaders and GPU-based tessellation.

C. Single-pass volume rendering Our volume rendering module is tightly integrated with the scene graph. We have implemented a GPU-based single-pass ray-casting approach [11] with iso-surface extraction, a Cook- Terrance light transport model and distance-based light Figure 4. Complex reflections and shadows at real-time speeds with attenuation. The code is implemented in the NVIDIA Cg NVIDIA OptiX. language and the resulting is bound to standard primitives in the scene graph. A 3D box is sufficient to initialize the starting and ending positions for the volume E. Off-line Advanced Rendering integration pass so that the final images represent a valid Even though the OptiX rendering can produce high quality stereoscopic pair. In addition, we intersect the box with the images at interactive frame rates, the quality of the output is near clipping plane of the scene’s camera so that we have valid reduced compared to traditional Reyes rendering systems [16]. starting points for user positions inside the volume data. This is We have implemented a translation from the SceniX scene essential for medical applications that involve navigating inside graph to a Renderman-compatible interface. This module has an organ. The ray’s ending positions are also modified by the been tested with ’s Renderman Pro Server and the open- scene geometry so that we can support the rendering of source renderer. Our second implementation uses the now environmental effects such as smoke around an airplane. As deprecated NVIDIA renderer, which is based on a can be seen in Fig. 3, the smoke correctly interacts with the similar to Renderman, but uses NVIDIA solid object in the scene. Quadro-based GPUs to accelerate certain computations. The comparison between the rendered outputs is presented in Fig. 5. In terms of deployment to the IC, off-line rendering is currently D. OptiX Rendering executed individually on each rendering node and for each The standard OpenGL rendering can produce high quality display surface. Although this allows for immediate visual images at real-time rates, however it is not particularly suitable feedback as parts of the images are rendered, it is highly Figure 5. Chess scene rendered with (left) OpenGL, (center) NVIDIA Gelato and (right) Renderman Pro Server inefficient. A future direction would be to dedicated nodes from the Visualization Cluster for back-end rendering only and Figure 7. Architectural rendering of [left] AERTC and [right] Simmons process all surfaces at the same time. Center for Geometry and Physics, both and Stony Brook University

and Technology Center (AERTC) and the Simons Center for Geometry and Physics. In both cases, the visualization was IV. APPLICATIONS available to the architects and the engineers before the A. Urban Planning and Design construction began. The highly immersive and interactive nature of visualizations in the IC lends itself to applications that deal with large volumes of data. In particular, the planning and C. Virtual Colonoscopy design of large urban environments involves visualizations at Our volume rendering module is primarily designed to multiple scales of detail, from large scale overviews of the support medical visualization applications such as Virtual entire city to small scale details at the street level. The IC Colonoscopy [18,19] in immersive environments. The allows very intuitive transitions between the levels, while the rendering framework can achieve interactive performance in immersive visualization help the user retain the context of the the IC for 512×512×451 16-bit CT scan of a patient’s overall model. In addition, real-time plume dispersion models abdomen. Since the examination in VC consists of traversing [17] can take advantage of the powerful GPU cluster of the IC the colon along the centerline, we rely heavily on optimizations and our volume rendering framework to provide interactive such as empty space skipping and early ray termination, while simulations and training for national security and natural additional computational resources are dedicated to high disaster scenarios. Fig. 6 illustrates the exploration of a large quality lighting and shading at the iso-surface representing the urban environment designed for the IC. The user has an internal colon wall. We have also developed a conformal overview of the entire city and access to details at the middle visualization that maps visual information from the missing and small scales. This model renders at interactive frame rates ceiling in the IC to the side walls. The conformal property on the IC’s GPU cluster. ensures that angles are preserved locally, and as a result shapes are also preserved locally. This is particularly advantageous for Virtual Colonoscopy since the user can examine the entire surface of the colon in a partially immersive environment such as the IC, and polyps retain their circular shape even though their size may be distorted.

Top Projection

Figure 6. Exploration of a large urban environment

B. Architectural Pre-visualization Architectural flythroughs are a very natural application for IC and take advantage of the extended depth perception and the immersion over 2D workstation displays. We primarily utilize the OpenGL renderer to provide more fluid animations, although for certain scenes with complex geometry the OptiX renderer may provide competitive performance. Our framework provides tools such as custom clipping planes, transparency shaders, high-quality materials and environmental effects, which are particularly useful for rendering detailed Front Projection Conformal Front Projection building exteriors and interiors. Head tracking is fully Figure 8. Interactive exploration of the colon with conformal visualization supported, although it is generally not used when the models during VC. are examined by more than one person or when the rendering performance is below 15 frames per second. Fig. 7 shows snapshots from the exploration of two buildings at the Stony Brook University campus – the Advanced Energy Research REFERENCES [1] C. Cruz-Neira, D. J. Sandin, T. A. DeFanti, R. V. Kenyon, and J. C. Hart, "The CAVE: Audio Visual Experience Automatic Virtual Environment," Communications of the ACM, vol. 35, pp. 64-72, 1992. [2] A. Hogue, M. Robinson, M. R. Jenkin, and R. S. Allison, "A vision- based head tracking system for fully immersive displays," in Proceedings of the Workshop on Virtual Environments, 2003, pp. 179- 187. [3] F. Qiu, B. Zhang, K. Petkov, L. Chong, A. Kaufman, K. Mueller and X. D. Gu, "Enclosed Five-Wall Immersive Cabin," Proceedings of the 4th International Symposium on Advances in Visual Computing, 2008, pp. 891-900. [4] G. Humphreys, M. Houston, R. Ng, R. Frank, S. Ahern, P. Kirchner and J. T. Klosowski, "Chromium: A Stream Processing Framework for Interactive Rendering on Clusters," ACM Transactions on Graphics, vol. 21, pp. 693-702, 2002. [5] O. G. Staadt, J. Walker, C. Nuber, and B. Hamann, "A survey and performance analysis of software platforms for interactive cluster-based multi-screen rendering," Workshop on Virtual Environments, 2003, pp. Figure 9. Virtual Colonoscopy in the IC. 261-270. [6] K.-U. Doerr and F. Kuester, "CGLX: A Scalable, High-performance Visualization Framework for Networked Display Environments " IEEE Transactions on Visualization and , vol. 99, 2010. [7] A. Bierbaum, C. Just, P. Hartling, K. Meinert, A. Baker and C. Cruz- Neira, "VR Juggler: A Virtual Platform for Virtual Reality Application Development," International Conference on Computer Graphics and CONCLUSIONS AND FUTURE WORK Interactive Techniques, 2008. [8] S. Eilemann, M. Makhinya, and R. Pajarola, "Equalizer: A Scalable Parallel Rendering Framework," IEEE Transactions on Visualization We have presented a rendering framework for the and Computer Graphics, vol. 15, pp. 436-452, 2009. Immersive Cabin that takes advantage of the newest GPU [9] NVIDIA Corporation. NVIDIA SceniX Application Acceleration technologies for real-time rendering, as well high quality Engine. Available: http://www.nvidia.com/object/scenix.html (accessed raytracing. We have also developed interaction techniques 24 Aug 2010) based on standard devices and a wireless tracking system based [10] M. Barnes, "COLLADA," ACM SIGGRAPH, 2006. on IR cameras. Our facility has been used in a number of [11] M. Hadwiger, J. M. Kniss, C. Rezk-salama, and D. Weiskopf, Real-time applications including mixed scale urban visualization, Volume Graphics: A K Peters, 2006. architectural pre-visualization and Virtual Colonoscopy. [12] B. Liu, L.-Y. Wei, X. Yang, Y.-Q. Xu, and B. Guo, "Nonlinear Beam Tracing on a GPU," Mcrosoft Research MSR-TR-2007-34, 2007. Our next project for the IC focuses on the development of a [13] S. Popov, J. Günther, H.-P. Seidel, and P. Slusallek, "Stackless KD-Tree natural user interface based on the wireless tracking of the Traversal for High Performance GPU ," Computer Graphics user’s body. The goal is combine the more traditional hand Forum, vol. 26, pp. 415-424, 2007. gestures with a system based on somatic learning. On the [14] M. Zlatuska and V. Havran, "Ray Tracing on a GPU with CUDA -- rendering side we plan to replace the deprecated NVIDIA Comparative Study of Three Algorithms," Journal of WSCG, vol. 18, pp. 69-75, 2010. Gelato renderer with technology from Mental Ray, as well as to [15] S. G. Parker, J. Bigler, A. Dietrich, H. Friedrich, J. Hoberock, D. introduce incremental improvements and new rendering Luebke, D. McAllister, M. McGuire, K. Morley, A. Robison and M. techniques to the rest of the framework. Stich, "OptiX: A General Purpose Ray Tracing Engine," ACM Transactions on Graphics, vol. 29, pp. 1-13, 2010.

[16] R. L. Cook, L. Carpenter, and E. Catmull, "The Reyes Image Rendering Architecture," Computer Graphics, vol. 21, pp. 95-102, 1987. ACKNOWLEDGMENT [17] F. Qiu, Y. Zhao, Z. Fan, X. Wei, H. Lorenz, J. Wang, S. Yoakum- This work has been supported by NIH grant R01EB7530 Stover, A. Kaufman and K. Mueller, "Dispersion Simulation and and NSF grants IIS0916235, CCF0702699 and CNS- Visualization for Urban Security," IEEE Visualization, 2004, pp. 553- 0959979. The VC datasets have been provided through the 560. NIH, courtesy of Dr. Richard Choi, Walter Reed Army [18] L. Hong, S. Muraki, A. Kaufman, D. Bartz, and T. He, "Virtual voyage: interactive navigation in the human colon," ACM SIGGRAPH, 1997, pp. Medical Center. 27-34. [19] C. D. Johnson and A. H. Dachman, "CT Colonography: The Next Colon Screening Examination?," Radiology, vol. 216, pp. 331-341, 2000.