Architectures and Codecs for Real-Time Light Field Streaming
Total Page:16
File Type:pdf, Size:1020Kb
Reprinted from Journal of Imaging Science and Technology R 61(1): 010403-1–010403-13, 2017. c Society for Imaging Science and Technology 2017 Architectures and Codecs for Real-Time Light Field Streaming Péter Tamás Kovács Tampere University of Technology, Tampere, Finland Holografika, Budapest, Hungary E-mail: [email protected] Alireza Zare Tampere University of Technology, Tampere, Finland Nokia Technologies, Tampere, Finland Tibor Balogh Holografika, Budapest, Hungary Robert Bregovi¢ and Atanas Gotchev Tampere University of Technology, Tampere, Finland its unique set of associated technical challenges. Projection- Abstract. Light field 3D displays represent a major step forward in based light field 3D displays9–11 represent one of the visual realism, providing glasses-free spatial vision of real or virtual scenes. Applications that capture and process live imagery have to most practical and scalable approaches to glasses-free 3D process data captured by potentially tens to hundreds of cameras visualization, achieving, as of today, a 100 Mpixel total 3D and control tens to hundreds of projection engines making up the resolution and supporting continuous parallax effect. human perceivable 3D light field using a distributed processing Various applications where 3D spatial visualization has system. The associated massive data processing is difficult to scale beyond a specific number and resolution of images, limited by the added value have been proposed for 3D displays. Some capabilities of the individual computing nodes. The authors therefore of these applications involve displaying artificial/computer analyze the bottlenecks and data flow of the light field conversion generated content, such as CAD design, service and mainte- process and identify possibilities to introduce better scalability. Based on this analysis they propose two different architectures for nance training, animated movies and 3D games, driving and distributed light field processing. To avoid using uncompressed video flight simulation. Other applications require the capturing, data all along the processing chain, the authors also analyze how transmission/storage and rendering of 3D imagery, such the operation of the proposed architectures can be supported by as 3D TV and 3D video conferencing (3D telepresence). existing image/video codecs. c 2017 Society for Imaging Science and Technology. From the second group, applications that require live 3D image transmission are the technically most demanding, as 3D visual information has much more information content compared to its 2D counterpart. In state of the INTRODUCTION art 3D displays, total pixel count can be 1–2 orders of Three-dimensional (3D) displays1,2 represent a new class magnitude higher than in common 2D displays, making such of terminal devices, that make a major step forward in applications extremely data intensive. realism toward displays that can show imagery indistin- We focus on the problems associated with applications guishable from reality. 3D display technologies use different that require real-time streaming of live 3D video. Weconsider approaches to make the human eyes see spatial information. projection-based light field 3D displays as the underlying While the most straightforward approaches use glasses or display technology (with tens or hundreds of projection other headgear to achieve separation of left and right views, engines) coupled with massive multi-camera arrays for autostereoscopic displays achieve similar effects without the light field capturing (with tens or hundreds of cameras). necessity of wearing special glasses. Typical autostereoscopic We explore possibilities that can potentially work in real display technologies include parallax barrier,3 lenticular time on today's hardware. Please note that the majority lens,4 volumetric,5 light field6,7 and pure holographic8 of the problems discussed here also apply to other 3D displays, most of them available commercially, each with capture and display setups in which the necessary data throughput cannot be handled by a single data processing node, and as such distributing the workload becomes Received July 17, 2016; accepted for publication Oct. 30, 2016; published online Dec. 20, 2016. Associate Editor: Hideki Kakeya. necessary. Our novel contribution presented in this article lies in: analysis of the typical data flow taking place during IS&T International Symposium on Electronic Imaging 2017 54 Stereoscopic Displays and Applications XXVIII converting multiview content to display-specific light field representation; analysis of bottlenecks in a direct (brute force) light field conversion process; presentation of two possible approaches for eliminating such bottlenecks; and a suitability analysis of existing image and video codecs for supporting the proposed approaches. The article is organized as follows. First light fields and light field displays, as well as different content generation and rendering methods for light field displays are introduced. Then the proposed architectures for scalable light field decompression and light field conversion are described, followed by a comparative analysis of the discussed architec- tures and codecs. Figure 1. Basic principle of projection-based light field displays: projection modules project light rays from the back, toward the LIGHT FIELDS, LIGHT FIELD DISPLAYS AND holographic screen. Object points are visible where two light rays appear CONTENT CREATION to cross each other consistently. Human eyes can see different imagery at The propagation of visible light rays in space can be described different locations due to the direction selective light emission property of the screen. by the plenoptic function, which is a 7D continuous function, parameterized by location in 3D space (3 parameters), angles 180◦, the image can be walked around. The 3D image can be of observation (2 parameters), wavelength and time.12 In observed without glasses, and displayed objects can appear real-world implementations, a light field, which is a simpli- floating in front of the screen, or behind the screen surface, fied 3D or 4D parameterization of the plenoptic function, is like with holograms. used, which allows to represent visual information in terms of This image formation is achieved by controlling the rays with their positions in space and directionality.13,14 In a light rays so that they mimic the light field that would be 4D light field, ray positions are identified by either using two present if the real object would be placed to the same physical planes and the hit point of the light ray on both planes, or by space, up to the angular resolution provided by the projection using a single hit point on a plane and the direction to which modules. Such glasses-free display technology has dramatic the light ray propagates. This 4D light field describes a static advantages in application fields like 3D TV,and especially 3D light field in a half-space. For a more detailed description, the video conferencing, where true eye contact between multiple reader is referred to Ref. 15. participants at both sides can only be achieved by using Having a light field properly reproduced will provide a 3D display.16 These same applications require real-time the user with 3D effects such a binocularity and continuous processing (capture, transmission, and visualization) of live motion parallax. Today's light field displays can typically images, which has special requirements on the rendering reproduce a horizontal-only-parallax light field, which allows system. the simplification of the light field representation to 3D. Projection-based light field displays from Holografika have been used for the purposes of this work. The interested Optical System and Image Properties reader is referred to Ref.7 for a complete description of the Projection-based light field displays are based on a massive display system. Note that other projection-based light field array of projection modules that project light rays onto a displays based on very similar architectures also exist;10,11,17 special holographic screen to reproduce an approximation thus our findings are relevant for those displays too. of the continuous light field. The light rays originating from one source (later referred to as light field slices) hit the Rendering System and Data Flow screen at different positions all over the screen surface and The previously described distributed optical system is served pass through the screen while maintaining their directions. by a parallel rendering system, implemented as a cluster of Light rays from other sources will also hit the screen all over multi-GPU computers. In the simple case, each GPU output the screen surface, but from slightly different directions (see drives a single optical module, multiple outputs of the same Figure1 ). The light rays that cross the screen at the same GPU correspond to adjacent optical modules, while one screen position but in different directions make up a pixel computer (having multiple GPUs) corresponds to a bigger that has the direction selective light emission property, which group of adjacent modules. As in Ref. 18, we define the part is a necessary condition to create glasses-free 3D displays. of the whole light field that is reproduced by a single optical By showing different but consistent information to slightly module a light field slice. One such slice is not visible from a different directions, it is possible to show different images to single point of observation, nor does it represent a single