Image Compositing Based on Virtual Cameras

Virtual Studios continue at the academic level as well as the practical level. Over the last few years, virtual studio technol- Image ogy has become very popular in television program production. NHK developed probably the first practical virtual studio system1,2 in the world and in 1991 used it in a science program, “Nano- Compositing space” (see Figure 1). Since then we have contin- ued to study the integration of computer graphics and image compositing systems. I’ll begin with an overview of image composit- Based on Virtual ing, then clarify its direction and research targets. Then, focusing on camerawork-related technology, I’ll explain the concept of a virtual camera before describing several image compositing sys- Cameras tems NHK developed that exploit virtual camera technology. Finally, I’ll address future issues regarding other aspects of the system besides cam- Masaki Hayashi erawork. NHK Science and Technical Research Laboratories Image compositing Basically, image compositing involves adjust- oday, actors in television programs ing and matching various conditions in separate- The virtual camera and movies commonly appear within ly shot or generated images, and synthesizing method presented a landscape wholly created by com- these images to create a completely different here integrates puter graphics. Computer graphics scene. The conditions might include such things computer graphics evolvedT for the most part as a computer-oriented as camerawork, lighting, resolution of pickups, and real-time video technology, which makes it possible to create atmosphere, semantic content, and so on. Which processing scenes from nothing by numerical calculation. conditions to match for picture synthesis depends techniques in a new Image compositing technology, on the other on the purpose of the image compositing. type of virtual studio. hand, has a different tradition. It evolved in the Generally, image compositing is done for the At NHK we field of video production as a technique for syn- following reasons: developed several thesizing multiple images taken by a real camera. image compositing Computer graphics technology has made remark- ❚ To create a scene in which it appears that the systems based on this able progress, and today the computer algorithms synthesized objects occupy a particular place method. First we look that produce the final image have reached a very (realism). at the virtual camera high level of sophistication. By contrast, image method, then some compositing technology relies on already filmed ❚ For artistic effect, often with little or no con- of the systems video footage, and the methods that produce the cern for realism (artwork). developed with it. final image still depend on expert human skill and Then we discuss experience. Examples of the former include the virtual studio future directions and Technology for integrating computer graphics approach and the image compositing effects seen possibilities not only and image compositing involves the convergence in so many recent special-effects movies (such as for image of these two cultural traditions, which already Jurassic Park). Examples of the latter encompass compositing but also have influenced one another over a long period of videos created for the title and credit segments of for the development time. The recent popularity of image-based ren- TV shows, and music promotion videos, which of sophisticated dering and motion capture in computer graphics combine special video effects with various types systems for editing on the one hand and the virtual studio approach of image components such as camera images, text and producing in image compositing on the other hand testify to images, computer graphics, and so on. television programs. this ongoing mutual interaction. However, Of these two types of application, the former because convergence has only just begun, a com- probably lends itself more readily to realization by prehensive system that integrates computer technology. The latter depends more on human graphics and image compositing does not yet artistic sensibility and therefore would be hard to exist. Efforts to achieve such a system will have to reduce to a technological process. Yet it would be 36 1070-986X/98/$10.00 © 1998 IEEE . of great technological interest to make a machine capable of automatically generating artistic synthesized images based on, for example, a semantic understanding of the video content. Here, we will focus on the pursuit of realism. The objective in this case is achieved by creating a new realistic image out of multiple image components, matching their attribute conditions as closely as possible. Specific types of conditions include 1. Camerawork—pan, tilt, roll, 3D positioning (x, y, z), zoom, focus 2. Lighting—color temperature, intensity, direction of light source, and so forth 3. Filming conditions—resolution of pickups, recording characteristics (for example, video or film?) 4. Environmental conditions—fog, shadows, and Figure 1. Scene from the so on TV program “Nano- In this work, we at NHK propose a filming sys- space.” 5. Interaction between objects tem that outputs the desired image by feeding the external data of filming conditions 1 through 5. In the computer graphics field, where a scene We refer to this as a virtual camera. Both the com- is literally created out of nothing, conditions 1 puter graphics technique and image processing through 4 have been studied thoroughly. You can technique may contribute to a virtual camera. Fig- readily obtain the desired computer graphics ure 2 shows a general schematic of an image com- images simply by entering data as matching val- positing system based on the virtual camera ues for conditions in a computer. However, the concept. In a conventional virtual studio, for image compositing field has produced no clear-cut example, the virtual camera for the background is answers—despite the ability to use many aspects a real-time graphics workstation, and the virtual of computer vision—because the conditions are camera for the foreground is a television camera. generally determined by the ambient physical Output data from a sensor mounted on the tele- conditions at the time of shooting. vision camera serves as the filming conditions. Turning to condition 5, we can think of many cases where interactions between objects occur. Figure 2. Image For example, two people filmed on different occa- compositing based on sions appear to shake hands, rain falls and objects the virtual camera get drenched, someone walks along leaving foot- Computer graphics concept. (Virtual camera #2) prints behind him, and so on. Effects such as these are quite difficult to achieve automatically, so in Real TV camera Multilayered the actual production of TV programs and movies with control port image Output today, special-effects professionals still do most of (Virtual camera #2) compositing scene these jobs manually. TV camera + The virtual studio creates a realistic composite image processor scene synchronized with foreground camerawork (Virtual camera #3) by driving computer graphics using camera oper- ation data. In other words, although matching Filming conditions camerawork condition 1 achieves the main pur- camera work, lighting pose in the virtual studio, matching conditions 2 pickups, fog, shadow through 5 currently mostly relies on experience object interaction, etc. and trial and error. 37 . Figure 3. Virtual y y y Object coordinate Real d shooting process. camera 2D plate on which real object image is mapped Real object z O x z i z Ri x x World coordinate World coordinate (a) (b) y y Virtual Vi object Virtual object z Virtual z camera Ui x x World coordinate World coordinate Obtained image (c) (d) (e) Virtual camera developed commercial systems are available. By Two fundamental approaches realize the virtu- contrast, image-based rendering has a much short- al camera: er history. It needs special techniques whenever different filming parameters are attempted for ❚ The first approach generates images by using video materials already filmed. This is obviously a computer graphics to set the filming parame- serious constraint, and the issue has been studied ters (model-based rendering). intensively in recent years to find a solution.3,4 As mentioned, this article primarily concerns ❚ The second approach obtains images by pro- camerawork as filming parameters. Here, based on cessing the original images taken with a real an image-based rendering approach, a method camera to match the filming parameters called virtual shooting provides a different kind of (image-based rendering). camerawork from that with an actual camera. At NHK we have developed a number of image com- The first approach imposes few constraints on positing systems that apply this virtual shooting the filming parameters, but it’s difficult to replace method to a virtual studio. all the real objects by computer graphics when Virtual shooting creates the impression of dif- rendering naturalistic landscapes. Furthermore, ferent camerawork in an image already shot. The this approach requires immense computational virtual shooting process described in this article resources to produce images that closely simulate appears in Figure 3. Once a 3D object is filmed to the filming parameters. Turning to the second produce a 2D image, this image is mapped onto a approach for realizing the virtual camera, let’s take 2D surface and then refilmed using different camerawork as an example. We first film a real camerawork. This process requires the following scene using a wide angle and high resolution, and data: store the image data in a computer. Then we get panning and tilting effects by altering the size of 1. Real camera data—the direction, position, and the area. angle of view of the real camera when the Although the second approach can certainly object was filmed handle photorealistic images, it imposes major constraints on the filming parameters. In com- 2. Real object data—the orientation and position puter graphics, the first approach is generally of the real object called model-based rendering, and the second method is called image-based rendering.

Load more