Europäisches Patentamt

(19) European Patent Office Office européen des brevets (11) EP 1 037 167 A2

(12) EUROPEAN PATENT APPLICATION

(43) Date of publication: (51) Int. Cl.7: G06T 15/20 20.09.2000 Bulletin 2000/38

(21) Application number: 00301953.6

(22) Date of filing: 09.03.2000

(84) Designated Contracting States: (72) Inventor: Deering, Michael F. AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU Los Altos, California 94024 (US) MC NL PT SE Designated Extension States: (74) Representative: AL LT LV MK RO SI Harris, Ian Richard D. Young & Co., (30) Priority: 15.03.1999 US 268397 21 New Fetter Lane London EC4A 1DA (GB) (71) Applicant: SUN MICROSYSTEMS, INC. Palo Alto, California 94303 (US)

(54) System and method for generating and playback of three-dimensional movies

(57) A system and method for generating and play- ing back three-dimensional (3D) movies are disclosed. The system is capable of partially rendering frames without relying upon exact viewpoint information. The partially rendered frames may be rendered to the extent possible without performing viewpoint-dependent proc- esses, and then compressed and stored to a carrier media To reduce the amount of data to be stored, the viewer's possible viewpoints may be restricted (e.g., by defining a viewpoint-limiting volume or region). The resulting partially-rendered geometry data may be com- pressed using geometry compression. During playback, the compressed frames are read as a stream, and decompressed. Any final viewpoint-dependent render- ing operations may then be performed (e.g., some light- ing calculations and atmospheric effects, some fogging, specular highlighting; and reflections). A sensor such as a head-tracker may provide real-time viewpoint informa- tion that may be used by the playback system. After ren- dering, the frames are rasterized and then displayed in stereo. EP 1 037 167 A2 Printed by Xerox (UK) Business Services 2.16.7 (HRS)/3.6 12EP 1 037 167 A2

Description to display a 3D football game that viewers can view from almost any angle they desire (e.g., from a bird's eye BACKGROUND OF THE INVENTION view directly overhead or from the side of the fifty yard line). To change their viewpoint, viewers simply move 1. Field of the Invention 5around the coffee table. The head-tracking sensors will determine where the viewer is and display different [0001] This invention relates generally to the field of component images accordingly. When used in connec- and, more particularly, to three- tion with a head-mounted display, head tracking allows dimensional (3D) movies. panoramic scenes that "wrap-around" viewers without 10 the need for large projection screens that physically sur- 2. Description of the Related Art round the viewer. Such head-tracked stereo systems produce a look-around holographic feel qualitatively dif- [0002] Traditional motion pictures provide the audi- ferent from older fixed image displays. ence with what may called a "two-dimensional" experi- [0007] However, these new systems have yet to see ence. Most motion pictures are projected on flat screens 15 wide-spread acceptance as movie playback devices. with the audience observing the movie from afar. Aside One reason is that these systems typically rely upon re- from improvements in sound playback and larger rendering each scene every time the viewer changes film/screen formats, the movie going experience has not their viewpoint. For example, as the viewer's head changed significantly for a number of years. changes position (or orientation, in some systems), the [0003] Some films have been produced in visual 20 scene being displayed is re-rendered from scratch using stereo, e.g., with the audience wearing red-blue or the new viewpoint To produce a realistic computer-gen- polarized glasses. Similarly, head-mounted displays erated image, a tremendous number of calculations have also allowed viewers to see visual stereo images must be performed. Even with current advances in and films. But visual stereo has not seen widespread microprocessors, most systems still fall significantly success relative to standard motion pictures. This may 25 short of being able to render detailed realistic scenes in be because the stereo-effect is typically limited to a few real-time. Given the high frame rates need for smooth special effects where a few objects appear to have movie-like , it may be many years before dimension or "leap out" at the audience. It is likely that processing power approaches the levels necessary to the effects that simple visual stereo provides does not completely render complex images (e.g., like those in outweigh the added cost of production and the hassle of 30 the movie Toy Story) a frame rates high enough to rival wearing glasses or head-mounted displays to view the current motion pictures. image. [0008] Thus, an efficient system and method for [0004] However, recent advances in computer generating and playing back realistic 3D movies with graphics and display technologies have hinted at the viewer-changeable viewpoints in real-time is desired. In possibly of a more realistic experience for movie goers. 35 particular, a system and method capable of reducing New sensing technologies such as head-tracking may the number of calculations to be performed for real-time increase realism from simple visual stereo to a truly playback of 3D movies is desired. three-dimensional (3D) viewing experience. [0005] As used herein, the term "visual stereo" SUMMARY OF THE INVENTION refers to the process of generating two images (i.e., one 40 for the viewer's left eye and another for the viewer's right [0009] Particular and preferred aspects of the eye). These images may be referred to herein as stereo invention are set out in the accompanying independent component images. In contrast to simple visual stereo, and dependent claims. Features of the dependent 3D images and movies go one step further by dynami- claims may be combined with those of the independent cally changing the viewpoint (ideally on a frame-by- 45 claims as appropriate and in combinations other than frame basis) for each of the two stereo component those explicitly set out in the claims. images. In this way, an objected viewed in 3D will not [0010] The problems outlined above may at least in only appear to leap out of the screen, but objects behind part be solved by a system capable of partially render- it will become obscured or un-obscured as the viewer ing frames without relying upon exact viewpoint infor- changes their viewpoint to different positions around the 50 mation. The partially rendered frames may be rendered object. In other words, a traditional stereo image of a to the extent possible without performing viewpoint- globe will appear to have depth, but a tree-dimensional dependent processes, and then compressed and stored image of the globe will allow the viewer to move around to a carrier medium. To reduce the amount of data to be the globe to see more than one side of it. stored, the viewer's possible viewpoints may be [0006] Thus, head tracking sensors allow graphics 55 restricted (e.g., by defining a viewpoint-limiting volume systems to recalculate the component images accord- or region). ing to where the viewer is in relation to the display [0011] During playback, the compressed frames device. Head tracking may allow a coffee table display are read as a stream, decompressed, and then raster-

2 34EP 1 037 167 A2 ized for display. As part of the rasterization process, the viewpoint-independent) calculations, such as some viewer's viewpoint may be determined and used to per- lighting calculations (e.g., viewpoint-independent reflec- form viewpoint-dependent effects such as some lighting tions, viewpoint-independent shadows, viewpoint-inde- and atmospheric effects (e.g., fogging, specular high- pendent depth of field effects, viewpoint-independent lighting, and reflections). 5bump mapping, and viewpoint-independent displace- [0012] As used herein the term "real-time" refers to ment mapping) and tessellating the three-dimensional performing an action, task, or process rapidly enough graphics data into micropolygons to meet one or more so that the user or viewer will not be substantially dis- predetermined criteria. These predetermined criteria tracted by the amount of time taken to perform the task. may include smoothness and micropolygon size in As used herein, the term "including" means "including, 10 world space. In other embodiments they may also but not limited to." include estimated worst-case micropolygon size in [0013] Movies may be thought of as linear story tell- screen space. This estimated worst-case micropolygon ing. Thus, in 3D movies the viewer may be like Scrooge size in screen space may be estimated using the dis- (i.e., dragged along by ghosts of Christmas past, tance from a closest permitted viewpoint to each micro- present, and future). The viewer can see everything in 15 polygon and or the worst-case angle from all permitted partial or full three dimensions, but generally may not viewpoints to the micropolygon. interact with anything (in some embodiments interaction [0018] In addition to geometry compression, in may be allowed). Furthermore, the viewer can only go some embodiments the micropolygons may be culled to the places where the ghost goes. Note that this cov- by deleting micropolygons that are not visible from any ers not only pre-scripted linear story telling, but also 20 valid viewpoint (i.e., from any viewpoint that satisfies the most forms of remote viewing (e.g., live sporting specified viewpoint limitations. The compressed frames events). are configured to be rasterized in real-time by applying [0014] Looked at another way, current general pur- viewpoint-dependent lighting effects. Frames may be pose take much longer to place and create stored separately, or they may be reused as a basis for non-trivial geometry than special purpose rendering 25 interpolation. The specified viewpoint limitation informa- hardware does to render it (albeit with simplistic hard- tion may also be stored with the frames for use during wired surface ). This trend is likely to get worse playback. rather then better for some time to come. But for pre- [0019] A method for playing back 3D movies is also scripted linear story telling, nearly all the geometry cre- contemplated. In one embodiment, the method may ating and much of the surface can be pre-com- 30 comprise reading a plurality of compressed frames from puted. Thus, the primary run-time task is the rendering a carrier medium and then decompressing the com- of large numbers of colored micropolygons. Therefore, pressed frames into a plurality of world-space micro- 3D movies may be produced and viewed by rendering polygons. The viewer may then provide input (e.g., via a (e.g., in real-time or near real-time) streams of com- head-tracker) as to where the viewpoint should be. pressed geometry (e.g., in a head-tracked stereo view- 35 Using this viewpoint, the micropolygons may be trans- ing environment). lated to screen-space, and then rasterized to generate a [0015] Linear narratives may be pre-computed frame for display on a display device. using arbitrarily complex animation, physics of motion, [0020] In one embodiment, the playback unit may shadowing, texturing, and surface modeling techniques, also apply viewpoint-dependent lighting, texturing, but with the viewpoint-dependent image-rendering 40 atmospheric, depth-of-field and or other effects to the deferred until play-back time. The geometry may be pro- micropolygons. The rasterization may be performed in duced by a geometry that outputs micropoly- real-time, with real-time viewer viewpoint information gons in world-space via programmable shaders in a from head-trackers. Compressed audio may also be realistic rendering package (e.g., Pixar's RenderMan). read with the compressed 3D movie frames, and the The technology may be used for a number of different 45 audio may be manipulated to reflect the viewer's ear applications, including entertainment, scientific visuali- location information (e.g., as provided by a head- zation, and live 3D television. tracker). [0016] In one embodiment, a method for generating [0021] A software program and computer system three-dimensional movies comprises receiving three- configured to generate three-dimensional movies are dimensional graphics data representing a scene. A 50 also contemplated. The software program may com- viewpoint limitation (e.g., a volume or a two-dimensional prise a plurality of instructions, wherein the instructions region with or without orientation limitations) is specified are configured to perform the method of generating 3D for said scene. Next, one or more frames representing movies as described above. Similarly, a software pro- the scene are partially rendered. These partially ren- gram and playback system configured to playback 3D dered frames are then compressed and output to a car- 55 movies are also contemplated. rier medium for storage or transmission. [0017] "Partially rendering" may comprises per- forming one or more non-viewpoint dependent (i.e.,

3 56EP 1 037 167 A2

BRIEF DESCRIPTION OF THE DRAWINGS ics data representing a scene in a movie is received (step 100). The 3D graphics data may include data in a [0022] The foregoing, as well as other objects, fea- number of different formats. For example, three dimen- tures, and advantages of this invention may be more sional objects that are part of the scene may be repre- completely understood by reference to the following 5sented as volumes, surfaces, or 3D objects that have detailed description of exemplary embodiments when been tessellated into a plurality of polygons (e.g., trian- read together with the accompanying drawings in gles or quadrilaterals). The 3D graphics data may also which: include objects modeled with NURBS (non-uniform rational B-splines), volume elements, subdivision sur- Figure 1 is a flowchart illustrating one embodiment 10 faces, meshes and other techniques. The 3D data may of a method for generating three-dimensional (3D) be generated by computer animators, by movies; devices, 3D cameras, 3D digitizers, or other techniques. Depending upon the format in which the 3D graphics Figure 2 is a flowchart illustrating one embodiment data is received, it may be manipulated and transformed of a method for playing back three-dimensional 15 before being transformed into a 3D movie. (3D) movies generated using the method of Figure [0025] Once the data has been received, real-time 1; viewpoint limitations may be specified (step 102). For example, if a scene in the movie involves a car chase, Figure 3 illustrates one embodiment of a computer the viewer's viewpoint may be limited to positions within system capable of generating and playing back 3D 20 a police car chasing down a felon fleeing a crime scene. movies generated using the method of Figure 1; Viewpoint limitations may be specified in a number of different ways. One possible method is specifying a Figure 4 is a diagram of one embodiment of a 3D viewpoint limitation volume. Any user-specified view- movie playback system; points for the particular scene will be processed as 25 valid. However, if the viewer attempts to change their Figure 5 is a diagram of another embodiment of a viewpoint to a position outside the specified limitation 3D movie playback system; volume, then the playback system will force the view- point to remain within the predetermined viewpoint vol- Figure 6 is a diagram of yet another embodiment of ume (more details on the playback system are a 3D movie playback system; 30 described further below). This may prevent users from traveling beyond the bounds of the current scene (e.g., Figure 7 is a block diagram illustrating more details through walls). This may also constrain the amount and of one embodiment of the playback system of Fig- tessellation of micropolygons, and may also avoid some ure 4; and playback level-of-detail issues. 35 [0026] In addition to limiting the viewer's viewpoint Figure 8 is a block diagram illustrating another by position (i.e., where the viewer is), limitations on the embodiment of a 3D movie playback system. orientation of the viewpoint (i.e., which direction the viewer is looking) may also be specified. Returning one [0023] While the invention is susceptible to various again the car chase example, if the user is limited to modifications and alternative forms, specific embodi- 40 viewpoints within the police car chasing the felon, the ments thereof are shown by way of example in the draw- viewer may be limited to viewpoint orientations that are ings and will herein be described in detail. It should be directed out the windshield of the police car. This may understood, however, that the drawings and detailed prevent the viewer from missing an explosion caused by description thereto are not intended to limit the invention the felon crashing his vehicle into a gasoline station to the particular form disclosed, but on the contrary, the 45 because the viewer was looking out the police car's intention is to cover all modifications, equivalents, and back window. alternatives falling within the scope of the present inven- [0027] Next, the 3D graphics data may be partially tion as defined by the appended claims. rendered (step 104). As shown by the figure, a number of different processes may be performed as part of par- DETAILED DESCRIPTION OF SEVERAL EMBODI- 50 tially rendering the 3D graphics data. First, the 3D MENTS graphics data may be tessellated into "micropolygons" positioned in object space (step 104A). Traditional Method for Generating Three-Dimensional Movies - micropolygons are polygons that have a size near or Figure 1 below a single pixel in screen space. However, since 55 tessellation into micropolygons is performed without [0024] Figure 1 is a flowchart illustrating one knowing the exact viewpoint, some estimations may be embodiment of a method for generating three-dimen- performed when tessellating. For example, since micro- sional (3D) movies. First, three dimensional (3D) graph- polygon size is related to the distance from the view-

4 78EP 1 037 167 A2 point to the micropolygon, the worst case (e.g., the ground may have depth of field effects applied during closest) viewpoint may be used when tessellating. Sim- partial rendering. This is because the range of permissi- ilarly, micropolygon size may also be related to the ble viewpoints is small relative the distance from all pos- angle from the viewpoint to the micropolygon. Thus a sible viewpoints to the trees. worst case angle may also be used during tessellation. 5[0031] Once partial rendering (step 104) has been The distance and angle may be used separately or in completed, each individual frame of micropolygons (still combination to obtain the desired tessellation. The use in world space) may be compressed (step 106). While a of properly sized micropolygons may improve the real- number of different types of compression are possible, ism of the final image produced when compared with geometry compression may be particularly effective for using large triangles having areas covering many pixels. 10 reducing the storage requirements of the micropoly- In some embodiments, partial rendering may be per- gons. One such method of geometry compression is formed using a specialized "geometry shader" as described in U.S. Patent No. 5,793,371 entitled "Method described in greater detail below. and Apparatus for Geometric Compression of Three- [0028] Based upon the viewpoint limitations speci- Dimensional Graphics Data" issued on August 11, fied during step 102, certain micropolygons may be 15 1998. eliminated from the 3D graphic data if they cannot influ- [0032] With the enormous number of micropoly- ence visible pixels in the final image (step 104B). For gons generated by each frame geometry compression example, if the 3D graphics data contains a tree and the is useful to be able to reduce the storage requirements viewer's viewpoint is limited to positions and orienta- for the frames. During play-back, use of compression tions far above the tree, then the leaves on the lower 20 also greatly reduces the bandwidth necessary for real- branches of the tree may be eliminated (i.e., culled) if time transfer of the frames from disk. To render the com- there is no way the viewer could possibly see them from pressed geometry in real-time, a compression format within the specified viewpoint limitation volume. that supports hardware decompression may be particu- [0029] Next, the selected micropolygons that are larly useful. visible from valid viewpoints may be "mostly lit" and 25 [0033] Finally, the compressed frames may be out- colored (step 104C). The micropolygons are "mostly lit" put to a carrier medium (step 106). In some embodi- by applying viewpoint-independent lighting and then ments, the compressed frames may be stored to a calculating color information for each micropolygon ver- computer-readable medium such as RAM, ROM, flash tex. The micropolygons are referred to as "mostly lit" memory, CD-ROM, DVD-ROM, or magnetic disk. In one because some final lighting that is viewpoint-dependent 30 embodiment, the frames may be streamed to an array of is not applied at this stage. Instead, the final viewpoint DVD-ROM recorders (e.g., in a RAID configuration). In dependent component of the lighting (e.g., reflections other embodiments, the compressed frames may and specular highlighting) may be applied during play- streamed to a transmission medium such as a computer back. This may be changed on a per-frame basis. For network line or wireless/satellite link coupled to the example, some frames or scenes may not have any 35 internet or broadcast (e.g., VHF or UHF). viewpoint-dependent lighting components. Others may [0034] Depending upon the amount of data to be have a large number of viewpoint-dependent lighting stored and the speed of the playback hardware, each components. In some embodiments, the lighting and frame may be stored as a completely separate set of coloring calculations may be performed in world space micropolygons. In another embodiment, portions of (other spaces such as object space may also be used). 40 some frames may be reused. In still other embodiments, Included in lighting and color calculations are non-view- only key frames may be stored completely stored (with point-dependent aspects of processes such as texture intermediate frames being interpolated during playback mapping, most shadowing, some atmospheric effects, from key frames). some depth of field effects (step 104D). For example, [0035] Note that the flowchart depicted in Figure 1 shadows cast by objects are typically viewpoint-inde- 45 is for explanatory purposes and is not meant to be limit- pendent Thus, in some embodiments the determination ing. In some embodiments, the steps may be performed of whether a particular light source contributes com- in a different order, in parallel (e.g., steps 102 and pletely, partially, or not at all to the lighting calculation 104A), or some steps be eliminated (e.g., step 104A or may be performed at this stage. Viewpoint-dependent the shadowing of step 104D). Additional steps may also components such as specular components can be 50 be performed. deferred until playback. [0036] In some embodiments, each frame may be [0030] As used herein, a calculation is "viewpoint- stored as a separate compressed object in a separate independent" if the calculation may be performed with- file (e.g., 3 megabytes in size). For playback, these indi- out reference to the particular viewpoint used by the vidual files may then be loaded into a main memory and viewer to view the scene or if the calculation may be 55 rendered sequentially. Depending upon the exact performed using only the viewpoint limitations defined embodiment each frame may have one or more of the for the scene. For example, if the viewpoint is limited to following associated with it: (i) a list of specular light the interior of a car, trees that are in the distant back- sources, (ii) a viewpoint-limiting volume constraining

5 91EP 1 037 167 A2 0 allowable viewpoints, (iii) a list of specular materials, (iv) cracks. This may significantly increase (e.g., double) the an array of compressed geometry objects, and (v) a list number of micropolygons. An intelligent re-tessellator of bindings of specular materials and light sources to may be used to reduce this overhead. compressed geometry objects. [0040] To handle the deferral of specular lighting to [0037] As previously noted, in some embodiments 5play-back time, a custom shaders that excludes specu- the director/cinematographer may constrain the viewer lar computations may be used. To make sure that occlu- to a particular region to prevent the viewer from wander- sion culling is not overly aggressive, all opaque surfaces ing randomly around the virtual set Depending upon the may be set to be very slightly transparent (e.g., 0.9999). desired effect, such a region may still allow for move- Otherwise, the customer shader may differ little from ment of the viewer's body within a several foot volume 10 more traditional RenderMan shaders. Effects applied for effective stereo display. The micropolygon genera- after the micropolygons are extracted (e.g., motion blur tion need not be uniform in world space (e.g., it may be and final anti-aliasing), do not show up in the com- uniform within a range of tolerances over the limited pressed triangle stream. Some view dependent effects, sighting angles). This allows objects much further away such as reflection or environment mapping, and atmos- to be tessellated into corresponding larger micropoly- 15 pheric shaders may be excluded or used with care. gons. There is also the opportunity for backface and Transparency may also require particular attention. occlusion culling of objects or portions of objects that [0041] Alternately, colored micropolygons can be are not visible from any allowed view angle on a given extracted from real world scenes by the use of 3D scan- frame. This may also be useful when capturing micropo- ners or cameras. Eventually, real-time 3D capture, con- lygons from the physical world. 3D cameras need not 20 version to geometry, compression, and transmission capture all possible surfaces. Instead, only those poten- may support a form of 3D television. While such geom- tially visible from the possible view angles may be cap- etry acquisition is technically challenging, the same tured. method described herein could be used for playback. [0042] A geometry compressor may be added as Geometry Shader 25 an output processor on the stream of micropolygons generated by RenderMan. In one embodiment, the tri- [0038] Many programmable shaders, e.g., the sur- angles may have individual vertex normal and color face shader in Pixar's RenderMan application, are a components. Changes in specular color and specular super set of the processing needed for partially render- power may be encoded at a higher object level for use ing 3D movies as described above. In these program- 30 at run-time. The same may be done with the color and mable shaders, surfaces are typically tessellated into placement of the specular light sources. Shadowing of polygons (e.g., quadrilaterals or triangles) that meet a objects from specular lights can be achieved by break- specified criteria such as surface curvature, size in ing up each object by the light sources used. In some screen space, or size in world space. Given explicit con- embodiments with both per-vertex colors and normals straints on playback viewpoint and screen resolution, 35 present, the compressed geometry may avenge about 7 render screen space size criteria may also be chosen to bytes per triangle. Other sizes and compression ratios generate controlled screen-space-size polygons at are also possible and contemplated. For example, more playback. Typical shader programs then apply a lighting specialized microtriangle compression techniques may and shading model to these micropolygons, including potentially be able to reduce the average microtriangle texture and shadow map look-up, ambient and diffuse 40 storage requirement to 2 or 3 bytes. lighting calculations, or even radiosity computations. [0043] One possible rendering configuration may These calculations are typically applied in world space if allow a "budget" of one quarter of a million triangles per effects such as displacement shaders are to be sup- frame and a playback rate of 24 frames per second. ported. Normally, as a final step, the micropolygon is Other values and rates are also contemplated, e.g., 400 projected into screen space, and then rasterized into a 45 thousand triangles per frame and a playback rate of 15 z-buffered frame buffer. Hz. Assuming each triangle comprises 7 bytes of infor- [0039] However, some shading languages explicitly mation, a sustained data transfer rate of about 42 meg- support the concept of a geometry shader, in which the abytes per second may be useful in such an final projection and rasterization steps are not applied. embodiment. One possible method for achieving this Instead, the lit micropolygons are output. For example, 50 transfer rate is an array of three-way interleaved fiber- the print command in Pixar's RenderMan version 3.8 channel disk drives or possibly an eight-way interleaved may be used to extract lit micropolygons. Because Ren- array of consumer 5X DVD drives. With a storage derMan internally defers final mesh connectivity until capacity of 8.5 megabytes per side, eight DVD disks the rasterization stage, the world space micropolygons would support 17 minutes of continuous playback. Of generated by this print command are unstitched quad 55 course, faster 3D rendering hardware would change meshes from a higher order surface tessellation. As a these numbers, but so would better compression tech- post process, the connectivity information may be rec- niques. Other configurations and data rates are also omputed and stitching strips may be added to fill in the possible and contemplated.

6 11 EP 1 037 167 A2 12

[0044] Note that each of the methods described the viewers specified viewpoint for a particular frame is herein may be implemented solely in software, solely in outside the particular frame's viewpoint-limiting volume, hardware, or as a combination of both software and then the system may be configured to force or limit the hardware. current viewpoint to the closest viewpoint to the viewer's 5 specified viewpoint that meets the frame's specified Method for Playing Back Three-Dimensional Movies viewpoint limitations (step 128). - Figure 2 [0048] Next, once the viewpoint for the frame has been determined, the playback system may be config- [0045] Figure 2 is a flowchart illustrating one ured to apply viewpoint-dependent lighting and other embodiment of a method for playing back three-dimen- 10 viewpoint-dependent optional effects, e.g., atmospheric sional (3D) movies generated using the method of Fig- effects, specular highlights, reflections, and depth of ure 1. First, a stream of compressed frames is read and field effects (step 130). The amount of processing may decompressed from a carrier medium (step 120). As varied on a per-frame basis to maintain a minimum previously noted, the carrier medium may be either a desired . In some embodiments, the 3D transmission medium or a computer readable medium. 15 movie producers may also dynamically adjust the In one embodiment, the 3D movie may be distributed by amount of viewpoint-dependent processing necessary across the internet. In another embodiment, the 3D to maintain a minimum frame rate during playback (e.g., movie may be distributed as a set of DVD-ROMs that based on a particular minimum playback system config- are configured to be simultaneously played back using uration). Other embodiments of the playback system an array of DVD-ROM players in a RAID configuration. 20 may be configured to perform as much viewpoint- [0046] During playback, the playback system may dependent processing as possible given a predeter- be configured to receive viewer input as to where the mined minimum frame rate. If the frame rate is about to viewer desires their viewpoint to be and (optionally) in drop below the minimum, the playback system may be which direction they desire to look (step 124). This infor- configured to reduce or eliminate certain viewpoint- mation may be received in a number of different ways. 25 dependent effects (e.g., eliminating any reflections). For example, a head-tracking sensor may be used to [0049] Next, the micropolygons may be translated determine the position of a viewer relative to the display from world-space to screen space based on the view- device. Similarly a data glove, wand, joystick, or key- point for the particular frame (step 132). The screen- board may be used to receive information as to where space micropolygons may then be rasterized into a the viewer desires the viewpoint to be positioned. Other 30 buffer (e.g., a frame buffer or sample buffer) for eventual alternatives include head-tracking using an infrared display (step 136). reflective dot placed on the user's forehead, or using a [0050] A number of different displays may be used pair of glasses with head- and or eye-tracking sensors in the playback system. For example, head-mounted built in. One method for using head- and hand-tracking displays, traditional CRT displays, gas plasma displays, is disclosed in U.S. Patent No. 5,446,834 (entitled 35 LCD displays, traditional projectors (CRT, LCD, or digital "Method and Apparatus for High Resolution Virtual micromirror), or ocular projectors that project images Reality Systems Using Head Tracked Display," by directly onto the viewer's retina may all be used. If the Michael Deering, issued August 29, 1995). Other meth- display device not head-mounted, glasses may be worn ods for head tracking are also possible and contem- by viewers to separate the image perceived by each plated (e.g., infrared sensors, electromagnetic sensors, 40 eye. The glasses may be passive color filters (e.g., red- capacitive sensors, video cameras, sonic and ultrasonic blue or red-green), passive polarized, or active (e.g., detectors, clothing based sensors, video tacking LCD-shutter type). In some embodiments using non- devices, conductive ink, strain gauges, force-feedback head-mounted displays, such eyeglasses may be detectors, fiber optic sensors, pneumatic sensors, mag- optional. netic tacking devices, and mechanical switches). This 45 [0051] In some embodiments, the micropolygons viewer viewpoint-related information may be received in may be rasterized into a sample buffer that stores multi- real-time (e.g., at or near the frame playback rate) or ple samples per display pixel. These samples may then near-real-time. be filtered to generate visually pleasing anti-aliased [0047] For each frame, the viewer's viewpoint infor- images. In addition, some embodiments may be config- mation may be processed along with the previous view- 50 ured to support multiple viewers simultaneously. Each point information and the viewpoint limitation read from viewer may have their own viewpoint and display. The the carrier medium. For example, hysteresis may be playback system may be configured to render a different implemented so that subtle movements of the viewer's stereo image for each viewer based upon each viewer's head (in embodiments utilizing a head-tracker for view- particular viewpoint. point information) do not introduce a "jitter" in the view- 55 [0052] In some more complex embodiments, the point. As part of this processing, the playback system playback system could perform some viewpoint- may be configured to compare the calculated viewpoint dependent occlusion-culling operations to reduce the with the viewpoint limitation information (step 126). If depth complexity. If objects were compressed at multi-

7 13 EP 1 037 167 A2 14 ple levels-of-detail and decoded in a viewpoint-depend- wireless (e.g., RF) link. In this embodiment, system unit ent method, then the constraints on allowable viewing 82 does not utilize a storage array. Instead, system unit positions could be reduced (at the cost of increasing the 82 receives compressed 3D movies via broadcast, amount of data stored and transferred for each frame). internet link, or satellite link (e.g., at a rate of 42 mega- Objects with non-specular shading that does not 5 bytes per second). Other embodiments are possible change between frames (e.g., background objects) may and contemplated. be transmitted or stored only at key frames in the file [0056] Figure 5 illustrated another embodiment of a (e.g., similar to MPEG 2 compression I-frames). In other 3D movie playback system 60. In this embodiment, a embodiments, some dynamically animated objects may digital projector 62 is configured to read, decompress, be included in frames to add an additional level of inter- 10 and rasterize 3D movies. Projector 62 is configured to action. Branch points to different animation or movie project the 3D movie an a display screen 64. Viewers sequences could be also added (e.g., like those used in may view display screen 64 with or without optional CD-I). Compressed audio may also be read with the color filter glasses 68. Head-tracking sensor 66 is con- compressed 3D movie frames, and the audio may be figured to determine the position of multiple viewer's manipulated to reflect the viewer's ear locations (as pro- 15 heads with respect to display screen 64. For example, vided by the head-tracker) and to improve realism. head-tracking sensor 66 may detect viewers' positions by an infra-red reflective dot placed on each viewer's Playback System - Figures 3-6 forehead (or on optional glasses 68). In multi-viewer embodiments, projector 62 may be configured to per- [0053] Referring now to Figure 3, one embodiment 20 form multiplexing of the image projected onto screen of a playback system 80 configured to playback 3D 64. For example, with two viewers, four-way multiplexing movies is shown. The playback system may be imple- may be performed (e.g., viewer 1 left eye, viewer one mented as any of number of different systems, including right eye, viewer 2 left eye, viewer 2 right eye). a computer system, network PC, Internet appliance, a [0057] Figure 6 illustrates yet another embodiment television, including HDTV systems and interactive tele- 25 of playback system 50. In this embodiment, playback vision systems, personal digital assistants (PDAs), system 50 is implemented as a coffee table. A 3D image video display glasses, and other devices which are 54 is projected from top surface 52 (e.g., by a gas capable of receiving compressed frames and displaying plasma display). As viewers move round the table, the 3D graphics. 3D image changes to reflect the viewer's current posi- [0054] As shown in the figure, the playback system 30 tion as detected by integral IR head-tracking sensors 80 comprises a system unit 82 configured to read, 56. The representation of image 54 is shown as per- decompress, and rasterize compressed 3D movies, and ceived from viewpoint 58 (i.e., typical gas plasma dis- a video monitor or display device 84 coupled to the sys- plays cannot actually project an image into space, so tem unit 82. As previously noted, the display device 84 image 58 is viewed from above by looking into top sur- may be any of various types of display monitors (head- 35 face 52). mounted or non-head-mounted). Optional eyeglasses 90 may be used with non-head-mounted displays. As System Block Diagrams ç Figures 7-8 described above, the glasses may be passive or active. Various input devices may be connected to the compu- [0058] Referring now to Figure 7, a simplified block ter system, including a keyboard 86 and/or a mouse 88, 40 diagram illustrating the playback system of Figure 1 is or other input device (e.g., a trackball, digitizer, tablet, shown. Elements of the system that are not necessary six-degree of freedom input device, head tracker, eye for an understanding of the present invention are not tracker, data glove, body sensors, etc.). Application soft- shown for convenience. As shown, playback system 80 ware may be executed by the computer system 80 to includes a central processing unit (CPU) 102 coupled to read, decompress, and display the 3-D movies on dis- 45 a high-speed memory bus or system bus 104 also play device 84. In one embodiment, playback system 80 referred to as the host bus 104. A system memory 106 may comprise an array of storage devices (e.g., a array may also be coupled to high-speed bus 104. of eight 5X DVD-ROMs coupled in a RAID configura- [0059] Host processor 102 may comprise one or tion) configured to read compressed 3D movies from more processors of varying types, e.g., microproces- computer-readable media. Other embodiments are pos- 50 sors, multi-processors and CPUs. The system memory sible and contemplated 106 may comprise any combination of different types of [0055] Figure 4 illustrates another embodiment of a memory subsystems, including random access memo- 3D movie playback system 70. In this embodiment, the ries, (e.g., static random access memories or "SRAMs", system comprises a head-mounted display device 72, synchronous dynamic random access memories or head-tracking sensors 74, and a data glove 76. Head 55 "SDRAMs", and Rambus dynamic access memories or mounted display 72 may be coupled to system unit 82 "RDRAM", among others) and mass storage devices via a fiber optic link 94, or one or more of the following: (e.g., DVD-ROM arrays, hard rive arrays, etc.). The sys- an electrically-conductive link, an infra-red link, or a tem bus or host bus 104 may comprise one or more

8 15 EP 1 037 167 A2 16 communication or host computer buses (for communi- space to screen space. This may be performed using a cation between host processors, CPUs, and memory viewpoint determined from information provided by subsystems) as well as specialized subsystem buses. head-tracking sensor 218. Once transformed to screen [0060] A 3-D graphics system or graphics system space, graphics processor applies any viewpoint- 112 may be coupled to the high-speed memory bus 5 dependent lighting effects. The completely lit micropoly- 104. The 3-D graphics system 112 maybe coupled to gons are then rasterized by graphics processor 214 into the bus 104 by, for example, a crossbar switch or other sample/frame buffer 216. Display device 84 is refreshed bus connectivity logic. It is assumed that various other from sample/frame buffer 216. peripheral devices, or other buses, may be connected to [0065] Although the embodiments above have been the high-speed memory bus 104. Optional head-tracker 10 described in considerable detail, other versions are pos- 114 may be coupled to memory bus 104 or directly to sible. Numerous variations and modifications will graphics system 112. Optional network adapter/com- become apparent to those skilled in the art once the munications device 116 may also be coupled to bus 104 above disclosure is fully appreciated. It is intended that or directly to graphics system 112. It is noted that the 3- the following claims be interpreted to embrace all such D graphics system may be coupled to one or more of 15 variations and modifications. Note the headings used the buses in playback system 80 and/or may be coupled herein are for organizational purposes only and are not to various types of buses. In addition, the 3D graphics meant to limit the description provided herein or the system may be coupled to a communication port and claims attached hereto. thereby directly receive graphics data from an external source, e.g., the Internet or a network. As shown in the 20 Claims figure, display device 84 is connected to the 3-D graph- ics system 112 comprised in playback system 80. 1. A method for generating three-dimensional movies [0061] Host CPU 102 may transfer information to comprising: and from the graphics system 112 according to a pro- grammed input/output (I/O) protocol over host bus 104. 25 receiving three-dimensional graphics data rep- Alternately, graphics system 112 may access the mem- resenting a scene; ory subsystem 106 according to a direct memory specifying viewpoint limitations for said scene; access (DMA) protocol or through intelligent bus mas- partially rendering, in a primarily viewpoint- tering. independent manner one or more frames rep- [0062] The graphics system may receive graphics 30 resenting said scene; data from any of various sources, including the host compressing said one or more partially ren- CPU 102 and/or the system memory 106, other mem- dered frames; and ory or computer-readable media (e.g., a DVD-ROM outputting said compressed frames to a carrier array), or from an external source such as a transmis- medium. sion medium, e.g., a network, the Internet, or a 3D tele- 35 vision broadcast. 2. The method as recited in claim 1, wherein said par- [0063] Note while graphics system 112 is depicted tially rendering comprises tessellating said three- as part of computer system 80, graphics system 112 dimensional graphics data into micropolygons to may also be configured as a stand-alone device (e.g., meet one or more predetermined criteria with its own built-in display or link to a projector or head- 40 mounted display). Graphics system 112 may also be 3. The method as recited in claim 2, wherein said one configured as a single chip device or as part of a sys- or more predetermined criteria comprise smooth- tem-on-a-chip or a multi-chip module. ness and predetermined size function in micropoly- [0064] Figure 8 illustrates yet another embodiment gon size in world space. of 3D movie playback system 80. In this embodiment, 45 playback system 80 comprises input unit 210, which 4. The method as recited in claim 2, wherein said one may be a network connection, a bus adapter, bus inter- or more predetermined criteria comprise estimated face logic, or other input mechanism with which to worst-case micropolygon size in screen space. receive data. Input unit 210 may be configured to receive a stream of frames of a compressed 3D movie 50 5. The method as recited in claim 4, wherein said esti- from a network, a modem, or other storage or transmis- mated worst-case micropolygon size in screen sion device. The frames received by input unit 210 are space is estimated using the distance from a clos- conveyed to decompressor 212 for decompression. est permitted viewpoint to a given micropolygon. Decompressor 212 may be a hardware decompressor (preferred) or a software decompressor. The decom- 55 6. The method as recited in claim 4, said estimated pressed frames are then conveyed to graphics proces- worst-case size in screen space is estimated using sor 214, which is configured to translate the constituent the combination of a worst case distance from a micropolygons of the decompressed frame from world closest permitted viewpoint to a given micropolygon

9 17 EP 1 037 167 A2 18

and a worst case viewing angle from said viewpoint wherein said compressed frames are configured to to a given micropolygon. be read in a stream and rasterized in real-time using user-controlled viewpoint information. 7. The method as recited in any preceding claim, wherein said partially rendering comprises select- 5 18. The method as recited in any preceding claim, ing micropolygons visible from viewpoints comply- wherein said compressed frames are configured to ing with said viewpoint limitations. be rasterized in real-time by applying viewpoint- dependent lighting effects. 8. The method as recited in any preceding claim, wherein said partially rendering comprises perform- 10 19. The method as recited in any preceding claim, ing a subset of lighting calculations for said wherein each compressed frame comprises sepa- selected micropolygons, wherein said subset of rate data. lighting calculations are mostly viewpoint-inde- pendent 20. The method as recited in any preceding claim, fur- 15 ther comprising outputting said viewpoint limitations 9. The method as recited in claim 8, wherein said sub- to said carrier medium. set of lighting calculations includes viewpoint-inde- pendent shadowing. 21. The method as recited in any preceding claim, fur- ther comprising indicating when one or more por- 10. The method as recited in claim 8, wherein said sub- 20 tions of a particular frame may be reused in a set of lighting calculations is selected from the subsequent frame. group comprising: viewpoint-independent reflec- tions, viewpoint-independent depth of field effects, 22. The method as recited in any preceding claim, viewpoint-independent , and view- wherein only key frames are stored and intermedi- point-independent displacement mapping. 25 ate frames are interpolated therefrom.

11. The method as recited in any preceding claim, 23. A method for playing back three-dimensional mov- wherein said specifying viewpoint limitations com- ies comprising: prises specifying a viewpoint-limiting volume. 30 reading a plurality of compressed frames from 12. The method as recited in claim 11, wherein said a carrier medium; viewpoint-limiting volume has a depth of zero and is decompressing said compressed frames into a a viewpoint-limiting surface. plurality of world-space micro-polygons; inputting viewer-controlled viewpoint informa- 13. The method as recited in any preceding claim, 35 tion; wherein said specifying viewpoint limitations com- translating said micropolygons to screen- prises specifying a set of valid viewpoints and a set space; of valid viewpoint orientations. rasterizing said micropolygons to generate a frame; and 14. The method as recited in any preceding claim, 40 displaying said frame on a display device. wherein said graphics data represents one or more three-dimensional graphics objects, and wherein 24. The method as recited in claim 23, further compris- said method further comprises tessellating said ing: graphics data into a plurality of micropolygons in world space. 45 applying viewpoint-dependent lighting effects to said micropolygons; and 15. The method as recited in any preceding claim, applying viewpoint-dependent atmospheric wherein said rending further comprises applying effects to said micropolygons. one or more lighting effects to aid micropolygons, wherein said one or more lighting effects are view- 50 25. The method as recited in any one of claims 23 or point-independent. 24, further comprising applying viewpoint-depend- ent texture effects to said micropolygons. 16. The method as recited in any preceding claim, wherein said carrier medium is selected from the 26. The method as recited in any one of claims 23 to group comprising: transmission media and compu- 55 25, further comprising applying viewpoint-depend- ter-readable media. ent reflection effects to said micropolygons.

17. The method as recited in any preceding claim, 27. The method as recited in any one of claims 23 to

10 19 EP 1 037 167 A2 20

26, further comprising comparing said viewer-con- translating said micropolygons to screen- trolled viewpoint information with viewpoint limita- space; tions read from said carrier medium. rasterizing said micropolygons to generate a frame; and 28. The method as recited in any one of claims 23 to 5 displaying said frame on a display device. 27, further comprising forcing said viewer-control- led viewpoint information to comply with viewpoint 34. The software program as recited in claim 32 or limitations read from said carrier medium. claim 33, wherein said first carrier medium is selected from the group comprising: (i) computer- 29. The method as recited in any one of claims 23 to 10 readable media and (ii) transmission media. 28, wherein said translating, rasterizing, and dis- playing are performed in real-time. 35. The software program as recited in any one of claims 32 to 34 embodied on a second carrier 30. The method as recited in any one of claims 23 to medium. 29, wherein each decompressed frame is trans- 15 lated to screen space, rasterized, and displayed in 36. The software program as recited in claim 35, visual stereo. wherein said second carrier medium are selected from the group consisting of: (i) computer-readable 31. The method as recited in any one of claims 23 to media and (ii) transmission media. 30, further comprising including: 20 37. The software program as recited in claim 35 or reading a stream of compressed audio from claim 36, wherein the first and second carrier media said carrier medium; are the same carrier medium. decompressing said stream of compressed audio; 25 38. A graphics system configured to playback previ- processing said decompressed audio to reflect ously generated three-dimensional movies, said said real-time viewer-controlled ear location graphics system comprising: information; and playing back said decompressed and proc- an input means for reading a stream of com- essed audio. 30 pressed frames from a carrier medium; a decompressor coupled to receive and 32. A software program configured to generate three- decompress said compressed frames into a dimensional movies, wherein said software pro- plurality of partially-rendered world-space gram comprises a plurality of instructions, wherein micro-polygons; said plurality of instructions are configured to: 35 a graphics processor coupled to receive said partially-rendered world-space micropolygons; receive three-dimensional graphics data repre- and senting a scene; a sensor coupled to provide said graphics proc- specify viewpoint limitations for said scene essor with viewer viewpoint information, based on viewer input; 40 wherein said graphics processor is configured partially render one or more frames represent- to translate said world-space micropolygons to ing said scene; screen-space using said viewer viewpoint infor- compress said one or more partially-rendered mation, and wherein said graphics processor is frames; and configured to rasterize said screen-space output said compressed frames to a first carrier 45 micropolygons to generate one or more 3D medium. frames in stereo.

33. A software program configured to playback three- 39. The system as recited in claim 38, wherein said dimensional movies, wherein said software pro- input means is further configured to receive view- gram comprises a plurality of instructions, wherein 50 point limitation information with said compressed said plurality of instructions are configured to: frames, wherein said graphics processor is config- ured to force said viewer viewpoint information to read a plurality of compressed frames from a comply with said viewpoint limitation information. first carrier medium; decompressing said compressed frames into a 55 40. The system as recited in claim 38 or claim 39, fur- plurality of world-space micro-polygons; ther comprising a display device and a sample inputting viewer-controlled viewpoint informa- buffer, wherein said sample buffer is coupled tion; between said graphics processor and said display

11 21 EP 1 037 167 A2 22

device wherein said sample buffer is super-sam- pled relative to the number of pixels displayable by said display device.

41. The system as recited in any one of claims 38 to 40, 5 wherein said graphics processor is further config- ured to perform a subset of lighting calculations on said partially-rendered world-space micro-poly- gons. 10 42. The system as recited in any of claims 38 to 41, wherein said subset of lighting calculations includes all viewpoint-dependent lighting calculations.

43. The system as recited in any of claims 38 to 42, 15 wherein said viewpoint-dependent calculations are selected from the group comprising: viewpoint- dependent reflections and viewpoint-dependent depth of field effects. 20

25

30

35

40

45

50

55

12 EP 1 037 167 A2

13 EP 1 037 167 A2

14 EP 1 037 167 A2

15 EP 1 037 167 A2

16 EP 1 037 167 A2

17 EP 1 037 167 A2

18 EP 1 037 167 A2

19