Reality Engine Graphics
Total Page:16
File Type:pdf, Size:1020Kb
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems Abstract Silicon Graphics Iris 3000 (1985) and the Apollo DN570 (1985). Toward the end of the ®rst-generation period advancesin technology The RealityEngineTM graphics system is the ®rst of a new genera- allowed lighting, smooth shading, and depth buffering to be imple- tion of systems designed primarily to render texture mapped, an- mented, but only with an order of magnitude less performance than tialiased polygons. This paper describes the architecture of the was available to render ¯at-shaded lines and polygons. Thus the RealityEngine graphics system, then justi®es some of the decisions target capability of these machines remained ®rst-generation. The made during its design. The implementation is near-massively par- Silicon Graphics 4DG (1986) is an example of such an architecture. allel, employing 353 independent processors in its fullest con®gura- tion, resulting in a measured ®ll rate of over 240 million antialiased, Because ®rst-generation machines could not ef®ciently eliminate texture mapped pixels per second. Rendering performance exceeds hidden surfaces, and could not ef®ciently shade surfaces even if the 1 million antialiased, texture mapped triangles per second. In ad- application was able to eliminate them, they were more effective dition to supporting the functions required of a general purpose, at rendering wireframe images than at rendering solids. Begin- high-end graphics workstation, the system enables realtime, ªout- ning in 1988 a second-generation of graphics systems, primarily the-windowº image generation and interactive image processing. workstations rather than terminals, became available. These ma- chines took advantage of reduced memory costs and the increased availability of ASICs to implement deep framebuffers with multiple CR Categories and Subject Descriptors: I.3.1 [Computer rendering processors. These framebuffers had the numeric ability Graphics]: Hardware Architecture; I.3.7 [Computer Graphics]: to interpolate colors and depths with little or no performance loss, Three-Dimensional Graphics and Realism - color, shading, shad- and the memory capacity and bandwidth to support depth buffering owing, and texture with minimal performance loss. They were therefore able to render solids and full-frame scenesef®ciently, as well as wireframe images. The Silicon Graphics GT (1988)[11] and the Apollo DN590 (1988) 1 Introduction are early examples of second-generation machines. Later second- generation machines, such as the Silicon Graphics VGX[12] the This paper describes and to a large extent justi®es the architecture Hewlett Packard VRX, and the Apollo DN10000[4] include texture chosen for the RealityEngine graphics system. The designers think mapping and antialiasing of points and lines, but not of polygons. of this system as our ®rst implementation of a third-generation Their performances are substantially reduced, however, when tex- graphics system. To us a generation is characterized not by the ture mapping is enabled, and the texture size (of the VGX) and scope of capabilities of an architecture, but rather by the capabili- ®ltering capabilities (of the VRX and the DN10000) are limited. ties for which the architecture was primarily designed ± the target capabilities with maximized performance. Because we designed The RealityEngine system is our ®rst third-generation design. Its our ®rst machine in the early eighties, our notion of ®rst generation target capability is the rendering of lighted, smooth shaded, depth corresponds to this period. Floating point hardware was just be- buffered, texture mapped, antialiased triangles. The initial target coming available at reasonableprices, framebuffer memory was still performance was 1/2 million such triangles per second, assuming quite expensive, and application-speci®cintegrated circuits (ASICs) the triangles are in short strips, and 10 percent intersect the viewing were not readily available. The resulting machines had workable frustum boundaries. Textures were to be well ®ltered (8-sample lin- transformation capabilities, but very limited framebuffer process- ear interpolation within and between two mipmap[13] levels) and ing capabilities. In particular, smooth shading and depth buffering, 1024 large enough ( 1024 ) to be usable as true images, rather which require substantial framebuffer hardware and memory, were than simply as repeated textures. Antialiasing was to result in high- not available. Thus the target capabilities of ®rst-generation ma- quality images of solids, and was to work in conjunction with depth chines were the transformation and rendering of ¯at-shaded points, buffering, meaning that no application sorting was to be required. lines, and polygons. These primitives were not lighted, and hidden Pixels were to be ®lled at a rate suf®cient to support 30Hz ren- surface elimination, if required, was accomplished by algorithms dering of full-screen images. Finally, the performance on second- implemented by the application. Examples of such systems are the generation primitives (lighted, smooth shaded, depth buffered) was 2011 N. Shoreline Blvd., Mountain View, CA 94043 USA, [email protected] to be no lower than that of the VGX, which renders roughly 800,000 such mesh triangles per second. All of these goals were achieved. The remainder of this paper is in four parts: a description of the architecture, some speci®cs of features supported by the architec- ture, alternatives considered during the design of the architecture, and ®nally some appendixes that describe performance and imple- mentation details. buffer, with the pixel assignments ®nely interleaved to insure that even small triangles are partially rasterized by each of the Fragment System Bus Generators. Each Fragment Generator computes the intersection of Command geometry the set of pixels that are fully or partially covered by the triangle and Processor board the set of pixels in the framebuffer that it is responsible for, gener- ating a fragment for each of these pixels. Color, depth, and texture Geometry coordinates are assigned to each fragment based on the initial and Engines slope values computed by the Geometry Engine. A subsample mask Triangle Bus is assigned to the fragment based on the portion of each pixel that Fragment Generators is covered by the triangle. The local copy of the texture memory is indexed by the texture coordinates, and the 8 resulting samples are Image reduced by linear interpolation to a single color value, which then Engines modulates the fragment's color. The resulting fragments, each comprising a pixel coordinate, a color, a depth, and a coverage mask, are then distributed to the raster memory board raster memory board Image Engines. Like the Fragment Generators, the Image Engines are each assigned a ®xed subset of the pixels in the framebuffer. These subsets are themselves subsets of the Fragment Generator allocations, so that each Fragment Generator communicates only video display generator board with the 16 Image Engines assigned to it. Each Image Engine manages its own dynamic RAM that implements its subset of the framebuffer. When a fragment is received by an Image Engine, Figure 1. Board-level block diagram of an intermediate con®gu- its depth and color sample data are merged with the data already ration with 8 Geometry Engines on the geometry board, 2 raster stored at that pixel, and a new aggregate pixel color is immediately memory boards, and a display generator board. computed. Thus the image is complete as soon as the last primitive has been rendered; there is no need for a ®nal framebuffer operation to resolve the multiple color samples at each pixel location to a single displayable color. 2 Architecture Before describing each of the rendering operations in more detail, we make the following observations. First, after it is separated by The RealityEngine system is a 3, 4, or 6 board graphics accelerator the Command Processor, the stream of rendering commands merges that is installed in a MIPS RISC workstation. The graphics system only at the Triangle Bus. Second, triangles of suf®cient size (a and one or more MIPS processors are connected by a single system function of the number of raster memory boards) are processed by bus. Figure 1 is a board-level block diagram of the RealityEngine almost all the processors in the system, avoiding only 5, 7, or 11 graphics accelerator. The geometry board comprises an input FIFO, Geometry Engines. Finally, small to moderate FIFO memories are the Command Processor, and 6, 8, or 12 Geometry Engines. Each included at the input and output of each Geometry Engine, at the raster memory board comprises 5 Fragment Generators (each with input of each Fragment Generator, and at the input of each Image its own complete copy of the texture memory), 80 Image Engines, Engine. These memories smooth the ¯ow of rendering commands, and enough framebuffer memory to allocate 256 bits per pixel to a helping to insure that the processors are utilized ef®ciently. 1024 1280 framebuffer. The display generator board supports all video functions, including video timing, genlock, color mapping, and digital-to-analog conversion. Systems can be con®gured with 2.1 Command Processor 1, 2, or 4 raster memory boards, resulting in 5, 10, or 20 Fragment That the Command Processor is required at all is primarily a func- Generators and 80, 160, or 320 Image Engines. tion of the OpenGLTM [8][7] graphics language. OpenGL is modal, To get an initial notion of how the