320 Virtual Memory
Total Page:16
File Type:pdf, Size:1020Kb
USOO864.3659B1 (12) United States Patent (10) Patent No.: US 8,643,659 B1 Baldwin (45) Date of Patent: Feb. 4, 2014 (54) SHADER WITH GLOBAL AND (56) References Cited INSTRUCTION CACHES U.S. PATENT DOCUMENTS (75) Inventor: David R. Baldwin, Weybridge (GB) 4,928,224. A 5/1990 Zulian (73) Assignee: 3DLabs Inc., Ltd., Hamilton (BM) 3. R ck 58. yet al. ....................... T12/228 7.245,302 B1* 7/2007 Donham et al. .............. 345,519 (*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 * cited by examiner U.S.C. 154(b) by 1387 days. Primary Examiner — Jeffrey Chow (21) Appl. No.: 10/958,758 (74) Attorney, Agent, or Firm — 3DLabs Inc., Ltd. (22) Filed: Oct. 5, 2004 (57) ABSTRACT O O An instruction cache and data cache used to virtualize the Related U.S. Application Data storage of global data and instructions used by graphics shad (60) Provisional application No. 60/533,532, filed on Dec. ers. Present day hardware design stores the global data and 31, 2003. instructions used by the shaders in a fixed amount of registers or writable control store (WCS). However, this traditional (51) Int. Cl. approach limits the size and the complexity of the shaders that G09G 5/36 (2006.01) can be supported. By virtualizing the storage of the global (52) U.S. Cl. data and instructions, the amount of global or state memory USPC - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 345/557 available tO the shader and the length of the shading programs (58) Field of Classification Search are no longer constrained by the physical on-chip memory. USPC ........... 345/418,506, 519,557; 712/219, 228 See application file for complete search history. 20 Claims, 13 Drawing Sheets FRAGMENT SHADING UNIT 110 Instruction Fragment Cache Cache 321 320 Virtual Memory U.S. Patent Feb. 4, 2014 Sheet 1 of 13 US 8,643,659 B1 MEMORY CONTROLLER ens180 Halol RequestsOther tArbiter --------------- si---------- Fragment Filter Fragment Cache Arbiter Cache - Read Read Only E 's E Command E E Command 120 N. -------"YF-FF0-ft--FFO-- Command Mux Stream Controller 130- N a 150 instruction n Global Cache Registers Cache 40 PEinst It is | ) 110 FIFO8) I Sequence? M PrOCeSSOrragment El Seq Data Texture IXCmd FIFO2 ----- FIFO EMem Write Plane E FIFO FRAGMENT I Circular SHADING Buffer UNIT Plane FIFO U.S. Patent Feb. 4, 2014 Sheet 2 of 13 US 8,643,659 B1 P20 Core Architecture Block Diagram O n T&L Subsystem Cull 9. 1A100 ClippingCull CD Vertex Vertex t SB Parameter Transform Vertex Viewport Polygon kB S Generator Transform Mode Ol CD cC C - - - - - a mam - as a ma - a - - - - - Visibility- - - - - - - Subsystem- - - - as a an am am a eur up are am m in a mamam - - 1A160 Wis Cache Vis Addr treatWis Setup an um em - a sm an - - - a man as us v - ---- Host Out -- or v- - -ar - -an or we u- - aaa- - 1A195 s ------------------------------------------- L--------- SD ACdr SD Setup H SD Data Pixel Addr Pixel Data: CD SDartrict. Subsystem SD Cache: Pixel Subsystem Cache J T-1A180 ------------------ ...it .1A190.........: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -(E) (E FIG. 1A - A U.S. Patent Feb. 4, 2014 Sheet 3 of 13 US 8,643,659 B1 P20 Core Architecture Block Diagram -> Message Stream K· ReadWrite Memory Interface up to a de s or wro • Parameter Stream CX Read Only Memory Interface - - - Fragment Stream Ge Feedback Connections -----Data Daisy Chain (a.e Wait For Completion, ------> Request Daisy Chain f.gBin Synchronization, h context restore) Deep FIFO (2 deep Ones not shown) (E)-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -O GE)-- - -(y) F.G. 1A - B U.S. Patent Feb. 4, 2014 Sheet 4 of 13 US 8,643,659 B1 P20 Core Architecture Block Diagram 1A120 (a) 1A145 Logically in parallel but physically daisy chained Texture LOD Texture Primary Cache O Texture Filter Texture Ot--- F - - - - - - - - - - - - - - - - - - - - - FIG. 1A - C U.S. Patent Feb. 4, 2014 Sheet 5 of 13 US 8,643,659 B1 U.S. Patent Feb. 4, 2014 Sheet 6 of 13 US 8,643,659 B1 Binning Subsystem 1A110 Bin Bin Rasteriser Manager 1C112 1C113 Overlap PFAddr L. PF Data r 1C116 U.S. Patent Feb. 4, 2014 Sheet 7 of 13 US 8,643,659 B1 WIDSubsystem 1A150 WID Cache 1D152 WDACdr 1D151 Visibility Subsystem 1A160 Wis Data 1E164 U.S. Patent US 8,643,659 B1 - - - - -> FIG. 1F U.S. Patent Feb. 4, 2014 Sheet 9 of 13 US 8,643,659 B1 1A170 Texture Index 1G172 Texture Primary Cache Texture Texture Secondary Cache Filter 1G176 1G174 Texture Addr - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1G175 SD Subsystem 1A180 U.S. Patent Feb. 4, 2014 Sheet 10 of 13 US 8,643,659 B1 Pixel Add Pixel Data 1.191 1.192 Pixel Subsystem 1A190 U.S. Patent Feb. 4, 2014 Sheet 11 of 13 US 8,643,659 B1 SYSTEM BUS 431 BRIDGE/MEM CONTROLLER MICROPROCESSOR 427 425 KEYBOARD I/F MANAGER RAM L2 CACHE FLASH/NVMEMORY DISPLAY VDA 445 HDD 470 DISKIF FDD CD-ROM ROM - BOS 4 3 PCMCIA AUDIO I/F SPEAKER FIG. 1 U.S. Patent Feb. 4, 2014 Sheet 12 of 13 US 8,643,659 B1 Fragment Filter Cache Arbiter Read Only ECommand Stream 230 240 Fragment PrOceSSOr Seq Data TexturelCmd FIFO2 FIFO Mem Write FIFO FRAGMENT SHADING Circular UNIT FIG. 2 (PRIORART) U.S. Patent Feb. 4, 2014 Sheet 13 of 13 US 8,643,659 B1 Instruction Cache 321 320 Virtual Memory 323 FIG. 3 FRAGMENT SHADING UNIT 210 250 Global Registers 230 WCS FIG. 4 (PRIORART) US 8,643,659 B1 1. 2 SHADER WITH GLOBAL AND The most challenging 3D graphics applications are INSTRUCTION CACHES dynamic rather than static. In addition to changing objects in the scene, many applications also seek to conveyan illusion of FIELD OF THE INVENTION movement by changing the scene in response to the user's input. Whenever a change in the orientation or position of the The present inventions relate to computer graphics and, camera is desired, every object in a scene must be recalculated more particularly, to a computer graphics rendering architec relative to the new view. As can be imagined, a fast-paced ture. game needing to maintain a high frame rate will require many calculations and many memory accesses. BACKGROUND AND SUMMARY OF THE 10 INVENTION Texturing 3D Computer Graphics There are different ways to add complexity to a 3D scene. Creating more and more detailed models, consisting of a 15 greater number of polygons, is one way to add visual interest One of the driving features in the performance of most to a scene. However, adding polygons necessitates paying the single-user computers is computer graphics. This is particu price of having to manipulate more geometry. 3D Systems larly important in computer games and workstations, but is have what is known as a "polygon budget, an approximate generally very important across the personal computer mar number of polygons that can be manipulated without unac ket. ceptable performance degradation. In general, fewer poly For Some years, the most critical area of graphics develop gons yield higher frame rates. ment has been in three-dimensional (3D) graphics. The The visual appeal of computer graphics rendering is peculiar demands of 3D graphics are driven by the need to greatly enhanced by the use of “textures”. A texture is a present a realistic view, on a computer monitor, of a three two-dimensional image which is mapped into the data to be dimensional scene. The pattern written onto the two-dimen 25 rendered. Textures provide a very efficient way to generate sional screen must, therefore, be derived from the three-di the level of minor surface detail which makes synthetic mensional geometries in Such a way that the user can easily images realistic, without requiring transfer of immense 'see' the three-dimensional scene (as if the screen were amounts of data. Texture patterns provide realistic detail at merely a window into a real three-dimensional Scene). This the Sub-polygon level. So the higher-level tasks of polygon requires extensive computation to obtain the correct image 30 processing are not overloaded. See Foley et al., Computer for display, taking account of surface textures, lighting, shad Graphics: Principles and Practice (2.ed. 1990, corr. 1995), owing, and other characteristics. especially at pages 741-744; Paul S. Heckbert, “Fundamen The starting point (for the aspects of computer graphics tals of Texture Mapping and Image Warping.” Thesis submit considered in the present application) is a three-dimensional ted to Dept. of EE and Computer Science, University of scene, with specified viewpoint and lighting (etc.). The ele 35 California, Berkeley, Jun. 17, 1994; Heckbert, “Survey of ments of a 3D scene are normally defined by sets of polygons Computer Graphics.” IEEE Computer Graphics, November (typically triangles), each having attributes such as color, 1986, pp. 56; all of which are hereby incorporated by refer reflectivity, and spatial location. (For example, a walking ence. Game programmers have also found that texture map human, at a given instant, might be translated into a few ping is generally a very efficient way to achieve very dynamic hundred triangles which map out the Surface of the human’s 40 images without requiring a hugely increased memory band body.) Textures are “applied onto the polygons, to provide width for data handling. detail in the scene. (For example, a flat, carpeted floor will A typical graphics system reads data from a texture map. look far more realistic if a simple repeating texture pattern is processes it, and writes color data to display memory. The applied onto it.) Designers use specialized modelling soft processing may include mipmap filtering which requires ware tools, such as 3D Studio, to build textured polygonal 45 access to several maps.