<<

Attention:

This material is copyright  1995-1997 Chris Hecker. All rights reserved. You have permission to read this article for your own education. You do not have permission to put it on your website (but you may link to my main page, below), or to put it in a book, or hand it out to your class or your company, etc. If you have any questions about using this article, send me email. If you got this article from a web page that was not mine, please let me know about it. Thank you, and I hope you enjoy the article, Chris Hecker definition six, incorporated [email protected] http://www.d6.com/users/checker PS. The email address at the end of the article is incorrect. Please use [email protected] for any correspondence. BEHIND THE SCREEN by Chris Hecker

An Open Letter to : Do the Right Thing for the 3D Game Industry

debate is raging in the game development community on an incredibly important topic: 3D for the PC, and specifically, versus OpenGL. This debate has its share of contentless flames, but at its core A is an issue that will affect the daily lives of 3D game developers for many years to come. Being one of vendors have voting seats). In the My opinion started to change last 14 these developers, I care deeply about space I have here, I’m not going to be year, after a few important events. how this issue is resolved. To put it able to describe either API, so I’m going First, a bunch of people at SGI got fed very bluntly, I believe it would be best to assume you’re familiar with both up with Microsoft’s game evangelists for the 3D game industry if Microsoft (check the references at the end of the telling developers that OpenGL was canceled Direct3D Immediate Mode article for more information on the inherently slow. They decided to and put all their 3D immediate mode APIs themselves). prove that it was at least as fast as resources behind their OpenGL team. Next, the history: This debate has Direct3D — if not faster — with a This article will give my rationale for been going on for a long time. I demo at Siggraph ‘96. This event got that statement. stopped looking for the original everyone’s attention, and indeed, The API issue has many twists and “Direct3D versus OpenGL” Usenet post- focused it on Microsoft’s implementa- turns. Debating it is like wrestling Jell- ing when I found one (a reply, no less) tion of OpenGL, which was starting to O; every time you think you’ve got it dating all the way back to March 1996. get pretty fast as well. I took an inter- cornered, it squeezes out somewhere The conventional wisdom used to be est in the issue at this time, and start- else. For this reason, I’m going to pro- that OpenGL was inherently slow — ed reading Usenet posts by knowl- ceed very methodically and lay out a too slow for games — and that edgeable 3D engineers from many logical framework for my opinion. I’m Microsoft had to design their own API. I different companies. Eventually, I sure to leave some holes through bought into this wisdom when I was at became convinced not only that which someone can squeeze if they are OpenGL wasn’t inherently slow, but intent on disagreeing with me, but I Where’s Physics, Part 4 ? that it was inherently faster than think I’ll provide a vast preponderance I’m going to take a short break from our Direct3D at the limit, for reasons I’ll of evidence to back up my claims. Note physics series to cover a topic of great detail below. Next, as everyone proba- that most of the ideas I’ll present have importance to the 3D game industry. bly knows, John Carmack of id been stated by other people in various I’ll be back with physics next issue. Software released a position statement places, so I can’t claim to have origi- in which he made known his choice nated many of these arguments myself. of 3D API: OpenGL. He ported Quake Microsoft (and even helped spread it to OpenGL to prove that the API has there) about two or three years ago. In what it takes for the highest end game Background fact, everyone I knew was convinced programming. Finally, Microsoft irst, some clarifications: When I OpenGL was big and slow, and the only announced plans to update Direct3D F refer to Direct3D in this article, I solution for games was a new and dif- to address some of the issues people mean Direct3D Immediate Mode, not ferent API. In retrospect, I realize that were raising. . Retained Mode will be didn’t understand the technical issues; That’s the history in two paragraphs. useful to some developers, but high- what’s worse, I didn’t know that I did- You’ll notice I keep using the term end 3D games probably won’t use it n’t understand the issues. Even worse “inherent” with regards to perfor- since they need more control over yet, no one within earshot understood mance. I should describe what I mean database traversal and culling. Just to the technical issues well enough to by this. When I say API A is “inherent- be totally clear, Direct3D is a explain why we were all wrong. The ly faster” than API B, I mean that on Microsoft-designed API; OpenGL was people who really understood OpenGL the vast majority of hardware, the dri- originally designed by SGI, but is now were in the workstation business at this ver and application writers for A will handled by an independent time, and didn’t realize 3D games were have more opportunities for optimiza- Architecture Review Board (the ARB, about to become an important market tions, and programs running on top of where Microsoft, SGI, and several other segment. A will be faster in general, than equiva-

http://www.gdmag.com APRIL-MAY 1997 GAME DEVELOPER BEHIND THE SCREEN

lent programs running on B. So, clearly Performance Direct3D, I’ll still cover the specific a bit of hand waving and faith is technical reasons that OpenGL is an involved in saying one API is inherent- et’s start with performance. The inherently faster immediate mode API. ly faster than another. Still, I think it’s L first and most telling thing about If you already agree that OpenGL is possible to take knowledge of the prob- the performance issue is that no one at inherently as fast or faster than lem domain and make a convincing Microsoft (or anywhere, actually) has Direct3D and you just want to see all case for “inherent speed” based on been able to give me a single technical the other overwhelming reasons why things like memory bandwidth and reason why Direct3D is even OpenGL’s OpenGL is superior, you can skip this access patterns, bus speeds, existence equal in 3D performance, let alone its whole next section. It’s going to be proofs embodied in current high-end better. I’m talking about technical heavy reading — remember, I need to hardware, and so on. The inherent engineers on the Direct3D team; when try to prevent the Jell-O from squeez- speed is important, since we don’t asked point blank for an architectural ing out. want to run into performance ceilings reason why Direct3D is (or could be) The performance comparison has imposed by our API. inherently faster than OpenGL, they two aspects. First, we’ll compare This argument takes place on many admit they have none. The only people OpenGL to Direct3D execute buffers. levels. I’m going to show that OpenGL who routinely say Direct3D is faster for Then, we’ll compare OpenGL to the is inherently faster than Direct3D. games are the Microsoft game evange- new Direct3D DrawPrimitive API. However, even if they’re just equal in lists, and they’re marketing people, not performance, the conclusion that technical engineers. Microsoft con- 16 OpenGL is superior still holds, as you’ll stantly refers to Direct3D as being Execute Buffers see. I’m reminded of one of Dave “designed for games,” but when hen comparing OpenGL’s out- Baraff’s dynamics papers that I read pressed, no one seems to be able to Wput model to Direct3D’s execute recently. To paraphrase, “We can always come up with how that translates into buffers, we must consider the three solve this problem because A is never actual performance improvements over types of 3D vertex data: static data, singular, but even if it is singular we can OpenGL. where the vertices of the model don’t solve the problem anyway for this other Although this lack of technical change relative to one another (like the reason.” response is pretty damning for fuselage of an airplane or the arm of a hierarchical model); dynamic data, sible — for the driver writer who’d like the card by the CPU. In DMA-based where most vertices change every frame to optimize for performance. Direct3D hardware, the 3D card starts up an (like undulating water, morphing geom- simply does not allow 3D hardware asynchronous memory transfer to read etry, or a continuous-skinned animating manufacturers the flexibility they need the data directly from main memory figure); and finally, partially static data, to optimize static vertex data render- without needing the CPU. There are, of which is a mix of the previous two. ing. In the interest of full disclosure, I course, hybrids that use both. Each STATIC DATA. For static data, should point out that Direct3D does technique has advantages and disad- OpenGL’s display lists are clearly better have a little known and currently vantages, and which is faster is still an designed than Direct3D’s execute unimplemented function you can call open topic of research (for example, buffers, because they’re opaque and to “optimize” the execute buffers and the current high-end SGI hardware noneditable. By “opaque,” I mean the make them noneditable, but it’s switches dynamically between the two, application doesn’t know the format of unclear whether this feature will ever and in the PC game world, the display list. By “noneditable,” I be implemented. Even if it is eventual- prefers DMA and 3Dfx uses IO). Since mean once created, the display list’s ly implemented, it still wouldn’t allow there’s no consensus as to which is internal data cannot be changed. These Direct3D’s theoretical performance on best, it’s vitally important that the 3D two features together make it possible static data to match OpenGL’s theoreti- API allow either or both, optimally and for driver writers to compile the display cal performance; OpenGL display lists without preference. list into a format appropriate for their have other performance-enhancing Consider how OpenGL’s data-for- hardware and to upload the list onto features execute buffers do not, such as matless function-call model handles the card, even into memory that’s inac- the ability to invoke other display lists. dynamically changing vertex data on 17 cessible to the application. DYNAMIC DATA. 3D hardware accepts both types of 3D hardware. On IO Contrast OpenGL’s display lists with your application’s rendering com- hardware, the OpenGL model allows Direct3D’s execute buffers, which have mands in one of two basic ways: IO or the application to generate the vertex a fixed format dictated by Direct3D DMA. In an IO-based architecture, the data and get it out to the accelerator and are editable by the application 3D card memory maps its data registers with as little data conversion and cache after being created. These features and backs them up with a FIFO, allow- pollution — and as much fine-grained make life much harder — if not impos- ing the data to be written directly to parallelism — as possible. On DMA BEHIND THE SCREEN

hardware, the model allows the data to Finally, some have proposed making one with any other points I’ve missed, be written directly to the card’s DMA the execute buffers writable and resi- either for or against execute buffers. buffers in exactly the optimal format dent in the card’s memory. In this case, for the hardware, without conversion the hardware designer not only needs to or from an intermediate vertex for- to handle all the parsing problems DrawPrimitive mat. The OpenGL model scales from mentioned previously, but also must n some ways, it seems that rasterization-only hardware (such as design the hardware to allow the CPU I Microsoft agrees with my conclu- current PC boards) all the way up to random access to — and even reading sion about execute buffers. The latest high-end hardware that can accept the from — the card-resident execute feature to be added to Direct3D is the model-space vertices directly. buffers and to have interlock protec- DrawPrimitive API. This is a way to Direct3D execute buffers, on the tion to prevent modification of the draw triangles without batching them other hand, deny you maximum per- buffers while they’re being executed. up in execute buffers. I’ve heard differ- formance for dynamic vertex data in No one, to my knowledge, has imple- ent rationalizations for adding this API, one of several ways. On IO-based cards, mented this kind hardware. from the performance limits of execute the application must write out an exe- It’s clear that for dynamic vertex data, buffers that I just discussed, to execute cute buffer full of vertices to main OpenGL allows much more flexibility buffers simply being “too hard for peo- memory, which will then be parsed by for both the application and the 3D ple to use.” Whatever the reason, Direct3D and written to the card. This hardware. This flexibility directly trans- DrawPrimitive isn’t going to avoid all means that your vertex data came out lates into better memory access charac- of Direct3D’s performance problems, 18 of your application, was reformatted teristics and higher performance. I although it will avoid some of the into a big, bandwidth-munching main should point out that a rasterization- memory traffic problems associated memory buffer, then read out of main only card will almost always deal with with execute buffers. Direct3D will still memory, munged by Direct3D, and dynamic vertex data. This is obvious have the data reformatting problems, sent to the card. This extra layer of with a moment’s thought; as the camera for example (the API will take memory traffic and reformatting is moves in 3D, the screen coordinates of Direct3D’s standard vertex formats), so death for high-throughput rendering. all the primitives will change in each your application will still have to read OpenGL sends vertex data either frame. Thus, this section directly applies from its database and convert its ver- directly to the hardware or through a to the vast majority of currently avail- tices into Direct3D structures, and then much smaller and cache-friendly sin- able commodity 3D cards for the PC. pass a pointer to those structures to the gle-primitive buffer. PARTIALLY STATIC DATA. Finally, we API. One might assume applications Now consider how Direct3D execute come to partially static vertex data. If I could use Direct3D’s vertex structures buffers work with DMA-based designs. had to pick the weakest part of my per- internally and save a conversion, but The best case here (or perhaps I should formance argument, it would be here. this puts onerous constraints on the say the “least-worst” case) for Direct3D If you assume that you only need to application. OpenGL doesn’t have is if the 3D hardware can somehow update a few vertices per frame, it these format constraints (see the paper directly DMA and parse the execute seems that editable execute buffers comparing OpenGL to PEX, referenced buffer format. Consider what is could have some benefit. However, this below, for more details on this format- involved in this task alone; the hard- is only the case if the card can directly ting issue). ware must be able to read all current DMA and parse the buffers. If Direct3D Basically, at this point, Direct3D can and future Direct3D vertex and primi- has to parse the buffer itself, then you choose between emphasizing execute tive formats, all commands, and all might as well have dynamic vertex buffers and running into the perfor- flags. Even the Direct3D engineers data, and then all the previous sec- mance problems I’ve mentioned, or it admit that this isn’t likely to happen, tion’s criticisms apply. Also, I’ve yet to can emphasize DrawPrimitive, in which and the hardware engineers from the hear a really compelling and uncon- case we’re stuck with an immature and PC 3D hardware companies I’ve talked trived example for partially static ver- poorly designed clone of OpenGL that’s to agree (no hardware that I know of tex data. The more the static vertex missing some of the architectural deci- does this, either). In addition, the data gets, the easier it is to break up sions that make OpenGL fast. Either application must cope with execute into multiple, completely static sets way, game developers — and players — buffers that are different sizes on differ- (OpenGL’s display lists support this lose as we give away performance and ent cards, none of which might be the mixing of static data by allowing them get nothing in return. optimal size for your application’s data. to invoke other lists). The less static the It gets even worse for Direct3D if the vertex data is, the more it starts to DMA-based card doesn’t parse the exe- resemble dynamic vertex data. Add to Everything Else cute buffers. In this case, the applica- that the disadvantage that you still ersonally, I feel performance scal- tion writes out an execute buffer, then have the data reformatting issues with P ability should be the primary the Direct3D driver must parse the Direct3D, and I’d say it’s at best a wash issue Microsoft considers when decid- buffer, write out the data again to the for both APIs on this one. ing which API to support for game card’s DMA buffers, and then start the So, I think that’s all she wrote for developers; I think I’ve shown that DMA transfer. This is yet another layer execute buffers from a performance per- OpenGL wins handily in this arena. of memory traffic over and above the spective — OpenGL trounces Direct3D. However, there are scads of other rea- IO-based card example. I’d be very interested to hear from any- sons for choosing OpenGL, even if we

GAME DEVELOPER APRIL-MAY 1997 http://www.gdmag.com BEHIND THE SCREEN

ignore performance. I’ll go into those To sum up, Microsoft can extend its OPENGL IS ONLY GOOD FOR SGI now. The reasons are so convincing OpenGL just as fast as they can extend HARDWARE. I used to believe this that even if OpenGL and Direct3D Direct3D. As an added benefit to game myself, but I did a little checking. It were somehow only evenly matched in developers, individual vendors can try turns out that people have done performance, OpenGL would still be a out extensions without needing OpenGL (or its predecessor, IrisGL) on better API for the game industry. Microsoft’s approval. This process is just about every type of hardware As I said before, no one at Microsoft imaginable, from span renderers to was able to give me a technical reason FOR FURTHER INFO full-. Furthermore, why Direct3D was better than OpenGL. consider the number of different That didn’t stop them from giving me Usenet News OpenGL vendors. Check out glQuake a bunch of nontechnical reasons, The never-ending Usenet threads are running on a 3Dfx or an Intergraph. which I’ll address here. available on www.dejanews.com. The more you think about it, the more DRIVERS. Currently (February 1997), Search for OpenGL and Direct3D and you’ll realize Direct3D’s exposed ver- there are more Direct3D drivers than then hunker down for a week’s worth of tex formats and execute buffers dictate OpenGL drivers available for Windows reading. the design of the hardware much more 95. However, this discrepancy is simply so than does OpenGL. That means a matter of time, resources, and evan- John Carmack’s OpenGL Position there’s less room for innovation by gelism. All major 3D card manufactur- Statement hardware designers, and again, we ers have now committed to doing http://redwood.gatsbyhouse.com/ developers lose. 20 OpenGL drivers (glQuake certainly quake/jc122396.txt OPENGL HAS NO CAPS BITS. Direct3D helped motivate people on this front). If that site is down, you can find it on has a mechanism by which you can Microsoft could intensify this develop- DejaNews. You can also find Alex St. test whether a driver supports certain ment and shorten the gap by giving John’s reply to Carmack’s statement on features, like a Z-buffer or alpha blend- their OpenGL team more resources. DejaNews. ing (you test capabilities bits, or caps DIRECTX INTEGRATION. Direct3D is, SGI’s OpenGL Page, including the 1.1 bits). In contrast, OpenGL requires that by definition, integrated with Specification all features be implemented, whether DirectDraw. Still, nothing says OpenGL http://www.sgi.com/Technology/ in hardware or emulated. At first can’t be integrated as well. In fact, by OpenGL glance, it seems that Direct3D has the the time you read this, Microsoft’s advantage here, but that’s not the case. OpenGL-DirectDraw bindings should A Comparison of OpenGL and PEX First, caps bits can’t express the rich- be announced and possibly available in ftp://ftp.sgi.com/opengl/doc/ ness of 3D hardware. For example, a beta. As with drivers, this will be nonis- analysis.ps.Z card that can Z-buffer and have desti- sue in a matter of months. Microsoft’s Direct3D Immediate Mode nation alpha — but not both at the EXTENSIONS. Some people at pages same time — is a real possibility, but Microsoft claim they’ll be able to http://www.microsoft.com/msdn/sdk/ you can’t express that in caps bits. extend Direct3D for game developers’ platforms/doc/sdk//src/ Second, just because a caps bit says a needs faster than OpenGL could be directx_400.htm feature is supported doesn’t mean you extended. This is not only incorrect, http://www.microsoft.com/mediadev/ want to use it, since as we’ve all seen it’s backwards. OpenGL has an official graphics/drawprim.htm on this first generation of hardware, it and time-tested extensions mecha- might be slower than software (or it nism, where vendors can do propri- Microsoft’s OpenGL Docs might not even actually be supported, Microsoft has good OpenGL documenta- etary extensions first, then move them given an unscrupulous vendor). The tion on their web site as well, but I can’t to multivendor extensions, and finally only true solution to this problem is to figure out how to get the direct URL. move them into the next OpenGL profile each feature at installation time Basically, go to the following URL and specification upon ARB approval (of and give the user a choice. You can do select the “Documentation” tab, then which Microsoft is a voting member). this equally well on either OpenGL or “3D Graphics” in the little drop-down Microsoft’s OpenGL team has done Direct3D. box, then wait for the to published extensions already, as have load. Good luck. SGI, , and others. The only way http://www.microsoft.com/msdn/sdk/ for a vendor to get extensions into NonDebatables default.htm Direct3D is to beg Microsoft to put think I’ve covered most of the them in — there is no way to do it Brian Hook’s Online 3D Papers I debatable points. In addition, there themselves. Hardware vendors always http://www.wksoftware.com/ are some points going for OpenGL that mention this as a major problem with publications.html no one debates. Direct3D. What’s more, OpenGL’s EASE OF USE AND ELEGANCE. No one structureless interface makes it much argues with the fact that OpenGL is eas- easier to seamlessly integrate exten- already in place and working in the ier to use and more elegant than sions into the API; adding multitexture OpenGL community. So, if you want Direct3D. This was the central thesis of support to OpenGL is going to be triv- to do a nifty 3Dfx-specific trick in your Carmack’s position statement. He feels ial, while adding it to Direct3D is going next game, you can do it with usability is even more important than to require creating a new vertex type. OpenGL, but not with Direct3D. the performance of the API. For more of

http://www.gdmag.com APRIL-MAY 1997 GAME DEVELOPER his opinion, you should read his state- Summary afraid of Microsoft and the repercus- ment for yourself. It’s in the references. sions it would cause for their compa- SPECIFICATION AND CONFORMANCE. an’t Microsoft fix these problems? nies. Again, this is not competition. It’s also inarguable that OpenGL is Of course they can. They can con- And again, we developers lose. As a better specified. You can read this tinue to change and patch Direct3D final reason for not having both APIs, specification on the Web; again, check until, as Carmack says, it “sucks less.” hardware vendors are forced to spread the references. The spec is based on However, it’ll take them years to reach themselves thin to support more than years of 3D experience, and it does a the level of polish and performance one API, and driver support suffers. great job of balancing specificity and that OpenGL has right now; why So, for all these reasons — for the ambiguity, allowing hardware manu- should we developers have to put up good of the game development indus- facturers room to innovate while pro- with it? For the same effort, Microsoft try — I urge Microsoft to cancel viding software developers with a con- can give their OpenGL team more Direct3D Immediate Mode and fully sistent, high-performance API. In resources to make OpenGL even better, embrace OpenGL as the immediate addition, each OpenGL implementa- and the whole industry benefits and mode game development API of tion must pass a battery of confor- advances. choice. I also urge game developers to mance tests. Direct3D has no compa- Can’t we have both APIs and let take a closer look at Microsoft’s rable specification, and no are them compete for mindshare? Sure, but OpenGL. If you like what you see, use conformance tests available as of this currently the Microsoft OpenGL team is it. Remember to share your opinion writing. prohibited from evangelizing OpenGL with Microsoft and 3D hardware man- DOCUMENTATION AND SAMPLES. to game developers because that would ufacturers. Finally, if you’re a 3D hard- 21 OpenGL is also better documented, run counter to the current “strategy.” ware manufacturer, get your OpenGL with many good books about it avail- That’s not fair technical competition. drivers done as soon as possible. able in stores. The few Direct3D books Also, I showed a draft of this article to Chris Hecker usually inserts a funny available only cover Retained Mode in engineers from several top-echelon PC bio here at the end, but he feels strongly any detail. There are tons of well-writ- 3D hardware manufacturers to get their enough about this topic that he’ll ten OpenGL samples available on the technical review; they all said they’d refrain for an issue. He can be reached Web, also in contrast to Direct3D. say the same thing if they weren’t at [email protected].