Enabling the Immersive 3D Web with COLLADA & Webgl Introduction
Total Page:16
File Type:pdf, Size:1020Kb
Enabling the Immersive 3D Web with COLLADA & WebGL Rita Turkowski – June 30, 2010 Introduction The web is built on standard document formats (HTML, XML, CSS) and standard communication protocols (HTTP, TCP/IP), gaining momentum by the principle and power of the hyperlink (URL / URI)(1). Browsers and servers implementing those standards have revolutionized publishing and searching. Not only can documents be linked by other documents, but since the data is provided in a standard open language it also provides the ability to link to specific content inside a document, enabling the rich navigation and search functionalities that we take for granted today. Two standard technologies developed by the Khronos Group(2), WebGL and COLLADA, are opening the way for enhancing the web experience into the third dimension (3D) of imaging. Thanks to this work, 3D applications can now take advantage of the fundamentals of the web while natively exploiting 3D graphics hardware acceleration. This white paper describes both Khronos standards at an abstract level and the evolution of an ecosystem that will enable immersive 3D Web applications to evolve from currently disparate implementations (i.e. “walled gardens” or “silos”) to become standard web publishing accessible natively in a web browser. What is WebGL? WebGL(3) is a low-level JavaScript API enabling web applications to take advantage of 3D graphics hardware acceleration in a standard way. Currently the only way to provide interactive display of appreciable quality 3D content in a web browser is to create a separate application loaded as a plug-in. Plug-ins are problematic as they require end-user installation which is not transparent, placing a burden on the end-user, and are frequently forbidden by companies security guidelines, impeding the general adoption of 3D content. Consequently, this current situation fosters an ecosystem of "walled garden" communities as opposed to a World Wide Web experience where each user can publish, access, and search content. In contrast, the WebGL standard will provide native access to the graphics hardware by extending the HTML specifications with a new set of objects and functions for 3D graphics. The contribution of this new API is twofold: firstly it removes the need for external plug- ins installation, thus widening and accelerating its adoption, and, secondly, it allows web application developers to rely on a standard that boasts a very large community of qualified 3D professionals. In fact, WebGL is directly derived from the OpenGL ES 2.0 specifications to be a bare-bones yet extremely efficient and powerful graphics API. In this scenario, users of WebGL-enabled browsers, such as Apple Safari, Google Chrome, Mozilla Firefox, Opera and mobile solutions by Nokia, will benefit from the 1 richness of newly created applications such as 3D virtual worlds directly from the browser. Technically speaking, WebGL is an extension to the HTMLCanvasElement (as defined by the W3C’s WHATWG HTML 5 specification Canvas element(4)), being specified and standardized by the Khronos Group. The HTML CanvasElement represents an element on the page into which graphic images can be rendered using a programmatic interface. The only interface currently standardized by the W3C is the CanvasRenderingContext2D. The Khronos WebGL specification describes another interface, WebGLRenderingContext, which faithfully exposes OpenGL ES 2.0 functionalities. WebGL brings OpenGL ES 2.0 to the web by providing a 3D drawing context to the familiar HTML5 Canvas element through JavaScript objects that offer the same level of functionality. What is COLLADA? COLLADA(5) is an intermediate language for interactive 3D applications. It has been designed from the ground-up with well-settled web technologies: it is essentially XML for 3D assets, using URLs for linking content. COLLADA enables content to flow from content creation tools to interactive applications; it is a lossless, extensible declarative language well suited for content (persistent) serialization and retrieval. The content can then be processed with standard tools in its native XML encoding and adapted to any target application. COLLADA is also being used as an interchange format, providing a bridge between authoring applications. Because COLLADA is a royalty-free open standard based on standard XML technology, its use as a publishing format has been steadily growing with its adoption by Google Earth(6) and other GIS applications, web 3D engines such as Papervision3D(7), Google O3D(8) and asset repository systems such as 3DVIA(9) and Google 3D warehouse(10). Motivation for WebGL The inclusion of native 3D rendering capabilities inside web browsers, as witnessed by the interest and participation in the Khronos Group's WebGL(3) project, aims at simplifying the development of 3D for the web. It does this by eliminating the need to create a 3D web plug-in (and requiring a non-trivial end-user download with manual installation before any 3D content can be viewed by the end-user). WebGL benefits by harnessing the widely used standard OpenGL ES 2.0 API directly. Some background is useful here. Many graphics programmers today leverage the OpenGL family of APIs that are supported by a number of hardware drivers. OpenGL ES is increasingly found on a number of devices (e.g., 3D games for the iPhone are supported via OpenGL ES). Therefore driver support is already good, and likely to continue to improve over time. In fact, OpenGL ES 2.0 will work on the desktop though not natively supported by desktop drivers. It can be implemented on top of desktop OpenGL with some care but you need to impose the appropriate restrictions. Google 2 explicitly created project ANGLE(11) which implements a compliant OpenGL ES 2.0 API on top of Microsoft’s D3D9 that browsers will most commonly use on Windows(12). The Khronos WebGL Working Group is motivated to specify how to bring hardware- accelerated 3D graphics to the web for a number of reasons: JavaScript performance has radically improved. JavaScript is the "defacto" programming language of the web. Many interesting applications are web applications. Cloud based application development is growing. New capabilities on the web should solve for the widest use case. For example, libraries such as jQuery and Dojo provide programmer-facing abstractions and syntactic help, so that developers rarely need to use the underlying "raw" primitives. Likewise, by providing a Shader-based API like OpenGL ES 2.0 within WebGL, Khronos is solving for the widest possible number of use cases, leaving room for new libraries to do many things, including model parsing, etc. This provides a technical infrastructure that benefits both business models and academic research models. Motivation for COLLADA The burden COLLADA set out to alleviate is that 3D environments are difficult, expensive and time-consuming to create and manage. Content creation represents by far the largest budget in the cost of creating 3D applications. The more that existing content and code for models, shading, animation, special effects, physics etc., can be shared and syndicated, the more efficiently production pipelines can be made available. For instance, as virtual goods are a growing business model for the game industry, and increasingly end-users aspire to evolve from consumers to creators and publishers of (their own) content, COLLADA’s popularity is growing as an asset exchange and publishing format. Many 3D web applications that are being developed with WebGL are leveraging this trend of content reuse and taking advantage of COLLADA. Taking advantage of COLLADA and WebGL WebGL provides native low-level 3D graphics hardware acceleration for JavaScript and is directly accessible within standard web browsers. COLLADA provides XML-encoded content that WebGL applications can use directly. Neither WebGL nor COLLADA specify how the web application should use the content; both are providing complementary technologies that a JavaScript web application can leverage, and both are, by design, not forcing a specific visual representation. Likewise neither enforces a specific usage of the content or graphic hardware. This is the key for enabling innovation and remaining future-proof. What makes COLLADA such a good fit for web-based applications is its inherent use of web technologies. This makes it appealing to WebGL developers. COLLADA is using 3 web technology for its syntax (XML) and makes direct use of resource identifiers (URI) to query server databases. As such, COLLADA is structurally a compliant citizen in the WWW scenario. In this particular case (web delivery), as for most other applications (e.g., native games) an encoding specific to the application is necessary. Conditioning can be done COLLADA-in, COLLADA-out for web content. In this context, standard XML tools (such as Ant and XSLT) can be used to process COLLADA .dae XML documents exported by popular DCC tools to create files that can then be easily imported into interactive applications. This explains why COLLADA is widely used on the web today and relied upon by the web’s 3D content repositories, e.g., Google Warehouse, 3DVIA, Daz3D and others. Note that there is a significant trend to move computation into the digital cloud(13), which means that some (or all) of the content pipeline could be moved into the cloud, changing the balance of delivery versus authoring (specifically in the case of the web). Consequently, operators will most likely be interested in compression and streaming technology after 3D content bandwidth usage grows significantly.1 This said, there is no formal web based approach between WebGL and COLLADA. Work needs to be done outside of implementing both specifications to realize COLLADA content in a WebGL enabled browser. WebGL is a low-level, immediate mode graphics API with little concept of even higher level visual scene elements such as hierarchical transforms, shapes, instancing, and materials etc. that are central to COLLADA. Therefore, in order to connect these two, some sort of transformation or glue code must be implemented, albeit a retained mode rendering engine in its most simple form.