XML3D – Interactive 3D Graphics for the Web

Kristian Sons∗ Felix Klein† Dmitri Rubinstein‡ Sergiy Byelozyorov§ Philipp Slusallek¶ DFKI Saarbrucken¨ Saarland University DFKI Saarbrucken¨ Saarland University DFKI Saarbrucken¨ Saarland University Saarland University

Figure 1: A modified version of Chromium browser with XML3D support showing an extended Wikipedia page about Venice with realtime navigation and interaction with a 3D model of a famous Venetian palace (left) and a car configurator demonstrating the tight integration of 2D and 3D content within the same web page inside the modified browser (right).

Abstract ics hardware, e.g. supports efficient mapping to GPUs without maintaining copies. It also leverages a new approach to specify Web technologies provide the basis to distribute digital information shaders independently of specific rendering techniques or graphics worldwide and in realtime but they have also established the Web APIs. We demonstrated the feasibility of our approach by integrat- as a ubiquitous application platform. The Web evolved from simple ing XML3D support into two major open browser frameworks from text data to include advanced layout, images, audio, and recently Mozilla and WebKit as well as providing a portable implementation streaming video. Today, as our digital environment becomes in- based on JavaScript and WebGL. creasingly three-dimensional (e.g. 3D cinema, 3D video, consumer 3D displays, and high-performance 3D processing even in mobile CR Categories: I.3.7 [Computer Graphics]: Three-Dimensional devices) it becomes obvious that we must extend the core Web tech- Graphics and Realism—Virtual Reality; I.3.6 [Methodology and nologies to support interactive 3D content. Techniques ]: Standards—Languages Instead of adapting existing graphics technologies to the Web, Keywords: XML3D, HTML5, DHTML, DOM, CSS, web inte- XML3D uses a more radical approach: We take today’s Web tech- grated, real-time, ray tracing nology and try to find the minimum set of additions that fully sup- port interactive 3D content as an integral part of mixed 2D/3D Web documents. 1 Motivation

XML3D enables portable cross-platform authoring, distribution, Originally designed as an information network, the World Wide and rendering of and interaction with 3D data. As a declarative ap- Web has moved to an interactive multi-purpose platform for all proach XML3D fully leverages existing web technologies includ- kinds of applications, including web-mail, social networks, on- ing HTML, Cascading Style Sheets (CSS), the Document Object line shops, web mappings, and collaborative encyclopedias such Model (DOM), and for dynamic content. All 3D content is as Wikipedia. The web consits of ubiquitous W3C standards and exposed in the DOM, fully supporting DOM scripting and events, quick emergence of de facto standards such as thus allowing Web designers to easily apply their existing skills. The design of XML3D is based on modern programmable graph- • HTML [W3C 2010] as a for content and structure, ∗e-mail: [email protected] †e-mail: [email protected] • CSS [W3C 2010] as a presentation definition language for the ‡e-mail: [email protected] style, § e-mail: [email protected] • DOM [W3C 2009d] representing the hierarchical structure of ¶ e-mail: [email protected] the web document and • JavaScript [ECMA ] for client-side DOM scripting. These technologies comprise what was termed Dynamic HTML (DHTML) and constitute - together with asynchronous server ac- cess through the XMLHttpRequest [W3C 2009f] API - the technol- ogy stack for nearly all interactive and animated web sites in the browser. However, HTML itself only supports the description of text and shading and transformations is not separable from the geometry de- simple, box-shaped 2D graphics including images and video. More scription using CSS. X3D defines its own event model, that is in- flexible 2D vector graphics support has later been added through compatible with DOM Events. While the SAI defines an integration SVG [W3C 2009e]. However, support for interactive 3D graph- model for DOM nodes, it is limited to a one-time import of DOM ics is not yet available in any browser. It can be added through nodes with no bidirectional update or synchronization mechanisms. plug-ins that manage the 3D content internally often using a game engine or an X3D renderer. However, the content managed by these The most frequently used node for storing geometry in X3D is the plug-ins is completely separate from the rest of the web page. Us- IndexedFaceSet node. While it is a very flexible geometry repre- ing it in Web pages requires learning new APIs, new data models, sentation, for many configurations the data must be pre-processed and often unusual scripting languages. Furthermore, these plug-ins before it can be passed to low-level graphic APIs, like OpenGL or are largely incompatible with each other and can often not be in- DirectX, which require VertexArrays for good performance. How- stalled in certain (business) environments. As a result of this, there ever, the original data must stay available for possible changes via is hardly any 3D content on the Web even though it could signifi- routing or SAI access, requiring the need to store multiple copies cantly enhance the capabilities and usability of many Web applica- of the data. tions and the fact that it would readily be supported on essentially all the existing hardware platforms for the Web – even down to The X3D standard defines other geometry nodes that support unin- small and mobile devices. dexed or single-indexed vertex data, but these nodes are seldomly used. For most of the existing content and for output from common We believe that a standard technology for 3D graphics on the web X3D exporters an interim conversion step is necessary to achieve should reuse as much as possible of what web-technologies already hardware-friendly data structures. provide. It should thus extend the stack of Web technologies as well as expand the capabilities of all Web documents and applications. An integrated 3D technology would be familiar to web developers 2.2 X3DOM and compatible with existing libraries and tools used in web de- velopment. Our approach is therefore to design a 3D technology X3DOM [Behr et al. 2009] is an approach to embed the X3D scene compatible to DHTML that reuses a maximum of its concepts, in- graph into the DOM of a web page. Just like SVG, the 3D content cluding Cascading Style Sheets [W3C 2009a] and the Document should be displayed in place of the declaration without the need to Object Model [W3C 2009d]. install a plug-in. To seamlessly integrate X3D into the DOM, its functionality is stripped down to visualization components while On the Web most of the content developers will not be graphics ex- dynamics, distribution, security, and scripting are managed through perts, making it mandatory to provide an intuitive approach to 3D the web technologies provided by the browser. graphics that eliminates most low-level details anyway. Note that this is similar to text: HTML/CSS also do not provide all the op- The proposed architecture of X3DOM consists of a connector com- tions of a professional text layout engine but are “good enough” for ponent between the browser (front end) and an X3D runtime (back- the majority of use cases on the Web. Over time these capabilities end). The connector transforms the X3DOM scene graph, declared can and should be increased in a controlled manner, though. inside the DOM, into the scene graph of the X3D backend. After- wards, any change of the scene is synchronized between these two 2 Related Work representations. While the X3D-backend is responsible for render- ing the image, media linked in the X3D scene graph can be resolved In the following, we focus on and discuss existing technologies that using the browsers URI/URL streaming mechanism. The current add sophisticated 2D and 3D graphic capabilities to the web. We implementation of X3DOM is based on WebGL [Khronos 2009], also briefly discuss other, non-declarative, API-driven approaches, which includes an simple X3D runtime implemented in JavaScript. that do not define a markup language but provide the functionality to create content procedurally. While some of these approaches While the approach of X3DOM is similar to XML3D, it tries to add show great results, the use of a 3D system with an API is orthogonal an existing 3D graphics format into the web, rather than consequen- to declarative web documents and therefore difficult to compare to tially extending the current technology where necessary. This way, our approach. exisiting X3D content can be reused in the browser - as long as it does not exceeds the proposed DOM profile. 2.1 X3D But although X3D is stripped down to visualization and interaction components, it still contains features that do not fit well to the ubiq- X3D is an ISO Standard [Web3DConsortium 2008] file format to uitous use of DHTML in recent Web 2.0 applications. Thus X3D represent 3D scenes. X3D has XML encoding, an encoding that is does not separate the style and layout from the content as it’s done backwards-compatible to VRML97, as well as a FastInfoSet based in HTML with CSS and authors can realize interaction with sensor binary encoding. Additionally to the 3D content description, it also nodes and routes, a mechanism that the web community is unfami- defines a full runtime environment, including scripting and an event lar with. The common way to add interaction to a web page is via system. DOM Events and DOM Scripting. Adding DOM Events and CSS Besides several stand-alone browsers that implement the whole or to X3DOM would result in two opposed mechanisms to achieve parts of the X3D specification, there is a plug-in based web-browser the same behaviour and it is necessary to define which mechanism integration model, where the scene is stored and managed by the precedes the other. The same applies for the non-trivial task to plug-in and can be accessed and modified using the Scene Access synchronize the -based DOM structure with graph-based X3D Interface (SAI), which is also part of the standard. SAI bindings are structure. Additionally, this approach shares all the limitations that defined for EcmaScript/JavaScript and Java. come with X3D as discussed above. Apart from the optional XML encoding, the X3D standard is not Although X3DOM is an obvious approach to integrate X3D into referencing or leveraging other W3C standards like CSS or DOM the DOM, it is not an intuitive way for web authors to handle 3D Events [W3C 2000]. The style of an X3D document in terms of objects in the DOM. 2.3 SVG On the other hand, we have a program or script, included in the web document, that creates the 3D content, but where the content itself (SVG) [W3C 2009e] was introduced is never explicitly visible. As a result, the representation of the 3D in 2001 by W3C as an XML-based file format for describing content is inherently different and separate from the rest of the web two-dimensional vector graphics. SVG integrates and leverages page’s content and incompatible with a vast amount of programs, other W3C standards, like CSS2 and DOM2. A lot of features scripts and tools used in web development. Thus, for a 3D graphics are modeled after HTML. All modern browsers have or intend technology that should be used as an extension of web documents, to have native support for SVG embedded into XHTML [W3C the imperative approach is inherently unfit. 2002]. HTML5 [W3C 2010] allows inline SVG even in text- encoded HTML. SVG also uses DOM events, event handlers such as onmouseover and onclick can be assigned to any SVG graphical 3 Properties of HTML object, just as in HTML. In order to integrate support for 3D graphics into web documents, SVG has capabilities for declarative 2D animations via SMIL [W3C we need to understand their properties. Therefore, we have a closer 2008] animation elements. But since SMIL does not fit well into look at how HTML documents are structured and used by web ap- DOM concepts and these elements are not very well supported by plications. authoring tools, they are rarely used. Instead most web applica- tions use DOM scripting capabilities for dynamic SVG content. The content of complex web pages and web applications is of- The CSS3 Animations [W3C 2009c] proposal could be a good al- ten generated on the server dynamically. There are many docu- ternative for declarative animations. ment management systems that are used to generate the HTML output. The output is sent to the client and rendered by different SVG has some other design issues. Though SVG leverages CSS, web browser on different platform using a variety of different lay- most properties can also be expressed as attributes, which can lead out and rendering engines. Because of the common to ambiguous behavior. Providing a element, SVG allows Web applications are portable across platforms, browsers, and ren- the reuse of complete sub-scene graphs. This makes the CSS inher- derers. itance down the referenced tree even more complex and hard to im- plement. Also SVG could leverage existing HTML elements rather Another important concept of modern is the separation than defining its own, i.e

For embedding AnySL-shaders we use